英文:
Mmap new file to existing pointer instead of munmap
问题
我正在使用Go语言的mmap,在mmap
一个文件之后,这个指针将在所有的goroutine中使用。
然后,如果我使用munmap
来更新这个文件的数据(包括新的大小和数据布局),如果其他的goroutine访问了被释放的内存区域,就会导致段错误。
所以我不使用munmap
,而是创建一个带有更新数据的新文件,然后在旧的指针上使用mmap
来映射这个文件,这样会起作用还是会导致内存泄漏?
// mmap一个文件
b, err := syscall.Mmap(fdOldFile, 0, int(dataSize), syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_SHARED)
// mmap一个新文件,大小更新
nb, e := syscall.Mmap(fdNewFile, 0, int(newSize), syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_SHARED)
// 将数据写入新文件,使用新的数据布局
// ...
// 如果在其他goroutine中仍在使用b,munmap b会导致段错误
// syscall.Munmap(b)
os.Remove(oldFile)
os.Rename(newFile, oldFile)
syscall.Munmap(nb)
// 将b设置为新的b
b = syscall.Mmap(fdNewFile, 0, int(newSize), syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_SHARED)
请注意,我只提供了翻译的代码部分,不包括其他内容。
英文:
I am using mmap on Go, after mmap
a file, this pointer will be used across all goroutines.
Then i want to update this file data (with new size + data layout) if i munmap
it, it will cause segfault error if any other goroutine access to the freedmemory region.
Then i don't use munmap
, i create a new file with updated data then i mmap
this file on the old pointer, will it work or cause any memory leak?
// mmap a file
b, err := syscall.Mmap(fdOldFile, 0, int(dataSize), syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_SHARED)
// mmap new file with new size
nb, e := syscall.Mmap(fdNewFile, 0, int(newSize), syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_SHARED)
// pooring data to new file with new data layout
// ...
// munmap b will cause segfault if b is beging used in another goroutine
// syscall.Munmap(b)
os.Remove(oldFile)
os.Rename(newFile, oldFile)
syscall.Munmap(nb)
// set b = new b instead
b = syscall.Mmap(fdNewFile, 0, int(newSize), syscall.PROT_READ|syscall.PROT_WRITE, syscall.MAP_SHARED)
答案1
得分: 1
你的示例代码将保持旧文件的内存映射,这是因为内核会一直保持映射,直到你取消映射或进程退出。因此,syscall/sys库始终会保留对内存映射地址的引用,以防止其被垃圾回收,即使你失去了引用。
替换相同地址后面的文件的正确方法是使用相同地址的mmap
系统调用。然而,syscall.Mmap
包装器不允许你指定address
参数,它总是为0(这意味着内核会选择一个当前未使用的地址)。
你也可以使用mremap
系统调用来扩展或缩小现有的内存区域,但是stdlib中没有这个系统调用的包装器。这些限制的最可能原因是,当你更改现有的映射时,长度可能会发生变化。Go会返回一个[]byte,它在内部有一个cap和len值。因此,如果底层数组的大小发生变化,但len不变,就可能导致段错误。而且由于len和cap是按值传递的,stdlib在更改底层内存时无法更改这些切片。
因此,为了实现这一点,假设你仍然希望这样做,你需要:
- 暴露内部的syscall.mmap函数,它允许你指定
address
import _ "unsafe"
//go:linkname mmap syscall.mmap
func mmap(addr uintptr, length uintptr, prot int, flags int, fd int, offset int64) (xaddr uintptr, err error)
- 对于地址的初始分配仍然应该使用
syscall.Mmap
,因为有一些要求,最好让内核选择一个好的地址,但现在你可以更改它。你需要使用反射和不安全指针转换来从syscall.Mmap
返回的[]byte中获取地址。 - 如果你要传递不同的
length
,你还必须更改所有的[]byte副本(包括子切片)的len
,以避免段错误。如果每次使用完全相同的长度,这应该不是一个问题。
总之,你需要非常确定自己在做什么,以避免任何错误,否则可能会出现一些严重的错误,但是这是可行的。
英文:
The code in your example will keep the old file memory mapped, this is because the kernel will keep it mapped until you unmap it or the process exits. Because of this the syscall/sys library always keeps a reference to the memory mapped address to prevent it from being garbage collected, even if you lose the reference.
The proper way to replace the file behind the same address is to use the mmap
syscall with the same address. However, the syscall.Mmap
wrapper will not let you specify the address
param, it is always 0(which means that the kernel will pick a address not currently in use).
You can also grow or shrink the existing region with the mremap
syscall, but no wrapper exists for this syscall in the stdlib. The most likey reason for these limitations it that when you change an existing mapping, the length may change. Go will return a []byte, which internally has a cap and len value. So if the size of the underlaying array changes but the len will not you can get segfaults. And since the len and cap are passed by value, the stdlib can't change these slices when changing the underlying memory.
So, in order to do this, assuming you still want to you have to:
- Expose the internal syscall.mmap function which does allow you to specify
address
import _ "unsafe"
//go:linkname mmap syscall.mmap
func mmap(addr uintptr, length uintptr, prot int, flags int, fd int, offset int64) (xaddr uintptr, err error)
- You should still use
syscall.Mmap
for the initial allocation of the address because there are a few requirements and it is better to let the kernel pick a good addresss, but now you can change it. You will need to use reflection and unsafe pointer casting to get the address form the []byte you got back fromsyscall.Mmap
. - If you are going to pass a different
length
you must also the change
thelen
of all copies of the []byte including subslices to avoid segfaults. If you use the exact same length every time this should not be an issue.
So TLDR: You need to be very sure what you are doing to not make any mistakes or you will some nasty bugs, but it can be done.
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论