在读取和修改切片之前,先锁定它。

huangapple go评论81阅读模式
英文:

Lock slice before reading and modifying it

问题

我最近开始接触Go语言,并且在审查一些代码时发现,虽然它是写保护的,但在读取数据时存在问题。问题不在于读取本身,而是在读取和修改切片之间可能发生的修改。

如下所示,写入是通过以下方式进行保护的:

type ConcurrentSlice struct {
	sync.RWMutex
	items []Item
}

type Item struct {
	Index int
	Value Info
}

type Info struct {
    Name        string 
	Labels      map[string]string
	Failure     bool
}

但是,在收集切片内容和修改切片之间,可能会发生修改。可能会有另一个例程修改相同的切片,当需要分配一个值时,它可能已经不存在:slice[i] = item

应该如何正确处理这种情况呢?

我已经实现了以下方法:

func GetList() *ConcurrentSlice {
	if list == nil {
		denylist = NewConcurrentSlice()
		return denylist
	}
	return denylist
}

我像这样使用它:

concurrentSlice := GetList()
concurrentSlice.UpdateOrAppend(item)

但是,我理解在获取和修改之间,即使几乎是立即的,另一个例程可能已经修改了切片。如何以原子方式执行这两个操作才是正确的?即我读取的切片是我要修改的切片的100%。因为如果我尝试将一个项目分配给一个不存在的索引,它将中断执行。

提前谢谢!

英文:

My experience working with Go is recent and in reviewing some code, I have seen that while it is write-protected, there is a problem with reading the data. Not with the reading itself, but with possible modifications that can occur between the reading and the modification of the slice.

type ConcurrentSlice struct {
	sync.RWMutex
	items []Item
}

type Item struct {
	Index int
	Value Info
}

type Info struct {
    Name        string 
	Labels      map[string]string
	Failure     bool

}

As mentioned, the writing is protected in this way:


func (cs *ConcurrentSlice) UpdateOrAppend(item ScalingInfo) {
	found := false
	i := 0
	for inList := range cs.Iter() {
		if item.Name == inList.Value.Name{
			cs.items[i] = item
			found = true
		}
		i++
	}
	if !found {
		cs.Lock()
		defer cs.Unlock()

		cs.items = append(cs.items, item)
	}
}

func (cs *ConcurrentSlice) Iter() <-chan ConcurrentSliceItem {
	c := make(chan ConcurrentSliceItem)

	f := func() {
		cs.Lock()
		defer cs.Unlock()
		for index, value := range cs.items {
			c <- ConcurrentSliceItem{index, value}
		}
		close(c)
	}
	go f()

	return c
}

But between collecting the content of the slice and modifying it, modifications can occur.It may be that another routine modifies the same slice and when it is time to assign a value, it no longer exists: slice[i] = item

What would be the right way to deal with this?

I have implemented this method:

func GetList() *ConcurrentSlice {
	if list == nil {
		denylist = NewConcurrentSlice()
		return denylist
	}
	return denylist
}

And I use it like this:

concurrentSlice := GetList()
concurrentSlice.UpdateOrAppend(item)

But I understand that between the get and the modification, even if it is practically immediate, another routine may have modified the slice. What would be the correct way to perform the two operations atomically? That the slice I read is 100% the one I modify. Because if I try to assign an item to a index that no longer exists, it will break the execution.

Thank you in advance!

答案1

得分: 3

你正在进行的阻塞方式是不正确的,因为它不能确保返回的项目没有被删除。在更新的情况下,数组仍然至少具有相同的长度。

一个更简单的可行解决方案可能是以下代码:

func (cs *ConcurrentSlice) UpdateOrAppend(item ScalingInfo) {
    found := false
    i := 0
    cs.Lock()
    defer cs.Unlock()

    for _, it := range cs.items {
        if item.Name == it.Name{
            cs.items[i] = it
            found = true
        }
        i++
    }
    if !found {
        cs.items = append(cs.items, item)
    }
}
英文:

The way you are doing the blocking is incorrect, because it does not ensure that the items you return have not been removed. In case of an update, the array would still be at least the same length.

A simpler solution that works could be the following:

func (cs *ConcurrentSlice) UpdateOrAppend(item ScalingInfo) {
    found := false
    i := 0
    cs.Lock()
    defer cs.Unlock()

    for _, it := range cs.items {
        if item.Name == it.Name{
            cs.items[i] = it
            found = true
        }
        i++
    }
    if !found {
        cs.items = append(cs.items, item)
    }
}

答案2

得分: 2

如果值的顺序不重要,可以使用sync.Map

type Items struct {
    m sync.Map
}

func (items *Items) Update(item Info) {
    items.m.Store(item.Name, item)
}

func (items *Items) Range(f func(Info) bool) {
    items.m.Range(func(key, value any) bool {
        return f(value.(Info))
    })
}
英文:

Use a sync.Map if the order of the values is not important.

type Items struct {
	m sync.Map
}

func (items *Items) Update(item Info) {
	items.m.Store(item.Name, item)
}

func (items *Items) Range(f func(Info) bool) {
	items.m.Range(func(key, value any) bool {
		return f(value.(Info))
	})
}

答案3

得分: 2

数据结构101:始终选择最适合你使用情况的数据结构。如果你要通过名称查找对象,那么map正是为此而设计的。如果你仍然需要保持项目的顺序,你可以使用treemap。

并发编程101:就像事务一样,你的互斥锁应该是原子的、一致的和隔离的。你在这里失败了隔离,因为数据结构读取不在互斥锁内。

你的代码应该类似于这样:

func {
 mutex.lock
 defer mutex.unlock
 检查map或treemap中的名称
 如果存在则更新
 否则添加
}
英文:

Data structures 101: always pick the best data structure for your use case. If you’re going to be looking up objects by name, that’s EXACTLY what map is for. If you still need to maintain the order of the items, you use a treemap

Concurrency 101: like transactions, your mutex should be atomic, consistent, and isolated. You’re failing isolation here because the data structure read does not fall inside your mutex lock.

Your code should look something like this:

func {
 mutex.lock
 defer mutex.unlock
 check map or treemap for name
 if exists update
 else add
}

答案4

得分: 1

经过一些测试,我可以说你担心的情况确实可能发生在sync.RWMutex中。我认为使用sync.Mutex也可能发生,但我无法复现。也许我缺少一些信息,或者调用的顺序是有序的,因为它们都被阻塞,而它们重新获取锁的顺序是有序的。

为了确保你的两个调用在没有其他例程“冲突”的情况下安全进行,一种方法是为每个对象上的每个任务使用另一个互斥锁。在读取和写入之前,你需要锁定该互斥锁,并在完成后释放它。你还必须在任何其他写入(或读取)该对象的调用上使用该互斥锁。你可以在这里的main.go文件中找到我所说的实现。为了复现使用RWMutex的问题,你只需注释掉startTask和endTask的调用,问题就会在终端输出中可见。

编辑:我的第一个答案是错误的,因为我错误地解释了一个测试结果,并陷入了OP描述的情况中。

英文:

After some tests, I can say that the situation you fear can indeed happen with sync.RWMutex. I think it could happen with sync.Mutex too, but I can't reproduce that. Maybe I'm missing some informations, or maybe the calls are in order because they all are blocked and the order they redeem the right to lock is ordered in some way.

One way to keep your two calls safe without other routines getting in 'conflict' would be to use an other mutex, for every task on that object. You would lock that mutex before your read and write, and release it when you're done. You would also have to use that mutex on any other call that write (or read) to that object. You can find an implementation of what I'm talking about here in the main.go file. In order to reproduce the issue with RWMutex, you can simply comment the startTask and the endTask calls and the issue is visible in the terminal output.

EDIT : my first answer was wrong as I misinterpreted a test result, and fell in the situation described by OP.

答案5

得分: 0

如果ConcurrentSlice只会从单个goroutine中使用,那么锁是不必要的,因为算法的编写方式不会导致对切片元素或切片的并发读写。

如果ConcurrentSlice将从多个goroutine中使用,现有的锁是不够的。这是因为UpdateOrAppend可能会并发修改切片元素。

一个安全的版本需要两个版本的Iter

这个版本可以被ConcurrentSlice的用户调用,但不能从UpdateOrAppend中调用:

func (cs *ConcurrentSlice) Iter() <-chan ConcurrentSliceItem {
    c := make(chan ConcurrentSliceItem)

    f := func() {
        cs.RLock()
        defer cs.RUnlock()
        for index, value := range cs.items {
            c <- ConcurrentSliceItem{index, value}
        }
        close(c)
    }
    go f()

    return c
}

而这个版本只能从UpdateOrAppend中调用:

func (cs *ConcurrentSlice) internalIter() <-chan ConcurrentSliceItem {
    c := make(chan ConcurrentSliceItem)

    f := func() {
        // No locking
        for index, value := range cs.items {
            c <- ConcurrentSliceItem{index, value}
        }
        close(c)
    }
    go f()

    return c
}

UpdateOrAppend应该在顶层进行同步:

func (cs *ConcurrentSlice) UpdateOrAppend(item ScalingInfo) {
    cs.Lock()
    defer cs.Unlock()
    // ...
}

长篇版如下:

这是一段有趣的代码。根据我对Go内存模型的理解,Iter()中的互斥锁只在有其他goroutine在处理该代码时才是必需的,即使有其他goroutine,代码中也存在可能的竞争条件。然而,UpdateOrAppend只会修改比Iter正在处理的切片元素的索引低的元素,因此该竞争条件永远不会出现。

竞争条件可能发生如下:

  1. Iter中的for循环读取切片的第0个元素。
  2. 该元素通过通道发送。因此,切片接收发生在第一步之后。
  3. 接收端可能会并发更新切片的第0个元素。到这里还没有问题。
  4. 然后发送goroutine读取切片的第1个元素。这时可能发生竞争条件。如果第3步更新了切片的索引1,那么第4步的读取就是一个竞争条件。也就是说,如果第3步读取了第4步所做的更新,那就是一个竞争条件。如果你在UpdateOrAppend中使用i:=1开始,并使用-race标志运行它,你就会看到这一点。

但是,UpdateOrAppend始终修改Iter在i=0时已经看到的切片元素,因此即使没有锁,这段代码也是安全的。

如果将有其他goroutine访问和修改该结构,你需要互斥锁,但你需要用它来保护完整的UpdateOrAppend方法,因为只有一个goroutine应该被允许运行该方法。你需要互斥锁来保护第一个for循环中的潜在更新,并且该互斥锁还必须包括切片追加的情况,因为那可能实际上修改底层对象的切片。

如果Iter只能从UpdateOrAppend中调用,那么这个单一的互斥锁应该足够了。然而,如果Iter可以从多个goroutine中调用,那么还存在另一个竞争条件的可能性。如果一个UpdateOrAppend与多个Iter实例并发运行,那么其中一些Iter实例将同时从修改后的切片元素中读取,导致竞争条件。因此,只有在没有UpdateOrAppend调用时,才能允许多个Iter同时运行。这就是一个RWMutex。

但是,Iter可以在带锁的UpdateOrAppend中调用,所以它实际上不能调用RLock,否则会发生死锁。

因此,你需要两个版本的Iter:一个可以在UpdateOrAppend之外调用,并在goroutine中发出RLock,另一个只能从UpdateOrAppend中调用,并且不调用RLock

英文:

tl;dr;

If ConcurrentSlice is to be used from a single goroutine, the locks are unnecessary, because the way algorithm written there is not going to be any concurrent read/writes to slice elements, or the slice.

If ConcurrentSlice is to be used from multiple goroutines, existings locks are not sufficient. This is because UpdateOrAppend may modify slice elements concurrently.

A safe version woule need two versions of Iter:

This can be called by users of ConcurrentSlice, but it cannot be called from `UpdateOrAppend:

func (cs *ConcurrentSlice) Iter() &lt;-chan ConcurrentSliceItem {
    c := make(chan ConcurrentSliceItem)

    f := func() {
        cs.RLock()
        defer cs.RUnlock()
        for index, value := range cs.items {
            c &lt;- ConcurrentSliceItem{index, value}
        }
        close(c)
    }
    go f()

    return c
}

and this is only to be called from UpdateOrAppend:

func (cs *ConcurrentSlice) internalIter() &lt;-chan ConcurrentSliceItem {
    c := make(chan ConcurrentSliceItem)

    f := func() {
        // No locking
        for index, value := range cs.items {
            c &lt;- ConcurrentSliceItem{index, value}
        }
        close(c)
    }
    go f()

    return c
}

And UpdateOrAppend should be synchronized at the top level:

func (cs *ConcurrentSlice) UpdateOrAppend(item ScalingInfo) {
cs.Lock()
defer cs.Unlock()
....
}

Here's the long version:

This is an interesting piece of code. Based on my understanding of the go memory model, the mutex lock in Iter() is only necessary if there is another goroutine working on this code, and even with that, there is a possible race in the code. However, UpdateOrAppend only modifies elements of the slice with lower indexes than what Iter is working on, so that race never manifests itself.

The race can happen as follows:

  1. The for-loop in iter reads element 0 of the slice
  2. The element is sent through the channel. Thus, the slice receive happens after the first step.
  3. The receiving end potentially updates element 0 of the slice. There is no problem up to here.
  4. Then the sending goroutine reads element 1 of the slice. This is when a race can happen. If step 3 updated index 1 of the slice, the read at step 4 is a race. That is: if step 3 reads the update done by step 4, it is a race. You can see this if you start with i:=1 in UpdateOrAppend, and running it with the -race flag.

But UpdateOrAppend always modifies slice elements that are already seen by Iter when i=0, so this code is safe, even without the lock.

If there will be other goroutines accessing and modifying the structure, you need the Mutex, but you need it to protect the complete UpdateOrAppend method, because only one goroutine should be allowed to run that. You need the mutex to protect the potential updates in the first for-loop, and that mutex has to also include the slice append case, because that may actually modify the slice of the underlying object.

If Iter is only called from UpdateOrAppend, then this single mutex should be sufficient. If however Iter can be called from multiple goroutines, then there is another race possibility. If one UpdateOrAppend is running concurrently with multiple Iter instances, then some of those Iter instances will read from the modified slice elements concurrently, causing a race. So, it should be such that multiple Iters can only run if there are no UpdateOrAppend calls. That is a RWMutex.

But Iter can be called from UpdateOrAppend with a lock, so it cannot really call RLock, otherwise it is a deadlock.

Thus, you need two versions of Iter: one that can be called outside UpdateOrAppend, and that issues RLock in the goroutine, and another that can only be called from UpdateOrAppend and does not call RLock.

huangapple
  • 本文由 发表于 2022年9月27日 03:59:26
  • 转载请务必保留本文链接:https://go.coder-hub.com/73859360.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定