Go检测到对映射进行并发读写,尽管有锁

huangapple go评论113阅读模式
英文:

Go detects concurrent read and write on map despite locks

问题

我正在编写一个简单的缓存机制,其中包含一个Add、一个Evict和一个Search方法。目前还没有实现Search方法,所以不需要担心这个问题。

有很多goroutine调用Add方法来添加数据,只有一个goroutine在循环中运行以清除数据。一旦我给它施加一些严重的流量,Go就会报错,说在metricCache映射上存在并发读写访问,但我看不出是怎么发生的,因为它周围有锁。我正在使用Go 1.7。

文件mdata/cache.go

  1. 57: func NewCCache() *CCache {
  2. 58: cc := &CCache{
  3. 59: lock: sync.RWMutex{},
  4. 60: metricCache: make(map[string]*CCacheMetric),
  5. 61: accnt: accnt.NewFlatAccnt(maxSize),
  6. 62: }
  7. 63: go cc.evictLoop()
  8. 64: return cc
  9. 65: }
  10. 66:
  11. 67: func (c *CCache) evictLoop() {
  12. 68: evictQ := c.accnt.GetEvictQ()
  13. 69: for target := range evictQ {
  14. 70: c.evict(target)
  15. 71: }
  16. 72: }
  17. 73:
  18. 74: func (c *CCache) Add(metric string, prev uint32, itergen chunk.IterGen) {
  19. 75: c.lock.Lock()
  20. 76:
  21. 77: if ccm, ok := c.metricCache[metric]; !ok {
  22. 78: var ccm *CCacheMetric
  23. 79: ccm = NewCCacheMetric()
  24. 80: ccm.Init(prev, itergen)
  25. 81: c.metricCache[metric] = ccm
  26. 82: } else {
  27. 83: ccm.Add(prev, itergen)
  28. 84: }
  29. 85: c.lock.Unlock()
  30. 86:
  31. 87: c.accnt.AddChunk(metric, itergen.Ts(), itergen.Size())
  32. 88: }
  33. 89:
  34. 90: func (c *CCache) evict(target *accnt.EvictTarget) {
  35. 91: c.lock.Lock()
  36. 92:
  37. 93: if _, ok := c.metricCache[target.Metric]; ok {
  38. 94: log.Debug("cache: evicting chunk %d on metric %s\n", target.Ts, target.Metric)
  39. 95: length := c.metricCache[target.Metric].Del(target.Ts)
  40. 96: if length == 0 {
  41. 97: delete(c.metricCache, target.Metric)
  42. 98: }
  43. 99: }
  44. 100:
  45. 101: c.lock.Unlock()
  46. 102: }

这是错误消息:

  1. metrictank_1 | fatal error: concurrent map read and map write
  2. metrictank_1 |
  3. metrictank_1 | goroutine 3159 [running]:
  4. metrictank_1 | runtime.throw(0xaade7e, 0x21)
  5. metrictank_1 | /usr/local/go/src/runtime/panic.go:566 +0x95 fp=0xc4216a7eb8 sp=0xc4216a7e98
  6. metrictank_1 | runtime.mapaccess2_faststr(0x9e22c0, 0xc42031e600, 0xc4210c2b10, 0x22, 0x28, 0xa585d5496)
  7. metrictank_1 | /usr/local/go/src/runtime/hashmap_fast.go:306 +0x52b fp=0xc4216a7f18 sp=0xc4216a7eb8
  8. metrictank_1 | github.com/raintank/metrictank/mdata/cache.(*CCache).Add(0xc4202fa070, 0xc4210c2b10, 0x22, 0x0, 0xc421875f82, 0x25, 0x25, 0xa585d5496)
  9. metrictank_1 | /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:77 +0x63 fp=0xc4216a7f80 sp=0xc4216a7f18
  10. metrictank_1 | runtime.goexit()
  11. metrictank_1 | /usr/local/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc4216a7f88 sp=0xc4216a7f80
  12. metrictank_1 | created by github.com/raintank/metrictank/api.(*Server).getSeries
  13. metrictank_1 | /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:442 +0x122b

更新:我重新使用-race重新编译,现在我得到了一个不同的错误。这看起来好像RWMutex完全无效,因为根据回溯信息,问题必须在evictAdd方法的组合中。

  1. ==================
  2. 警告:数据竞争
  3. goroutine 215读取于0x00c4201c81e0
  4. runtime.mapaccess2_faststr()
  5. /usr/local/go/src/runtime/hashmap_fast.go:297 +0x0
  6. github.com/raintank/metrictank/mdata/cache.(*CCache).Add()
  7. /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:77 +0xaa
  8. 之前由goroutine 155写入于0x00c4201c81e0
  9. runtime.mapdelete()
  10. /usr/local/go/src/runtime/hashmap.go:558 +0x0
  11. github.com/raintank/metrictank/mdata/cache.(*CCache).evict()
  12. /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:97 +0x30e
  13. github.com/raintank/metrictank/mdata/cache.(*CCache).evictLoop()
  14. /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:70 +0xb3
  15. Goroutine 215(正在运行)创建于:
  16. github.com/raintank/metrictank/api.(*Server).getSeries()
  17. /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:442 +0x17c9
  18. github.com/raintank/metrictank/api.(*Server).getTarget()
  19. /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:331 +0x9c3
  20. github.com/raintank/metrictank/api.(*Server).getTargetsLocal.func1()
  21. /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:284 +0xa9
  22. Goroutine 155(正在运行)创建于:
  23. github.com/raintank/metrictank/mdata/cache.NewCCache()
  24. /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:63 +0x12f
  25. main.main()
  26. /home/mst/go/src/github.com/raintank/metrictank/metrictank.go:388 +0x246c
  27. ==================
英文:

I'm writing a simple caching mechanism which has an Add an Evict and a Search method. The Search is currently not implemented yet, so there's no need to worry about that.

There's a relatively large number of goroutines that call Add to add data and there's only one which runs in an evict loop to evict data. As soon as I put some serious traffic on it Go throws up saying there's a concurrent read and write access on the map metricCache, but I can't see how that can happen because there are locks around it. I'm using Go 1.7.

File mdata/cache.go:

  1. 57: func NewCCache() *CCache {
  2. 58: cc := &CCache{
  3. 59: lock: sync.RWMutex{},
  4. 60: metricCache: make(map[string]*CCacheMetric),
  5. 61: accnt: accnt.NewFlatAccnt(maxSize),
  6. 62: }
  7. 63: go cc.evictLoop()
  8. 64: return cc
  9. 65: }
  10. 66:
  11. 67: func (c *CCache) evictLoop() {
  12. 68: evictQ := c.accnt.GetEvictQ()
  13. 69: for target := range evictQ {
  14. 70: c.evict(target)
  15. 71: }
  16. 72: }
  17. 73:
  18. 74: func (c *CCache) Add(metric string, prev uint32, itergen chunk.IterGen) {
  19. 75: c.lock.Lock()
  20. 76:
  21. 77: if ccm, ok := c.metricCache[metric]; !ok {
  22. 78: var ccm *CCacheMetric
  23. 79: ccm = NewCCacheMetric()
  24. 80: ccm.Init(prev, itergen)
  25. 81: c.metricCache[metric] = ccm
  26. 82: } else {
  27. 83: ccm.Add(prev, itergen)
  28. 84: }
  29. 85: c.lock.Unlock()
  30. 86:
  31. 87: c.accnt.AddChunk(metric, itergen.Ts(), itergen.Size())
  32. 88: }
  33. 89:
  34. 90: func (c *CCache) evict(target *accnt.EvictTarget) {
  35. 91: c.lock.Lock()
  36. 92:
  37. 93: if _, ok := c.metricCache[target.Metric]; ok {
  38. 94: log.Debug("cache: evicting chunk %d on metric %s\n", target.Ts, target.Metric)
  39. 95: length := c.metricCache[target.Metric].Del(target.Ts)
  40. 96: if length == 0 {
  41. 97: delete(c.metricCache, target.Metric)
  42. 98: }
  43. 99: }
  44. 100:
  45. 101: c.lock.Unlock()
  46. 102: }

That's the error message:

  1. metrictank_1 | fatal error: concurrent map read and map write
  2. metrictank_1 |
  3. metrictank_1 | goroutine 3159 [running]:
  4. metrictank_1 | runtime.throw(0xaade7e, 0x21)
  5. metrictank_1 | /usr/local/go/src/runtime/panic.go:566 +0x95 fp=0xc4216a7eb8 sp=0xc4216a7e98
  6. metrictank_1 | runtime.mapaccess2_faststr(0x9e22c0, 0xc42031e600, 0xc4210c2b10, 0x22, 0x28, 0xa585d5496)
  7. metrictank_1 | /usr/local/go/src/runtime/hashmap_fast.go:306 +0x52b fp=0xc4216a7f18 sp=0xc4216a7eb8
  8. metrictank_1 | github.com/raintank/metrictank/mdata/cache.(*CCache).Add(0xc4202fa070, 0xc4210c2b10, 0x22, 0x0, 0xc421875f82, 0x25, 0x25, 0xa585d5496)
  9. metrictank_1 | /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:77 +0x63 fp=0xc4216a7f80 sp=0xc4216a7f18
  10. metrictank_1 | runtime.goexit()
  11. metrictank_1 | /usr/local/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc4216a7f88 sp=0xc4216a7f80
  12. metrictank_1 | created by github.com/raintank/metrictank/api.(*Server).getSeries
  13. metrictank_1 | /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:442 +0x122b

UPDATE: I recompiled with -race and now I'm getting a different error. This looks as if the RWMutex were completely ineffective because according to the backtraces the problem must be in the combination of the evict and Add methods.

  1. ==================
  2. WARNING: DATA RACE
  3. Read at 0x00c4201c81e0 by goroutine 215:
  4. runtime.mapaccess2_faststr()
  5. /usr/local/go/src/runtime/hashmap_fast.go:297 +0x0
  6. github.com/raintank/metrictank/mdata/cache.(*CCache).Add()
  7. /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:77 +0xaa
  8. Previous write at 0x00c4201c81e0 by goroutine 155:
  9. runtime.mapdelete()
  10. /usr/local/go/src/runtime/hashmap.go:558 +0x0
  11. github.com/raintank/metrictank/mdata/cache.(*CCache).evict()
  12. /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:97 +0x30e
  13. github.com/raintank/metrictank/mdata/cache.(*CCache).evictLoop()
  14. /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:70 +0xb3
  15. Goroutine 215 (running) created at:
  16. github.com/raintank/metrictank/api.(*Server).getSeries()
  17. /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:442 +0x17c9
  18. github.com/raintank/metrictank/api.(*Server).getTarget()
  19. /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:331 +0x9c3
  20. github.com/raintank/metrictank/api.(*Server).getTargetsLocal.func1()
  21. /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:284 +0xa9
  22. Goroutine 155 (running) created at:
  23. github.com/raintank/metrictank/mdata/cache.NewCCache()
  24. /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:63 +0x12f
  25. main.main()
  26. /home/mst/go/src/github.com/raintank/metrictank/metrictank.go:388 +0x246c
  27. ==================

答案1

得分: 2

我的同事找到了答案:

在调用NewCCache()之后,我通过值复制(包括锁)复制了返回的变量,然后在副本上调用了Add(),与此同时,evictLoop()协程仍然引用旧的副本。因此,它们在不同的锁副本上操作 Go检测到对映射进行并发读写,尽管有锁

英文:

A colleague of mine has found the answer:

After calling NewCCache() I copied the returned variable by value (including the lock) and then called Add() on the copy, at the same time the evictLoop() go routine was still referring to the old copy. So they were operating on different copies of the lock Go检测到对映射进行并发读写,尽管有锁

huangapple
  • 本文由 发表于 2016年12月24日 01:02:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/41305244.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定