Go检测到对映射进行并发读写,尽管有锁

huangapple go评论90阅读模式
英文:

Go detects concurrent read and write on map despite locks

问题

我正在编写一个简单的缓存机制,其中包含一个Add、一个Evict和一个Search方法。目前还没有实现Search方法,所以不需要担心这个问题。

有很多goroutine调用Add方法来添加数据,只有一个goroutine在循环中运行以清除数据。一旦我给它施加一些严重的流量,Go就会报错,说在metricCache映射上存在并发读写访问,但我看不出是怎么发生的,因为它周围有锁。我正在使用Go 1.7。

文件mdata/cache.go

57: func NewCCache() *CCache {
58:     cc := &CCache{
59:         lock:        sync.RWMutex{},
60:         metricCache: make(map[string]*CCacheMetric),
61:         accnt:       accnt.NewFlatAccnt(maxSize),
62:     }
63:     go cc.evictLoop()
64:     return cc
65: }
66:
67: func (c *CCache) evictLoop() {
68:     evictQ := c.accnt.GetEvictQ()
69:     for target := range evictQ {
70:         c.evict(target)
71:     }
72: }
73: 
74: func (c *CCache) Add(metric string, prev uint32, itergen chunk.IterGen) {
75:     c.lock.Lock()
76: 
77:     if ccm, ok := c.metricCache[metric]; !ok {
78:         var ccm *CCacheMetric
79:         ccm = NewCCacheMetric()
80:         ccm.Init(prev, itergen)
81:         c.metricCache[metric] = ccm
82:     } else {
83:         ccm.Add(prev, itergen)
84:     }
85:     c.lock.Unlock()
86: 
87:     c.accnt.AddChunk(metric, itergen.Ts(), itergen.Size())
88: }
89: 
90: func (c *CCache) evict(target *accnt.EvictTarget) {
91:     c.lock.Lock()
92: 
93:     if _, ok := c.metricCache[target.Metric]; ok {
94:         log.Debug("cache: evicting chunk %d on metric %s\n", target.Ts, target.Metric)
95:         length := c.metricCache[target.Metric].Del(target.Ts)
96:         if length == 0 {
97:             delete(c.metricCache, target.Metric)
98:         }
99:     }
100: 
101:     c.lock.Unlock()
102: }

这是错误消息:

metrictank_1    | fatal error: concurrent map read and map write
metrictank_1    | 
metrictank_1    | goroutine 3159 [running]:
metrictank_1    | runtime.throw(0xaade7e, 0x21)
metrictank_1    |       /usr/local/go/src/runtime/panic.go:566 +0x95 fp=0xc4216a7eb8 sp=0xc4216a7e98
metrictank_1    | runtime.mapaccess2_faststr(0x9e22c0, 0xc42031e600, 0xc4210c2b10, 0x22, 0x28, 0xa585d5496)
metrictank_1    |       /usr/local/go/src/runtime/hashmap_fast.go:306 +0x52b fp=0xc4216a7f18 sp=0xc4216a7eb8
metrictank_1    | github.com/raintank/metrictank/mdata/cache.(*CCache).Add(0xc4202fa070, 0xc4210c2b10, 0x22, 0x0, 0xc421875f82, 0x25, 0x25, 0xa585d5496)
metrictank_1    |       /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:77 +0x63 fp=0xc4216a7f80 sp=0xc4216a7f18
metrictank_1    | runtime.goexit()
metrictank_1    |       /usr/local/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc4216a7f88 sp=0xc4216a7f80
metrictank_1    | created by github.com/raintank/metrictank/api.(*Server).getSeries
metrictank_1    |       /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:442 +0x122b

更新:我重新使用-race重新编译,现在我得到了一个不同的错误。这看起来好像RWMutex完全无效,因为根据回溯信息,问题必须在evictAdd方法的组合中。

==================
警告:数据竞争
由goroutine 215读取于0x00c4201c81e0:
  runtime.mapaccess2_faststr()
      /usr/local/go/src/runtime/hashmap_fast.go:297 +0x0
  github.com/raintank/metrictank/mdata/cache.(*CCache).Add()
      /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:77 +0xaa

之前由goroutine 155写入于0x00c4201c81e0:
  runtime.mapdelete()
      /usr/local/go/src/runtime/hashmap.go:558 +0x0
  github.com/raintank/metrictank/mdata/cache.(*CCache).evict()
      /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:97 +0x30e
  github.com/raintank/metrictank/mdata/cache.(*CCache).evictLoop()
      /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:70 +0xb3

Goroutine 215(正在运行)创建于:
  github.com/raintank/metrictank/api.(*Server).getSeries()
      /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:442 +0x17c9
  github.com/raintank/metrictank/api.(*Server).getTarget()
      /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:331 +0x9c3
  github.com/raintank/metrictank/api.(*Server).getTargetsLocal.func1()
      /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:284 +0xa9

Goroutine 155(正在运行)创建于:
  github.com/raintank/metrictank/mdata/cache.NewCCache()
      /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:63 +0x12f
  main.main()
      /home/mst/go/src/github.com/raintank/metrictank/metrictank.go:388 +0x246c
==================
英文:

I'm writing a simple caching mechanism which has an Add an Evict and a Search method. The Search is currently not implemented yet, so there's no need to worry about that.

There's a relatively large number of goroutines that call Add to add data and there's only one which runs in an evict loop to evict data. As soon as I put some serious traffic on it Go throws up saying there's a concurrent read and write access on the map metricCache, but I can't see how that can happen because there are locks around it. I'm using Go 1.7.

File mdata/cache.go:

57: func NewCCache() *CCache {
58:     cc := &CCache{
59:         lock:        sync.RWMutex{},
60:         metricCache: make(map[string]*CCacheMetric),
61:         accnt:       accnt.NewFlatAccnt(maxSize),
62:     }
63:     go cc.evictLoop()
64:     return cc
65: }
66:
67: func (c *CCache) evictLoop() {
68:     evictQ := c.accnt.GetEvictQ()
69:     for target := range evictQ {
70:         c.evict(target)
71:     }
72: }
73: 
74: func (c *CCache) Add(metric string, prev uint32, itergen chunk.IterGen) {
75:     c.lock.Lock()
76: 
77:     if ccm, ok := c.metricCache[metric]; !ok {
78:         var ccm *CCacheMetric
79:         ccm = NewCCacheMetric()
80:         ccm.Init(prev, itergen)
81:         c.metricCache[metric] = ccm
82:     } else {
83:         ccm.Add(prev, itergen)
84:     }
85:     c.lock.Unlock()
86: 
87:     c.accnt.AddChunk(metric, itergen.Ts(), itergen.Size())
88: }
89: 
90: func (c *CCache) evict(target *accnt.EvictTarget) {
91:     c.lock.Lock()
92: 
93:     if _, ok := c.metricCache[target.Metric]; ok {
94:         log.Debug("cache: evicting chunk %d on metric %s\n", target.Ts, target.Metric)
95:         length := c.metricCache[target.Metric].Del(target.Ts)
96:         if length == 0 {
97:             delete(c.metricCache, target.Metric)
98:         }
99:     }
100: 
101:     c.lock.Unlock()
102: }

That's the error message:

metrictank_1    | fatal error: concurrent map read and map write
metrictank_1    | 
metrictank_1    | goroutine 3159 [running]:
metrictank_1    | runtime.throw(0xaade7e, 0x21)
metrictank_1    |       /usr/local/go/src/runtime/panic.go:566 +0x95 fp=0xc4216a7eb8 sp=0xc4216a7e98
metrictank_1    | runtime.mapaccess2_faststr(0x9e22c0, 0xc42031e600, 0xc4210c2b10, 0x22, 0x28, 0xa585d5496)
metrictank_1    |       /usr/local/go/src/runtime/hashmap_fast.go:306 +0x52b fp=0xc4216a7f18 sp=0xc4216a7eb8
metrictank_1    | github.com/raintank/metrictank/mdata/cache.(*CCache).Add(0xc4202fa070, 0xc4210c2b10, 0x22, 0x0, 0xc421875f82, 0x25, 0x25, 0xa585d5496)
metrictank_1    |       /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:77 +0x63 fp=0xc4216a7f80 sp=0xc4216a7f18
metrictank_1    | runtime.goexit()
metrictank_1    |       /usr/local/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc4216a7f88 sp=0xc4216a7f80
metrictank_1    | created by github.com/raintank/metrictank/api.(*Server).getSeries
metrictank_1    |       /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:442 +0x122b

UPDATE: I recompiled with -race and now I'm getting a different error. This looks as if the RWMutex were completely ineffective because according to the backtraces the problem must be in the combination of the evict and Add methods.

==================
WARNING: DATA RACE
Read at 0x00c4201c81e0 by goroutine 215:
  runtime.mapaccess2_faststr()
      /usr/local/go/src/runtime/hashmap_fast.go:297 +0x0
  github.com/raintank/metrictank/mdata/cache.(*CCache).Add()
      /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:77 +0xaa

Previous write at 0x00c4201c81e0 by goroutine 155:
  runtime.mapdelete()
      /usr/local/go/src/runtime/hashmap.go:558 +0x0
  github.com/raintank/metrictank/mdata/cache.(*CCache).evict()
      /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:97 +0x30e
  github.com/raintank/metrictank/mdata/cache.(*CCache).evictLoop()
      /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:70 +0xb3

Goroutine 215 (running) created at:
  github.com/raintank/metrictank/api.(*Server).getSeries()
      /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:442 +0x17c9
  github.com/raintank/metrictank/api.(*Server).getTarget()
      /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:331 +0x9c3
  github.com/raintank/metrictank/api.(*Server).getTargetsLocal.func1()
      /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:284 +0xa9

Goroutine 155 (running) created at:
  github.com/raintank/metrictank/mdata/cache.NewCCache()
      /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:63 +0x12f
  main.main()
      /home/mst/go/src/github.com/raintank/metrictank/metrictank.go:388 +0x246c
==================

答案1

得分: 2

我的同事找到了答案:

在调用NewCCache()之后,我通过值复制(包括锁)复制了返回的变量,然后在副本上调用了Add(),与此同时,evictLoop()协程仍然引用旧的副本。因此,它们在不同的锁副本上操作 Go检测到对映射进行并发读写,尽管有锁

英文:

A colleague of mine has found the answer:

After calling NewCCache() I copied the returned variable by value (including the lock) and then called Add() on the copy, at the same time the evictLoop() go routine was still referring to the old copy. So they were operating on different copies of the lock Go检测到对映射进行并发读写,尽管有锁

huangapple
  • 本文由 发表于 2016年12月24日 01:02:36
  • 转载请务必保留本文链接:https://go.coder-hub.com/41305244.html
匿名

发表评论

匿名网友

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen:

确定