英文:
Go detects concurrent read and write on map despite locks
问题
我正在编写一个简单的缓存机制,其中包含一个Add
、一个Evict
和一个Search
方法。目前还没有实现Search
方法,所以不需要担心这个问题。
有很多goroutine调用Add
方法来添加数据,只有一个goroutine在循环中运行以清除数据。一旦我给它施加一些严重的流量,Go就会报错,说在metricCache
映射上存在并发读写访问,但我看不出是怎么发生的,因为它周围有锁。我正在使用Go 1.7。
文件mdata/cache.go
:
57: func NewCCache() *CCache {
58: cc := &CCache{
59: lock: sync.RWMutex{},
60: metricCache: make(map[string]*CCacheMetric),
61: accnt: accnt.NewFlatAccnt(maxSize),
62: }
63: go cc.evictLoop()
64: return cc
65: }
66:
67: func (c *CCache) evictLoop() {
68: evictQ := c.accnt.GetEvictQ()
69: for target := range evictQ {
70: c.evict(target)
71: }
72: }
73:
74: func (c *CCache) Add(metric string, prev uint32, itergen chunk.IterGen) {
75: c.lock.Lock()
76:
77: if ccm, ok := c.metricCache[metric]; !ok {
78: var ccm *CCacheMetric
79: ccm = NewCCacheMetric()
80: ccm.Init(prev, itergen)
81: c.metricCache[metric] = ccm
82: } else {
83: ccm.Add(prev, itergen)
84: }
85: c.lock.Unlock()
86:
87: c.accnt.AddChunk(metric, itergen.Ts(), itergen.Size())
88: }
89:
90: func (c *CCache) evict(target *accnt.EvictTarget) {
91: c.lock.Lock()
92:
93: if _, ok := c.metricCache[target.Metric]; ok {
94: log.Debug("cache: evicting chunk %d on metric %s\n", target.Ts, target.Metric)
95: length := c.metricCache[target.Metric].Del(target.Ts)
96: if length == 0 {
97: delete(c.metricCache, target.Metric)
98: }
99: }
100:
101: c.lock.Unlock()
102: }
这是错误消息:
metrictank_1 | fatal error: concurrent map read and map write
metrictank_1 |
metrictank_1 | goroutine 3159 [running]:
metrictank_1 | runtime.throw(0xaade7e, 0x21)
metrictank_1 | /usr/local/go/src/runtime/panic.go:566 +0x95 fp=0xc4216a7eb8 sp=0xc4216a7e98
metrictank_1 | runtime.mapaccess2_faststr(0x9e22c0, 0xc42031e600, 0xc4210c2b10, 0x22, 0x28, 0xa585d5496)
metrictank_1 | /usr/local/go/src/runtime/hashmap_fast.go:306 +0x52b fp=0xc4216a7f18 sp=0xc4216a7eb8
metrictank_1 | github.com/raintank/metrictank/mdata/cache.(*CCache).Add(0xc4202fa070, 0xc4210c2b10, 0x22, 0x0, 0xc421875f82, 0x25, 0x25, 0xa585d5496)
metrictank_1 | /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:77 +0x63 fp=0xc4216a7f80 sp=0xc4216a7f18
metrictank_1 | runtime.goexit()
metrictank_1 | /usr/local/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc4216a7f88 sp=0xc4216a7f80
metrictank_1 | created by github.com/raintank/metrictank/api.(*Server).getSeries
metrictank_1 | /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:442 +0x122b
更新:我重新使用-race
重新编译,现在我得到了一个不同的错误。这看起来好像RWMutex
完全无效,因为根据回溯信息,问题必须在evict
和Add
方法的组合中。
==================
警告:数据竞争
由goroutine 215读取于0x00c4201c81e0:
runtime.mapaccess2_faststr()
/usr/local/go/src/runtime/hashmap_fast.go:297 +0x0
github.com/raintank/metrictank/mdata/cache.(*CCache).Add()
/home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:77 +0xaa
之前由goroutine 155写入于0x00c4201c81e0:
runtime.mapdelete()
/usr/local/go/src/runtime/hashmap.go:558 +0x0
github.com/raintank/metrictank/mdata/cache.(*CCache).evict()
/home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:97 +0x30e
github.com/raintank/metrictank/mdata/cache.(*CCache).evictLoop()
/home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:70 +0xb3
Goroutine 215(正在运行)创建于:
github.com/raintank/metrictank/api.(*Server).getSeries()
/home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:442 +0x17c9
github.com/raintank/metrictank/api.(*Server).getTarget()
/home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:331 +0x9c3
github.com/raintank/metrictank/api.(*Server).getTargetsLocal.func1()
/home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:284 +0xa9
Goroutine 155(正在运行)创建于:
github.com/raintank/metrictank/mdata/cache.NewCCache()
/home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:63 +0x12f
main.main()
/home/mst/go/src/github.com/raintank/metrictank/metrictank.go:388 +0x246c
==================
英文:
I'm writing a simple caching mechanism which has an Add
an Evict
and a Search
method. The Search
is currently not implemented yet, so there's no need to worry about that.
There's a relatively large number of goroutines that call Add
to add data and there's only one which runs in an evict loop to evict data. As soon as I put some serious traffic on it Go throws up saying there's a concurrent read and write access on the map metricCache
, but I can't see how that can happen because there are locks around it. I'm using Go 1.7.
File mdata/cache.go
:
57: func NewCCache() *CCache {
58: cc := &CCache{
59: lock: sync.RWMutex{},
60: metricCache: make(map[string]*CCacheMetric),
61: accnt: accnt.NewFlatAccnt(maxSize),
62: }
63: go cc.evictLoop()
64: return cc
65: }
66:
67: func (c *CCache) evictLoop() {
68: evictQ := c.accnt.GetEvictQ()
69: for target := range evictQ {
70: c.evict(target)
71: }
72: }
73:
74: func (c *CCache) Add(metric string, prev uint32, itergen chunk.IterGen) {
75: c.lock.Lock()
76:
77: if ccm, ok := c.metricCache[metric]; !ok {
78: var ccm *CCacheMetric
79: ccm = NewCCacheMetric()
80: ccm.Init(prev, itergen)
81: c.metricCache[metric] = ccm
82: } else {
83: ccm.Add(prev, itergen)
84: }
85: c.lock.Unlock()
86:
87: c.accnt.AddChunk(metric, itergen.Ts(), itergen.Size())
88: }
89:
90: func (c *CCache) evict(target *accnt.EvictTarget) {
91: c.lock.Lock()
92:
93: if _, ok := c.metricCache[target.Metric]; ok {
94: log.Debug("cache: evicting chunk %d on metric %s\n", target.Ts, target.Metric)
95: length := c.metricCache[target.Metric].Del(target.Ts)
96: if length == 0 {
97: delete(c.metricCache, target.Metric)
98: }
99: }
100:
101: c.lock.Unlock()
102: }
That's the error message:
metrictank_1 | fatal error: concurrent map read and map write
metrictank_1 |
metrictank_1 | goroutine 3159 [running]:
metrictank_1 | runtime.throw(0xaade7e, 0x21)
metrictank_1 | /usr/local/go/src/runtime/panic.go:566 +0x95 fp=0xc4216a7eb8 sp=0xc4216a7e98
metrictank_1 | runtime.mapaccess2_faststr(0x9e22c0, 0xc42031e600, 0xc4210c2b10, 0x22, 0x28, 0xa585d5496)
metrictank_1 | /usr/local/go/src/runtime/hashmap_fast.go:306 +0x52b fp=0xc4216a7f18 sp=0xc4216a7eb8
metrictank_1 | github.com/raintank/metrictank/mdata/cache.(*CCache).Add(0xc4202fa070, 0xc4210c2b10, 0x22, 0x0, 0xc421875f82, 0x25, 0x25, 0xa585d5496)
metrictank_1 | /home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:77 +0x63 fp=0xc4216a7f80 sp=0xc4216a7f18
metrictank_1 | runtime.goexit()
metrictank_1 | /usr/local/go/src/runtime/asm_amd64.s:2086 +0x1 fp=0xc4216a7f88 sp=0xc4216a7f80
metrictank_1 | created by github.com/raintank/metrictank/api.(*Server).getSeries
metrictank_1 | /home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:442 +0x122b
UPDATE: I recompiled with -race
and now I'm getting a different error. This looks as if the RWMutex
were completely ineffective because according to the backtraces the problem must be in the combination of the evict
and Add
methods.
==================
WARNING: DATA RACE
Read at 0x00c4201c81e0 by goroutine 215:
runtime.mapaccess2_faststr()
/usr/local/go/src/runtime/hashmap_fast.go:297 +0x0
github.com/raintank/metrictank/mdata/cache.(*CCache).Add()
/home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:77 +0xaa
Previous write at 0x00c4201c81e0 by goroutine 155:
runtime.mapdelete()
/usr/local/go/src/runtime/hashmap.go:558 +0x0
github.com/raintank/metrictank/mdata/cache.(*CCache).evict()
/home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:97 +0x30e
github.com/raintank/metrictank/mdata/cache.(*CCache).evictLoop()
/home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:70 +0xb3
Goroutine 215 (running) created at:
github.com/raintank/metrictank/api.(*Server).getSeries()
/home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:442 +0x17c9
github.com/raintank/metrictank/api.(*Server).getTarget()
/home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:331 +0x9c3
github.com/raintank/metrictank/api.(*Server).getTargetsLocal.func1()
/home/mst/go/src/github.com/raintank/metrictank/api/dataprocessor.go:284 +0xa9
Goroutine 155 (running) created at:
github.com/raintank/metrictank/mdata/cache.NewCCache()
/home/mst/go/src/github.com/raintank/metrictank/mdata/cache/cache.go:63 +0x12f
main.main()
/home/mst/go/src/github.com/raintank/metrictank/metrictank.go:388 +0x246c
==================
答案1
得分: 2
我的同事找到了答案:
在调用NewCCache()
之后,我通过值复制(包括锁)复制了返回的变量,然后在副本上调用了Add()
,与此同时,evictLoop()
协程仍然引用旧的副本。因此,它们在不同的锁副本上操作
英文:
A colleague of mine has found the answer:
After calling NewCCache()
I copied the returned variable by value (including the lock) and then called Add()
on the copy, at the same time the evictLoop()
go routine was still referring to the old copy. So they were operating on different copies of the lock
通过集体智慧和协作来改善编程学习和解决问题的方式。致力于成为全球开发者共同参与的知识库,让每个人都能够通过互相帮助和分享经验来进步。
评论