Open
Description
Describe the bug
When running the latest version of OCIS, a SW relying on go-micro, I have some crashes (owncloud/ocis#10785)
The backtrace look like this (I extracted only the relevant part):
fatal error: concurrent map writes
fatal error: concurrent map writes
goroutine 13004 [running]:
go-micro.dev/v4/registry/cache.(*cache).isValid(0xc003746c60, {0xc0099c38c0, 0x1, 0x16?}, {0xc009750fa0?, 0x0?, 0x5f85c8271d40?})
go-micro.dev/[email protected]/registry/cache/cache.go:103 +0x1aa
go-micro.dev/v4/registry/cache.(*cache).get(0xc003746c60, {0x5f85c3fe03b1, 0x16})
go-micro.dev/[email protected]/registry/cache/cache.go:145 +0x136
go-micro.dev/v4/registry/cache.(*cache).GetService(0x5f85c3fe03b1?, {0x5f85c3fe03b1?, 0x5f85c5856f90?}, {0x1?, 0x0?, 0xc00a766b80?})
go-micro.dev/[email protected]/registry/cache/cache.go:462 +0x18
github.com/cs3org/reva/v2/pkg/rgrpc/todo/pool.(*Selector[...]).Next(0x5f85c6241710, {0x0, 0x0, 0x16})
github.com/cs3org/reva/[email protected]/pkg/rgrpc/todo/pool/selector.go:117 +0x2b4
I did some research and in cache.go
I've the following code:
func (c *cache) get(service string) ([]*registry.Service, error) {
// read lock
c.RLock()
// check the cache first
services := c.cache[service]
// get cache ttl
ttl := c.ttls[service]
// make a copy
cp := util.Copy(services)
// got services, nodes && within ttl so return cache
if c.isValid(cp, ttl) {
c.RUnlock()
// return services
return cp, nil
}
From what I understood, this function take a "read only lock" but isValid
need a RW lock, because of this line:
delete(c.nttls, s.Name)
The bug might have been introduced with this PR #2736
Metadata
Metadata
Assignees
Labels
No labels