Skip to content

cmd/cgo: segmentation violation with -race, only on one machine #74038

Open
@bboreham

Description

@bboreham

Go version

go version go1.24.4 linux/amd64

Output of go env in your module/workspace:

AR='ar'
CC='gcc'
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_ENABLED='1'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
CXX='g++'
GCCGO='gccgo'
GO111MODULE=''
GOAMD64='v1'
GOARCH='amd64'
GOAUTH='netrc'
GOBIN=''
GOCACHE='/home/bryan/.cache/go-build'
GOCACHEPROG=''
GODEBUG=''
GOENV='/home/bryan/.config/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFIPS140='off'
GOFLAGS=''
GOGCCFLAGS='-fPIC -m64 -pthread -Wl,--no-gc-sections -fmessage-length=0 -ffile-prefix-map=/tmp/go-build156449372=/tmp/go-build -gno-record-gcc-switches'
GOHOSTARCH='amd64'
GOHOSTOS='linux'
GOINSECURE=''
GOMOD='/home/bryan/src/github.com/grafana/mimir/go.mod'
GOMODCACHE='/home/bryan/go/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='linux'
GOPATH='/home/bryan/go'
GOPRIVATE=''
GOPROXY='https://proxy.golang.org,direct'
GOROOT='/usr/local/go'
GOSUMDB='sum.golang.org'
GOTELEMETRY='local'
GOTELEMETRYDIR='/home/bryan/.config/go/telemetry'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/go/pkg/tool/linux_amd64'
GOVCS=''
GOVERSION='go1.24.4'
GOWORK=''
PKG_CONFIG='pkg-config'

What did you do?

I want to stress: this is something weird about my machine; I'm filing this in the hope that someone can give a pointer what's causing it.
The crash happens repeatedly on the desktop machine I normally use; it does not happen on other machines such as an AWS m7i.4xlarge I created as a test, or the CI runners that exercise the same code daily.

To try to eliminate environmental factors I created a fresh VM. Here are all the steps:

  • Created a fresh VM, installed with Ubuntu 24.04 LTS. (via wsl --install -d Ubuntu-24.04)
  • Made a directory ~/src/github.com/grafana and cd to there.
  • git clone https://github.com/grafana/mimir
  • Install Go from https://go.dev/dl/go1.24.4.linux-amd64.tar.gz
  • Install gcc so that -race works: sudo apt-get install gcc
  • cd ~/src/github.com/grafana/mimir
  • go test -race -run Fuzz ./pkg/frontend/querymiddleware/

Mimir code is at commit ad7ac183b4f8131fe53e669b05c9a5a3355a87f5, but lots of other commits fail the same way. For example 9fd2b19a61acec59336d677a50270806dcc4482c.

What did you see happen?

# github.com/AzureAD/microsoft-authentication-library-for-go/apps/internal/json
unexpected fault address 0xc00061f928
fatal error: fault
[signal SIGSEGV: segmentation violation code=0x2 addr=0xc00061f928 pc=0xc00061f928]

goroutine 1 gp=0xc000002380 m=0 mp=0x168a860 [running]:
runtime.throw({0xeb2b62?, 0x1042dd0?})
        runtime/panic.go:1101 +0x48 fp=0xc00061f8a8 sp=0xc00061f878 pc=0x4852e8
runtime.sigpanic()
        runtime/signal_unix.go:939 +0x26c fp=0xc00061f908 sp=0xc00061f8a8 pc=0x486ecc
cmd/compile/internal/ir.(*bottomUpVisitor).visit.Visit.func3({0x1043988, 0xc0009d5d40})
        cmd/compile/internal/ir/visit.go:118 +0x45 fp=0xc00061f938 sp=0xc00061f908 pc=0x6425c5
cmd/compile/internal/ir.doNodes(...)
        cmd/compile/internal/ir/node_gen.go:2433
cmd/compile/internal/ir.(*ReturnStmt).doChildren(0xc0009d5d80, 0xc000bc76b0)
        cmd/compile/internal/ir/node_gen.go:1702 +0xbb fp=0xc00061f968 sp=0xc00061f938 pc=0x638c9b
cmd/compile/internal/ir.DoChildren({0x1043280?, 0xc0009d5d80?}, 0xc000bc76b0?)
        cmd/compile/internal/ir/visit.go:94 +0x65 fp=0xc00061f9a0 sp=0xc00061f968 pc=0x643e65
cmd/compile/internal/ir.(*bottomUpVisitor).visit.Visit.func3({0x1043280, 0xc0009d5d80})
        cmd/compile/internal/ir/visit.go:118 +0x45 fp=0xc00061f9d0 sp=0xc00061f9a0 pc=0x6425c5
cmd/compile/internal/ir.doNodes(...)
        cmd/compile/internal/ir/node_gen.go:2433

output.txt

The symptom is repeatable (it might report a different package) if I go clean -cache first. I do not get the symptom if I omit -race.

If I repeat the command including -race without cleaning cache, I get a different symptom:

==================
WARNING: DATA RACE
Read at 0x00c00fc1ce40 by goroutine 261:
  runtime.slicecopy()
      /usr/local/go/src/runtime/slice.go:355 +0x0
  fmt.(*buffer).write()
      /usr/local/go/src/fmt/print.go:104 +0x136
  fmt.(*fmt).pad()
      /usr/local/go/src/fmt/format.go:95 +0x93
  fmt.(*fmt).fmtQ()
      /usr/local/go/src/fmt/format.go:460 +0x16b
  fmt.(*pp).fmtString()
      /usr/local/go/src/fmt/print.go:503 +0x5e
  fmt.(*pp).printArg()
      /usr/local/go/src/fmt/print.go:741 +0x8b5
  fmt.(*pp).doPrintf()
      /usr/local/go/src/fmt/print.go:1074 +0x5bc
  fmt.Errorf()
      /usr/local/go/src/fmt/errors.go:25 +0xa4
  github.com/prometheus/prometheus/util/annotations.NewPossibleNonCounterInfo()
      /home/bryan/src/github.com/grafana/mimir/vendor/github.com/prometheus/prometheus/util/annotations/annotations.go:269 +0x83ef
...

Previous write at 0x00c00fc1ce45 by goroutine 261:
  unicode/utf8.AppendRune()
      /usr/local/go/src/unicode/utf8/utf8.go:393 +0xb6f
  strconv.appendEscapedRune()
      /usr/local/go/src/strconv/quote.go:80 +0xb8a
  strconv.appendQuotedWith()
      /usr/local/go/src/strconv/quote.go:52 +0x3ad
...

fuzz.txt

It reports a race between two goroutines with the same number.

What did you expect to see?

I would like it to not crash.

I figured it could be a hardware problem, so ran memtest86+ for several hours.

Doesn't seem tight on memory:

$ free -h
               total        used        free      shared  buff/cache   available
Mem:            47Gi       1.0Gi        43Gi       3.5Mi       3.1Gi        46Gi
Swap:           12Gi          0B        12Gi

Metadata

Metadata

Assignees

No one assigned

    Labels

    BugReportIssues describing a possible bug in the Go implementation.WaitingForInfoIssue is not actionable because of missing required information, which needs to be provided.compiler/runtimeIssues related to the Go compiler and/or runtime.

    Type

    No type

    Projects

    Status

    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions