Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(prettyCQL): prettyCQL crashes on ouf-of-bound access #433

Merged
merged 8 commits into from
Nov 12, 2024

Conversation

CodeLieutenant
Copy link
Contributor

@CodeLieutenant CodeLieutenant commented Oct 23, 2024

When gemini tries to pretty print CQL, in some cases can fail with out-of-bounds access on array.

out = append(out, queryChunks[qID])

@CodeLieutenant CodeLieutenant added bug Something isn't working crash labels Oct 23, 2024
@CodeLieutenant CodeLieutenant self-assigned this Oct 23, 2024
@CodeLieutenant
Copy link
Contributor Author

CodeLieutenant commented Oct 23, 2024

CI Failure is now related to #431 and not to the crashing on out of bounds error on prettyCQL

{"L":"INFO","T":"2024-10-23T10:29:46.940Z","N":"work cycle.mutation_job","M":"ending mutation loop"}
panic: map cql pretty, unknown type &{map timeuuid varint false}

goroutine 369282 [running]:
github.com/scylladb/gemini/pkg/typedef.(*MapType).CQLPretty(0xc00012ea80, {0xbbe060, 0xc00f77ce20})
	/home/runner/work/gemini/gemini/pkg/typedef/types.go:145 +0x3d4
github.com/scylladb/gemini/pkg/typedef.prettyCQL({0xc011b66120, 0x83}, {0xc0104ee500, 0x10, 0xccce11?}, {0xc0001802a0, 0xe, 0x0?})
	/home/runner/work/gemini/gemini/pkg/typedef/typedef.go:271 +0x36d
github.com/scylladb/gemini/pkg/typedef.(*Stmt).PrettyCQL(0xc0104adc00)
	/home/runner/work/gemini/gemini/pkg/typedef/typedef.go:99 +0x9e
github.com/scylladb/gemini/pkg/jobs.mutation({0xdf3cc0, 0xc016ee3db0}, 0x4400000015f61c80?, 0x1000000002000000?, 0x8f1ba312981d476b?, {0xdf3ba8, 0xc0002a2cc0}, 0x1000000060ca4704?, 0xb510549d53368ef8?, 0xc000184120, ...)
	/home/runner/work/gemini/gemini/pkg/jobs/jobs.go:391 +0x3f5
github.com/scylladb/gemini/pkg/jobs.mutationJob({0xdf3cc0, 0xc016ee3db0}, 0xc000400000, _, {0xc00006e088, 0xc00006e098, {0x0, 0x0, 0x0}, 0x1, ...}, ...)
	/home/runner/work/gemini/gemini/pkg/jobs/jobs.go:182 +0x2aa
github.com/scylladb/gemini/pkg/jobs.List.Run.func2()
	/home/runner/work/gemini/gemini/pkg/jobs/jobs.go:136 +0x10f
golang.org/x/sync/errgroup.(*Group).Go.func1()
	/home/runner/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:78 +0x50
created by golang.org/x/sync/errgroup.(*Group).Go in goroutine 1
	/home/runner/go/pkg/mod/golang.org/x/[email protected]/errgroup/errgroup.go:75 +0x96
make: *** [Makefile:71: integration-test] Error 2

pkg/typedef/typedef.go Outdated Show resolved Hide resolved
pkg/typedef/typedef.go Outdated Show resolved Hide resolved
pkg/typedef/typedef.go Outdated Show resolved Hide resolved
@CodeLieutenant
Copy link
Contributor Author

CodeLieutenant commented Oct 29, 2024

Benchmark results

cpu: Intel(R) Core(TM) i9-14900HX
BenchmarkPrettyCQLOLD
BenchmarkPrettyCQLOLD/New
BenchmarkPrettyCQLOLD/New-32         	  754578	      1851 ns/op	    1048 B/op	      13 allocs/op
BenchmarkPrettyCQLOLD/Old
BenchmarkPrettyCQLOLD/Old-32         	  298058	      3926 ns/op	    3648 B/op	      38 allocs/op
PASS

Another benchmark with Go Iterators

BenchmarkPrettyCQLOLD
BenchmarkPrettyCQLOLD/New
BenchmarkPrettyCQLOLD/New-32         	 1000000	      1348 ns/op	     928 B/op	       4 allocs/op
BenchmarkPrettyCQLOLD/Old
BenchmarkPrettyCQLOLD/Old-32         	  264987	      5549 ns/op	    3648 B/op	      38 allocs/op
PASS

Really don't know why Iterators optimize the allocations and where they do, but looks like it does, allocations take around 60ns to preform, and this ~500ns gap can be explained by that

@CodeLieutenant CodeLieutenant force-pushed the fix--prettyCQL branch 7 times, most recently from 8c0a31e to 9edf169 Compare October 29, 2024 23:07
When gemini tries to `pretty print` CQL, in some cases can fail
with `out-of-bounds` access on array.

```go
out = append(out, queryChunks[qID])
```

This fix uses new go1.23 iter funcs, which removes the allocations,
and is not concerened with indexes which caused the issue in the first
place.

I tried to remove the `switch .(type)` but `TuppleType` is different
so has to be still there.

Signed-off-by: Dusan Malusev <[email protected]>
@CodeLieutenant CodeLieutenant force-pushed the fix--prettyCQL branch 2 times, most recently from 63f24f2 to c66cc89 Compare October 29, 2024 23:12
Not prettyCQL uses the `strings.Builder` to generate the `CQL` statament
that was previously executed on ScyllaDB.
Instaface has been changed accomadate the `builder` being passed to
down.

Signed-off-by: Dusan Malusev <[email protected]>
pkg/typedef/tuple.go Outdated Show resolved Hide resolved
pkg/typedef/typedef.go Outdated Show resolved Hide resolved
pkg/typedef/typedef.go Outdated Show resolved Hide resolved
pkg/typedef/tuple.go Outdated Show resolved Hide resolved
@CodeLieutenant CodeLieutenant linked an issue Oct 30, 2024 that may be closed by this pull request
everything that can produce a `Pretty CQL` for logging
now returns an error, to indicate bad types pass on to it.

Signed-off-by: Dusan Malusev <[email protected]>
@CodeLieutenant CodeLieutenant force-pushed the fix--prettyCQL branch 2 times, most recently from 2390abf to 8da860f Compare November 1, 2024 14:03
Statement Logger had a flow, it assumed it has the owner ship of the
query, but statements are reused this means, it produced queries that
are not even possible to execute. (This does not fix the issue of more
columns expected then returned). Since logger works as single thread
to write to the file (Queue), it has to be reworked to format the query
and the values before they are passed to the channel (works like a
queue), this way a query is owned by the channel inside a bytes.Buffer.

PrettyCQL has been rewritten to use `bytes.Buffer` instead of
`strings.Builder`, as the intent of the PrettyCQL is to be stored inside
a file, and files work with `[]byte` rather then strings.

Signed-off-by: Dusan Malusev <[email protected]>
pkg/jobs/jobs.go Outdated Show resolved Hide resolved
pkg/typedef/tuple.go Show resolved Hide resolved
pkg/typedef/typedef.go Show resolved Hide resolved
@CodeLieutenant
Copy link
Contributor Author

I dont know why, no explanation, but these changes made gemini stable. Further investigation is needed as nothing much has been done to affect how gemini works

@fruch
Copy link
Collaborator

fruch commented Nov 6, 2024

I dont know why, no explanation, but these changes made gemini stable. Further investigation is needed as nothing much has been done to affect how gemini works

makes sense to me, you switch a few cases from panicking to erroring, it can make a lot of difference
i.e. the test you are running are not checking for existence of errors, do they ?

pkg/typedef/typedef.go Show resolved Hide resolved
@CodeLieutenant CodeLieutenant merged commit 9b15797 into scylladb:master Nov 12, 2024
9 checks passed
@CodeLieutenant CodeLieutenant deleted the fix--prettyCQL branch November 12, 2024 10:55
Comment on lines +124 to +157
func prettyCQLOld(query string, values Values, types Types) string {
if len(values) == 0 {
return query
}

k := 0
out := make([]string, 0, len(values)*2)
queryChunks := strings.Split(query, "?")
out = append(out, queryChunks[0])
qID := 1
builder := bytes.NewBuffer(nil)
for _, typ := range types {
builder.Reset()
tupleType, ok := typ.(*TupleType)
if !ok {
_ = typ.CQLPretty(builder, values[k])
out = append(out, builder.String())
out = append(out, queryChunks[qID])
qID++
k++
continue
}
for _, t := range tupleType.ValueTypes {
builder.Reset()
_ = t.CQLPretty(builder, values[k])
out = append(out, builder.String())
out = append(out, queryChunks[qID])
qID++
k++
}
}
out = append(out, queryChunks[qID:]...)
return strings.Join(out, "")
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should you drop this ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do it

Comment on lines +175 to +183
b.Run("Old", func(b *testing.B) {
b.ReportAllocs()
b.ResetTimer()
for i := 0; i < b.N; i++ {
query, _ := stmt.Query.ToCql()
values := stmt.Values.Copy()
prettyCQLOld(query, values, stmt.Types)
}
})
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And this

return strings.Join(out, "")
}

func BenchmarkPrettyCQLOLD(b *testing.B) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please rename it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working crash
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Failure while running integration tests
3 participants