Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zstd compression support between distributors and ingesters #8522

Closed
aallawala opened this issue Jun 25, 2024 · 24 comments · Fixed by #9322 · May be fixed by grafana/dskit#582 or #10411
Closed

zstd compression support between distributors and ingesters #8522

aallawala opened this issue Jun 25, 2024 · 24 comments · Fixed by #9322 · May be fixed by grafana/dskit#582 or #10411

Comments

@aallawala
Copy link
Contributor

Is your feature request related to a problem? Please describe.

Are there plans for zstd support between distributors and ingesters using the grpc_compression flag?

Enabling zstd compression would help significantly reduce the amount of cross-AZ traffic our org pays for running Mimir. gzip gets some wins, at the expense of CPU.

Now that there is a pure Go zstd implementation, are there any major blockers for introducing it to Mimir?

Describe the solution you'd like

Support for zstd when using the grpc_compression.
A clear and concise description of what you want to happen.

Describe alternatives you've considered

A clear and concise description of any alternative solutions or features you've considered.

Additional context

Add any other context or screenshots about the feature request here.

@aknuds1
Copy link
Contributor

aknuds1 commented Jun 26, 2024

Are you specifically asking for the -ingester.client.grpc-compression flag to also support zstd?

@aallawala
Copy link
Contributor Author

Are you specifically asking for the -ingester.client.grpc-compression flag to also support zstd?

I'm specifically looking for compression support on the ingester-client, yes. That's our most expensive client.

Though, judging by how the config is laid out, I suspect it is a shared compressor for all the gRPC clients?

@aknuds1
Copy link
Contributor

aknuds1 commented Jun 26, 2024

Though, judging by how the config is laid out, I suspect it is a shared compressor for all the gRPC clients?

In effect yes. I just needed more clarification, as your description was relatively vague. I'm trying to see whether the team has any opinion on supporting zstd.

@aknuds1
Copy link
Contributor

aknuds1 commented Jun 26, 2024

So, I found out that the Mimir team already has considered zstd support, but the zstd implementation we considered is apparently not very performant.

Is that the same zstd implementation as you're referring to?

@aallawala
Copy link
Contributor Author

aallawala commented Jun 26, 2024

So, I found out that the Mimir team already has considered zstd support, but the zstd implementation we considered is apparently not very performant.

Is that the same zstd implementation as you're referring to?

Thanks for checking in. I'm looking at https://github.com/klauspost/compress/tree/master/zstd as a possible zstd implementation. It looks like what you linked is a wrapper on top of klauspost/compress.

What were the reasons why it was not performant? I'm surprised to hear it was not compared to the other offering, gzip. Surely it strikes in the middle of gzip and snappy but with a much higher compression ratio.

@aallawala
Copy link
Contributor Author

@aknuds1, checking in - are you able to share more on the reasons why that implementation was not performant?

@aknuds1
Copy link
Contributor

aknuds1 commented Jul 17, 2024

I just know that @bboreham considered the aforementioned zstd implementation, and found performance problems with it.

When we discussed trying another compression algorithm for cheaper AZ traffic internally, the other point made was it wasn't worth all the testing that comes along with it.

@aknuds1
Copy link
Contributor

aknuds1 commented Jul 31, 2024

I have some new context to share to shed hopefully enough light on why the Mimir team doesn't think the investment into a new compression algorithm for distributor -> ingester traffic is going to be worthwhile. I don't know if you're perhaps already familiar with this, but Mimir is moving towards a fundamentally revamped architecture, where distributors will no longer be writing directly to ingesters, but instead to a Kafka compatible back-end, which ingesters again read from. Please see [FEATURE] Experimental Kafka-based ingest storage in the changelog for reference.

@bboreham
Copy link
Contributor

bboreham commented Aug 2, 2024

I remarked on the amount of memory allocations it does.
Looks like this was improved recently, with the addition of a pool in mostynb/go-grpc-compression@629c44d.

Someone else commented mostynb/go-grpc-compression#25

@deniszh
Copy link
Contributor

deniszh commented Sep 5, 2024

@aallawala
@aknuds1
@bboreham

I decided to experiment with zstd and s2 GRPC compression from https://github.com/mostynb/go-grpc-compression.
So, I created patched dskit, with zstd and s2 compression and then built mimir 2.13.0 with that patched dskit - it's nothing fancy, just dskit pointed to patched version.

Then I ran this patched version in our test cluster, first with zstd compression, then s2. Please note, that before test test cluster already run Mimir 2.13.0 with Snappy GRPC compression enabled, so, comparison below is to snappy and NOT uncopressed:

Results:

Zstd results comparing to Snappy:

  • traffic dropped from 4MB/s per writer to 2.2MB/s per writer incoming, 3Mb/s outgoing to 1MB/s outgoing
    • i.e. 7MB to 3.2MB drop from snappy - 54% less traffic
  • cpu didn't change significantly
  • memory increased a lot - from 6.5G per writer to 15GB per writer - I think thats a show stopper
  • 99p write latency dropped from 90 to 50 ms.

Image

Image

So, results are good, but memory usage is tremendous. I checked heap, majority of memory consumed by zstd.EnsureBlock:

Image

But I also have a good news too, when I tried to switch to s2 compression, which is better version of snappy.

S2 results comparing to Snappy:

  • traffic almost similar to zstd - receive 2.4MB/s per writer, send 1.4MB/s per writer
    • i.e 7MB to 3.8MB drop from snappy - 45% less traffic
    • cpu didn't change significantly
    • memory even dropped comparing to snappy - from 6.5G per writer to 5.8G per writer
    • 99p write latency dropped from 90 to 60 ms

Image

(left is Snappy, middle - zstd, right part - S2)

So, after that I decided to drop zstd (first I wanted to test Datadog or another wrapper but then I realized that Mimir has no CGO enabled) and S2 looks really promising.

I know that Mimir architecture is revamped and will be migrated to Kafka, but it will take some time to implement that properly - and S2 patch can shave 50% of cross AZ data cost right now.

And I don't like run patched version and prefer to port my changes to upstream. I can clean up and submit PR for experimental S2 support in dskit. Should I or no sense?

@aallawala
Copy link
Contributor Author

@deniszh, thanks so much for trying it out and posting your results. The memory increase seems to match up with what @bboreham said earlier in his findings too.

Thanks for also trying out s2. The results seem much more favorable and it would be something I can also help drive in order to get this ported upstream. Do you have a PR available on the dskit side for s2?

@deniszh
Copy link
Contributor

deniszh commented Sep 5, 2024

@aallawala : grafana/dskit#582

But please take note that latest mimir is not ready to latest dskit. So, if you want to test you can build mimir from this branch https://github.com/deniszh/mimir/tree/2.13.0-grpc-s2

@deniszh
Copy link
Contributor

deniszh commented Sep 5, 2024

Tried S2 on bigger cluster - it's less impressive but I see 70MB/s instead of 105MB/s - 30% decrease.

@aknuds1
Copy link
Contributor

aknuds1 commented Sep 11, 2024

Thanks for your testing @deniszh! The results look very promising. I'm asking the Mimir team what they think.

@aknuds1
Copy link
Contributor

aknuds1 commented Sep 11, 2024

@deniszh the Mimir team is positive to using S2 for gRPC compression in Mimir. I guess we should start out by reviewing your dskit PR.

@deniszh
Copy link
Contributor

deniszh commented Sep 25, 2024

👍

@Rohlik
Copy link

Rohlik commented Nov 6, 2024

Will be S2 better than gzip 🤔 compression?

@deniszh
Copy link
Contributor

deniszh commented Nov 6, 2024

I didn't compare to gzip. Maybe by ratio it is, but not by resources and speed.

@mostynb
Copy link

mostynb commented Dec 19, 2024

FWIW I am considering a change to the zstd wrapper in mostynb/go-grpc-compression#34

@deniszh
Copy link
Contributor

deniszh commented Jan 6, 2025

Ok, I decided to tried it one more time. First, I applied @mostynb patch above, but for some reason I got panic somewhere in ztsd.go when calling close() in my tests.
So, I decided to simply take S2 compressor (which works perfectly) and change S2 readers and writers there with zstd decoder and encoder. I implemented that in https://github.com/deniszh/mimir/tree/2.14.2-grpc-s2-zstd branch (see this commit, rest is backports of 2.14.x fixes).
After that I enabled zstd grpc compressor in the same cluster where I had S2 compressor before.
Results are below:

  • no changes in write latency, read latency, rule evaluation latency (comparing to S2)

  • very slight improve for write bandwith:

Image Image

(65MB S2 vs 54MB ZSTD - 17% better)
(45MB S2 vs 35MB ZSTD - 22% better)

  • no significant changes in read bandwith

  • no significant changes in CPU for read and write components

  • no significant changes in memory for mimir-write

Image
  • significant memory increase for mimir-read
Image

(300% worse for ZSTD, similar effect is for mimir-backend)

Conclusion:

  • not sure if saving 15-20% of network bandwidth worth 3x increase of RAM for read components. OTOH, it's much better than previous implementation (now mimir-read RAM consumption is not increased)
  • We would stay on S2, though.

@deniszh
Copy link
Contributor

deniszh commented Jan 8, 2025

PS: OTOH we need ZSTD compression between distributor and ingesters, would try that, using S2 for rest.

@deniszh
Copy link
Contributor

deniszh commented Jan 9, 2025

OK, I applied patch on top of 2.15.0, then override ingester compression with "-ingester.client.grpc-compression=zstd", keeping s2 for the rest of clients.
Results:

  • 99p write latency is the same

  • cpu / memory of write component mostly the same (memory even improved a bit, but probably because of 2.15.0)

Image
  • write bandwith improved:

RX - 63MB to 50MB
TX - 43MB to 32MB
Total - 106 to 82 - 22% more (comparing to S2, which is kinda impressive):

Image

Will do more tests and give it some time for stability check and then introduce PR for zstd only in ingester client.

@deniszh
Copy link
Contributor

deniszh commented Jan 9, 2025

OK, on bigger cluster I definitely see 22% of bandwith savings, but at the same time 40% increase of CPU in distributors. Ingesters resources are the same, so, probably, still feasible in some use cases but generally speaking S2 can be probably still good compromise.

@deniszh
Copy link
Contributor

deniszh commented Jan 10, 2025

Latest update - compared zstd with gzip. Compression ratio is the same as zstd (17% better than S2), but CPU-wise zstd is better than gzip for 15%, see graph of distributor CPU, normalized to S2 consumption:

Image

Left is S2 (100), medium - ZSTD (140), right - gzip (165). So, maybe still worth to have it in ingester if anyone wants to save more traffic than S2 in exchange for CPU (but less CPU than gzip).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment