-
Notifications
You must be signed in to change notification settings - Fork 9.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
nhcb: store custom buckets in WAL, WBL #14730
Comments
Can't we use native histogram chunk format as a first step? I guess that's the alternative 1, right? |
Yes. I don't know if it's a good idea, haven't thought it through. I'm worried that storing the custom values for every histogram individually is a huge waste of space. Similar to how we don't store the labels in every sample, but once per series: prometheus/tsdb/record/record.go Line 147 in 8c7bf39
On the other hand adding a new RefCustomValues record type like
is probably more complicated? |
@bwplotka We've discussed this question in the histograms sync meeting. It was pointed out that we already keep the bucket definitions separate from the values, since the bucket definition is a label |
That's separate discussion than chunk format, or is it similar decision? |
Separate. Reusing the same formatting as in the chunks should be fine. We can always add new kind of record if we need to. |
Hm... I would challenge this. It's true we move bucketing from label + sample to complex sample. But:
To sum up, I would not try to prematurely optimize with a new pattern, especially as it's not easy to add/maintain - but instead do same as native histograms are doing for now, especially as NH literally have CustomValues slice in it. Then if profile and benchmarks suggest, we can try to be more smart and split/only add new record when it's needed or something. Started some attempt on top of @carrieedwards code here: main...bwplotka/nhcb-wal-wbl The main "counter argument" is backward compatibility. I don't see any versioning for WAL records (?), so this will require a new Histogram and float histogram record (with versioning added ideally for future use). This however can and should be FULLY abstracted by inside |
Related #15200 |
To progress here: A) We could start by adding a single different record type for RefHistogram to replace float and normal histogram, but also supports custom buckets -- all with updated histogram scheme that:
Funny enough, even without tackling versioning a lot here, we could encode ONLY NHCB in this new record type, and then switch to it if successful for other histograms in a few later releases. B) We could continue separate bucketing record. It's inconsistent and might cause pain in future, but it does not break compatibility, I think. I wonder only as compared to @carrieedwards approach here is there a room to encapsulate reading/writing a single package or abstraction, adding new records even is spreaded around the codebase with pools everywhere. Technically head/wal code does not need to be aware about I think I sill lean on A), but personally will have more time helping on this after KubeCon NA |
Working through this myself to see if I understand: RefHistogramSample and As a baseline I think you need to absolutely support the current HistogramSamples encoding, since you wouldn't be able to upgrade with an existing WAL. So an idea "A"+"B" could be to keep the current The super annoying thing about the WAL/WBL is that it orders records based on type , not on append order, so it has series first, then floats, then histograms and float histograms etc. This would mean going back end forth between the types. No idea if that's bad or not? But your idea "A" would take care of that. Another consequence of the type based ordering in WAL/WBL is that it's kind of hard to implement #15177 and keep the appended order over Commit. So I'm actually tempted to say that we should fold even cc @bwplotka @carrieedwards also @codesome |
I have another idea based on "B" to optimize away storing custom values multiple times: have a marker that says: "use the custom value from the previous sample for the same series reference". And use the lastHistogramValue/ |
Agree with the ideas in #14730 (comment) As per later comment on optimizations based on what changed I think it's def powerful but it feels we have to first DRY the record here, by switching into single, more complex one (and version by record types?), then we can iterate with "unchanged bucketing" logic 👍 It feels like a lot of work, but if we team up perhaps it's eatable this year |
@carrieedwards and @krajorama will check if the "single" part of this is easily feasible as the interface of WAL currently handles integer and float histograms separately all over the place. If not, we'll have to go for two new record types and make more substantial changes together with other things. I think we're getting to the point where we'll need a design doc->proposal on how we want to reform the code around this. |
We've paired up and came to the conclusion that if we're not going to optimize and/or fix commit ordering then it makes sense to simply reuse the existing record types (HistogramSamples=7 and FloatHistogramSamples=8). One nice thing we could do is to release a version of Prometheus that can read and ignore NHCB encoded in WAL, but ignores it and make the next version use it for write as well. So that users have something to fall back on. |
This turned out to be wrong as you would be able to load the WAL anymore on a rollback. Opting to add new record types and when writing histogram samples to WAL we write two records: one containing the non NHCB samples and one with the NHCB samples. The old code will ignore the second one, enabling rollback without NHCB. |
Proposal
Related to prometheus/proposals#31
WAL, WBL records need to be able to encode the native histogram custom values.
Alternative 1:
Encode custom values into every histogram record.
Alternative 2:
This is the preferred method for implementation. #14730 (comment)
Introduce new record type that has a series reference and the custom values. Only record once every (segment?).
The text was updated successfully, but these errors were encountered: