-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: Skip writeback for chunks fetched by queriers older than a duration #15393
Conversation
@@ -138,6 +140,10 @@ func (c *Fetcher) FetchChunks(ctx context.Context, chunks []chunk.Chunk) ([]chun | |||
l2OnlyChunks := make([]chunk.Chunk, 0, len(chunks)) | |||
|
|||
for _, m := range chunks { | |||
if c.skipQueryWritebackCacheOlderThan > 0 && m.From.Time().Before(time.Now().UTC().Add(-c.skipQueryWritebackCacheOlderThan)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this does not look like the right place for this check. It is preventing the fetcher from fetching chunks older than skipQueryWritebackCacheOlderThan
.
WriteBackCache()
method might be a better place to add this
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for catching that and I do have the check in the WriteBackCache.
What I do want to add is the ability to short circuit a cache lookup if it before this duration value. There might already be a setting for that and need to take a deeper look
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ashwanthgoli This is going to fit our needs for the time being and would like to get this merged.
If you feel there should be some work to be done to short-circuit the read path for these requests to the cache, we can look to add that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What I do want to add is the ability to short circuit a cache lookup if it before this duration value.
Sorry, I think you go it right the first time. processCacheResponse should be able to figure out the missing keys and request them from store using the chunks
request
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rest looks good! Happy to approve once you revert the latest commit
@@ -230,6 +231,10 @@ func (c *Fetcher) WriteBackCache(ctx context.Context, chunks []chunk.Chunk) erro | |||
keysL2 := make([]string, 0, len(chunks)) | |||
bufsL2 := make([][]byte, 0, len(chunks)) | |||
for i := range chunks { | |||
if c.skipQueryWritebackCacheOlderThan > 0 && chunks[i].From.Time().Before(time.Now().UTC().Add(-c.skipQueryWritebackCacheOlderThan)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: return early if the resulting keys
or keysL2
is empty
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where would we return early? There is already a check if keysL2
is greater than 0. I can add that to the main cache, but an empty slice passed to the store is a no-op
@@ -56,6 +56,11 @@ func TestPipeline_JSON(t *testing.T) { | |||
entry string | |||
expectedExtract map[string]interface{} | |||
}{ | |||
"successfully run a pipeline with 1 logfmt stage with log not using json formatted string": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
looks unrelated. can you raise a separate pr for this if needed?
@@ -44,6 +44,11 @@ func TestPipeline_Logfmt(t *testing.T) { | |||
entry string | |||
expectedExtract map[string]interface{} | |||
}{ | |||
"successfully run a pipeline with 1 logfmt stage with log not using logfmt formatted string": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same, looks unrelated to this pr.
@@ -38,7 +39,7 @@ func (cfg *ChunkStoreConfig) RegisterFlags(f *flag.FlagSet) { | |||
f.DurationVar(&cfg.L2ChunkCacheHandoff, "store.chunks-cache-l2.handoff", 0, "Chunks will be handed off to the L2 cache after this duration. 0 to disable L2 cache.") | |||
f.BoolVar(&cfg.chunkCacheStubs, "store.chunks-cache.cache-stubs", false, "If true, don't write the full chunk to cache, just a stub entry.") | |||
cfg.WriteDedupeCacheConfig.RegisterFlagsWithPrefix("store.index-cache-write.", "", f) | |||
|
|||
f.DurationVar(&cfg.SkipQueryWritebackOlderThan, "store.skip-query-writeback-older-than", 0, "Chunks fetched from queriers before this duration will not be written to the cache. A value of 0 will write all chunks to the cache") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: wdyt about SkipCachingChunksOlderThan
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason for including Query
in the name was to distinguish that these were the chunks being written by the querier to the cache. This configuration only applies to the query operation.
If I needed to ingest historical data from 3 months ago into Loki, using SkipCachingChunksOlderThan
might imply that the ingested data is not written to the cache
…rafana#15378) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
…hunks-inspect (grafana#15377) Signed-off-by: dependabot[bot] <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
…ion filter prefix (grafana#15381) Exposes the filter_prefix in the lambda_function of the aws_s3_bucket_notification resource as a terraform variable.
…/v2 to v2.2.0 (grafana#15392) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
…afana#15396) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
…a#15400) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
…not contain any streams (grafana#13706)
…ken (grafana#13195) This adds an extra nil check to verify the Azure blob storage client has received a service principal token before trying to do anything with said token. This has led to crashes if `use_federated_token is enabled` and an error message is returned instead of a token.
Signed-off-by: Christian Haudum <[email protected]>
Signed-off-by: Christian Haudum <[email protected]>
…ana#15556) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
…rafana#15540) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
…ana#15529) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
…na#15539) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
…tly (grafana#15433) Co-authored-by: Christian Haudum <[email protected]> Co-authored-by: Owen Diehl <[email protected]>
…5564) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
…hile processing delete requests (grafana#15541)
…rafana#15573) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
…rafana#15569) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
…fana#15581) Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Co-authored-by: renovate[bot] <29139614+renovate[bot]@users.noreply.github.com>
Signed-off-by: Owen Diehl <[email protected]> Co-authored-by: Ashwanth <[email protected]>
Closing this PR and will submit a new one as I messed up branch |
What this PR does / why we need it:
This change will prevent any chunks fetched by the querier to not be written back to the chunk cache that are before a given time range.
95% of the searches that our engineering teams execute are within a 7 day range and we have a cache enabled to support 14 days worth of chunks to aid in query performance. We recently found that we have compliance need to maintain searchable logs for 365 days. While these queries outside of the 7 day window are rare, they tend to be longer duration searches of up to 30 days for compliance, contain a lot of data, and tend to be one-off searches.
Currently these chunks fetched by queriers outside our target window are written back to the chunk cache and causing more recent data to be evicted from the cache and impacting the query performance for the most recent data.
Which issue(s) this PR fixes:
Fixes #14983
Special notes for your reviewer:
Checklist
CONTRIBUTING.md
guide (required)feat
PRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.docs/sources/setup/upgrade/_index.md
deprecated-config.yaml
anddeleted-config.yaml
files respectively in thetools/deprecated-config-checker
directory. Example PR