Skip to content

HIVE-29465:Prevent excessive query results cache usage at runtime#6376

Open
ramitg254 wants to merge 3 commits intoapache:masterfrom
ramitg254:HIVE-29465
Open

HIVE-29465:Prevent excessive query results cache usage at runtime#6376
ramitg254 wants to merge 3 commits intoapache:masterfrom
ramitg254:HIVE-29465

Conversation

@ramitg254
Copy link
Copy Markdown
Contributor

@ramitg254 ramitg254 commented Mar 18, 2026

What changes were proposed in this pull request?

Introducing safe cache writing conf which is when enabled writing to cache directory should not happen dirctly and if the entry is valid then only that entry should be copied to cache directory.

Why are the changes needed?

spilling of cache directory was happening when query as cleanup is done in the post execution.

Does this PR introduce any user-facing change?

No

How was this patch tested?

locally and ci test

@ramitg254 ramitg254 changed the title [WIP]HIVE-29465:Prevent excessive query results cache usage at runtime [WIP] HIVE-29465:Prevent excessive query results cache usage at runtime Mar 18, 2026
@ramitg254 ramitg254 changed the title [WIP] HIVE-29465:Prevent excessive query results cache usage at runtime HIVE-29465:Prevent excessive query results cache usage at runtime Mar 23, 2026
@ramitg254 ramitg254 changed the title HIVE-29465:Prevent excessive query results cache usage at runtime [WIP]HIVE-29465:Prevent excessive query results cache usage at runtime Mar 23, 2026
@ramitg254 ramitg254 changed the title [WIP]HIVE-29465:Prevent excessive query results cache usage at runtime HIVE-29465:Prevent excessive query results cache usage at runtime Mar 23, 2026
cacheFs.mkdirs(resultDir);

Set<FileStatus> cacheFilesToFetch = new HashSet<>();
rwLock.writeLock().lock();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can see that in other places in this class, this is handled via a separate variable:

    Lock writeLock = rwLock.writeLock();
    try{
        writeLock.lock();
    }

but that usage pattern is not consistent: feel free to pick one, and unify all the lock usages

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done moved to withWriteLock() method

return false;
}

if (isSafeCacheWriteEnabled) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe refactor this whole block to a separate method

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done moved to rewriteFetchWorkForSafeCacheWrite method

"If the query results cache is enabled. This will keep results of previously executed queries " +
"to be reused if the same query is executed again."),

HIVE_QUERY_RESULTS_SAFE_CACHE_WRITE_ENABLED("hive.query.results.safe.cache.write.enabled", false,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for correct alphabetical order this should rather be hive.query.results.cache.safe.write.enabled

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done corrected order

@sonarqubecloud
Copy link
Copy Markdown

@sonarqubecloud
Copy link
Copy Markdown

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants