HIVE-29465:Prevent excessive query results cache usage at runtime#6376
HIVE-29465:Prevent excessive query results cache usage at runtime#6376ramitg254 wants to merge 3 commits intoapache:masterfrom
Conversation
| cacheFs.mkdirs(resultDir); | ||
|
|
||
| Set<FileStatus> cacheFilesToFetch = new HashSet<>(); | ||
| rwLock.writeLock().lock(); |
There was a problem hiding this comment.
I can see that in other places in this class, this is handled via a separate variable:
Lock writeLock = rwLock.writeLock();
try{
writeLock.lock();
}
but that usage pattern is not consistent: feel free to pick one, and unify all the lock usages
There was a problem hiding this comment.
done moved to withWriteLock() method
| return false; | ||
| } | ||
|
|
||
| if (isSafeCacheWriteEnabled) { |
There was a problem hiding this comment.
maybe refactor this whole block to a separate method
There was a problem hiding this comment.
done moved to rewriteFetchWorkForSafeCacheWrite method
| "If the query results cache is enabled. This will keep results of previously executed queries " + | ||
| "to be reused if the same query is executed again."), | ||
|
|
||
| HIVE_QUERY_RESULTS_SAFE_CACHE_WRITE_ENABLED("hive.query.results.safe.cache.write.enabled", false, |
There was a problem hiding this comment.
for correct alphabetical order this should rather be hive.query.results.cache.safe.write.enabled
There was a problem hiding this comment.
done corrected order
|
|



What changes were proposed in this pull request?
Introducing safe cache writing conf which is when enabled writing to cache directory should not happen dirctly and if the entry is valid then only that entry should be copied to cache directory.
Why are the changes needed?
spilling of cache directory was happening when query as cleanup is done in the post execution.
Does this PR introduce any user-facing change?
No
How was this patch tested?
locally and ci test