Skip to content

branch-4.0: [fix](filecache) unify TTL expiration calculation and persist pending rowset timestamps#62287

Open
freemandealer wants to merge 1 commit intoapache:branch-4.0from
freemandealer:pick-branch-4.0-f3156c9ef719
Open

branch-4.0: [fix](filecache) unify TTL expiration calculation and persist pending rowset timestamps#62287
freemandealer wants to merge 1 commit intoapache:branch-4.0from
freemandealer:pick-branch-4.0-f3156c9ef719

Conversation

@freemandealer
Copy link
Copy Markdown
Contributor

What problem does this PR solve?

Issue Number: N/A

Related PR: N/A

Problem Summary:

This PR backports commit f3156c9ef71 to branch-4.0 and fixes file cache TTL/expiration inconsistency in Doris BE.

  • Add a shared helper to calculate file cache expiration with expired-value clamping and overflow protection.
  • Reuse the helper in read, write, and cloud warmup paths so the same cache object gets the same expiration_time.
  • Persist newest_write_timestamp for pending/prepared rowsets earlier to avoid timestamp drift between write and later read/warmup flows.

Release note

None

Check List (For Author)

  • Test

    • No need to test or manual test. Explain why:
      • Other reason
      • This is a targeted branch-4.0 backport. No focused test was run in this environment.
  • Behavior changed:

    • No.
    • Yes. The file cache TTL expiration path is now consistent across read, write, and warmup, and pending rowset timestamps are persisted earlier.
  • Does this need documentation?

    • No.
    • Yes.

… rowset timestamps

The file cache TTL flow used inconsistent expiration rules across write, read,
and warmup paths. Query paths clamped expired values to 0, while writer and
some warmup paths could still produce non-zero absolute expiration timestamps.
Pending rowsets also delayed persisting newest_write_timestamp, which let cache
writes use a different time base from later reads and warmup tasks.

This change adds a shared helper to calculate file cache expiration with
validation, overflow protection, and expired-value clamping. It reuses the same
logic in rowset read, rowset write, and cloud warmup paths so the same cache
hash gets the same expiration_time consistently.

This change also persists newest_write_timestamp into pending/prepared rowset
meta during initialization when the context already provides a valid value.
That keeps import/write, query, and warmup flows aligned on the same timestamp
base and avoids generating different TTL cache directories for the same object.
@Thearas
Copy link
Copy Markdown
Contributor

Thearas commented Apr 9, 2026

Thank you for your contribution to Apache Doris.
Don't know what should be done next? See How to process your PR.

Please clearly describe your PR:

  1. What problem was fixed (it's best to include specific error reporting information). How it was fixed.
  2. Which behaviors were modified. What was the previous behavior, what is it now, why was it modified, and what possible impacts might there be.
  3. What features were added. Why was this function added?
  4. Which code was refactored and why was this part of the code refactored?
  5. Which functions were optimized and what is the difference before and after the optimization?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants