Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]Large number of SST files in locations not being cleared after upgrade #4554

Open
danpi opened this issue Feb 20, 2025 · 3 comments · May be fixed by #4555
Open

[BUG]Large number of SST files in locations not being cleared after upgrade #4554

danpi opened this issue Feb 20, 2025 · 3 comments · May be fixed by #4555
Labels

Comments

@danpi
Copy link

danpi commented Feb 20, 2025

BUG REPORT

Describe the bug
After upgrading from Pulsar 2.8 to LTS 3.0+, with the corresponding BookKeeper version upgraded from 4.14.4 to 4.16.6, the system initially runs well. However, when the throughput increases to a certain level, such as 100MB of writes per node, and after running for a period of time, I observe that the system's physical memory usage increases linearly. Meanwhile, the BookKeeper JVM itself does not show significant memory changes, and there are a large number of SST files in BookKeeper's locations that remain uncleared.

To Reproduce

Steps to reproduce the behavior:

  1. Deploy a minimal Pulsar cluster.
  2. Configure the message retention time in the Pulsar broker for the namespace, setting both retention and TTL to 6 hours.
  3. Use the Pulsar perf tool to write data, maintaining a throughput of 30MB/s or higher per BookKeeper node. The higher the throughput, the easier it is to reproduce the issue.
  4. After running continuously for a few days, you will observe that the earliest *.log files in BookKeeper are from 6 hours ago, but the earliest files in the locations directory are still from several days ago and cannot be cleared.

Expected behavior
The SST files and entry logs should both be retained for only 6 hours, as configured. There should not be a large accumulation of SST files, which can negatively impact query performance, increase disk storage pressure, and lead to physical memory usage issues.

Screenshots
As shown in the figure, the entryLog files are retained for a maximum of 6 hours.
Image
However, the oldest SST files in the locations directory remain uncleared.
Image

Additional context

@danpi danpi added the type/bug label Feb 20, 2025
@hezhangjian
Copy link
Member

is that same problem of #3605

@danpi
Copy link
Author

danpi commented Feb 20, 2025

is that same problem of #3605

In fact, the issue I encountered is more similar to issue #4145.

@danpi
Copy link
Author

danpi commented Feb 20, 2025

This issue is related to PR #3653, which replaced the previous one-by-one deletion behavior with deleteRange, thereby improving the performance of entry location deletions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants