-
Notifications
You must be signed in to change notification settings - Fork 5
NONEVM-3034: log pruning background worker #553
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This PR implements a comprehensive log pruning system for the TON log poller, enabling automatic cleanup of old or excessive logs based on configurable retention policies.
Changes:
- Added time-based and count-based log pruning mechanisms with configurable intervals and batch sizes
- Extended database schema to support
log_retentionandmax_logs_keptfields for filters, plusexpires_atfor logs - Implemented background pruning worker that runs periodically to clean up expired logs, excess logs, and logs from deleted filters
Reviewed changes
Copilot reviewed 20 out of 22 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/.core_version | Updated core version reference |
| pkg/logpoller/store/postgres/models.go | Added retention fields to filter and log models |
| pkg/logpoller/store/postgres/logs.go | Implemented DeleteExpiredLogs, DeleteExcessLogs, and DeleteLogsForDeletedFilters methods |
| pkg/logpoller/store/postgres/filters.go | Added validation for retention fields and DeleteEmptyFilters method |
| pkg/logpoller/store/memory/logs.go | Added no-op implementations for in-memory store |
| pkg/logpoller/store/memory/filters.go | Added no-op DeleteEmptyFilters for in-memory store |
| pkg/logpoller/service.go | Integrated background pruning worker with configurable startup delay |
| pkg/logpoller/pruning.go | New file implementing backgroundPruningRun method |
| pkg/logpoller/parser.go | Updated to populate ExpiresAt field and refactored filter handling |
| pkg/logpoller/models/models.go | Changed FilterIndex to map to Filter pointers and added retention fields |
| pkg/logpoller/metrics.go | Added pruning-specific metrics (logs_deleted, pruning_duration, pruning_errors) |
| pkg/logpoller/interfaces.go | Added method signatures for pruning operations |
| pkg/logpoller/filter.go | Updated buildFilterIndex to store Filter pointers instead of IDs |
| pkg/logpoller/config_test.go | Refactored tests and added pruning configuration validation tests |
| pkg/logpoller/config.go | Added pruning configuration fields with validation |
| pkg/ccip/chainaccessor/event.go | Set default CCIP log retention to 30 days |
| integration-tests/logpoller/testdata/create_ton_logpoller_tables.sql | Added retention columns and pruning index |
| integration-tests/logpoller/pg_pruning_test.go | New comprehensive pruning integration test |
| integration-tests/logpoller/pg_filters_test.go | Added tests for retention field validation |
| integration-tests/go.mod | Added test dependencies |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
patricios-space
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
ogtownsend
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
small comments otherwise lgtm!
| } | ||
|
|
||
| // DeleteExcessLogs removes logs exceeding max_logs_kept for each filter. | ||
| func (s *pgLogStore) DeleteExcessLogs(ctx context.Context, limit int64) (int64, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one might end up getting a little expensive/slow, but it's only running every ~10 minutes, so we can keep an eye on it. Probably overkill to create an index for it now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, and for CCIP we don't have actual log count limit per filter
integration-tests/go.mod
Outdated
| // but pin chainlink-ton root module to avoid conflicts with indirect dependency via CLDF. | ||
| replace github.com/smartcontractkit/chainlink-ton/deployment => ../deployment | ||
|
|
||
| // TODO: remove on merge(TBD local package dependency management) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reminder
| PruningInterval *config.Duration // How often to run pruning (default 10 minutes) | ||
| PruningBatchSize int64 // Max rows to delete per batch (default 1000) | ||
| PruningStartDelay *config.Duration // Delay before first pruning cycle (default 5 minutes) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: add new fields to full config test: TestNewDecodedTOMLConfig() in pkg/config/toml_test.go
| WHERE chain_id = :chain_id | ||
| AND expires_at IS NOT NULL | ||
| AND expires_at <= NOW() | ||
| LIMIT :limit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could be a post-launch follow-up, but what if:
PruningBatchSize = 1000,PruningInterval = 10 minutes- something goes down for an extended time
- there are ~50k+ expired logs
- each pruning cycle deletes only 1k logs, this would take ~8 hours to catch up, but meanwhile we'll be accumulating more and more logs
- maybe a future optimization: this could return the number of logs deleted and if
numLogsDeleted == limit, then loop immediately and issue another delete operation right away rather than waiting another 10 minutes 🤷
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point, noted. also for short term we can try to increase the batch size.
fc6570e
krebernisak
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work!
| promTonLpPruningErrors = promauto.NewCounterVec(prometheus.CounterOpts{ | ||
| Name: "ton_logpoller_pruning_errors_total", | ||
| Help: "Total number of pruning errors", | ||
| }, []string{"chainID"}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is node metadata (id, name) added automatically to these metrics?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
Copilot reviewed 21 out of 23 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Uh oh!
There was an error while loading. Please reload this page.