-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(dag-protobuf): cache dag pb directory structure and block indexes #147
Conversation
7d150c9
to
de83b8c
Compare
de83b8c
to
272a02c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM though I think we should set longer TTL -- we can just drop expiry and see how large the cache gets over time. the big expense is writes anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd set a TTL of a month or something.
Needs tests.
port: 8787, | ||
inspectorPort: 9898, | ||
log: new Log(LogLevel.INFO), | ||
cache: false, // Disable Worker Global Cache to test cache middlewares |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we don't disable this global cache in test the requests won't reach the Content Claim Dagula middleware where we have the KV cache.
814a4f4
to
03a7f78
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM on condition we set the prod cache to 30 days.
🤖 I have created a release *beep* *boop* --- ## [2.25.0](v2.24.0...v2.25.0) (2025-01-28) ### Features * **dag-protobuf:** cache dag pb directory structure and block indexes ([#147](#147)) ([e367852](e367852)) --- This PR was generated with [Release Please](https://github.com/googleapis/release-please). See [documentation](https://github.com/googleapis/release-please#release-please). Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Context
The requests to fetch a DAG Protobuf directory structure using a CID execute the following steps:
bafy...cid
- which represents the folder containing the target file, so that we can determine the verifiable cid for the file (let's call thatbafy...file
)bafy...file
to get the root block of the file, which in UnixFS contains NO raw data, but rather is a list of sub-blocks that contain the file (let's call thosebafy...bytes1
andbafy...bytes2
)This PR enables the caching strategy for steps 2 to 4 where instead of fetching the directory structure from the locator and navigating the DAG for every request, it caches the DAGs if they have a Protobuf structure and content size <= 2MB.
Changes
withContentClaimsDagula
middleware to cache DAG PB content requestsDAGPB_CONTENT_CACHE
FF_DAGPB_CONTENT_CACHE_TTL_SECONDS
: The number that represents when to expire the key-value pair in seconds from now. The minimum value is 60 seconds. Any value less than 60MB will not be used. We will use 30 days TTL by default for Production environment.FF_DAGPB_CONTENT_CACHE_MAX_SIZE_MB
: The maximum size of the key-value pair in MB. The minimum value is 1 MB. Any value less than 1MB will not be used. We will use 2MB max file size by default.FF_DAGPB_CONTENT_CACHE_ENABLED
: The flag that enables the DAGPB content cache. The cache is disabled in prod by default.Samples
2MB file - no cache
data:image/s3,"s3://crabby-images/0b296/0b296824abe8b32526537697c2eb373ac0533810" alt="2mb-file-no-cache"
2MB file - cached
data:image/s3,"s3://crabby-images/1b5d1/1b5d1b9e619f7e34a20798c4ece2fce161adb3bf" alt="2mb-file-cached"
KV Limits
resolves storacha/project-tracking#301