Add cached find support#899
Conversation
|
Caution Review failedAn error occurred during the review process. Please try again later. 📝 WalkthroughWalkthroughAdds a TTL-backed ChangesfindCached() Implementation and Validation
Sequence Diagram(s)sequenceDiagram
participant Caller
participant findCached as Database::findCached
participant Cache
participant find as Database::find
Caller->>findCached: findCached(collection, queries, ttl, key, touchOnHit)
alt ttl=0 or authorization active or random order or malformed queries
findCached->>find: find(collection, queries)
find-->>Caller: documents
else cache path
findCached->>findCached: getCachedFindKeys(collection, queries, key)
findCached->>Cache: load(indexKey, entryKey, ttl)
alt cache hit
Cache-->>findCached: serialized payload
findCached->>findCached: decode documents, check isTtlExpired() per document
alt any document expired
findCached->>Cache: purge(indexKey, entryKey)
findCached->>find: find(collection, queries)
find-->>findCached: fresh documents
findCached->>Cache: save(indexKey, entryKey, payload)
findCached-->>Caller: fresh documents
else all valid
findCached->>Cache: touch(indexKey, entryKey) if touchOnHit
findCached-->>Caller: cached documents + EVENT_DOCUMENT_FIND
end
else cache miss
findCached->>find: find(collection, queries)
find-->>findCached: documents
findCached->>Cache: save(indexKey, entryKey, payload)
findCached-->>Caller: documents
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile Summary
Confidence Score: 5/5The cached find changes appear safe to merge based on the reviewed code and accompanying targeted tests. The implementation is covered by focused unit and adapter tests for cache keys, purge behavior, empty result caching, TTL-backed refresh, and relationship/filter dimensions.
What T-Rex did
Reviews (4): Last reviewed commit: "Add list cache key helper" | Re-trigger Greptile |
| if ($key !== null) { | ||
| return [$findIndexKey, $collectionKey . ':' . $key]; |
There was a problem hiding this comment.
The caller-owned $key branch drops the payload dimensions that were just added for database, schema hash, relationship mode, and active filters. When the same namespace, collection, and stable key such as rules_v1 are reused across setDatabase() scopes or after a schema/filter/relationship-mode change, both calls read and write the same {$collectionKey}:{$key} entry. That can return stale results with the wrong shape or filter semantics until TTL expiry or purge. Keep the readable caller key if needed, but include the non-query dimensions in the final cache entry key.
There was a problem hiding this comment.
Actionable comments posted: 5
🧹 Nitpick comments (1)
tests/unit/FindCacheTest.php (1)
346-371: ⚖️ Poor tradeoffConsider simplifying
setCachedDocumentAttribute()complexity.This method has nested conditionals and handles multiple payload formats. While functional, consider extracting helper methods to improve readability.
💡 Example refactoring approach
+ private function extractDocuments(mixed $payload): ?array + { + if (!\is_array($payload)) { + return null; + } + return \is_array($payload['documents'] ?? null) ? $payload['documents'] : $payload; + } + + private function updateDocumentInPayload(array $payload, array $documents): array + { + if (\array_key_exists('documents', $payload)) { + $payload['documents'] = $documents; + return $payload; + } + return $documents; + } + public function setCachedDocumentAttribute(string $key, string $hash, string $documentId, string $attribute, mixed $value): void { $usesEntryKey = isset($this->store[$hash][$hash]); $cacheKey = $usesEntryKey ? $hash : $key; $cacheHash = $hash; $payload = $this->store[$cacheKey][$cacheHash]['data'] ?? []; - $documents = \is_array($payload) && \is_array($payload['documents'] ?? null) ? $payload['documents'] : $payload; - if (!\is_array($documents)) { + $documents = $this->extractDocuments($payload); + if ($documents === null) { return; } foreach ($documents as $index => $document) { if (!\is_array($document) || ($document['$id'] ?? '') !== $documentId) { continue; } $documents[$index][$attribute] = $value; - if (\is_array($payload) && \array_key_exists('documents', $payload)) { - $payload['documents'] = $documents; - $this->store[$cacheKey][$cacheHash]['data'] = $payload; - } else { - $this->store[$cacheKey][$cacheHash]['data'] = $documents; - } + $this->store[$cacheKey][$cacheHash]['data'] = $this->updateDocumentInPayload($payload, $documents); return; } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/unit/FindCacheTest.php` around lines 346 - 371, The `setCachedDocumentAttribute()` method has high complexity due to nested conditionals and handling multiple payload formats. Refactor this method by extracting helper methods such as one for extracting the documents array from the payload (to handle both array and object formats), one for finding the target document by ID, and one for updating the store with the modified payload. This will improve readability and make the logic flow more clear while reducing nesting levels.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@src/Database/Database.php`:
- Around line 9670-9687: The cache key generation at line 9684-9685 when $key is
not null drops important cache dimensions by returning only the collectionKey
with the caller key appended, ignoring the database, schema, relationships, and
filters in the payload. This causes the same caller key to potentially return
wrong cached results across different scopes. Fix this by computing a SHA-256
digest of the full $payload (similar to line 9687 for the null key case) and
appending it to the collectionKey along with the caller key, ensuring the cache
key is properly scoped by all relevant dimensions. Also replace the MD5 digest
on line 9687 with SHA-256 for consistency.
- Around line 8749-8756: The cached documents need to be re-cast to ensure
schema values are handled consistently since cache backends may round-trip
values differently than the adapter path. After the document is created with
createDocumentInstance and passes the TTL expiration check in isTtlExpired, call
the casting() method on the document before appending it to the $results array
to match the single-document cache path behavior.
- Around line 8639-8646: The caching mechanism in the find operation does not
properly handle relationships within documents. When loadCachedFind returns
cached results for collections that contain relationships, the nested
relationship values are not recursively hydrated and may return scalars or
arrays instead of Document objects, causing inconsistency with non-cached
results. Modify the logic after loadCachedFind to either bypass the cache
entirely for collections containing relationships by checking if the
collectionDocument has relationships before calling getCachedFindKeys and
loadCachedFind, or alternatively ensure that any cached results are recursively
hydrated to properly reconstruct relationship Document instances before
returning them.
In `@tests/e2e/Adapter/Scopes/DocumentTests.php`:
- Around line 114-117: The test in DocumentTests uses a hardcoded collection
name 'findCached' that is not cleaned up after the test runs, which can cause
flaky behavior in shared test environments where reruns may encounter stale
collections. Replace the hardcoded collection name with a dynamically generated
unique identifier (such as a UUID or timestamp-based name) assigned to the
$collection variable, and add teardown or cleanup logic at the end of the test
to call $database->deleteCollection() to remove the test collection after
assertions are complete, ensuring the test does not leave behind artifacts that
could interfere with subsequent runs.
In `@tests/unit/FindCacheTest.php`:
- Around line 410-413: The getName method in the FindCacheTest.php file has an
unused parameter $key that should be removed. Update the method signature of
getName to remove the ?string $key = null parameter while keeping the return
type as string and the return statement unchanged.
---
Nitpick comments:
In `@tests/unit/FindCacheTest.php`:
- Around line 346-371: The `setCachedDocumentAttribute()` method has high
complexity due to nested conditionals and handling multiple payload formats.
Refactor this method by extracting helper methods such as one for extracting the
documents array from the payload (to handle both array and object formats), one
for finding the target document by ID, and one for updating the store with the
modified payload. This will improve readability and make the logic flow more
clear while reducing nesting levels.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
Run ID: 9841ee36-135f-4a09-9935-3cb3dbeb3388
📒 Files selected for processing (6)
src/Database/Database.phptests/e2e/Adapter/Base.phptests/e2e/Adapter/RedisTest.phptests/e2e/Adapter/Scopes/DocumentTests.phptests/unit/CacheKeyTest.phptests/unit/FindCacheTest.php
| $collectionDocument = $this->silent(fn () => $this->getCollection($collection)); | ||
|
|
||
| if ($collectionDocument->isEmpty()) { | ||
| throw new NotFoundException('Collection not found'); | ||
| } | ||
|
|
||
| [$findIndexKey, $findKey] = $this->getCachedFindKeys($collectionDocument->getId(), $queries, $key, $collectionDocument); | ||
| $cached = $this->loadCachedFind($findKey, $ttl); |
There was a problem hiding this comment.
Avoid caching relationship-populated results.
find() can return nested relationship Document instances, but cached hits only rebuild the top-level document. For collections with relationships, cache hits can return arrays/scalars where find() returns Document objects; either bypass caching there or recursively hydrate relationship values.
Proposed safe bypass for relationship collections
if ($collectionDocument->isEmpty()) {
throw new NotFoundException('Collection not found');
}
+ $relationships = \array_filter(
+ $collectionDocument->getAttribute('attributes', []),
+ fn (Document $attribute) => $attribute->getAttribute('type') === self::VAR_RELATIONSHIP
+ );
+
+ if ($this->resolveRelationships && !empty($relationships)) {
+ return $this->find($collectionDocument->getId(), $queries, $forPermission);
+ }
+
[$findIndexKey, $findKey] = $this->getCachedFindKeys($collectionDocument->getId(), $queries, $key, $collectionDocument);📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| $collectionDocument = $this->silent(fn () => $this->getCollection($collection)); | |
| if ($collectionDocument->isEmpty()) { | |
| throw new NotFoundException('Collection not found'); | |
| } | |
| [$findIndexKey, $findKey] = $this->getCachedFindKeys($collectionDocument->getId(), $queries, $key, $collectionDocument); | |
| $cached = $this->loadCachedFind($findKey, $ttl); | |
| $collectionDocument = $this->silent(fn () => $this->getCollection($collection)); | |
| if ($collectionDocument->isEmpty()) { | |
| throw new NotFoundException('Collection not found'); | |
| } | |
| $relationships = \array_filter( | |
| $collectionDocument->getAttribute('attributes', []), | |
| fn (Document $attribute) => $attribute->getAttribute('type') === self::VAR_RELATIONSHIP | |
| ); | |
| if ($this->resolveRelationships && !empty($relationships)) { | |
| return $this->find($collectionDocument->getId(), $queries, $forPermission); | |
| } | |
| [$findIndexKey, $findKey] = $this->getCachedFindKeys($collectionDocument->getId(), $queries, $key, $collectionDocument); | |
| $cached = $this->loadCachedFind($findKey, $ttl); |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/Database/Database.php` around lines 8639 - 8646, The caching mechanism in
the find operation does not properly handle relationships within documents. When
loadCachedFind returns cached results for collections that contain
relationships, the nested relationship values are not recursively hydrated and
may return scalars or arrays instead of Document objects, causing inconsistency
with non-cached results. Modify the logic after loadCachedFind to either bypass
the cache entirely for collections containing relationships by checking if the
collectionDocument has relationships before calling getCachedFindKeys and
loadCachedFind, or alternatively ensure that any cached results are recursively
hydrated to properly reconstruct relationship Document instances before
returning them.
| $document = $this->createDocumentInstance($collection->getId(), $document); | ||
|
|
||
| if ($this->isTtlExpired($collection, $document)) { | ||
| $hasExpiredDocuments = true; | ||
| continue; | ||
| } | ||
|
|
||
| $results[] = $document; |
There was a problem hiding this comment.
Re-cast cached documents before returning them.
Cache backends can round-trip schema values differently from the adapter path, e.g. 1.0 as 1. Match the single-document cache path by running casting() after rehydrating each cached find result.
Proposed cached-hit casting fix
$document = $this->createDocumentInstance($collection->getId(), $document);
+ $document = $this->casting($collection, $document);
if ($this->isTtlExpired($collection, $document)) {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| $document = $this->createDocumentInstance($collection->getId(), $document); | |
| if ($this->isTtlExpired($collection, $document)) { | |
| $hasExpiredDocuments = true; | |
| continue; | |
| } | |
| $results[] = $document; | |
| $document = $this->createDocumentInstance($collection->getId(), $document); | |
| $document = $this->casting($collection, $document); | |
| if ($this->isTtlExpired($collection, $document)) { | |
| $hasExpiredDocuments = true; | |
| continue; | |
| } | |
| $results[] = $document; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/Database/Database.php` around lines 8749 - 8756, The cached documents
need to be re-cast to ensure schema values are handled consistently since cache
backends may round-trip values differently than the adapter path. After the
document is created with createDocumentInstance and passes the TTL expiration
check in isTtlExpired, call the casting() method on the document before
appending it to the $results array to match the single-document cache path
behavior.
| $payload = [ | ||
| 'version' => 1, | ||
| 'database' => $this->getDatabase(), | ||
| 'schema' => $this->getCachedFindSchemaHash($collection), | ||
| 'key' => $key, | ||
| 'queries' => $key === null ? \array_map( | ||
| fn (Query $query): array => $this->serializeCachedFindQuery($query), | ||
| $queries | ||
| ) : null, | ||
| 'relationships' => $this->resolveRelationships, | ||
| 'filters' => $this->getActiveFilterSignatures(), | ||
| ]; | ||
|
|
||
| if ($key !== null) { | ||
| return [$findIndexKey, $collectionKey . ':' . $key]; | ||
| } | ||
|
|
||
| return [$findIndexKey, $findIndexKey . ':' . \md5(\json_encode($payload) ?: '')]; |
There was a problem hiding this comment.
Keep caller-owned keys scoped by the full cache payload.
Line 9684 drops the database, schema, relationship, and active-filter dimensions whenever $key is provided, so the same caller key can serve the wrong cached result across scopes. Include the caller key in the readable segment, but still append a digest of the full payload; also replace the flagged MD5 digests with SHA-256.
Proposed scoped key digest fix
- if ($key !== null) {
- return [$findIndexKey, $collectionKey . ':' . $key];
- }
-
- return [$findIndexKey, $findIndexKey . ':' . \md5(\json_encode($payload) ?: '')];
+ $payloadHash = \hash('sha256', \json_encode($payload, \JSON_THROW_ON_ERROR));
+ $keySegment = $key === null ? 'query' : 'key:' . \rawurlencode($key);
+
+ return [$findIndexKey, "{$findIndexKey}:{$keySegment}:{$payloadHash}"];- return \md5(
- \json_encode($collection->getAttribute('attributes', []))
- . \json_encode($collection->getAttribute('indexes', []))
- );
+ return \hash('sha256', \json_encode([
+ 'attributes' => $collection->getAttribute('attributes', []),
+ 'indexes' => $collection->getAttribute('indexes', []),
+ ], \JSON_THROW_ON_ERROR));Also applies to: 9741-9744
🧰 Tools
🪛 ast-grep (0.43.0)
[error] 9686-9686: Do not use a weak hash algorithm
Context: \md5(\json_encode($payload) ?: '')
Note: [CWE-328] Use of Weak Hash.
(weak-hash-algorithm)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@src/Database/Database.php` around lines 9670 - 9687, The cache key generation
at line 9684-9685 when $key is not null drops important cache dimensions by
returning only the collectionKey with the caller key appended, ignoring the
database, schema, relationships, and filters in the payload. This causes the
same caller key to potentially return wrong cached results across different
scopes. Fix this by computing a SHA-256 digest of the full $payload (similar to
line 9687 for the null key case) and appending it to the collectionKey along
with the caller key, ensuring the cache key is properly scoped by all relevant
dimensions. Also replace the MD5 digest on line 9687 with SHA-256 for
consistency.
Source: Linters/SAST tools
| $collection = 'findCached'; | ||
|
|
||
| $database->createCollection($collection); | ||
| $database->createAttribute($collection, 'name', Database::VAR_STRING, 255, true); |
There was a problem hiding this comment.
Use an isolated collection name (and/or cleanup) to avoid flaky reruns.
Line 114 uses a fixed collection id ('findCached') and this test never deletes it. In shared/reused e2e environments, reruns can fail on collection/document conflicts or stale leftovers.
Suggested fix
- $collection = 'findCached';
+ $collection = __FUNCTION__;
- $database->createCollection($collection);
- $database->createAttribute($collection, 'name', Database::VAR_STRING, 255, true);
+ $database->createCollection($collection);
+ $database->createAttribute($collection, 'name', Database::VAR_STRING, 255, true);
+
+ try {
+ // ... existing test body ...
+ } finally {
+ $database->deleteCollection($collection);
+ }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/e2e/Adapter/Scopes/DocumentTests.php` around lines 114 - 117, The test
in DocumentTests uses a hardcoded collection name 'findCached' that is not
cleaned up after the test runs, which can cause flaky behavior in shared test
environments where reruns may encounter stale collections. Replace the hardcoded
collection name with a dynamically generated unique identifier (such as a UUID
or timestamp-based name) assigned to the $collection variable, and add teardown
or cleanup logic at the end of the test to call $database->deleteCollection() to
remove the test collection after assertions are complete, ensuring the test does
not leave behind artifacts that could interfere with subsequent runs.
| public function getName(?string $key = null): string | ||
| { | ||
| return 'hash-memory'; | ||
| } |
There was a problem hiding this comment.
Remove unused parameter.
The $key parameter is never used in the method body.
🔧 Proposed fix
- public function getName(?string $key = null): string
+ public function getName(): string
{
return 'hash-memory';
}📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| public function getName(?string $key = null): string | |
| { | |
| return 'hash-memory'; | |
| } | |
| public function getName(): string | |
| { | |
| return 'hash-memory'; | |
| } |
🧰 Tools
🪛 PHPMD (2.15.0)
[warning] 410-410: Avoid unused parameters such as '$key'. (undefined)
(UnusedFormalParameter)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@tests/unit/FindCacheTest.php` around lines 410 - 413, The getName method in
the FindCacheTest.php file has an unused parameter $key that should be removed.
Update the method signature of getName to remove the ?string $key = null
parameter while keeping the return type as string and the return statement
unchanged.
Source: Linters/SAST tools
Summary
Tests
Notes
Summary by CodeRabbit
Release Notes