Skip to content

Optimize FindQuery.update() to use key-only search and partial HSET #777

@abrookins

Description

@abrookins

Summary

Optimize FindQuery.update() to avoid loading full documents. Instead, fetch only keys and apply partial field updates directly.

Problem

Current implementation:

async def update(self, use_transaction=True, **field_values):
    for model in await self.all():       # Loads ALL documents into Python
        for field, value in field_values.items():
            setattr(model, field, value)
        await model.save(pipeline=pipeline)  # Full document HSET

Issues:

  1. self.all() loads full documents when we only need keys
  2. save() writes all fields when we only changed specific ones
  3. Pydantic validation runs N times (once per model) with identical values

Proposed Implementation

async def update(self, use_transaction=True, **field_values) -> int:
    # 1. Validate field values once upfront
    validate_model_fields(self.model, field_values)
    serialized = self._serialize_field_values(field_values)
    
    # 2. Get matching keys only (no document content)
    keys = await self._search_keys_only()
    
    # 3. Pipeline partial updates (preserve existing use_transaction behavior)
    pipeline = await self.model.db().pipeline(transaction=use_transaction)
    for key in keys:
        if self.model._is_json_model():
            for field, value in serialized.items():
                pipeline.json().set(key, f"$.{field}", value)
        else:
            pipeline.hset(key, mapping=serialized)
    
    await pipeline.execute()
    return len(keys)

Key Changes

1. Key-only search

async def _search_keys_only(self) -> List[str]:
    # Use NOCONTENT to get keys without document data
    results = await self._execute_search(nocontent=True)
    return [doc.id for doc in results.docs]

2. Validate once, not N times

def _serialize_field_values(self, field_values: Dict[str, Any]) -> Dict[str, Any]:
    # Validate and serialize each field value once
    # Uses Pydantic field validation
    ...

3. Partial HSET

# Only the fields being updated are written
pipeline.hset("user:123", mapping={"status": "active"})  # Only status

API

Preserves existing signature:

async def update(self, use_transaction=True, **field_values) -> int:

Only change: now returns count of updated records.

Performance Comparison

Metric Current Proposed
Documents loaded N 0
Pydantic validations N 1
Fields written per doc All Only updated
Data transferred (read) N × doc_size N × key_size
Data transferred (write) N × doc_size N × field_size

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    enhancementNew feature or request

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions