Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
59 changes: 58 additions & 1 deletion .github/workflows/release.yml
Original file line number Diff line number Diff line change
Expand Up @@ -188,6 +188,55 @@ jobs:
path: hindsight-integrations/openclaw/*.tgz
retention-days: 1

release-ai-sdk-integration:
runs-on: ubuntu-latest
environment: npm

steps:
- uses: actions/checkout@v4

- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: '22'
registry-url: 'https://registry.npmjs.org'

- name: Install dependencies
working-directory: ./hindsight-integrations/ai-sdk
run: npm ci

- name: Build
working-directory: ./hindsight-integrations/ai-sdk
run: npm run build

- name: Publish to npm
working-directory: ./hindsight-integrations/ai-sdk
run: |
set +e
OUTPUT=$(npm publish --access public 2>&1)
EXIT_CODE=$?
echo "$OUTPUT"
if [ $EXIT_CODE -ne 0 ]; then
if echo "$OUTPUT" | grep -q "cannot publish over"; then
echo "Package version already published, skipping..."
exit 0
fi
exit $EXIT_CODE
fi
env:
NODE_AUTH_TOKEN: ${{ secrets.NPM_TOKEN }}

- name: Pack for GitHub release
working-directory: ./hindsight-integrations/ai-sdk
run: npm pack

- name: Upload artifacts
uses: actions/upload-artifact@v4
with:
name: ai-sdk-integration
path: hindsight-integrations/ai-sdk/*.tgz
retention-days: 1

release-control-plane:
runs-on: ubuntu-latest
environment: npm
Expand Down Expand Up @@ -415,7 +464,7 @@ jobs:

create-github-release:
runs-on: ubuntu-latest
needs: [release-python-packages, release-typescript-client, release-openclaw-integration, release-control-plane, release-rust-cli, release-docker-images, release-helm-chart]
needs: [release-python-packages, release-typescript-client, release-openclaw-integration, release-ai-sdk-integration, release-control-plane, release-rust-cli, release-docker-images, release-helm-chart]
permissions:
contents: write

Expand Down Expand Up @@ -444,6 +493,12 @@ jobs:
name: openclaw-integration
path: ./artifacts/openclaw-integration

- name: Download AI SDK Integration
uses: actions/download-artifact@v4
with:
name: ai-sdk-integration
path: ./artifacts/ai-sdk-integration

- name: Download Control Plane
uses: actions/download-artifact@v4
with:
Expand Down Expand Up @@ -487,6 +542,8 @@ jobs:
cp artifacts/typescript-client/*.tgz release-assets/ || true
# OpenClaw Integration
cp artifacts/openclaw-integration/*.tgz release-assets/ || true
# AI SDK Integration
cp artifacts/ai-sdk-integration/*.tgz release-assets/ || true
# Control Plane
cp artifacts/control-plane/*.tgz release-assets/ || true
# Rust CLI binaries
Expand Down
23 changes: 23 additions & 0 deletions .github/workflows/test.yml
Original file line number Diff line number Diff line change
Expand Up @@ -74,6 +74,29 @@ jobs:
working-directory: ./hindsight-integrations/openclaw
run: npm run build

build-ai-sdk-integration:
runs-on: ubuntu-latest

steps:
- uses: actions/checkout@v4

- name: Set up Node.js
uses: actions/setup-node@v4
with:
node-version: '22'

- name: Install dependencies
working-directory: ./hindsight-integrations/ai-sdk
run: npm ci

- name: Run tests
working-directory: ./hindsight-integrations/ai-sdk
run: npm test

- name: Build
working-directory: ./hindsight-integrations/ai-sdk
run: npm run build

build-control-plane:
runs-on: ubuntu-latest

Expand Down
6 changes: 4 additions & 2 deletions hindsight-api/hindsight_api/api/http.py
Original file line number Diff line number Diff line change
Expand Up @@ -866,6 +866,7 @@ class ListDocumentsResponse(BaseModel):
"updated_at": "2024-01-15T10:30:00Z",
"text_length": 5420,
"memory_unit_count": 15,
"tags": ["user_a", "session_123"],
}
],
"total": 50,
Expand Down Expand Up @@ -1160,7 +1161,8 @@ class CreateMentalModelRequest(BaseModel):
class CreateMentalModelResponse(BaseModel):
"""Response model for mental model creation."""

operation_id: str = Field(description="Operation ID to track progress")
mental_model_id: str = Field(description="ID of the created mental model")
operation_id: str = Field(description="Operation ID to track refresh progress")


class UpdateMentalModelRequest(BaseModel):
Expand Down Expand Up @@ -2423,7 +2425,7 @@ async def api_create_mental_model(
mental_model_id=mental_model["id"],
request_context=request_context,
)
return CreateMentalModelResponse(operation_id=result["operation_id"])
return CreateMentalModelResponse(mental_model_id=mental_model["id"], operation_id=result["operation_id"])
except ValueError as e:
raise HTTPException(status_code=400, detail=str(e))
except (AuthenticationError, HTTPException):
Expand Down
85 changes: 70 additions & 15 deletions hindsight-api/hindsight_api/engine/consolidation/consolidator.py
Original file line number Diff line number Diff line change
Expand Up @@ -143,6 +143,9 @@ async def run_consolidation_job(
"skipped": 0,
}

# Track all unique tags from consolidated memories for mental model refresh filtering
consolidated_tags: set[str] = set()

batch_num = 0
last_progress_timings = {} # Track timings at last progress log
while True:
Expand Down Expand Up @@ -176,6 +179,11 @@ async def run_consolidation_job(
for memory in memories:
mem_start = time.time()

# Track tags from this memory for mental model refresh filtering
memory_tags = memory.get("tags") or []
if memory_tags:
consolidated_tags.update(memory_tags)

# Process the memory (uses its own connection internally)
async with pool.acquire() as conn:
result = await _process_memory(
Expand Down Expand Up @@ -284,10 +292,12 @@ async def run_consolidation_job(
perf.log(f"[4] Timing breakdown: {', '.join(timing_parts)}")

# Trigger mental model refreshes for models with refresh_after_consolidation=true
# SECURITY: Only refresh mental models with matching tags (or all if no tags were consolidated)
mental_models_refreshed = await _trigger_mental_model_refreshes(
memory_engine=memory_engine,
bank_id=bank_id,
request_context=request_context,
consolidated_tags=list(consolidated_tags) if consolidated_tags else None,
perf=perf,
)
stats["mental_models_refreshed"] = mental_models_refreshed
Expand All @@ -301,15 +311,20 @@ async def _trigger_mental_model_refreshes(
memory_engine: "MemoryEngine",
bank_id: str,
request_context: "RequestContext",
consolidated_tags: list[str] | None = None,
perf: ConsolidationPerfLog | None = None,
) -> int:
"""
Trigger refreshes for mental models with refresh_after_consolidation=true.

SECURITY: Only triggers refresh for mental models whose tags overlap with the
consolidated memory tags, preventing unnecessary refreshes across security boundaries.

Args:
memory_engine: MemoryEngine instance
bank_id: Bank identifier
request_context: Request context for authentication
consolidated_tags: Tags from memories that were consolidated (None = refresh all)
perf: Performance logging

Returns:
Expand All @@ -318,22 +333,52 @@ async def _trigger_mental_model_refreshes(
pool = memory_engine._pool

# Find mental models with refresh_after_consolidation=true
# SECURITY: Control which mental models get refreshed based on tags
async with pool.acquire() as conn:
rows = await conn.fetch(
f"""
SELECT id, name
FROM {fq_table("mental_models")}
WHERE bank_id = $1
AND (trigger->>'refresh_after_consolidation')::boolean = true
""",
bank_id,
)
if consolidated_tags:
# Tagged memories were consolidated - refresh:
# 1. Mental models with overlapping tags (security boundary)
# 2. Untagged mental models (they're "global" and available to all contexts)
# DO NOT refresh mental models with different tags
rows = await conn.fetch(
f"""
SELECT id, name, tags
FROM {fq_table("mental_models")}
WHERE bank_id = $1
AND (trigger->>'refresh_after_consolidation')::boolean = true
AND (
(tags IS NOT NULL AND tags != '{{}}' AND tags && $2::varchar[])
OR (tags IS NULL OR tags = '{{}}')
)
""",
bank_id,
consolidated_tags,
)
else:
# Untagged memories were consolidated - only refresh untagged mental models
# SECURITY: Tagged mental models are NOT refreshed when untagged memories are consolidated
rows = await conn.fetch(
f"""
SELECT id, name, tags
FROM {fq_table("mental_models")}
WHERE bank_id = $1
AND (trigger->>'refresh_after_consolidation')::boolean = true
AND (tags IS NULL OR tags = '{{}}')
""",
bank_id,
)

if not rows:
return 0

if perf:
perf.log(f"[5] Triggering refresh for {len(rows)} mental models with refresh_after_consolidation=true")
if consolidated_tags:
perf.log(
f"[5] Triggering refresh for {len(rows)} mental models with refresh_after_consolidation=true "
f"(filtered by tags: {consolidated_tags})"
)
else:
perf.log(f"[5] Triggering refresh for {len(rows)} mental models with refresh_after_consolidation=true")

# Submit refresh tasks for each mental model
refreshed_count = 0
Expand Down Expand Up @@ -385,14 +430,16 @@ async def _process_memory(
memory_id = memory["id"]
fact_tags = memory.get("tags") or []

# Find related observations using the full recall system (NO tag filtering)
# Find related observations using the full recall system
# SECURITY: Pass tags to ensure observations don't leak across security boundaries
t0 = time.time()
related_observations = await _find_related_observations(
conn=conn,
memory_engine=memory_engine,
bank_id=bank_id,
query=fact_text,
request_context=request_context,
tags=fact_tags, # Pass source memory's tags for security
)
if perf:
perf.record_timing("recall", time.time() - t0)
Expand Down Expand Up @@ -666,17 +713,20 @@ async def _find_related_observations(
bank_id: str,
query: str,
request_context: "RequestContext",
tags: list[str] | None = None,
) -> list[dict[str, Any]]:
"""
Find observations related to the given query using optimized recall.

IMPORTANT: We do NOT filter by tags here. Consolidation needs to see ALL
potentially related observations regardless of scope, so the LLM can
decide on tag routing (same scope update vs cross-scope create).
SECURITY: Filters by tags using all_strict matching to prevent cross-tenant/cross-user
information leakage. Observations are only consolidated within the same tag scope.

Uses max_tokens to naturally limit observations (no artificial count limit).
Includes source memories with dates for LLM context.

Args:
tags: Optional tags to filter observations (uses all_strict matching for security)

Returns:
List of related observations with their tags, source memories, and dates
"""
Expand All @@ -685,14 +735,19 @@ async def _find_related_observations(
from ...config import get_config

config = get_config()

# SECURITY: Use all_strict matching if tags provided to prevent cross-scope consolidation
tags_match = "all_strict" if tags else "any"

recall_result = await memory_engine.recall_async(
bank_id=bank_id,
query=query,
max_tokens=config.consolidation_max_tokens, # Token budget for observations (configurable)
fact_type=["observation"], # Only retrieve observations
request_context=request_context,
tags=tags, # Filter by source memory's tags
tags_match=tags_match, # Use strict matching for security
_quiet=True, # Suppress logging
# NO tags parameter - intentionally get ALL observations
)

# If no observations returned, return empty list
Expand Down
9 changes: 6 additions & 3 deletions hindsight-api/hindsight_api/engine/consolidation/prompts.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,13 +32,16 @@

## MERGE RULES (when comparing to existing observations):
1. REDUNDANT: Same information worded differently → update existing
2. CONTRADICTION: Opposite information about same topic → update with history (e.g., "used to X, now Y")
3. UPDATE: New state replacing old state → update with history
2. CONTRADICTION: Opposite information about same topic → update with temporal markers showing change
Example: "Alex used to love pizza but now hates it" OR "Alex's pizza preference changed from love to hate"
3. UPDATE: New state replacing old state → update showing the transition with "used to", "now", "changed from X to Y"

## CRITICAL RULES:
- NEVER merge facts about DIFFERENT people
- NEVER merge unrelated topics (food preferences vs work vs hobbies)
- When merging contradictions, capture the CHANGE (before → after)
- When merging contradictions, the "text" field MUST capture BOTH states with temporal markers:
* Use "used to X, now Y" OR "changed from X to Y" OR "X but now Y"
* DO NOT just state the new fact - you MUST show the change
- Keep observations focused on ONE specific topic per person
- The "text" field MUST contain durable knowledge, not ephemeral state
- Do NOT include "tags" in output - tags are handled automatically"""
Expand Down
Loading
Loading