Skip to content

Releases: dev07060/mobile_rag_engine

v0.16.0

Choose a tag to compare

@dev07060 dev07060 released this 26 Mar 15:05
3cbc7e2

Highlights

  • sharp-edge cleanup for context packing, heading heuristics, markdown table contract alignment, and parser regression coverage
  • exact-budget and contextual rendering work shipped in the 0.16.0 release train
  • README wording clarified to describe the current architecture as a copy-minimized Rust core rather than end-to-end zero-copy FFI

Packages

  • rag_engine_flutter 0.16.0 published on pub.dev
  • mobile_rag_engine 0.16.0 published on pub.dev

Notes

  • this release tag reflects the repository metadata and docs synced to the already-published 0.16.0 packages

Release Notes v0.14.0

Choose a tag to compare

@dev07060 dev07060 released this 23 Feb 17:55

Key Features

1. Vector Math Refactor

  • Replaced ndarray with a zero-allocation vector_math module optimized for mobile cosine similarity, dot product, and L2 norm.
  • Added optional SIMD-accelerated backend via faer under the vector_faer feature flag.

2. i8 Scalar Quantization (Feature-Gated)

  • Added vector_quant_i8 feature support for scalar i8 quantization in search paths.
  • Added schema migration for embedding_i8 and embedding_scale columns in docs and chunks.
  • Added quantized cosine similarity for linear scan paths with automatic f32 fallback.

3. Benchmark Service Expansion

  • Added DetailedBenchmarkStats including warmup stats, p50/p95, standard deviation, and raw sample capture.
  • Added benchmarkDetailed(), collectSamples(), summarizeSamples(), and aggregateRoundStats() APIs.

4. Benchmark FFI Coverage

  • Added benchmarkSearchLinearScan() for deterministic linear-scan benchmarking without HNSW.
  • Added benchmarkSearchChunksLinearInCollection() for collection-scoped linear-scan benchmark scenarios.

API & Internal Changes

Quality & Maintenance

  • Applied rustfmt formatting across Rust sources.
  • Sorted module declarations alphabetically in mod.rs.
  • Updated package dependencies for the 0.14.0 release line.

Release Notes v0.13.0

Choose a tag to compare

@dev07060 dev07060 released this 19 Feb 09:21
2b5adb3

Key Features

1. Multi-Collection Core

First-class collection-scoped architecture was introduced:

  • Collection-Aware Schema: Added collection_id support on sources/chunks and collection index state.
  • Scoped Data Operations: Added collection-aware APIs for list/add/delete/stats/rebuild/search flows.
  • Backward Compatibility: Preserved legacy behavior via default collection mapping.

2. Hybrid Search Isolation by Collection

Search execution now aligns with active collection boundaries:

  • Filter Extension: Extended SearchFilter with collection_id.
  • Exact-Scan Rule Update: Updated exact-scan switching for source/metadata filters while preserving collection post-filter behavior.
  • Index Activation Hook: Added collection-scoped activation hook to align BM25/HNSW in-memory indexes before hybrid search.

3. Reliability & Recovery

Operational stability was reinforced for multi-collection environments:

  • Index State Handling: Improved collection activation and index-state transitions for load/rebuild flows.
  • Test Expansion: Added/expanded tests for collection isolation, scoped dedupe, and filter semantics.

API & Internal Changes

Platform Stability

  • Improved consistency of collection-scoped lifecycle transitions across initialization, rebuild, and search paths.

Release Notes v0.12.0

Choose a tag to compare

@dev07060 dev07060 released this 19 Feb 09:20
2b5adb3

Key Features

1. Logger Stability on Hot Restart

Significant stability updates were applied to Dart/Rust log bridge lifecycle:

  • Safer Sink Ownership: Reworked Dart log sink ownership to Arc<StreamSink<_>> for safer cross-thread access.
  • Lock Contention Reduction: Avoided holding logger locks while sending logs to Dart stream.
  • Non-Blocking Teardown: Switched log stream teardown to non-blocking cleanup to prevent restart deadlocks.
  • Stale Sink Recovery: Added stale sink recovery on stream send failures.

API & Internal Changes

Runtime Reliability

  • Improved log pipeline resilience under frequent restart/re-init cycles in development and hot-restart workflows.

Release Notes v0.11.0

Choose a tag to compare

@dev07060 dev07060 released this 19 Feb 09:20
2b5adb3

Key Features

1. Hybrid Search Quality Improvements

Hybrid retrieval behavior was improved for source-filtered scenarios:

  • Scoped BM25 Preservation: Improved source-filter exact-scan path to keep scoped BM25 ranking.
  • Regression Coverage: Added regression tests for source-filter + exact-keyword behavior.

2. Tokenization & Chunking Accuracy

Retrieval inputs are now handled more consistently across query lengths and languages:

  • Dynamic Truncation Policy: Added input-length based truncation tiers (256/384/512).
  • Semantic Overlap Fix: Applied overlap prefix logic in semantic_chunk_with_overlap.
  • CJK/Code Token Support: Improved BM25 tokenization to retain meaningful single-char CJK/code tokens.

API & Internal Changes

Quality & Reliability

  • Strengthened test coverage around exact keyword retrieval behavior when source filters are active.

Release Notes v0.10.0

Choose a tag to compare

@dev07060 dev07060 released this 12 Feb 16:59

Key Features

1. Robust Index Persistence & Initialization

Critical fixes to ensure search indices are reliably saved and loaded:

  • Immediate Persistence: Fixed a bug where the HNSW index was not correctly saved to disk upon creation. The engine now ensures the index is persisted immediately after rebuilding.
  • Path Resolution Fix: Corrected the load_hnsw_index logic in the native Rust layer to resolve the index file path accurately.
  • Crash Prevention: Added safeguards in save_hnsw_index to handle uninitialized or empty indices gracefully, preventing runtime crashes.

2. Background Processing & Responsiveness

Major performance improvements for document handling:

  • Isolate Offloading: CPU-intensive tasks including PDF text extraction, chunking, and embedding are now fully offloaded to a background Isolate.
  • Prevents UI Freeze: Users can process large documents without blocking the main UI thread.
  • Restored Progress Reporting: Implemented Isolate-based communication to provide granular progress updates (e.g., "Embedding chunks: 10/50") during background execution.

API & Internal Changes

Quality of Life

  • Log Noise Reduction: Filtered out excessive granular debug logs from the internal hnsw_rs engine (e.g., load_point_graph), resulting in a much cleaner console output.
  • Pub Score Optimization: Refactored SourceRagService to remove unnecessary imports (dart:typed_data), improving the package health score.

Release Notes v0.9.0

Choose a tag to compare

@dev07060 dev07060 released this 06 Feb 04:45

Key Features

1. Memory Optimized Model Loading

Significant reduction in memory footprint during initialization:

  • Direct File Loading: The ONNX model is now initialized directly from the file system path instead of loading into a Dart memory buffer first.
  • Reduces Dart heap usage by approximately 20-50MB (depending on model size) by eliminating double buffering.

2. Simplified Thread Management

New high-level API for easier CPU resource management:

  • Introduced ThreadUseLevel enum with presets: low (~20%), medium (~40%), and high (~80%).
  • Allows developers to configure ONNX Runtime performance without manually calculating thread counts based on device cores.

API Changes

New & Improved Functions

  • MobileRag.initialize(): Now accepts threadLevel parameter for streamlined configuration.
  • Logging Update: Replaced all internal print() calls with debugPrint() to ensure reliable log output on Android and prevent message dropping.

Fixed

  • Configuration Safety: Added validation to throw AssertionError if both threadLevel (high-level) and embeddingIntraOpNumThreads (low-level) are provided simultaneously, ensuring clear configuration intent.

Release Notes v0.8.0

Choose a tag to compare

@dev07060 dev07060 released this 04 Feb 18:55

Key Features

1. Independent Source Search (Exact Scan)

New search strategy optimization for source-specific filtering:

  • Switches to Brute Force (Exact Scan) when sourceIds filter is active.
  • Guarantees Perfect Recall within the selected document by scanning every chunk, bypassing HNSW approximation limits.
  • Ensures local relevance within a document is not overshadowed by global relevance scores.

2. Smart PDF Dehyphenation (Korean Support)

Improved text extraction logic for non-Latin languages:

  • Smart Dehyphenation now correctly handles Korean text where words are split by line breaks.
  • Intelligently merges broken lines to improve semantic chunking quality.

API Changes

New & Improved Functions

  • tryLoadCachedIndex(): Optimize startup by loading existing HNSW index from disk instead of rebuilding.
  • searchHybridWithContext(): Directly retrieve formatted context prompts for LLM applications.
  • getStats(): Now returns the exported SourceStats type for better usability.

Documentation

  • Added Advanced Features section to Quick Start guide covering startup optimization, LLM context assembly, and database stats.

Fixed

  • App Crash: Resolved No MaterialLocalizations found error when deleting sources in the example app.
  • Imports: Fixed missing type exports in mobile_rag.dart.

Release Notes v0.5.3 (Since 0.5.0)

Choose a tag to compare

@dev07060 dev07060 released this 23 Jan 17:04

What's Changed

This release focuses on significantly improving the Developer Experience (DX) by introducing a Singleton pattern, allowing you to initialize the engine with a single line of code.

⚠️ Breaking Changes

  • Simplified Initialization: The manual initialization process (RustLib.init, EmbeddingService.init, SourceRagService) has been deprecated in the documentation in favor of the new MobileRag singleton. While the low-level APIs are still available, the recommended and documented path has changed.
  • API Exports: Low-level Rust APIs (e.g., RustLib) are no longer exported by default to improve IDE auto-completion. Advanced users needing these must import them from src/.

🚀 Features

  • Singleton Pattern: Introduced MobileRag class for simplified, global access to the engine.
  • One-Line Init: MobileRag.initialize() handles Rust FFI, Tokenizer, ONNX Runtime, and Database setup automatically.
  • Auto-Initialization: Eliminated the need to manually call RustLib.init().

📚 Documentation

  • Quick Start Overhaul: Updated README, Quick Start Guide, and Examples to reflect the new 5-minute setup process.
  • Example App: Refactored the example application to demonstrate the best-practice Singleton usage.

🐛 Bug Fixes

  • Fixed stale test configurations in the example app.
  • Improved internal structure for better maintainability.

Migration Guide

Before (v0.4.x):

// Manual initialization (Deprecated approach)
await RustLib.init();
final dir = await getApplicationDocumentsDirectory();
await initTokenizer(tokenizerPath: ...);
await EmbeddingService.init(...);
final service = SourceRagService(dbPath: ...);
await service.init();
// Modern initialization (Recommended)
await MobileRag.initialize(
  tokenizerAsset: 'assets/tokenizer.json',
  modelAsset: 'assets/model.onnx',
);

// Access anywhere
await MobileRag.instance.addDocument("Hello RAG!");

v0.4.0: PDF/DOCX extraction,cross-platform support

Choose a tag to compare

@dev07060 dev07060 released this 12 Jan 16:00

Key Features

1. Document Parsing (Rust Core)

New Rust-based document extraction with native performance:

  • PDF text extraction via pdf-extract crate
  • DOCX parsing via docx-lite crate
  • Smart dehyphenation: automatically rejoins words split by line breaks and page boundaries
  • Page number stripping from extracted PDF text
  • 50MB file size limit to prevent OOM on mobile devices

2. Cross-Platform Support

  • Android platform support added
  • macOS entitlements for file picker integration

API Changes

New Functions

  • extractTextFromPdf(), extractTextFromDocx(), extractTextFromDocument()
  • SourceRagService.removeSource(id) - Delete documents from index

New Parameters

  • EmbeddingService.init(useGpuAcceleration: bool) - Enable GPU acceleration

Full Changelog

v0.3.0...v0.4.0