-
Notifications
You must be signed in to change notification settings - Fork 386
feat(perf): Add comprehensive benchmarking framework #3029
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: XuanYang-cn The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
…nsive benchmarking This commit includes comprehensive benchmark suite to identify remaining bottlenecks **New Benchmarks** Created comprehensive benchmark suite under `tests/benchmark/`: - **Access patterns**: 23 tests measuring real-world usage patterns - First-access (UI display), iterate-all (export), random-access (pagination) - **Search benchmarks**: Various vector types, dimensions, output fields - **Query benchmarks**: Scalars, JSON, all output types - **Hybrid search**: Multiple requests, varying top-k **Profiling Infrastructure** - Mock framework for client-only testing (no server required) - Integrated `pytest-memray` for memory profiling - Added helper scripts: - `profile_cpu.sh`: CPU profiling with py-spy - `profile_memory.sh`: Memory profiling with pytest-memray **Profiling Tools** - `pytest-benchmark`: Timing measurements - `py-spy`: CPU profiling and flamegraphs - `memray`: Memory allocation tracking **Key Discoveries** 1. **Lazy loading inefficiency** (CRITICAL) - Accessing first result materializes ALL results (+77% overhead) - Example: `result[0][0]` loads all 10,000 results - Impact: 423ms → 749ms for 10K results 2. **Vector materialization dominates** (HIGH PRIORITY) - 76% of memory usage (326 MiB of 431 MiB for 65K results) - 8x slower than scalars (337ms vs 42ms for 10K results) - Scales linearly with dimensions (128d: 8 MiB, 1536d: 68 MiB) 3. **Struct fields are slow** (MEDIUM PRIORITY) - 10x slower than scalars (435ms vs 42ms for 10K results) - Column-to-row conversion overhead - Linear O(n) scaling with high constant factor 4. **Scalars are efficient** (NO OPTIMIZATION NEEDED) - 64.6 MiB for 65K rows × 4 fields - ~1 KB per entity (acceptable dict overhead) Signed-off-by: yangxuan <[email protected]>
1a5ab37 to
6961059
Compare
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #3029 +/- ##
=========================================
Coverage ? 46.41%
=========================================
Files ? 65
Lines ? 13614
Branches ? 0
=========================================
Hits ? 6319
Misses ? 7295
Partials ? 0 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
This commit includes comprehensive benchmark suite to identify remaining bottlenecks
New Benchmarks
Created comprehensive benchmark suite under
tests/benchmark/:- Access patterns: 23 tests measuring real-world usage patterns
- First-access (UI display), iterate-all (export), random-access (pagination)
- Search benchmarks: Various vector types, dimensions, output fields
- Query benchmarks: Scalars, JSON, all output types
- Hybrid search: Multiple requests, varying top-k
Profiling Infrastructure
- Mock framework for client-only testing (no server required)
- Integrated
pytest-memrayfor memory profiling- Added helper scripts:
-
profile_cpu.sh: CPU profiling with py-spy-
profile_memory.sh: Memory profiling with pytest-memrayProfiling Tools
-
pytest-benchmark: Timing measurements-
py-spy: CPU profiling and flamegraphs-
memray: Memory allocation trackingKey Discoveries
Signed-off-by: yangxuan [email protected]