Skip to content

Conversation

@wjones127
Copy link
Contributor

@wjones127 wjones127 commented Nov 21, 2025

Summary

Previously, IvfModel::partition_size would panic when accessing a partition that doesn't exist in an index. This occurred during index optimization when multiple indices had different partition counts due to incremental partition splitting.

Root Cause

The should_split function in builder.rs:1174-1177 iterates over all partitions based on the first index's partition count, then calls partition_size on all existing indices, including older ones with fewer partitions. When partitions are split during optimization, newer indices can have more partitions than older ones (e.g., 116 → 117 → 118).

Changes

  • Changed IvfModel::partition_size to use .get().cloned().unwrap_or(0) instead of direct array indexing
  • Returns 0 for non-existent partitions, which is semantically correct

Fixes #5312

🤖 Generated with Claude Code

Previously, IvfModel::partition_size would panic when accessing a
partition that doesn't exist in an index. This occurred during index
optimization when multiple indices had different partition counts due
to incremental partition splitting.

The should_split function would iterate over all partitions based on
the first index's partition count, then call partition_size on all
existing indices, including older ones with fewer partitions.

Changed partition_size to use .get().unwrap_or(0) instead of direct
array indexing, returning 0 for non-existent partitions.

Added regression test that verifies out-of-bounds accesses return 0
instead of panicking.

Fixes lance-format#5312

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@github-actions
Copy link
Contributor

ACTION NEEDED
Lance follows the Conventional Commits specification for release automation.

The PR title and description are used as the merge commit message. Please update your PR title and description to match the specification.

For details on the error please inspect the "PR Title Check" action.

Added test_optimize_with_mismatched_partition_counts to verify that
the full optimization pipeline handles indices with different partition
counts without panicking. This test simulates the real-world scenario
where incremental partition splitting creates indices with varying
partition counts.

The test:
- Creates an initial IVF-PQ index with 4 partitions
- Appends data and optimizes multiple times
- Verifies optimization succeeds even with mismatched partition counts
- Confirms the index remains functional after optimization

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <[email protected]>
@wjones127 wjones127 force-pushed the fix-5312-partition-size-bounds branch from eca4f1d to a64adbc Compare November 21, 2025 18:45
@wjones127 wjones127 changed the title Fix panic in partition_size with mismatched partition counts fix: don't panic in IVF_PQ optimize if segments have different number of partitions Nov 21, 2025
@github-actions github-actions bot added the bug Something isn't working label Nov 21, 2025
@wjones127 wjones127 closed this Nov 24, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Panic while optimizing vector indices

1 participant