pglance is a PostgreSQL extension built with the pgrx framework that implements full-table scanning functionality for directly reading and querying Lance format tables within PostgreSQL.
This is the first open-source project to seamlessly integrate the modern columnar storage format Lance with PostgreSQL database.
Bring Lance's high-performance columnar storage and vector search capabilities into the PostgreSQL ecosystem, providing users with:
- Efficient large-scale data analytics capabilities
- Native vector search support (planned)
- Unified SQL interface for accessing Lance data
- ๐ Lance Table Scanning: Complete table data reading and traversal
- ๐ Schema Inspection: Automatic parsing of Lance table structure and column types
- ๐ Statistics: Get table metadata including version, row count, column count
- ๐ Type Conversion: Intelligent type mapping from Arrow/Lance to PostgreSQL
- ๐ฆ JSONB Output: JSON serialization for complex data structures
- โก Async Processing: Integration of async Lance APIs within sync PostgreSQL interface
- ๐ฏ Vector Search: KNN and ANN search support
- ๐ง FDW Support: Foreign Data Wrapper interface
- โ๏ธ Write Operations: INSERT/UPDATE/DELETE support
- ๐ Query Optimization: Predicate pushdown and column projection optimization
| Component | Version | Description |
|---|---|---|
| PostgreSQL | 13-17 | Support for all actively maintained versions |
| Rust | 1.70+ | Modern systems programming language |
| pgrx | 0.14.3 | PostgreSQL extension development framework |
| Lance | 0.29 | Latest version of Lance storage engine |
| Arrow | 55.1 | Latest version of Apache Arrow |
Install required tools:
- Rust (latest stable) - https://rustup.rs/
- PostgreSQL (13-17) with development headers
- Protocol Buffers compiler (protoc)
# Clone the project
git clone <repository-url>
cd pglance
# Setup development environment
cargo install cargo-pgrx --version=0.14.3 --locked
cargo pgrx init
# Build and install extension
cargo pgrx install --features pg16
# Enable extension in PostgreSQL
psql -c "CREATE EXTENSION pglance;"-- Test basic functionality
SELECT hello_pglance();
-- Should return: "Hello, pglance"-- View complete Lance table structure information
SELECT
column_name,
data_type,
CASE WHEN nullable THEN 'YES' ELSE 'NO' END as is_nullable
FROM lance_table_info('/path/to/your/lance/table')
ORDER BY column_name;Example Output:
column_name | data_type | is_nullable
-------------+-----------+-------------
id | int8 | NO
embedding | float4[] | YES
metadata | jsonb | YES
name | text | YES
-- Get detailed table statistics
SELECT
'Lance Table Version: ' || version as info,
'Total Rows: ' || num_rows as row_info,
'Total Columns: ' || num_columns as col_info
FROM lance_table_stats('/path/to/your/lance/table');-- View first 5 rows of data (recommended for large tables)
SELECT
(row_data->>'id')::bigint as id,
row_data->>'name' as name,
jsonb_array_length(row_data->'embedding') as embedding_dim
FROM lance_scan_jsonb('/path/to/your/lance/table', 5);
-- Data quality statistics
SELECT
COUNT(*) as total_rows,
COUNT(CASE WHEN row_data ? 'id' THEN 1 END) as has_id,
COUNT(CASE WHEN row_data ? 'embedding' THEN 1 END) as has_embedding
FROM lance_scan_jsonb('/path/to/your/lance/table', 1000);Returns a simple greeting to verify extension installation.
Returns: TEXT - "Hello, pglance"
Returns Lance table structure information.
Parameters:
table_path: File system path to the Lance table
Returns:
column_name: Column namedata_type: PostgreSQL data typenullable: Whether null values are allowed
Returns Lance table statistics.
Parameters:
table_path: File system path to the Lance table
Returns:
version: Lance table versionnum_rows: Total number of rowsnum_columns: Total number of columns
Scans Lance table and returns data in JSONB format.
Parameters:
table_path: File system path to the Lance tablelimit: Limit number of rows returned (optional)
Returns:
row_data: Row data in JSONB format
| Arrow/Lance Type | PostgreSQL Type |
|---|---|
| Boolean | boolean |
| Int8 | char |
| Int16 | int2 |
| Int32 | int4 |
| Int64 | int8 |
| Float32 | float4 |
| Float64 | float8 |
| Utf8/LargeUtf8 | text |
| Binary | bytea |
| Date32/Date64 | date |
| Timestamp | timestamp |
| List/Struct | jsonb |
| FixedSizeList(float) | float4[]/float8[] |
# Setup development environment
cargo install cargo-pgrx --version=0.14.3 --locked
cargo pgrx init
# Clone and setup project
git clone <repository-url>
cd pglance
# Run all quality checks
cargo fmt --all -- --check
cargo clippy --features pg16 -- -D warnings
cargo test --features pg16
# Build and install
cargo pgrx install --features pg16
# Start PostgreSQL with extension
cargo pgrx run --features pg16If you have just installed:
# Show all available commands
just
# Run all quality checks
just check
# Auto-format code
just fmt
# Build extension
just build
# Run tests
just test
# Start PostgreSQL with extension
just run
# Simulate CI locally
just ciSpecify PostgreSQL version for commands:
cargo pgrx install --features pg15 # PostgreSQL 15
cargo pgrx install --features pg17 # PostgreSQL 17
# Or with just:
just build pg=15
just test pg=17Supported versions: 13, 14, 15, 16, 17 (default: 16)
For detailed development information, see DEVELOPMENT.md.
pglance uses a pure Rust testing approach with comprehensive unit and integration tests.
# Run all tests
cargo test --features pg16
# Or with just:
just testAll tests are written in Rust using the pgrx testing framework. For detailed testing information, see TESTING.md.
pglance/
โโโ src/
โ โโโ lib.rs # Main entry, PostgreSQL function definitions
โ โโโ types/ # Type conversion module
โ โ โโโ mod.rs # Module exports
โ โ โโโ conversion.rs # Arrow to PostgreSQL type mapping
โ โ โโโ arrow_convert.rs # Arrow value conversion utilities
โ โโโ scanner/ # Lance scanner implementation
โ โโโ mod.rs # Module exports
โ โโโ lance_scanner.rs # Lance table scanning logic
โโโ sql/ # SQL scripts (if any)
โโโ .github/ # GitHub workflows
โ โโโ workflows/
โ โโโ rust-checks.yml # CI/CD pipeline
โ โโโ release.yml # Release automation
โโโ Cargo.toml # Rust dependency configuration
โโโ justfile # Development commands
โโโ pglance.control # PostgreSQL extension metadata
โโโ README.md # This file
โโโ DEVELOPMENT.md # Development guide
โโโ TESTING.md # Testing guide
- File Paths: Currently requires full file system path to Lance tables
- Permissions: PostgreSQL process needs read permissions for Lance files
- Memory Usage: Large table scans may consume significant memory
- Type Support: Complex nested types are converted to JSONB
- Concurrency: Current implementation uses synchronous access
- Foreign Data Wrapper (FDW) support
- Vector search functionality (KNN/ANN)
- Write support (INSERT/UPDATE/DELETE)
- Partitioned table support
- Query pushdown optimization
- Streaming scans for large datasets
- Custom vector types
- Index creation and management
Issues and Pull Requests are welcome! Please see our development guidelines in [DEVELOPMENT.md](DEVELOPMENT.m d).
Apache License 2.0
- Lance - Modern columnar data format
- pgrx - PostgreSQL extension development framework
- Apache Arrow - In-memory columnar data format
- LanceDB - Vector database built on Lance