Skip to content

Commit bc7c00c

Browse files
phillipleblancsgrebnovlukekimpeaseeSevenannn
authored
Improve DuckDB insert performance; support Postgres materialized views; MySQL connection metrics (#311)
* Revert "Revert "Bump secrecy version (#248)" (#252)" (#254) This reverts commit 3c38758. * Upgrade datafusion-federation (#258) * Upgrade federation (#259) * Add primary keys mismatch verification for DuckDB table creation (#260) * Add indexes mismatch detection for DuckDB table creation (#261) * Add indexes mismatch detection for DuckDB table creation * Update src/duckdb.rs Co-authored-by: Luke Kim <[email protected]> * Update src/duckdb.rs Co-authored-by: Luke Kim <[email protected]> * Update src/duckdb.rs --------- Co-authored-by: Luke Kim <[email protected]> * Add test for DuckDB memory_limit option (#263) * Use quote_identifier for table name when fetching existing primary keys (#265) * Defer index/constraint creation until after the initial data load. (#280) * Defer index/constraint creation until after the initial data load. * revert: Apply primary keys only at table creation --------- Co-authored-by: peasee <[email protected]> * refactor: New TableManager interfaces for DuckDB (#287) * wip * refactor: New TableCreator interfaces for DuckDB * wip: Add some stubbed tests * wip: #[ignore] not #[skip] * test: Add creator.rs tests * test: Add overwrite and append tests * fix: Add schema validation back, add more unit tests * refactor: Rename TableCreator::list_internal_tables * chore: Remove commented out code * Switch from Appender to duckdb_arrow_scan (#288) * Switch from Appender to duckdb_arrow_scan * Fix duckdb-rs commit * Update src/duckdb.rs Co-authored-by: Luke Kim <[email protected]> --------- Co-authored-by: Luke Kim <[email protected]> * duckdb-rs commit from spiceai-1.1.3-backported * refactor: Append reuses existing base table, more tests, ViewCreator * fix: InsertOp::Replace needs Append * Add temp directory parameter (#289) * refactor: Make TableDefinition PartialEq all of the time * review: Rename TableCreator to TableManager, use option for internal table state * refactor: Update error messaging on indexes * Apply suggestions from code review Co-authored-by: Phillip LeBlanc <[email protected]> * fix: Address internal tables that are subsets or other table names * feat: Add TableDefinition function to determine if it exists * fix: Allow TableDefinition::name() to be public * clippy: must_use --------- Co-authored-by: Phillip LeBlanc <[email protected]> Co-authored-by: Luke Kim <[email protected]> * fix: Make an initial table during `TableProviderFactory::create()` for DuckDB (#290) * fix: Make initial table if none exist for DuckDB * refactor: Defer indexes to after first append load * test: Update tests for append * fix: Index detection for overwrite mode * clippy: Reference immediately dereferenced * test: Fix table creator tests * Fixes a warning about missing indexes that shouldn't show up for newly created tables. (#292) * De-duplicate attachments in DuckDBAttachments (#294) * Allow connection pool size configuration for duckdb connection pool (#275) * Add connection_pool_size parameter for duckdb table provider * fix tests * remove user configured connection_pool param, add min_idle configuration to pool * fix * apply suggestions * fix tests * add DuckDBConnectionPoolBuilder method * remove idle size calculation, introduce non-breaking get_or_init instance methods * fix tests * Add mode to DuckDbConnectionPoolBuilder, remove lifetime for DuckDBConnectionPoolBuilder * move build pool functionality into the DuckDB ConnectionPool Builder * Use builder when get_or_init instance * only pass pool to the get_or_init_instance_with_builder * remove unnecessary error * Add DuckDB setting 'preserve_insertion_order` (#298) * Update the PostgreSQL query to support inferring the schema from materialized views (#300) * Update the PostgreSQL query to support inferring the schema from materialized views * update postgres schema tests, to use complex table for views * Update examples/postgres.rs * Update examples/postgres.rs * Upgrade mysql_async, expose metrics and remove unnecessary type parameters (#302) * Upgrade mysql_async, expose metrics and remove unnecessary type parameters * Fix lint issue * Expose the connection pool metrics via the table factory (#304) * Support the `pool_min` and `pool_max` MySQL connection parameters (#305) * Support the `pool_min` and `pool_max` parameters * Support the `pool_min` and `pool_max` MySQL connection parameters * Exercise the pool_min/pool_max options * Fixing merge issues + use spiceai_duckdb_fork --------- Co-authored-by: Sergei Grebnov <[email protected]> Co-authored-by: Luke Kim <[email protected]> Co-authored-by: peasee <[email protected]> Co-authored-by: Qianqian <[email protected]> Co-authored-by: Evgenii Khramkov <[email protected]>
2 parents 6fd7aa2 + a37be8f commit bc7c00c

22 files changed

+3721
-820
lines changed

Cargo.lock

Lines changed: 26 additions & 26 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -48,4 +48,4 @@ datafusion-proto = { version = "46" }
4848
datafusion-physical-expr = { version = "46" }
4949
datafusion-physical-plan = { version = "46" }
5050
datafusion-table-providers = { path = "core" }
51-
duckdb = { version = "=1.2.1" }
51+
duckdb = { version = "=1.2.1", package = "spiceai_duckdb_fork" } # Forked to add support for duckdb_scan_arrow, pending: https://github.com/duckdb/duckdb-rs/pull/488

README.md

Lines changed: 12 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,18 @@ CREATE TABLE companies (
6464
name VARCHAR(100)
6565
);
6666
67-
INSERT INTO companies (id, name) VALUES (1, 'Acme Corporation');
67+
INSERT INTO companies (id, name) VALUES
68+
(1, 'Acme Corporation'),
69+
(2, 'Widget Inc.'),
70+
(3, 'Gizmo Corp.'),
71+
(4, 'Tech Solutions'),
72+
(5, 'Data Innovations');
73+
74+
CREATE VIEW companies_view AS
75+
SELECT id, name FROM companies;
76+
77+
CREATE MATERIALIZED VIEW companies_materialized_view AS
78+
SELECT id, name FROM companies;
6879
EOF
6980
```
7081

core/Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -109,6 +109,7 @@ rstest = "0.25.0"
109109
test-log = { version = "0.2", features = ["trace"] }
110110
tokio-stream = { version = "0.1", features = ["net"] }
111111
tracing-subscriber = { version = "0.3", features = ["env-filter"] }
112+
tempfile = "3.19.1"
112113

113114
[features]
114115
duckdb = [

core/examples/postgres.rs

Lines changed: 29 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -76,6 +76,21 @@ async fn main() {
7676
)
7777
.expect("failed to register table");
7878

79+
let companies_view = table_factory
80+
.table_provider(TableReference::bare("companies_view"))
81+
.await
82+
.expect("to create table provider for view");
83+
84+
let companies_materialized_view = table_factory
85+
.table_provider(TableReference::bare("companies_materialized_view"))
86+
.await
87+
.expect("to create table provider for materialized view");
88+
89+
ctx.register_table("companies_view", companies_view)
90+
.expect("to register view");
91+
ctx.register_table("companies_materialized_view", companies_materialized_view)
92+
.expect("to register materialized view");
93+
7994
// Query Example 1: Query the renamed table through default catalog
8095
let df = ctx
8196
.sql("SELECT * FROM datafusion.public.companies_v2")
@@ -89,4 +104,18 @@ async fn main() {
89104
.await
90105
.expect("select failed");
91106
df.show().await.expect("show failed");
107+
108+
let df = ctx
109+
.sql("SELECT * FROM companies_view")
110+
.await
111+
.expect("select from view failed");
112+
113+
df.show().await.expect("show failed");
114+
115+
let df = ctx
116+
.sql("SELECT * FROM companies_materialized_view")
117+
.await
118+
.expect("select from materialized view failed");
119+
120+
df.show().await.expect("show failed");
92121
}

0 commit comments

Comments
 (0)