DuckDB provider for ServiceStack.OrmLite - A fast, simple, and typed ORM for .NET.
This is an independent, community-maintained provider for ServiceStack.OrmLite. It is not officially maintained or endorsed by ServiceStack.
This package enables ServiceStack.OrmLite to work with DuckDB, an in-process analytical database management system. DuckDB excels at:
- Fast analytical queries - Columnar storage and vectorized execution
- In-process - No separate server process required
- SQL standard - Familiar SQL syntax with PostgreSQL compatibility
- Data processing - Native support for Parquet, CSV, JSON
- OLAP workloads - Aggregations, window functions, complex analytics
✅ Full OrmLite Support
- Complete CRUD operations (sync + async)
- LINQ query expressions
- Transactions
- AutoIncrement with sequences and INSERT...RETURNING
- Complex queries (JOINs, aggregations, subqueries)
- Parameterized queries
- Batch operations
- Async/await support (pseudo-async)
- High-performance bulk insert - 10-100x faster using DuckDB's Appender API
- Bulk insert with deduplication - Production-safe staging table pattern for massive tables
- Type-safe LINQ expressions - IntelliSense support for unique column selection
- Multi-database support - Query across multiple DuckDB files transparently
✅ Complete Type Support
- All .NET primitive types
- DateTime, DateTimeOffset, TimeSpan
- Decimal and all integer types
- Guid (UUID)
- byte[] (BLOB)
- Nullable types
✅ Production Ready
- 100 comprehensive tests (100% passing)
- Optimized for DuckDB 1.3.2
- SQL injection prevention
- Robust error handling
- Connection timeout/retry for multi-process scenarios
dotnet add package DuckDB.OrmLiteusing ServiceStack.OrmLite;
using DuckDB.OrmLite;
// Create connection factory
var dbFactory = new DuckDbOrmLiteConnectionFactory("Data Source=myapp.db");
// Define your models
public class Customer
{
[AutoIncrement]
public int Id { get; set; }
[Required]
public string Name { get; set; }
public string Email { get; set; }
public decimal CreditLimit { get; set; }
public DateTime RegisteredAt { get; set; }
}
// Use OrmLite
using var db = dbFactory.Open();
// Create table
db.CreateTable<Customer>();
// Insert
var customer = new Customer
{
Name = "Acme Corp",
Email = "[email protected]",
CreditLimit = 50000,
RegisteredAt = DateTime.UtcNow
};
db.Insert(customer);
// Query with LINQ
var highValueCustomers = db.Select<Customer>(c =>
c.CreditLimit > 10000 && c.RegisteredAt > DateTime.UtcNow.AddMonths(-6)
);
// Update
customer.CreditLimit = 75000;
db.Update(customer);
// Aggregations
var totalCredit = db.Scalar<decimal>(
db.From<Customer>().Select(c => Sql.Sum(c.CreditLimit))
);
// JOINs
var orders = db.Select(db.From<Order>()
.Join<Customer>((o, c) => o.CustomerId == c.Id)
.Where<Customer>(c => c.Name == "Acme Corp")
);
// Transactions
using (var trans = db.OpenTransaction())
{
db.Insert(customer);
db.Insert(order);
trans.Commit();
}DuckDB.OrmLite supports async/await for all operations through ServiceStack.OrmLite's built-in async APIs.
Important: Since DuckDB.NET.Data does not provide native async I/O operations, these are pseudo-async implementations (similar to SQLite). Operations will block the calling thread but provide API compatibility with other OrmLite providers.
using var db = dbFactory.Open();
// SELECT operations
var customers = await db.SelectAsync<Customer>(c => c.CreditLimit > 10000);
var customer = await db.SingleAsync<Customer>(c => c.Id == 1);
var count = await db.CountAsync<Customer>();
// INSERT operations
await db.InsertAsync(new Customer { Name = "New Corp", ... });
await db.InsertAllAsync(customers);
// UPDATE operations
customer.CreditLimit = 75000;
await db.UpdateAsync(customer);
await db.UpdateAllAsync(customers);
// DELETE operations
await db.DeleteAsync<Customer>(c => c.CreditLimit < 1000);
await db.DeleteByIdAsync<Customer>(customerId);
// SQL operations
var results = await db.SqlListAsync<Customer>("SELECT * FROM Customer WHERE CreditLimit > 50000");
var total = await db.SqlScalarAsync<decimal>("SELECT SUM(CreditLimit) FROM Customer");
// Transactions work with async too
using (var trans = db.OpenTransaction())
{
await db.InsertAsync(customer);
await db.InsertAsync(order);
trans.Commit();
}- Suitable for maintaining consistent async/await code style
- API-compatible with other OrmLite providers
- Not suitable for high-concurrency scenarios expecting true async I/O benefits
- Consider using synchronous methods if async benefits are not needed
Query across multiple DuckDB database files transparently - perfect for time-series data, archival scenarios, or partitioned datasets.
var factory = new DuckDbOrmLiteConnectionFactory("Data Source=main.db")
.WithAdditionalDatabases("archive_2024.db", "archive_2023.db")
.WithMultiDatabaseTables("CmcPrice", "FiatPrice");
// Queries automatically span all databases
using var db = factory.Open();
var allPrices = db.Select<CmcPrice>(x => x.Symbol == "BTC");
// Returns data from main.db, archive_2024.db, and archive_2023.db
// Writes go to main database only
using var writeDb = factory.OpenForWrite();
writeDb.Insert(new CmcPrice { Date = DateTime.Today, Symbol = "ETH", Price = 3500 });// Generic factory - automatically configures multi-database support for CmcPrice
var factory = new DuckDbOrmLiteConnectionFactory<CmcPrice>("Data Source=main.db")
.WithAdditionalDatabases("archive.db");
// Or explicit configuration
var factory = new DuckDbOrmLiteConnectionFactory("Data Source=main.db")
.WithAdditionalDatabases("archive.db")
.WithMultiDatabaseTable<CmcPrice>();// Setup: Current year + archives
var factory = new DuckDbOrmLiteConnectionFactory("Data Source=prices_2025.db")
.WithAdditionalDatabases("prices_2024.db", "prices_2023.db", "prices_2022.db")
.WithMultiDatabaseTables("CmcPrice");
// Read: Query spans all years transparently
using (var db = factory.Open())
{
// Get Bitcoin prices for last 2 years
var btcPrices = db.Select<CmcPrice>(x =>
x.Symbol == "BTC" &&
x.Date >= DateTime.Today.AddYears(-2));
// Aggregations work across all databases
var avgPrice = db.Scalar<decimal>(
db.From<CmcPrice>()
.Where(x => x.Symbol == "BTC")
.Select(x => Sql.Avg(x.Price))
);
}
// Write: New data goes to current year database
using (var writeDb = factory.OpenForWrite())
{
writeDb.Insert(new CmcPrice
{
Date = DateTime.Today,
Symbol = "BTC",
Price = 95000
});
}- Automatic ATTACH: Additional databases are attached on connection open
- Unified Views: Creates
{TableName}_Unifiedviews withUNION ALLacross all databases - Query Routing: Read queries automatically use unified views; writes go directly to main database
- Zero Code Changes: Application code using
db.Select<T>()works unchanged
✅ All OrmLite operations work across databases:
- SELECT with WHERE, ORDER BY, LIMIT
- Aggregations (COUNT, SUM, AVG, MAX, MIN)
- JOINs (multi-db table with single-db table)
- Complex LINQ expressions
- Async operations
✅ Smart table detection:
- Only creates views for tables that exist in databases
- Handles tables existing in subset of databases
✅ Flexible configuration:
- Mix multi-db and single-db tables in same factory
- Toggle auto-configuration with
.WithAutoConfigureViews(false)
- Schema consistency: Tables must have identical schemas across all databases
- No cross-database transactions: Transactions only work with
OpenForWrite()(single database) - Read-only archives: Additional databases should be read-only for data consistency
- No automatic deduplication:
UNION ALLdoesn't deduplicate - ensure partitioning prevents duplicates
DuckDB uses an exclusive file lock when opening a database. If another process has the database open, subsequent attempts will fail immediately. The timeout feature provides automatic retry with exponential backoff:
// Default behavior - fail immediately if database is locked
using var db = factory.Open();
// Retry for up to 30 seconds with exponential backoff
using var db = factory.Open(TimeSpan.FromSeconds(30));
// Also available for write connections
using var writeDb = factory.OpenForWrite(TimeSpan.FromSeconds(30));
// Customize retry delays (useful for high-contention scenarios)
var factory = new DuckDbOrmLiteConnectionFactory("Data Source=myapp.db")
{
RetryDelayMs = 100, // Start with 100ms delays (default: 50ms)
MaxRetryDelayMs = 5000 // Cap at 5 seconds (default: 1000ms)
};How it works:
- Detects lock errors: "Could not set lock", "database is locked", "IO Error"
- Exponential backoff: starts at
RetryDelayMs, doubles each retry, caps atMaxRetryDelayMs - Random jitter (up to 25% of delay) to avoid thundering herd
- Throws
TimeoutExceptionif lock not acquired within timeout
Use cases:
- Multi-process scenarios where processes may temporarily lock the database
- Automated scripts that need to wait for other processes to finish
- Services that need resilient database access
Concurrency notes:
- Within same process: Multiple threads can read/write concurrently (MVCC)
- Across processes: Only one process can open at a time (exclusive lock)
- Use timeout feature for cross-process coordination
Recommended partitioning strategies:
- Time-based: One database per year/month
- Categorical: Separate databases by data type or category
- Archival: Current database + historical archives
Workflow pattern:
// Daily process: Keep current database for writes
using (var writeDb = factory.OpenForWrite())
{
writeDb.InsertAll(todaysData);
}
// Analytics: Query across all time periods
using (var readDb = factory.Open())
{
var historicalTrends = readDb.Select<CmcPrice>(x =>
x.Date >= new DateTime(2020, 1, 1));
}
// Year-end: Rotate current to archive
// 1. Copy prices_2025.db to archive location
// 2. Update factory configuration to include new archive
// 3. Create fresh prices_2026.db for new year// DuckDB excels at analytical queries
var salesByMonth = db.SqlList<dynamic>(@"
SELECT
DATE_TRUNC('month', OrderDate) as Month,
COUNT(*) as OrderCount,
SUM(TotalAmount) as Revenue
FROM Orders
GROUP BY Month
ORDER BY Month DESC
");// High-performance bulk insert (10-100x faster than InsertAll)
db.BulkInsert(largeDataset); // Uses DuckDB's native Appender API
// For smaller datasets or when transaction control is needed
db.InsertAll(smallDataset);
// Process with SQL
db.ExecuteSql(@"
INSERT INTO ProcessedOrders
SELECT * FROM Orders
WHERE Status = 'completed'
AND OrderDate > CURRENT_DATE - INTERVAL '30 days'
");// Use in-memory database for fast processing
var dbFactory = new DuckDbOrmLiteConnectionFactory("Data Source=:memory:");For maximum performance when inserting large datasets, use BulkInsert() which leverages DuckDB's native Appender API:
using var db = dbFactory.Open();
var products = new List<Product>();
for (int i = 0; i < 100000; i++)
{
products.Add(new Product {
Id = i,
Name = $"Product {i}",
Price = i * 1.5m
});
}
// BulkInsert is 10-100x faster than InsertAll for large datasets
db.BulkInsert(products); // Uses DuckDB Appender API| Method | 1,000 rows | 10,000 rows | 100,000 rows |
|---|---|---|---|
InsertAll() |
~100ms | ~1s | ~10s |
BulkInsert() |
~10ms | ~50ms | ~500ms |
| Speed improvement | 10x | 20x | 20-100x |
✅ Blazing fast - Uses DuckDB's native Appender API for direct memory-to-database transfer
✅ All data types supported - DateTime, Guid, decimal, byte[], TimeSpan, etc.
✅ Async support - BulkInsertAsync() available (pseudo-async wrapper)
✅ Simple API - Drop-in replacement for InsertAll()
- No transaction participation - Appender auto-commits on completion
- No return values - Generated IDs are not returned (unlike
Insert()) - All-or-nothing - If any row fails, entire batch fails
- For transactions, use
InsertAll()- Standard inserts support explicit transactions
Use BulkInsert() when:
- Inserting 100+ rows
- Performance is critical
- You don't need generated ID values returned
- Auto-commit behavior is acceptable
Use InsertAll() when:
- Inserting < 100 rows
- You need transaction control (explicit BEGIN/COMMIT)
- You need to mix inserts with other operations in a transaction
- You need generated ID values returned
using var db = dbFactory.Open();
var products = GetLargeDataset();
await db.BulkInsertAsync(products);
// Note: This is pseudo-async (wraps sync operation)
// since DuckDB.NET doesn't provide native async AppenderFor large tables where indexes cannot fit in memory (a known DuckDB limitation), use BulkInsertWithDeduplication(). This is the RECOMMENDED approach for production use with tables containing hundreds of millions of rows.
DuckDB's ART indexes must fit in memory. With 845+ million rows, maintaining PRIMARY KEY or UNIQUE constraints becomes impractical. The staging table pattern provides:
✅ Zero risk to main table - Data validated in staging before touching main table ✅ Atomic duplicate detection - SQL JOIN ensures no duplicates ✅ Fast rollback - Just drop staging table on error ✅ Minimal lock time - Main table locked only during final INSERT SELECT ✅ High performance - Uses Appender API for staging table loading
using var db = dbFactory.Open();
// Your 845M row table with composite unique key
public class CryptoPrice
{
public DateTime Timestamp { get; set; }
public string Symbol { get; set; }
public long ExchangeId { get; set; }
public decimal Price { get; set; }
}
// Load 70,000 new records (internally unique, but may overlap with existing)
var newPrices = LoadNewPrices(); // 70K records
// Option 1: LINQ expression (type-safe, recommended)
var insertedCount = db.BulkInsertWithDeduplication(
newPrices,
x => new { x.Timestamp, x.Symbol, x.ExchangeId }
);
// Option 2: String column names (flexible)
var insertedCount = db.BulkInsertWithDeduplication(
newPrices,
"Timestamp", "Symbol", "ExchangeId"
);
Console.WriteLine($"Inserted {insertedCount} new records (duplicates filtered)");// Step 1: Create temporary staging table (same schema as main)
CREATE TABLE CryptoPrice_Staging_<guid> AS SELECT * FROM CryptoPrice LIMIT 0
// Step 2: BulkInsert into staging (fast, isolated from main table)
// Uses Appender API - 10-100x faster than InsertAll
// Step 3: Atomic INSERT SELECT with deduplication (handles both internal and external duplicates)
INSERT INTO CryptoPrice
WITH DeduplicatedStaging AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Timestamp, Symbol, ExchangeId ORDER BY (SELECT NULL)) as rn
FROM CryptoPrice_Staging_<guid>
)
SELECT columns
FROM DeduplicatedStaging s
LEFT JOIN CryptoPrice m ON
s.Timestamp = m.Timestamp AND
s.Symbol = m.Symbol AND
s.ExchangeId = m.ExchangeId
WHERE m.Timestamp IS NULL -- Not in main table
AND s.rn = 1 -- First occurrence from incoming data
// Step 4: DROP staging table (always executed, even on error)Duplicate Handling:
- Internal duplicates (within incoming data): Keeps first occurrence only
- External duplicates (with main table): Filters out all matches
- Deterministic behavior ensures consistent results
Option 1: LINQ Expression (Type-Safe, Recommended)
// Single column
var insertedCount = db.BulkInsertWithDeduplication(
users,
x => x.Email
);
// Multiple columns
var insertedCount = db.BulkInsertWithDeduplication(
prices,
x => new { x.Timestamp, x.Symbol, x.ExchangeId }
);
// Async
var insertedCount = await db.BulkInsertWithDeduplicationAsync(
prices,
x => new { x.Timestamp, x.Symbol }
);Option 2: String Column Names (Flexible)
var insertedCount = db.BulkInsertWithDeduplication(
prices,
"Timestamp", "Symbol", "ExchangeId"
);Option 3: Auto-Detect from Attributes
// CompositeIndex attribute
[CompositeIndex(nameof(Timestamp), nameof(Symbol), nameof(ExchangeId), Unique = true)]
public class CryptoPrice { ... }
// Auto-detects unique columns from attribute
var insertedCount = db.BulkInsertWithDeduplication(newPrices);
// Single [Unique] attribute
public class User
{
[Unique]
public string Email { get; set; }
...
}
var insertedCount = db.BulkInsertWithDeduplication(newUsers);| Operation | Time (70K records) | Notes |
|---|---|---|
| Append to staging | ~5-10ms | Appender API - blazing fast |
| INSERT SELECT with JOIN | ~50-200ms | Depends on main table size |
| Drop staging | ~1ms | Cleanup |
| Total overhead | ~60-210ms | Minimal cost for safety |
Compare to risk of direct append:
- Direct append failure: Hours to recover 845M row table
- Duplicate corruption: Potentially unfixable without backups
// Daily ETL process for massive time-series table
public async Task LoadDailyPrices()
{
using var db = dbFactory.Open();
// Main table: 845,000,000 rows
// New batch: 70,000 rows (may contain internal duplicates)
// Expected: ~500 duplicates with main table, ~69,500 new records
var newBatch = await FetchDailyPricesAsync(); // 70K records
var sw = Stopwatch.StartNew();
var insertedCount = db.BulkInsertWithDeduplication(
newBatch,
"Timestamp", "Symbol", "ExchangeId"
);
sw.Stop();
logger.LogInformation(
"Inserted {Inserted} of {Total} records in {Ms}ms (duplicates filtered: {Duplicates})",
insertedCount,
newBatch.Count,
sw.ElapsedMilliseconds,
newBatch.Count - insertedCount
);
// Typical output:
// Inserted 69,500 of 70,000 records in 180ms (duplicates filtered: 500)
}var insertedCount = await db.BulkInsertWithDeduplicationAsync(
newRecords,
"Timestamp", "Symbol", "ExchangeId"
);Use BulkInsertWithDeduplication() when:
- ✅ Table has 100M+ rows (indexes can't fit in memory)
- ✅ You need duplicate prevention without UNIQUE constraints
- ✅ Main table safety is critical (zero corruption risk)
- ✅ You're doing ETL/data loading with potential duplicates
- ✅ You want automatic cleanup and transactional safety
Use BulkInsert() when:
- ✅ Data is guaranteed unique (no duplicates possible)
- ✅ Table is small enough for UNIQUE constraints
- ✅ Maximum speed is priority (no duplicate checking needed)
- ✅ You're loading into a temporary/staging table
Use InsertAll() when:
- ✅ Small datasets (< 100 rows)
- ✅ Need explicit transaction control
- ✅ Need generated ID values returned
File-based database:
"Data Source=myapp.db"In-memory database:
"Data Source=:memory:"Read-only mode:
"Data Source=myapp.db;Read Only=true"DuckDB requires parameter handling that differs slightly from other databases. The provider includes a BeforeExecFilter that handles this automatically:
// Automatically configured by DuckDbOrmLiteConnectionFactory
// Handles:
// - Parameter name conversion ($ prefix handling)
// - 1-based positional parameter indexing
// - DbType.Currency → DbType.Decimal conversionThis package depends on:
- ServiceStack.OrmLite (>= 8.5.2) - The ORM framework
- DuckDB.NET.Data.Full (>= 1.3.0) - .NET bindings for DuckDB
Both dependencies are automatically installed when you add this package.
- .NET: 8.0+
- DuckDB: 1.3.2+
- ServiceStack.OrmLite: 8.5.2+
DuckDB is optimized for analytical workloads:
- Fast aggregations - Columnar storage enables efficient aggregations
- Vectorized execution - SIMD optimizations for bulk operations
- Memory efficient - Optimized for large datasets
- Zero-copy reads - Direct memory access where possible
- TimeSpan: Limited to ~24 hours when using HH:MM:SS format
- Concurrent writes: DuckDB uses single-writer model
- DuckDB Official Documentation
- ServiceStack.OrmLite Documentation
- Development Documentation - Implementation details and history
# Run all tests
dotnet test
# Run specific test category
dotnet test --filter "FullyQualifiedName~AdvancedFeatureTests"Test coverage:
- 25 core OrmLite functionality tests
- 15 advanced feature tests (JOINs, aggregations, edge cases)
- 17 async/await tests
- 18 multi-database tests
- 11 concurrency tests (thread safety, optimistic concurrency)
- 4 generic factory tests
- 90 total tests (100% passing)
- Production-ready error handling and SQL injection prevention
Contributions are welcome! Please feel free to submit issues or pull requests.
This project is licensed under the MIT License - see the LICENSE.md file for details.
- ServiceStack.OrmLite - The excellent ORM framework
- DuckDB - The fast in-process analytical database
- DuckDB.NET - .NET bindings for DuckDB
- Issues: GitHub Issues
- ServiceStack OrmLite: ServiceStack Support
- DuckDB: DuckDB Discord