Unified VFS-backed artifact and workspace storage with scope-based isolation—built for AI apps and MCP servers
CHUK Artifacts provides a unified namespace architecture where everything—blobs (artifacts) and workspaces (file collections)—is VFS-backed. Store ephemeral session files, persistent user projects, and shared resources with automatic access control, checkpoints, and a clean API that works the same for single files and entire directory trees.
The v0.9 architecture unifies blobs and workspaces under a single API:
- Blobs = Single-file VFS-backed namespaces (artifacts, documents, data)
- Workspaces = Multi-file VFS-backed namespaces (projects, collections, repos)
- Same API for both types (only the
typeparameter differs) - Same features for both (checkpoints, scoping, VFS access, metadata)
from chuk_artifacts import ArtifactStore, NamespaceType, StorageScope
async with ArtifactStore() as store:
# Create a blob (single file)
blob = await store.create_namespace(
type=NamespaceType.BLOB,
scope=StorageScope.SESSION
)
await store.write_namespace(blob.namespace_id, data=b"Hello, World!")
# Create a workspace (file tree)
workspace = await store.create_namespace(
type=NamespaceType.WORKSPACE,
name="my-project",
scope=StorageScope.USER,
user_id="alice"
)
# Write files to workspace
await store.write_namespace(workspace.namespace_id, path="/main.py", data=b"print('hello')")
await store.write_namespace(workspace.namespace_id, path="/config.json", data=b'{"version": "1.0"}')
# Get VFS for advanced operations (works for BOTH!)
vfs = store.get_namespace_vfs(workspace.namespace_id)
files = await vfs.ls("/") # ['.workspace', 'main.py', 'config.json']
# Create checkpoint (works for BOTH!)
checkpoint = await store.checkpoint_namespace(workspace.namespace_id, name="v1.0")One API. Two types. Zero complexity.
CHUK Artifacts is the unified storage substrate for the entire CHUK AI stack:
chuk-ai-planner → uses artifacts as workspaces for multi-step plans
chuk-mcp-server → exposes artifacts as remote filesystems via MCP
chuk-virtual-fs → underlying filesystem engine for all namespaces
chuk-sessions → session-based scope isolation for namespaces
Why this matters:
- Consistent storage across all CHUK components
- Unified access patterns for AI tools, planners, and MCP servers
- Automatic isolation prevents cross-session data leakage
- Production-ready from development to deployment
- Why This Exists
- Architecture
- Install
- Quick Start
- Core Concepts
- API Reference
- VFS Operations
- Examples
- Advanced Features
- Legacy Compatibility
- Configuration
- Testing
Most platforms offer object storage (S3, filesystem)—but not a unified namespace architecture with automatic access control.
CHUK Artifacts provides:
- ✅ Unified API - Same code for single files (blobs) and file trees (workspaces)
- ✅ Three storage scopes - SESSION (ephemeral), USER (persistent), SANDBOX (shared)
- ✅ VFS-backed - Full filesystem operations on all namespaces
- ✅ Checkpoints - Snapshot and restore for both blobs and workspaces
- ✅ Grid architecture - Predictable, auditable storage organization
- ✅ Access control - Automatic scope-based isolation
- ✅ Provider-agnostic - Memory, Filesystem, SQLite, S3—same API
- ✅ Async-first - Built for FastAPI, MCP servers, modern Python
Use cases:
- 📝 AI chat applications (session artifacts + user documents)
- 🔧 MCP servers (tool workspaces + shared templates)
- 🚀 CI/CD systems (build artifacts + project workspaces)
- 📊 Data platforms (user datasets + shared libraries)
What you get with raw storage:
- S3 → objects (not namespaces)
- Filesystem → files (not isolated storage units)
- SQLite → durability (not structured filesystem trees)
What CHUK Artifacts adds:
| Feature | S3 Alone | Filesystem Alone | CHUK Artifacts |
|---|---|---|---|
| Namespace abstraction | ❌ | ❌ | ✅ |
| Scope-based isolation | ❌ | ❌ | ✅ |
| Unified API across backends | ❌ | ❌ | ✅ |
| Checkpoints/snapshots | ❌ | ❌ | ✅ |
| Grid path organization | Manual | Manual | Automatic |
| VFS operations | ❌ | Partial | ✅ Full |
| Session lifecycle | Manual | Manual | Automatic |
CHUK Artifacts provides:
- VFS + scopes + namespaces + checkpoints + unified API + grid paths
This is fundamentally more powerful than raw storage.
┌─────────────────────────────────────────────────────────────────┐
│ Your Application │
└────────────────────────────┬────────────────────────────────────┘
│
│ create_namespace(type=BLOB|WORKSPACE)
│ write_namespace(), read_namespace()
│ checkpoint_namespace(), restore_namespace()
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ ArtifactStore │
│ (Unified Namespace Management) │
│ │
│ • Manages both BLOB and WORKSPACE namespaces │
│ • Enforces scope-based access control │
│ • Provides VFS access to all namespaces │
│ • Handles checkpoints and restoration │
└──────────┬────────────────────────────────────┬─────────────────┘
│ │
│ session management │ VFS operations
▼ ▼
┌────────────┐ ┌─────────────────────────┐
│ Sessions │ │ chuk-virtual-fs │
│ (Memory/ │ │ (Unified VFS Layer) │
│ Redis) │ │ │
└────────────┘ │ • ls(), mkdir(), rm() │
│ • cp(), mv(), find() │
│ • Metadata management │
│ • Batch operations │
└────────┬────────────────┘
│
│ provider calls
▼
┌────────────────────────────┐
│ Storage Providers │
│ │
│ Memory │ Filesystem │ S3 │ │
│ SQLite │
└─────────────┬───────────────┘
│
▼
grid/{sandbox}/{scope}/{namespace_id}/
- Everything is VFS - Both blobs and workspaces are VFS-backed
- Unified API - One set of methods for all namespace types
- Scope-based isolation - SESSION, USER, and SANDBOX scopes
- Grid organization - Predictable, auditable storage paths
- Provider-agnostic - Swap storage backends via configuration
pip install chuk-artifactsDependencies:
chuk-virtual-fs- VFS layer (automatically installed)chuk-sessions- Session management (automatically installed)
Optional:
redis- For Redis session providerboto3- For S3 storage backendibm-cos-sdk- For IBM Cloud Object Storage
from chuk_artifacts import ArtifactStore, NamespaceType, StorageScope
store = ArtifactStore()
# Create a blob namespace (single file)
blob = await store.create_namespace(
type=NamespaceType.BLOB,
scope=StorageScope.SESSION
)
# Write data to the blob
await store.write_namespace(blob.namespace_id, data=b"My important data")
# Read data back
data = await store.read_namespace(blob.namespace_id)
print(data) # b"My important data"# Create a workspace namespace (file tree)
workspace = await store.create_namespace(
type=NamespaceType.WORKSPACE,
name="my-project",
scope=StorageScope.USER,
user_id="alice"
)
# Write multiple files
await store.write_namespace(workspace.namespace_id, path="/README.md", data=b"# My Project")
await store.write_namespace(workspace.namespace_id, path="/src/main.py", data=b"print('hello')")
# Get VFS for advanced operations
vfs = store.get_namespace_vfs(workspace.namespace_id)
# List files
files = await vfs.ls("/") # ['.workspace', 'README.md', 'src']
src_files = await vfs.ls("/src") # ['main.py']
# Copy files
await vfs.cp("/src/main.py", "/src/backup.py")
# Search for files
python_files = await vfs.find(pattern="*.py", path="/", recursive=True)# Create a checkpoint
checkpoint = await store.checkpoint_namespace(
workspace.namespace_id,
name="initial-version",
description="First working version"
)
# Make changes
await store.write_namespace(workspace.namespace_id, path="/README.md", data=b"# Updated")
# Restore from checkpoint
await store.restore_namespace(workspace.namespace_id, checkpoint.checkpoint_id)A namespace is a VFS-backed storage unit. There are two types:
| Type | Description | Use Cases |
|---|---|---|
| BLOB | Single file at /_data |
Artifacts, documents, data files, caches |
| WORKSPACE | Full file tree | Projects, collections, code repos, datasets |
Both types:
- Use the same unified API
- Support checkpoints
- Have VFS access
- Support all three scopes
Every namespace has a scope that determines its lifecycle and access:
| Scope | Lifecycle | Access | Grid Path | Use Cases |
|---|---|---|---|---|
| SESSION | Ephemeral (session lifetime) | Same session only | grid/{sandbox}/sess-{session_id}/{ns_id} |
Temporary files, caches, current work |
| USER | Persistent | Same user only | grid/{sandbox}/user-{user_id}/{ns_id} |
User projects, personal docs, settings |
| SANDBOX | Persistent | All users | grid/{sandbox}/shared/{ns_id} |
Templates, shared libraries, documentation |
Example:
# Session-scoped (ephemeral)
temp_blob = await store.create_namespace(
type=NamespaceType.BLOB,
scope=StorageScope.SESSION
)
# User-scoped (persistent)
user_project = await store.create_namespace(
type=NamespaceType.WORKSPACE,
name="my-docs",
scope=StorageScope.USER,
user_id="alice"
)
# Sandbox-scoped (shared)
shared_templates = await store.create_namespace(
type=NamespaceType.WORKSPACE,
name="templates",
scope=StorageScope.SANDBOX
)All namespaces are organized in a grid structure:
grid/
├── {sandbox_id}/
│ ├── sess-{session_id}/ # SESSION scope
│ │ ├── {namespace_id}/ # Blob or workspace
│ │ │ ├── _data # For blobs
│ │ │ ├── _meta.json # For blobs
│ │ │ ├── file1.txt # For workspaces
│ │ │ └── ...
│ ├── user-{user_id}/ # USER scope
│ │ └── {namespace_id}/
│ └── shared/ # SANDBOX scope
│ └── {namespace_id}/
Benefits:
- Predictable paths
- Easy auditing
- Clear isolation
- Efficient listing
Everything works for both namespace types across all scopes:
| Feature | BLOB | WORKSPACE | SESSION | USER | SANDBOX |
|---|---|---|---|---|---|
| VFS access | ✅ | ✅ | ✅ | ✅ | ✅ |
| Checkpoints/restore | ✅ | ✅ | ✅ | ✅ | ✅ |
| Metadata (custom) | ✅ | ✅ | ✅ | ✅ | ✅ |
| Batch operations | ✅ | ✅ | ✅ | ✅ | ✅ |
| Search/find | ✅ | ✅ | ✅ | ✅ | ✅ |
| Grid placement | Auto | Auto | Auto | Auto | Auto |
| Access control | Auto | Auto | Auto | Auto | Auto |
| TTL expiration | ✅ | ✅ | ✅ | ❌ | ❌ |
Key insight: The unified architecture means you get full feature parity regardless of namespace type or scope.
# Create namespace
namespace = await store.create_namespace(
type: NamespaceType, # BLOB or WORKSPACE
scope: StorageScope, # SESSION, USER, or SANDBOX
name: str | None = None, # Optional name (workspaces only)
user_id: str | None = None, # Required for USER scope
ttl_hours: int | None = None, # Session TTL (SESSION scope only)
provider_type: str = "vfs-memory", # VFS provider
provider_config: dict | None = None # Provider configuration
) -> NamespaceInfo
# Write data
await store.write_namespace(
namespace_id: str,
data: bytes,
path: str | None = None # Required for workspaces, optional for blobs
)
# Read data
data: bytes = await store.read_namespace(
namespace_id: str,
path: str | None = None # Required for workspaces, optional for blobs
)
# Get VFS access
vfs: AsyncVirtualFileSystem = store.get_namespace_vfs(namespace_id: str)
# List namespaces
namespaces: list[NamespaceInfo] = store.list_namespaces(
session_id: str | None = None,
user_id: str | None = None,
type: NamespaceType | None = None
)
# Destroy namespace
await store.destroy_namespace(namespace_id: str)# Create checkpoint
checkpoint: CheckpointInfo = await store.checkpoint_namespace(
namespace_id: str,
name: str,
description: str | None = None
)
# List checkpoints
checkpoints: list[CheckpointInfo] = await store.list_checkpoints(
namespace_id: str
)
# Restore from checkpoint
await store.restore_namespace(
namespace_id: str,
checkpoint_id: str
)
# Delete checkpoint
await store.delete_checkpoint(
namespace_id: str,
checkpoint_id: str
)All namespaces provide full VFS access:
vfs = store.get_namespace_vfs(namespace_id)
# File operations
await vfs.write_file(path, data)
data = await vfs.read_file(path)
await vfs.rm(path)
await vfs.cp(src, dst)
await vfs.mv(src, dst)
exists = await vfs.exists(path)
# Directory operations
await vfs.mkdir(path)
await vfs.rmdir(path)
await vfs.cd(path)
files = await vfs.ls(path)
is_dir = await vfs.is_dir(path)
is_file = await vfs.is_file(path)
# Metadata
await vfs.set_metadata(path, metadata)
metadata = await vfs.get_metadata(path)
node_info = await vfs.get_node_info(path)
# Search
results = await vfs.find(pattern="*.py", path="/", recursive=True)
# Batch operations
await vfs.batch_create_files(file_specs)
data_dict = await vfs.batch_read_files(paths)
await vfs.batch_write_files(file_data)
await vfs.batch_delete_paths(paths)
# Text/Binary
await vfs.write_text(path, text, encoding="utf-8")
text = await vfs.read_text(path, encoding="utf-8")
await vfs.write_binary(path, data)
data = await vfs.read_binary(path)
# Stats
stats = await vfs.get_storage_stats()
provider = await vfs.get_provider_name()See examples/05_advanced_vfs_features.py for comprehensive VFS examples.
We provide 9 comprehensive examples covering all features:
- 00_quick_start.py - Quick introduction to unified API
- 01_blob_namespace_basics.py - Blob operations
- 02_workspace_namespace_basics.py - Workspace operations
- 03_unified_everything_is_vfs.py - Unified architecture
- 04_legacy_api_compatibility.py - Legacy compatibility
- 05_advanced_vfs_features.py - Advanced VFS features
- 06_session_isolation.py - Session isolation and scoping
- 07_large_files_streaming.py - Large file handling
- 08_batch_operations.py - Batch operations
Run any example:
python examples/00_quick_start.py
python examples/02_workspace_namespace_basics.py
python examples/05_advanced_vfs_features.pySee examples/README.md for complete documentation.
Create snapshots of any namespace (blob or workspace):
# Create checkpoint
cp1 = await store.checkpoint_namespace(workspace.namespace_id, name="v1.0")
# Make changes...
await store.write_namespace(workspace.namespace_id, path="/new_file.txt", data=b"new")
# Restore to checkpoint
await store.restore_namespace(workspace.namespace_id, cp1.checkpoint_id)Process multiple files efficiently:
vfs = store.get_namespace_vfs(workspace.namespace_id)
# Batch create with metadata
file_specs = [
{"path": "/file1.txt", "content": b"data1", "metadata": {"tag": "important"}},
{"path": "/file2.txt", "content": b"data2", "metadata": {"tag": "draft"}},
]
await vfs.batch_create_files(file_specs)
# Batch read
data = await vfs.batch_read_files(["/file1.txt", "/file2.txt"])
# Batch delete
await vfs.batch_delete_paths(["/file1.txt", "/file2.txt"])Attach rich metadata to files:
await vfs.set_metadata("/document.pdf", {
"author": "Alice",
"tags": ["important", "reviewed"],
"custom": {"project_id": 123}
})
metadata = await vfs.get_metadata("/document.pdf")Find files by pattern:
# Find all Python files
py_files = await vfs.find(pattern="*.py", path="/", recursive=True)
# Find specific file
results = await vfs.find(pattern="config.json", path="/")The legacy store() and retrieve() APIs still work:
# Legacy API (still supported)
artifact_id = await store.store(
b"data",
mime="text/plain",
summary="My artifact"
)
data = await store.retrieve(artifact_id)
# But unified API is recommended for new code
blob = await store.create_namespace(type=NamespaceType.BLOB)
await store.write_namespace(blob.namespace_id, data=b"data")
data = await store.read_namespace(blob.namespace_id)See examples/04_legacy_api_compatibility.py for details.
Choose the right storage backend for your use case:
Development / Testing:
# Memory provider - instant, ephemeral
store = ArtifactStore() # Uses vfs-memory by defaultSmall Deployments / Edge:
# Filesystem provider with container volumes
export ARTIFACT_PROVIDER=vfs-filesystem
export VFS_ROOT_PATH=/data/artifacts
# Good for: Docker containers, edge devices, local-first appsPortable / Embedded:
# SQLite provider - single file, queryable
export ARTIFACT_PROVIDER=vfs-sqlite
export SQLITE_DB_PATH=/data/artifacts.db
# Good for: Desktop apps, portable storage, offline-firstProduction / Cloud:
# S3 provider with Redis sessions
export ARTIFACT_PROVIDER=vfs-s3
export SESSION_PROVIDER=redis
export AWS_S3_BUCKET=my-artifacts
export REDIS_URL=redis://prod-redis:6379/0
# Good for: Multi-tenant SaaS, distributed systems, high scaleHybrid Deployments:
# Different scopes, different backends
# - SESSION: vfs-memory (ephemeral, fast)
# - USER: vfs-filesystem (persistent, local)
# - SANDBOX: vfs-s3 (persistent, shared, cloud)
# Configure per namespace:
await store.create_namespace(
type=NamespaceType.BLOB,
scope=StorageScope.SESSION,
provider_type="vfs-memory" # Fast ephemeral
)
await store.create_namespace(
type=NamespaceType.WORKSPACE,
scope=StorageScope.USER,
provider_type="vfs-s3" # Persistent cloud
)Configure via environment variables:
# Memory (default, for development)
export ARTIFACT_PROVIDER=vfs-memory
# Filesystem (for local persistence)
export ARTIFACT_PROVIDER=vfs-filesystem
# SQLite (for portable database)
export ARTIFACT_PROVIDER=vfs-sqlite
# S3 (for production)
export ARTIFACT_PROVIDER=vfs-s3
export AWS_ACCESS_KEY_ID=your_key
export AWS_SECRET_ACCESS_KEY=your_secret
export AWS_DEFAULT_REGION=us-east-1# Memory (default)
export SESSION_PROVIDER=memory
# Redis (for production)
export SESSION_PROVIDER=redis
export REDIS_URL=redis://localhost:6379/0from chuk_artifacts.config import configure_memory, configure_s3
# Development
config = configure_memory()
store = ArtifactStore(**config)
# Production
config = configure_s3(
bucket="my-artifacts",
region="us-east-1"
)
store = ArtifactStore(**config)CHUK Artifacts is designed for production performance:
Memory Provider:
- Nanosecond to microsecond operations
- Zero I/O overhead
- Perfect for testing and development
Filesystem Provider:
- Depends on OS filesystem (typically microseconds to milliseconds)
- Uses async I/O for non-blocking operations
- Good for local deployments
S3 Provider:
- Uses streaming + zero-copy writes
- Parallel uploads for large files
- Production-proven at scale
SQLite Provider:
- Fast for small to medium workspaces
- Queryable storage with indexes
- Good for embedded/desktop apps
Checkpoints:
- Use copy-on-write semantics where supported
- Snapshot-based for minimal overhead
- Incremental when possible
VFS Layer:
- Batch operations reduce round trips
- Streaming for large files (no memory buffering)
- Provider-specific optimizations
Benchmarks (from examples):
- Batch operations: 1.7x faster than individual operations
- Large file writes: 577 MB/s (memory provider)
- Large file reads: 1103 MB/s (memory provider)
- Batch dataset creation: 250+ files/sec
See examples/08_batch_operations.py and examples/07_large_files_streaming.py for detailed benchmarks.
CHUK Artifacts includes 778 passing tests with 92% coverage:
# Run all tests
pytest
# Run with coverage
pytest --cov=chuk_artifacts --cov-report=html
# Run specific test file
pytest tests/test_namespace.py -vMemory provider makes testing instant:
import pytest
from chuk_artifacts import ArtifactStore, NamespaceType, StorageScope
@pytest.mark.asyncio
async def test_my_feature():
store = ArtifactStore() # Uses memory provider by default
blob = await store.create_namespace(
type=NamespaceType.BLOB,
scope=StorageScope.SESSION
)
await store.write_namespace(blob.namespace_id, data=b"test")
data = await store.read_namespace(blob.namespace_id)
assert data == b"test"- Examples - 9 comprehensive examples
- VFS API Reference - Quick VFS API guide
MIT License - see LICENSE for details.
Contributions welcome! Please:
- Fork the repository
- Create a feature branch
- Add tests for new functionality
- Ensure all tests pass (
pytest) - Run linters (
make check) - Submit a pull request
- Issues: GitHub Issues
- Documentation: examples/
- Discussions: GitHub Discussions
Built with ❤️ for AI applications and MCP servers