Skip to content

SGNL-ai/fabricator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

68 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Fabricator

CI Security Go Report Card License Go Version

A modern, enterprise-grade CSV data generator for system-of-record testing

Fabricator is a powerful command-line tool that generates realistic CSV test data for system-of-record (SOR) platforms. Built with a robust pipeline architecture and comprehensive validation, it transforms YAML definitions into consistent, relationship-aware CSV datasets.

πŸš€ Quick Start

# Download the latest release for your platform
curl -L https://github.com/SGNL-ai/fabricator/releases/latest/download/fabricator-linux -o fabricator
chmod +x fabricator

# Generate test data from a YAML definition
./fabricator -f examples/sample.yaml -n 1000 -o ./test-data

# Output:
# βœ“ Generated 16 CSV files with 1000 rows each
# βœ“ All relationships consistent across files
# βœ“ Entity-relationship diagram created

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   YAML Input    │───▢│   Validation     │───▢│   Pipeline      β”‚
β”‚                 β”‚    β”‚                  β”‚    β”‚                 β”‚
β”‚ β€’ Entities      β”‚    β”‚ β€’ JSON Schema    β”‚    β”‚ β€’ Phase 1: IDs  β”‚
β”‚ β€’ Attributes    β”‚    β”‚ β€’ Business Logic β”‚    β”‚ β€’ Phase 2: Rels β”‚
β”‚ β€’ Relationships β”‚    β”‚ β€’ 96% Template   β”‚    β”‚ β€’ Phase 3: Data β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β”‚   Compatibility  β”‚    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                       β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜             β”‚
                                                        β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚   CSV Output    │◀───│    Validation    │◀───│  Data Model     β”‚
β”‚                 β”‚    β”‚                  β”‚    β”‚                 β”‚
β”‚ β€’ Multi-file    β”‚    β”‚ β€’ Referential    β”‚    β”‚ β€’ Graph         β”‚
β”‚ β€’ Consistent    β”‚    β”‚   Integrity      β”‚    β”‚ β€’ Entities      β”‚
β”‚ β€’ Realistic     β”‚    β”‚ β€’ Uniqueness     β”‚    β”‚ β€’ Relationships β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

✨ Features

πŸ—οΈ Robust Architecture

  • Pipeline-based processing with clean separation of concerns
  • Comprehensive validation with JSON Schema + business logic layers
  • Graph-based dependency resolution with topological sorting

πŸ“Š Data Generation

  • Realistic test data with type-aware field generation
  • Relationship consistency across all CSV files
  • Variable cardinalities (1:1, 1:N, N:1, N:N) with auto-detection
  • Configurable data volume from small samples to large datasets

πŸ” Validation & Quality

  • YAML schema validation using industry-standard JSON Schema
  • Relationship integrity checking across entities
  • Uniqueness constraint validation
  • Production template compatibility (96% of SGNL catalog templates supported)

🎨 User Experience

  • Colorful CLI output with progress indicators
  • SVG diagram generation for entity-relationship visualization
  • Detailed error messages with actionable guidance
  • Multiple operation modes (generate, validate-only, diagram-only)

Installation

Requires Go 1.24.3 or higher.

From Source

# Clone the repository
git clone https://github.com/SGNL-ai/fabricator.git
cd fabricator

# Build the project
make build

The binary will be built to build/fabricator.

From GitHub Releases

Pre-built binaries for Linux, macOS (Intel and Apple Silicon), and Windows are automatically generated for each release and can be downloaded from the GitHub Releases page.

# For macOS Intel
curl -L https://github.com/SGNL-ai/fabricator/releases/latest/download/fabricator-macos-intel -o fabricator
chmod +x fabricator
./fabricator --version

# For macOS Apple Silicon (M1/M2/M3)
curl -L https://github.com/SGNL-ai/fabricator/releases/latest/download/fabricator-macos-apple-silicon -o fabricator
chmod +x fabricator
./fabricator --version

# For Linux
curl -L https://github.com/SGNL-ai/fabricator/releases/latest/download/fabricator-linux -o fabricator
chmod +x fabricator
./fabricator --version

For Windows users, download the fabricator-windows.exe file from the releases page.

Usage

# Basic usage (short options)
./build/fabricator -f <yaml-file> [-o <dir>] [-n <count>] [-a]

# Basic usage (long options)
./build/fabricator --file <yaml-file> [--output <dir>] [--num-rows <count>] [--auto-cardinality]

# View version information
./build/fabricator -v

Command Line Options

Short Flag Long Flag Description Default
-f --file Path to the YAML definition file (required) -
-o --output Directory to store generated CSV files "output"
-n --num-rows Number of rows to generate for each entity 100
-a --auto-cardinality Enable automatic cardinality detection false
-d --diagram Generate Entity-Relationship diagram true
--validate Validate relationships in CSV files true
--validate-only Validate existing CSV files without generation false
-v --version Display version information -

Examples

# Generate CSV files from example.yaml with 500 rows per entity
./build/fabricator -f example.yaml -n 500 -o data/sgnl

# Using long-form options
./build/fabricator --file example.yaml --num-rows 1000 --output export/data

# Generate CSV files with automatic cardinality detection for relationships
./build/fabricator -f example.yaml -n 200 -a

# Using long-form options with auto-cardinality
./build/fabricator --file example.yaml --num-rows 500 --auto-cardinality --output data/variable-cardinality

# Generate CSV files but disable ER diagram generation
./build/fabricator -f example.yaml --diagram=false

# Validate existing CSV files without generating new data
./build/fabricator -f example.yaml -o existing/csv/data --validate-only

# Validate existing CSV files and generate an ER diagram
./build/fabricator -f example.yaml -o existing/csv/data --validate-only --diagram

YAML Format

The YAML file should define a system-of-record structure, including:

  • Entities with attributes
  • Relationships between entities
  • External IDs that will be used for CSV filenames

Each entity in the YAML file will result in a corresponding CSV file, with the filename derived from the entity's externalId.

Generated Data & Validation

The tool provides the following functionality:

  1. CSV Generation:

    • CSV files named after each entity's external ID (without the namespace prefix)
    • Headers matching the entity's attribute external IDs
    • Consistent data across relationships between entities
    • Variable cardinality relationships (with the -a flag)
    • Realistic test data based on attribute names and types
  2. CSV Validation (via --validate-only):

    • Checks existing CSV files against a YAML definition
    • Validates relationship consistency across entities
    • Verifies unique constraint requirements are met
    • Helpful for validating production or manually-created data exports
    • Use with the existing output directory containing CSV files
  3. Entity-Relationship Diagram (enabled by default):

    • SVG visualization of all entities and their relationships
    • Color-coded entities with attributes listed
    • Primary keys (uniqueId attributes) highlighted
    • Relationship cardinality indicators (1:1, 1:N, N:1, N:M)
    • Can be disabled with --diagram=false
    • Works in both generation and validation-only modes

The data generator intelligently creates appropriate values based on field names:

  • ID fields get unique identifiers
  • Name fields get contextual names based on entity types (e.g., person names for users, company names for organizations)
  • Date fields get properly formatted dates
  • Email fields get valid email addresses
  • Boolean fields get true/false values
  • Numeric fields get appropriate numbers

Relationship Cardinality

When the auto-cardinality feature is enabled (-a flag), Fabricator automatically detects and generates appropriate cardinality for entity relationships:

  • 1:1 relationships - Simple one-to-one mappings between entities
  • 1:N relationships - One entity related to multiple instances of another entity
  • N:1 relationships - Multiple entities related to a single instance of another entity

Cardinality detection is based on:

  1. Entity metadata (primary detection method)

    • Fields with uniqueId: true are used to identify key relationships
    • When a relationship links a unique ID to a non-unique field, cardinality is automatically determined
  2. Field naming patterns (fallback method)

    • Field names ending with "Id" typically indicate N:1 relationships
    • Plural field names or names ending with "Ids" suggest 1:N relationships

Without the -a flag, all relationships default to 1:1 cardinality.

πŸ“ˆ Performance

Fabricator is designed for efficiency and can handle large datasets:

Dataset Size Entities Time Memory
Small 5 <1s <50MB
Medium 16 2-5s <100MB
Large 50 10-30s <500MB

Benchmarks (16 entities, complex relationships):

  • 1,000 rows/entity: ~3 seconds, 16 CSV files
  • 10,000 rows/entity: ~15 seconds, consistent relationships
  • 100,000 rows/entity: ~2 minutes, 1.6M total records

πŸ› οΈ Development

Prerequisites for Development

  • Go 1.23+ (tested with 1.23 and 1.24)
  • golangci-lint for code quality
  • Pre-commit hooks (optional but recommended)

Development Commands

# Run tests
make test

# Run tests with coverage
make coverage

# Format code
make fmt

# Static analysis
make vet

# Run linter
make lint

# Run all checks (CI pipeline)
make ci

# Security scanning
gosec ./...
govulncheck ./...

Contributing

See CONTRIBUTING.md for detailed development guidelines, architecture documentation, and contribution workflow.

License

This project is licensed under the MIT License - see the LICENSE file for details.

About

No description or website provided.

Topics

Resources

License

Contributing

Security policy

Stars

Watchers

Forks

Packages

No packages published

Contributors 3

  •  
  •  
  •