A tool-first architecture that uses AI as a translator rather than an orchestrator. Instead of using LLMs to drive workflow logic, Intentive uses lightweight ONNX models to map human intent to specific tools, with LLM escalation only for unsupported requests.
| Source | Key Point | Quote (Workflow-Orchestration Emphasis) |
|---|---|---|
| Retool - State of AI (H1 2024) | Workflow automation jumped YoY from 13% → 18% | "We saw a big jump in AI used for workflow automation... the fastest-growing category of adoption this year." |
| LangChain - Is LangGraph Used in Production? | Enterprises use LangGraph for reliable/observable workflows | "The key driver for LangGraph adoption is making agents reliable, observable, and controlabile in production workflows." |
AI as Translator, Not Orchestrator: Convert human intent into deterministic tool execution paths instead of using LLMs for business logic.
Self-Configuring System: Automatically discovers tools from tools.json configuration, trains ONNX models on discovered capabilities, and provides fast deterministic execution.
- Tool Discovery: Automatically connects to MCP servers and local tools
- Intent Training: Generates training data from discovered tool capabilities
- ONNX Classification: Lightweight models (86MB) classify requests to specific tools
- Direct Execution: Fast tool execution (~10-50ms) without LLM overhead
- LLM Escalation: Only for unsupported or complex requests
Try the implementation immediately without any setup using Docker:
# Run with Groq (fast, free API)
docker run -e OPENAI_API_KEY=your-groq-key -e OPENAI_BASE_URL=https://api.groq.com/openai/v1 ghcr.io/katasec/intentive:latest
# Run with OpenAI
docker run -e OPENAI_API_KEY=your-openai-key ghcr.io/katasec/intentive:latest
# Without API key (shows usage)
docker run ghcr.io/katasec/intentive:latestTest different execution paths:
> what is the status of order 12345? # Deterministic tool execution
> hello # Fast rule-based response
> what's today's date? # LLM escalation
> help me with something complex # Quality-driven refinement
Get a Groq API key (free, fast):
- Visit console.groq.com
- Sign up and create an API key
- Use with the Docker command above
Simple 3-stage pipeline optimized for speed and cost-efficiency:
1. USER REQUEST
↓
2. RULE GATE (~5ms)
• Pattern matching (hi/hello)
• Input validation (length)
• Fast path responses
↓ [if no direct match, continue]
3. ONNX INTENT CLASSIFIER (~50ms)
• MiniLM-L6-v2 (86MB local model)
• Embedding-based classification
• Confidence + Risk scoring
↓
HIGH CONFIDENCE LOW CONFIDENCE/RISKY
↓ ↓
DETERMINISTIC EXECUTOR LLM ESCALATION (~200-800ms)
(~10ms) • GPT-4o-mini/Groq
• GetOrder lookup • Plan generation
• Data validation • Tool orchestration
• Fast business logic • Response composition
↓ ↓
↓
4. RESPONSE TO USER
Rule Gate (0-5ms): Pattern matching for common cases like greetings, input validation
ONNX Classifier (~50ms): 86MB MiniLM model for local intent classification with confidence scoring
Tool Executor (~10ms): Deterministic business logic - order lookups, data queries, calculations
LLM Escalation (200-800ms): GPT-4o-mini or Groq models for complex reasoning and plan generation
- Fast Path:
Rule Gate → Response(greetings, simple queries) - Deterministic Path:
Rule Gate → ONNX → Tool Executor(high-confidence classifications) - LLM Path:
Rule Gate → ONNX → LLM Escalation → Tools(low-confidence or high-risk requests)
Edit tools.json to define MCP servers and local tools:
{
"mcpServers": [
{
"name": "weather-server",
"enabled": true,
"transport": {
"type": "stdio",
"command": "docker",
"args": ["run", "--rm", "-i", "mcp/weather-server"]
},
"capabilities": ["weather", "forecast", "temperature"]
}
],
"localTools": [
{
"name": "GetOrder",
"enabled": true,
"class": "Intentive.Core.Tools.GetOrderTool",
"capabilities": ["order", "status", "track"]
}
]
}Auto-discover and train from your configured tools:
# Train ONNX model from discovered tools
./intentive --train-tools
# With custom parameters
./intentive --train-tools --examples 200 --model models/custom.onnxTraining Process:
- 🔍 Discovers tools from
tools.json(local + MCP servers) - 🎨 Generates training examples for each discovered capability
- 🧠 Trains lightweight ONNX model (2-5 minutes)
- ✅ System ready - now accurately classifies user input to tools
Add new capabilities without changing code:
# 1. Add new MCP server to tools.json
vim tools.json
# 2. Retrain system
./intentive --train-tools
# 3. Use new capabilities immediately
./intentiveFor development (requires .NET 9.0 SDK):
git clone https://github.com/katasec/intentive.git
cd intentive
make build
# Set API credentials
export OPENAI_API_KEY="your-key"
export OPENAI_BASE_URL="https://api.groq.com/openai/v1" # Optional
# First-time setup: train your model
./intentive --train-tools
# Run the system
make runDocker is recommended for trying the implementation - see the Quick Start section above.
- ONNX Classification: ~50ms (local inference)
- Rule Gate: <5ms (pattern matching)
- LLM Escalation: 200-800ms (network dependent)
- Memory Usage: ~186MB (base + ONNX model)
This is an experimental exploration of alternatives to LLM-first architectures. The implementation uses Microsoft Semantic Kernel for LLM integration and Microsoft.ML.OnnxRuntime for local model inference.