Skip to content

intel/enterprise-agent-toolkit

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Intel® AI for Enterprise Agent Toolkit — Open Source Agentic AI Platform on Kubernetes

License: Apache 2.0 Platform: Intel Xeon Deployment: Kubernetes Gateway: LiteLLM Frameworks: LangGraph · CrewAI · Flowise

Deploy a production-grade, open source agentic AI stack on Intel® Xeon® in hours — not months. A batteries-included Kubernetes platform with API gateway, intelligent inference routing, agent memory, sandboxed code execution, MCP tool integration, and full agent and system observability — all pre-integrated and ready to run.

Keywords: agentic AI platform · enterprise AI agents · open source AI agent stack · Kubernetes AI deployment · LLM inference on CPU · Intel Xeon AI · LiteLLM gateway · AI agent memory RAG · on-premise LLM deployment · private AI infrastructure · AI agent observability · MCP server Kubernetes · vLLM sglang Intel · Enterprise Agnet Toolkit


Table of Contents


Why This Matters

AI agents are not just smarter chatbots — they are autonomous systems that plan, act, and iterate across multi-step tasks. Building a production-ready agentic AI platform from scratch requires solving a dozen infrastructure problems simultaneously: secure API access, LLM inference routing, agent memory and state management, sandboxed code execution, LLM observability, and enterprise governance.

Most organizations waste months stitching together LiteLLM, Redis, pgvector, Kubernetes, Langfuse, and agent sandboxes. This open source toolkit delivers them pre-integrated, pre-optimized for Intel® Xeon®, and deployable in a single automated workflow.

The core insight: GPUs accelerate reasoning. But the value of agentic AI comes from end-to-end task execution — and that control plane (state, tools, policy, coordination) runs efficiently on CPUs. This toolkit is designed around that reality, making it the ideal platform for cost-efficient, on-premise, and private AI deployments.


Business Value at a Glance

Problem What the Toolkit Delivers
Long time-to-production One-click Kubernetes deployment — base stack live in hours
Integration complexity All layers pre-integrated: gateway, memory, sandbox, observability
Vendor lock-in risk Provider Design Pattern — swap or extend any component without re-architecting
Governance & compliance gaps Policy-driven API gateway, sandboxed execution, full audit trail
GPU cost dependency CPU-efficient LLM inference by default on Xeon; GPU addable on demand
Observability blind spots Every token, latency, and agent step captured and queryable
Adoption friction Works with LangGraph, CrewAI, Flowise, Microsoft Agent Framework, and any OpenAI-compatible framework

Architecture

Intel® AI for Enterprise Agent Toolkit Architecture

The toolkit is organized into seven pluggable provider blocks, each with a default implementation and a defined interface. Customers can use the defaults on day one and swap or extend individual providers as their needs evolve — without touching any other layer.


Provider Design Pattern

Each functional block in the toolkit follows a Provider Design Pattern: the block defines the capability contract (what it does), not the implementation (how it does it). The default provider is Intel-validated and optimized for Xeon out of the box. Any standard-compliant alternative can be substituted.

This means:

  • No big-bang replacements. Swap one provider at a time.
  • No lock-in. The toolkit never mandates a proprietary runtime.
  • Incremental adoption. Start with all defaults; introduce your own components over time.

Provider Blocks

Each functional block ships with an Intel-validated default. Because the toolkit is built on the Provider Design Pattern, any block can be swapped or extended independently — without touching the rest of the stack.

API Gateway & Access Control

Unified, secure entry point for all agents, tools, and applications. Routes every request through a policy-driven layer that enforces authentication, authorization, and rate limiting before anything reaches a model or tool. Supports any OpenAI-compatible backend — cloud, on-premise, or hybrid.

Default provider: LiteLLM


Intelligent Inference Routing

Automatically dispatches LLM inference workloads to the optimal compute target based on task intent. Reasoning and planning tasks are routed to GPU or cloud/MaaS endpoints; execution, encoding, and coordination tasks are routed to CPU on Xeon. Rules are configurable per application and model-agnostic — works with OpenAI, Anthropic Claude, Google Gemini, and any self-hosted model.

Default provider: LiteLLM Semantic Router · Intel Native Inferencing Service (vLLM, SGLang)


Actions & Tools

Gives agents the ability to take actions in the world — call APIs, run code, query databases, and invoke external services — through a governed and extensible tool layer. MCP (Model Context Protocol) server templates let teams register domain-specific tools without modifying the core stack.

Default provider: Intel MCP (Model Context Protocol) · ETL / Ingestion Services


Code Execution & Sandbox

Enables agents to safely execute code and run arbitrary actions in isolated environments, preventing runaway agents from affecting production systems. Each sandbox is a fully self-contained ephemeral Kubernetes pod with its own filesystem and process tree. Policy controls limit the blast radius of any single agent action.

Default provider: kubernetes-sigs/agent-sandbox


Memory, State & Context

Maintains agent memory across the full lifecycle — short-term working context within a session and long-term knowledge across sessions and users. Enables RAG (Retrieval-Augmented Generation), semantic search over past interactions, and stateful multi-turn workflows. Agents maintain context across tasks, sessions, and workflows.

Default provider: Redis + RediSearch (short-term) · PostgreSQL + pgvector (long-term)


Observability & Telemetry

Every token, latency measurement, and agent step is captured, stored, and queryable. Provides LLM-native tracing for debugging agent reasoning chains, out-of-the-box dashboards, and seamless integration with existing enterprise monitoring infrastructure — no custom instrumentation required.

Default provider: Langfuse · Prometheus · Grafana · Loki


Deployment & Orchestration

Manages the full lifecycle of the stack — provisioning, scaling, updates, and teardown — across single-node and multi-node environments. Supports both production Kubernetes clusters and lightweight Docker Compose for development and demos, using the same configuration.

Default provider: Kubernetes · Helm · Docker Compose


Infrastructure Core

All provider blocks run on a common infrastructure foundation:

Layer Details
Operating System Ubuntu 24.04 LTS / RHEL
CPU Configuration Intel® Xeon® — NUMA tuning, firmware, drivers, AMX enabled
Compute & Runtime Intel® Xeon® · Intel® AI Accelerators · Intel® Arc™
Supported Model APIs OpenAI · Anthropic Claude · Google Gemini · SambaNova · local models via vLLM / SGLang

Validated Orchestration Frameworks

Framework Style
LangGraph State-machine, multi-step agent workflows
Flowise Low-code visual pipeline builder
CrewAI Role-based multi-agent collaboration
OpenClaw Intel-optimized agentic framework

Example Business Use Cases (Not pre-bundled)

Use Case Description
Coding Agent Automated code generation, review, test, and fix loops — accelerates developer productivity
Deep Research Agent Multi-step web and knowledge base research with citation; replaces hours of analyst work
Flowise Agents Low-code visual agent pipeline construction for business teams
Customer Support Automation RAG-augmented, multi-turn resolution with memory — reduces ticket handle time
Enterprise Data Analyst Natural language to SQL/report across internal databases — self-serve analytics
IT Operations Agent Infrastructure monitoring, alerting, and self-healing workflows — reduces MTTR

Deployment Model

Deployment Options

Option Stack Best For
Multi-node Kubernetes Kubespray + Helm + KubeRay (for Parallel Tool Calling) Production, HA, scale-out
Single-node Kubernetes K8s + Helm Pilot, staging, edge
Docker Compose Docker Compose Development, demos, single machine

How to Deploy

1. Clone the repository

git clone https://github.com/intel/enterprise-agent-toolkit.git
cd enterprise-agent-toolkit

2. Complete prerequisites

Hardware requirements, SSH key setup, and DNS/TLS configuration are covered in Prerequisites Guide.

3. Quick Start

The full step-by-step deployment guide — covering the base stack, semantic routing, Redis, pgvector, and Agent Sandbox — is in Quick Start Guide.

Estimated time to first running agent: hours, not months.


Backend Connection Reference

In-cluster URLs for all shared backends (Redis, Agent Sandbox, pgvector, KubeRay, Flowise). See Backend Connection Reference for the full reference.


Decommission / Reset and Redeploy from Scratch

To tear down the cluster and redeploy from scratch, see Decommission and Redeploy Guide.


Build vs. Build-from-Scratch

Dimension Build from Scratch Intel® Toolkit
Time to production 3–6 months Hours to days
Vendor lock-in None None (Apache 2.0)
Component flexibility Full (you build it) Full (Provider Pattern)
GPU dependency Your choice Optional
Data residency Full control Full control
Enterprise governance You build it Built-in
Intel® Xeon® optimization Manual Native
Total cost of ownership High Low (open source + Xeon)

Frequently Asked Questions

What is the Intel® AI for Enterprise Agent Toolkit? An open source, Kubernetes-based platform that gives enterprises everything they need to deploy, run, and govern production AI agents on Intel® Xeon® — including LLM gateway, inference routing, agent memory, sandboxed execution, MCP tool integration, and full observability.

How is this different from LangChain, LangGraph, or CrewAI? LangGraph, CrewAI, and similar frameworks handle agent orchestration logic. This toolkit provides the production infrastructure underneath them — the API gateway, inference routing, memory store, sandbox, and observability layer. You use LangGraph or CrewAI on top of this toolkit.

Does it require a GPU? No. The toolkit is designed to run AI agent workloads efficiently on Intel® Xeon® CPUs using AMX acceleration. GPU or cloud LLM endpoints (OpenAI, Anthropic, Google) can be added to the same gateway for reasoning-heavy tasks.

Can I use it with OpenAI, Anthropic Claude, or Google Gemini? Yes. The LiteLLM API gateway is fully OpenAI-compatible and supports all major model providers — OpenAI, Anthropic Claude, Google Gemini, SambaNova, and any self-hosted model via vLLM or SGLang.

Is this suitable for on-premise or air-gapped deployments? Yes. The full stack runs on bare metal or private cloud Intel® Xeon® servers with no external dependencies required. Ideal for regulated industries requiring data residency and private AI deployments.

Can I replace individual components? Yes — this is the core design principle. The toolkit is built on the Provider Design Pattern: each block (gateway, memory, observability, etc.) has a default but can be swapped independently without touching other layers.

What agent frameworks does it support? LangGraph, CrewAI, Flowwise, OpenClaw, and any Python-based or OpenAI-compatible orchestration framework.

How do I get started? Clone the repo, run ./deploy-agentic-stack.sh, and follow the Quick Start Guide.

License: Apache 2.0 · Primary languages: Shell (88%), Jinja (10%), Go Template (2%)


Links


Intel®, Intel® Xeon®, and Intel® Arc™ are registered trademarks of Intel Corporation or its subsidiaries. Licensed under the Apache License, Version 2.0.

About

A comprehensive, enterprise-ready toolkit for deploying Agentic AI systems on Intel® Xeon processors and Intel accelerators. Built for organizations that need to move from prototype to production quickly, the toolkit bundles all the components required to operate trustworthy, observable, and governable AI agents at Scale

Resources

License

Code of conduct

Contributing

Security policy

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors