Skip to content

Commit 73707af

Browse files
JAORMXclaude
andauthored
Add MCPGroup CRD proposal for Kubernetes operator (#2207)
This proposal introduces MCPGroup support to the Kubernetes operator, enabling Virtual MCP Server and logical grouping of MCPServer resources. Key design decisions: - Explicit groupRef field in MCPServer spec (follows K8s naming conventions) - Simple MCPGroup CRD with minimal spec (description) and status tracking - Namespace-scoped groups for security/isolation - No webhooks - controller-based validation for simplicity - Optional group membership (unlike CLI where groups are required) Design rationale: - MCPGroup as first-class construct (not just labels) enables meta-mcp and virtual MCP to discover and aggregate backend servers - Provides seamless CLI-to-Kubernetes transition with consistent API - Explicit group lifecycle management and validation - Foundation for growing ecosystem of ToolHive constructs 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-authored-by: Claude <[email protected]>
1 parent 71e2934 commit 73707af

File tree

1 file changed

+320
-0
lines changed

1 file changed

+320
-0
lines changed
Lines changed: 320 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,320 @@
1+
# MCPGroup CRD for Kubernetes Operator
2+
3+
## Problem Statement
4+
5+
The CLI supports runtime groups for organizing MCP servers, but this is missing in Kubernetes. The Virtual MCP Server feature (PR #2106) requires groups to discover and aggregate backend servers. Without groups, Kubernetes users cannot use the Virtual MCP or organize their servers logically.
6+
7+
## Goals
8+
9+
- Add MCPGroup support to Kubernetes matching CLI runtime group behavior
10+
- Enable Virtual MCP Server to discover servers in a group
11+
- Maintain API consistency between CLI and Kubernetes
12+
- Keep implementation simple and predictable
13+
14+
## Non-Goals
15+
16+
- Registry groups (CLI-only feature)
17+
- Cross-namespace groups
18+
- Multi-group membership per server
19+
- Client configuration management (not applicable in Kubernetes)
20+
21+
## Design
22+
23+
### Design Decision: MCPGroup CRD vs Labels/Annotations
24+
25+
**Question:** Could we use labels/annotations on MCPServer instead of creating an MCPGroup CRD?
26+
27+
**Answer:** We need MCPGroup as a first-class construct for several reasons:
28+
29+
1. **Meta-MCP and Virtual MCP requirements**: These features need to aggregate multiple MCP servers. They need a way to:
30+
- Discover which servers belong to a group
31+
- Reference groups in their configuration
32+
- Watch for group membership changes
33+
34+
2. **Seamless CLI-to-Kubernetes transition**: The CLI has an explicit Group concept that workloads belong to. Users migrating from CLI to Kubernetes expect the same mental model and API patterns.
35+
36+
3. **Growing ecosystem of constructs**: As we build more features on top of ToolHive (meta-mcp, virtual MCP, future aggregation patterns), we need a consistent way to represent server collections.
37+
38+
4. **Group as an explicit concept**: Labels are meant for flexible, ad-hoc categorization. Groups are a core organizational concept in ToolHive's architecture, deserving explicit representation.
39+
40+
While labels could technically provide grouping, they lack:
41+
- Discoverability (no list of available groups without scanning all servers)
42+
- A place for group-level metadata or status
43+
- Explicit lifecycle management
44+
- Ability to validate references before use
45+
46+
**Conclusion:** MCPGroup CRD provides the foundation for meta-mcp, virtual MCP, and future aggregation features while maintaining consistency with CLI semantics.
47+
48+
### MCPGroup CRD
49+
50+
Simple CRD for grouping servers:
51+
52+
```yaml
53+
apiVersion: mcp.toolhive.stacklok.io/v1alpha1
54+
kind: MCPGroup
55+
metadata:
56+
name: engineering-team
57+
namespace: default
58+
spec:
59+
# Optional human-readable description
60+
description: "Engineering team MCP servers"
61+
62+
status:
63+
# Number of servers in this group
64+
serverCount: 3
65+
66+
# List of server names for quick reference
67+
servers:
68+
- github-server
69+
- jira-server
70+
- slack-server
71+
72+
phase: Ready
73+
conditions:
74+
- type: Ready
75+
status: "True"
76+
lastTransitionTime: "2025-10-15T10:30:00Z"
77+
```
78+
79+
### MCPServer Spec Addition
80+
81+
Add explicit group field to MCPServer:
82+
83+
```yaml
84+
apiVersion: mcp.toolhive.stacklok.io/v1alpha1
85+
kind: MCPServer
86+
metadata:
87+
name: github-server
88+
namespace: default
89+
spec:
90+
# Existing fields...
91+
image: ghcr.io/stackloklabs/github-server:latest
92+
93+
# New: explicit group membership
94+
groupRef: engineering-team
95+
```
96+
97+
**Rationale for explicit groupRef field:**
98+
- Matches CLI behavior (workload has `Group` field)
99+
- Follows Kubernetes naming conventions for references (`groupRef` instead of `group`)
100+
- Simple and predictable
101+
- Easy to query: `list MCPServers where spec.groupRef = X`
102+
- No confusion about membership
103+
- API consistency with CLI
104+
105+
### API Consistency
106+
107+
CLI runtime groups store membership on the workload:
108+
```go
109+
type Workload struct {
110+
Name string
111+
Group string // Explicit group membership
112+
}
113+
```
114+
115+
Kubernetes should match this pattern:
116+
```go
117+
type MCPServerSpec struct {
118+
// Existing fields...
119+
120+
// GroupRef is the name of the MCPGroup this server belongs to
121+
// +optional
122+
GroupRef string `json:"groupRef,omitempty"`
123+
}
124+
```
125+
126+
### Controller Behavior
127+
128+
**MCPGroup Controller:**
129+
- Watches MCPGroup and MCPServer resources
130+
- Updates `status.servers` list when servers join/leave group
131+
- Updates `status.serverCount`
132+
- Validates referenced group exists when MCPServer is created
133+
134+
**MCPServer Controller:**
135+
- Existing reconciliation logic
136+
- Validates `spec.groupRef` references an existing MCPGroup (if specified)
137+
- Adds condition if group reference is invalid
138+
139+
### Discovery API
140+
141+
Virtual MCP (and other features) can discover servers in a group:
142+
143+
```go
144+
// List all MCPServers in a group
145+
servers, err := clientset.McpV1alpha1().MCPServers(namespace).List(ctx, metav1.ListOptions{
146+
FieldSelector: "spec.groupRef=engineering-team",
147+
})
148+
```
149+
150+
## Implementation
151+
152+
### Phase 1: Core CRD
153+
1. Add `GroupRef` field to MCPServer spec
154+
2. Create MCPGroup CRD types
155+
3. Implement MCPGroup controller
156+
4. Add field selector support for group queries
157+
5. Update CRD manifests and documentation
158+
159+
### Phase 2: Integration
160+
1. Virtual MCP integration with groups
161+
2. kubectl plugin support
162+
163+
## Examples
164+
165+
### Standalone MCPServer (No Group)
166+
167+
MCPServers can run without belonging to a group:
168+
169+
```yaml
170+
apiVersion: mcp.toolhive.stacklok.io/v1alpha1
171+
kind: MCPServer
172+
metadata:
173+
name: standalone-server
174+
namespace: default
175+
spec:
176+
image: ghcr.io/stackloklabs/filesystem:latest
177+
# No groupRef - server runs independently
178+
```
179+
180+
### MCPServer with Group Membership
181+
182+
```yaml
183+
# Create group
184+
apiVersion: mcp.toolhive.stacklok.io/v1alpha1
185+
kind: MCPGroup
186+
metadata:
187+
name: engineering-team
188+
namespace: default
189+
spec:
190+
description: "Engineering team servers"
191+
---
192+
# Create servers in group
193+
apiVersion: mcp.toolhive.stacklok.io/v1alpha1
194+
kind: MCPServer
195+
metadata:
196+
name: github-server
197+
spec:
198+
image: ghcr.io/stackloklabs/github:latest
199+
groupRef: engineering-team
200+
---
201+
apiVersion: mcp.toolhive.stacklok.io/v1alpha1
202+
kind: MCPServer
203+
metadata:
204+
name: jira-server
205+
spec:
206+
image: ghcr.io/company/jira:latest
207+
groupRef: engineering-team
208+
```
209+
210+
### Virtual MCP Usage
211+
212+
```yaml
213+
# Virtual MCP references the group
214+
# NOTE: This is an example of future MCPVirtualServer API (not yet implemented)
215+
apiVersion: mcp.toolhive.stacklok.io/v1alpha1
216+
kind: MCPVirtualServer
217+
metadata:
218+
name: engineering-virtual
219+
spec:
220+
# References existing group
221+
groupRef: engineering-team
222+
223+
# Virtual MCP configuration
224+
aggregation:
225+
conflictResolution: prefix
226+
```
227+
228+
### Querying Servers in Group
229+
230+
```bash
231+
# List all servers in a group
232+
kubectl get mcpservers -n default --field-selector spec.groupRef=engineering-team
233+
234+
# Check group status
235+
kubectl get mcpgroup engineering-team -o jsonpath='{.status.servers}'
236+
```
237+
238+
## Migration from CLI
239+
240+
CLI groups and Kubernetes groups are separate concepts:
241+
- **CLI groups**: Local runtime groups (`.toolhive/` directory)
242+
- **K8s groups**: Namespace-scoped groups (etcd)
243+
244+
**Key differences from CLI:**
245+
- In CLI: All servers must belong to a group (defaults to "default" group if not specified)
246+
- In K8s: Servers can optionally belong to a group (`spec.groupRef` is optional)
247+
248+
No automatic migration - users manually create MCPGroup resources and set `spec.groupRef` on MCPServers.
249+
250+
## Type Definitions
251+
252+
```go
253+
// MCPGroupSpec defines the desired state of MCPGroup
254+
type MCPGroupSpec struct {
255+
// Description provides human-readable context
256+
// +optional
257+
Description string `json:"description,omitempty"`
258+
}
259+
260+
// MCPGroupStatus defines observed state
261+
type MCPGroupStatus struct {
262+
// Phase indicates current state
263+
// +optional
264+
Phase MCPGroupPhase `json:"phase,omitempty"`
265+
266+
// Servers lists server names in this group
267+
// +optional
268+
Servers []string `json:"servers,omitempty"`
269+
270+
// ServerCount is the number of servers
271+
// +optional
272+
ServerCount int `json:"serverCount"`
273+
274+
// Conditions represent observations
275+
// +optional
276+
Conditions []metav1.Condition `json:"conditions,omitempty"`
277+
}
278+
279+
type MCPGroupPhase string
280+
281+
const (
282+
MCPGroupPhaseReady MCPGroupPhase = "Ready"
283+
)
284+
285+
// Add to MCPServerSpec
286+
type MCPServerSpec struct {
287+
// Existing fields...
288+
289+
// GroupRef is the MCPGroup this server belongs to
290+
// Must reference an existing MCPGroup in the same namespace
291+
// +optional
292+
GroupRef string `json:"groupRef,omitempty"`
293+
}
294+
```
295+
296+
## Open Questions
297+
298+
1. **Should groupRef be immutable after creation?**
299+
- Recommendation: Allow changes, easier for user corrections
300+
301+
2. **What happens if group is deleted?**
302+
- Recommendation: Servers continue running, `spec.groupRef` becomes dangling reference
303+
- Controller will log errors and add conditions to affected MCPServer resources
304+
305+
3. **Should we validate group exists on MCPServer create?**
306+
- Recommendation: Yes, via controller reconciliation
307+
- Controller validates groupRef and adds status conditions if invalid
308+
- No webhook needed - keep implementation simple
309+
310+
## Future Enhancements
311+
312+
- Group-level policies and authorization
313+
- Cross-namespace groups (with security review)
314+
- Group quotas and resource limits
315+
316+
## Testing
317+
318+
- **Unit**: Group validation, status updates
319+
- **Integration (envtest)**: Controller reconciliation, field selectors
320+
- **E2E (Chainsaw)**: Complete group lifecycle, Virtual MCP integration

0 commit comments

Comments
 (0)