Skip to content

Commit d1799c4

Browse files
schrocknclaude
andcommitted
Add JSON output support for CLI commands with infrastructure and framework
This change establishes the foundation for machine-parseable CLI output by introducing JSON serialization helpers, a renderer abstraction framework, and migrating the status command to the new infrastructure. Created JSON output infrastructure (Phase 1-2 complete) with comprehensive test coverage. Implemented the core serialization layer, renderer framework, and migrated status command to demonstrate the pattern. - `src/erk/cli/json_output.py` - JSON serialization helpers and error boundary decorator - `src/erk/cli/rendering.py` - OutputRenderer framework with Text/Json implementations - `tests/cli/test_json_output.py` - 22 tests for JSON serialization and error handling - `tests/cli/test_rendering.py` - 17 tests for renderer framework - `src/erk/cli/commands/status.py` - Migrated to renderer pattern with --format json flag - **JSON Serialization Infrastructure**: Automatic handling of Path, datetime, and dataclass objects with serialize_for_json() helper - **Renderer Framework**: Abstract OutputRenderer interface separating data collection from presentation, enabling text/JSON format selection - **Error Boundary Pattern**: json_error_boundary decorator for structured error output in JSON mode - **Status Command Migration**: Complete refactor with --format flag and comprehensive JSON schema documentation - **Test Coverage**: 39 new tests validating serialization edge cases, output routing, and renderer behavior 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude <[email protected]>
1 parent 7936f8a commit d1799c4

File tree

12 files changed

+1953
-40
lines changed

12 files changed

+1953
-40
lines changed

pyproject.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -18,6 +18,7 @@ dependencies = [
1818
"click>=8.1.7",
1919
"python-frontmatter>=1.0.0",
2020
"tomlkit>=0.12.0",
21+
"pydantic>=2.0.0",
2122
]
2223

2324
[project.scripts]

src/erk/cli/AGENTS.md

Lines changed: 380 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,380 @@
1+
# CLI Module - Pydantic Serialization Patterns
2+
3+
## Quick Reference: When Pydantic Auto-Serializes Types
4+
5+
| Type | In Model Fields | In Arbitrary Dicts | Solution for Dicts |
6+
| ----------- | -------------------------- | ---------------------- | ----------------------- |
7+
| `Path` | ✅ Auto → `str` | ❌ Requires custom | `str(path)` |
8+
| `datetime` | ✅ Auto → ISO format | ❌ Requires custom | `dt.isoformat()` |
9+
| `dataclass` | ✅ Auto → dict (recursive) | ❌ Requires conversion | `asdict()` then recurse |
10+
11+
**Key insight:** Pydantic's automatic serialization only applies to **typed model fields**, not arbitrary Python objects nested in dicts.
12+
13+
---
14+
15+
## Design Decision: json_output.py Architecture
16+
17+
### Dict-Only Approach with Pydantic Validation Where Needed
18+
19+
The `emit_json()` function accepts only dicts:
20+
21+
```python
22+
def emit_json(data: dict[str, Any]) -> None:
23+
serialized = _serialize_for_json(data)
24+
json_str = json.dumps(serialized, indent=2)
25+
machine_output(json_str)
26+
```
27+
28+
**Rationale:**
29+
30+
1. **Single contract** - Clear expectation: "give me a dict"
31+
- No union types (`BaseModel | dict`)
32+
- No branching logic based on type
33+
- Simple, explicit interface
34+
35+
2. **Pydantic models convert first** - Commands using schemas call `.model_dump(mode='json')`
36+
- Example: `emit_json(status_model.model_dump(mode='json'))`
37+
- Pydantic handles Path/datetime serialization during conversion
38+
- Validation happens at model construction, not at emit time
39+
40+
3. **Custom serialization for dict values** - `_serialize_for_json()` handles special types
41+
- Path → str conversion
42+
- datetime → ISO format
43+
- dataclass → dict recursion
44+
45+
**Why not accept Pydantic models directly?**
46+
47+
- Eliminates dual-path complexity
48+
- Makes conversion explicit at call sites
49+
- Pydantic `.model_dump(mode='json')` already does serialization
50+
- No performance benefit to accepting BaseModel (still converts to dict internally)
51+
52+
---
53+
54+
## Why NOT Use RootModel for Dict Serialization?
55+
56+
### The Tempting (But Wrong) Approach
57+
58+
```python
59+
# ❌ This seems like it would work, but doesn't
60+
from pydantic import RootModel
61+
62+
data = {"path": Path("/test"), "timestamp": datetime.now()}
63+
model = RootModel[dict[str, Any]](data)
64+
json_str = model.model_dump_json() # ← TypeError: Path not JSON serializable
65+
```
66+
67+
**Why it fails:**
68+
69+
- `RootModel[dict[str, Any]]` tells Pydantic "this is a dict"
70+
- Pydantic only serializes Path/datetime when they're **typed fields**
71+
- Generic `Any` type provides no serialization hints
72+
- The Path/datetime are **dict values**, not model fields
73+
74+
### What Actually Works
75+
76+
**Option 1: Typed model fields** (what Pydantic excels at)
77+
78+
```python
79+
# ✅ Pydantic knows path is a Path field
80+
class MyModel(BaseModel):
81+
path: Path
82+
timestamp: datetime
83+
84+
model = MyModel(path=Path("/test"), timestamp=datetime.now())
85+
json_str = model.model_dump_json() # ← Works: {"path": "/test", "timestamp": "..."}
86+
```
87+
88+
**Option 2: Custom serialization function** (for arbitrary dicts)
89+
90+
```python
91+
# ✅ Explicit control over serialization
92+
def _serialize_for_json(obj: Any) -> Any:
93+
if isinstance(obj, Path):
94+
return str(obj)
95+
if isinstance(obj, datetime):
96+
return obj.isoformat()
97+
# ... handle dicts, lists, nested structures
98+
99+
serialized = _serialize_for_json(data)
100+
json_str = json.dumps(serialized, indent=2)
101+
```
102+
103+
---
104+
105+
## Performance Characteristics
106+
107+
### RootModel Construction Overhead
108+
109+
**Creating a Pydantic model has significant overhead:**
110+
111+
```python
112+
# Every emit_json() call would do this:
113+
model = RootModel[dict[str, Any]](data) # ← Model construction
114+
json_str = model.model_dump_json() # ← Validation + serialization
115+
```
116+
117+
**Overhead includes:**
118+
119+
- Schema construction/validation
120+
- Field inspection and type checking
121+
- Serialization setup
122+
- Internal model bookkeeping
123+
124+
**From Pydantic documentation:**
125+
126+
> "Schema construction carries significant overhead...it is recommended to create a TypeAdapter for a given type just once and reuse it"
127+
128+
### Custom Function Performance
129+
130+
```python
131+
# Direct Python recursion - minimal overhead
132+
serialized = _serialize_for_json(data) # ← Pure Python recursion
133+
json_str = json.dumps(serialized) # ← Standard library
134+
```
135+
136+
**Characteristics:**
137+
138+
- No model construction
139+
- No validation overhead
140+
- Direct type checking with `isinstance()`
141+
- Minimal memory allocation
142+
143+
**Performance comparison:**
144+
145+
- RootModel: ~100-1000x slower (model construction dominates)
146+
- Custom function: Near-zero overhead beyond the serialization itself
147+
148+
---
149+
150+
## Common Mistakes to Avoid
151+
152+
### Mistake 1: Assuming RootModel Auto-Serializes Nested Types
153+
154+
```python
155+
# ❌ WRONG: Assumes RootModel handles Path in dict values
156+
data = {"worktree": {"path": Path("/test")}}
157+
model = RootModel[dict[str, Any]](data)
158+
json_str = model.model_dump_json() # ← Fails: Path not serializable
159+
```
160+
161+
**Why it fails:** `dict[str, Any]` doesn't tell Pydantic that values contain Path objects.
162+
163+
**Correct approach:**
164+
165+
```python
166+
# ✅ CORRECT: Define schema with typed fields
167+
class WorktreeInfo(BaseModel):
168+
path: Path
169+
170+
class Output(BaseModel):
171+
worktree: WorktreeInfo
172+
173+
model = Output(worktree=WorktreeInfo(path=Path("/test")))
174+
json_str = model.model_dump_json() # ← Works: field typing provides serialization hint
175+
```
176+
177+
### Mistake 2: Using model_dump() Then json.dumps()
178+
179+
```python
180+
# ❌ INEFFICIENT: Two-step serialization
181+
model = MyPydanticModel(...)
182+
dict_data = model.model_dump(mode='json') # ← Pydantic converts to dict
183+
json_str = json.dumps(dict_data, indent=2) # ← json.dumps formats it
184+
```
185+
186+
**Why inefficient:** Pydantic already has formatted JSON serialization.
187+
188+
**Correct approach:**
189+
190+
```python
191+
# ✅ CORRECT: One-step serialization
192+
model = MyPydanticModel(...)
193+
json_str = model.model_dump_json(indent=2) # ← Single call, already formatted
194+
```
195+
196+
### Mistake 3: Not Converting Pydantic Model to Dict
197+
198+
```python
199+
# ❌ WRONG: emit_json() only accepts dicts
200+
def my_command():
201+
model = MyResponseModel(name="test", path=Path("/foo"))
202+
emit_json(model) # ← Type error: Expected dict, got BaseModel
203+
```
204+
205+
**Why wrong:** `emit_json()` signature is `dict[str, Any]`, not `BaseModel | dict`.
206+
207+
**Correct approach:**
208+
209+
```python
210+
# ✅ CORRECT: Convert Pydantic model to dict first
211+
def my_command():
212+
model = MyResponseModel(name="test", path=Path("/foo"))
213+
emit_json(model.model_dump(mode="json")) # ← Explicit conversion
214+
```
215+
216+
---
217+
218+
## Design Rationale Summary
219+
220+
### Why json_output.py Uses This Approach
221+
222+
1. **Dict-only interface** - Single, simple contract
223+
- All callers pass dicts to `emit_json()`
224+
- No union types or branching logic
225+
- Explicit conversions at call sites
226+
227+
2. **Pydantic models call .model_dump(mode='json')** - Explicit conversion
228+
- Validation happens at model construction
229+
- Serialization happens during .model_dump()
230+
- emit_json() receives pre-serialized dict
231+
232+
3. **Custom serialization for dict values** - Handles special types
233+
- Path → str, datetime → ISO format
234+
- Explicit, performant, testable
235+
- Follows LBYL (Look Before You Leap) principles
236+
237+
4. **ErrorResponse still uses Pydantic** - Provides validation value
238+
- Validates exit_code range (0-255)
239+
- Converts to dict before emit: `emit_json(error.model_dump(mode='json'))`
240+
241+
5. **No RootModel** - Avoids anti-pattern
242+
- Doesn't solve the serialization problem
243+
- Adds performance overhead
244+
- More complex without benefit
245+
246+
### When to Use Pydantic Models
247+
248+
**Use Pydantic models before calling emit_json() when:**
249+
250+
- ✅ Output has a well-defined schema
251+
- ✅ Want runtime validation (e.g., field constraints)
252+
- ✅ Multiple commands share the same structure
253+
- ✅ Need to document JSON schema
254+
255+
**Pattern:**
256+
257+
```python
258+
model = MyResponseModel(...) # ← Validation happens here
259+
emit_json(model.model_dump(mode='json')) # ← Serialization happens here
260+
```
261+
262+
**Use plain dicts when:**
263+
264+
- ✅ Output structure is dynamic or simple
265+
- ✅ Don't need validation
266+
- ✅ One-off command output
267+
268+
---
269+
270+
## Code Examples
271+
272+
### Example 1: Pydantic Model with Explicit Conversion
273+
274+
```python
275+
from pathlib import Path
276+
from datetime import datetime
277+
from pydantic import BaseModel
278+
from erk.cli.json_output import emit_json
279+
280+
class WorktreeResponse(BaseModel):
281+
name: str
282+
path: Path # ← Typed field: auto-converts to str
283+
created: datetime # ← Typed field: auto-converts to ISO format
284+
285+
# Usage
286+
response = WorktreeResponse(
287+
name="feature",
288+
path=Path("/repo/worktrees/feature"),
289+
created=datetime.now(),
290+
)
291+
292+
# Convert to dict then emit
293+
emit_json(response.model_dump(mode="json"))
294+
# Output: {
295+
# "name": "feature",
296+
# "path": "/repo/worktrees/feature",
297+
# "created": "2025-11-17T10:30:45.123456Z"
298+
# }
299+
```
300+
301+
### Example 2: Dict with Custom Serialization
302+
303+
```python
304+
from pathlib import Path
305+
from datetime import datetime
306+
307+
def _serialize_for_json(obj: Any) -> Any:
308+
"""Recursively serialize special types."""
309+
if isinstance(obj, Path):
310+
return str(obj)
311+
if isinstance(obj, datetime):
312+
return obj.isoformat()
313+
if isinstance(obj, dict):
314+
return {k: _serialize_for_json(v) for k, v in obj.items()}
315+
if isinstance(obj, list):
316+
return [_serialize_for_json(item) for item in obj]
317+
return obj
318+
319+
# Usage
320+
data = {
321+
"name": "feature",
322+
"path": Path("/repo/worktrees/feature"),
323+
"created": datetime.now(),
324+
}
325+
326+
serialized = _serialize_for_json(data)
327+
json_str = json.dumps(serialized, indent=2)
328+
# Output: {
329+
# "name": "feature",
330+
# "path": "/repo/worktrees/feature",
331+
# "created": "2025-11-17T10:30:45.123456"
332+
# }
333+
```
334+
335+
### Example 3: ErrorResponse with Validation
336+
337+
```python
338+
from pydantic import BaseModel, Field
339+
340+
class ErrorResponse(BaseModel):
341+
error: str
342+
error_type: str
343+
exit_code: int = Field(default=1, ge=0, le=255) # ← Validates range
344+
345+
# Valid
346+
error = ErrorResponse(error="File not found", error_type="FileNotFoundError", exit_code=1)
347+
348+
# Invalid - raises ValidationError
349+
error = ErrorResponse(error="Bad", error_type="Error", exit_code=999) # ← Fails: > 255
350+
```
351+
352+
---
353+
354+
## Related Files
355+
356+
**Implementation:**
357+
358+
- `src/erk/cli/json_output.py` - Core JSON output utilities
359+
- `src/erk/cli/json_schemas.py` - Pydantic schemas for command responses
360+
- `src/erk/cli/rendering.py` - Output rendering framework
361+
362+
**Tests:**
363+
364+
- `tests/cli/test_json_output.py` - Tests for emit_json() and serialization
365+
- `tests/cli/test_rendering.py` - Tests for renderer framework
366+
367+
---
368+
369+
## Further Reading
370+
371+
**Pydantic documentation:**
372+
373+
- [Serialization](https://docs.pydantic.dev/latest/concepts/serialization/) - Model dump, JSON mode
374+
- [Custom Serializers](https://docs.pydantic.dev/latest/concepts/serialization/#custom-serializers) - Field and model serializers
375+
- [JSON Schema](https://docs.pydantic.dev/latest/concepts/json_schema/) - Generating schemas
376+
377+
**Erk coding standards:**
378+
379+
- `AGENTS.md` - Top 6 critical rules
380+
- `dignified-python` skill - Complete Python standards

src/erk/cli/CLAUDE.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
@AGENTS.md

0 commit comments

Comments
 (0)