Building deterministic AI systems, not demos.
Software engineer focused on agentic AI systems.
I build:
- multi-agent systems with structured orchestration
- deterministic harnesses for reliability
- evaluation-aware AI systems (focused on correctness, not just output)
Focus areas:
- agent orchestration (LangChain, LangGraph, custom harnesses)
- context + tool design (ACI-style systems)
- AI evaluation (failure detection, reproducibility, ranking)
→ AI systems (LLM workflows, agent orchestration, evaluation)
→ Distributed systems + automation tooling
Featured Project: ci-rootcause
Deterministic multi-agent CI debugging engine.
Most AI CI tools summarize logs.
They fail because CI failures are execution systems, not text problems.
ci-rootcause reconstructs execution and identifies the actual root cause.
- builds a failure graph from CI logs
- detects the first failure (not downstream symptoms)
- analyzes diffs to link code changes to breakages
- ranks root causes using deterministic scoring
- generates evidence-backed fixes (LLM-constrained)
- produces structured outputs:
ci-rca.json,ci-rca.md
LLMs are used selectively for:
- explanation
- fix suggestions
Supported providers:
- Ollama (local models)
- OpenAI
- Anthropic
- Google Gemini
LLMs are never used for scoring or confidence.
- no hallucinated root causes
- reproducible outputs across runs
- confidence is computed, not generated
- works with both local and hosted models
- designed for real CI workflows, not demos
- determinism over heuristics
- systems over prompts
- evaluation before optimization
- evidence over plausibility
ci-rca.json→ machine-readable root causeci-rca.md→ human-readable explanation
- contributing to real AI systems (not toy projects)
- building production-grade agent workflows
- focusing on correctness, determinism, and evaluation
- shipping systems that can be reasoned about and verified
Approach: → consistent, high-frequency contributions → focus on real issues that get merged
- [Open] #6535 fix(ethereum): handle trace_filter traces missing result.output via c… in
graphprotocol/graph-node - [Open] #2331 fix(langgraph): handle null thread checkpoint in RemoteGraph.getState in
langchain-ai/langgraphjs - [Open] #5461 fix(converter): fall back on invalid JSON-like partial matches in
crewAIInc/crewAI - [Open] #5545 fix(flow,task): handle pydantic outputs in guardrail retries and checkpoint serialization in
crewAIInc/crewAI - [Open] #2316 fix(sdk): Backfill truncated history for regenerate branching in
langchain-ai/langgraphjs - [Open] #21386 fix(azureaisearch): preserve falsy metadata values in index mapping in
run-llama/llama_index - [Open] #21336 fix(elasticsearch): split sync and async store paths in
run-llama/llama_index
- Determinism over heuristics where possible
- Systems over prompts
- Evaluation before optimization
- Evidence > plausibility
- LangChain
- LangGraph
- multi-agent orchestration
- LLM tool + context design
- Python
- TypeScript
- Rust
- agent harness design
- evaluation systems
- reproducibility + reliability



