Skip to content

sshkeda/agent-history

Repository files navigation

agent-history

Find, view, and losslessly convert recorded agent sessions across Claude Code, Codex, and Pi.

Each of those tools stores its session context in its own verbose JSONL shape. This repo imports any of them into one canonical event model, which powers two things: human/LLM-readable views for mining what an agent did, and lossless conversion between providers so you can hot-swap a session from one tool to another.

The primary use is finding and viewing sessions. Conversion is secondary.

Install

git clone https://github.com/sshkeda/agent-history
cd agent-history
bun install
ln -sf "$PWD/bin/agent-history" ~/.local/bin/agent-history

agent-history <args> is equivalent to bun src/cli/index.ts <args> from the checkout.

Find and view sessions

agent-history <session.jsonl>           # digest: timeline, tool ledger, token growth
cat session.jsonl | agent-history -     # stdin; provider auto-detected

agent-history find fix-bridge-dialog    # find by id, branch, cwd, or opening task
agent-history find --since 2026-06-01   # filter by recency (--since / --before)
agent-history find keychain --deep --since 2026-06-01   # full-text search inside sessions
agent-history find --sort tokens        # rank recent sessions by context blowup

agent-history view session.jsonl --event <id>   # show one omitted span in full
agent-history view session.jsonl --full         # digest without omission truncation
agent-history view session.jsonl --stats        # header + token growth + tool ledger only
agent-history find <query> --digest --limit 3   # find the newest N and digest each

agent-history usage --since 2026-06-01  # most-used tools/skills across recent sessions

The digest is a lossy filter over the canonical events — session header, token growth (input → peak, for both Claude and Codex), a user/assistant/reasoning/tool timeline, and a per-tool call/error ledger. Long spans are truncated and tagged with their source event id (evt:<id>); nothing is lost, recover any span with --event. find scans the Codex (~/.codex/sessions), Claude Code (~/.claude/projects), Pi (~/.pi/agent/sessions), cursor-agent (~/.cursor/projects), and agy / Antigravity (~/.gemini/antigravity-cli/conversations) stores, and groups Claude subagent matches under their parent session. usage rolls the per-session tool ledger up across recent sessions (or a find query / recency window) into one ranked ledger of tool calls and skill use — Claude's Skill tool plus skills/<name>/SKILL.md reads from any harness — with distinct-session counts. To discover sessions a provider writes to a non-standard location (e.g. an agent fleet pointed at a custom dir), add extra dirs per provider via AGENT_HISTORY_PI_DIRS, AGENT_HISTORY_CODEX_DIRS, AGENT_HISTORY_CLAUDE_DIRS, AGENT_HISTORY_CURSOR_DIRS, or AGENT_HISTORY_AGY_DIRS (colon-separated).

A skill (skills/agent-history/SKILL.md) documents this for agents.

Convert between providers

agent-history convert session.jsonl --to claude-code -o claude.jsonl   # provider -> provider
cat session.jsonl | agent-history convert - --to codex                 # stdin, source auto-detected
agent-history convert session.jsonl --from pi --to codex               # explicit source
agent-history convert conversation.db --to agy -o copy.db              # agy/cursor: byte-identical .db
agent-history convert session.jsonl --to agy --install                 # install for `agy --conversation <id>`
agent-history convert session.jsonl --to codex --install               # install for `codex resume <id>`
agent-history convert session.jsonl --to pi --install                  # install for `pi --session <id>`
agent-history convert session.jsonl --to cursor --install              # install for `cursor-agent --resume <id>`

--install (agy, codex, and pi) writes the converted session straight into that provider's own store and prints the ready-to-run resume command, so continuing a converted session natively is a one-liner. agy installs go to ~/.gemini/antigravity-cli/conversations/<cascade>.db under the cascade id agy opens by; codex installs go to ~/.codex/sessions/<y>/<m>/<d>/rollout-…-<id>.jsonl under a fresh session id (codex resumes by the id inside the file); pi installs go to ~/.pi/agent/sessions/<cwd-slug>/<ts>_<id>.jsonl (resume with pi --session <id> from the session's cwd — from elsewhere pi offers to fork); cursor installs go to ~/.cursor/chats/<md5(cwd)>/<chat-id>/store.db (the chat id is the directory name, so a fresh id never collides) — cursor-sourced sessions replay their store.db byte-identically, and foreign sessions get a store constructed from scratch (content-addressed blob DAG, lossless via a carry meta row cursor ignores). The target CLI must be logged in / under its usage limit to actually continue it.

Providers: pi | claude-code | codex. --from is auto-detected when omitted. cursor and agy are read via the sqlite3 binary (--from cursor / --from agy, or auto-detected). cursor has two stores: the lossy JSONL transcript (~/.cursor/projects/.../agent-transcripts/, assistant text + tool_use only) and the authoritative store.db (~/.cursor/chats/<hash>/<id>/store.db) — a content-addressed blob DAG with the full conversation including tool results, read by pointing agent-history at the .db. agy stores each conversation as a binary SQLite database (~/.gemini/antigravity-cli/conversations/<id>.db) with the timeline in protobuf.

Both can also be written back byte-identically (--to agy / --to cursor, binary, requires -o). agy's per-step blobs are crypto signatures/embeddings and cursor's store is content-addressed, so neither can be reconstructed from semantics — instead the importer preserves the exact original .db and the exporter replays it, the same preserve-and-replay the JSONL providers use via __agent_history_foreign. This survives a round-trip through another provider: agy → claude → agy and cursor → claude → cursor are byte-identical.

Both --to cursor and --to agy additionally accept a foreign session (one that didn't come from that provider):

  • cursor's transcript is open JSON, so agent-history constructs a valid agent-transcripts JSONL from any canonical stream. The visible transcript is the same lossy view cursor itself writes (text + tool_use, no results), but the full canonical session rides along in a __agent_history_session field cursor ignores.
  • agy has no open format, so agent-history instead clones the largest local agy .db as a structural skeleton and transplants the foreign session into the plaintext step fields the agy model actually reads. agy reconstructs the model's history from those plaintext fields (not the encrypted per-step blob, which it ignores), so the foreign session is rendered as ordered narration ("The user asked: …", "I used the bash tool: ", "The result was: …") and written into each skeleton step's prose — every duplicated copy — with the cascade/trajectory/executor ids rewritten to a fresh set (shared cross-conversation constants preserved). The result is model-resumable: agy loads it, the model inherits the foreign conversation as real context and can continue it (verified end-to-end against the authenticated CLI — asked to recall the prior commands, the model lists them accurately). Constructing needs at least one real agy conversation on the machine to clone (agy can't be written from a schema); without one, --to agy errors clearly. The visible narration is best-effort; the full canonical still rides in an agent_history_carry table agy ignores, so re-import is exact regardless. (The skeleton is the largest real local conversation — never another agent-history-constructed .db, which carries an agent_history_carry table and would clone a broken structure; and the foreign session is truncated to the steps it narrates, so agy's executor loads with no id collisions.)

Both carries make the round-trip lossless: pi → cursor → pi, codex → cursor → codex, pi → agy → pi, and agy → cursor → agy all come back byte-identical, because re-import reconstructs the exact canonical from the carry rather than the lossy visible view.

How it works

raw native JSONL  ↔  canonical event model  ↔  views (lossy)
  (pi/claude/codex)            │                convert (lossless)
                               └─────────────→  raw native JSONL

The canonical model is append-only and provider-agnostic. Views are lossy filters over it; exporters take canonical events back to any of the three native shapes losslessly.

When going cross-provider (e.g. Pi → Claude Code), exporters embed foreign native line envelopes as __agent_history_foreign / __agent_history_canonical fields when the target provider can safely carry them. Claude Code resume seeds are stricter than generic conversion JSONL, so prepare-claude-code-resume writes an adjacent recovery sidecar at <file>.jsonl.lossless.json. Keep that file next to the JSONL when converting back; the CLI reads it automatically for file-based Claude Code imports.

Layout

  • bin/agent-history — launcher (symlink onto PATH)
  • src/cli/ — the CLI: find, view, usage, convert
  • src/core/ — canonical event schema
  • src/adapters/ — Pi / Claude Code / Codex JSONL importers + exporters
  • src/find/ — session discovery across provider stores
  • src/views/ — the digest and usage views
  • src/tokens.ts, src/json.ts — shared token-usage and JSON-narrowing helpers
  • skills/agent-history/ — agent-facing skill
  • test/e2e/ — fixture-driven integration tests

Conversion coverage

  • Pi JSONL ↔ canonical
  • Claude Code JSONL ↔ canonical
  • Codex JSONL ↔ canonical
  • cursor store.db (content-addressed DAG) and agy .db (protobuf-in-SQLite) ↔ canonical, byte-identical via preserve-and-replay of the original .db
  • cursor-agent transcript JSONL → canonical (lossy subset of the store.db; no .db to replay)
  • Foreign → cursor (constructed agent-transcripts JSONL) and foreign → agy (a real agy .db synthesized from scratch — schema, trajectory rows, and protobuf step payloads — no local agy state needed), each carrying the full canonical losslessly in a field/table the provider ignores (__agent_history_session / agent_history_carry), so pi → agy → pi and pi → cursor → pi are byte-identical
  • Cross-provider export (e.g. Pi → Claude Code) with lossless __agent_history_foreign / __agent_history_canonical carry-through
  • Deterministic recovery sidecars (*.lossless.json) for transforms that provider JSONL cannot safely carry directly, such as demoted reasoning markers
  • Native Codex response items including messages, reasoning, function/custom tool calls, web search calls, and image generation calls
  • Semantic Pi → Claude Code export validated by the real @anthropic-ai/claude-agent-sdkgetSessionMessages parses the converted output and returns the original Pi user/assistant chain

Provider versions

Each adapter is written against a specific shape of a provider's on-disk session format. The JSONL providers (Claude Code / Codex / Pi) are relatively stable; the reverse-engineered binary formats (cursor's SQLite store, agy's protobuf-in-SQLite) can change shape with no notice. agent-history doctor pins the CLI version each adapter was verified against and flags drift, so a CLI update is a visible "re-check the format" signal instead of a silent breakage:

agent-history doctor
# ok    Claude Code        2.1.170
# ok    agy (Antigravity)  1.0.7  [reverse-engineered]
# DRIFT cursor-agent       verified 2026.06.04-…, installed 2026.07.… — re-check: …

The pinned versions live in src/provider-versions.ts; bump them (and re-run test:real-logs) when you re-verify against a new CLI release.

Development

bun run check         # typecheck + eslint + prettier
bun run test          # portable fixture-driven test suite
bun run verify:portable  # check + test in series (what CI proves)

Before cutting a release from a machine that has the local CLIs and session stores available, also run the real-log gate:

bun run test:real-logs

test:real-logs reads recent sessions from ~/.pi/agent/sessions, ~/.claude/projects, and ~/.codex/archived_sessions, validates target-native output at each hop, and checks byte-identical same-provider round-trips. Override picked files with AGENT_HISTORY_REAL_PI_SESSION / AGENT_HISTORY_REAL_CLAUDE_SESSION / AGENT_HISTORY_REAL_CODEX_SESSION.

Design standard

Strict native fidelity:

  • provider -> agent-history -> other format -> agent-history -> provider must round-trip back to the original native session bytes.
  • This applies uniformly to Claude Code, Codex, and Pi.
  • If a rebuilt provider session is not byte-for-byte identical to the original native session, that is a bug.
  • Provider-specific workarounds that rely on replaying preserved raw files instead of reconstructing them are not the intended end state — reconstruction itself must be lossless.

Tool-use concurrency: Claude's API requires every tool_use in an assistant message to be answered in the single immediately-following user message. A cross-provider seed used to split a parallel tool_use batch's results across consecutive one-result user messages, which the docs flagged as a source of API Error: 400 due to tool use concurrency issues. prepareClaudeCodeResumeSeed now merges those split results back into one user message (offline-verified: a real pi→claude seed drops from 16 ungrouped batches to 0; a no-op for same-provider seeds, so byte fidelity holds). Note: a live A/B claude --resume of the split vs. merged seed on the current CLI accepted both, so today's resume path appears to tolerate or internally regroup the split — the merge now stands as defensive conformance to the documented API rule rather than a fix for a reproducible 400.

License

MIT © 2026 Stephen Shkeda

About

Lossless session switching between Claude Code, Codex, and Pi. Convert JSONL sessions between any two formats without dropping context.

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages