T3 Code has one server-side observability model:
- pretty logs go to stdout for humans
- completed spans go to a local NDJSON trace file
- traces and metrics can also be exported over OTLP to a real backend like Grafana LGTM
The local trace file is the persisted source of truth. There is no separate persisted server log file anymore.
Logs are human-facing only:
- destination: stdout
- format:
Logger.consolePretty() - persistence: none
If you want a log message to show up in the trace file, emit it inside an active span with Effect.log.... Logger.tracerLogger will attach it as a span event.
Completed spans are written as NDJSON records to serverTracePath (by default, ~/.t3/userdata/logs/server.trace.ndjson).
Important fields in each record:
name: span nametraceId,spanId,parentSpanId: correlationdurationMs: elapsed timeattributes: structured contextevents: embedded logs and custom eventsexit:Success,Failure, orInterrupted
The schema lives in apps/server/src/observability/TraceRecord.ts.
Metrics are not written to a local file.
- local persistence: none
- remote export: OTLP only, when configured
- current definitions:
apps/server/src/observability/Metrics.ts
If OTLP is not configured, metrics still exist in-process, but you will not have a local artifact to inspect.
Provider event NDJSON files still exist for provider runtime streams. Those are separate from the main server trace file.
There are two useful modes:
- local-only: stdout + local
server.trace.ndjson - full local observability: stdout + local trace file + OTLP export to Grafana/Tempo/Prometheus
The local trace file is always on. OTLP export is opt-in.
You do not need any extra env vars. Just run the app normally and inspect server.trace.ndjson.
Examples:
npx t3bun devbun dev:desktopdocker run --name lgtm \
-p 3000:3000 \
-p 4317:4317 \
-p 4318:4318 \
--rm -ti \
grafana/otel-lgtmThen open http://localhost:3000.
Default Grafana login:
- username:
admin - password:
admin
export T3CODE_OTLP_TRACES_URL=http://localhost:4318/v1/traces
export T3CODE_OTLP_METRICS_URL=http://localhost:4318/v1/metrics
export T3CODE_OTLP_SERVICE_NAME=t3-localOptional:
export T3CODE_TRACE_MIN_LEVEL=Info
export T3CODE_TRACE_TIMING_ENABLED=trueCLI:
npx t3Monorepo web/server dev:
bun devMonorepo desktop dev:
bun dev:desktopPackaged desktop app:
Launch the actual app executable from the same shell so the desktop app and embedded backend inherit T3CODE_OTLP_*.
macOS app bundle example:
T3CODE_OTLP_TRACES_URL=http://localhost:4318/v1/traces \
T3CODE_OTLP_METRICS_URL=http://localhost:4318/v1/metrics \
T3CODE_OTLP_SERVICE_NAME=t3-desktop \
"/Applications/T3 Code.app/Contents/MacOS/T3 Code"Direct binary example:
T3CODE_OTLP_TRACES_URL=http://localhost:4318/v1/traces \
T3CODE_OTLP_METRICS_URL=http://localhost:4318/v1/metrics \
T3CODE_OTLP_SERVICE_NAME=t3-desktop \
./path/to/your/desktop-app-binaryDo not rely on launching from Finder, Spotlight, the dock, or the Start menu after setting shell env vars. Those launches usually will not pick them up.
The backend reads observability config at process start. If you change OTLP env vars, stop the app completely and start it again.
The trace file is the fastest way to inspect raw span data.
Tail it:
tail -f "$T3CODE_HOME/userdata/logs/server.trace.ndjson"In monorepo dev, use:
tail -f ./dev/logs/server.trace.ndjsonShow failed spans:
jq -c 'select(.exit._tag != "Success") | {
name,
durationMs,
exit,
attributes
}' "$T3CODE_HOME/userdata/logs/server.trace.ndjson"Show slow spans:
jq -c 'select(.durationMs > 1000) | {
name,
durationMs,
traceId,
spanId
}' "$T3CODE_HOME/userdata/logs/server.trace.ndjson"Inspect embedded log events:
jq -c 'select(any(.events[]?; .attributes["effect.logLevel"] != null)) | {
name,
durationMs,
events: [
.events[]
| select(.attributes["effect.logLevel"] != null)
| {
message: .name,
level: .attributes["effect.logLevel"]
}
]
}' "$T3CODE_HOME/userdata/logs/server.trace.ndjson"Follow one trace:
jq -r 'select(.traceId == "TRACE_ID_HERE") | [
.name,
.spanId,
(.parentSpanId // "-"),
.durationMs
] | @tsv' "$T3CODE_HOME/userdata/logs/server.trace.ndjson"Filter orchestration commands:
jq -c 'select(.attributes["orchestration.command_type"] != null) | {
name,
durationMs,
commandType: .attributes["orchestration.command_type"],
aggregateKind: .attributes["orchestration.aggregate_kind"]
}' "$T3CODE_HOME/userdata/logs/server.trace.ndjson"Filter git activity:
jq -c 'select(.attributes["git.operation"] != null) | {
name,
durationMs,
operation: .attributes["git.operation"],
cwd: .attributes["git.cwd"],
hookEvents: [
.events[]
| select(.name == "git.hook.started" or .name == "git.hook.finished")
]
}' "$T3CODE_HOME/userdata/logs/server.trace.ndjson"Tempo is better than raw NDJSON when you want to:
- search across many traces
- inspect parent/child relationships visually
- compare many slow traces
- drill into one failing request without hand-joining by
traceId
Recommended flow in Grafana:
- Open
Explore. - Pick the
Tempodata source. - Set the time range to something recent like
Last 15 minutes. - Start broad. Do not begin with a very narrow query.
- Look for spans from your configured service name, then narrow by span name or attributes.
Good first searches:
- service name such as
t3-local,t3-dev, ort3-desktop - span names like
sql.execute,git.runCommand,provider.sendTurn - orchestration spans with attributes like
orchestration.command_type
Once you know traces are arriving, narrower TraceQL queries like name = "sql.execute" become useful.
Traces are best for one request. Metrics are best for trends.
Good metric families to watch:
t3_rpc_request_durationt3_orchestration_command_durationt3_orchestration_command_ack_durationt3_provider_turn_durationt3_git_command_durationt3_db_query_duration
Counters tell you volume and failure rate:
t3_rpc_requests_totalt3_orchestration_commands_totalt3_provider_turns_totalt3_git_commands_totalt3_db_queries_total
Use metrics when the question is:
- "is this always slow?"
- "did this get worse after a change?"
- "which command type is failing most often?"
Use traces when the question is:
- "what happened in this specific request?"
- "which child span caused this one slow interaction?"
- "what logs were emitted inside the failing flow?"
t3_orchestration_command_ack_duration measures:
- start: command dispatch enters the orchestration engine
- end: the first committed domain event for that command is published by the server
That is a server-side acknowledgment metric. It does not measure:
- websocket transit to the browser
- client receipt
- React render time
If you need those later, add client-side instrumentation or a dedicated server fanout metric.
- Start with the local NDJSON file.
- Find spans where
exit._tag != "Success". - Group by
traceId. - Inspect sibling spans and span events.
- If needed, move to Tempo for the full trace tree.
- Search for slow top-level spans in the trace file or Tempo.
- Check child spans for sqlite, git, provider, or terminal work.
- Look at the matching duration metrics to see whether the slowness is systemic.
- Check
t3_orchestration_command_ack_durationbycommandType. - If it is high, inspect the corresponding orchestration trace.
- Look at child spans for projection, sqlite, provider, or git work.
- Filter
git.operationspans. - Inspect
git.hook.startedandgit.hook.finishedevents. - Compare hook timing to the enclosing git span duration.
Usually one of these is true:
T3CODE_OTLP_TRACES_URLwas not set- the app was launched from a different environment than the one where you exported the vars
- the app was not fully restarted after changing env
- Grafana is looking at the wrong time range or service name
If the local NDJSON file is updating, local tracing is working. The problem is almost always OTLP export configuration or process startup.
Good span boundaries:
- RPC methods
- orchestration command handling
- provider adapter calls
- external process calls
- persistence writes
- queue handoffs
Avoid tracing every tiny helper. Most helpers should inherit the active span rather than create a new one.
The codebase already uses Effect.fn("name") heavily. That should usually be your first tracing boundary.
For ad hoc work:
import { Effect } from "effect";
const runThing = Effect.gen(function* () {
yield* Effect.annotateCurrentSpan({
"thing.id": "abc123",
"thing.kind": "example",
});
yield* Effect.logInfo("starting thing");
return yield* doWork();
}).pipe(Effect.withSpan("thing.run"));Use span annotations for IDs, paths, and other detailed context:
yield *
Effect.annotateCurrentSpan({
"provider.thread_id": input.threadId,
"provider.request_id": input.requestId,
"git.cwd": input.cwd,
});Good metric labels:
- operation kind
- method name
- provider kind
- aggregate kind
- outcome
Bad metric labels:
- raw thread IDs
- command IDs
- file paths
- cwd
- full prompts
- full model strings when a normalized family label would do
Detailed context belongs on spans, not metrics.
Logs inside a span become part of the trace story:
yield * Effect.logInfo("starting provider turn");
yield * Effect.logDebug("waiting for approval response");Those messages show up as span events because Logger.tracerLogger is installed.
withMetrics(...) is the default way to attach a counter and timer to an effect:
import { someCounter, someDuration, withMetrics } from "../observability/Metrics.ts";
const program = doWork().pipe(
withMetrics({
counter: someCounter,
timer: someDuration,
attributes: {
operation: "work",
},
}),
);The server observability layer is assembled in apps/server/src/observability/Layers/Observability.ts.
It provides:
- pretty stdout logger
Logger.tracerLogger- local NDJSON tracer
- optional OTLP trace exporter
- optional OTLP metrics exporter
- Effect trace-level and timing refs
Local trace file:
T3CODE_TRACE_FILE: override trace file pathT3CODE_TRACE_MAX_BYTES: per-file rotation size, default10485760T3CODE_TRACE_MAX_FILES: rotated file count, default10T3CODE_TRACE_BATCH_WINDOW_MS: flush window, default200T3CODE_TRACE_MIN_LEVEL: minimum trace level, defaultInfoT3CODE_TRACE_TIMING_ENABLED: enable timing metadata, defaulttrue
OTLP export:
T3CODE_OTLP_TRACES_URL: OTLP trace endpointT3CODE_OTLP_METRICS_URL: OTLP metric endpointT3CODE_OTLP_EXPORT_INTERVAL_MS: export interval, default10000T3CODE_OTLP_SERVICE_NAME: service name, defaultt3-server
If the OTLP URLs are unset, local tracing still works and metrics stay in-process only.
Current high-value span and metric boundaries include:
- Effect RPC websocket request spans from
effect/rpc - RPC request metrics in
apps/server/src/observability/RpcInstrumentation.ts - startup phases
- orchestration command processing
- orchestration command acknowledgment latency
- provider session and turn operations
- git command execution and git hook events
- terminal session lifecycle
- sqlite query execution
- logs outside spans are not persisted
- metrics are not snapshotted locally
- the old
serverLogPathstill exists in config for compatibility, but the trace file is the persisted artifact that matters