Python harness for agentic development of embodied control policies using perception-less motion control, computer vision, and motion generation models.
Project website: ikatsov.github.io/phy-harness
The current development workflow is:
┌──────────────────────────────────────────────────────────────────┐
│ INPUTS │
│ Start with policies/impl/<task>/<task>.yaml (task spec) │
└───────────────────────────────┬──────────────────────────────────┘
▼
┌──────────────────────────────────────────────────────────────────┐
│ DESIGN & IMPLEMENT │
│ Coding model implements/updates policy, analyzers, and tests │
└───────────────────────────────┬──────────────────────────────────┘
▼
┌──────────────────────────────────────────────────────────────────┐
│ SIMULATE │
│ simulate_policy.py generates artifacts/<task>/ with: │
│ - augmented rollout video (overlays) │
│ - joints.csv │
│ - VLM transcript JSON (if enabled) │
│ - custom analyzer outputs │
└───────────────────────────────┬──────────────────────────────────┘
▼
┌──────────────────────────────────────────────────────────────────┐
│ CODING FEEDBACK LOOP │
│ Coding model analyzes simulation outputs (video frames, logs, │
│ transcripts, analyzer results), adds focused analyzers/tests, │
│ and loops back to DESIGN & IMPLEMENT. │
└──────────────────────────────────────────────────────────────────┘
python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"For rollout validation with Gemini / VLM analyzers (and .env loading):
pip install -e ".[vlm]"
cp .env.example .env # set GEMINI_API_KEY; do not commit .envMuJoCo uses an OpenGL backend for offscreen RGB (mujoco.Renderer). On a normal desktop, the default platform GL is fine. In headless CI, use UR5GripperEnv(enable_rgb=False) or configure OSMesa/EGL per the MuJoCo rendering docs. Run full RGB rollouts (simulate_policy.py) in a terminal where rendering already works.
Differential IK on the same simulator MJCF uses mink (MuJoCo-based) plus a QP backend:
pip install -e ".[ik]"Create a task bundle at policies/impl/<task>/:
| File | Purpose |
|---|---|
<task>.yaml |
task_spec.inline (intent), optional policy_module |
<task>.py |
policy(obs, step, env) -> ctrl (and optional reset) |
<analyzer_type>.py |
Optional policy-specific analyzer (build(params) -> analyzer) |
tests/test_<task>.py |
Headless unit tests (paired with the task) |
From the repo root with the venv active:
# 1 — Headless unit tests
pytest -q tests/test_<task>.py
# 2 — Simulation rollout (writes artifacts/<task>/)
python scripts/simulate_policy.py --config policies/simulate_policy.example.yaml \
policies/impl/<task>/<task>.py --run-dir artifacts/<task>
# 3 — Coding agent analyzes the simulation outputs and makes changesDesign & implement: done in policy/analyzer/test files for the task.
Simulate: produces augmented rollout video, joints log, and analyzer artifacts.
Coding feedback loop: analyze artifacts (including extracted video frames and analyzer JSON), implement fixes, add focused tests/analyzers, and repeat.
Outputs under artifacts/<task>/: rollout.mp4, metrics.txt, joints.csv, rollout.vlm_transcript.json (when VLM transcriber is enabled), and optional custom analyzer JSON files.
joints.csv logs joint qpos (and free-joint pose columns), then for each actuator a target_* column (the vector returned by the policy for that step) and a ctrl_* column (data.ctrl after physics, i.e. applied command). Floats are rounded to five decimal places to keep files smaller.
The robot uses vendored MuJoCo Menagerie universal_robots_ur5e OBJ meshes under src/robot_manipulation_sim/mjcf/menagerie_ur5e/ (see NOTICE.txt and MENAGERIE_LICENSE there). Default scene: ur5e_two_finger_scene.xml.