PhyHarness - A Dev Harness for Physical AI

Python harness for agentic development of embodied control policies using perception-less motion control, computer vision, and motion generation models.

Project website: ikatsov.github.io/phy-harness

The current development workflow is:

  ┌──────────────────────────────────────────────────────────────────┐
  │ INPUTS                                                           │
  │  Start with policies/impl/<task>/<task>.yaml (task spec)         │
  └───────────────────────────────┬──────────────────────────────────┘
                                  ▼
  ┌──────────────────────────────────────────────────────────────────┐
  │ DESIGN & IMPLEMENT                                               │
  │  Coding model implements/updates policy, analyzers, and tests    │
  └───────────────────────────────┬──────────────────────────────────┘
                                  ▼
  ┌──────────────────────────────────────────────────────────────────┐
  │ SIMULATE                                                         │
  │  simulate_policy.py generates artifacts/<task>/ with:            │
  │  - augmented rollout video (overlays)                            │
  │  - joints.csv                                                    │
  │  - VLM transcript JSON (if enabled)                              │
  │  - custom analyzer outputs                                       │
  └───────────────────────────────┬──────────────────────────────────┘
                                  ▼
  ┌──────────────────────────────────────────────────────────────────┐
  │ CODING FEEDBACK LOOP                                             │
  │  Coding model analyzes simulation outputs (video frames, logs,   │
  │  transcripts, analyzer results), adds focused analyzers/tests,   │
  │  and loops back to DESIGN & IMPLEMENT.                           │
  └──────────────────────────────────────────────────────────────────┘

Setup

python3 -m venv .venv
source .venv/bin/activate
pip install -e ".[dev]"

For rollout validation with Gemini / VLM analyzers (and .env loading):

pip install -e ".[vlm]"
cp .env.example .env   # set GEMINI_API_KEY; do not commit .env

MuJoCo uses an OpenGL backend for offscreen RGB (mujoco.Renderer). On a normal desktop, the default platform GL is fine. In headless CI, use UR5GripperEnv(enable_rgb=False) or configure OSMesa/EGL per the MuJoCo rendering docs. Run full RGB rollouts (simulate_policy.py) in a terminal where rendering already works.

Optional: inverse kinematics (`mink`)

Differential IK on the same simulator MJCF uses mink (MuJoCo-based) plus a QP backend:

pip install -e ".[ik]"

Task workflow

Inputs

Create a task bundle at policies/impl/<task>/:

File	Purpose
`<task>.yaml`	`task_spec.inline` (intent), optional `policy_module`
`<task>.py`	`policy(obs, step, env) -> ctrl` (and optional `reset`)
`<analyzer_type>.py`	Optional policy-specific analyzer (`build(params) -> analyzer`)
`tests/test_<task>.py`	Headless unit tests (paired with the task)

Canonical loop (per task)

From the repo root with the venv active:

# 1 — Headless unit tests
pytest -q tests/test_<task>.py

# 2 — Simulation rollout (writes artifacts/<task>/)
python scripts/simulate_policy.py --config policies/simulate_policy.example.yaml \
  policies/impl/<task>/<task>.py --run-dir artifacts/<task>

# 3 — Coding agent analyzes the simulation outputs and makes changes

Design & implement: done in policy/analyzer/test files for the task.

Simulate: produces augmented rollout video, joints log, and analyzer artifacts.

Coding feedback loop: analyze artifacts (including extracted video frames and analyzer JSON), implement fixes, add focused tests/analyzers, and repeat.

Outputs under artifacts/<task>/: rollout.mp4, metrics.txt, joints.csv, rollout.vlm_transcript.json (when VLM transcriber is enabled), and optional custom analyzer JSON files.

joints.csv logs joint qpos (and free-joint pose columns), then for each actuator a target_* column (the vector returned by the policy for that step) and a ctrl_* column (data.ctrl after physics, i.e. applied command). Floats are rounded to five decimal places to keep files smaller.

Third-party assets

The robot uses vendored MuJoCo Menagerie universal_robots_ur5e OBJ meshes under src/robot_manipulation_sim/mjcf/menagerie_ur5e/ (see NOTICE.txt and MENAGERIE_LICENSE there). Default scene: ur5e_two_finger_scene.xml.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PhyHarness - A Dev Harness for Physical AI

Setup

Optional: inverse kinematics (`mink`)

Task workflow

Inputs

Canonical loop (per task)

Third-party assets

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
artifacts		artifacts
docs		docs
policies		policies
scripts		scripts
src/robot_manipulation_sim		src/robot_manipulation_sim
tests		tests
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
pyproject.toml		pyproject.toml

Folders and files

Latest commit

History

Repository files navigation

PhyHarness - A Dev Harness for Physical AI

Setup

Optional: inverse kinematics (mink)

Task workflow

Inputs

Canonical loop (per task)

Third-party assets

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Optional: inverse kinematics (`mink`)

Packages