Feature/dmcontrol env by mreso · Pull Request #319 · meta-pytorch/OpenEnv

mreso · 2026-01-22T00:37:37Z

Summary

Adds dm_control_env, a new OpenEnv environment wrapping https://github.com/google-deepmind/dm_control for MuJoCo-based continuous control tasks.

Key features:

Supports 40+ environments across 18 domains (cartpole, walker, humanoid, cheetah, hopper, quadruped, etc.)
Dynamic environment switching via reset(domain_name="...", task_name="...")
Optional visual observations (base64-encoded PNG rendering)
macOS compatibility with threading-safe async methods
Full client-server architecture following OpenEnv patterns

Files added:

envs/dm_control_env/client.py - WebSocket client with from_direct() factory
envs/dm_control_env/models.py - Pydantic models (DMControlAction, DMControlObservation, DMControlState)
envs/dm_control_env/server/ - FastAPI server and Environment implementation
envs/dm_control_env/examples/ - Control examples for cartpole, hopper, and quadruped
envs/dm_control_env/README.md - Documentation with usage examples

Type of Change

Alignment Checklist

Before submitting, verify:

I have read .claude/docs/PRINCIPLES.md and this PR aligns with our principles
I have checked .claude/docs/INVARIANTS.md and no invariants are violated
I have run /pre-submit-pr (or bash .claude/hooks/lint.sh and tests) and addressed all issues

RFC Status

Not required (bug fix, docs, minor refactoring)
RFC exists: #___
RFC needed (will create before merge)

Test Plan

cd envs/dm_control_env

Test client-server mode:
PYTHONPATH=src:envs uvicorn envs.dm_control_env.server.app:app --port 8765

In another terminal:

python -c "from dm_control_env import DMControlEnv; c = DMControlEnv('http://localhost:8765'); print(c.reset())"

Claude Code Review

Alignment Review Report

Automated Checks

Lint: PASS - 77 files already formatted
Debug code: CLEAN - No debugger statements found in dm_control_env

Open RFCs Context
┌───────────────────────┬─────────────┬──────────────────────────────────────────────┐
│ RFC │ Status │ Relevance │
├───────────────────────┼─────────────┼──────────────────────────────────────────────┤
│ 000-project-phases.md │ Implemented │ Design principles - foundational │
├───────────────────────┼─────────────┼──────────────────────────────────────────────┤
│ 001-abstractions.md │ Implemented │ Environment/Client abstractions │
├───────────────────────┼─────────────┼──────────────────────────────────────────────┤
│ 002-env-spec.md │ Implemented │ Environment specification │
├───────────────────────┼─────────────┼──────────────────────────────────────────────┤
│ 003-mcp-support.md │ Implemented │ MCP integration (not used by dm_control_env) │
└───────────────────────┴─────────────┴──────────────────────────────────────────────┘
No Draft or In Review RFCs that would conflict with dm_control_env.

Tier 1: Fixes Required

None identified. The dm_control_env code passes all automated checks.

Tier 2: Alignment Discussion

Principle Conflicts

None identified. The dm_control_env implementation follows OpenEnv principles:
┌────────────────────────────┬────────┬────────────────────────────────────────────────────────────────────┐
│ Principle │ Status │ Evidence │
├────────────────────────────┼────────┼────────────────────────────────────────────────────────────────────┤
│ Gymnasium-style API │ ✅ │ Uses reset(), step(), state │
├────────────────────────────┼────────┼────────────────────────────────────────────────────────────────────┤
│ Container isolation │ ✅ │ Has server/Dockerfile │
├────────────────────────────┼────────┼────────────────────────────────────────────────────────────────────┤
│ Type safety with generics │ ✅ │ Environment[DMControlAction, DMControlObservation, DMControlState] │
├────────────────────────────┼────────┼────────────────────────────────────────────────────────────────────┤
│ Pydantic serialization │ ✅ │ All models extend Action, Observation, State │
├────────────────────────────┼────────┼────────────────────────────────────────────────────────────────────┤
│ Rewards inside environment │ ✅ │ Reward from dm_control passed through, not computed externally │
├────────────────────────────┼────────┼────────────────────────────────────────────────────────────────────┤
│ Client-server separation │ ✅ │ client.py does not import from server/ │
└────────────────────────────┴────────┴────────────────────────────────────────────────────────────────────┘
RFC Conflicts

None identified. The dm_control_env is a standard environment implementation that:

Does not introduce new core APIs
Does not change existing interfaces
Follows established patterns from echo_env
Does not require MCP support (uses standard Gym-like API only)

Per RFC README: "You generally don't need an RFC for new example environments (unless they introduce new patterns)." dm_control_env follows existing patterns.

Invariant Check
┌──────────────────────────┬────────┬────────────────────────────────────┐
│ Invariant │ Status │ Notes │
├──────────────────────────┼────────┼────────────────────────────────────┤
│ Gymnasium API signatures │ ✅ │ Standard reset(), step(), state │
├──────────────────────────┼────────┼────────────────────────────────────┤
│ Generic type safety │ ✅ │ Proper generic types used │
├──────────────────────────┼────────┼────────────────────────────────────┤
│ Pydantic serialization │ ✅ │ All wire types are Pydantic models │
├──────────────────────────┼────────┼────────────────────────────────────┤
│ Agent isolation │ ✅ │ No MCP tools exposing reset/step │
├──────────────────────────┼────────┼────────────────────────────────────┤
│ Container isolation │ ✅ │ Dockerfile provided │
├──────────────────────────┼────────┼────────────────────────────────────┤
│ Client-server separation │ ✅ │ No cross-imports │
├──────────────────────────┼────────┼────────────────────────────────────┤
│ Rewards in environment │ ✅ │ Uses dm_control's native reward │
└──────────────────────────┴────────┴────────────────────────────────────┘
Summary

0 mechanical issues to fix
0 alignment points for human review
0 RFC conflicts to discuss

Verdict: READY FOR REVIEW - The dm_control_env follows all OpenEnv principles and invariants. It is a standard environment implementation without any architectural deviations.

Add quadruped example Added hopper examples Align example cli Fix libglx installation in docker and enable exiting with ctrl + c inside docker Add screenshots Increase random forces

- Rename directory from dmcontrol_env to dm_control_env - Update all internal import paths and module references - Add screenshots to README (cartpole.png, quadruped.png) - Update examples with consistent CLI args (--visual, --headless, --task) - Increase random force magnitude in hopper/quadruped examples

greptile-apps · 2026-01-22T00:42:40Z

Greptile Summary

This PR adds dm_control_env, a new OpenEnv environment wrapping Google DeepMind's dm_control library to provide access to 40+ MuJoCo-based continuous control tasks across 18 domains (cartpole, walker, humanoid, cheetah, hopper, quadruped, etc.).

Key Features Implemented:

Full client-server architecture following OpenEnv patterns with WebSocket communication
Dynamic environment switching via reset(domain_name="...", task_name="...") without restarting the server
Optional visual observations (base64-encoded PNG rendering) controlled by render flag
macOS compatibility with threading-safe async methods (MuJoCo crashes when run in background threads on macOS, so synchronous fallback is used)
Proper reward passthrough from dm_control's native reward computation (not externally computed)
Type-safe generics with Pydantic models extending OpenEnv base types
Concurrent session support enabled (SUPPORTS_CONCURRENT_SESSIONS = True)
Comprehensive documentation with three interactive examples (cartpole, hopper, quadruped) demonstrating OpenEnv step/observation pattern

Implementation Quality:

Follows all OpenEnv principles from PRINCIPLES.md (Gymnasium API, container isolation, type safety, rewards in environment)
No invariant violations found - proper client-server separation, no MCP tools exposing reset/step to agents
Extensive fallback import handling for flexible deployment contexts
Proper error handling with helpful macOS-specific guidance for MuJoCo/OpenGL issues
Multi-stage Dockerfile with appropriate MuJoCo/OpenGL dependencies

Additional Improvements:

Applied exec to CMD in Dockerfiles across multiple environments (echo_env, repl_env, textarena_env, unity_env, websearch_env) for proper SIGINT/SIGTERM signal handling

Confidence Score: 5/5

This PR is safe to merge with no identified issues
Score reflects exemplary adherence to all OpenEnv principles and invariants, comprehensive testing as evidenced by the automated alignment review, proper handling of platform-specific issues (macOS threading), clean architecture with no client-server boundary violations, and high-quality documentation. The implementation follows established patterns from existing environments and introduces no new architectural concerns.
No files require special attention

Important Files Changed

Filename	Overview
envs/dm_control_env/client.py	WebSocket client implementing DMControlEnv with proper type safety, flexible import handling, and from_direct() factory for embedded server - follows OpenEnv patterns correctly
envs/dm_control_env/models.py	Pydantic models for Action/Observation/State extending core OpenEnv types, includes comprehensive list of 40+ available environments
envs/dm_control_env/server/dm_control_environment.py	Environment implementation wrapping dm_control.suite with dynamic environment switching, macOS threading workarounds, and proper reward passthrough from dm_control
envs/dm_control_env/server/app.py	FastAPI application using create_app factory with concurrent session support enabled
envs/dm_control_env/server/Dockerfile	Multi-stage Docker build with MuJoCo/OpenGL dependencies, proper exec usage for signal handling
envs/dm_control_env/pyproject.toml	Package configuration with dm_control and mujoco dependencies, optional interactive/dev dependencies

Sequence Diagram

sequenceDiagram
    participant Client as DMControlEnv Client
    participant WS as WebSocket Connection
    participant Server as FastAPI Server
    participant Env as DMControlEnvironment
    participant DMC as dm_control.suite
    
    Note over Client,DMC: Initialization
    Client->>Server: HTTP GET /health
    Server-->>Client: 200 OK
    Client->>Server: WebSocket Connect
    Server->>Env: Create Environment Instance
    Env->>DMC: Load domain/task
    DMC-->>Env: Environment ready
    Server-->>Client: WebSocket Connected
    
    Note over Client,DMC: Reset Episode
    Client->>WS: reset(domain_name, task_name, render=True)
    WS->>Server: WebSocket message
    Server->>Env: reset_async()
    Env->>DMC: reset()
    DMC-->>Env: TimeStep (observations, reward)
    Env->>DMC: render() [if render=True]
    DMC-->>Env: RGB pixels
    Env-->>Server: DMControlObservation (obs, pixels, reward, done)
    Server-->>WS: JSON response
    WS-->>Client: StepResult[DMControlObservation]
    
    Note over Client,DMC: Step Loop
    loop Until done
        Client->>WS: step(DMControlAction)
        WS->>Server: WebSocket message
        Server->>Env: step_async(action)
        Env->>DMC: step(action_array)
        DMC-->>Env: TimeStep (observations, reward, done)
        Env->>DMC: render() [if render enabled]
        DMC-->>Env: RGB pixels
        Env-->>Server: DMControlObservation
        Server-->>WS: JSON response
        WS-->>Client: StepResult[DMControlObservation]
    end
    
    Note over Client,DMC: State Query
    Client->>Server: HTTP GET /state
    Server->>Env: state property
    Env-->>Server: DMControlState (domain, task, specs)
    Server-->>Client: JSON response
    
    Note over Client,DMC: Cleanup
    Client->>Server: WebSocket Close
    Server->>Env: close()
    Env->>DMC: close()
    DMC-->>Env: Cleanup complete

greptile-apps · 2026-01-22T00:42:41Z

Greptile found no issues!

From now on, if a review finishes and we haven't found any issues, we will not post anything, but you can confirm that we reviewed your changes in the status check section.

_{This feature can be toggled off in your Code Review Settings by deselecting "Create a status check for each PR".}

zkwentz · 2026-01-22T10:54:54Z

Nice! Will take a closer look on desktop in a few hours.

burtenshaw · 2026-02-05T13:44:56Z

Hey @mreso . Thanks for this and sorry to go quiet on this. Some high level changes please:

Can you also deployed it to the HF hub?
Then update the environments page in the docs
I would remove all of the exec uvicorn changes in other envs and open a separate PR for those.

Thanks.

mreso added 3 commits January 21, 2026 14:59

Adds inital version of dm_control_env

511f403

Add quadruped example Added hopper examples Align example cli Fix libglx installation in docker and enable exiting with ctrl + c inside docker Add screenshots Increase random forces

Fix formatting in dm_control_env files

3e8397e

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 22, 2026

Merge branch 'main' into feature/dmcontrol_env

fbeb1b4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/dmcontrol env#319

Feature/dmcontrol env#319
mreso wants to merge 4 commits intometa-pytorch:mainfrom
mreso:feature/dmcontrol_env

mreso commented Jan 22, 2026 •

edited

Loading

Uh oh!

greptile-apps bot commented Jan 22, 2026

Uh oh!

greptile-apps bot commented Jan 22, 2026

Uh oh!

zkwentz commented Jan 22, 2026

Uh oh!

burtenshaw commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mreso commented Jan 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Type of Change

Alignment Checklist

RFC Status

Test Plan

In another terminal:

Claude Code Review

Uh oh!

greptile-apps bot commented Jan 22, 2026

Greptile Summary

Confidence Score: 5/5

Important Files Changed

Sequence Diagram

Uh oh!

greptile-apps bot commented Jan 22, 2026

Greptile found no issues!

Uh oh!

zkwentz commented Jan 22, 2026

Uh oh!

burtenshaw commented Feb 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mreso commented Jan 22, 2026 •

edited

Loading