Skip to content

nick1231321/maintenance-bench

Repository files navigation

title My Env Environment Server
emoji 💿
colorFrom yellow
colorTo green
sdk docker
pinned false
app_port 8000
base_path /web
tags
openenv

Maintenance-Bench

An OpenEnv-based reinforcement learning environment for industrial maintenance diagnostics.

This repository provides the initial implementation of a diagnostic decision-making environment designed to evaluate AI research agents on industrial troubleshooting tasks.

The environment is inspired by real-world industrial maintenance logs and is part of the Maintenance-Bench benchmark.


Overview

The environment simulates an industrial diagnostic process where an agent must decide which diagnostic steps to perform to identify a fault.

Each episode corresponds to a maintenance case scenario stored as a JSON file.

A scenario contains:

  • machine metadata
  • symptoms
  • possible diagnostic steps
  • observations for each step
  • rewards
  • operational costs

The agent interacts with the environment through a step-by-step diagnostic process, receiving observations, rewards, accumulated costs, and the current health status of the system.


Environment API

The environment follows the OpenEnv interface:

  • reset() – loads a new diagnostic scenario and initializes an episode
  • step(action) – performs a diagnostic action and returns the observation

Episode Flow

  1. The environment loads a diagnostic scenario from the data/ directory.
  2. The agent receives the initial symptom description.
  3. The agent chooses a diagnostic action.
  4. The environment returns:
  • the observation
  • reward
  • accumulated cost
  • health status of the system
  • history of performed steps

Example Interaction

Example sequence of actions:

reset()

Agent observation:

"High Bearing temperature."

Agent performs diagnostic step:

action = "diagnose:0"

Environment response:

Observation:
"Overload relay found tripped at the time of breakdown."

Reward: 1
Total Cost: 5
done: False #equivalent to health status 0

History:
[
  {
    "action": "Monitored bearing temperatures at DE and NDE using RTD.",
    "observation": "Overload relay found tripped at the time of breakdown."
  }
]

Project Goal

The goal of this project is to provide a benchmark environment for evaluating deep research agents on industrial maintenance diagnostics tasks.

This environment supports research in:

  • autonomous troubleshooting agents
  • cost-aware reasoning
  • industrial decision making
  • long-horizon diagnostic planning


Running the Environment (OpenEnv + Docker)

The environment is built using OpenEnv, which automatically packages the environment into a Docker image that exposes an HTTP API.

Build the Environment

OpenEnv builds the environment and creates a Docker image:

openenv build

This process:

  • packages the environment
  • creates a Docker image
  • prepares the HTTP server interface

Run the Environment

After building, run the environment container:

docker run -p 8000:8000 my_env_env:latest

This starts the environment server and exposes it at:

http://localhost:8000

Available Endpoints

Once running, the following endpoints are available:

POST /reset
POST /step
GET  /state
GET  /health

You can interact with the environment using:

  • OpenEnv clients
  • HTTP requests
  • Swagger UI
  • custom research agents

License

This project builds on OpenEnv and follows the license terms of the respective components.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors