AIDE ML — The Machine Learning Engineering Agent

LLM‑driven agent that writes, evaluates & improves machine‑learning code.

What Is AIDE ML?

AIDE ML is the open‑source “reference build” of the AIDE algorithm, a tree‑search agent that autonomously drafts, debugs and benchmarks code until a user‑defined metric is maximised (or minimised). It ships as a research‑friendly Python package with batteries‑included utilities (CLI, visualisation, config presets) so that academics and engineer‑researchers can replicate the paper, test new ideas, or prototyping ML pipelines.

Layer	Description	Where to find it
AIDE algorithm	LLM‑guided agentic tree search in the space of code.	Described in our paper.
*AIDE ML repo* (this repo)**	Lean implementation for experimentation & extension.	`pip install aideml`
Weco product	The platform generalizes AIDE's capabilities to broader code optimization scenarios, providing experiment tracking and enhanced user control.	weco.ai

Who should use it?

Agent‑architecture researchers – swap in new search heuristics, evaluators or LLM back‑ends.
ML Practitioners – quickly build a high performance ML pipelines given a dataset.

Key Capabilities

Natural‑language task specification Point the agent at a dataset and describe goal + metric in plain English. No YAML grids or bespoke wrappers. aide data_dir=… goal="Predict churn" eval="AUROC"
Iterative agentic tree search Each python script becomes a node in a solution tree; LLM‑generated patches spawn children; metric feedback prunes and guides the search. OpenAI’s MLE‑Bench (75 Kaggle comps) found the tree‑search of AIDE wins 4 × more medals than the best linear agent (OpenHands).

Utility features provided by this repo

HTML visualiser – inspect the full solution tree and code attached to each node.
Streamlit UI – prototype ML solution .
Model‑neutral plumbing – OpenAI, Anthropic, Gemini, or any local LLM that speaks the OpenAI API.

Featured Research built on/with AIDE

Institution	Paper / Project Name	Links
OpenAI	MLE-bench: Evaluating Machine-Learning Agents on Machine-Learning Engineering	Paper, GitHub
METR	RE-Bench: Evaluating frontier AI R&D capabilities of language-model agents against human experts	Paper, GitHub
Sakana AI	The AI Scientist-v2: Workshop-Level Automated Scientific Discovery via Agentic Tree Search	Paper, GitHub
Meta	The Automated LLM Speedrunning Benchmark: Reproducing NanoGPT Improvements	Paper, GitHub
Meta	AI Research Agents for Machine Learning: Search, Exploration, and Generalization in MLE-bench	Paper, GitHub
SJTU	ML-Master: Towards AI-for-AI via Integration of Exploration and Reasoning	Paper, GitHub

Know another public project that cites or forks AIDE?
Open a PR and add it to the table!

How to Use AIDE ML

Quick Start

# 1  Install
pip install -U aideml

# 2  Set an LLM key
export OPENAI_API_KEY=<your‑key>  # https://platform.openai.com/api-keys

# 3  Run an optimisation
aide data_dir="example_tasks/house_prices" \
     goal="Predict the sales price for each house" \
     eval="RMSE between log‑prices"

After the run finishes you’ll find:

logs/<id>/best_solution.py – best code found
logs/<id>/tree_plot.html – click to inspect the solution tree

Web UI

pip install -U aideml   # adds streamlit
cd aide/webui
streamlit run app.py

Use the sidebar to paste your API key, upload data, set Goal & Metric, then press Run AIDE.

The UI shows live logs, the solution tree, and the best code.

Advanced CLI Options

# Choose a different coding model and run 50 steps
aide agent.code.model="claude-4-sonnet" \
     agent.steps=50 \
     data_dir=… goal=… eval=…

Common flags

Flag	Purpose	Default
`agent.code.model`	LLM used to write code	`gpt-4-turbo`
`agent.steps`	Improvement iterations	`20`
`agent.search.num_drafts`	Drafts per step	`5`

Use AIDE ML Inside Python

import aide
import logging

def main():
    logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(name)s - %(levelname)s - %(message)s')
    aide_logger = logging.getLogger("aide")
    aide_logger.setLevel(logging.INFO)
    print("Starting experiment...")
    exp = aide.Experiment(
        data_dir="example_tasks/bitcoin_price",  # replace this with your own directory
        goal="Build a time series forecasting model for bitcoin close price.",  # replace with your own goal description
        eval="RMSLE"  # replace with your own evaluation metric
    )

    best_solution = exp.run(steps=2)

    print(f"Best solution has validation metric: {best_solution.valid_metric}")
    print(f"Best solution code: {best_solution.code}")
    print("Experiment finished.")

if __name__ == '__main__':
    main()

Power‑User Extras

Local LLM (Ollama example)

export OPENAI_BASE_URL="http://localhost:11434/v1"
aide agent.code.model="qwen2.5" data_dir=… goal=… eval=…

Docker

docker build -t aide .
docker run -it --rm \
  -v "${LOGS_DIR:-$(pwd)/logs}:/app/logs" \
  -v "${WORKSPACE_BASE:-$(pwd)/workspaces}:/app/workspaces" \
  -v "$(pwd)/aide/example_tasks:/app/data" \
  -e OPENAI_API_KEY="your-actual-api-key" \
  aide data_dir=/app/data/house_prices goal="Predict price" eval="RMSE"

Development install

git clone https://github.com/WecoAI/aideml.git
cd aideml && pip install -e .

Citation

If you use AIDE in your work, please cite the following paper:

@article{aide2025,
      title={AIDE: AI-Driven Exploration in the Space of Code}, 
      author={Zhengyao Jiang and Dominik Schmidt and Dhruv Srikanth and Dixing Xu and Ian Kaplan and Deniss Jacenko and Yuxiang Wu},
      year={2025},
      eprint={2502.13138},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2502.13138}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
.github		.github
.streamlit		.streamlit
aide		aide
sample_results		sample_results
.dockerignore		.dockerignore
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AIDE ML — The Machine Learning Engineering Agent

What Is AIDE ML?

Who should use it?

Key Capabilities

Featured Research built on/with AIDE

How to Use AIDE ML

Quick Start

Web UI

Advanced CLI Options

Use AIDE ML Inside Python

Power‑User Extras

Local LLM (Ollama example)

Docker

Development install

Citation

About

Uh oh!

Releases 3

Uh oh!

Contributors 12

Uh oh!

Languages

License

WecoAI/aideml

Folders and files

Latest commit

History

Repository files navigation

AIDE ML — The Machine Learning Engineering Agent

What Is AIDE ML?

Who should use it?

Key Capabilities

Featured Research built on/with AIDE

How to Use AIDE ML

Quick Start

Web UI

Advanced CLI Options

Use AIDE ML Inside Python

Power‑User Extras

Local LLM (Ollama example)

Docker

Development install

Citation

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 3

Uh oh!

Contributors 12

Uh oh!

Languages

How to Use AIDE ML

Use AIDE ML Inside Python