DoomArena: A Framework for Testing AI Agents Against Evolving Security Threats

DoomArena is a modular, configurable, plug-in security testing framework for AI agents that supports many agentic frameworks including $\tau$-bench, Browsergym, OSWorld and TapeAgents (see Mail agent example). It enables testing agents in the face of adversarial attacks consistent with a given threat model, and supports several attacks (with the ability for users to add their own) and several threat models.

🚀 Quick Start

The DoomArena Intro Notebook is a good place for learning hands-on about the core concepts of DoomArena. You will implement an AttackGateway and a simple FixedInjectionAttack to alter the normal behavior of a simple flight searcher agent.

If you only want to use the library just run

pip install doomarena  # core library, minimal dependencies

If you want to run DoomArena integrated with TauBench, additionally run

pip install doomarena-taubench  # optional

If you want to run DoomArena integrated with Browsergym, additionally run

pip install doomarena-browsergym  # optional

If you want to test attacks on a Mail Agent (which can summarize and send emails on your behalf) inspired by the LLMail Challenge run

pip install -e doomarena/mailinject  # optional

If you want to run DoomArena integrated with OSWorld run

pip install -e doomarena/osworld

and follow our setup instructions here.

Export relevant API keys into your environment or .env file.

OPENAI_API_KEY="<your api key>"
OPENROUTER_API_KEY="<your api key>"

🛠️ Advanced Setup

To actively develop DoomArena, please create a virtual environment and install the package locally in editable mode using

pip install -e doomarena/core
pip install -e doomarena/taubench
pip install -e doomarena/browsergym
pip install -e doomarena/mailinject
pip install -e doomarena/osworld

Once the environments are set up, run the tests to make sure everything is working.

make ci-tests
make tests  # requires openai key

💻 Running Experiments

Follow the environment-specific instructions for TauBench and BrowserGym

🌟 Contributors

Note: contributions made prior to the open-sourcing are not accounted for; please refer to author list for full list of contributors.

📝 Paper

If you found DoomArena helpful, please cite us

@misc{boisvert2025doomarenaframeworktestingai,
      title={DoomArena: A framework for Testing AI Agents Against Evolving Security Threats}, 
      author={Leo Boisvert and Mihir Bansal and Chandra Kiran Reddy Evuru and Gabriel Huang and Abhay Puri and Avinandan Bose and Maryam Fazel and Quentin Cappart and Jason Stanley and Alexandre Lacoste and Alexandre Drouin and Krishnamurthy Dvijotham},
      year={2025},
      eprint={2504.14064},
      archivePrefix={arXiv},
      primaryClass={cs.CR},
      url={https://arxiv.org/abs/2504.14064}, 
}

Name		Name	Last commit message	Last commit date
Latest commit History 82 Commits
doomarena		doomarena
notebooks		notebooks
.gitignore		.gitignore
CITATION.cff		CITATION.cff
CODE_OF_CONDUCT.md		CODE_OF_CONDUCT.md
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
SECURITY.md		SECURITY.md
pytest.ini		pytest.ini

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

DoomArena: A Framework for Testing AI Agents Against Evolving Security Threats

🚀 Quick Start

🛠️ Advanced Setup

💻 Running Experiments

🌟 Contributors

📝 Paper

About

Uh oh!

Releases

Packages

Uh oh!

Contributors 4

Uh oh!

Languages

License

ServiceNow/DoomArena

Folders and files

Latest commit

History

Repository files navigation

DoomArena: A Framework for Testing AI Agents Against Evolving Security Threats

🚀 Quick Start

🛠️ Advanced Setup

💻 Running Experiments

🌟 Contributors

📝 Paper

About

Topics

Resources

License

Code of conduct

Security policy

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors 4

Uh oh!

Languages

Packages