Skip to content

ServiceNow/NOWAI-Bench

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 

Repository files navigation

NOWAI-Bench

NOW as Enterprise AI Industry Gold Standard

NOWAI-Bench is ServiceNow's enterprise AI benchmarking suite, designed to measure whether AI agents perform reliably across the real workflows, domains, and governance demands of the world's largest organizations. Grounded in production-grade enterprise tasks across ITSM, HR, CSM, and cross-domain scenarios, it provides a rigorous, open standard that enables enterprises to make informed model selection decisions, validate AI deployments with confidence, and meet emerging regulatory requirements for AI transparency and accountability.

Benchmark Suite

Benchmark Description Repo
EnterpriseOps-Gym 1,150 tasks across core enterprise domains (IT, HR, Finance, Customer Service, Procurement). Submitted to ICML. → EnterpriseOps-Gym
EVA-Bench Voice agent evaluation benchmark targeting enterprise contact center and service desk scenarios. → EVA-Bench

Getting Started

Each benchmark lives in its own repository with self-contained setup instructions. See the individual repos linked above.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors