Introducing HallOumi, a state-of-the-art claim verification model, outperforming DeepSeek R1, OpenAI o1, Google Gemini 1.5 Pro, and Anthropic Sonnet 3.5 at only 8 billion parameters!
HallOumi, the hallucination detection model built with Oumi, is a system built specifically to enable per-sentence verification of any content (either AI or human-generated) with sentence-level citations and human-readable explanations.
Try a hosted version of this demo on our website!
Read more in our blog post here!
You can easily build and run the HallOumi demo via docker. While inside the repo directory:
docker build -t halloumi-demo .
docker run -p 3000:3000 halloumi-demo
Open http://localhost:3000 with your browser to access the demo!
-
Install NPM
brew install npm
-
NextJS and React
npm install next@latest react@latest react-dom@latest
First, run the development server:
npm run dev
# or
yarn dev
# or
pnpm dev
# or
bun dev
Open http://localhost:3000 with your browser to access the demo!
If you'd like to point the demo to your own self-hosted version of the models, simply modify data.json.
Below is a sample data.json
assuming you've self hosted the generative model at localhost:8000
:
{
"models": [
{
"displayName": "My custom hosted model",
"name": "mymodel",
"apiUrl": "http://localhost:8000/chat/completions",
"isEmbeddingModel": false
}
],
"examples": [
{
"displayName": "Getting Started",
"claim": "Text here appears in the 'Claims to verify' box.",
"context": "This text will appear in the 'Context' box."
}
]
}
The demo assumes that the target endpoint supports the standard OpenAI API for each model.
pip install sglang
python3 -m sglang.launch_server --model-path oumi-ai/HallOumi-8B --port 8000 --dtype auto --mem-fraction-static 0.9 --trust-remote-code
Note that the classifier is hosted as an embeddings model. You can query it via the standard /embeddings
endpoint in OpenAI-compatible API servers.
pip install sglang
python3 -m sglang.launch_server --model-path oumi-ai/HallOumi-8B-classifier --port 8001 --dtype auto --mem-fraction-static 0.9 --trust-remote-code --is-embedding