Skip to content

Latest commit

 

History

History
54 lines (36 loc) · 2.22 KB

README.md

File metadata and controls

54 lines (36 loc) · 2.22 KB

Cleanlab Trustworthy Language Model (TLM) - Reliability and explainability added to every LLM output

Build Status PyPI - Version PyPI - Python Version

In one line of code, Cleanlab TLM adds real-time evaluation of every response in GenAI, RAG, LLM, and Agent systems.

Setup

This tutorial requires a TLM API key. Get one here.

export CLEANLAB_TLM_API_KEY=<YOUR_API_KEY_HERE>

Install the package:

pip install cleanlab-tlm

Usage

To get started, copy the code below to try your own prompt or score existing prompt/response pairs with ease.

from cleanlab_tlm import TLM
tlm = TLM(options={"log": ["explanation"], "model": "gpt-4o-mini"}) # GPT, Claude, etc.
out = tlm.prompt("What's the third month of the year alphabetically?")
print(out)

TLM returns a dictionary containing response, trustworthiness_score, and any requested optional fields like explanation.

{
  "response": "March.",
  "trustworthiness_score": 0.4590804375945598,
  "explanation": "Found an alternate response: December"
}

Why TLM?

  • Trustworthiness Scores: Each response comes with a trustworthiness score, helping you reliably gauge the likelihood of hallucinations.
  • Higher accuracy: Rigorous benchmarks show TLM consistently produces more accurate results than other LLMs like o3/o1, GPT 4o, and Claude.
  • Scalable API: Designed to handle large datasets, TLM is suitable for most enterprise applications, including data extraction, tagging/labeling, Q&A (RAG), and more.

Documentation

Comprehensive documentation along with tutorials and examples can be found here.

License

cleanlab-tlm is distributed under the terms of the MIT license.