Legible

Legible is an agent harness for analyzing government statutes and regulations. It helps policy analysts inside government agencies find what the rules actually say — surfacing ambiguities, applicant burdens, and the gaps that accumulate over decades of layered amendments.

Wait, what's an agent harness?

An agent harness is the scaffolding that wraps around a large language model to give it structure, tools, and a purpose. A model on its own takes input and produces output. A harness is what lets it act — retrieving documents, following analytical rubrics, calling external tools, and working through a problem across multiple steps before producing a structured result. Legible provides that scaffolding specifically for regulatory analysis: it connects an LLM to a corpus of governing documents, equips it with search and parsing tools, and directs its reasoning through plain-text skill files that domain experts write and maintain. The result is a system that applies consistent analytical judgment across a large body of legal text — work that would take a team of policy analysts weeks to do by hand.

The problem

Government benefit program rules are not written as coherent wholes. They grow through successive amendments — federal mandates layered onto state statutes, emergency provisions patched onto original language, guidance memos interpreting regulations that interpret laws. Over time, eligibility criteria become genuinely unclear. Caseworkers fill the gaps with judgment calls. Similarly situated applicants get different outcomes. The rules become, in a word, illegible.

Legible makes them readable again.

How it works

Legible is built around a simple idea: analytical expertise lives in plain-text skill files, not in code. Domain experts — policy analysts, benefits lawyers, program administrators — author rubric files that define exactly what to look for and how to classify findings. The agent applies those rubrics systematically across an entire corpus of governing documents.

Adding a new type of analysis means writing new skill files. No code changes required.

Reference scenarios

Two analytical scenarios are included out of the box:

Ambiguity detection — Identifies eligibility provisions that cannot produce a deterministic pass/fail outcome for all applicant situations. Flags undefined terms, unspecified thresholds, discretionary language, temporal gaps, and categorical holes. Produces a prioritized audit report for policy review.

Burden analysis — Maps requirements placed on applicants and beneficiaries. Classifies burden by type (physical, documentary, temporal, technological, cognitive, relational, recurring), assesses whether each requirement is legally mandated or agency discretion, and identifies which populations each burden disproportionately affects.

Getting started

See SPEC.md for the full software specification including repository structure, database schema, orchestrator design, tool definitions, and skill file authoring guidance.

The prototype corpus is built from the New York State Open Legislation API — no PDF parsing, no manual ingestion. A free API key is required and can be obtained at legislation.nysenate.gov/public. Once you have a key, seed the corpus with the Social Services Law volume and you have a real working dataset immediately:

# Seed the corpus with NY Social Services Law
python -m corpus.ingestion.sync --source ny_senate --law-ids SOC

# Generate embeddings for semantic search
python -m corpus.ingestion.build_embeddings

# Run an analysis
python main.py \
  --scenario ambiguity_detection \
  --program SNAP \
  --jurisdiction NY \
  --output-dir ./outputs/run_001

Adding additional law volumes or data sources is handled through the source adapter layer — see corpus/ingestion/base.py and config.yaml.

Built for extension

Legible is a harness, not a fixed tool. The two included scenarios are reference implementations. If you need a third type of analysis — contradiction detection, plain-language rewriting, cross-jurisdiction comparison — create a new folder under skills/scenarios/ and write the rubric. The infrastructure handles the rest.

Making the rules legible — for the people who administer them, and the people who depend on them.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
spec		spec
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Legible

Wait, what's an agent harness?

The problem

How it works

Reference scenarios

Getting started

Built for extension

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

Legible

Wait, what's an agent harness?

The problem

How it works

Reference scenarios

Getting started

Built for extension

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages