Skip to content

usi-verification-and-security/athena

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

26 Commits
 
 
 
 
 
 
 
 

Repository files navigation

ATHENA

The modulAr consTrained Horn clauses modEl validatioN frAmework, ATHENA for short, is a framework for the validation of models produced by CHC solvers. It executes the selected CHC solver(s) for a given set of inputs and if a result is SAT it produces and solves the necessary SMT instances to validate the CHC model. Different SMT solvers can be selected for validation, and if the selected solver produces UNSAT proofs capable of being checked, the framework automatically executes the necessary proof checker.

CHC solvers supported: Eldarica, Golem, and Spacer.

SMT solvers supported: cvc5, OpenSMT, SMTInterpol, veriT, and Z3.

Proof checkers supported: alfc 1, Carcara, LFSC checker, SMTInterpol checker, and TSWC.

Dependencies

ATHENA has been developed for a Linux environment and requires the following dependencies:

Instructions

To execute the framework, go to the scripts folder and execute:

evaluate_all.sh [CHC solver] [SMT solver] [SMT mode] [Target] [Num. of threads]

[CHC solver] is one of all, none, eldarica, golem, or spacer.

[SMT solver] is one of all, none, cvc5, opensmt, smtinterpol, verit, or z3.

[SMT mode] is one of all, proof, or noProof.

[Target] is the folder containing the CHC benchmarks, e.g., test.

[Num. of threads] is a positive integer.

By default ATHENA searches for the tools it needs in a local folder named binaries, containing subfolders for chc_solvers, smt_solvers, and smt_checkers. The tools' paths and options, as well as the time and memory limits, can be set in the run_*.sh files. Please follow, or update, the current tool paths before executing; the tools' binaries have to be gathered separately.

The input CHC files have to follow the CHC-COMP format. If your files are following a different format, please use the CHC-COMP formatter to adjust them.

Metrics

To gather the average sizes of models and proofs, as well as the average runtimes and memory consumptions, execute log_stats_all.sh followed by python3 avg_stats_all.py in the scripts folder. After that, the classification of individual tool results obtained, e.g., SAT or UNSAT, can be seen by executing stats_all.sh. To generate plots displaying the sizes of models and proofs execute make_gnuplot_proof-size.sh.

To aggregate all individual tool results and to establish the validity of the CHC benchmarks execute log_results_all.sh. After that, the aggregate results can be displayed by executing results_all.sh.

Notes

The framework is currently capable of validating models produced for LIA and ALIA benchmarks, and has been used in the validation of the benchmarks from CHC-COMP 2022 and CHC-COMP 2024.

A large part of the SMT instance generator was implemented by Fedor Gromov.

Publication

To know more about the theory behind ATHENA, as well as its applications, have a look at our iFM23 paper.

Footnotes

  1. The alfc checker has been recently rebranded ethos. According to @ajreynol this was only a name change on the checker side, with the codebase carrying over from alfc. Together with this change, however, the relation between cvc5 and alfc/ethos has been refactored and ATHENA has not yet been updated to accommodate for it. For the time being, support is available for cvc5 v1.1.2 and alfc eca2cbd.