Skip to content

ASSERT-KTH/elle-elle-aime

Repository files navigation

elle-elle-aime ❤️

Framework to use LLMs for automated program repair.

Supported benchmarks:

  • Defects4J
  • GitBug-Java
  • HumanEval-Java
  • QuixBugs

If you use this code, please cite

@techreport{elle-elle-aime,
 title = {RepairLLaMA: Efficient Representations and Fine-Tuned Adapters for Program Repair},
 year = {2023},
 author = {André Silva and Sen Fang and Martin Monperrus},
 url = {http://arxiv.org/pdf/2312.15698},
 number = {2312.15698},
 institution = {arXiv},
}

Installation

Requires python3.11 (or latest) and python-poetry.

To setup elle-elle-aime, run the following command:

./setup.sh

Note: By default, GitBug-Java will be installed. This benchmark is heavy (requires ~130GiB free). If you do not need to use GitBug-Java you can comment out the commands in setup.sh that refer to it before running the script.

Execution

Be sure to be in the correct environment:

poetry shell

Example of how to generate samples for Defects4J using the instruct strategy:

python generate_samples.py defects4j instruct

Example of how to generate patches for the samples:

python generate_patches.py samples_defects4j_instruct_.jsonl gpt-4o-mini --n_workers 1 --num_return_sequences 10 --temperature 1.0

Example of how to evaluate the generated patches:

python evaluate_patches.py defects4j candidates_defects4j_instruct_gpt-4o-mini.jsonl.gz --strategy openai

Example of how to export the evaluated patches:

python export_results.py defects4j evaluation_defects4j_instruct_openai.jsonl --model_name gpt-4o-mini

Development

How to run tests:

pytest -s tests/

Check out the results

We store all the results (prompts, patches, evaluation) in a separate repository.

Please visit https://github.com/ASSERT-KTH/repairbench for these.