Skip to content

The repository for ACL 2024 findings paper: DARA: Decomposition-Alignment-Reasoning Autonomous Language Agent for Question Answering over Knowledge Graphs

Notifications You must be signed in to change notification settings

UKPLab/acl2024-DARA

Repository files navigation

DARA: Decomposition-Alignment-Reasoning Autonomous Language Agent for Question Answering over Knowledge Graphs

License Python Versions

🤗 Models | 🤗 Dataset | 📃 Paper

This repository implements the DARA, a LLM-based agent for KGQA, as described in DARA: Decomposition-Alignment-Reasoning Autonomous Language Agent for Question Answering over Knowledge Graphs

Abstract : Answering Questions over Knowledge Graphs (KGQA) is key to well-functioning autonomous language agents in various real-life applications. To improve the neural-symbolic reasoning capabilities of language agents powered by Large Language Models (LLMs) in KGQA, we propose the DecompositionAlignment-Reasoning Agent (DARA) framework. DARA effectively parses questions into formal queries through a dual mechanism: high-level iterative task decomposition and low-level task grounding. Importantly, DARA can be efficiently trained with a small number of high-quality reasoning trajectories. Our experimental results demonstrate that DARA fine-tuned on LLMs (e.g. Llama-2-7B, Mistral) outperforms both in-context learning-based agents with GPT-4 and alternative fine-tuned agents, across different benchmarks in zero-shot evaluation, making such models more accessible for real-life applications. We also show that DARA attains performance comparable to state-of-the-art enumerating-and-ranking-based methods for KGQA.

Contact person: Haishuo Fang

UKP Lab | TU Darmstadt

Don't hesitate to send us an e-mail or report an issue, if something is broken (and it shouldn't be) or if you have further questions.

🚀 Setup

  • Environment
> python -m venv .dara
> source ./.dara/bin/activate
> pip install -r requirements.txt
  • Freebase Setup To install Freebase, please refer to here

Once the service is up, replace the url in the line 6 of ./kg_querier/sparql_executor.py

sparql = SPARQLWrapper("url/to/service")

Models

Models Training Data Model Card
DARA-Llama-2-7B UKPLab/dara UKPLab/dara-llama-2-7b
DARA-Llama-2-13B UKPLab/dara UKPLab/dara-llama-2-13b
DARA-Mistral-7B UKPLab/dara UKPLab/dara-mistral-7b
Agentbench-7B UKPLab/dara-agentbench UKPLab/agentbench-7b

Fine-tuning

torchrun --nproc_per_node=2 --master_port=8889 finetune.py \
    --model_name_or_path /path/to/model \
    --data_dir ./data/finetune_data/dara.json \
    --output_dir ./models/$1 \
    --wandb_project kgqa-dara \
    --run_name $2 \
    --report_to wandb \
    --num_train_epochs 10 \
    --per_device_train_batch_size 1 \
    --per_device_eval_batch_size 1 \
    --gradient_accumulation_steps 2 \
    --evaluation_strategy "no" \
    --save_strategy "epoch" \
    --logging_steps 20 \
    --cutoff_len 2500 \
    --learning_rate 2e-6 \
    --lr_scheduler_type cosine \
    --save_total_limit 4 \
    --weight_decay 0.00 \
    --warmup_ratio 0.1 \
    --tf32 True \
    --bf16 True \
    --deepspeed "./configs/default_offload_opt_param.json" \
    --gradient_checkpointing True \

As we use deepspeed to fine-tune the model, python zero_to_fp32.py . pytorch_model.bin is needed to convert the model format.

Generation

if [ $1 == "dara" ]; then
    python generate.py \
    --pred_file_path ./data/test_data/$2.json \
    --base_model $3 \
    --num_q $4 \
    --start_ix $5 \
    --output_dir ./outputs/dara/$6 \
    --batch_size 1 \

elif [ $1 == "agentbench" ]; then
    python -m baseline.agentbench \
    --pred_file_path ./data/test_data/$2.json \
    --base_model $3 \
    --num_q $4 \
    --start_ix $5 \
    --ouptut_dir ./outputs/agentbench/$6 \
    --batch_size 1 \
    --use_gpt \

fi

Evaluation

The evalaution datasets are under ./test_data/

python evaluate.py \
--gold_data_path ./data/test_data/grailqa.json \
--predict_data_dir ./outputs/dara \
--metric_output_path ./eval/grailqa/ \

Cite

@inproceedings{fang-etal-2024-dara,
    title = "$\texttt{DARA}$: Decomposition-Alignment-Reasoning Autonomous Language Agent for Question Answering over Knowledge Graphs",
    author = "Fang, Haishuo  and
      Zhu, Xiaodan  and
      Gurevych, Iryna",
    editor = "Ku, Lun-Wei  and
      Martins, Andre  and
      Srikumar, Vivek",
    booktitle = "Findings of the Association for Computational Linguistics ACL 2024",
    month = aug,
    year = "2024",
    address = "Bangkok, Thailand and virtual meeting",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.findings-acl.203",
    pages = "3406--3432"}
}

Disclaimer

This repository contains experimental software and is published for the sole purpose of giving additional background details on the respective publication.

About

The repository for ACL 2024 findings paper: DARA: Decomposition-Alignment-Reasoning Autonomous Language Agent for Question Answering over Knowledge Graphs

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages