Skip to content

HKUST-KnowComp/TEGA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Enhancing Transformers for Generalizable First-Order Logical Entailment

Official Github Repository for ACL 2025 paper:
Enhancing Transformers for Generalizable First-Order Logical Entailment arXiv

Our Benchmark

We provide a sample dataset in the folder ./sample_query_data with a small amount of queries from all 55 query types. We use multiprocessing to accelerate the dataset sampling process. To replicate our experiments, you may use the following command to sample your own dataset:

cd efo_code

python sample_query_multi.py --sample_formula_scope SEEN23 --mode train --output_folder \
../your_folder_name/training_queries --num_queries <your_training_data_size> \
--num_processes 80 --dataset FB15K237 <replace with your targeting KG>

python sample_query_multi.py --sample_formula_scope FULL55 --mode valid --output_folder \
../your_folder_name/validation_queries --num_queries <your_validation_data_size> \
--num_processes 80 --dataset FB15K237 <replace with your targeting KG>

python sample_query_multi.py --sample_formula_scope FULL55 --mode test --output_folder \
../your_folder_name/testing_queries --num_queries <your_testing_size> \
--num_processes 80 --dataset FB15K237 <replace with your targeting KG>

Query Answering Models

All transformers are trained on 4 NVIDIA A100 GPUs for two days, with a batch size of 1024. To replicate the experimental results, you may train the models listed in the folder ./model with the following command:

Transformer + Absolute PE

cd model

python train.py \
    -dn FB15k-237 <replace with your targeting KG> \
    -m transformer \
    --train_query_dir ../sample_query_data/fb237-23-train <replace with your own folder name> \
    --valid_query_dir ../sample_query_data/fb237-55-valid <replace with your own folder name> \
    --test_query_dir ../sample_query_data/fb237-55-test <replace with your own folder name> \
    --checkpoint_path ../checkpoint/logs \
    -b 1024 \
    --log_steps 50000 \
    -lr 0.0001 \

Transformer + Relative PE

cd model

python train.py \
    -dn FB15k-237 <replace with your targeting KG> \
    -m transformer \
    --train_query_dir ../sample_query_data/fb237-23-train <replace with your own folder name> \
    --valid_query_dir ../sample_query_data/fb237-55-valid <replace with your own folder name> \
    --test_query_dir ../sample_query_data/fb237-55-test <replace with your own folder name> \
    --checkpoint_path ../checkpoint/logs \
    -b 1024 \
    --log_steps 50000 \
    -lr 0.0001 \
    --rpe

TEGA (Transformer Encoder with Guided Attention)

cd model

python train.py \
    -dn FB15k-237 <replace with your targeting KG> \
    -m transformertega \
    --train_query_dir ../sample_query_data/fb237-23-train <replace with your own folder name> \
    --valid_query_dir ../sample_query_data/fb237-55-valid <replace with your own folder name> \
    --test_query_dir ../sample_query_data/fb237-55-test <replace with your own folder name> \
    --checkpoint_path ../checkpoint/logs \
    -b 1024 \
    --log_steps 50000 \
    -lr 0.0001 \
    --rpe \
    --num_categories 6 \
    --pooling sum \
    --self_dist

Checking Model Performance

To check model performances, you may use tensorboardX to log the evaluation results in ./checkpoint/logs/gradient_tape with the following command:
tensorboard --logdir ./checkpoint/logs/gradient_tape --port 6006
ssh -N -f -L localhost:port_number:localhost:port_number your_server_location

About

[ACL 2025] Enhancing Transformers for Generalizable First-Order Logical Entailment

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published