Skip to content

Generalizing Question Answering System with Pre-trained Language Model Fine-tuning.

Notifications You must be signed in to change notification settings

HLTCHKUST/HLTC-MRQA

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

HLTC-MRQA

License: MIT

This is the TensorFlow implementation of the paper: Generalizing Question Answering System with Pre-trained Language Model Fine-tuning.

Dan Su*, Yan Xu*, Genta Indra Winata, Peng Xu, Hyeondey Kim, Zihan Liu, Pascale Fung MRQA@EMNLP 2019 [PDF]

Dan Su and Yan Xu contributed equally to this work.

The MRQA model can be downloaded by this link.

If you use our HLTC-MRQA model in your work, please cite the following paper. The bibtex is listed below:

@inproceedings{su2019generalizing,
  title={Generalizing Question Answering System with Pre-trained Language Model Fine-tuning},
  author={Su, Dan and Xu, Yan and Winata, Genta Indra and Xu, Peng and Kim, Hyeondey and Liu, Zihan and Fung, Pascale},
  booktitle={Proceedings of the 2nd Workshop on Machine Reading for Question Answering},
  pages={203--211},
  year={2019}
}

Dependency

Check the packages needed or simply run the command

❱❱❱ pip install -r requirements.txt

Abstract

With a large number of datasets being released and new techniques being proposed, Question answering (QA) systems have witnessed great breakthroughs in reading comprehension (RC)tasks. However, most existing methods focus on improving in-domain performance, leaving open the research question of how these mod-els and techniques can generalize to out-of-domain and unseen RC tasks. To enhance the generalization ability, we propose a multi-task learning framework that learns the shared representation across different tasks. Our model is built on top of a large pre-trained language model, such as XLNet, and then fine-tuned on multiple RC datasets. Experimental results show the effectiveness of our methods, with an average Exact Match score of 56.59 and an average F1 score of 68.98, which significantly improves the BERT-Large baseline by 8.39 and 7.22.

Data

The required data can be downloaded from MRQA Github repository.

Methodology

Experiment

To finetune the XLNet-based model on MRQA dataset with TPU, please modify the paths in the script file and run:

❱❱❱ sh scripts/run_mrqa_TPU.sh

To finetune the XLNet-based model on MRQA dataset with single GPU, please modify the paths in the script file and run:

❱❱❱ sh scripts/run_mrqa_GPU.sh

Save the pretrained HLTC-MRQA model for inference:

❱❱❱ sh scripts/save_mrqa_model.sh

Use the un-exported HLTC-MRQA model for inference:

❱❱❱ sh scripts/predict_mrqa.sh

Use the exported HLTC-MRQA model for inference:

❱❱❱ sh scripts/predictor.sh

Results

Acknowledgement

The code is partially modified from the original XLNet repository.

About

Generalizing Question Answering System with Pre-trained Language Model Fine-tuning.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published