Skip to content

Source code of Dialogue Summaries as Dialogue States (DS2), Template-Guided Summarization for Few-shot Dialogue State Tracking for KLUE-DST (a.k.a WoS)

Notifications You must be signed in to change notification settings

squiduu/fewshot-klue-dst-as-ds2

Repository files navigation

Few-shot KLUE-DST as DS2

This is the Korean version implementation of "Dialogue Summaries as Dialogue States (DS2), Template-Guided Summarization for Few-shot Dialogue State Tracking" on KLUE-DST (a.k.a WoS) dataset.

Overview

I made the Korean heuristic converter myself, and it was relatively unnatural than the English converter because of the difference in characteristics between Korean and English. Therefore, the model can make better performance if you create a more natural converter.

Leaderboard

Few-shot JGA(%) is calculated with the validation set because the test set is not open to public.

Model Domains
관광 식당 숙소 지하철 택시
Kolang-T5-base 1% 5% 1% 5% 1% 5% 1% 5% 1% 5%
52.6 72.2 38.7 50.0 30.6 64.1 63.0 81.6 44.9 77.9

The pre-trained LM used in this repository is Kolang-T5-base.

Installation

This repository is available in Ubuntu 20.04 LTS, and it is not tested in other OS.

conda create -n klue_dst python=3.7.10
conda activate klue_dst
cd KLUE_DST_as_DS2
pip install -r requirements.txt

Download KLUE-DST Dataset

You can download the dataset from KLUE-Benchmark or the following commands.

cd kluewos11
wget https://aistages-prod-server-public.s3.amazonaws.com/app/Competitions/000073/data/wos-v1.1.tar.gz
tar -xvf wos-v1.1.tar.gz
cd wos-v1.1/
mv ontology.json wos-v1.1_dev.json wos-v1.1_dev_sample_10.json wos-v1.1_train.json ..
cd ..
rm wos-v1.1.tar.gz
rm -r wos-v1.1

Preprocess Data

It needs that converting the data format of KLUE-DST to that of MultiWOZ to utilize the original code for English. You can get dev.json and train.json after pre-processing in the ./kluewos11.

cd ..
sh convert_data_format.sh

Few-shot learning

Please set the training arguments --dev_batch_size, --test_batch_size, --train_batch_size, --grad_acc_steps, and --num_gpus in train.sh to suit your learning environment first, and then

sh train.sh

It takes approximately 5 minutes per epoch on 2 NVIDIA Titan RTX for 1% of train set. Finally, you can check the metrics from metrics.csv in ./log if you follow the above commands.

Citation and Contact

This repository is based on the following paper:

@article{shin2022dialogue,
  title={Dialogue Summaries as Dialogue States (DS2), Template-Guided Summarization for Few-shot Dialogue State Tracking},
  author={Shin, Jamin and Yu, Hangyeol and Moon, Hyeongdon and Madotto, Andrea and Park, Juneyoung},
  journal={arXiv preprint arXiv:2203.01552},
  year={2022}
}

About

Source code of Dialogue Summaries as Dialogue States (DS2), Template-Guided Summarization for Few-shot Dialogue State Tracking for KLUE-DST (a.k.a WoS)

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published