BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue Modeling

This repository includes the dataset and baselines of the paper:

BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue Modeling (Accepted in NeurIPS 2021 Track on Datasets and Benchmarks) [PDF].

Authors: Zhaojiang Lin, Andrea Madotto, Genta Indra Winata, Peng Xu, Feijun Jiang, Yuxiang Hu, Chen Shi, Pascale Fung

Abstract:

Task-oriented dialogue (ToD) benchmarks provide an important avenue to measure progress and develop better conversational agents. However, existing datasets for end-to-end ToD modelling are limited to a single language, hindering the development of robust end-to-end ToD systems for multilingual countries and regions. Here we introduce BiToD, the first bilingual multi-domain dataset for end-to-end task-oriented dialogue modeling. BiToD contains over 7k multi-domain dialogues (144k utterances) with a large and realistic parallel knowledge base. It serves as an effective benchmark for evaluating bilingual ToD systems and cross-lingual transfer learning approaches. We provide state-of-the-art baselines under three evaluation settings (monolingual, bilingual and cross-lingual). The analysis of our baselines in different settings highlights 1) the effectiveness of training a bilingual ToD system comparing to two independent monolingual ToD systems, and 2) the potential of leveraging a bilingual knowledge base and cross-lingual transfer learning to improve the system performance in the low resource condition.

Leaderboard

Monolingual

	English (EN)					Chinese (ZH)
	TSR	DSR	API_ACC	BLEU	JGA	TSR	DSR	API_ACC	BLEU	JGA
MinTL(mBART)	56	33.71	57.03	35.34	67.36	56.82	29.35	71.89	20.06	72.18
MinTL(mT5)	69.13	47.51	67.92	38.48	69.19	53.77	31.09	63.25	19.03	67.35

Bi-lingual

	English (EN)					Chinese (ZH)
	TSR	DSR	API_ACC	BLEU	JGA	TSR	DSR	API_ACC	BLEU	JGA
MinTL(mBART)	42.45	17.87	65.35	28.76	69.37	40.39	16.96	65.37	5.23	69.5
MinTL(mT5)	71.18	51.13	71.87	40.71	72.16	57.24	34.78	65.54	22.45	68.7

Cross-lingual

	ZH→EN (10%)					EN→ZH (10%)
	TSR	DSR	API_ACC	BLEU	JGA	TSR	DSR	API_ACC	BLEU	JGA
MinTL(mBART)	1.11	0.23	0.6	3.17	4.64	0	0	0	0.01	2.14
+CPT	36.19	16.06	41.51	22.5	42.84	24.64	11.96	29.04	8.29	28.57
+MLT	33.62	11.99	41.08	20.01	55.39	44.71	21.96	54.87	14.19	60.71
MinTL(mT5)	6.78	1.36	17.75	10.35	19.86	4.16	2.2	6.67	3.3	12.63
+CPT	44.94	24.66	47.6	29.53	48.77	43.27	23.7	49.7	13.89	51.4
+MLT	56.78	33.71	56.78	32.43	58.31	49.2	27.17	50.55	14.44	55.05

Dataset

Training, validation and test data are avalible in data folder. We also provide the data split for cross-lingual few shot setting.

{
    dialogue_id:{
        "Scenario": {
            "WizardCapabilities": [
            ],
            "User_Goal": {
            }
        }
        "Events":{
            {
                "Agent": "User",
                "Actions": [
                    {
                        "act": "inform_intent",
                        "slot": "intent",
                        "relation": "equal_to",
                        "value": [
                        "restaurants_en_US_search"
                        ]
                    }
                ],
                "active_intent": "restaurants_en_US_search",
                "state": {
                "restaurants_en_US_search": {}
                },
                "Text": "Hi, I'd like to find a restaurant to eat",
            },
            {
                "Agent": "Wizard",
                "Actions": [
                    {
                        "act": "request",
                        "slot": "price_level",
                        "relation": "",
                        "value": []
                    }
                ],
                "Text": "Hi there. Would you like a cheap or expensive restaurant?",
                "PrimaryItem": null,
                "SecondaryItem": null,
            },
            ...
        }
    }
}

Experimental Setup

Dependency

Check the packages needed or simply run the command

❱❱❱ pip install -r requirements.txt

Setup MongoDB server

Install MongoDB server. Please check the documentation in this link.

Then, export the DB dump by running the following command

❱❱❱ cd ./db && bash restore.sh

Preprocessing

❱❱❱ python preprocess.py --setting zh

--setting: data preprocessing for monolingual, bilingual, and crosslingual setting. Options: [en, zh, en_zh, en2zh, zh2en]

Baselines

Here we show one example for training and evaluation. Check run.sh to run all the baselines

mT5(zh)

Train

CUDA_VISIBLE_DEVICES=0,1 python -m torch.distributed.launch --nproc_per_node=2 train.py \
--model_name_or_path google/mt5-small \
--do_train \
--do_eval \
--train_file data/preprocessed/zh_train.json \
--validation_file data/preprocessed/zh_valid.json \
--learning_rate 5e-4  \
--num_train_epochs 8 \
--source_lang en_XX \
--target_lang en_XX \
--logging_steps 100 \
--save_steps 2000 \
--output_dir save/zh_mt5_5e-4 \
--per_device_train_batch_size=8 \
--per_device_eval_batch_size=8 \
--gradient_accumulation_steps 8 \
--overwrite_output_dir \
--predict_with_generate \
--fp16 \
--sharded_ddp zero_dp_3

--model_name_or_path: path of pre-trained models
--train_file: preprocessed training file
--output_dir: output_dir

Evaluate Model

❱❱❱ CUDA_VISIBLE_DEVICES=0 python evaluate.py --model_path save/zh_mt5_5e-4 --setting zh --reference_file_path data/zh_test.json --save_prefix t5_

--model_path: path of the trained model
--reference_file_path: test set data path
--save_prefix: prefix of result file

Evaluate File

We also support evalute the prediction file:

❱❱❱ python evaluate.py --eval_mode eval_file --prediction_file_path result/zh_end2end_predictions.json --setting zh --reference_file_path data/zh_test.json

Citation:

The bibtex is listed below:

@article{lin2021bitod,
  title={BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue Modeling},
  author={Lin, Zhaojiang and Madotto, Andrea and Winata, Genta Indra and Xu, Peng and Jiang, Feijun and Hu, Yuxiang and Shi, Chen and Fung, Pascale},
  journal={arXiv preprint arXiv:2106.02787},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 36 Commits
data		data
db		db
knowledgebase		knowledgebase
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
bitod.gif		bitod.gif
demo_api.py		demo_api.py
evaluate.py		evaluate.py
preprocess.py		preprocess.py
requirements.txt		requirements.txt
run.sh		run.sh
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue Modeling

Abstract:

Leaderboard

Dataset

Experimental Setup

Dependency

Setup MongoDB server

Preprocessing

Baselines

mT5(zh)

Citation:

About

Releases

Packages

Contributors 4

Languages

License

HLTCHKUST/BiToD

Folders and files

Latest commit

History

Repository files navigation

BiToD: A Bilingual Multi-Domain Dataset For Task-Oriented Dialogue Modeling

Abstract:

Leaderboard

Dataset

Experimental Setup

Dependency

Setup MongoDB server

Preprocessing

Baselines

mT5(zh)

Citation:

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 4

Languages

Packages