Knowledge Graph-based Multilingual Question Answering (KG-MLQA), as one of the essential subtasks in Knowledge Graph-based Question Answering (KGQA), emphasizes that questions on the KGQA task can be expressed in different languages to solve the lexical gap between questions and knowledge graph(s). However, the existing KG-MLQA works mainly focus on the semantic parsing of multilingual questions but ignore the questions that require integrating information from cross-lingual knowledge graphs (CLKG). This paper extends KG-MLQA to Cross-lingual KG-based multilingual Question Answering (CLKGQA) and constructs the first CLKGQA dataset over multilingual DBpedia named MLPQ, which contains 300K questions in English, Chinese, and French. We further propose a novel KG sampling algorithm based on subgraph structural features and obtain KGs for MLPQ, making the evaluated methods compatible with our datasets. To evaluate the dataset, we put forward a general question answering framework whose core idea is to transform CLKGQA into KG-MLQA. We first use the Cross-lingual Entity Alignment (CLEA) model to merge CLKG into a single KG and get the answer to the question by the Multi-hop QA model combined with the Multilingual pre-training model. Then we establish two baselines for MLPQ, one of which uses Google translation to obtain alignment entities, and the other adopts the recent CLEA model. Experiments show that the simple combination of the existing QA and CLEA methods fails to obtain the ideal performances on CLKGQA. Moreover, the availability of our benchmark contributes to the community of question answering and entity alignment.
There are a total of 300K questions in MLPQ, which covers three language pairs (English-Chinese, English/French, and Chinese/French), and requires a 2-hop or 3-hop cross-lingual path inference to answer each question.
We establish MLPQ through a semi-automatic process shown in the following picture:
The statistics of the generated questions, each subset contains English, Chinese, and French versions, with a total scale of 314,479question:
KG pair |
Language |
2-hop |
3-hop |
Relation pairs in questions |
Average length |
||
2-hop |
3-hop |
2-hop |
3-hop |
||||
en-zh |
English |
14,656 |
29,815 |
1,250 |
2,628 |
12.4 |
15.5 |
Chinese |
14,852 |
29,643 |
1,251 |
2,637 |
17.2 |
21.7 |
|
French |
15,169 |
30,360 |
1,251 |
2,626 |
11.3 |
16.1 |
|
en-fr |
English |
15,289 |
18,154 |
1,138 |
3,575 |
12.3 |
15.5 |
Chinese |
15,831 |
18,035 |
1,141 |
3,578 |
17.8 |
21.8 |
|
French |
15,867 |
17,993 |
1,144 |
3,580 |
11.7 |
14.7 |
|
zh-fr |
English |
8,373 |
17,800 |
759 |
1,674 |
11.6 |
16.0 |
Chinese |
8,414 |
17,877 |
758 |
1,677 |
17.5 |
21.4 |
|
French |
8,495 |
17,856 |
758 |
1,668 |
12.1 |
14.9 |
|
Sum |
- |
116,946 |
197,533 |
3,157 |
9,484 |
12.2/17.5/11.6 |
15.6/21.6/15.4 |
(English/Chinese/French) |
- The datasets are available in two formats. One is in RDF format, the other is in a custom format similar to the datasets used in IRN.
- All the datasets are in the datasets directory. For explanation of file naming convensions and our custom format, please refer to this directory for further information.
- We established 3 baseline models of MLPQ.
- The latest baseline combines NMN and UHop on our latest dataset that have integrated bilingual KGs. It is the one that achieves highest scores on our datasets.
- The other 2 older models use MTransE and are tested on the 1.0 version of our datasets:
- MIRN is based on the popular multi-hop reasoning model IRN.
- CL-MKQA is based on a multiple KGQA model
- Baseline codes are in the baselines directory. To try these baselines, please refer to this directory for further information.
By using KGT(https://github.com/bisheng/KTG4KBQG) model, we have generated more paraphrases for the questions in MLPQ. We used these paraphrases to randomly replace 50% of the original questions, which further enhanced the diversity of MLPQ. In version 1.3, we provided the divided set of train/dev/test.
Recreated the datasets to address the diversity problem and the redundancy problem in the datasets. As a result, we now have fewer questions. Also added a new baseline framework combining NMN and UHop with m-BERT.
In this slightly improved version, we corrected many grammatical errors and added the RDF version of all the datasets.
- Currently the MLPQ version is
1.3
. We expect to further the work and provide datasets of higher quality and more variety in the future. - Because the generation of MLPQ is semi-automatic and relies on manually crafted templates and machine translation to some degree, there might be some minor problems in the text. We try to improve the quality of MLPQ by post-editing and there should be very few problems now. However, if you find any errors in the dataset, please contact us, thanks.
For now, MLPQ mainly contains 2-hop and 3-hop path questions. In the future, we plan to adopt retelling generation based on web resources to create a greater abundance of question expressions. The path question is merely one subset of complex questions; we also plan to update and augment factoriented complex questions with property information and to explore aggregate-typed complex questions.
This project is licensed under the GPL3 License - see the LICENSE file for details