answer-selection

New datasets for Answer Sentence Selection.

Why?

Common datasets for Answer Sentence Selection (AS2) like WikiQA and TREC-QA are very small (a few thousand QA pairs) and are not challenging anymore. Some systems achiveve MAP > 92% on both datasets.

A recent large-scale dataset (ASNQ) shows that more data are needed to reach SOTA performance. Inspired by how ASNQ was built starting from Google's NQ, we release 4 large-scale dataset for AS2 derived from NewsQA, TriviaQA, SearchQA and HotpotQA.

We named those new dataset NewsAS2, TriviaAS2, SearchAS2 and HotpotAS2.

NOTICE: in all datasets, the original validation set has been split in both dev and test to have non-hidden labels.

How to

The dataset are available from the Huggingface datasets repository.

First, install the datasets library with pip install datasets --upgrade.

Then, dowload the datasets with:

from datasets import load_dataset

news_as2 = load_dataset('lucadiliello/news_as2')
trivia_as2 = load_dataset('lucadiliello/trivia_as2')
search_as2 = load_dataset('lucadiliello/search_as2')
hotpot_as2 = load_dataset('lucadiliello/hotpot_as2')

Statistics

Dataset	Training set		Validation set		Test set
Dataset	# Q	# QA pairs	# Q	# QA pairs	# Q	# QA pairs
NewsAS2	71561	1840533	2102	51844	2083	51472
TriviaAS2	61688	1843349	3933	117012	3852	114853
SearchAS2	117220	3281909	8509	236360	8470	236792
HotpotAS2	72921	489238	2989	25295	2912	24846

Baselines performance

Best checkpoint selection on the MAP of the development set.
5 different runs with different random seeds.
Standard deviation of results in round brackets.

`NewsAS2`

Model	MAP	MRR	P@1
RoBERTa Base	82.4 (0.2)	85.2 (0.3)	76.4 (0.6)
ELECTRA Base	82.0 (0.2)	84.8 (0.2)	76.0 (0.2)

`TriviaAS2`

Model	MAP	MRR	P@1
RoBERTa Base	76.9 (0.6)	82.2 (0.5)	73.1 (0.5)
ELECTRA Base	73.3 (0.7)	79.1 (1.1)	68.9 (1.3)

`SearchAS2`

Model	MAP	MRR	P@1
RoBERTa Base	84.1 (0.2)	88.1 (0.3)	82.1 (0.5)
ELECTRA Base	83.0 (0.1)	87.3 (0.2)	80.3 (0.4)

`HotpotAS2`

Model	MAP	MRR	P@1
RoBERTa Base	92.6 (0.2)	93.5 (0.2)	90.4 (0.3)
ELECTRA Base	92.9 (0.1)	93.5 (0.1)	89.5 (0.1)

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

answer-selection

Why?

How to

Statistics

Baselines performance

`NewsAS2`

`TriviaAS2`

`SearchAS2`

`HotpotAS2`

About

Releases

Packages

License

lucadiliello/answer-selection

Folders and files

Latest commit

History

Repository files navigation

answer-selection

Why?

How to

Statistics

Baselines performance

NewsAS2

TriviaAS2

SearchAS2

HotpotAS2

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

`NewsAS2`

`TriviaAS2`

`SearchAS2`

`HotpotAS2`

Packages