Skip to content

VARSHAJOSHY/multi-lingual-stance-dataset

Repository files navigation

multi-lingual-stance-dataset

The objective of stance detection is to ascertain a text's (or author’s) attitude toward a certain topic or statement. It is a key component of many NLP tasks such as rumour confirmation, fact-checking, and fake news detection.
Due to the lack of annotated data in other languages, the majority of stance detection research has focused on English.

I looked for a multilingual dataset for stance detection as part of my dissertation. This project demonstrates the statistical analysis I carried out on four datasets that I discovered to be helpful in moving forward with my thesis. For additional reference, the details of these datasets, including the URL of the GitHub repository and the published paper, are provided below.

1.Catalonia independence dataset Dataset with 2 languages – Spanish and Catalan. GitHub Repository - https://github.com/ixa-ehu/catalonia-independence-corpus Related Paper -

2.The x-stance Dataset Dataset with 3 languages – Swiss Standard German, French, and Italian Contains more than 150 political questions, and 67k comments written by candidates on those questions. GitHub Repository - https://github.com/ZurichNLP/xstance/tree/v1.0.0 Related Paper - https://arxiv.org/pdf/2003.08385.pdf

3.SemEval-2016 Dataset Dataset with 1 language – English Location - https://alt.qcri.org/semeval2016/task6/index.php?id=data-and-tools

4.Dataset 4 Dataset with 12 languages – French and Italian GitHub Repository - https://github.com/mirkolai/MultilingualStanceDetection Related Paper - https://doi.org/10.1016/j.csl.2020.101075

Purpose

When I was working on Multilingual Stance Detection task I didn't find any open source data-sets to work on, I believe there are people just like me who are working on these tasks and I hope it helps them.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published