AfriHate Shared Task Co-located with AfricaNLP 2025: Hate and Offensive Language Detection in African Languages

Join Discord | Contact Us | Download Dataset | How to Participate

Content

Overview
Importance and Impact
Participants
Languages
Tracks
- Track 1: Category Classification
- Track 2: Target Classification
Pilot Task
Baselines and Evaluation
Task Organizers
Ethical Statement

Overview

Online hate speech is a growing problem worldwide, causing harm to users, polluting online communities, and potentially leading to offline violence. Social media platforms facilitate the rapid propagation of hateful content. However, in regions like Africa, interventions primarily focus on high-profile individuals, often through labour-intensive human moderation which is not scalable. Furthermore, there is limited availability of machine learning tools for African languages due to the scarcity of labeled datasets.

AfriHate introduces the first high-quality labeled social media dataset collection for detecting hate and abusive language in 17 African languages. The datasets will be used in the AfriHate Shared Task, co-located with AfricaNLP 2025, a shared task on hate speech and abusive language detection in African languages with two tracks.

Rationale of the Shared Task

Africa is home to over 2000 languages, including 75 languages with at least one million speakers each. Despite this linguistic richness, African languages are under-served in NLP. Our datasets aim to create a more inclusive digital landscape by supporting African languages and addressing language-specific challenges and nuances in hate speech detection.

Participants

Researchers in low-resource African languages and socio-linguistic phenomena.
Researchers studying machine learning approaches for hate speech detection.

Languages

Language	Country
Hausa, Igbo, Yoruba, Nigerian Pidgin	Nigeria
Amharic, Tigrinya, Oromo, Somali	Ethiopia
Swahili	Kenya
Moroccan Arabic	Morocco
Sudanese Arabic	Sudan
Twi	Ghana
Mozambican Portuguese	Mozambique
Kinyarwanda	Rwanda
IsiZulu, Afrikaans, IsiXhosa	South Africa

Tracks

Track 1: Category Classification

Given a social media post, classify it into one of three categories: Hate, Offensive, or Normal. This track focuses on understanding the type of language used in the posts and categorizing them accordingly.

Track 2: Target Classification

Determine the target attribute of the offensiveness in a given social media post. The possible targets include: Ethnicity, Religion, Sexual Orientation, Disability, or Gender. This track aims to identify which group is the target of the offensive content.

Baselines and Evaluation

Baseline models will be shared, and performance evaluated using accuracy, precision, recall, and macro-F1 metrics.

Important Dates and Task Phases

Description	Deadline
Training Data Ready	25 May 2025
Evaluation Start	15 June 2025
Evaluation End	30 June 2025
System Description Paper Due	07 July 2025
Notification to authors	14 July 2025
Camera ready due	20 July 2025
Workshop	31 July 2025

The task will be divided into three phases: Development, Evaluation, and Post-Evaluation.

How to Participate

Register: Sign up on the CodaBench competition platform.
Track: Decide on the track(s) you want to participate in (Track 1 and/or Track 2).
Download: Access the datasets for each track provided in this repository.
Develop: Build your models using the provided data.
Submit: Submit your predictions on the CodaBench competition platform.

Competition Rules and Terms

1. Consent to Public Release of Scores

By submitting results, you consent to the public release of your scores on:
- the competition website,
- at the AfricaNLP workshop,
- in associated proceedings.

2. Score Release and Validity

Task organizers reserve the right to withhold scores for:
- incomplete submissions,
- erroneous submissions,
- deceptive submissions,
- rule-violating submissions.
Inclusion of a submission's scores does not constitute endorsement.

3. Team Participation Rules

Participants may be involved in only one team.
Exceptions may be granted with prior approval from organizers.

4. Account Management

Each team must create and use exactly one account on the Codabench platform.

5. Team Constitution

Team membership cannot be changed after the evaluation period begins.

6. Development Period Rules

Teams can submit up to 999 submissions.
Results are visible only to the submitting team.
Leaderboard is disabled.
Warnings and errors are visible for each submission.

7. Evaluation Period Rules

The teams are constrained to make 3 submissions.
Only the final submission will be considered official.
Warnings and errors are visible for each submission.

8. Post-Competition

The gold labels will be released after the competition.
The teams are encouraged to report results on all their system variants in their description paper.
The official submission results must be clearly indicated.

9. Public Release of Submissions

Final team submissions may be made public after the evaluation period.

10. Disclaimer about the Datasets

Organizers and affiliated institutions provide no warranties on dataset correctness or completeness.
They are not liable for dataset access or usage.

11. Peer Review Process

Each participant will review another team's system description paper.

12. Dataset Usage Restrictions

Datasets should only be used for scientific or research purposes.
Any other use is explicitly prohibited.
Datasets must not be redistributed or shared with third parties.
Interested parties should be directed to the official website.

13. Final ranking

To be included in the official task ranking, you MUST submit a system description paper.

Dataset paper

The following paper will be used for the dataset: AfriHate: A Multilingual Collection of Hate Speech and Abusive Language Datasets for African Languages by Shamsuddeen Hassan Muhammad, Idris Abdulmumin, Abinew Ali Ayele, David Ifeoluwa Adelani, Ibrahim Said Ahmad, Saminu Mohammad Aliyu, Nelson Odhiambo Onyango, Lilian D. A. Wanzare, Samuel Rutunda, Lukman Jibril Aliyu, Esubalew Alemneh, Oumaima Hourrane, Hagos Tesfahun Gebremichael, Elyas Abdi Ismail, Meriem Beloucif, Ebrahim Chekol Jibril, Andiswa Bukula, Rooweither Mabuya, Salomey Osei, Abigail Oppong, Tadesse Destaw Belay, Tadesse Kebede Guge, Tesfa Tegegne Asfaw, Chiamaka Ijeoma Chukwuneke, Paul Röttger, Seid Muhie Yimam, Nedjma Ousidhoum. This paper will provide the details necessary for understanding the data collection, annotation process, and baseline experiments.

Communication

Join our Discord Channel to ask questions and receive updates (coming soon).
If you have any questions or issues, please feel free to create an issue.
Contact organizers at: afrihate-sharedtask-organizers[at]googlegroups[dot]com

FAQs

Do I have to participate in all languages for a given track?

No, you can participate in one or more languages.

How will you verify my submitted model?

To be included in the final team rankings of our shared task, it is mandatory for participants to submit a system description paper describing their approaches and methodologies in detail, therefore ensuring scientific integrity.

When will you release the gold labels?

For the dev set, the gold labels will be released when the evaluation phase starts and the gold labels for the different test sets will be released after the competition is over.

Can I use LLMs in the different tracks?

Yes.

Can I use additional datasets (e.g, publicly provided ones from other sources)?

Yes. Please do cite them in the system description paper.

How was the data collected?

The data collection process a standard one, you can check previous papers in the area to have an idea (e.g., https://aclanthology.org/S18-1001.pdf). We have data instances (text snippets) annotated by >3 annotators. The annotators decide whether the text convey some kind of hate or offensive content. For details about the data sources, annotation guidelines, number of annotators per language, etc., please check our paper, Muhammad et al. (2025).

How was the data annotated and did you use LLMs to annotate it?

No. The data instances were annotated by (>=3) native speakers and no LLMs were involved in the process. The annotators labeled the whole sentences not the words.

Will I be included in the final ranking if I do not write a system description paper?

No. You MUST write a system description paper to be included in the final ranking.

I have never written a system description paper. How can I write one?

We will have an online writing tutorial and share resources to help you write a system description paper.

Our system did not perform very well, should I still write a system description paper?

We want to hear from all of you even if you did not outperform other systems! Write about the details of your system. (Yes we want your insights from any negative results!)

Resources

SemEval 2025 Shared Tasks
Frequently Asked Questions about SemEval
Paper Submission Requirements
Guidelines for Writing Papers
Paper style files
Paper submission link (to be added)

Organizers

Organizer	Affiliation	Role
Shamsuddeen Hassan Muhammad	Bayero University; MasaKhane	Ph.D. Candidate in NLP
Idris Abdulmumin	Ahmadu Bello University; MasaKhane	Ph.D. in NLP
David Ifeoluwa Adelani	Masakhane;	Ph.D. in NLP
Ibrahim Said Ahmad	Bayero University; MasaKhane	Ph.D. in NLP
Saminu Mohammad Aliyu	Bayero University; MasaKhane	Ph.D. Candidate
Tadesse Destaw Belay	IPN; EthioNLP	Ph.D. Candidate
Abinew Ali Ayele	Bahir Dar University; EthioNLP	Ph.D. Candidate
Seid Muhie Yimam	Universität Hamburg; Masakhane; EthioNLP	Ph.D. in NLP

NOTE Some of the content in this page are adapted from Our previous shared task pages such as SemEval2025-Task11

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AfriHate Shared Task Co-located with AfricaNLP 2025: Hate and Offensive Language Detection in African Languages

Content

Overview

Rationale of the Shared Task

Participants

Languages

Tracks

Track 1: Category Classification

Track 2: Target Classification

Baselines and Evaluation

Important Dates and Task Phases

How to Participate

Competition Rules and Terms

Dataset paper

Communication

FAQs

Resources

Organizers

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

AfriHate/Afrihate-shared-task

Folders and files

Latest commit

History

Repository files navigation

AfriHate Shared Task Co-located with AfricaNLP 2025: Hate and Offensive Language Detection in African Languages

Content

Overview

Rationale of the Shared Task

Participants

Languages

Tracks

Track 1: Category Classification

Track 2: Target Classification

Baselines and Evaluation

Important Dates and Task Phases

How to Participate

Competition Rules and Terms

Dataset paper

Communication

FAQs

Resources

Organizers

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Packages