🎲 Analyzing Gambling Addictions: A Spanish Corpus for Understanding Pathological Behavior

This repository accompanies the paper:
"Analyzing Gambling Addictions: A Spanish Corpus for Understanding Pathological Behavior"
📍 Accepted at Findings of EMNLP 2025.

📂 Dataset

The main contribution of this work is a Spanish sentence retrieval dataset focused on symptoms associated with pathological gambling.

Corpus: resources/dataset/corpus.jsonl
Queries & Qrels: also available in the same directory. Since 4 assessors were employed, the majority agreement is considered when at least two authors consider the sentence relevant.
All files follow the BEIR compatible format, enabling easy use with standard baselines (see Section 4 of the paper).

Additionally, a subfolder with pools is provided, containing the material used by both human annotators and LLMs for dataset labeling.

⚙️ Code

The src folder is structured as follows:

train/ → Training scripts for our domain-adapted ludoBETO model.
labelling/ → Statistics and analysis of human vs. automatic label generation. "David Gallego" user is the trained psychologist.

🤖 Model

We introduce ludoBETO, a BETO-based model adapted to the pathological gambling domain.
This model is publicly available on HuggingFace for further research and fine-tuning.

🔧 In our paper, we also implemented a cross-encoder using the SimCSE strategy with custom parameters over ludoBETO.

📖 Citation

If you use this resource, please cite:

@inproceedings{couto-etal-2025,
    title = "Analyzing Gambling Addictions: A Spanish Corpus for Understanding Pathological Behavior",
    author = "Couto-Pintos, Manuel and
              Fernández-Pichel, Marcos and
               Aragón, Mario Ezra and
              Losada, David E.",
    booktitle = "Findings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP)"
}

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
resources		resources
src		src
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎲 Analyzing Gambling Addictions: A Spanish Corpus for Understanding Pathological Behavior

📂 Dataset

⚙️ Code

🤖 Model

📖 Citation

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

citiususc/ludosym

Folders and files

Latest commit

History

Repository files navigation

🎲 Analyzing Gambling Addictions: A Spanish Corpus for Understanding Pathological Behavior

📂 Dataset

⚙️ Code

🤖 Model

📖 Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages