CNER: Concept and Named Entity Recognition

This is the official repository for CNER: Concept and Named Entity Recognition.

Citation

This work has been published at NAACL 2024 (main conference). If you use any part, please consider citing our paper as follows:

@inproceedings{martinelli-etal-2024-cner,
    title = "{CNER}: Concept and Named Entity Recognition",
    author = "Martinelli, Giuliano  and
      Molfese, Francesco  and
      Tedeschi, Simone  and
      Fern{\'a}ndez-Castro, Alberte  and
      Navigli, Roberto",
    editor = "Duh, Kevin  and
      Gomez, Helena  and
      Bethard, Steven",
    booktitle = "Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers)",
    month = jun,
    year = "2024",
    address = "Mexico City, Mexico",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.naacl-long.461",
    pages = "8329--8344",
}

Description

This repository contains the evaluation scripts to evaluate CNER models and the official outputs of the CNER system, which can be used to reproduce paper results. We also release:

Our silver training and gold evaluation data on Hugging Face.
A Concept and Named Entity Recognition model trained on CNER-silver on the Hugging Face🤗 Models hub. Specifically, we fine-tuned a pretrained DeBERTa-v3-base for token classification using the default hyperparameters, optimizer and architecture of Hugging Face (see the Tutorial Notebook), therefore the results of this model may differ from the ones presented in the paper.

Setup

Clone the repository:

git clone https://github.com/Babelscape/cner.git

Create a conda environment:
```
conda create -n env-name python==3.9
```
Install the requirements:
```
pip install -r requirements.txt
```

Evaluate CNER models

To evaluate a CNER model, run the following script: python scripts/evaluate.py --predictions_path path_to_predictions

where path_to_predictions is a file with CNER predictions over the CNER-gold dataset split.

Supported formats:

.jsonl

{"sentence_id": "55705165.21", "tokens": ["Commander", ..., "."],  "predictions": ["B-PER", ... , "O"]}

.tsv

Sentence_id Tokens predictions

"55705165.21" ['Commander', 'Donald', 'S.', ... '.'] ['B-PER', 'I-PER', ... 'O']

Reproduce Paper Results

At outputs/cner_output.jsonl you can find the official outputs of our CNER system. To reproduce our CNER results, run the following script: python scripts/evaluate.py --predictions_path outputs/cner_output.jsonl

Name		Name	Last commit message	Last commit date
Latest commit History 29 Commits
outputs		outputs
scripts		scripts
.gitignore		.gitignore
CNER_HuggingFace.ipynb		CNER_HuggingFace.ipynb
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CNER: Concept and Named Entity Recognition

Citation

Description

Setup

Evaluate CNER models

Reproduce Paper Results

About

Releases

Packages

Contributors 3

Languages

License

Babelscape/cner

Folders and files

Latest commit

History

Repository files navigation

CNER: Concept and Named Entity Recognition

Citation

Description

Setup

Evaluate CNER models

Reproduce Paper Results

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages