Skip to content

OnlpLab/AlephBERT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

99a4f3f · Mar 18, 2022

History

9 Commits
Mar 18, 2022
Mar 18, 2022
Mar 14, 2021
Mar 18, 2022
Apr 7, 2021
May 18, 2021
Mar 18, 2022
Nov 25, 2021
Mar 18, 2022
Mar 18, 2022
Mar 18, 2022
Mar 18, 2022
Mar 18, 2022
Mar 18, 2022
Mar 18, 2022
Mar 18, 2022
Mar 18, 2022
Mar 18, 2022
Mar 18, 2022

Repository files navigation

AlephBERT

overview

A large Pre-trained language model for Modern Hebrew

Based on BERT-base training, 12 hidden layers, with 52K vocab size.

Trained on 95M sentences from OSCAR+Wikipedia+Tweeter data, 10 epochs.

Evaluation

We evaluated AlephBERT for the following prediction tasks:

  • Morphological Segmentation
  • Part of Speech Tagging
  • Morphological Features
  • Named Entity Recognition
  • Sentiment Analysis

On four different benchmarks:

  • The SPMRL Treebank (for: Segmentation, POS, Feats, NER)
  • The Universal Dependency Treebanks (for: Segmentation, POS, Feats, NER)
  • The Hebrew Facebook Corpus (for: Sentiment Analysis)

Citation

@misc{alephBert2021,

  title={AlephBERT: a Pre-trained Language Model to Start Off your Hebrew NLP Application}, 
  
  author={Amit Seker, Elron Bandel, Dan Bareket, Idan Brusilovsky, Shaked Refael Greenfeld, Reut Tsarfaty},
  
  year={2021}

}

Contributors:

The ONLP Lab at Bar Ilan University

PI: Prof. Reut Tsarfaty

Contributor: Amit Seker, Elron Bandel, Dan Bareket, Idan Brusilovsky, Shaked Refael Greenfeld

Advisors: Dr. Roee Aharoni, Prof. Yoav Goldberg

Credits

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published