Skip to content
@IndoNLP

IndoNLP

We are researchers who push up the lower bound of the Indonesian NLP standard. We are collaborating to release new data resources and benchmarks.

Pinned Loading

  1. indonlu indonlu Public

    The first-ever vast natural language processing benchmark for Indonesian Language. We provide multiple downstream tasks, pre-trained IndoBERT models, and a starter code! (AACL-IJCNLP 2020)

    Jupyter Notebook 535 190

  2. nusa-crowd nusa-crowd Public

    A collaborative project to collect datasets in Indonesian languages.

    Jupyter Notebook 260 60

  3. nusax nusax Public

    High-quality parallel resource on sentiment analysis for 10 low-resource Indonesian languages, English, and Indonesian (Outstanding Paper at EACL 2023)

    Jupyter Notebook 87 9

  4. indonlg indonlg Public

    The first-ever vast natural language generation benchmark for Indonesian, Sundanese, and Javanese. We provide multiple downstream tasks, pre-trained IndoGPT and IndoBART models, and a starter code!…

    Python 69 11

Repositories

Showing 10 of 10 repositories
  • .github Public

    Landing page

    IndoNLP/.github’s past year of commit activity
    1 0 0 0 Updated Oct 2, 2024
  • nusa-writes Public

    NusaWrites is an in-depth analysis of corpora collection strategy and a comprehensive language modeling benchmark for underrepresented and extremely low-resource Indonesian local languages.

    IndoNLP/nusa-writes’s past year of commit activity
    Jupyter Notebook 25 Apache-2.0 2 0 0 Updated Sep 27, 2024
  • cendol Public

    Indonesian T0 | Instruction-tuning for low-resource and extremely low-resource Austronesian languages

    IndoNLP/cendol’s past year of commit activity
    Jupyter Notebook 10 Apache-2.0 1 0 1 Updated Jun 24, 2024
  • nusa-crowd Public

    A collaborative project to collect datasets in Indonesian languages.

    IndoNLP/nusa-crowd’s past year of commit activity
    Jupyter Notebook 260 Apache-2.0 60 35 (5 issues need help) 2 Updated Jun 2, 2024
  • nusa-catalogue Public

    Dataset Catalogue Homepage for Indonesian Languages

    IndoNLP/nusa-catalogue’s past year of commit activity
    JavaScript 7 Apache-2.0 8 1 0 Updated Feb 19, 2024
  • nusax Public

    High-quality parallel resource on sentiment analysis for 10 low-resource Indonesian languages, English, and Indonesian (Outstanding Paper at EACL 2023)

    IndoNLP/nusax’s past year of commit activity
    Jupyter Notebook 87 Apache-2.0 9 0 0 Updated May 8, 2023
  • nusacrowd-asr Public

    NusaCrowd ASR Experiment

    IndoNLP/nusacrowd-asr’s past year of commit activity
    Jupyter Notebook 2 Apache-2.0 0 0 0 Updated Jan 5, 2023
  • indonlu Public

    The first-ever vast natural language processing benchmark for Indonesian Language. We provide multiple downstream tasks, pre-trained IndoBERT models, and a starter code! (AACL-IJCNLP 2020)

    IndoNLP/indonlu’s past year of commit activity
    Jupyter Notebook 535 Apache-2.0 190 4 1 Updated Dec 3, 2022
  • indonlg Public

    The first-ever vast natural language generation benchmark for Indonesian, Sundanese, and Javanese. We provide multiple downstream tasks, pre-trained IndoGPT and IndoBART models, and a starter code! (EMNLP 2021)

    IndoNLP/indonlg’s past year of commit activity
    Python 69 Apache-2.0 11 1 0 Updated Dec 3, 2022
  • IndoNLP/indonlp.github.io’s past year of commit activity
    SCSS 1 Apache-2.0 1 0 0 Updated Jun 12, 2022