Orginal Authors main focus was to program a light-weight library to determine a identical word for same type of inflected word as stem to run on any model. Currently, generated stems efficacy for Data analysis and NLP tasks is yet to be determined.
- using git
git clone https://github.com/SaifurOWL/Bangla-stemmer-Usable-Variant-for-self.git
cd Bangla-stemmer-Usable-Variant-for-self
python setup.py install
- for colab (cloning to colabs own drive)
!git clone https://github.com/SaifurOWL/Bangla-stemmer-Usable-Variant-for-self.git
%cd /content/Bangla-stemmer-Usable-Variant-for-self
!python setup.py install
Example 01:
from bangla_stemmer.stemmer import stemmer wordlist = ['কবিরগুলিকে', 'আমাকে', 'নামাবার'] stmr = stemmer.BanglaStemmer() stm = stmr.stem(word) print(stm)
output: ['কবির', 'আমা', 'নামা']
Example 01:
from bangla_stemmer.stemmer.stemmer import BanglaStemmer word = 'কবিরগুলিকে' stm = BanglaStemmer().stem(word) print(stm)
output: ['কবির', 'আমা', 'নামা']
The grammar rules applied here are based on the algorithm of Rafi Kamal. However some major and required modification has been made in this library in order to make it compatible with any model.
Reference: ⮑ BanglaKit