📝 Persian Text Processing with Parsivar

This project demonstrates how to process Persian (Farsi) text using the Parsivar NLP library. It includes text normalization, tokenization, stemming, and spell checking, with additional tools to handle proper display of Persian characters.

🔍 Features

✅ Normalization – Cleans and standardizes Persian text.
✅ Tokenization – Splits text into sentences and words.
✅ Stemming – Converts words to their root forms.
✅ Spell Checking – Detects and corrects misspellings in Persian.
✅ Display Support – Uses arabic_reshaper and python-bidi to fix RTL display issues.

🧰 Libraries Used

parsivar – NLP tools for Persian.
arabic_reshaper – For reshaping characters to correct forms.
python-bidi – Ensures proper display of RTL scripts like Persian.

📌 How It Works

Read Persian text from a .txt file.
Normalize the text using Parsivar.
Tokenize the normalized text into words and sentences.
Apply stemming to get root forms of words.
Use spell correction on custom input.
Display reshaped output for better readability in terminals.

🚀 Usage

Install dependencies:

pip install parsivar arabic_reshaper python-bidi

pip install -r requirements.txt

then in code first we normalize then we tokenize, and after tokenize we stammer and in the end for spell detection you need to download these two files and put it in the this below path:

first create a spell folder in this path:
venv\Lib\site-packages\parsivar\resource

then replace these two file in the spell folder:
- onegram.pckl
- mybigram_lm.pckl

🔽Download two files from here

🎥preview

📳technology

python
nltk
parsivar
bidi
arabic_reshaper

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
.gitignore		.gitignore
LICENSE		LICENSE
PTPHzm.py		PTPHzm.py
PersianText.txt		PersianText.txt
README.md		README.md
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

📝 Persian Text Processing with Parsivar

🔍 Features

🧰 Libraries Used

📌 How It Works

🚀 Usage

Install dependencies:

🔽Download two files from here

🎥preview

📳technology

About

Uh oh!

Releases

Packages

Languages

License

farhad-here/Persian_Text_Processing

Folders and files

Latest commit

History

Repository files navigation

📝 Persian Text Processing with Parsivar

🔍 Features

🧰 Libraries Used

📌 How It Works

🚀 Usage

Install dependencies:

🔽Download two files from here

🎥preview

📳technology

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages