Skip to content

Full-stack self education NLP Website for summarization , named entity recognition etc..

Notifications You must be signed in to change notification settings

aimanyounises1/NLP_WEB

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

94 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Web Application with NLP :

Full-stack project in NLP using Django as a backend and Django rest framework as API with Bert model, the fronend is with React.js in 3 languages (English, Hebrew, Arabic).

I assume you heard about Huggingface models based on transformers encoders and decoders:

In this webiste our goal is to collect dataset for Hebrew and Arabic languages so we can do advanced NLU, like summarization and sentimental analysis. The main problem is the lackage of dataset for Arabic and Hebrew languages to do tasks like summerization, i.e We can use BERT to summarize English text. But what about summarizing Hebrew and Arabic texts?
We can use google translate to translate the text from Hebrew to English and summarize it then to translate it back , this approach is good but it's not quit enough. In this site the main idea is to build a large dataset for Hebrew and Arabic languages , by submiting the data by the client, if the client evaluated the result to be good, then the result will be stored at the Database under the language that were choosen by the user. then we can train Bert model on this dataset, and get very good accuracy of course.

NLP WEB :

RunWEB.mov

Example of Arabic NER:

arab_ner

Example of English NER with Summarizer:

Eng_ner_sum

TODO list :

  • Initialize the user's profile.
  • Connected all nlp models to the app.
  • improving the design.
  • Adding Hebrew and Arabic languages.
  • Trying to migrate big models like GPT-2.
  • Adding Named Entity Recognition for Hebrew.
  • Adding BIO-NER in English,Arabic and Hebrew.
  • trying to find dataset to summarize Hebrew and Arabic.
  • Adding ChatBox for Customers.
  • Improving English NER dataset.
  • Adding Mask completion for 3 languages.
  • Adding Firebase/Django SQLite3 to each language to collect data.
  • Adding Mask Bert model for 3 languages.
  • Adding summarizer in Arabic and Hebrew using TF-IDF/BagOfWords for client evaluation.
  • Adding Text-Generation GPT-2 for Hebrew, Arabic and English.