Spell Correction using Language Model (LM)

This experiment uses the n-Gram language model trained on the news genre of the Brown corpus to find the correct spelling of misspelled words in the Birkbeck corpus. Where n={1, 2, 3, 5, 10} and k={1, 5, 10}, the average success at k, is calculated for each n.

Keywords: Spell correction, Language Model, Corpus, n-Gram, Probability, Natural Language Processing.

The Data

The APPLING1DAT.643 file, out of the Birkbeck spelling error corpus by Roger Mitton was used for this experiment. They contain 198 entries of misspelled words in total and the correct equivalent of these words.

The Brown corpus contains 100,554 words and 4,623 sentences.

Requirements

You can find the modules and libraries used in this project in the requirement.txt file. You can also run the code below.

pip install -r requirements.txt

Structure

Data: contains the Birbeck corpus file used for this project.
utils: contains the essential functions for this project.
models: contains the trained models
Assignment_#2.ipynb and Assignment_#2.py are python notebook and script that uses the functions in the utils folder to generate the results.

Contact

Glory Odeyemi is currently undergoing her Master's program in Computer Science, Artificial Intelligence specialization at the University of Windsor, Windsor, ON, Canada. You can connect with her on LinkedIn.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Spell Correction using Language Model (LM)

The Data

Requirements

Structure

Contact

References

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
Data		Data
models		models
utils		utils
.gitignore		.gitignore
Assignment_#2.ipynb		Assignment_#2.ipynb
Assignment_#2.py		Assignment_#2.py
README.md		README.md
requirements.txt		requirements.txt

gloryodeyemi/COMP_8730_Assignment2

Folders and files

Latest commit

History

Repository files navigation

Spell Correction using Language Model (LM)

The Data

Requirements

Structure

Contact

References

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages