SCENE-TEXT-RECOGNITION

ABSTRACT

The field of scene text recognition has become increasingly popular in recent years, as recognizing text in natural scenes presents a significant challenge compared to scanned documents. This paper proposes a novel technique that utilizes first-order Histogram of Oriented Gradient (HOG) through a spatial pyramid,EAST,EASY OCR to address the issue of text recognition in photos of natural scenes and on the internet.Further More,Deep learning-based approaches, such as convolutional neural networks (CNNs) and recurrent neural networks (RNNs), have shown promising results in scene text recognition, and have been combined with various techniques to handle specific challenges, such as attention mechanisms, geometric transformations, and multi-scale processing. The importance of scene text recognition is further underscored by its applications in various domains, including autonomous driving, content-based image retrieval, and cultural heritage preservation.In This paper we have reviewed recent advances in scene text recognition, with a particular focus on deep learning-based methods and their applications.

Optical Character Recognition

Finding Text In the Scene Images In other words, OCR software converts a two-dimensional image of text—which could be handwritten or machine printed—into language that a computer can understand. To accomplish OCR as a procedure as accurately as feasible, it typically consists of multiple sub-processes. The subsequent steps are:

1.Preprocessing of the Image
2.Text Localization
3.Character Segmentation
4.Character Recognition
5.Post Processing

Tesseract OCR

Tesseract is an Apache 2.0-licensed open source text recognition (OCR) engine. For programmers, it can be applied directly or through the use of an API to extract written text from photos. Many different languages are supported. Tesseract doesn't come with a built-in graphical user interface, however there are plenty on the 3rdParty website. Through the use of wrappers, which may be obtained here, Tesseract is compatible with a wide range of programming languages and frameworks. It can be used in conjunction with the current layout analysis to find text within a large document or with an outside text detector to find text.

Methods Implemented

EAST
EASY OCR
FOTS

1.EAST

EAST uses a fully convolutional neural network (FCN) to detect text regions in natural scene images. The network is designed to predict the score map and the geometry map of the text regions simultaneously.The score map indicates the likelihood of a pixel being part of a text region, while the geometry map provides information about the shape and orientation of the text region.

OUTPUT OF EAST METHOD

2.EASY OCR

Easy OCR is based on deep learning algorithms and pre-trained models that can recognize text in various languages, fonts, and styles.EasyOCR works by first performing image preprocessing to enhance the image quality, such as de-skewing, binarization, and noise reduction.Then, the image is fed into a pre-trained neural network that is capable of recognizing text from the image. The neural network outputs the recognized text along with a confidence score for each character.EasyOCR uses a combination of convolutional neural networks (CNN) and recurrent neural networks (RNN) to recognize text from images.

OUTPUT OF EASY OCR METHOD

3.FOTS

FOTS is an end-to-end trainable framework that detects and recognizes all words in a natural scene image simultaneously. For FOTS we have used ICDAR15 dataset.The overall architecture of FOTS consists of four parts: 1.Shared Convolutions 2.Text Detection Branch 3.RoI Rotate 4.Text Recognition Branch

OUTPUT OF FOTS METHOD

Steps to run the project files

Select the Scene text recognition Model as per the required Applications
Download the pretrained model weights available online and use them in the code.
Use "EASY_OCR_EAST.ipynb" to select the EAST or EASY_OCR model and input image to this model.
For FOTS first use "FOTS_Data_Analysis.ipynb" to preprocess the data set
Next use "FOTS_Text_Detection_Modelling.ipynb" for finding text in the image.Finally use "FOTS_End_to_End_Modelling.ipnb" to extract text.

Important Links

Report Link: https://drive.google.com/file/d/1eHwcYX24QPWXkDgfgwVKEw2Qs5F2spEG/view?usp=share_link Youtube Link:https://youtu.be/ThntCvbJrYU
Custom Data set:https://drive.google.com/drive/folders/1WPx8VuB0pWuWwILQnr9Y44RzeJ56g0Lk?usp=share_link

Dependencies

1.OpenCV
2.Numpy
3.Matplotlib
4.pytesseract
5.easy ocr

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
EASY_OCR_EAST.ipynb		EASY_OCR_EAST.ipynb
FOTS_Data_Analysis.ipynb		FOTS_Data_Analysis.ipynb
FOTS_End_to_End_Modelling.ipynb		FOTS_End_to_End_Modelling.ipynb
FOTS_Text_Detection_Modelling.ipynb		FOTS_Text_Detection_Modelling.ipynb
README.md		README.md
SCENE_TEXT_RECOGNITION.pdf		SCENE_TEXT_RECOGNITION.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

SCENE-TEXT-RECOGNITION

ABSTRACT

Optical Character Recognition

Tesseract OCR

Methods Implemented