- André Moreira Souza - N°USP: 9778985
- Josué Grâce Kabongo Kalala - N°USP: 9770382
This project aims to build a digital business card wallet, as a mobile application. This application basically can take a picture of a business card, automatically extracting information about the card, and saves it in the user's digital wallet. During this process, there are digital image processing techniques, such as (1) image denoising / deblurring, (2) image segmentation, and natural language processing techniques for categorizing the extracted text (name, company, job position, email, location, ...). The application should be able to process and extract information from images with different perspectives and possible presence of undesired objects, such as in the examples below.
For this project, we will build a collection of images for testing purposes. Initially, the collected images are photos, taken from cellphones, of business cards of companies in São Paulo. Those will be used to test each step of the project.
[x] - Corner detection - Detect/approximate corners of the business card of the input image. The user should be able to make adjustments when the detection is not accurate.
[x] - Perspective transform - Some images may be taken from different perspectives. This stage's objetive is to normalize the perspective, using on the corner points of the corner detection step, to enhance further operations.
[x] - Text and character recognition - Recognize the characters present in the text of the business card, and build strings with the sequences of recognized characters.
[ ] - Text categorization for the portuguese language - For each sentence, use NLP methods to categorize it into the following categories: entity, phone number, email, location
[ ] - Logo detection
[ ] - Web scraping for more info about the extracted entities
[ ] - Text categorization for the english language
String | Category |
---|---|
HOLBORN & MOORGATE ENGINEERING | Entity |
+1 801 566-1800 | Phone |
+1 801 566-1801 | Phone |
hme.com | URL |
2011 S 1100 E | Unknown (coordinates) |
Salt Lake City, UT 84106 | Location |
String | Category |
---|---|
lush | Entity |
LAWN + PROPERTY ENHANCEMENT | Entity |
JON | Entity |
248-343-5976 | Phone |
KAYLEN | Entity |
734-552-8728 | Phone |
6811 | Number |
CLINTOVILLE ROAD | Location |
CLARKSTON, MICHIGAN 48348 | Location |
WWW.LUSHMICHIGAN.COM | URL |
String | Category |
---|---|
Daniel Whitton | Entity |
Cell: | Entity |
817-228-6401 | Phone |
Email: | Entity |
[email protected] | |
Website: | Entity |
www.DFWCarpentry.com | URL |
Address: | Entity |
18011 Bruno Road | Location |
Justin, TX 76247 | Location |
As the examples of inputs and outputs have shown, the final application must be able to identify the text and categorize it based on the sentences found. The user must be able to edit the recognized strings and categories, when the results are not accurate.
Until the date of this commit (29/05/2019), we have built the image collection for test purposes, and implemented methods for corner detection, utilizing the Harris Corner Detector and Shi-Tomasi Corner Detector.
For the corner detection, we created functions for conversion from RGB to grayscale, computing image derivatives using the Sobel coefficient, and the corner detection function. We've utilized OpenCV's functions for denoising, and Scikit-image's functions for filtering and thresholding.
A demonstration of the project, its detailed explanation and results discussion can be seen in this python notebook
The entire project can be executed using the shell program "shell.py".
The following programs and Python packages need to be installed at the running system, for proper execution of the shell.py script:
- Python packages:
- External programs:
The shell supports the following commands:
exit
: finish the program and go out from the shellsamples
: show to the user 9 business cards selected randomly from the sample data set.selectcard <number>
: select a businness card giving a number between 1 and 9.run
: perform the process to analise and extract the business card informations.help
: show the usage help message.
Just run python3 shell.py
.
Enter these commands:
samples
. This will show 9 images randomly selected from the sample dataset.selectcard 4
. This will select the fourth card.run
. This will run all the project steps.exit
. This will exit the shell program.
Notice that you have to close the window which shows the image in order to continue using the shell in the terminal.