Welcome to the Real-time Sign Language Recognition project! This project aims to recognize American Sign Language (ASL) alphabets in real-time using computer vision techniques and machine learning. Below are the details on how to run the project locally and an overview of its components:
The dataset used for training and testing the model is available on Kaggle at the following link: ASL Alphabet Dataset. It contains images of ASL alphabet signs.
If you wish to collect your own data for training, you can use the collect_imgs.py
script. This script captures images from your webcam and saves them in a specified directory. It allows you to collect images for multiple classes (up to 29 classes) by capturing 100 images per class.
To convert the images into numeric data, you can run the create_dataset.py
script. This script utilizes the Mediapipe framework to extract hand landmarks from the images, which are then saved in pickle format and used as features for training the model.
Before proceeding with model training, I conducted a comparative evaluation of several classifiers to determine their performance on the dataset evaluate.ipynb
.
-
Data Preparation:
- The dataset was split into features (
X
) and target labels (y
), with an 80/20 train-test split.
- The dataset was split into features (
-
Classifiers: The following classifiers were evaluated:
- Logistic Regression: Configured for multi-class classification using the 'lbfgs' solver.
- Random Forest: An ensemble method utilizing multiple decision trees.
- Support Vector Machine: Employed with probability estimates.
- K-Nearest Neighbors: A non-parametric method that classifies based on nearest neighbors.
- Decision Tree: A model that makes decisions based on feature splits.
-
Metrics: For each classifier, I computed the following evaluation metrics:
- Accuracy: The proportion of true results among the total number of cases examined.
- Precision: The ratio of correctly predicted positive observations to the total predicted positives.
- Recall: The ratio of correctly predicted positive observations to all actual positives.
- F1 Score: The weighted average of Precision and Recall.
- AUC ROC: The area under the receiver operating characteristic curve.
The results for each classifier are summarized below
For predicting alphabet classes, a Random Forest model from the Scikit-learn library is used. The train_classifier.py
script loads the numeric data, trains the Random Forest model, and saves it in pickle format for later use.
To demonstrate real-time recognition, the project uses OpenCV and Mediapipe. OpenCV is used for accessing the webcam and displaying video streams, while Mediapipe is utilized for detecting hand landmarks in real-time. The trained Random Forest model predicts the alphabet based on the detected hand landmarks, and the results are displayed on the screen.
To run the Real-time Sign Language Recognition system locally using your dataset and model weights, follow these steps:
- Clone the Repository:
git clone https://github.com/adiren7/Real_time_sign_language_recognition.git
cd Real-time-Sign-Language-Recognition
- Install Dependencies:
pip install -r requirements.txt
- Run the Application:
python sign_detection.py
Here's an example of how to use the Real-time Sign Language Recognition system:
- Run the
sign_detection.py
script. - Position your hand in front of the webcam.
- The system will detect your hand landmarks and predict the corresponding ASL alphabet.
- The predicted alphabet will be displayed on the screen in real-time.
Contributions to this project are welcome! If you have any ideas for improvements or feature suggestions, feel free to open an issue or submit a pull request.
- Special thanks to Kaggle user
grassknoted
for providing the ASL Alphabet Dataset.