An individual university project. A real-time American Sign Language (ASL) recognition system built in Python using Convolutional Neural Networks and live webcam video processing.
- Data Collection — custom script captures hand gesture images via webcam
- Data Augmentation — augments training data to improve generalisation
- Model Training — trains multiple CNN architectures and selects the best
- Ensemble Inference — combines predictions from multiple models in real time
- Live Translation — processes webcam feed and overlays predicted sign letter
| Stage | File |
|---|---|
| Data collection | collect_training_data.py |
| Augmentation | data_augmentation.py |
| Dataset loading | dataset_loader.py |
| Model definition | model.py / enhanced_model.py |
| Training | train_cnn.py / train_enhanced.py |
| Fine-tuning | fine_tune_model.py |
| Ensemble training | ensemble_train.py |
| Live inference | ensemble_inference.py |
| Entry point | main.py |
- EfficientNet — primary high-accuracy model
- MobileNetV3 — lightweight, optimised for real-time performance
- ResNet — used in ensemble for robustness
- Ensemble — combines all three for final prediction
Training history and confusion matrices are included in the repo as .png files.
Python TensorFlow/Keras OpenCV NumPy CNN Transfer Learning
- Python 3.9+
- Webcam
git clone https://github.com/Mololola/DeepSign_CNN_2.git
cd DeepSign_CNN_2
pip install tensorflow opencv-python numpy
python main.pyModel was trained and optimised to run on a standard laptop CPU — architecture choices reflect hardware constraints of the development environment.