A powerful AI-powered accent detection application built with Streamlit that can analyze speech accents from audio files or video URLs. The app uses advanced machine learning to identify different English accents with high accuracy.
- Audio File Analysis: Upload audio files (WAV, MP3, FLAC, M4A, OGG) for accent detection
- Video URL Analysis: Extract audio from video URLs (YouTube and other platforms) and analyze accents
- Real-time Results: Get instant accent predictions with confidence scores
- Multiple Accent Support: Detects various English accents including American, British, Australian, and more
- Detailed Analytics: View probability distributions for all detected accents
- Clean Interface: Modern, user-friendly Streamlit interface
This application uses the HamzaSidhu786/speech-accent-detection model from Hugging Face: 🔗 Model Link: https://huggingface.co/HamzaSidhu786/speech-accent-detection
You can check it out in action here: 🔗 Space Link: [https://huggingface.co/spaces/HeyChriss/accent-detector-ai}
- Base Architecture: Fine-tuned Facebook Wav2Vec2-base model
- Training Dataset: CSTR-Edinburgh/VCTK dataset
- Accuracy: 99.55% on evaluation set
- Framework: PyTorch + Transformers
- Model Size: 94.6M parameters
- License: Apache 2.0
The model achieves exceptional performance with a validation loss of 0.0441 and was trained for 10 epochs using advanced hyperparameters and mixed precision training.
- Python 3.8+
- pip package manager
-
Clone the repository:
git clone https://github.com/heychriss/accent-detector-ai.git cd accent-detector-ai -
Install dependencies:
pip install -r requirements.txt
-
Run the application:
streamlit run streamlit_app.py
-
Open your browser and navigate to
http://localhost:8501
- Click on the "📁 Upload Audio File" tab
- Choose an audio file from your device
- Click "Analyze Accent" button
- View the results with confidence scores
- Click on the "🔗 Video URL" tab
- Paste a video URL (YouTube, etc.)
- Click "Analyze Accent" button
- The app will download, extract audio, and analyze the accent
- Predicted Accent: The most likely accent detected
- Confidence: Percentage confidence in the prediction
- Probability Distribution: Shows all possible accents with their probabilities
- Source Information: Details about the analyzed file or URL
accent-detector-ai/
├── streamlit_app.py # Main Streamlit application
├── accent_detector.py # Core accent detection logic
├── model.py # Model loading and management
├── video_downloader.py # Video URL processing
├── logger.py # Logging configuration
├── requirements.txt # Python dependencies
├── test/ # Test files
│ ├── test_accent_detector.py
│ ├── test_model.py
│ └── test_video_downloader.py
├── .streamlit/ # Streamlit configuration
│ └── config.toml
└── README.md # This file
The application includes comprehensive tests that run automatically on startup:
- Test Coverage: Accent detector, model loading, video downloader
- Automated Testing: Tests run silently in background during app initialization
- Console Logging: Test results are logged to console for development monitoring
- Frontend: Streamlit for web interface
- ML Framework: PyTorch + Transformers
- Audio Processing: librosa for audio analysis
- Video Processing: yt-dlp for video URL handling
- Logging: Python logging with custom configuration
- AccentDetector: Main class handling accent prediction
- ModelManager: Handles model loading and caching
- VideoDownloader: Manages video URL processing and audio extraction
- Logger: Centralized logging system
- Model Caching: Models are cached using Streamlit's
@st.cache_resource - Efficient Processing: Optimized audio preprocessing and inference
- Memory Management: Proper cleanup of temporary files
The model can detect various English accents including but not limited to:
- American English
- British English
- Australian English
- Canadian English
- Irish English
- Scottish English
- And more regional variations
For complete accent list and detailed model performance, visit the model page.
Contributions are welcome! Please feel free to submit a Pull Request. For major changes, please open an issue first to discuss what you would like to change.
This version works correctly, and there are no issues with YouTube cookies; however, in the HuggingFace Space, there might be issues related to cookies because there is no browser in HuggingFace therefore, some issues downloading from YouTube might be present. I am working on a way to download cookies manually to use in HuggingFace, but locally works perfectly locally.
This project is licensed under the MIT License - see the LICENSE file for details.
- Model Creator: HamzaSidhu786 for the excellent accent detection model
- Base Model: Facebook's Wav2Vec2 team for the foundational architecture
- Dataset: CSTR-Edinburgh for the VCTK dataset
- Community: Hugging Face community for model hosting and tools