This project demonstrates real-time object detection and distance estimation using YOLOv3, OpenCV, and Python. The application can detect objects in a video stream (e.g., from a webcam) and estimate their distance from the camera. It also uses text-to-speech functionality to announce the detected objects and their distances.
- Python 3.x
- OpenCV
- NumPy
- PyTesseract
- pyttsx3
- YOLOv3 weights and configuration files
- coco.names (class names for YOLOv3)
Install the required packages:
pip install opencv-python numpy pytesseract pyttsx3
Download the YOLOv3 weights and configuration files from the official YOLO website and place them in the project directory.
Download the coco.names file from the official YOLO repository and place it in the project directory.
Run the object_detection.py script:
python main.py
Press 'q' to exit the application.
The code consists of the following main components:
- Video capture: Captures video frames from a webcam using OpenCV.
- Object detection: Detects objects in the video frames using YOLOv3 and OpenCV.
- Distance estimation: Estimates the distance of the detected objects from the camera using the object's bounding box width, focal length, and real-world width.
- Text-to-speech: Announces the detected objects and their distances using the pyttsx3 library.
- Non-Maxima Suppression (NMS): Applies NMS to remove overlapping bounding boxes and improve detection accuracy.
You can configure the following parameters in the code:
- known_width: The real-world width of the object being detected (in centimeters).
- focal_length: The focal length of the camera (experimentally determined or obtained from camera specifications).
This project is licensed under the MIT License - see the LICENSE.md file for details.