VisionVoice AI

This is an AI-powered chatbot designed to improve communication between patients and doctors by combining speech recognition and image analysis.

Project Overview

This chatbot helps bridge gaps in healthcare by:

Listening & Understanding:: Uses OpenAI’s Whisper to convert speech into text.
Analyzing Medical Images:: Uses Meta’s Llama 3.2 to interpret images and provide insights.
Responding Naturally: Generating human-like, empathetic responses to patient inquiries.

Features

Speech-to-Text (STT):Converts spoken words into text using Whisper.
Medical Image Analysis: Processes images with Llama 3.2 for insights.
Text-to-Speech (TTS): Replies with natural-sounding speech via gTTS & ElevenLabs.
User-Friendly Interface: Built with Gradio for a smooth experience.

Getting Started

Prerequisites

Python 3.8+: Ensure Python is installed on your system.
Virtual Environment: It's recommended to use a virtual environment to manage dependencies.

Installation

Clone the Repository:

git clone [email protected]:jamesoyanna/voice_vision_ai.git
cd vision_voice_ai

Set Up Virtual Environment:

python -m venv venv
source venv/bin/activate  # On Windows: venv\Scripts\activate

Install Dependencies:
```
pip install -r requirements.txt
```
Set Up Environment Variables:

Create a .env file in the project root directory with the following content:
```
GROQ_API_KEY=your_groq_api_key
ELEVENLABS_API_KEY=your_elevenlabs_api_key
```
Install FFmpeg:
- Windows: Download from FFmpeg Official Website and add to system PATH.
- macOS: Install via Homebrew: brew install ffmpeg.
- Linux: Use the package manager: sudo apt-get install ffmpeg.

Running the Application

After completing the installation steps:

python gradio_app.py

Open your browser and navigate to http://localhost:7860 to interact with the chatbot.

License

This project is licensed under the MIT License. See the LICENSE file for details.

Acknowledgments

OpenAI: For the Whisper model.
Meta: For the Llama 3.2 model.
Gradio: For the user interface framework.

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
.gitignore		.gitignore
README.md		README.md
brain_of_the_doctor.py		brain_of_the_doctor.py
gradio_app.py		gradio_app.py
requirements.txt		requirements.txt
voice_of_the_doctor.py		voice_of_the_doctor.py
voice_of_the_patient.py		voice_of_the_patient.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

VisionVoice AI

Project Overview

Features

Getting Started

Prerequisites

Installation

Running the Application

License

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

jamesoyanna/voice_vision_ai

Folders and files

Latest commit

History

Repository files navigation

VisionVoice AI

Project Overview

Features

Getting Started

Prerequisites

Installation

Running the Application

License

Acknowledgments

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages