Skip to content

Latest commit

 

History

History
33 lines (27 loc) · 937 Bytes

README.md

File metadata and controls

33 lines (27 loc) · 937 Bytes

AI-Voice-Assistant-AIVA-

Voice enabled AI Assistant with voice activity detection AI Voice Assistant Pipeline This project implements an end-to-end AI Voice Assistant Pipeline that converts voice queries into text, processes them using a Large Language Model (LLM), and converts the response back into speech. Features

Voice-to-Text conversion using VAD (Voice Activity Detection) and Whisper Text processing using Google's Gemini AI Text-to-Speech conversion with adjustable parameters Low latency design Output restriction to 2 sentences Tunable parameters for voice output (pitch, gender, speed)

Technologies Used

Python Transformer (Pipeline) Torch Numpy speech_recognition VAD (Voice Activity Detection) Whisper google.generativeai (for Gemini) edge-tts

Pipeline Steps

Voice-to-Text Conversion Text Input into LLM Text-to-Speech Conversion

Contributing Contributions are welcome! Please feel free to submit a Pull Request.