Voice Assistant with Azure Cognitive Speech Services, Azure OpenAI (GPT-4) and Langchain

This project demonstrates how to build a voice assistant using Azure Cognitive Speech Services, Azure OpenAI GPT-4, and Langchain. The voice assistant can perform a variety of tasks such as opening applications, fixing bugs in code, locking your computer, and fetching stock prices.

The voice assistant use Langchain Agents to perform task against the terminal and run Python code. You can easily add more tools by adding tools to load_tools().

Read more about Langchain and Langchain agents here: https://python.langchain.com/en/latest/modules/agents.html

Example Prompts

Once the voice assistant is running, you can use the following example prompts to interact with it:

"Hey Astra, open Microsoft Edge."
"Hey Astra, fix the bug in the code in my clipboard and set the solution to my clipboard."
"Hey Astra, lock my computer."
"Hey Astra, what is the stock price of MSFT?"
"Hey Astra, what time is it?"

It can also respond to any general queries that the GPT model is capable of answering. However, its capabilities extend beyond that. Since it has access to the terminal and Python through a Langchain agent, it can also interact with your system.

Prerequisites

Azure account
Azure Speech Service: https://learn.microsoft.com/en-us/azure/cognitive-services/speech-service/
Azure OpenAI Service: https://azure.microsoft.com/en-us/products/cognitive-services/openai-service

Setup

Clone this repository to your local machine.
Install the required dependencies:

pip install -r requirements.txt

Create a .env file in the root directory of the project and fill it with your own keys and other details. Use the example.env file in the repository as a reference.

# Azure Cognitive Speech Services
AZURE_SPEECH_KEY="Replace with your subscription key for Azure Speech Service"
AZURE_SPEECH_REGION="Replace with the region for your Azure Speech Service"
AZURE_WAKEUP_MODEL="astra.table" # This is the default wakeup keyword model for "Hey Astra"
AZURE_SPEECH_VOICE="en-US-AriaNeural" # This is the voice used by Azure Speech Service

# Azure OpenAI GPT-4
OPENAI_API_MODEL="gpt-4-32k" # This can be changed to any other available GPT model that has chat capabilities
OPENAI_API_DEPLOYMENTNAME="Replace with the name of the model you created in Azure Portal"
OPENAI_API_VERSION="2023-03-15-preview" # This is the version of the OpenAI API
OPENAI_API_BASE="https://REPLACETHISWITHYOURPREFIX.openai.azure.com/" # Replace with your OpenAI API base URL
OPENAI_API_KEY="Replace with your Azure OpenAI API key"
OPENAI_API_TYPE="azure" # This is the type of the OpenAI API

Run the script:

python main.py

Usage

Once the voice assistant is running, it will continuously listen for the wakeup keyword "Hey Astra". Once the keyword is detected, it will start listening for speech input, which it will then pass to the OpenAI model for processing. The model's response will be spoken out loud using text-to-speech.

Contributions

Contributions are welcome! Please feel free to submit a Pull Request.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Voice Assistant with Azure Cognitive Speech Services, Azure OpenAI (GPT-4) and Langchain

Example Prompts

Prerequisites

Setup

Usage

Contributions

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.MD		README.MD
astra.table		astra.table
example.env		example.env
main.py		main.py
requirements.txt		requirements.txt

DennizSvens/azure-voice-assistant

Folders and files

Latest commit

History

Repository files navigation

Voice Assistant with Azure Cognitive Speech Services, Azure OpenAI (GPT-4) and Langchain

Example Prompts

Prerequisites

Setup

Usage

Contributions

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages