How do you stream audio bidirectionally? #1

akdeb · 2024-08-19T16:53:35Z

Hey! Your project is super cool and we are using it to draw inspiration for our own open source project.
We are trying to stream audio bidirectionally with a full-duplex websocket with STT and TTS with deepgram on a py server.

In your demo picture are you using two ESP32 boards? Are you using multiple FreeRTOS tasks to handle audio streaming?

kaloprojects · 2024-08-19T18:17:31Z

Hi, thx for the compliment .. full duplex websocket streaming sounds interesting too ! Might be a follow up on my side too instead sending prerecoded audio (but pretty complex on ESP using C++/Arduino IDE, py server might be an easier option).

The reason i am using 2 ESP32 is more simple .. the right one (we are speaking here) is a pure (and versatile) STT and TTS device, the left ESP handles any other tasks (in my current projects e.g. an Open AI device via chat 4o API). Both communicate via Serial Tx/Rx (UART2) text/commands. So the right one is just an 'voice-assistant', just an I/O extension (covering STT and TTS) for dedicated other (existing) projects. Maybe i will combine all in one, but using 2 ESP is just easier and more flexible (and structured) for my use cases in moment

kaloprojects · 2024-11-01T16:41:26Z

Hi @akdeb !

.. i just reopened this issue to keep & get in contact with you. I just found you in this Starmoon project https://github.com/StarmoonAI/Starmoon and https://www.starmoon.app/.
This project is awesome !, i love it !! .. well well done, so cool :)

You know, seen this is pretty similar what i started (but until today never finalized). The whole KALO-ESP32-Voice-Assistant code was more a starter toolkit for my own (private) Open AI Chat project (using Speech-To-Speech on ESP32) .. to chat with virtual friends (just for fun):

so i built a pcb for my ESP32, same as on my picture (using I2S microphone INMP441, I2S audio amp MAX98357A)
implemented an Open AI chat device (STT via deepgram, TTS via Open AI TTS and meanwhile SpeechGen.IO TTS
btw: i use SpeechGen.IO because i LOVE the (German) child voice Gisela (same as your Azure 'Twinkle') .. easy to use as they respond with an url to a generated wav ;)
coded several agents (via System prompts) i call them via their name .. then they respond in their role (with their voice), Gisela is one of them
one detail more: meanwhile also coded optional access to Perplexity LLM / model 'llama-3.1-sonar-small-128k-online' (on top to OpenAI model gpt-4o-mini) , allowing me to ask actual real time questions about today (weather, politics etc)

=> all is done with pure C code on ESP32, works well .. BUT the big issue is the latency as i never realized real STT 'streaming' .. all is done via sending pre-recorded wav to Deepgram via http POST request, similar as i did in my KALO-ESP32-Voice-Assistant. But this concept reached the limit for 'human' conversations. Streaming in C++ /ESP32 is a nightmare .. you need a py Server and websockets, i do not have this skill set

then i found your Starmoon project. So amazing !! :) .. so i might stop my privats (Open AI Chat) and go with your idea :)

.. that’s exactly what I was looking for: 'compact AI-enabled device, you can take anywhere and converse with one of your 10 friends' .. in am emphatic conversation, just as a lovely friend, btw: I also planned to build into cuddly toys for the kinds of my friends :). So nice.

I will join your Discord for sure, also planning to order one of those lovely Starmoon AI devices .. and I might ask you ‘thousand’ questions more (in Discord’?) how to setup your device (saying this as I have some skills in hardware and ESP32 C coding, but I am a newbie with Docker, py server, github clones .. etc LOL)

Well well done @akdeb !! .. and I am happy that my KALO-ESP32-Voice-Assistant project could help you maybe a small bit in past (assuming in the I2S coding, right ?)

kaloprojects closed this as completed Aug 20, 2024

kaloprojects reopened this Aug 20, 2024

kaloprojects closed this as completed Aug 24, 2024

kaloprojects reopened this Nov 1, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How do you stream audio bidirectionally? #1

How do you stream audio bidirectionally? #1

akdeb commented Aug 19, 2024

kaloprojects commented Aug 19, 2024 •

edited

Loading

kaloprojects commented Nov 1, 2024 •

edited

Loading

How do you stream audio bidirectionally? #1

How do you stream audio bidirectionally? #1

Comments

akdeb commented Aug 19, 2024

kaloprojects commented Aug 19, 2024 • edited Loading

kaloprojects commented Nov 1, 2024 • edited Loading

kaloprojects commented Aug 19, 2024 •

edited

Loading

kaloprojects commented Nov 1, 2024 •

edited

Loading