Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

1s Latency Definitiion #14

Open
RuchirB opened this issue Feb 9, 2024 · 6 comments
Open

1s Latency Definitiion #14

RuchirB opened this issue Feb 9, 2024 · 6 comments

Comments

@RuchirB
Copy link

RuchirB commented Feb 9, 2024

Tried out the project, very impressed. Thanks for open sourcing. Quick question on latency.

Noticed a minimum latency of at least 3-4s. I am measuring latency as delay between when the human speaks and when the AI responds. This was with everything deployed on fly.io in Ashburn using the exact demo as instructed.

Looks like the biggest bottleneck is the request from Twilio -> Fly.io and Fly.io -> Twilio. Second biggest bottleneck looks like transcription via deepgram.

The ReadMe suggests a latency of 1s—can you clarify the definition of latency here? Is that just looking at gpt response + TTS?

Any ideas on how to reduce latency? Is there a roadmap for this project we can follow somewhere?

@ansario
Copy link

ansario commented Mar 8, 2024

You could use the gpt-4-turbo-preview (or 3.5) GPT model for a small boost.

@ANIL-KADURKA
Copy link

i am acheieving same latency i am using Groq for faster access but still the latency is 4 tell me the best way like for STT adn TTS it staking more time the STT is taking a time of 1.5 and tts a time of 1.2 please help me out the best configuraiton for the deepgram and wahtever it is how you are getting 1 by god sake?

@devsalman247
Copy link

Tried out the project, very impressed. Thanks for open sourcing. Quick question on latency.

Noticed a minimum latency of at least 3-4s. I am measuring latency as delay between when the human speaks and when the AI responds. This was with everything deployed on fly.io in Ashburn using the exact demo as instructed.

Looks like the biggest bottleneck is the request from Twilio -> Fly.io and Fly.io -> Twilio. Second biggest bottleneck looks like transcription via deepgram.

The ReadMe suggests a latency of 1s—can you clarify the definition of latency here? Is that just looking at gpt response + TTS?

Any ideas on how to reduce latency? Is there a roadmap for this project we can follow somewhere?

@RuchirB Same case here...I am using Grok instead openai GPT models but still experiencing some delay b/w receiving audio packets from twilio & human speaking...

@badereddineqodia
Copy link

I think using OpenAI's real-time API now is perfect, as it eliminates the need for additional middle services that would add more latency.

@boxed-dev
Copy link

I created an application using the OpenAI real-time API with function call and twilio integration, but it seems too rigid and robotic. Additionally, I find it too expensive to be feasible for real-world use, at least for now.

I think using OpenAI's real-time API now is perfect, as it eliminates the need for additional middle services that would add more latency.

@devsalman247
Copy link

Try elevenlabs conversational AI..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants
@ansario @RuchirB @badereddineqodia @ANIL-KADURKA @devsalman247 @boxed-dev and others