Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sound effects #67

Open
wants to merge 6 commits into
base: main
Choose a base branch
from
Open

Sound effects #67

wants to merge 6 commits into from

Conversation

Sobsz
Copy link
Contributor

@Sobsz Sobsz commented Mar 15, 2024

User description

Placeholder sounds made with the online editor Beepbox (project; cut every 2 seconds, then normalized to -1 dB, then monofied).

  • Received (as soon as a stop event is detected)
  • Listening (after the bot's done talking)
  • Generating (looping hum until TTS starts)
  • Error (currently played when the chat model thinks there's "technical difficulties")

⚠ The sounds can get backlogged, need to figure out how to play them truly asynchronously (some sort of mixer? preemptible like send_audio_loop?).


Description

  • Added sound effects support for voice interaction events such as "listening" and "received".
  • Sound effects are loaded using librosa, converted to the correct sample rate, and played at appropriate times during voice processing.
  • Introduced a new setting SFX_VOLUME to control the volume of sound effects.
  • Updated the Docker Compose file to sync the entire 'prototype' directory from AWS S3 instead of just 'styletts2'.
  • Minor cleanup in daily_bot.py by removing an unnecessary blank line.

Changes walkthrough

Relevant files
Formatting
daily_bot.py
Minor Cleanup in PyAudioApp Class                                                           

clients/daily/daily_bot.py

  • Removed an unnecessary blank line.
+0/-1     
Enhancement
voice.py
Implement Sound Effects for Voice Interaction                                   

openduck-py/openduck_py/routers/voice.py

  • Imported librosa library for audio processing.
  • Added a dictionary to store sound effects.
  • Loaded sound effects from WAV files and converted them to byte format.
  • Implemented play_sfx method to play sound effects.
  • Commented out play_sfx_loop method with a TODO note for future
    implementation.
  • Added sound effect playback at the end of speech detection and after
    the assistant's response.
  • +39/-0   
    Configuration changes
    __init__.py
    Add Sound Effects Volume Setting                                                             

    openduck-py/openduck_py/settings/init.py

    • Added a new setting SFX_VOLUME for sound effects volume control.
    +1/-0     
    docker-compose.yml
    Update Docker Compose S3 Sync Path                                                         

    docker-compose.yml

  • Updated the AWS S3 sync command to target the 'prototype' directory
    instead of 'styletts2'.
  • +4/-4     
    💡 Usage Guide

    Checking Your Pull Request

    Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

    Talking to CodeAnt AI

    Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

    @codeant-ai ask: Your question here
    

    This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

    Check Your Repository Health

    To analyze the health of your code repository, visit our dashboard at app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

    @codeant-ai codeant-ai bot added the enhancement New feature or request label Mar 15, 2024
    @Sobsz
    Copy link
    Contributor Author

    Sobsz commented Mar 15, 2024

    @CodeAnt-AI ask: sure let's try this, how would you approach play_sfx_loop?

    Copy link
    Contributor

    codeant-ai bot commented Mar 15, 2024

    Question: sure let's try this, how would you approach play_sfx_loop?

    Answer:
    The play_sfx_loop function is designed to play a sound effect in a loop until a certain condition is met or the task is canceled. The current implementation is commented out with a TODO note indicating uncertainty about how to prevent the loop from blocking Text-to-Speech (TTS) from getting through.

    To approach the play_sfx_loop function, we need to ensure that it can run concurrently with other tasks and not block them. Here's a possible approach:

    1. Use an asyncio.Event to control the loop. This event can be set when we want to stop the sound effect loop (e.g., when TTS starts).
    2. Instead of using await self.response_queue.join(), which waits for the queue to be empty, we should simply put the sound chunk into the queue without waiting. This will allow the loop to continue adding sound chunks to the queue without blocking.
    3. Replace asyncio.sleep(len(chunk) / OUTPUT_SAMPLE_RATE) with await asyncio.sleep(len(chunk) / OUTPUT_SAMPLE_RATE) to properly await the sleep coroutine.
    4. Ensure that the loop checks the event at each iteration and exits if the event is set.

    Here's a revised version of the play_sfx_loop function with these changes:

    async def play_sfx_loop(self, name):
        sound = sound_effects[name]
        loop_event = asyncio.Event()  # Create an event to control the loop
    
        try:
            while not loop_event.is_set():  # Continue looping until the event is set
                for i in range(0, len(sound), CHUNK_SIZE):
                    chunk = sound[i : i + CHUNK_SIZE]
                    await self.response_queue.put(chunk)  # Put the chunk in the queue
                    await asyncio.sleep(len(chunk) / OUTPUT_SAMPLE_RATE)  # Sleep for the duration of the chunk
        except asyncio.CancelledError:
            pass
        finally:
            loop_event.set()  # Ensure the event is set when exiting the loop
    
        # This function should be called with something like:
        # self.hold_sound_task = asyncio.create_task(self.play_sfx_loop("generating"))

    Additionally, you would need a way to set the loop_event from outside the function when you want to stop the loop. This could be done by making loop_event an attribute of the class and setting it when TTS starts or when the loop should be stopped for any other reason.

    Remember to handle the cancellation of self.hold_sound_task properly when the loop needs to be stopped. This can be done by calling self.hold_sound_task.cancel() followed by await self.hold_sound_task to allow the task to clean up.

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    enhancement New feature or request
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    1 participant