Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

speech prob and settings refactor #117

Merged
merged 1 commit into from
Apr 1, 2024

Conversation

matthewkennedy5
Copy link
Contributor

@matthewkennedy5 matthewkennedy5 commented Apr 1, 2024

User description

Whisper outputs a no_speech_prob, which we can use to reduce false positives from VAD. If the no_speech_prob is above 0.8, we can cancel the response, because the audio is likely silence or background noise without speech from the user.

Also organized the settings.py a bit.


Description

  • Implemented a new feature to reduce false positives in voice activity detection by using a no_speech_prob threshold.
  • If the no_speech_prob from Whisper's transcription is above 0.8, the response is cancelled to avoid processing likely silence or noise.
  • Reorganized settings.py to categorize different types of settings for better readability and maintenance.

Changes walkthrough

Relevant files
Enhancement
ml.py
Implement no speech probability threshold check                               

openduck-py/openduck_py/routers/ml.py

  • Added a check for no_speech_prob using a new threshold
    NO_SPEECH_PROB_THRESHOLD from settings.
  • If no_speech_prob is greater than the threshold, the transcription is
    set to an empty string.
  • Imported NO_SPEECH_PROB_THRESHOLD from settings.
  • +7/-3     
    settings.py
    Add no speech probability threshold and reorganize settings       

    openduck-py/openduck_py/settings.py

  • Introduced NO_SPEECH_PROB_THRESHOLD with a value of 0.8.
  • Reorganized environment and settings variables into categorized
    sections.
  • +15/-8   
    💡 Usage Guide

    Checking Your Pull Request

    Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

    Talking to CodeAnt AI

    Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

    @codeant-ai ask: Your question here
    

    This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

    Check Your Repository Health

    To analyze the health of your code repository, visit our dashboard at app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

    Copy link

    vercel bot commented Apr 1, 2024

    The latest updates on your projects. Learn more about Vercel for Git ↗︎

    Name Status Preview Updated (UTC)
    openduck ✅ Ready (Inspect) Visit Preview Apr 1, 2024 7:08pm

    @codeant-ai codeant-ai bot added enhancement New feature or request bug_fix labels Apr 1, 2024
    Copy link
    Contributor

    @zachwe zachwe left a comment

    Choose a reason for hiding this comment

    The reason will be displayed to describe this comment to others. Learn more.

    optional comment: we could return the actual transcript and the no_speech_prob, then add logging so that we can better understand what's happening from the streamlit logs.

    @matthewkennedy5 matthewkennedy5 merged commit 1460119 into main Apr 1, 2024
    9 checks passed
    @matthewkennedy5 matthewkennedy5 deleted the matthew-no-speech-threshold branch April 1, 2024 19:46
    @matthewkennedy5 matthewkennedy5 restored the matthew-no-speech-threshold branch April 1, 2024 21:37
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    bug_fix enhancement New feature or request
    Projects
    None yet
    Development

    Successfully merging this pull request may close these issues.

    None yet

    2 participants