Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a feature/property to disable the default "Disfluency Removal" thereby producing verbatim transcripts for presentation development, improvement, refinement, and practice #2637

Open
ahotrod opened this issue Oct 21, 2024 · 0 comments

Comments

@ahotrod
Copy link

ahotrod commented Oct 21, 2024

Our popular & in-demand industry use-case for Azure AI Speech includes analyzing important high-level presentations for development, improvement, refinement, and practice. A portion of our NLTK analysis on transcribed presentations identifies concordances of filler words (um, uh, er, hmm, so, etc.) which requires a verbatim transcript, with no "Disfluency Removal".

We are porting our application to Azure by adopting the Python code in this sample: https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/batch/python/python-client/main.py which uses the Speech to Text REST API v3.2 in a swagger-client configuration. Unfortunately, this API, which includes Disfluency Removal of many filler words by default, has no means to disable it that we can find. The issue is, does this API allow disabling Disfluency Removal or can it be updated to do so?

Our use-case requires batch transcription & custom speech model management which requires we use the Speech to Text REST API v3.2. (https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-sdk). Except for this Disfluency issue, the Speech to Text REST API v3.2 works perfectly for our use-case (speaker diarization, word-level timestamps, BYOS, etc.).

A solution perhaps would be to leave "Disfluency Removal" as the default for the installed code base, and provide a transcription property to disable it, ala:

`properties.verbatim = True'

-OR-

'properties.disfluencyremoval = False`

Thanks for your consideration.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant