Provide a feature/property to disable the default "Disfluency Removal" thereby producing verbatim transcripts for presentation development, improvement, refinement, and practice #2637

ahotrod · 2024-10-21T01:14:28Z

Our popular & in-demand industry use-case for Azure AI Speech includes analyzing important high-level presentations for development, improvement, refinement, and practice. A portion of our NLTK analysis on transcribed presentations identifies concordances of filler words (um, uh, er, hmm, so, etc.) which requires a verbatim transcript, with no "Disfluency Removal".

We are porting our application to Azure by adopting the Python code in this sample: https://github.com/Azure-Samples/cognitive-services-speech-sdk/blob/master/samples/batch/python/python-client/main.py which uses the Speech to Text REST API v3.2 in a swagger-client configuration. Unfortunately, this API, which includes Disfluency Removal of many filler words by default, has no means to disable it that we can find. The issue is, does this API allow disabling Disfluency Removal or can it be updated to do so?

Our use-case requires batch transcription & custom speech model management which requires we use the Speech to Text REST API v3.2. (https://learn.microsoft.com/en-us/azure/ai-services/speech-service/speech-sdk). Except for this Disfluency issue, the Speech to Text REST API v3.2 works perfectly for our use-case (speaker diarization, word-level timestamps, BYOS, etc.).

A solution perhaps would be to leave "Disfluency Removal" as the default for the installed code base, and provide a transcription property to disable it, ala:

`properties.verbatim = True'

-OR-

'properties.disfluencyremoval = False`

Thanks for your consideration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide a feature/property to disable the default "Disfluency Removal" thereby producing verbatim transcripts for presentation development, improvement, refinement, and practice #2637

Provide a feature/property to disable the default "Disfluency Removal" thereby producing verbatim transcripts for presentation development, improvement, refinement, and practice #2637

ahotrod commented Oct 21, 2024 •

edited

Loading

Provide a feature/property to disable the default "Disfluency Removal" thereby producing verbatim transcripts for presentation development, improvement, refinement, and practice #2637

Provide a feature/property to disable the default "Disfluency Removal" thereby producing verbatim transcripts for presentation development, improvement, refinement, and practice #2637

Comments

ahotrod commented Oct 21, 2024 • edited Loading

ahotrod commented Oct 21, 2024 •

edited

Loading