Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to change the language recognition of Deepgram API? I want him to recognize it as Chinese instead of English. I tried to modify the language in DeepgramSTTModel in the transfer_models.py file, but still can only recognize English #189

Closed
willt0 opened this issue Mar 29, 2024 · 3 comments

Comments

@willt0
Copy link

willt0 commented Mar 29, 2024

 def __init__(self, stt_model_config: dict):
        # Check for api_key
        if stt_model_config["api_key"] is None:
            raise Exception("Attempt to create Deepgram STT Model without an api key.")  # pylint: disable=W0719
        # self.lang = 'en-US'
        self.lang = 'zh-CN'

        print('[INFO] Using Deepgram API for transcription.')
        self.audio_model = DeepgramClient(stt_model_config["api_key"])
@abhinavuppal1
Copy link
Collaborator

The configuration is not clear from the issue description. Are you using command line parameters or override.yaml to use deepgram.

The observation is correct that deepgram is unable to recognize any other languages besides english.

I believe the following change will resolve the issue

Add the line
detect_language=True

here

paragraphs=True)

The method will look like this with the additional option of detecting the language.

    def get_transcription(self, wav_file_path: str):
        """Get text using STT
        """
        try:
            with open(wav_file_path, "rb") as audio_file:
                buffer_data = audio_file.read()

            payload: FileSource = {
                "buffer": buffer_data
                }

            options = PrerecordedOptions(
                model="nova",
                smart_format=True,
                utterances=True,
                punctuate=True,
                paragraphs=True,
                detect_language=True)

            response = self.audio_model.listen.prerecorded.v("1").transcribe_file(payload, options)
            # This is not necessary and just a debugging aid
            with open('logs/deep.json', mode='a', encoding='utf-8') as deep_log:
                deep_log.write(response.to_json(indent=4))

            return response
        except Exception as exception:
            print(exception)

        return None

This should resolve the issue.

@willt0
Copy link
Author

willt0 commented Mar 29, 2024

Thank you!!!The problem has been resolved.

@willt0 willt0 closed this as completed Mar 29, 2024
@abhinavuppal1
Copy link
Collaborator

Resolved in #190.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants