Replies: 2 comments 1 reply
-
I like this idea, and it could be extended to add new languages to the
tiptoi universe. One problem might be to identify different speakers to
generate different voices.
Interestingly the gme files support English as a language but there seem to
be no products on the market that actually use it.
…On Tue, Oct 8, 2024 at 9:43 AM Germling ***@***.***> wrote:
Hello, my idea is to build a Python script to identify all OGG files that
have speech, convert them to text, then translate that text and generate
audio files again. This way, I could easily translate common books like
"Auf dem Bauernhof" into English. I just have to overwrite the OGG files
and build a new GME.
Main challenges:
- Identify only those OGG files with speech
- Batch process these files using various TTS and translation APIs
—
Reply to this email directly, view it on GitHub
<#297>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABQHBZBD5QWMMYRTVXNJUX3Z2OELHAVCNFSM6AAAAABPRRVACKVHI2DSMVQWIX3LMV43ERDJONRXK43TNFXW4OZXGI4DSMZQHE>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
|
Beta Was this translation helpful? Give feedback.
1 reply
-
Maybe this could be solved using something like
https://github.com/noisetorch/NoiseTorch
…On Tue, Oct 8, 2024 at 11:40 AM Germling ***@***.***> wrote:
ogg_batch_translation_silero.zip
<https://github.com/user-attachments/files/17291251/ogg_batch_translation_silero.zip>
I wrote a Python script that can run through the whole media folder, scan
for speech via Silero VAD, convert German speech to text, translate the
German text to English and convert the English text to speech. An example
is attached.
The output file is temp.mp3; as you can hear, it uses a female voice by
default, and there are nicer-sound voices available.
I think the speaker types could be identified via VAD and mapped in some
way. But the challenge are large ogg files with lots of background noise. I
don't know how this could be preserved... You would need to isolate only
the speech somehow.
—
Reply to this email directly, view it on GitHub
<#297 (reply in thread)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ABQHBZD53OP2JXU4ZTOWX3TZ2OSABAVCNFSM6AAAAABPRRVACKVHI2DSMVQWIX3LMV43URDJONRXK43TNFXW4Q3PNVWWK3TUHMYTAOBXG44TMNI>
.
You are receiving this because you commented.Message ID:
***@***.***
com>
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hello, my idea is to build a Python script to identify all OGG files that have speech, convert them to text, then translate that text and generate audio files again. This way, I could easily translate common books like "Auf dem Bauernhof" into English. I just have to overwrite the OGG files and build a new GME.
Main challenges:
Beta Was this translation helpful? Give feedback.
All reactions