Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

audio segmentation #61

Open
Maria-Habib opened this issue Oct 10, 2021 · 1 comment
Open

audio segmentation #61

Maria-Habib opened this issue Oct 10, 2021 · 1 comment

Comments

@Maria-Habib
Copy link

Hi...
As recommended on GitHub, the best size of chunks is 10 to 30 seconds. However, the Librispeech dataset was split into various sizes starts from 2 secs.
My question is what is the optimal chunk's size? and is it okay to pre-train on audios of different sizes and fine-tune on chunks of fixed sizes, or the opposite (fixed for pertaining and variable for fine-tuning)?

Further, when split the audios into chunks (ex. at a fixed size of 3 s), some spoken words might be lost, what is a better approach would be for splitting the audios? given that relying on silences results in a larger chunks size

Thanks in advance.

@blessyyyu
Copy link

hello , I am sorry to solve your question, and I want to ask that how can you git checkout c8a0....

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants