audio segmentation #61

Maria-Habib · 2021-10-10T18:21:13Z

Hi...
As recommended on GitHub, the best size of chunks is 10 to 30 seconds. However, the Librispeech dataset was split into various sizes starts from 2 secs.
My question is what is the optimal chunk's size? and is it okay to pre-train on audios of different sizes and fine-tune on chunks of fixed sizes, or the opposite (fixed for pertaining and variable for fine-tuning)?

Further, when split the audios into chunks (ex. at a fixed size of 3 s), some spoken words might be lost, what is a better approach would be for splitting the audios? given that relying on silences results in a larger chunks size

Thanks in advance.

blessyyyu · 2021-10-17T13:37:47Z

hello , I am sorry to solve your question, and I want to ask that how can you git checkout c8a0....

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

audio segmentation #61

audio segmentation #61

Maria-Habib commented Oct 10, 2021

blessyyyu commented Oct 17, 2021

audio segmentation #61

audio segmentation #61

Comments

Maria-Habib commented Oct 10, 2021

blessyyyu commented Oct 17, 2021