What was the exact format during stage 1 and stage 2 #125

choyakawa · 2024-12-06T00:19:22Z

What special tokens were used here? Could you please share one example?

darkacorn · 2024-12-08T12:57:08Z

that looks like the regular tokenizer .. as it produces audio tokens and text tokens at the same time - its always 12.5 audio tokens for 1 sec of audio from what ive seen so far

choyakawa · 2024-12-08T17:23:18Z

I mean the exact usage of those unlisted special tokens, <|begin_of_transcription|>, <|end_of_transcription|>, etc.

darkacorn · 2024-12-08T17:27:39Z

transcript is audio in from the whisper backbone .. the tokenizer is pretty easy

choyakawa changed the title ~~What was the exact format during stage 1~~ What was the exact format during stage 1 and stage 2 Dec 6, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What was the exact format during stage 1 and stage 2 #125

What was the exact format during stage 1 and stage 2 #125

choyakawa commented Dec 6, 2024

darkacorn commented Dec 8, 2024

choyakawa commented Dec 8, 2024

darkacorn commented Dec 8, 2024 •

edited

Loading

What was the exact format during stage 1 and stage 2 #125

What was the exact format during stage 1 and stage 2 #125

Comments

choyakawa commented Dec 6, 2024

darkacorn commented Dec 8, 2024

choyakawa commented Dec 8, 2024

darkacorn commented Dec 8, 2024 • edited Loading

darkacorn commented Dec 8, 2024 •

edited

Loading