AssemblyAI Batch Transcribe Tool

Chunk audio files into smaller clips to transcribe them faster with AssemblyAI.

Transcription time is usually ~20% of the original audio file length. One way to speed up this turnaround time is to chunk the file into shorter clips and send them to AssemblyAI to be transcribed concurrently. AssemblyAI allows 32 concurrent jobs by default which can be increased based on customer requirements. Once the jobs are complete, the individual transcripts for each clip can be joined back together and sorted by timestamps to match the original audio clip.

Steps to run this tool

Install libraries in requirements.txt file: pip3 install -r requirements.txt
Add AssemblyAI API key as an env variable: export ASSEMBLYAI_API_TOKEN={YOUR_KEY}
Run python3 clip-chunker.py <path_to_input_file> <clip_length_in_seconds (optional, default=120 seconds)>

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
assembly.py		assembly.py
clip chunker system design.png		clip chunker system design.png
clip-chunker.py		clip-chunker.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

AssemblyAI Batch Transcribe Tool

Steps to run this tool

How it works

About

Uh oh!

Releases

Packages

AssemblyAI/batch-transcribe-tool

Folders and files

Latest commit

History

Repository files navigation

AssemblyAI Batch Transcribe Tool

Steps to run this tool

How it works

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages