- Content and Feed Inputs
- Language Model (LLM) Options
- Transcription Options
- Prompt Options
- Alternative Runtimes
- Test Suite
Run on a single YouTube video.
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk"
Run on multiple YouTube videos in a playlist.
npm run as -- \
--playlist "https://www.youtube.com/playlist?list=PLCVnrVv4KhXPz0SoAVu8Rc1emAdGPbSbr"
Run on playlist URL and generate JSON info file with markdown metadata of each video in the playlist:
npm run as -- \
--playlist "https://www.youtube.com/playlist?list=PLCVnrVv4KhXPz0SoAVu8Rc1emAdGPbSbr" \
--info
Process all videos from a YouTube channel (both live and non-live):
npm run as -- \
--channel "https://www.youtube.com/@ajcwebdev"
Process videos starting from the oldest instead of newest:
npm run as -- \
--channel "https://www.youtube.com/@ajcwebdev" \
--order oldest
Skip a certain number of videos before beginning processing (starts from newest by default and can be used with --order oldest
):
npm run as -- \
--channel "https://www.youtube.com/@ajcwebdev" \
--skip 1
Process a certain number of the most recent videos, for example the last three videos released on the channel:
npm run as -- \
--channel "https://www.youtube.com/@ajcwebdev" \
--last 3
Run on a YouTube channel and generate JSON info file with markdown metadata of each video:
npm run as -- \
--channel "https://www.youtube.com/@ajcwebdev" \
--info
Run on an arbitrary list of URLs in example-urls.md
.
npm run as -- \
--urls "content/example-urls.md"
Run on URLs file and generate JSON info file with markdown metadata of each video:
npm run as -- \
--urls "content/example-urls.md" \
--info
Run on audio.mp3
on the content
directory:
npm run as -- \
--file "content/audio.mp3"
Process RSS feed from newest to oldest (default behavior):
npm run as -- \
--rss "https://ajcwebdev.substack.com/feed"
Process RSS feed from oldest to newest:
npm run as -- \
--rss "https://feeds.transistor.fm/fsjam-podcast/" \
--order oldest
Start processing a different episode by selecting a number of episodes to skip:
npm run as -- \
--rss "https://feeds.transistor.fm/fsjam-podcast/" \
--skip 1
Process a certain number of the most recent items, for example the last three episodes released on the feed:
npm run as -- \
--rss "https://feeds.transistor.fm/fsjam-podcast/" \
--last 3
Process a single specific episode from a podcast RSS feed by providing the episode's audio URL with the --item
option:
npm run as -- \
--rss "https://ajcwebdev.substack.com/feed" \
--item "https://api.substack.com/feed/podcast/36236609/fd1f1532d9842fe1178de1c920442541.mp3"
Run on a podcast RSS feed and generate JSON info file with markdown metadata of each item:
npm run as -- \
--rss "https://ajcwebdev.substack.com/feed" \
--info
Create a .env
file and set API key as demonstrated in .env.example
for either:
OPENAI_API_KEY
ANTHROPIC_API_KEY
GEMINI_API_KEY
COHERE_API_KEY
MISTRAL_API_KEY
TOGETHER_API_KEY
FIREWORKS_API_KEY
GROQ_API_KEY
For each model available for each provider, I have collected the following details:
- Context Window, the limit of tokens a model can process at once.
- Max Output, the upper limit of tokens a model can generate in a response, influencing response length and detail.
- Cost of input and output tokens per million tokens.
- Some model providers also offer a Batch API with input/output tokens at half the price.
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--chatgpt
Select ChatGPT model:
# Select GPT-4o mini model - https://platform.openai.com/docs/models/gpt-4o-mini
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--chatgpt GPT_4o_MINI
# Select GPT-4o model - https://platform.openai.com/docs/models/gpt-4o
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--chatgpt GPT_4o
# Select GPT-4 Turbo model - https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--chatgpt GPT_4_TURBO
# Select GPT-4 model - https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--chatgpt GPT_4
Model | Context Window | Max Output | Input Tokens | Output Tokens | Batch Input | Batch Output |
---|---|---|---|---|---|---|
GPT-4o mini | 128,000 | 16,384 | $0.15 | $0.60 | $0.075 | $0.30 |
GPT-4o | 128,000 | 4,096 | $5 | $15 | $2.50 | $7.50 |
GPT-4 Turbo | 128,000 | 4,096 | $10 | $30 | $5 | $15 |
GPT-4 | 8,192 | 8,192 | $30 | $60 | $15 | $30 |
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--claude
Select Claude model:
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--claude CLAUDE_3_5_SONNET
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--claude CLAUDE_3_OPUS
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--claude CLAUDE_3_SONNET
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--claude CLAUDE_3_HAIKU
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--gemini
Select Gemini model:
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--gemini GEMINI_1_5_FLASH
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--gemini GEMINI_1_5_PRO
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--cohere
Select Cohere model:
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--cohere COMMAND_R
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--cohere COMMAND_R_PLUS
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--mistral
Select Mistral model:
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--mistral MIXTRAL_8x7b
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--mistral MIXTRAL_8x22b
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--mistral MISTRAL_LARGE
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--mistral MISTRAL_NEMO
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--fireworks
Select Fireworks model:
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--fireworks LLAMA_3_1_405B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--fireworks LLAMA_3_1_70B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--fireworks LLAMA_3_1_8B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--fireworks LLAMA_3_2_3B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--fireworks LLAMA_3_2_1B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--fireworks QWEN_2_5_72B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--together
Select Together model:
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--together LLAMA_3_2_3B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--together LLAMA_3_1_405B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--together LLAMA_3_1_70B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--together LLAMA_3_1_8B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--together GEMMA_2_27B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--together GEMMA_2_9B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--together QWEN_2_5_72B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--together QWEN_2_5_7B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--groq
Select Groq model:
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--groq LLAMA_3_1_70B_VERSATILE
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--groq LLAMA_3_1_8B_INSTANT
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--groq LLAMA_3_2_1B_PREVIEW
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--groq LLAMA_3_2_3B_PREVIEW
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--groq MIXTRAL_8X7B_32768
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--ollama
Select Ollama model:
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--ollama LLAMA_3_2_1B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--ollama LLAMA_3_2_3B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--ollama GEMMA_2_2B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--ollama PHI_3_5
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--ollama QWEN_2_5_1B
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--ollama QWEN_2_5_3B
If neither the --deepgram
or --assembly
option is included for transcription, autoshow
will default to running the largest Whisper.cpp model. To configure the size of the Whisper model, use the --model
option and select one of the following:
# tiny model
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--whisper tiny
# base model
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--whisper base
# small model
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--whisper small
# medium model
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--whisper medium
# large-v2 model
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--whisper large-v2
# large-v3-turbo model
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--whisper large-v3-turbo
Run whisper.cpp
in a Docker container with --whisperDocker
:
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--whisperDocker base
Use the original openai/whisper
Python library with the newly released turbo
model:
npm run as -- \
--file "content/audio.mp3" \
--whisperPython turbo
Use whisper-diarization
to provide speaker labels:
npm run as -- \
--file "content/audio.mp3" \
--whisperDiarization tiny
Create a .env
file and set API key as demonstrated in .env.example
for DEEPGRAM_API_KEY
.
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--deepgram
Create a .env
file and set API key as demonstrated in .env.example
for ASSEMBLY_API_KEY
.
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--assembly
Include speaker labels and number of speakers:
npm run as -- \
--video "https://ajc.pics/audio/fsjam-short.mp3" \
--assembly \
--speakerLabels
Default includes summary and long chapters, equivalent to running this:
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--prompt summary longChapters
Create five title ideas:
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--prompt titles
Create a one sentence and one paragraph summary:
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--prompt summary
Create a short, one sentence description for each chapter that's 25 words or shorter.
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--prompt shortChapters
Create a one paragraph description for each chapter that's around 50 words.
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--prompt mediumChapters
Create a two paragraph description for each chapter that's over 75 words.
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--prompt longChapters
Create three key takeaways about the content:
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--prompt takeaways
Create ten questions about the content to check for comprehension:
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--prompt questions
Include all prompt options:
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--prompt titles summary longChapters takeaways questions
This will start whisper.cpp
, Ollama, and the AutoShow Commander CLI in their own Docker containers.
npm run docker-up
Inspect various aspects of the containers, images, and volumes:
docker images && docker ps -a && docker system df -v && docker volume ls
docker volume inspect autoshow_ollama
du -sh ./whisper.cpp/models
docker history autoshow-autoshow:latest
docker history autoshow-whisper:latest
Replace as
with docker
to run most other commands explained in this document. Does not support all options at this time, notably --whisperPython
and --whisperDiarization
.
npm run docker -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk"
npm run docker -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--whisperDocker tiny
Currently supports Ollama's official Docker image so the entire project can be encapsulated in one local Docker Compose file:
npm run docker -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--whisperDocker tiny \
--ollama
To reset your Docker images and containers, run:
npm run prune
npm run bun -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk"
npm run deno -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk"
Integration test.
- You'll need API keys for all services to make it through this entire command.
- Mostly uses transcripts of videos around one minute long and cheaper models when possible, so the total cost of running this for any given service should be at most only a few cents.
npm run test-integrations
Local services test, only uses Whisper for transcription and Ollama for LLM operations.
npm run test-local
Docker test, also uses Whisper for transcription and Ollama for LLM operations but in Docker containers.
npm run test-docker
Benchmark test, compares different size models for whisper.cpp
, openai-whisper
, and whisper-diarization
.
npm run test-bench