Skip to content

Latest commit

 

History

History
739 lines (551 loc) · 17.4 KB

examples.md

File metadata and controls

739 lines (551 loc) · 17.4 KB

Example CLI Commands

Outline

Content and Feed Inputs

Process Single Video URLs

Run on a single YouTube video.

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk"

Process Multiple Videos in YouTube Playlist

Run on multiple YouTube videos in a playlist.

npm run as -- \
  --playlist "https://www.youtube.com/playlist?list=PLCVnrVv4KhXPz0SoAVu8Rc1emAdGPbSbr"

Run on playlist URL and generate JSON info file with markdown metadata of each video in the playlist:

npm run as -- \
  --playlist "https://www.youtube.com/playlist?list=PLCVnrVv4KhXPz0SoAVu8Rc1emAdGPbSbr" \
  --info

Process All Videos from a YouTube Channel

Process all videos from a YouTube channel (both live and non-live):

npm run as -- \
  --channel "https://www.youtube.com/@ajcwebdev"

Process videos starting from the oldest instead of newest:

npm run as -- \
  --channel "https://www.youtube.com/@ajcwebdev" \
  --order oldest

Skip a certain number of videos before beginning processing (starts from newest by default and can be used with --order oldest):

npm run as -- \
  --channel "https://www.youtube.com/@ajcwebdev" \
  --skip 1

Process a certain number of the most recent videos, for example the last three videos released on the channel:

npm run as -- \
  --channel "https://www.youtube.com/@ajcwebdev" \
  --last 3

Run on a YouTube channel and generate JSON info file with markdown metadata of each video:

npm run as -- \
  --channel "https://www.youtube.com/@ajcwebdev" \
  --info

Process Multiple Videos Specified in a URLs File

Run on an arbitrary list of URLs in example-urls.md.

npm run as -- \
  --urls "content/example-urls.md"

Run on URLs file and generate JSON info file with markdown metadata of each video:

npm run as -- \
  --urls "content/example-urls.md" \
  --info

Process Single Audio or Video File

Run on audio.mp3 on the content directory:

npm run as -- \
  --file "content/audio.mp3"

Process Podcast RSS Feed

Process RSS feed from newest to oldest (default behavior):

npm run as -- \
  --rss "https://ajcwebdev.substack.com/feed"

Process RSS feed from oldest to newest:

npm run as -- \
  --rss "https://feeds.transistor.fm/fsjam-podcast/" \
  --order oldest

Start processing a different episode by selecting a number of episodes to skip:

npm run as -- \
  --rss "https://feeds.transistor.fm/fsjam-podcast/" \
  --skip 1

Process a certain number of the most recent items, for example the last three episodes released on the feed:

npm run as -- \
  --rss "https://feeds.transistor.fm/fsjam-podcast/" \
  --last 3

Process a single specific episode from a podcast RSS feed by providing the episode's audio URL with the --item option:

npm run as -- \
  --rss "https://ajcwebdev.substack.com/feed" \
  --item "https://api.substack.com/feed/podcast/36236609/fd1f1532d9842fe1178de1c920442541.mp3"

Run on a podcast RSS feed and generate JSON info file with markdown metadata of each item:

npm run as -- \
  --rss "https://ajcwebdev.substack.com/feed" \
  --info

Language Model (LLM) Options

Create a .env file and set API key as demonstrated in .env.example for either:

  • OPENAI_API_KEY
  • ANTHROPIC_API_KEY
  • GEMINI_API_KEY
  • COHERE_API_KEY
  • MISTRAL_API_KEY
  • TOGETHER_API_KEY
  • FIREWORKS_API_KEY
  • GROQ_API_KEY

For each model available for each provider, I have collected the following details:

  • Context Window, the limit of tokens a model can process at once.
  • Max Output, the upper limit of tokens a model can generate in a response, influencing response length and detail.
  • Cost of input and output tokens per million tokens.
    • Some model providers also offer a Batch API with input/output tokens at half the price.

OpenAI's ChatGPT Models

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --chatgpt

Select ChatGPT model:

# Select GPT-4o mini model - https://platform.openai.com/docs/models/gpt-4o-mini
npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --chatgpt GPT_4o_MINI

# Select GPT-4o model - https://platform.openai.com/docs/models/gpt-4o
npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --chatgpt GPT_4o

# Select GPT-4 Turbo model - https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4
npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --chatgpt GPT_4_TURBO

# Select GPT-4 model - https://platform.openai.com/docs/models/gpt-4-turbo-and-gpt-4
npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --chatgpt GPT_4
Model Context Window Max Output Input Tokens Output Tokens Batch Input Batch Output
GPT-4o mini 128,000 16,384 $0.15 $0.60 $0.075 $0.30
GPT-4o 128,000 4,096 $5 $15 $2.50 $7.50
GPT-4 Turbo 128,000 4,096 $10 $30 $5 $15
GPT-4 8,192 8,192 $30 $60 $15 $30

Anthropic's Claude Models

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --claude

Select Claude model:

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --claude CLAUDE_3_5_SONNET

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --claude CLAUDE_3_OPUS

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --claude CLAUDE_3_SONNET

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --claude CLAUDE_3_HAIKU

Google's Gemini Models

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --gemini

Select Gemini model:

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --gemini GEMINI_1_5_FLASH

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --gemini GEMINI_1_5_PRO

Cohere's Command Models

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --cohere

Select Cohere model:

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --cohere COMMAND_R

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --cohere COMMAND_R_PLUS

Mistral's Mistral Models

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --mistral

Select Mistral model:

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --mistral MIXTRAL_8x7b

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --mistral MIXTRAL_8x22b

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --mistral MISTRAL_LARGE

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --mistral MISTRAL_NEMO

Fireworks

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --fireworks

Select Fireworks model:

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --fireworks LLAMA_3_1_405B

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --fireworks LLAMA_3_1_70B

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --fireworks LLAMA_3_1_8B

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --fireworks LLAMA_3_2_3B

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --fireworks LLAMA_3_2_1B

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --fireworks QWEN_2_5_72B

Together

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --together

Select Together model:

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --together LLAMA_3_2_3B

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --together LLAMA_3_1_405B

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --together LLAMA_3_1_70B

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --together LLAMA_3_1_8B

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --together GEMMA_2_27B

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --together GEMMA_2_9B

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --together QWEN_2_5_72B

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --together QWEN_2_5_7B

Groq

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --groq

Select Groq model:

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --groq LLAMA_3_1_70B_VERSATILE

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --groq LLAMA_3_1_8B_INSTANT

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --groq LLAMA_3_2_1B_PREVIEW

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --groq LLAMA_3_2_3B_PREVIEW

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --groq MIXTRAL_8X7B_32768

Ollama

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --ollama

Select Ollama model:

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --ollama LLAMA_3_2_1B

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --ollama LLAMA_3_2_3B

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --ollama GEMMA_2_2B

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --ollama PHI_3_5

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --ollama QWEN_2_5_1B

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --ollama QWEN_2_5_3B

Transcription Options

Whisper.cpp

If neither the --deepgram or --assembly option is included for transcription, autoshow will default to running the largest Whisper.cpp model. To configure the size of the Whisper model, use the --model option and select one of the following:

# tiny model
npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --whisper tiny

# base model
npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --whisper base

# small model
npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --whisper small

# medium model
npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --whisper medium

# large-v2 model
npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --whisper large-v2

# large-v3-turbo model
npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --whisper large-v3-turbo

Run whisper.cpp in a Docker container with --whisperDocker:

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --whisperDocker base

Whisper Python

Use the original openai/whisper Python library with the newly released turbo model:

npm run as -- \
  --file "content/audio.mp3" \
  --whisperPython turbo

Whisper Diarization

Use whisper-diarization to provide speaker labels:

npm run as -- \
  --file "content/audio.mp3" \
  --whisperDiarization tiny

Deepgram

Create a .env file and set API key as demonstrated in .env.example for DEEPGRAM_API_KEY.

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --deepgram

Assembly

Create a .env file and set API key as demonstrated in .env.example for ASSEMBLY_API_KEY.

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --assembly

Include speaker labels and number of speakers:

npm run as -- \
  --video "https://ajc.pics/audio/fsjam-short.mp3" \
  --assembly \
  --speakerLabels

Prompt Options

Default includes summary and long chapters, equivalent to running this:

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --prompt summary longChapters

Create five title ideas:

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --prompt titles

Create a one sentence and one paragraph summary:

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --prompt summary

Create a short, one sentence description for each chapter that's 25 words or shorter.

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --prompt shortChapters

Create a one paragraph description for each chapter that's around 50 words.

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --prompt mediumChapters

Create a two paragraph description for each chapter that's over 75 words.

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --prompt longChapters

Create three key takeaways about the content:

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --prompt takeaways

Create ten questions about the content to check for comprehension:

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --prompt questions

Include all prompt options:

npm run as -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --prompt titles summary longChapters takeaways questions

Alternative Runtimes

Docker Compose

This will start whisper.cpp, Ollama, and the AutoShow Commander CLI in their own Docker containers.

npm run docker-up

Inspect various aspects of the containers, images, and volumes:

docker images && docker ps -a && docker system df -v && docker volume ls
docker volume inspect autoshow_ollama
du -sh ./whisper.cpp/models
docker history autoshow-autoshow:latest
docker history autoshow-whisper:latest

Replace as with docker to run most other commands explained in this document. Does not support all options at this time, notably --whisperPython and --whisperDiarization.

npm run docker -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk"

npm run docker -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --whisperDocker tiny

Currently supports Ollama's official Docker image so the entire project can be encapsulated in one local Docker Compose file:

npm run docker -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
  --whisperDocker tiny \
  --ollama

To reset your Docker images and containers, run:

npm run prune

Bun

npm run bun -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk"

Deno

npm run deno -- \
  --video "https://www.youtube.com/watch?v=MORMZXEaONk"

Test Suite

Integration test.

  • You'll need API keys for all services to make it through this entire command.
  • Mostly uses transcripts of videos around one minute long and cheaper models when possible, so the total cost of running this for any given service should be at most only a few cents.
npm run test-integrations

Local services test, only uses Whisper for transcription and Ollama for LLM operations.

npm run test-local

Docker test, also uses Whisper for transcription and Ollama for LLM operations but in Docker containers.

npm run test-docker

Benchmark test, compares different size models for whisper.cpp, openai-whisper, and whisper-diarization.

npm run test-bench