Merge pull request #38 from ajcwebdev/blog

Migrate server to TypeScript and remove Llama.cpp/Octo
ajcwebdev · Oct 30, 2024 · 144aac9 · 144aac9
2 parents d73573c + aa40e77
commit 144aac9
Show file tree

Hide file tree

Showing 64 changed files with 1,153 additions and 1,525 deletions.
diff --git a/.env.example b/.env.example
@@ -3,9 +3,9 @@ ANTHROPIC_API_KEY=""
 GEMINI_API_KEY=""
 COHERE_API_KEY=""
 MISTRAL_API_KEY=""
-OCTOAI_API_KEY=""
 TOGETHER_API_KEY=""
 FIREWORKS_API_KEY=""
+GROQ_API_KEY=""
 
 DEEPGRAM_API_KEY=""
 ASSEMBLY_API_KEY=""
diff --git a/.github/llama.Dockerfile b/.github/llama.Dockerfile
diff --git a/.gitignore b/.gitignore
@@ -14,4 +14,5 @@ types
 dist
 NEW.md
 TODO.md
-nemo_msdd_configs
+nemo_msdd_configs
+temp_outputs
diff --git a/README.md b/README.md
@@ -8,9 +8,6 @@
 - [Project Overview](#project-overview)
   - [Key Features](#key-features)
 - [Setup](#setup)
-  - [Copy Environment Variable File](#copy-environment-variable-file)
-  - [Install Local Dependencies](#install-local-dependencies)
-  - [Clone Whisper Repo](#clone-whisper-repo)
 - [Run Autoshow Node Scripts](#run-autoshow-node-scripts)
 - [Project Structure](#project-structure)
 
@@ -29,8 +26,10 @@ The Autoshow workflow includes the following steps:
 ### Key Features
 
 - Support for multiple input types (YouTube links, RSS feeds, local video and audio files)
-- Integration with various LLMs (ChatGPT, Claude, Cohere, Mistral) and transcription services (Whisper.cpp, Deepgram, Assembly)
-- Local LLM support (Llama 3.1, Phi 3, Qwen 2, Mistral)
+- Integration with various:
+  - LLMs (ChatGPT, Claude, Gemini, Cohere, Mistral, Fireworks, Together, Groq)
+  - Transcription services (Whisper.cpp, Deepgram, Assembly)
+- Local LLM support with Ollama
 - Customizable prompts for generating titles, summaries, chapter titles/descriptions, key takeaways, and questions to test comprehension
 - Markdown output with metadata and formatted content
 - Command-line interface for easy usage
@@ -40,36 +39,12 @@ See [`docs/roadmap.md`](/docs/roadmap.md) for details about current development
 
 ## Setup
 
-### Copy Environment Variable File
-
-`npm run autoshow` expects a `.env` file even for commands that don't require API keys. You can create a blank `.env` file or use the default provided:
-
-```bash
-cp .env.example .env
-```
-
-### Install Local Dependencies
-
-Install `yt-dlp`, `ffmpeg`, and run `npm i`.
+`scripts/setup.sh` checks to ensure a `.env` file exists, Node dependencies are installed, and the `whisper.cpp` repository is cloned and built. Run the script with the `setup` script in `package.json`.
 
 ```bash
-brew install yt-dlp ffmpeg
-npm i
+npm run setup
 ```
 
-### Clone Whisper Repo
-
-Run the following commands to clone `whisper.cpp` and build the `base` model:
-
-```bash
-git clone https://github.com/ggerganov/whisper.cpp.git && \
-  bash ./whisper.cpp/models/download-ggml-model.sh base && \
-  make -C whisper.cpp && \
-  cp .github/whisper.Dockerfile whisper.cpp/Dockerfile
-```
-
-> Replace `base` with `large-v2` for the largest model, `medium` for a middle sized model, or `tiny` for the smallest model.
-
 ## Run Autoshow Node Scripts
 
 Run on a single YouTube video.
@@ -105,7 +80,7 @@ npm run as -- --rss "https://ajcwebdev.substack.com/feed"
 Use local LLM.
 
 ```bash
-npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --llama
+npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --ollama
 ```
 
 Use 3rd party LLM providers.
@@ -116,45 +91,49 @@ npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --claude CLA
 npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --gemini GEMINI_1_5_PRO
 npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --cohere COMMAND_R_PLUS
 npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --mistral MISTRAL_LARGE
-npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --octo LLAMA_3_1_405B
+npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --fireworks
+npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --together
+npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --groq
 ```
 
 Example commands for all available CLI options can be found in [`docs/examples.md`](/docs/examples.md).
 
 ## Project Structure
 
-- Main Entry Point (`src/autoshow.js`)
+- Main Entry Point (`src/autoshow.ts`)
   - Defines the command-line interface using Commander.js
   - Handles various input options (video, playlist, URLs, file, RSS)
   - Manages LLM and transcription options
 
 - Command Processors (`src/commands`)
-  - `processVideo.js`: Handles single YouTube video processing
-  - `processPlaylist.js`: Processes all videos in a YouTube playlist
-  - `processURLs.js`: Processes videos from a list of URLs in a file
-  - `processFile.js`: Handles local audio/video file processing
-  - `processRSS.js`: Processes podcast RSS feeds
+  - `processVideo.ts`: Handles single YouTube video processing
+  - `processPlaylist.ts`: Processes all videos in a YouTube playlist
+  - `processURLs.ts`: Processes videos from a list of URLs in a file
+  - `processFile.ts`: Handles local audio/video file processing
+  - `processRSS.ts`: Processes podcast RSS feeds
 
 - Utility Functions (`src/utils`)
-  - `downloadAudio.js`: Downloads audio from YouTube videos
-  - `runTranscription.js`: Manages the transcription process
-  - `runLLM.js`: Handles LLM processing for summarization and chapter generation
-  - `generateMarkdown.js`: Creates initial markdown files with metadata
-  - `cleanUpFiles.js`: Removes temporary files after processing
+  - `downloadAudio.ts`: Downloads audio from YouTube videos
+  - `runTranscription.ts`: Manages the transcription process
+  - `runLLM.ts`: Handles LLM processing for summarization and chapter generation
+  - `generateMarkdown.ts`: Creates initial markdown files with metadata
+  - `cleanUpFiles.ts`: Removes temporary files after processing
 
 - Transcription Services (`src/transcription`)
-  - `whisper.js`: Uses Whisper.cpp for transcription
-  - `deepgram.js`: Integrates Deepgram transcription service
-  - `assembly.js`: Integrates AssemblyAI transcription service
+  - `whisper.ts`: Uses Whisper.cpp, openai-whisper, or whisper-diarization for transcription
+  - `deepgram.ts`: Integrates Deepgram transcription service
+  - `assembly.ts`: Integrates AssemblyAI transcription service
 
 - Language Models (`src/llms`)
-  - `chatgpt.js`: Integrates OpenAI's GPT models
-  - `claude.js`: Integrates Anthropic's Claude models
-  - `cohere.js`: Integrates Cohere's language models
-  - `mistral.js`: Integrates Mistral AI's language models
-  - `octo.js`: Integrates OctoAI's language models
-  - `llama.js`: Integrates Llama models (local inference)
-  - `prompt.js`: Defines the prompt structure for summarization and chapter generation
+  - `chatgpt.ts`: Integrates OpenAI's GPT models
+  - `claude.ts`: Integrates Anthropic's Claude models
+  - `gemini.ts`: Integrates Google's Gemini models
+  - `cohere.ts`: Integrates Cohere's language models
+  - `mistral.ts`: Integrates Mistral AI's language models
+  - `fireworks.ts`: Integrates Fireworks's open source models
+  - `together.ts`: Integrates Together's open source models
+  - `groq.ts`: Integrates Groq's open source models
+  - `prompt.ts`: Defines the prompt structure for summarization and chapter generation
 
 - Web Interface (`web`) and Server (`server`)
   - Web interface built with React and Vite

diff --git a/docs/examples.md b/docs/examples.md
@@ -143,9 +143,12 @@ Create a `.env` file and set API key as demonstrated in `.env.example` for eithe
 
 - `OPENAI_API_KEY`
 - `ANTHROPIC_API_KEY`
+- `GEMINI_API_KEY`
 - `COHERE_API_KEY`
 - `MISTRAL_API_KEY`
-- `OCTOAI_API_KEY`
+- `TOGETHER_API_KEY`
+- `FIREWORKS_API_KEY`
+- `GROQ_API_KEY`
 
 For each model available for each provider, I have collected the following details:
 
@@ -401,34 +404,6 @@ npm run as -- \
   --groq MIXTRAL_8X7B_32768
 ```
 
-### Llama.cpp
-
-```bash
-npm run as -- \
-  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
-  --llama
-```
-
-Select Llama model:
-
-```bash
-npm run as -- \
-  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
-  --llama GEMMA_2_2B
-
-npm run as -- \
-  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
-  --llama LLAMA_3_2_1B
-
-npm run as -- \
-  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
-  --llama PHI_3_5
-
-npm run as -- \
-  --video "https://www.youtube.com/watch?v=MORMZXEaONk" \
-  --llama QWEN_2_5_3B
-```
-
 ### Ollama
 
 ```bash
@@ -644,7 +619,7 @@ This will start `whisper.cpp`, Ollama, and the AutoShow Commander CLI in their o
 npm run docker-up
 ```
 
-Replace `as` with `docker` to run most other commands explained in this document. Does not support all options at this time, notably `--llama`, `--whisperPython`, and `--whisperDiarization`.
+Replace `as` with `docker` to run most other commands explained in this document. Does not support all options at this time, notably `--whisperPython` and `--whisperDiarization`.
 
 ```bash
 npm run docker -- \

diff --git a/docs/server.md b/docs/server.md
@@ -43,7 +43,7 @@ Use LLM.
 curl --json '{
   "youtubeUrl": "https://www.youtube.com/watch?v=MORMZXEaONk",
   "whisperModel": "tiny",
-  "llm": "llama"
+  "llm": "ollama"
 }' http://localhost:3000/video
 ```
 
@@ -59,7 +59,7 @@ curl --json '{
 curl --json '{
   "playlistUrl": "https://www.youtube.com/playlist?list=PLCVnrVv4KhXPz0SoAVu8Rc1emAdGPbSbr",
   "whisperModel": "tiny",
-  "llm": "llama"
+  "llm": "ollama"
 }' http://localhost:3000/playlist
 ```
 
@@ -75,7 +75,7 @@ curl --json '{
 curl --json '{
   "filePath": "content/example-urls.md",
   "whisperModel": "tiny",
-  "llm": "llama"
+  "llm": "ollama"
 }' http://localhost:3000/urls
 ```
 
@@ -84,7 +84,7 @@ curl --json '{
   "filePath": "content/example-urls.md",
   "prompts": ["titles", "mediumChapters"],
   "whisperModel": "tiny",
-  "llm": "llama"
+  "llm": "ollama"
 }' http://localhost:3000/urls
 ```
 
@@ -100,7 +100,7 @@ curl --json '{
 curl --json '{
   "filePath": "content/audio.mp3",
   "whisperModel": "tiny",
-  "llm": "llama"
+  "llm": "ollama"
 }' http://localhost:3000/file
 ```
 
@@ -109,7 +109,7 @@ curl --json '{
   "filePath": "content/audio.mp3",
   "prompts": ["titles"],
   "whisperModel": "tiny",
-  "llm": "llama"
+  "llm": "ollama"
 }' http://localhost:3000/file
 ```
 
@@ -125,7 +125,7 @@ curl --json '{
 curl --json '{
   "rssUrl": "https://feeds.transistor.fm/fsjam-podcast/",
   "whisperModel": "tiny",
-  "llm": "llama",
+  "llm": "ollama",
   "order": "newest",
   "skip": 0
 }' http://localhost:3000/rss
@@ -236,23 +236,6 @@ curl --json '{
 }' http://localhost:3000/video
 ```
 
-### Octo
-
-```bash
-curl --json '{
-  "youtubeUrl": "https://www.youtube.com/watch?v=MORMZXEaONk",
-  "llm": "octo"
-}' http://localhost:3000/video
-```
-
-```bash
-curl --json '{
-  "youtubeUrl": "https://www.youtube.com/watch?v=MORMZXEaONk",
-  "llm": "octo",
-  "llmModel": "LLAMA_3_1_8B"
-}' http://localhost:3000/video
-```
-
 ## Transcription Options
 
 ### Whisper.cpp
@@ -277,7 +260,7 @@ curl --json '{
 curl --json '{
   "youtubeUrl": "https://www.youtube.com/watch?v=MORMZXEaONk",
   "transcriptServices": "deepgram",
-  "llm": "llama"
+  "llm": "ollama"
 }' http://localhost:3000/video
 ```
 
@@ -294,7 +277,7 @@ curl --json '{
 curl --json '{
   "youtubeUrl": "https://www.youtube.com/watch?v=MORMZXEaONk",
   "transcriptServices": "assembly",
-  "llm": "llama"
+  "llm": "ollama"
 }' http://localhost:3000/video
 ```
 
@@ -311,7 +294,7 @@ curl --json '{
   "youtubeUrl": "https://ajc.pics/audio/fsjam-short.mp3",
   "transcriptServices": "assembly",
   "speakerLabels": true,
-  "llm": "llama"
+  "llm": "ollama"
 }' http://localhost:3000/video
 ```
 
@@ -336,7 +319,7 @@ curl --json '{
   "youtubeUrl": "https://www.youtube.com/watch?v=MORMZXEaONk",
   "prompts": ["titles", "summary", "shortChapters", "takeaways", "questions"],
   "whisperModel": "tiny",
-  "llm": "llama"
+  "llm": "ollama"
 }' http://localhost:3000/video
 ```
 
@@ -345,7 +328,7 @@ curl --json '{
   "playlistUrl": "https://www.youtube.com/playlist?list=PLCVnrVv4KhXPz0SoAVu8Rc1emAdGPbSbr",
   "prompts": ["titles", "mediumChapters"],
   "whisperModel": "tiny",
-  "llm": "llama"
+  "llm": "ollama"
 }' http://localhost:3000/playlist
 ```
 
@@ -354,6 +337,6 @@ curl --json '{
   "playlistUrl": "https://www.youtube.com/playlist?list=PLCVnrVv4KhXPz0SoAVu8Rc1emAdGPbSbr",
   "prompts": ["titles", "mediumChapters"],
   "whisperModel": "tiny",
-  "llm": "llama"
+  "llm": "ollama"
 }' http://localhost:3000/playlist
 ```