Skip to content

Commit

Permalink
Merge pull request #38 from ajcwebdev/blog
Browse files Browse the repository at this point in the history
Migrate server to TypeScript and remove Llama.cpp/Octo
  • Loading branch information
ajcwebdev authored Oct 30, 2024
2 parents d73573c + aa40e77 commit 144aac9
Show file tree
Hide file tree
Showing 64 changed files with 1,153 additions and 1,525 deletions.
2 changes: 1 addition & 1 deletion .env.example
Original file line number Diff line number Diff line change
Expand Up @@ -3,9 +3,9 @@ ANTHROPIC_API_KEY=""
GEMINI_API_KEY=""
COHERE_API_KEY=""
MISTRAL_API_KEY=""
OCTOAI_API_KEY=""
TOGETHER_API_KEY=""
FIREWORKS_API_KEY=""
GROQ_API_KEY=""

DEEPGRAM_API_KEY=""
ASSEMBLY_API_KEY=""
33 changes: 0 additions & 33 deletions .github/llama.Dockerfile

This file was deleted.

3 changes: 2 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -14,4 +14,5 @@ types
dist
NEW.md
TODO.md
nemo_msdd_configs
nemo_msdd_configs
temp_outputs
87 changes: 33 additions & 54 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,9 +8,6 @@
- [Project Overview](#project-overview)
- [Key Features](#key-features)
- [Setup](#setup)
- [Copy Environment Variable File](#copy-environment-variable-file)
- [Install Local Dependencies](#install-local-dependencies)
- [Clone Whisper Repo](#clone-whisper-repo)
- [Run Autoshow Node Scripts](#run-autoshow-node-scripts)
- [Project Structure](#project-structure)

Expand All @@ -29,8 +26,10 @@ The Autoshow workflow includes the following steps:
### Key Features

- Support for multiple input types (YouTube links, RSS feeds, local video and audio files)
- Integration with various LLMs (ChatGPT, Claude, Cohere, Mistral) and transcription services (Whisper.cpp, Deepgram, Assembly)
- Local LLM support (Llama 3.1, Phi 3, Qwen 2, Mistral)
- Integration with various:
- LLMs (ChatGPT, Claude, Gemini, Cohere, Mistral, Fireworks, Together, Groq)
- Transcription services (Whisper.cpp, Deepgram, Assembly)
- Local LLM support with Ollama
- Customizable prompts for generating titles, summaries, chapter titles/descriptions, key takeaways, and questions to test comprehension
- Markdown output with metadata and formatted content
- Command-line interface for easy usage
Expand All @@ -40,36 +39,12 @@ See [`docs/roadmap.md`](/docs/roadmap.md) for details about current development

## Setup

### Copy Environment Variable File

`npm run autoshow` expects a `.env` file even for commands that don't require API keys. You can create a blank `.env` file or use the default provided:

```bash
cp .env.example .env
```

### Install Local Dependencies

Install `yt-dlp`, `ffmpeg`, and run `npm i`.
`scripts/setup.sh` checks to ensure a `.env` file exists, Node dependencies are installed, and the `whisper.cpp` repository is cloned and built. Run the script with the `setup` script in `package.json`.

```bash
brew install yt-dlp ffmpeg
npm i
npm run setup
```

### Clone Whisper Repo

Run the following commands to clone `whisper.cpp` and build the `base` model:

```bash
git clone https://github.com/ggerganov/whisper.cpp.git && \
bash ./whisper.cpp/models/download-ggml-model.sh base && \
make -C whisper.cpp && \
cp .github/whisper.Dockerfile whisper.cpp/Dockerfile
```

> Replace `base` with `large-v2` for the largest model, `medium` for a middle sized model, or `tiny` for the smallest model.
## Run Autoshow Node Scripts

Run on a single YouTube video.
Expand Down Expand Up @@ -105,7 +80,7 @@ npm run as -- --rss "https://ajcwebdev.substack.com/feed"
Use local LLM.

```bash
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --llama
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --ollama
```

Use 3rd party LLM providers.
Expand All @@ -116,45 +91,49 @@ npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --claude CLA
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --gemini GEMINI_1_5_PRO
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --cohere COMMAND_R_PLUS
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --mistral MISTRAL_LARGE
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --octo LLAMA_3_1_405B
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --fireworks
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --together
npm run as -- --video "https://www.youtube.com/watch?v=MORMZXEaONk" --groq
```

Example commands for all available CLI options can be found in [`docs/examples.md`](/docs/examples.md).

## Project Structure

- Main Entry Point (`src/autoshow.js`)
- Main Entry Point (`src/autoshow.ts`)
- Defines the command-line interface using Commander.js
- Handles various input options (video, playlist, URLs, file, RSS)
- Manages LLM and transcription options

- Command Processors (`src/commands`)
- `processVideo.js`: Handles single YouTube video processing
- `processPlaylist.js`: Processes all videos in a YouTube playlist
- `processURLs.js`: Processes videos from a list of URLs in a file
- `processFile.js`: Handles local audio/video file processing
- `processRSS.js`: Processes podcast RSS feeds
- `processVideo.ts`: Handles single YouTube video processing
- `processPlaylist.ts`: Processes all videos in a YouTube playlist
- `processURLs.ts`: Processes videos from a list of URLs in a file
- `processFile.ts`: Handles local audio/video file processing
- `processRSS.ts`: Processes podcast RSS feeds

- Utility Functions (`src/utils`)
- `downloadAudio.js`: Downloads audio from YouTube videos
- `runTranscription.js`: Manages the transcription process
- `runLLM.js`: Handles LLM processing for summarization and chapter generation
- `generateMarkdown.js`: Creates initial markdown files with metadata
- `cleanUpFiles.js`: Removes temporary files after processing
- `downloadAudio.ts`: Downloads audio from YouTube videos
- `runTranscription.ts`: Manages the transcription process
- `runLLM.ts`: Handles LLM processing for summarization and chapter generation
- `generateMarkdown.ts`: Creates initial markdown files with metadata
- `cleanUpFiles.ts`: Removes temporary files after processing

- Transcription Services (`src/transcription`)
- `whisper.js`: Uses Whisper.cpp for transcription
- `deepgram.js`: Integrates Deepgram transcription service
- `assembly.js`: Integrates AssemblyAI transcription service
- `whisper.ts`: Uses Whisper.cpp, openai-whisper, or whisper-diarization for transcription
- `deepgram.ts`: Integrates Deepgram transcription service
- `assembly.ts`: Integrates AssemblyAI transcription service

- Language Models (`src/llms`)
- `chatgpt.js`: Integrates OpenAI's GPT models
- `claude.js`: Integrates Anthropic's Claude models
- `cohere.js`: Integrates Cohere's language models
- `mistral.js`: Integrates Mistral AI's language models
- `octo.js`: Integrates OctoAI's language models
- `llama.js`: Integrates Llama models (local inference)
- `prompt.js`: Defines the prompt structure for summarization and chapter generation
- `chatgpt.ts`: Integrates OpenAI's GPT models
- `claude.ts`: Integrates Anthropic's Claude models
- `gemini.ts`: Integrates Google's Gemini models
- `cohere.ts`: Integrates Cohere's language models
- `mistral.ts`: Integrates Mistral AI's language models
- `fireworks.ts`: Integrates Fireworks's open source models
- `together.ts`: Integrates Together's open source models
- `groq.ts`: Integrates Groq's open source models
- `prompt.ts`: Defines the prompt structure for summarization and chapter generation

- Web Interface (`web`) and Server (`server`)
- Web interface built with React and Vite
Expand Down
35 changes: 5 additions & 30 deletions docs/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -143,9 +143,12 @@ Create a `.env` file and set API key as demonstrated in `.env.example` for eithe

- `OPENAI_API_KEY`
- `ANTHROPIC_API_KEY`
- `GEMINI_API_KEY`
- `COHERE_API_KEY`
- `MISTRAL_API_KEY`
- `OCTOAI_API_KEY`
- `TOGETHER_API_KEY`
- `FIREWORKS_API_KEY`
- `GROQ_API_KEY`

For each model available for each provider, I have collected the following details:

Expand Down Expand Up @@ -401,34 +404,6 @@ npm run as -- \
--groq MIXTRAL_8X7B_32768
```

### Llama.cpp

```bash
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--llama
```

Select Llama model:

```bash
npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--llama GEMMA_2_2B

npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--llama LLAMA_3_2_1B

npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--llama PHI_3_5

npm run as -- \
--video "https://www.youtube.com/watch?v=MORMZXEaONk" \
--llama QWEN_2_5_3B
```

### Ollama

```bash
Expand Down Expand Up @@ -644,7 +619,7 @@ This will start `whisper.cpp`, Ollama, and the AutoShow Commander CLI in their o
npm run docker-up
```

Replace `as` with `docker` to run most other commands explained in this document. Does not support all options at this time, notably `--llama`, `--whisperPython`, and `--whisperDiarization`.
Replace `as` with `docker` to run most other commands explained in this document. Does not support all options at this time, notably `--whisperPython` and `--whisperDiarization`.

```bash
npm run docker -- \
Expand Down
43 changes: 13 additions & 30 deletions docs/server.md
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ Use LLM.
curl --json '{
"youtubeUrl": "https://www.youtube.com/watch?v=MORMZXEaONk",
"whisperModel": "tiny",
"llm": "llama"
"llm": "ollama"
}' http://localhost:3000/video
```

Expand All @@ -59,7 +59,7 @@ curl --json '{
curl --json '{
"playlistUrl": "https://www.youtube.com/playlist?list=PLCVnrVv4KhXPz0SoAVu8Rc1emAdGPbSbr",
"whisperModel": "tiny",
"llm": "llama"
"llm": "ollama"
}' http://localhost:3000/playlist
```

Expand All @@ -75,7 +75,7 @@ curl --json '{
curl --json '{
"filePath": "content/example-urls.md",
"whisperModel": "tiny",
"llm": "llama"
"llm": "ollama"
}' http://localhost:3000/urls
```

Expand All @@ -84,7 +84,7 @@ curl --json '{
"filePath": "content/example-urls.md",
"prompts": ["titles", "mediumChapters"],
"whisperModel": "tiny",
"llm": "llama"
"llm": "ollama"
}' http://localhost:3000/urls
```

Expand All @@ -100,7 +100,7 @@ curl --json '{
curl --json '{
"filePath": "content/audio.mp3",
"whisperModel": "tiny",
"llm": "llama"
"llm": "ollama"
}' http://localhost:3000/file
```

Expand All @@ -109,7 +109,7 @@ curl --json '{
"filePath": "content/audio.mp3",
"prompts": ["titles"],
"whisperModel": "tiny",
"llm": "llama"
"llm": "ollama"
}' http://localhost:3000/file
```

Expand All @@ -125,7 +125,7 @@ curl --json '{
curl --json '{
"rssUrl": "https://feeds.transistor.fm/fsjam-podcast/",
"whisperModel": "tiny",
"llm": "llama",
"llm": "ollama",
"order": "newest",
"skip": 0
}' http://localhost:3000/rss
Expand Down Expand Up @@ -236,23 +236,6 @@ curl --json '{
}' http://localhost:3000/video
```

### Octo

```bash
curl --json '{
"youtubeUrl": "https://www.youtube.com/watch?v=MORMZXEaONk",
"llm": "octo"
}' http://localhost:3000/video
```

```bash
curl --json '{
"youtubeUrl": "https://www.youtube.com/watch?v=MORMZXEaONk",
"llm": "octo",
"llmModel": "LLAMA_3_1_8B"
}' http://localhost:3000/video
```

## Transcription Options

### Whisper.cpp
Expand All @@ -277,7 +260,7 @@ curl --json '{
curl --json '{
"youtubeUrl": "https://www.youtube.com/watch?v=MORMZXEaONk",
"transcriptServices": "deepgram",
"llm": "llama"
"llm": "ollama"
}' http://localhost:3000/video
```

Expand All @@ -294,7 +277,7 @@ curl --json '{
curl --json '{
"youtubeUrl": "https://www.youtube.com/watch?v=MORMZXEaONk",
"transcriptServices": "assembly",
"llm": "llama"
"llm": "ollama"
}' http://localhost:3000/video
```

Expand All @@ -311,7 +294,7 @@ curl --json '{
"youtubeUrl": "https://ajc.pics/audio/fsjam-short.mp3",
"transcriptServices": "assembly",
"speakerLabels": true,
"llm": "llama"
"llm": "ollama"
}' http://localhost:3000/video
```

Expand All @@ -336,7 +319,7 @@ curl --json '{
"youtubeUrl": "https://www.youtube.com/watch?v=MORMZXEaONk",
"prompts": ["titles", "summary", "shortChapters", "takeaways", "questions"],
"whisperModel": "tiny",
"llm": "llama"
"llm": "ollama"
}' http://localhost:3000/video
```

Expand All @@ -345,7 +328,7 @@ curl --json '{
"playlistUrl": "https://www.youtube.com/playlist?list=PLCVnrVv4KhXPz0SoAVu8Rc1emAdGPbSbr",
"prompts": ["titles", "mediumChapters"],
"whisperModel": "tiny",
"llm": "llama"
"llm": "ollama"
}' http://localhost:3000/playlist
```

Expand All @@ -354,6 +337,6 @@ curl --json '{
"playlistUrl": "https://www.youtube.com/playlist?list=PLCVnrVv4KhXPz0SoAVu8Rc1emAdGPbSbr",
"prompts": ["titles", "mediumChapters"],
"whisperModel": "tiny",
"llm": "llama"
"llm": "ollama"
}' http://localhost:3000/playlist
```
Loading

0 comments on commit 144aac9

Please sign in to comment.