Ollama Client is a powerful, privacy-first Chrome extension that lets you chat with locally hosted LLMs using Ollama — no cloud, no tracking. It’s lightweight, open source, and designed for fast, offline-friendly AI conversations.
- 🔌 Local Ollama Integration – Connect to a local Ollama server (no API keys required)
- 🌐 LAN/Local Network Support – Connect to Ollama servers on your local network using IP addresses (e.g.,
http://192.168.x.x:11434) - 🔄 Model Switcher – Switch between models in real time with a beautiful UI
- 🔍 Model Search & Pull – Search and pull models directly from Ollama.com in the UI (with progress indicator)
- 🗑️ Model Deletion – Clean up unused models with confirmation dialogs
- 🧳 Load/Unload Models – Manage Ollama memory footprint efficiently
- 📦 Model Version Display – View and compare model versions easily
- 🎛️ Advanced Parameter Tuning – Per-model configuration: temperature, top_k, top_p, repeat penalty, stop sequences, system prompts
- 💬 Beautiful Chat UI – Modern, polished interface built with Shadcn UI
- 🗂️ Multi-Chat Sessions – Create, manage, and switch between multiple chat sessions
- 📤 Export Chat Sessions – Export single or all chat sessions as PDF or JSON
- 📥 Import Chat Sessions – Import single or multiple chat sessions from JSON files
- 📋 Copy & Regenerate – Quickly rerun or copy AI responses
- ⚡ Streaming Responses – Real-time streaming with typing indicators
- 🔍 Semantic Chat Search – Search chat history by meaning, not just keywords
- 📊 Vector Database – IndexedDB-based vector storage with optimized cosine similarity
- 🎯 Smart Chunking – 3 strategies: fixed, semantic, hybrid (configurable)
- 🚀 Optimized Search – Pre-normalized vectors, caching, early termination
- 🔧 Configurable – Chunk size, overlap, similarity threshold, search limits
- 📁 Context-Aware – Search across all chats or within current session
- 📄 Text Files – Support for .txt, .md and text based files
- 📁 PDF Support – Extract and process text from PDF documents
- 📘 DOCX Support – Extract text from Word documents
- 📊 CSV Support – Parse CSV, TSV, PSV with custom delimiters and column extraction (Beta v0.5.0)
- 🌐 HTML Support – Convert HTML to Markdown for clean text extraction (Beta v0.5.0) with 50+ language support (Beta v0.5.0)
- ⚙️ Auto-Embedding – Automatic embedding generation for uploaded files
- 📊 Progress Tracking – Real-time progress indicators during processing
- 🎛️ Configurable Limits – User-defined max file size in settings
- 🧠 Enhanced Content Extraction – Advanced extraction with multiple scroll strategies (none, instant, gradual, smart)
- 🔄 Lazy Loading Support – Automatically waits for dynamic content to load
- 📄 Site-Specific Overrides – Configure extraction settings per domain (scroll strategies, delays, timeouts)
- 🎯 Defuddle Integration – Smart content extraction with Defuddle fallback
- 📖 Mozilla Readability – Fallback extraction using Mozilla Readability
- 🎬 YouTube Transcripts – Automated YouTube transcript extraction
- 📊 Extraction Metrics – View scroll steps, mutations detected, and content length
- 🎨 Professional UI – Modern design system with glassmorphism effects, gradients, and smooth animations
- 🌓 Dark Mode – Beautiful dark theme with smooth transitions
- 📝 Prompt Templates – Create, manage, and use custom prompt templates (Ctrl+/)
- 🔊 Advanced Text-to-Speech – Searchable voice selector with adjustable speech rate & pitch
- 🌍 Internationalization (i18n) – Full multi-language support with 9 languages: English, Hindi, Spanish, French, German, Italian, Chinese (Simplified), Japanese, Russian
- 🎚️ Cross-Browser Compatibility – Works with Chrome, Brave, Edge, Opera, Firefox
- 🧪 Voice Testing – Test voices before using them
- 🛡️ 100% Local and Private – All storage and inference happen on your device
- 🧯 Declarative Net Request (DNR) – Automatic CORS handling
- 💾 IndexedDB Storage – Efficient local storage for chat sessions
- ⚡ Performance Optimized – Lazy loading, debounced operations, optimized re-renders
- 🔄 State Management – Clean Zustand-based state management
-
TypeScript – Type‑safe development
-
React 18 – Modern UI framework
-
Plasmo – Chrome‑extension framework
-
Shadcn UI – Professional component library (Radix UI primitives)
-
Radix UI – Accessible UI primitives
-
Tailwind CSS – Utility‑first styling
-
Lucide React – Icon library
-
Zustand – Lightweight state management
-
Dexie – IndexedDB wrapper for chat storage
-
webextension‑polyfill – Promise‑based browser extension API wrapper
- Ollama – Local LLM backend
- Chrome Extension APIs –
declarativeNetRequest,storage,sidePanel,tabs
- Defuddle – Advanced content extraction
- Mozilla Readability – Content extraction fallback
- highlight.js – Code syntax highlighting
- markdown-it – Markdown rendering
- pdfjs‑dist – PDF parsing and rendering
- dompurify – HTML sanitization
- html2pdf.js – Convert HTML to PDF
- mammoth – DOCX to HTML conversion
- Biome – Fast formatter & linter
- TypeScript – Strict type checking
- Husky – Git hooks
brew install ollama # macOS
# or visit https://ollama.com for Windows/Linux installers
ollama serve # starts at http://localhost:11434💡 Quick Setup Script (Cross-platform):
For easier setup with LAN access and Firefox CORS support:
# Cross-platform bash script (macOS/Linux/Windows with Git Bash)
./tools/ollama-env.sh firefox # Firefox with CORS + LAN access
./tools/ollama-env.sh chrome # Chrome with LAN access📄 Script file: tools/ollama-env.sh
This script automatically:
- Configures Ollama for LAN access (
0.0.0.0) - Sets up CORS for Firefox extensions (if needed)
- Shows your local IP address for network access
- Detects your OS (macOS, Linux, Windows) automatically
- Stops any running Ollama instances before starting
If you don't have the script file, you can download it directly or see the full setup guide: Ollama Setup Guide
More info: https://ollama.com
ollama pull gemma3:1bOther options: mistral, llama3:8b, codellama, etc.
-
Click the Ollama Client icon
-
Open ⚙️ Settings
-
Set your:
- Base URL:
http://localhost:11434(default) or your local network IP (e.g.,http://192.168.1.100:11434) - Default model (e.g.
gemma:2b) - Theme & appearance
- Model parameters
- Prompt templates
- Base URL:
💡 Tip: You can use Ollama on a local network server by entering its IP address (e.g.,
http://192.168.x.x:11434) in the Base URL field. Make sure Ollama is configured withOLLAMA_HOST=0.0.0.0for LAN access.
Advanced parameters like system prompts and stop sequences are available per model.
Want to contribute or customize? You can run and modify the Ollama Client extension locally using Plasmo.
git clone https://github.com/Shishir435/ollama-client.git
cd ollama-clientUsing pnpm (recommended):
pnpm installOr with npm:
npm installStart development mode with hot reload:
pnpm devOr with npm:
npm run devThis launches the Plasmo dev server and gives instructions for loading the unpacked extension in Chrome:
- Open
chrome://extensions - Enable Developer mode
- Click Load unpacked
- Select the
dist/folder generated by Plasmo
pnpm buildpnpm packageSetup Ollama for Firefox:
Firefox requires manual CORS configuration. Use the helper script:
# Cross-platform bash script (macOS/Linux/Windows with Git Bash)
./tools/ollama-env.sh firefoxThis configures OLLAMA_ORIGINS for Firefox extension support.
Build and run:
pnpm dev --target=firefoxpnpm build --target=firefoxpnpm package --target=firefoxOr with npm:
npm run dev -- --target=firefoxLoad as a temporary extension.
src/
├── background/ # Background service worker & API handlers
├── sidepanel/ # Main chat UI
├── options/ # Settings page
├── features/ # Feature modules
│ ├── chat/ # Chat components, hooks, semantic search
│ ├── model/ # Model management & settings
│ ├── sessions/ # Chat session management
│ ├── prompt/ # Prompt templates
│ └── tabs/ # Browser tab integration
├── lib/ # Shared utilities
│ └── embeddings/ # Vector embeddings & semantic search
├── components/ # Shared UI components (Shadcn)
└── hooks/ # Shared React hooks
Architecture: Feature-based organization with separation of concerns (components, hooks, stores). Zustand for global state, React hooks for local state.
- Change manifest settings in
package.json - PRs welcome! Check issues for open tasks
| System Specs | Suggested Models |
|---|---|
| 💻 8GB RAM (no GPU) | gemma:2b, mistral:7b-q4 |
| 💻 16GB RAM (no GPU) | gemma:3b-q4, mistral |
| 🎮 16GB+ with GPU (6GB VRAM) | llama3:8b-q4, gemma:3b |
| 🔥 RTX 3090+ or Apple M3 Max | llama3:70b, mixtral |
📦 Prefer quantized models (q4_0, q5_1, etc.) for better performance.
Explore: Ollama Model Library
Ollama Client is a Chrome Manifest V3 extension. To use in Firefox:
- Go to
about:debugging - Click "Load Temporary Add-on"
- Select the
manifest.jsonfrom the extension folder - Manually allow CORS access (see setup guide)
- "Stop Pull" during model downloads may glitch
- Large chat histories in IndexedDB can impact performance
Here’s what’s coming up next in Ollama Client—grouped by priority:
- Migrate state management to Zustand for cleaner logic and global state control
- Add Export / Import Chat History (JSON, txt or PDF format)
- Add Reset App Data button ("Reset All") under Options → Reset (clears IndexedDB + localStorage)
- Enhanced Content Extraction – Phase 1 implementation with lazy loading support, site-specific overrides, and Defuddle integration
- Advanced Text-to-Speech – Searchable voice selector with rate/pitch controls and cross-browser compatibility
- Automated YouTube Transcript Extraction – Automatic button clicking for transcript access
- GitHub Content Extraction – Special handling for repository and profile pages
- Implement Ollama Embedding Models:
- Integration with Ollama embedding models (e.g.,
nomic-embed-text,mxbai-embed-large) - Generate embeddings for chat messages and store in IndexedDB
- Semantic search over chat history (global and per-chat)
- Auto-embedding toggle and backfill functionality
- Integration with Ollama embedding models (e.g.,
- Vector Search Optimization (Phase 1 - Completed):
- Brute-force cosine similarity with optimized computation
- Pre-normalize embeddings on storage
- Use Float32Array for better memory locality
- Implement early termination for low similarity scores
- Add search result caching (configurable TTL & max size)
- Non-blocking computation (async chunking with yields)
- Semantic Chat Search UI (Beta v0.3.0 - Completed):
- Search dialog with debounced input
- Search scope toggle (all chats / current chat)
- Grouped results by session
- Similarity scores with % match display
- Click to navigate and highlight message
- Real-time loading indicators
- Advanced Vector Search (Phase 2 - Completed):
- Optimized vector indexing for faster searches
- Service Worker-compatible implementation
- Hybrid search strategy (indexed + brute-force fallback)
- Incremental index updates
- Expected performance: 5-10x faster than Phase 1 for datasets >1000 vectors
- WASM upgrade path documented in
docs/HNSW_WASM_UPGRADE.md(optional for >50K vectors)
- Enable Local RAG over chats, PDFs, and uploaded files
- Browser Search Feature:
- Contextual search within webpage content
- Semantic search over extracted content
- Search result highlighting
- Search history
- Optional Web Search Enrichment:
- Offline-first architecture
- Opt-in Brave / DuckDuckGo API (user-provided key)
- WASM fallback (e.g., tinysearch) when no key
Note: Hybrid embeddings with client-side transformers (
@xenova/transformers) have been tested and show degraded model response quality compared to direct text prompts. The focus will be on Ollama-hosted embedding models instead.
- Text File Support (Beta v0.3.0 - Completed):
- Plain text-based formats:
.txt,.mdand more. - Direct UTF-8 reading
- Plain text-based formats:
- PDF Support (Beta v0.3.0 - Completed):
- Full text extraction via pdf.js
- Multi-page document support
- DOCX Support (Beta v0.3.0 - Completed):
- Extract text from Word documents via mammoth.js
- Handle formatting and structure
- Auto-Embedding (Beta v0.3.0 - Completed):
- Automatic chunking with configurable strategies
- Background embedding generation via port messaging
- Progress tracking with real-time updates
- Batch processing for performance
- File Upload Settings (Beta v0.3.0 - Completed):
- Configurable max file size
- Auto-embed toggle
- Embedding batch size configuration
- CSV Support (Beta v0.5.0 - Completed):
- CSV parsing with d3-dsv
- Custom delimiter support (comma, tab, pipe, semicolon)
- Column extraction
- TSV and PSV file support
- HTML Support (Beta v0.5.0 - Completed):
- HTML to Markdown conversion via Turndown
- Structure and link preservation
- Track Per-Session Token Usage and display in chat metadata (duration, token count)
- Enable Semantic Chat Search / Filter once embeddings are in place
- Add Export/Import UI Buttons in chat selector ui
- 🌐 Install Extension: Chrome Web Store
- 📘 Docs & Landing Page: ollama-client
- 🐙 GitHub Repo: Github Repo
- 📖 Setup Guide: Ollama Setup Guide
- 🔒 Privacy Policy Privacy Policy
- 🐞 Issue Tracker: Report a Bug
- 🙋♂️ Portfolio: shishirchaurasiya.in
- 💡 Feature Requests: Email Me
If you find Ollama Client helpful, please consider:
- ⭐ Starring the repo
- 📝 Leaving a review on the Chrome Web Store
- 💬 Sharing on socials (tag
#OllamaClient)
Built with ❤️ by @Shishir435