Migrate to RubyLLM with TUI, profiles, and hybrid search by cpetersen · Pull Request #21 · scientist-labs/ragnar-cli

cpetersen · 2026-03-28T21:54:38Z

Summary

Major upgrade migrating ragnar from direct red-candle LLM usage to RubyLLM, adding a TUI, LLM profiles, and hybrid search.

TUI Interactive Mode

Ratatui-based TUI launches by default with ragnar (no args)
Auto-completion, persistent history, live output
All commands available via /command syntax
/verbose toggle and /profile switching mid-session
Progress bars gracefully fall back to text in TUI mode (no ioctl crash)

RubyLLM Migration

Replace direct Candle::LLM with RubyLLM.chat() — supports any provider
Red-candle retained for embeddings and reranking
Default local model: Qwen3-4B (MaziyarPanahi/Qwen3-4B-GGUF)
Strip <think> tags from Qwen3 responses, /no_think in prompts

LLM Profiles

YAML-based profiles: switch between local and cloud models via config
--profile / -p global flag on all commands
/profile command to list/switch in TUI
API keys from config or environment variables
Sample profiles: red_candle, opus, sonnet

Hybrid Search (Vector + FTS)

Add full-text search alongside vector search in retrieval pipeline
Both signals combined via RRF fusion
FTS catches exact keyword matches that vector search misses
Pass original user query (not clarified_intent) to retrieval

UMAP Subcommands

ragnar umap train / ragnar umap apply (was train-umap / apply-umap)
Better LAPACK error handling with retry logic

Configurable Reranking

query.enable_reranking and query.reranker_model in config
Supports BAAI/bge-reranker-base (XLM-RoBERTa) via updated red-candle
Default: on (configurable per-project)
--rerank CLI flag overrides config

Other

Profile-aware /model command
Updated README with all new features
336 specs, 0 failures (was 293)

Test plan

bundle exec rspec — 336 examples, 0 failures
TUI mode launches and accepts commands
Profile switching works in CLI and TUI
Hybrid search finds keyword matches vector search misses
Qwen3-4B local model generates answers without <think> leakage
Opus profile works with Anthropic API
UMAP subcommands work in both CLI and TUI
Progress bars work in both CLI and TUI modes

🤖 Generated with Claude Code

Replace direct Candle::LLM usage with RubyLLM, enabling any supported provider (red_candle for local, openai, anthropic, ollama, etc.) via config. Embeddings and reranking remain on red-candle directly. Changes: - Add ruby_llm and ruby_llm-red_candle dependencies - Rewrite LLMManager to use RubyLLM.chat(provider:, model:) API - Update QueryRewriter to use chat.with_schema().ask() for structured gen - Update QueryProcessor to use fresh chat per query (prevents conversation bleed) with proper system instructions via with_instructions() - Remove hardcoded TinyLlama chat template from build_prompt — RubyLLM applies model-specific templates automatically - Update CLI topic summarization to use LLMManager instead of direct Candle - Add llm.provider and llm.api_key config options - Switch default model to Qwen3-4B (MaziyarPanahi/Qwen3-4B-GGUF) - Update all specs and mock helpers for RubyLLM API Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Profiles: - Add profile system to Config with YAML-based profiles (red_candle, opus, sonnet) - Global --profile / -p flag on all CLI commands - /profile command in TUI to show or switch profiles mid-session - Config.create_chat centralizes provider API key configuration - Backwards compatible with flat llm.provider/llm.default_model config Reranker: - Switch to BAAI/bge-reranker-base (XLM-RoBERTa, via local red-candle) - Use raw logits (apply_sigmoid: false) for better score separation - Pass original user query to reranker instead of clarified_intent Query pipeline: - Strip <think>...</think> tags from Qwen3 model responses - Add /no_think to system prompts to disable thinking mode on local models - Switch default LLM to Qwen3-4B (MaziyarPanahi/Qwen3-4B-GGUF) - Bump minimum context docs from 2 to 3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Reranking: - Make reranking configurable via query.enable_reranking and query.reranker_model in .ragnar.yml - Default reranking to on (true) for new projects; --rerank CLI flag overrides - Always include original user query in sub-queries for robust retrieval - Strip <think> tags from model responses (Qwen3 thinking mode) Specs (20 new, 334 total): - strip_think_tags: normal, multiline, nil, unclosed tags - enable_reranking parameter: on/off paths - Original query prepended to sub-queries - CLI profile command: list, switch, error handling, global --profile option - Config: create_chat with providers/api_keys, reranking settings README: - Full config example with LLM profiles (red_candle, opus, sonnet, ollama) - New LLM Profiles section with --profile flag and /profile TUI command - Query config with enable_reranking and reranker_model options - Updated Supported Models: RubyLLM providers, Qwen3-4B default, reranker options - Environment variables for API keys Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- /verbose command toggles verbose output on/off for subsequent queries - @@verbose_mode class variable persists across commands in TUI session - Query command respects toggle when -v flag isn't explicitly passed - 2 new specs for toggle on/off behavior (336 total, 0 failures) - Documented in README TUI section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Hybrid search: - Add full-text search alongside vector search in retrieval pipeline - Lancelot's FTS finds exact keyword matches (e.g., "password policy") that pure vector search misses in noisy embedding spaces - Both signals combined via RRF fusion — docs matching both semantically and by keywords rank highest /verbose toggle: - New /verbose command toggles verbose mode on/off in TUI sessions - Persists across queries via @@verbose_mode class variable /model command: - Now profile-aware: shows active profile's provider and model - Adapts display for local (cache status) vs cloud (API key status) - Removed hardcoded GGUF file references Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- RRF now prefers documents with complete metadata when merging duplicates from vector and FTS results - Filter out sources with nil file_path to prevent empty "Sources:" display - Add string key fallback for file_path/chunk_index lookups Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Remove local red-candle path dependency, use released gem - Add specs for configurable reranker model and load failure fallback - 338 examples, 0 failures Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cpetersen and others added 8 commits March 26, 2026 19:18

Add TUI screenshot to README

96e7fd4

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cpetersen merged commit 51c6965 into main Mar 29, 2026
1 check passed

cpetersen deleted the migrate-to-ruby-llm branch March 29, 2026 00:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Migrate to RubyLLM with TUI, profiles, and hybrid search#21

Migrate to RubyLLM with TUI, profiles, and hybrid search#21
cpetersen merged 8 commits intomainfrom
migrate-to-ruby-llm

cpetersen commented Mar 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cpetersen commented Mar 28, 2026

Summary

TUI Interactive Mode

RubyLLM Migration

LLM Profiles

Hybrid Search (Vector + FTS)

UMAP Subcommands

Configurable Reranking

Other

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant