Migrate to RubyLLM with TUI, profiles, and hybrid search#21
Merged
Conversation
Replace direct Candle::LLM usage with RubyLLM, enabling any supported provider (red_candle for local, openai, anthropic, ollama, etc.) via config. Embeddings and reranking remain on red-candle directly. Changes: - Add ruby_llm and ruby_llm-red_candle dependencies - Rewrite LLMManager to use RubyLLM.chat(provider:, model:) API - Update QueryRewriter to use chat.with_schema().ask() for structured gen - Update QueryProcessor to use fresh chat per query (prevents conversation bleed) with proper system instructions via with_instructions() - Remove hardcoded TinyLlama chat template from build_prompt — RubyLLM applies model-specific templates automatically - Update CLI topic summarization to use LLMManager instead of direct Candle - Add llm.provider and llm.api_key config options - Switch default model to Qwen3-4B (MaziyarPanahi/Qwen3-4B-GGUF) - Update all specs and mock helpers for RubyLLM API Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Profiles: - Add profile system to Config with YAML-based profiles (red_candle, opus, sonnet) - Global --profile / -p flag on all CLI commands - /profile command in TUI to show or switch profiles mid-session - Config.create_chat centralizes provider API key configuration - Backwards compatible with flat llm.provider/llm.default_model config Reranker: - Switch to BAAI/bge-reranker-base (XLM-RoBERTa, via local red-candle) - Use raw logits (apply_sigmoid: false) for better score separation - Pass original user query to reranker instead of clarified_intent Query pipeline: - Strip <think>...</think> tags from Qwen3 model responses - Add /no_think to system prompts to disable thinking mode on local models - Switch default LLM to Qwen3-4B (MaziyarPanahi/Qwen3-4B-GGUF) - Bump minimum context docs from 2 to 3 Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Reranking: - Make reranking configurable via query.enable_reranking and query.reranker_model in .ragnar.yml - Default reranking to on (true) for new projects; --rerank CLI flag overrides - Always include original user query in sub-queries for robust retrieval - Strip <think> tags from model responses (Qwen3 thinking mode) Specs (20 new, 334 total): - strip_think_tags: normal, multiline, nil, unclosed tags - enable_reranking parameter: on/off paths - Original query prepended to sub-queries - CLI profile command: list, switch, error handling, global --profile option - Config: create_chat with providers/api_keys, reranking settings README: - Full config example with LLM profiles (red_candle, opus, sonnet, ollama) - New LLM Profiles section with --profile flag and /profile TUI command - Query config with enable_reranking and reranker_model options - Updated Supported Models: RubyLLM providers, Qwen3-4B default, reranker options - Environment variables for API keys Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- /verbose command toggles verbose output on/off for subsequent queries - @@verbose_mode class variable persists across commands in TUI session - Query command respects toggle when -v flag isn't explicitly passed - 2 new specs for toggle on/off behavior (336 total, 0 failures) - Documented in README TUI section Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Hybrid search: - Add full-text search alongside vector search in retrieval pipeline - Lancelot's FTS finds exact keyword matches (e.g., "password policy") that pure vector search misses in noisy embedding spaces - Both signals combined via RRF fusion — docs matching both semantically and by keywords rank highest /verbose toggle: - New /verbose command toggles verbose mode on/off in TUI sessions - Persists across queries via @@verbose_mode class variable /model command: - Now profile-aware: shows active profile's provider and model - Adapts display for local (cache status) vs cloud (API key status) - Removed hardcoded GGUF file references Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- RRF now prefers documents with complete metadata when merging duplicates from vector and FTS results - Filter out sources with nil file_path to prevent empty "Sources:" display - Add string key fallback for file_path/chunk_index lookups Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Remove local red-candle path dependency, use released gem - Add specs for configurable reranker model and load failure fallback - 338 examples, 0 failures Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Major upgrade migrating ragnar from direct red-candle LLM usage to RubyLLM, adding a TUI, LLM profiles, and hybrid search.
TUI Interactive Mode
ragnar(no args)/commandsyntax/verbosetoggle and/profileswitching mid-sessionRubyLLM Migration
Candle::LLMwithRubyLLM.chat()— supports any provider<think>tags from Qwen3 responses,/no_thinkin promptsLLM Profiles
--profile/-pglobal flag on all commands/profilecommand to list/switch in TUIHybrid Search (Vector + FTS)
UMAP Subcommands
ragnar umap train/ragnar umap apply(wastrain-umap/apply-umap)Configurable Reranking
query.enable_rerankingandquery.reranker_modelin config--rerankCLI flag overrides configOther
/modelcommandTest plan
bundle exec rspec— 336 examples, 0 failures<think>leakage🤖 Generated with Claude Code