Skip to content

[Critical] - Fix ollama_chat reasoning content#20750

Merged
2 commits merged intoBerriAI:litellm_oss_staging_02_09_2026from
DenisStefanAndrei:patch-1
Feb 10, 2026
Merged

[Critical] - Fix ollama_chat reasoning content#20750
2 commits merged intoBerriAI:litellm_oss_staging_02_09_2026from
DenisStefanAndrei:patch-1

Conversation

@DenisStefanAndrei
Copy link
Copy Markdown
Contributor

@DenisStefanAndrei DenisStefanAndrei commented Feb 9, 2026

Relevant issues

For ollama_chat models, reasoning context is ignored after 2 consecutive thinking chunks.
Fixes #20737

Pre-Submission checklist

Please complete all items before asking a LiteLLM maintainer to review your PR

  • [x ] I have Added testing in the tests/litellm/ directory, Adding at least 1 test is a hard requirement - see details
  • My PR passes all unit tests on make test-unit
  • [x ] My PR's scope is as isolated as possible, it only solves 1 specific problem

CI (LiteLLM team)

CI status guideline:

  • 50-55 passing tests: main is stable with minor issues.
  • 45-49 passing tests: acceptable but needs attention
  • <= 40 passing tests: unstable; be careful with your merges and assess the risk.
  • Branch creation CI run
    Link:

  • CI run for the last commit
    Link:

  • Merge / cherry-pick CI run
    Links:

Type

🆕 New Feature
🐛 Bug Fix
🧹 Refactoring
📖 Documentation
🚄 Infrastructure
✅ Test

Changes

For ollama_chat models, reasoning context is ignored after 2 consecutive thinking chunks.
@vercel
Copy link
Copy Markdown

vercel Bot commented Feb 9, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
litellm Ready Ready Preview, Comment Feb 9, 2026 11:21am

Request Review

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented Feb 9, 2026

Greptile Overview

Greptile Summary

Fixed critical bug in Ollama chat streaming where reasoning_content (thinking chunks) was ignored after the first 2 consecutive chunks. The old code incorrectly set finished_reasoning_content = True after processing just 2 thinking chunks, causing all subsequent thinking content to be lost. The fix removes the flawed counter logic and only sets finished_reasoning_content = True when transitioning from thinking chunks to regular content chunks.

Key changes:

  • Removed the buggy conditional logic that limited thinking chunks to 2
  • Now all message.thinking chunks are properly returned as reasoning_content
  • Transition detection moved to the content handling branch to properly mark when reasoning ends
  • Added comprehensive test coverage with 4 test cases covering multiple thinking chunks, transitions, <think> tags, and done chunks

Confidence Score: 5/5

  • This PR is safe to merge with high confidence
  • The fix is straightforward and clearly addresses the reported issue. The old logic had an obvious bug limiting thinking chunks to 2, and the new logic correctly processes all thinking chunks. Comprehensive test coverage validates all edge cases including multiple chunks, transitions, tag parsing, and done chunks. The change is isolated to the streaming response iterator with no impact on other components.
  • No files require special attention

Important Files Changed

Filename Overview
litellm/llms/ollama/chat/transformation.py Fixed bug where reasoning_content was ignored after 2 consecutive thinking chunks - now all thinking chunks are properly captured
tests/test_litellm/llms/ollama/test_ollama_chat_transformation.py Added comprehensive test coverage for reasoning_content streaming with multiple thinking chunks and edge cases

Sequence Diagram

sequenceDiagram
    participant Client
    participant OllamaChatCompletionResponseIterator
    participant chunk_parser
    participant Delta

    Note over OllamaChatCompletionResponseIterator: started_reasoning_content = False<br/>finished_reasoning_content = False

    Client->>OllamaChatCompletionResponseIterator: chunk 1 (thinking: "Chunk 1")
    OllamaChatCompletionResponseIterator->>chunk_parser: Parse chunk
    chunk_parser->>chunk_parser: Check message.thinking != None
    chunk_parser->>chunk_parser: Set reasoning_content = "Chunk 1"
    chunk_parser->>chunk_parser: Set started_reasoning_content = True
    chunk_parser->>Delta: Create delta with reasoning_content
    Delta-->>Client: Return chunk with reasoning_content

    Client->>OllamaChatCompletionResponseIterator: chunk 2 (thinking: "Chunk 2")
    OllamaChatCompletionResponseIterator->>chunk_parser: Parse chunk
    chunk_parser->>chunk_parser: Check message.thinking != None
    chunk_parser->>chunk_parser: Set reasoning_content = "Chunk 2"
    chunk_parser->>Delta: Create delta with reasoning_content
    Delta-->>Client: Return chunk with reasoning_content

    Client->>OllamaChatCompletionResponseIterator: chunk 3 (thinking: "Chunk 3")
    OllamaChatCompletionResponseIterator->>chunk_parser: Parse chunk
    chunk_parser->>chunk_parser: Check message.thinking != None
    chunk_parser->>chunk_parser: Set reasoning_content = "Chunk 3"
    Note right of chunk_parser: OLD BUG: Would skip this<br/>because finished_reasoning_content<br/>was set to True after 2 chunks
    chunk_parser->>Delta: Create delta with reasoning_content
    Delta-->>Client: Return chunk with reasoning_content

    Client->>OllamaChatCompletionResponseIterator: chunk 4 (content: "Answer")
    OllamaChatCompletionResponseIterator->>chunk_parser: Parse chunk
    chunk_parser->>chunk_parser: Check message.content != None
    chunk_parser->>chunk_parser: Set finished_reasoning_content = True
    chunk_parser->>chunk_parser: Set content = "Answer"
    chunk_parser->>Delta: Create delta with content
    Delta-->>Client: Return chunk with content
Loading

Copy link
Copy Markdown
Contributor

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 files reviewed, no comments

Edit Code Review Agent Settings | Greptile

@DenisStefanAndrei
Copy link
Copy Markdown
Contributor Author

DenisStefanAndrei commented Feb 9, 2026

@Sameerlite, @ishaan-jaff, @krrishdholakia could you please take a look at this small fix? We really need it in our product. 🙏

@ghost ghost changed the base branch from main to litellm_oss_staging_02_09_2026 February 10, 2026 02:53
@ghost ghost merged commit 5adee48 into BerriAI:litellm_oss_staging_02_09_2026 Feb 10, 2026
10 of 13 checks passed
Sameerlite pushed a commit that referenced this pull request Feb 10, 2026
* Fix ollama_chat reasoning_context.

For ollama_chat models, reasoning context is ignored after 2 consecutive thinking chunks.

* add test
@Sameerlite
Copy link
Copy Markdown
Collaborator

Sameerlite commented Feb 18, 2026 via email

This pull request was closed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants