Skip to content

[Bug] ChatAdapter fails to parse field markers not separated by newlines #8901

@enitrat

Description

@enitrat

What happened?

Description

The ChatAdapter.parse() method incorrectly handles field markers ([[ ## field_name ## ]]) when they appear in a line rather than at the beginning. This causes the parser to include subsequent markers and content as part of the previous field's value, eventually failing to parse the full output due to missing field.

Root Cause

The parsing logic in dspy/adapters/chat_adapter.py:169 uses field_header_pattern.match() which only checks if a line starts with the field marker pattern. When markers
appear anywhere else in the line, they are not detected and get included in the previous field's content.

Expected Behavior

Field markers should be recognized anywhere in a line, not just at the beginning. The parser should correctly split content when multiple markers appear on the same line.

Actual Behavior

When a field marker appears in the middle of a line (e.g., Current Weather[[ ## answer ## ]]), the marker and subsequent content are included as part of the previous field's value.

Impact

This can occur when:

  • The field markers (or [[ ## completed ## ]]) appears on the same line as the last field value
  • LLMs generate compact responses without newlines between fields (quite often from my experience with gemini/gemini-flash-latest. This causes fallback to the JSONAdapter, which is terrible for e2e latency.

Environment

  • DSPy version: latest (main branch)
  • Affected component: dspy/adapters/chat_adapter.py

Steps to reproduce

  import dspy
  from unittest import mock
  from litellm.utils import Choices, Message, ModelResponse

  def test_chat_adapter_inline_markers():
      # LM response with markers on the same line as content
      MOCK_ANSWER = """
  [[ ## reasoning ## ]]
  The user asked a question about the weather. The topic is the current weather.
  [[ ## topic ## ]]
  Current Weather[[ ## answer ## ]]
  The current weather is sunny.[[ ## completed ## ]]
  """

      class MySignature(dspy.Signature):
          question: str = dspy.InputField()
          reasoning: str = dspy.OutputField()
          topic: str = dspy.OutputField()
          answer: str = dspy.OutputField()

      adapter = dspy.ChatAdapter()
      with mock.patch("litellm.completion") as mock_completion:
          mock_completion.return_value = ModelResponse(
              choices=[Choices(message=Message(content=MOCK_ANSWER))],
              model="openai/gpt-4o-mini",
          )
          result = adapter(
              dspy.LM(model="openai/gpt-4o-mini", cache=False),
              {},
              MySignature,
              [],
              {"question": "What is the weather?"}
          )

      # This fails with the bug:
      # Expected: "Current Weather"
      # Actual: "Current Weather[[ ## answer ## ]]\nThe current weather is sunny.[[ ## completed ## ]]"
      assert result[0]["topic"] == "Current Weather"
      assert result[0]["answer"] == "The current weather is sunny."

DSPy version

3.0.3

Metadata

Metadata

Labels

bugSomething isn't working

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions