Skip to content

Conversation

@sarahegler
Copy link
Contributor

Add support for CoT in the gptoss renderer based on https://cookbook.openai.com/articles/gpt-oss/fine-tune-transfomers and https://cookbook.openai.com/articles/gpt-oss/handle-raw-cot

An example like this:

{"role": "user", "content": "What is 2+2?"},
{
    "role": "assistant",
    "thinking": "The user wants me to calculate 2+2. Let me compute: 2+2 = 4.",
    "content": "The answer is 4.",
},

Now renders to the harmony format like this:

<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.
Knowledge cutoff: 2024-06
Current date: 2025-10-28

Reasoning: medium

# Valid channels: analysis, commentary, final. Channel must be included for every message.<|end|><|start|>user<|message|>What is 2+2?<|end|><|start|>assistant<|channel|>analysis<|message|>The user wants me to calculate 2+2. Let me compute: 2+2 = 4.<|end|><|channel|>final<|message|>The answer is 4.<|return|>

Left commentary channel / tooling for a separate PR.

@sarahegler sarahegler marked this pull request as draft October 28, 2025 23:35
@sarahegler sarahegler marked this pull request as ready for review October 28, 2025 23:45
@joschu joschu requested review from kzl-tml and opherlieber October 29, 2025 06:34
@opherlieber
Copy link
Contributor

opherlieber commented Oct 29, 2025

Thanks for the PR!
I think each channel should be formatted like a new assistant message, so you are missing the <|start|>assistant in the final channel, i.e. in your example it should be:

<|start|>assistant<|channel|>analysis<|message|>The user wants me to calculate 2+2. Let me compute: 2+2 = 4.<|end|><|start|>assistant<|channel|>final<|message|>The answer is 4.<|return|>

That's according to this, and also that's what I get in the HF chat template for this example (If you can also add a version of test_against_hf_chat_templates that tests the thinking it would be great too)

Also, if you can add an assert in the other render_message functions that the new message.thinking field is None, at least if/until we support those there too (Also maybe worth asserting that it's None also in gpt-oss if it's a user message)

@joschu
Copy link
Collaborator

joschu commented Oct 30, 2025

According to @opherlieber's comment, the gpt-oss thinking block is in its own Message. So are you sure we want thinking as a separate attribute of the Message, as opposed to just using the content field?

That said, having thinking as a separate field of the Message could actually make sense for models like the Qwen3 ones where every message includes thinking followed by text. Currently, for these models, content field includes the tags rendered as text, which requires parsing the string, and could lead to prompt injection issues.

@joschu
Copy link
Collaborator

joschu commented Oct 30, 2025

It's worth looking at other libraries like https://github.com/UKGovernmentBEIS/inspect_ai, which work with various models with different thinking conventions, to see how they deal with these issues.

@opherlieber
Copy link
Contributor

I think it might make sense to have the channel as a separate message in this specific gpt-oss case, but might still be preferable to have it in a single message like the current PR does since:

  • It can allow the same thinking messaging format support for other models like you wrote
  • It matches the format of the HF chat template (i.e. single message with 'content' and 'thinking' fields)
  • The harmony spec defines to remove the analysis/thinking from all but the last assistant message, which complicates and can be confusing if it's a separate message and not part of the last one.

@sarahegler
Copy link
Contributor Author

Thanks for the review @opherlieber @joschu! I was probably oversimplifying for the gpt-oss case a bit. I based this off the gpt-oss chat template and fine-tuning cookbook where thinking is a separate field within the message, parallel to content. OpenRouter also has a reasoning_details in parallel to the content field in the message.

It looks like Inspect used to do the same, but switched to defining a ContentReasoning type alongside other Content blocks, e.g. ContentText, ContentReasoning, ContentToolUse, ContentImage, etc. (They changed it to support multiple fields in the reasoning content as with Anthropic’s API) In that case, the reasoning is still part of the same message, but a separate content block. Happy to try something more like that as 1) I like the extensibility for things like multimodal content, 2) user messages won’t have a “thinking” field so might make more sense to not have that explicitly defined on the message (or else to define a separate subclass for assistant messages like they also do in Inspect), 3) and like you said, it might fit nicely with some of the other models where the tags are currently part of the content field.

So for now my thinking is either 1) have reasoning in a separate field like this or 2) define a separate Content block type similar to Inspect and allow the content field to be an array of such but let me know what you prefer! Happy to also include changes for other renders in this PR if it makes sense to see it across the board before approving.

@sarahegler
Copy link
Contributor Author

jinx, what @opherlieber said :)

@opherlieber opherlieber merged commit 15bedd0 into thinking-machines-lab:main Oct 31, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants