Support analysis channel in gptoss renderer #51

sarahegler · 2025-10-28T23:28:00Z

Add support for CoT in the gptoss renderer based on https://cookbook.openai.com/articles/gpt-oss/fine-tune-transfomers and https://cookbook.openai.com/articles/gpt-oss/handle-raw-cot

An example like this:

{"role": "user", "content": "What is 2+2?"},
{
    "role": "assistant",
    "thinking": "The user wants me to calculate 2+2. Let me compute: 2+2 = 4.",
    "content": "The answer is 4.",
},

Now renders to the harmony format like this:

<|start|>system<|message|>You are ChatGPT, a large language model trained by OpenAI.
Knowledge cutoff: 2024-06
Current date: 2025-10-28

Reasoning: medium

# Valid channels: analysis, commentary, final. Channel must be included for every message.<|end|><|start|>user<|message|>What is 2+2?<|end|><|start|>assistant<|channel|>analysis<|message|>The user wants me to calculate 2+2. Let me compute: 2+2 = 4.<|end|><|channel|>final<|message|>The answer is 4.<|return|>

Left commentary channel / tooling for a separate PR.

opherlieber · 2025-10-29T22:49:45Z

Thanks for the PR!
I think each channel should be formatted like a new assistant message, so you are missing the <|start|>assistant in the final channel, i.e. in your example it should be:

<|start|>assistant<|channel|>analysis<|message|>The user wants me to calculate 2+2. Let me compute: 2+2 = 4.<|end|><|start|>assistant<|channel|>final<|message|>The answer is 4.<|return|>

That's according to this, and also that's what I get in the HF chat template for this example (If you can also add a version of test_against_hf_chat_templates that tests the thinking it would be great too)

Also, if you can add an assert in the other render_message functions that the new message.thinking field is None, at least if/until we support those there too (Also maybe worth asserting that it's None also in gpt-oss if it's a user message)

joschu · 2025-10-30T06:14:25Z

According to @opherlieber's comment, the gpt-oss thinking block is in its own Message. So are you sure we want thinking as a separate attribute of the Message, as opposed to just using the content field?

That said, having thinking as a separate field of the Message could actually make sense for models like the Qwen3 ones where every message includes thinking followed by text. Currently, for these models, content field includes the tags rendered as text, which requires parsing the string, and could lead to prompt injection issues.

joschu · 2025-10-30T06:17:13Z

It's worth looking at other libraries like https://github.com/UKGovernmentBEIS/inspect_ai, which work with various models with different thinking conventions, to see how they deal with these issues.

opherlieber · 2025-10-30T21:51:37Z

I think it might make sense to have the channel as a separate message in this specific gpt-oss case, but might still be preferable to have it in a single message like the current PR does since:

It can allow the same thinking messaging format support for other models like you wrote
It matches the format of the HF chat template (i.e. single message with 'content' and 'thinking' fields)
The harmony spec defines to remove the analysis/thinking from all but the last assistant message, which complicates and can be confusing if it's a separate message and not part of the last one.

sarahegler · 2025-10-30T21:52:50Z

Thanks for the review @opherlieber @joschu! I was probably oversimplifying for the gpt-oss case a bit. I based this off the gpt-oss chat template and fine-tuning cookbook where thinking is a separate field within the message, parallel to content. OpenRouter also has a reasoning_details in parallel to the content field in the message.

It looks like Inspect used to do the same, but switched to defining a ContentReasoning type alongside other Content blocks, e.g. ContentText, ContentReasoning, ContentToolUse, ContentImage, etc. (They changed it to support multiple fields in the reasoning content as with Anthropic’s API) In that case, the reasoning is still part of the same message, but a separate content block. Happy to try something more like that as 1) I like the extensibility for things like multimodal content, 2) user messages won’t have a “thinking” field so might make more sense to not have that explicitly defined on the message (or else to define a separate subclass for assistant messages like they also do in Inspect), 3) and like you said, it might fit nicely with some of the other models where the tags are currently part of the content field.

So for now my thinking is either 1) have reasoning in a separate field like this or 2) define a separate Content block type similar to Inspect and allow the content field to be an array of such but let me know what you prefer! Happy to also include changes for other renders in this PR if it makes sense to see it across the board before approving.

sarahegler · 2025-10-30T21:56:44Z

jinx, what @opherlieber said :)

tinker_cookbook/tests/test_renderers.py

Support analysis channel in gptoss renderer

99bdc4a

sarahegler marked this pull request as draft October 28, 2025 23:35

Fix type check and linting

9e437a2

sarahegler marked this pull request as ready for review October 28, 2025 23:45

joschu requested review from kzl-tml and opherlieber October 29, 2025 06:34

sarahegler added 3 commits October 30, 2025 04:31

Address comments

c1c8669

linting

e8f4f7f

Add thinking example to generation rendering

5739fe4

opherlieber reviewed Oct 30, 2025

View reviewed changes

tinker_cookbook/tests/test_renderers.py Outdated Show resolved Hide resolved

opherlieber reviewed Oct 30, 2025

View reviewed changes

tinker_cookbook/tests/test_renderers.py Outdated Show resolved Hide resolved

sarahegler added 2 commits October 31, 2025 00:55

Syntax in test

9737bc4

Address failing test separately

255b7d2

opherlieber approved these changes Oct 31, 2025

View reviewed changes

opherlieber merged commit 15bedd0 into thinking-machines-lab:main Oct 31, 2025
2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support analysis channel in gptoss renderer #51

Support analysis channel in gptoss renderer #51

Uh oh!

sarahegler commented Oct 28, 2025

Uh oh!

opherlieber commented Oct 29, 2025 •

edited

Loading

Uh oh!

joschu commented Oct 30, 2025

Uh oh!

joschu commented Oct 30, 2025

Uh oh!

opherlieber commented Oct 30, 2025

Uh oh!

sarahegler commented Oct 30, 2025

Uh oh!

sarahegler commented Oct 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Support analysis channel in gptoss renderer #51

Support analysis channel in gptoss renderer #51

Uh oh!

Conversation

sarahegler commented Oct 28, 2025

Uh oh!

opherlieber commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

joschu commented Oct 30, 2025

Uh oh!

joschu commented Oct 30, 2025

Uh oh!

opherlieber commented Oct 30, 2025

Uh oh!

sarahegler commented Oct 30, 2025

Uh oh!

sarahegler commented Oct 30, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

opherlieber commented Oct 29, 2025 •

edited

Loading