Reject assisted generation for LFM2 and LFM2-MoE (set _is_stateful) by Sunt-ing · Pull Request #46937 · huggingface/transformers

Sunt-ing · 2026-06-27T18:45:44Z

What does this PR do?

LFM2 and LFM2-MoE are conv/attention hybrids that keep recurrent conv state, but they inherited the default _is_stateful = False from Llama. Assisted and prompt-lookup decoding are therefore not rejected for them, and because their conv state cannot be rolled back during speculative verification, they silently produce tokens that diverge from greedy instead of raising the clear error the other stateful models raise:

ValueError: assisted generation is not supported with stateful models, such as Lfm2ForCausalLM

Speculative decoding is supposed to be lossless (token-identical to greedy), so silently diverging is a correctness bug. This PR sets _is_stateful = True on both models so the existing guard rejects assisted/prompt-lookup decoding cleanly. (This does not add speculative-decoding support for LFM2, which would require rolling back the conv cache; it makes the unsupported path fail loudly instead of silently.)

Reproduction (real LFM2-1.2B, fp32) and before/after

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer

tok = AutoTokenizer.from_pretrained("LiquidAI/LFM2-1.2B")
m = AutoModelForCausalLM.from_pretrained("LiquidAI/LFM2-1.2B", dtype=torch.float32).eval()
ids = tok("The capital of France is Paris, and the capital of Germany is", return_tensors="pt").input_ids

greedy = m.generate(ids, max_new_tokens=40, do_sample=False)
lookup = m.generate(ids, max_new_tokens=40, do_sample=False, prompt_lookup_num_tokens=2)

Before this PR _is_stateful is False, greedy is deterministic, but prompt_lookup diverges from greedy at every prompt_lookup_num_tokens with completely different tokens (fp32, so not a tie). After this PR prompt_lookup (and any assistant_model) raises the ValueError above, matching the other stateful hybrids (Falcon-H1, Qwen3.5, mamba2, ...).

A regression test is added for each model (Lfm2ModelTest / Lfm2MoeModelTest ::test_assisted_generation_rejected_as_stateful), asserting prompt-lookup decoding raises a stateful-model error. Each fails on main (decoding silently runs) and passes with this fix. The lfm2 model test file passes (137 passed, 122 skipped); ruff is clean. The fix edits the modular files and regenerates the modeling files.

Code Agent Policy

I confirm that this is not a pure code agent PR.

Before submitting

This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
Did you read the contributor guideline and the
Pull Request checks?
Was this discussed/approved via a Github issue or the forum? Please add a link
to it if that's the case.
Did you make sure to update the documentation with your changes according to the guidelines?
Did you write any new necessary tests?

Who can review?

@Cyrilvallez

LFM2 and LFM2-MoE are conv/attention hybrids that keep recurrent conv state, but inherited the default _is_stateful = False from Llama. Assisted and prompt-lookup decoding were therefore not rejected, and since their conv state cannot be rolled back during speculative verification they silently produced wrong tokens (divergent from greedy) instead of raising the clear "assisted generation is not supported with stateful models" error the other stateful models raise. Set _is_stateful = True on both.

github-actions · 2026-06-27T18:46:52Z

[For maintainers] Suggested jobs to run (before merge)

run-slow: lfm2, lfm2_moe

github-actions · 2026-06-27T18:53:25Z

CI Dashboard: View test results in Grafana

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Reject assisted generation for LFM2 and LFM2-MoE (set _is_stateful)#46937

Reject assisted generation for LFM2 and LFM2-MoE (set _is_stateful)#46937
Sunt-ing wants to merge 1 commit into
huggingface:mainfrom
Sunt-ing:6

Sunt-ing commented Jun 27, 2026

Uh oh!

github-actions Bot commented Jun 27, 2026

Uh oh!

github-actions Bot commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

Sunt-ing commented Jun 27, 2026

What does this PR do?

Code Agent Policy

Before submitting

Who can review?

Uh oh!

github-actions Bot commented Jun 27, 2026

Uh oh!

github-actions Bot commented Jun 27, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant