Proposal
Add a TransformerBridge adapter for RavenForCausalLM (tomg-group-umd Huginn), a depth-recurrent transformer that does latent reasoning by iterating a weight-tied core block a runtime-variable number of times.
Motivation
Huginn is the canonical open artifact of the latent-reasoning subfield: a prelude followed by weight-tied recurrent core followed by coda, where the core block is applied a variable number of iterations at inference, refining the residual stream in latent space rather than emitting chain-of-thought tokens. No model TransformerLens supports shares weights across depth, so this is a genuinely new surface and it is the rare candidate with real, independent mech-interp literature already targeting it (logit-/coda-lens probing across recurrence steps, latent-trajectory analysis). It is also an AI-safety priority: reasoning that happens in latent iterations bypasses legible CoT, exactly the regime hooks are built to inspect.
Gap scan (2026-06-25): ~15 models, ~39K downloads.
Scope note (higher-effort adapter)
The recurrence is custom control flow, not a static layer list: the number of core iterations is a runtime forward argument, the model re-injects the prelude embedding each step, and it uses a bespoke HuginnDynamicCache. The bridge's blocks.{i} assumption must be extended to expose the same core block across iterations as a hookable per-step stream. Remote-code loading itself is already a supported pattern (see openelm.py).
Pitch
Map prelude / recurrent-core / coda, and expose the core block's residual stream per iteration so researchers can run cross-step logit lens, fixed-point/convergence analysis, and cross-step activation patching.
- Claude Code users can scaffold with
/add-model-support tomg-group-umd/huginn-0125.
- Register at the four sites listed in contributing.md.
- Verify:
tomg-group-umd/huginn-0125 (~3.5B; the single released checkpoint).
Additional context
Checklist
Proposal
Add a TransformerBridge adapter for
RavenForCausalLM(tomg-group-umd Huginn), a depth-recurrent transformer that does latent reasoning by iterating a weight-tied core block a runtime-variable number of times.Motivation
Huginn is the canonical open artifact of the latent-reasoning subfield: a prelude followed by weight-tied recurrent core followed by coda, where the core block is applied a variable number of iterations at inference, refining the residual stream in latent space rather than emitting chain-of-thought tokens. No model TransformerLens supports shares weights across depth, so this is a genuinely new surface and it is the rare candidate with real, independent mech-interp literature already targeting it (logit-/coda-lens probing across recurrence steps, latent-trajectory analysis). It is also an AI-safety priority: reasoning that happens in latent iterations bypasses legible CoT, exactly the regime hooks are built to inspect.
Gap scan (2026-06-25): ~15 models, ~39K downloads.
Scope note (higher-effort adapter)
The recurrence is custom control flow, not a static layer list: the number of core iterations is a runtime forward argument, the model re-injects the prelude embedding each step, and it uses a bespoke
HuginnDynamicCache. The bridge'sblocks.{i}assumption must be extended to expose the same core block across iterations as a hookable per-step stream. Remote-code loading itself is already a supported pattern (see openelm.py).Pitch
Map prelude / recurrent-core / coda, and expose the core block's residual stream per iteration so researchers can run cross-step logit lens, fixed-point/convergence analysis, and cross-step activation patching.
/add-model-support tomg-group-umd/huginn-0125.tomg-group-umd/huginn-0125(~3.5B; the single released checkpoint).Additional context
hf_scraperarchitecture-gaps pass (2026-06-25).Checklist