DFlash speculative decoding for MiniMax-M2.7 (FSDP2): auto mask-token, FSDP2 resume fixes, per-checkpoint draft export#1621
Open
yeyu-nvidia wants to merge 6 commits into
Open
DFlash speculative decoding for MiniMax-M2.7 (FSDP2): auto mask-token, FSDP2 resume fixes, per-checkpoint draft export#1621yeyu-nvidia wants to merge 6 commits into
yeyu-nvidia wants to merge 6 commits into