forked from NVIDIA/Megatron-LM
-
Notifications
You must be signed in to change notification settings - Fork 14
Pull requests: ROCm/Megatron-LM
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
fix: set missing qkv_bias for Mixtral HF model during conversion
#62
opened Feb 17, 2025 by
mpashkovskii
•
Draft
feat: add Grok-1 transformer layer and training scripts
#55
opened Feb 7, 2025 by
mpashkovskii
•
Draft
feat: add LoRA adapter layer and Mixtral LoRA training
#53
opened Jan 31, 2025 by
mpashkovskii
Loading…
[Perf] Skip creating attention mask in llama dataloader
#40
opened Dec 13, 2024 by
billishyahao
Loading…
ProTip!
Mix and match filters to narrow down what you’re looking for.