Skip to content

From NVIDIA Megatron-LM for visibility #18

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4,603 commits into
base: multi-query-attention
Choose a base branch
from

Conversation

RaymondLi0
Copy link
Collaborator

No description provided.

@RaymondLi0 RaymondLi0 changed the base branch from multi-query-attention to before-merge June 20, 2023 20:12
@RaymondLi0 RaymondLi0 changed the base branch from before-merge to multi-query-attention June 20, 2023 20:12
ko3n1g and others added 28 commits April 15, 2025 18:18
Fix `post_training/test_get_gpt_modelopt_spec_interface`

See merge request ADLR/megatron-lm!3118
Remove legacy bert tests

See merge request ADLR/megatron-lm!3023
Co-authored-by: Ali Taghibakhshi <[email protected]>
Co-authored-by: Mcore Bot <[email protected]>
Alit/config mamba head

See merge request ADLR/megatron-lm!2601
Update CODEOWNERS to make modelopt  review only for QAT.

See merge request ADLR/megatron-lm!3125
Run nemo2 tests instead of nemo1

See merge request ADLR/megatron-lm!3119
…attn for dynamic batching.

Co-authored-by: Shanmugam Ramasamy <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: Vijay Korthikanti <[email protected]>
Co-authored-by: Mcore Bot <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: root <[email protected]>
Co-authored-by: root <[email protected]>
Integrating paged attention feature of flash_attn for dynamic batching.

See merge request ADLR/megatron-lm!2955
Co-authored-by: Mcore Bot <[email protected]>
Co-authored-by: yaoyu-33 <[email protected]>
Co-authored-by: Chenhan Yu <[email protected]>
add l2 norm in torch_norm.py for LLAMA-4 support

See merge request ADLR/megatron-lm!2960
fix: Improvements to the auto-reminder bot

See merge request ADLR/megatron-lm!3126
Fix Gemma TRTLLM export

See merge request ADLR/megatron-lm!2475
Co-authored-by: Yuzhong Wang <[email protected]>
Co-authored-by: Shunkang <[email protected]>
Fix MLA THD format support

See merge request ADLR/megatron-lm!2691
Dynamic inference example | Control checkpoint load strictness.

See merge request ADLR/megatron-lm!2914
patch for fp8 primary weight custom fsdp support

See merge request ADLR/megatron-lm!3057
ci: Track info about MR

See merge request ADLR/megatron-lm!3129
ci: Handle nargs

See merge request ADLR/megatron-lm!3105
…h --no-optim-load

Co-authored-by: jianbinc <[email protected]>
Co-authored-by: 胡凯文 <[email protected]>
ko3n1g and others added 30 commits May 13, 2025 12:00
ci: Run on multiple clusters

See merge request ADLR/megatron-lm!3292
ci: Allow specific TE-ref

See merge request ADLR/megatron-lm!3302
ci(fix): Write logs to log_dir

See merge request ADLR/megatron-lm!3299
Address dist checkpointing PyT 24.08 failure

See merge request ADLR/megatron-lm!3253
ci(hotfix): Downstream pipeline

See merge request ADLR/megatron-lm!3307
…nal argparse flag to clear GPU...

Co-authored-by: Szymon Migacz <[email protected]>
MR feedback: added units for arguments, optional argparse flag to clear GPU...

See merge request ADLR/megatron-lm!3308
Allow process group as optional argument for mamba class constructor

See merge request ADLR/megatron-lm!2966
Add NVTX ranges to categorize execution

See merge request ADLR/megatron-lm!2588
Move fsdp 2 import from _composable to public

See merge request ADLR/megatron-lm!3116
ci: Add nemo-image to `ci-rebuild-mcore-nemo-image`

See merge request ADLR/megatron-lm!3321
ci: Re-enable tests that failed on memory

See merge request ADLR/megatron-lm!3197
Signed-off-by: oliver könig <[email protected]>
Co-authored-by: Shanmugam Ramasamy <[email protected]>
Co-authored-by: Shanmugam Ramasamy <[email protected]>
Engine updates

See merge request ADLR/megatron-lm!3254
ci: Onboard mr-slim to h100

See merge request ADLR/megatron-lm!3312
chore: Deprecate T5 tests

See merge request ADLR/megatron-lm!3334
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.