Skip to content

Actions: NVIDIA/TransformerEngine

Deploy nightly docs

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
565 workflow runs
565 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

[common/PyTorch] Add cuDNN SWA (left, 0) + padding + bottom right cau…
Deploy nightly docs #756: Commit 838345e pushed by cyanguwa
December 20, 2024 05:32 1m 15s main
December 20, 2024 05:32 1m 15s
[JAX] Move parallel encoder tests to L0 distributed test set. (#1356)
Deploy nightly docs #755: Commit a3b32ec pushed by phu0ngng
December 18, 2024 15:47 1m 38s main
December 18, 2024 15:47 1m 38s
[PyTorch] Fix get_swa_mask() for padding masks (#1281)
Deploy nightly docs #754: Commit f033498 pushed by cyanguwa
December 18, 2024 02:15 1m 39s main
December 18, 2024 02:15 1m 39s
[PyTorch] Add weights_only=False for torch.load (#1374)
Deploy nightly docs #753: Commit 83dac8c pushed by cyanguwa
December 18, 2024 02:15 1m 40s main
December 18, 2024 02:15 1m 40s
[JAX] Fused attention unit tests fixes and refinements (#1352)
Deploy nightly docs #752: Commit 7f5c784 pushed by zlsh80826
December 17, 2024 07:41 1m 23s main
December 17, 2024 07:41 1m 23s
[common] Add max_t support for KV in THD (#1370)
Deploy nightly docs #751: Commit f4f35c2 pushed by cyanguwa
December 17, 2024 03:57 1m 20s main
December 17, 2024 03:57 1m 20s
Enabling FP8 all-gather for TE Float8Tensor when using Torch FSDP2 (#…
Deploy nightly docs #750: Commit 0196ed4 pushed by youngeunkwon0405
December 16, 2024 23:39 1m 16s main
December 16, 2024 23:39 1m 16s
[JAX] Bug Fix: Softmax FFIs with correct Encapsulates (#1375)
Deploy nightly docs #749: Commit 1975ace pushed by phu0ngng
December 14, 2024 17:09 1m 23s main
December 14, 2024 17:09 1m 23s
Fix an invalid reference in the doc (#1362)
Deploy nightly docs #748: Commit 1ae8190 pushed by denera
December 14, 2024 02:09 1m 28s main
December 14, 2024 02:09 1m 28s
Add user to CI (#1371)
Deploy nightly docs #747: Commit e7bfc0c pushed by phu0ngng
December 12, 2024 22:16 1m 22s main
December 12, 2024 22:16 1m 22s
[JAX] Bug fix for distributed normalization (#1366)
Deploy nightly docs #746: Commit 0e1d9fa pushed by phu0ngng
December 12, 2024 13:00 1m 41s main
December 12, 2024 13:00 1m 41s
[JAX] Use default factory for not sharing mutable default values (#1364)
Deploy nightly docs #745: Commit e4c99b0 pushed by phu0ngng
December 10, 2024 17:31 1m 38s main
December 10, 2024 17:31 1m 38s
[C] Normalization Refactor + Adding CUDNN backend (#1315)
Deploy nightly docs #744: Commit 3102fdd pushed by phu0ngng
December 6, 2024 18:59 1m 17s main
December 6, 2024 18:59 1m 17s
Disable FP8 in Mcore integration test on older GPUs (#1357)
Deploy nightly docs #743: Commit d8b13cb pushed by timmoon10
December 6, 2024 05:42 1m 17s main
December 6, 2024 05:42 1m 17s
Fix attention mask type for Flash Attention + CP + THD (#1354)
Deploy nightly docs #742: Commit d978e80 pushed by xrennvidia
December 5, 2024 21:44 1m 16s main
December 5, 2024 21:44 1m 16s
[PyTorch] Store module extra state in tensor (#1335)
Deploy nightly docs #741: Commit 8c00424 pushed by timmoon10
December 5, 2024 21:19 1m 20s main
December 5, 2024 21:19 1m 20s
Debug nightly docs (#1338)
Deploy nightly docs #740: Commit 71ada55 pushed by timmoon10
December 5, 2024 21:18 1m 33s main
December 5, 2024 21:18 1m 33s
[JAX] Scale sequence length in CP tests to avoid tiny sizes. (#1347)
Deploy nightly docs #739: Commit d3cbccd pushed by mgoldfarb-nvidia
December 4, 2024 15:52 1m 17s main
December 4, 2024 15:52 1m 17s
Improving communication overlap for the case of multi kernel queue us…
Deploy nightly docs #738: Commit 64126aa pushed by denera
December 2, 2024 20:26 1m 19s main
December 2, 2024 20:26 1m 19s
Update list of CI users (#1340)
Deploy nightly docs #737: Commit 0951971 pushed by timmoon10
December 2, 2024 19:07 1m 14s main
December 2, 2024 19:07 1m 14s
Fix cuda graph capture for grouped gemm (#1345)
Deploy nightly docs #736: Commit a132ac4 pushed by xrennvidia
November 27, 2024 17:35 1m 13s main
November 27, 2024 17:35 1m 13s
[Common] Moved framework agnostic THD kernels to common. (#1339)
Deploy nightly docs #735: Commit 60ce21f pushed by mgoldfarb-nvidia
November 25, 2024 14:43 1m 12s main
November 25, 2024 14:43 1m 12s
Support CUDA Graph for MoE models (#1233)
Deploy nightly docs #734: Commit ae393e8 pushed by yaox12
November 25, 2024 08:02 1m 12s main
November 25, 2024 08:02 1m 12s
[Core] Add function to convert container to string (#1342)
Deploy nightly docs #733: Commit 8952bc4 pushed by timmoon10
November 22, 2024 02:15 1m 26s main
November 22, 2024 02:15 1m 26s
[PyTorch] Integration test for Megatron-LM (#1329)
Deploy nightly docs #732: Commit 6b98768 pushed by timmoon10
November 21, 2024 02:47 1m 19s main
November 21, 2024 02:47 1m 19s