Skip to content

Actions: NVIDIA/TransformerEngine

Deploy nightly docs

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
155 workflow run results
155 workflow run results

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Support FP16 for user buffer (#690)
Deploy nightly docs #372: Commit 8255f87 pushed by ksivaman
March 7, 2024 18:01 1m 36s main
March 7, 2024 18:01 1m 36s
[PyTorch] Adjusted the logic of MHA and DPA to enable speculative dec…
Deploy nightly docs #371: Commit b459ccc pushed by ksivaman
March 6, 2024 20:37 1m 49s main
March 6, 2024 20:37 1m 49s
Fix types for forward attention for JAX. (#704)
Deploy nightly docs #370: Commit 728e335 pushed by ksivaman
March 6, 2024 19:16 1m 35s main
March 6, 2024 19:16 1m 35s
Return layernorm output in the gathered form (#697)
Deploy nightly docs #369: Commit d8f678d pushed by ksivaman
March 6, 2024 14:15 1m 23s main
March 6, 2024 14:15 1m 23s
Disable UB bulk wgrad when weights are frozen (#702)
Deploy nightly docs #368: Commit b0f6535 pushed by ksivaman
March 5, 2024 21:06 1m 36s main
March 5, 2024 21:06 1m 36s
Update README.rst to show the table in FP8 Convergence. (#678)
Deploy nightly docs #367: Commit 3f8baf9 pushed by ksivaman
March 5, 2024 01:39 1m 29s main
March 5, 2024 01:39 1m 29s
[PyTorch] Update doc for checkpoint API (#695)
Deploy nightly docs #366: Commit 24f78ac pushed by ksivaman
March 4, 2024 23:41 1m 37s main
March 4, 2024 23:41 1m 37s
Enable incremental CMake build (#684)
Deploy nightly docs #365: Commit 509ab0b pushed by ksivaman
March 4, 2024 23:40 1m 16s main
March 4, 2024 23:40 1m 16s
[PyTorch] Use dummy amax for Float8Tensor cast (#693)
Deploy nightly docs #364: Commit 4e2ce51 pushed by ksivaman
March 1, 2024 14:33 1h 13m 19s main
March 1, 2024 14:33 1h 13m 19s
Create a small tutorial on how to accelerate HF Llama models with Tra…
Deploy nightly docs #363: Commit 0bd84ed pushed by sudhakarsingh27
March 1, 2024 07:57 1m 17s main
March 1, 2024 07:57 1m 17s
Slightly more explicit error message for invalid FP8 GEMM dims (#692)
Deploy nightly docs #362: Commit df4bf79 pushed by timmoon10
February 29, 2024 23:40 1m 28s main
February 29, 2024 23:40 1m 28s
[C/PyTorch/Jax] Add support for more bias shapes (#677)
Deploy nightly docs #361: Commit b8eea8a pushed by cyanguwa
February 28, 2024 18:37 2m 6s main
February 28, 2024 18:37 2m 6s
[JAX] Bugfix for softmax primitives accepting invalid input sharding …
Deploy nightly docs #360: Commit 0404095 pushed by denera
February 28, 2024 15:45 1m 38s main
February 28, 2024 15:45 1m 38s
[JAX] Support various implementations of RoPE. (#655)
Deploy nightly docs #359: Commit 8bba5ee pushed by denera
February 27, 2024 15:14 1m 35s main
February 27, 2024 15:14 1m 35s
[PyTorch] Non-reentrant mode for activation recompute (#670)
Deploy nightly docs #358: Commit 82bc797 pushed by ksivaman
February 24, 2024 00:19 1m 52s main
February 24, 2024 00:19 1m 52s
[JAX] Refine MHA API and add DPA API (#653)
Deploy nightly docs #357: Commit 9b2fed5 pushed by denera
February 22, 2024 18:09 1m 37s main
February 22, 2024 18:09 1m 37s
Add __version__ attribute to Python module (#675)
Deploy nightly docs #356: Commit fb2f952 pushed by timmoon10
February 21, 2024 18:23 2m 0s main
February 21, 2024 18:23 2m 0s
[Paddle] Add RMSNorm, RoPE and SwiGLU (#599)
Deploy nightly docs #355: Commit 7172509 pushed by timmoon10
February 21, 2024 17:41 2m 25s main
February 21, 2024 17:41 2m 25s
Move distributed tests to L1 (#673)
Deploy nightly docs #354: Commit 2187a8f pushed by ksivaman
February 20, 2024 18:21 1m 12s main
February 20, 2024 18:21 1m 12s
Changed VERSION to 1.5.0dev
Deploy nightly docs #353: Commit a3840c1 pushed by ptrendx
February 20, 2024 16:22 1m 22s main
February 20, 2024 16:22 1m 22s
QuickGELU activation from HuggingFace/Transformers (#475)
Deploy nightly docs #352: Commit 0e116d5 pushed by ptrendx
February 17, 2024 16:25 2m 31s main
February 17, 2024 16:25 2m 31s
Use unoptimized layernorm kernel if pointers are not aligned (#490)
Deploy nightly docs #351: Commit d5c088d pushed by ptrendx
February 17, 2024 05:31 1m 15s main
February 17, 2024 05:31 1m 15s
Use fused implementation of RoPE in MultiHeadAttention (#658)
Deploy nightly docs #350: Commit 8d62d5c pushed by ksivaman
February 15, 2024 19:06 1m 43s main
February 15, 2024 19:06 1m 43s
[PyTorch] Add Float8Tensor option to avoid updating transpose cache w…
Deploy nightly docs #349: Commit 1e78094 pushed by timmoon10
February 15, 2024 18:53 1m 34s main
February 15, 2024 18:53 1m 34s
Use arguments instead of env vars for TP comm overlap (#649)
Deploy nightly docs #348: Commit bdf1afe pushed by timmoon10
February 14, 2024 21:45 1m 22s main
February 14, 2024 21:45 1m 22s