v0.6.1_rocm
Pre-release
Pre-release
github-actions
released this
19 Sep 15:16
·
686 commits
to main
since this release
What's Changed
- [fix] moe padding for reading correct tuned config by @divakar-amd in #172
- Upstream merge 24/9/9 by @gshtras in #174
- Restoring deleted .buildkite/test-template.j2 by @Alexei-V-Ivanov-AMD in #177
- Support commandr on ROCm by @shajrawi in #180
- Correct type hint by @gshtras in #173
- update custom PA kernel with support for fp8 kv cache dtype by @sanyalington in #87
- Support Grok-1 by @kkHuang-amd in #181
- Adding MLPerf optimization to 0.6.0 by @charlifu in #182
- 6.2 dockerfile by @gshtras in #176
- [Grok1] fix the name of input scale factor for autofp8 run by @kkHuang-amd in #183
- [Grok-1] fix the run-time error "Can't pickle <class 'transformers_mo… by @kkHuang-amd in #184
- Upstream merge 24/09/16 by @gshtras in #187
- Perf improvement: remove redundant torch slice; Match decode PA partition size to csrc by @sanyalington in #188
- refactor dbrx experts to use FusedMoe layer by @divakar-amd in #186
- Disable moe padding by default and enable fp8 padding by default. by @charlifu in #190
- Enabling Splitting HW by Buildkite Agents by @Alexei-V-Ivanov-AMD in #191
- Revert "remove redundant slice; match decode PA partition size with csrc (#188)" by @gshtras in #194
- [Grok-1] 1. upload moe configuration file for moe kernel optimization… by @kkHuang-amd in #193
- Removing the original text in reminder_comment.yml by @Alexei-V-Ivanov-AMD in #195
- Fix PA custom and PA v2 tests and partition sizes by @mawong-amd in #196
New Contributors
- @kkHuang-amd made their first contribution in #181
Full Changelog: v0.6.0_rocm...v0.6.1_rocm