Release Release v0.4.1 · flashinfer-ai/flashinfer

What's Changed

fix: fix the failed sampling unittest on 5090 by @yzh119 in #1886
Updated to latest docker tag by @nvmbreughe in #1889
Fix: Prevent race condition in cubin loader when file is being consumed by @yzh119 in #1852
Improve graph caching of cudnn graph by @Anerudhan in #1887
misc: Various Updates to Attention Microbenchmark Suite by @bkryu in #1891
docs: Fix installation instructions for CUDA-specific package URLs by @yzh119 in #1893
docker image improvements by @nvmbreughe in #1890
tests: Add batch size 1 cases to test_trtllm_gen_attention.py that fail, marked xfail by @bkryu in #1897
Ensure docker installs the torch version we need by @nvmbreughe in #1901
bugfix: exclude tests/utils/test_load_cubin_compile_race_condition.py from pytest by @yzh119 in #1907
ci: use self-hosted runner for building docker containers by @yzh119 in #1908
feat: Add FP4 TRTLLM-Gen throughput MOE batched gemms by @jiahanc in #1882
Update Docker CI tags to 20251010-8d072e6 by @github-actions[bot] in #1915
ci/cd: consolidate release workflow by @yzh119 in #1910
bugfix: fix cli error when cuda toolkit is not installed by @yzh119 in #1905
feat: trtrllm-gen global scaled FP8 GEMMs by @hypdeb in #1829
feat:enable fp8 blockscale moe for fused cultass for sm90 by @djmmoss in #1819
use ffi::TensorView instead of ffi::Tensor by @cyx-6 in #1844
Minor updates to cubin_loader.py download_file to avoid race condition on temporary file by @nvjullin in #1918
chore: make cache directory flashinfer-version specific by @yzh119 in #1920
misc: checksum check when downloading artifacts by @jimmyzho in #1761
release: bump version v0.4.1 by @yzh119 in #1921

New Contributors

@jiahanc made their first contribution in #1882

Full Changelog: v0.4.0...v0.4.1

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Release v0.4.1

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

What's Changed

New Contributors

Contributors

Uh oh!