Skip to content

rocFFT-1.0.11 for ROCm 4.2.0

Latest
Compare
Choose a tag to compare
@saadrahim saadrahim released this 10 May 23:13
a470ba6

Optimizations

  • Improved performance for single precision kernels exercising all except radix-2/7 butterfly ops.
  • Minor optimization for C2R 3D 100, 200 cube sizes.
  • Optimized some C2C/R2C 3D 64, 81, 100, 128, 200, 256 rectangular sizes.
  • When factoring, test to see if remaining length is explicitly supported.
  • Explicitly add radix-7 lengths 14, 21, and 224 to list of supported lengths.
  • Optimized R2C 2D/3D 128, 200, 256 cube sizes.

Fixed

  • Fixed potential crashes in small 3D transforms with unusual strides. (ROCm#311)
  • Fixed potential crashes when executing transforms on multiple devices. (ROCm#310)