Skip to content

Releases: diku-dk/futhark

nightly

17 Sep 23:11
Compare
Choose a tag to compare
nightly Pre-release
Pre-release

Commits

  • eeaa870: WithAcc inputs are consumed inside the body. (Troels Henriksen)

0.25.22

10 Sep 08:15
Compare
Choose a tag to compare

Added

  • futhark script now supports an -f option.

  • futhark script now supports the builtin procedure $store.

Removed

Changed

Fixed

  • An error in tuning file validation.

  • Constant folding for loops that produce floating point results could
    result in different numerical behaviour.

  • Compiler crash in memory short circuiting (#2176).

0.25.21

01 Sep 13:24
Compare
Choose a tag to compare

Added

  • Logging now prints more GPU information on context initialisation.

  • GPU cache size can now be configured (tuning param: default_cache).

  • GPU shared memory can now be configured (tuning param: default_shared_memory).

  • GPU register capacity can now be configured.

  • futhark script now accepts a -b option for producing binary
    output.

Fixed

  • Type names for element types of array indexing functions in C
    interface are now often better - although there are still cases
    where you end up with hashed names. (#2172)

  • In some cases, GPU failures would not be reported properly if a
    previous failure was pending.

  • auto output didn't work if the .fut file did not have any path
    components.

  • Improved detection of malformed tuning files.

0.25.20

15 Aug 18:51
Compare
Choose a tag to compare

Added

  • Better error message when in-place updates fail at runtime due to a
    shape mismatch.

Fixed

  • #[unroll] on an outer loop now no longer causes unrolling of all
    loops nested inside the loop body.

  • Obscure issue related to replications of constants in complex
    intrablock kernels.

  • Interpreter no longer crashes on attributes in patterns.

  • Fixes to array indexing through C API when using GPU backends.

0.25.19

26 Jul 17:11
Compare
Choose a tag to compare

Added

  • The compiler now does slightly less aggressive inlining. Use the
    #[inline] attribute if you want to force inlining of some
    function.

  • Arrays of opaque types now support indexing through the C API.
    Arrays of records can also be constructed. (#2082)

Fixed

  • The opencl backend now always passes
    -cl-fp32-correctly-rounded-divide-sqrt to the kernel compiler, in
    order to match CUDA and HIP behaviour.

0.25.18

19 Jul 09:54
Compare
Choose a tag to compare

Added

  • New prelude function: rep, an implicit form of replicate.

  • Improved handling of large monomorphic single-dimensional array
    literals (#2160).

Fixed

  • futhark repl no longer asks for confirmation on EOF.

  • Obscure oversight related to abstract size-lifted types (#2120).

  • Accidential exponential-time algorithm in layout optimisation for
    multicore backends (#2151).

0.25.17

12 Jun 08:49
Compare
Choose a tag to compare
  • Faster device-to-device copies on CUDA.

  • "More correctly" detect L2 cache size for OpenCL backend on AMD GPUs.

Fixed

  • Handling of .. in import paths (again).

  • Detection of impossible loop parameter sizes (#2144).

  • Rare case where GPU histograms would use slightly too much shared
    memory and fail at run-time.

  • Rare crash in layout optimisation.

0.25.16

01 May 11:49
Compare
Choose a tag to compare

Added

  • futhark test: --no-terminal now prints status messages even when
    no failures occur.

  • futhark test no longer runs structure tests by default. Pass
    -s to run them.

  • Rewritten array layout optimisation pass by Bjarke Pedersen and
    Oscar Nelin. Minor speedup for some programs, but is more
    importantly a principled foundation for further improvements.

  • Better error message when exceeding shared memory limits.

  • Better dead code removal for the GPU representation (minor impact on
    some programs).

Fixed

  • Bugs related to deduplication of array payloads in sum types.
    Unfortunately, fixed by just not deduplicating in those cases.

  • Frontend bug related to turning size expressions into variables
    (#2136).

  • Another exotic monomorphisation bug.

0.25.15

27 Mar 13:27
Compare
Choose a tag to compare

Added

  • Incremental Flattening generates fewer redundant code versions.

  • Better simplification of slices. (#2125)

Fixed

  • Ignore type suffixes when unifying expressions (#2124).

  • In the C API, opaque types that correspond to an array of an opaque
    type are now once again named futhark_opaque_arr_....

  • cuda backend did not correctly profile CPU-to-GPU scalar copies.

0.25.14

13 Mar 23:01
Compare
Choose a tag to compare

Added

  • The prelude definition of filter is now more memory efficient,
    particularly when the output is much smaller than the input. (#2109)

  • New configuration for GPU backends:
    futhark_context_config_set_unified_memory, also available on
    executables as --unified-memory.

  • The "raw" API functions now do something potentially useful, but are
    still considered experimental.

  • futhark --version now reports GHC version.

Fixed

  • Incorrect type checking of let-bound sizes occurring multiple times
    in pattern. (#2103).

  • A concatenation simplification would sometimes mess up sizes.
    (#2104)

  • Bug related to monomorphisation of polymorphic local functions
    (#2106).

  • Rare crash in short circuiting.

  • Referencing an unbound type parameter could crash the type checker
    (#2113, #2114).

  • Futhark now works with GHC 9.8 (#2105).