Skip to content

Releases: diku-dk/futhark

nightly

28 Aug 10:36
Compare
Choose a tag to compare
nightly Pre-release
Pre-release

Commits

  • deac456: futhark script: add -b option. (Troels Henriksen)

0.25.20

15 Aug 18:51
Compare
Choose a tag to compare

Added

  • Better error message when in-place updates fail at runtime due to a
    shape mismatch.

Fixed

  • #[unroll] on an outer loop now no longer causes unrolling of all
    loops nested inside the loop body.

  • Obscure issue related to replications of constants in complex
    intrablock kernels.

  • Interpreter no longer crashes on attributes in patterns.

  • Fixes to array indexing through C API when using GPU backends.

0.25.19

26 Jul 17:11
Compare
Choose a tag to compare

Added

  • The compiler now does slightly less aggressive inlining. Use the
    #[inline] attribute if you want to force inlining of some
    function.

  • Arrays of opaque types now support indexing through the C API.
    Arrays of records can also be constructed. (#2082)

Fixed

  • The opencl backend now always passes
    -cl-fp32-correctly-rounded-divide-sqrt to the kernel compiler, in
    order to match CUDA and HIP behaviour.

0.25.18

19 Jul 09:54
Compare
Choose a tag to compare

Added

  • New prelude function: rep, an implicit form of replicate.

  • Improved handling of large monomorphic single-dimensional array
    literals (#2160).

Fixed

  • futhark repl no longer asks for confirmation on EOF.

  • Obscure oversight related to abstract size-lifted types (#2120).

  • Accidential exponential-time algorithm in layout optimisation for
    multicore backends (#2151).

0.25.17

12 Jun 08:49
Compare
Choose a tag to compare
  • Faster device-to-device copies on CUDA.

  • "More correctly" detect L2 cache size for OpenCL backend on AMD GPUs.

Fixed

  • Handling of .. in import paths (again).

  • Detection of impossible loop parameter sizes (#2144).

  • Rare case where GPU histograms would use slightly too much shared
    memory and fail at run-time.

  • Rare crash in layout optimisation.

0.25.16

01 May 11:49
Compare
Choose a tag to compare

Added

  • futhark test: --no-terminal now prints status messages even when
    no failures occur.

  • futhark test no longer runs structure tests by default. Pass
    -s to run them.

  • Rewritten array layout optimisation pass by Bjarke Pedersen and
    Oscar Nelin. Minor speedup for some programs, but is more
    importantly a principled foundation for further improvements.

  • Better error message when exceeding shared memory limits.

  • Better dead code removal for the GPU representation (minor impact on
    some programs).

Fixed

  • Bugs related to deduplication of array payloads in sum types.
    Unfortunately, fixed by just not deduplicating in those cases.

  • Frontend bug related to turning size expressions into variables
    (#2136).

  • Another exotic monomorphisation bug.

0.25.15

27 Mar 13:27
Compare
Choose a tag to compare

Added

  • Incremental Flattening generates fewer redundant code versions.

  • Better simplification of slices. (#2125)

Fixed

  • Ignore type suffixes when unifying expressions (#2124).

  • In the C API, opaque types that correspond to an array of an opaque
    type are now once again named futhark_opaque_arr_....

  • cuda backend did not correctly profile CPU-to-GPU scalar copies.

0.25.14

13 Mar 23:01
Compare
Choose a tag to compare

Added

  • The prelude definition of filter is now more memory efficient,
    particularly when the output is much smaller than the input. (#2109)

  • New configuration for GPU backends:
    futhark_context_config_set_unified_memory, also available on
    executables as --unified-memory.

  • The "raw" API functions now do something potentially useful, but are
    still considered experimental.

  • futhark --version now reports GHC version.

Fixed

  • Incorrect type checking of let-bound sizes occurring multiple times
    in pattern. (#2103).

  • A concatenation simplification would sometimes mess up sizes.
    (#2104)

  • Bug related to monomorphisation of polymorphic local functions
    (#2106).

  • Rare crash in short circuiting.

  • Referencing an unbound type parameter could crash the type checker
    (#2113, #2114).

  • Futhark now works with GHC 9.8 (#2105).

0.25.13

25 Jan 10:50
Compare
Choose a tag to compare

Added

  • Incremental flattening of map-scan compositions with nested
    parallelism (similar to the logic for map-reduce compositions
    that we have had for years).

  • futhark script, for running FutharkScript expressions from the
    command line.

  • futhark repl now prints out a message when it ignores a breakpoint
    during initialisation. (#2098)

Fixed

  • Flattening of scatter with multi-dimensional elements (#2089).

  • Some instances of not-actually-irregular allocations were mistakenly
    interpreted as irregular. Fixing this was a dividend of the memory
    representation simplifications of 0.25.12.

  • Obscure issue related to expansion of shared memory allocations (#2092).

  • A crash in alias checking under some rare circumstances (#2096).

  • Mishandling of existential sizes for top level constants. (#2099)

  • Compiler crash when generating code for copying nothing at all. (#2100)

0.25.12

16 Jan 08:26
Compare
Choose a tag to compare

Added

  • f16.copysign, f32.copysign, f64.copysign.

  • Trailing commas are now allowed for all syntactical elements that
    involve comma-separation. (#2068)

  • The C API now allows destruction and construction of sum types (with
    some caveats). (#2074)

  • An overall reduction in memory copies, through simplifying the
    internal representation.

Fixed

  • C API would define distinct entry point types for Futhark types that
    differed only in naming of sizes (#2080).

  • == and != on sum types with array payloads. Constructing them is
    now a bit slower, though. (#2081)

  • Somewhat obscure simplification error caused by neglecting to update
    metadata when removing dead scatter outputs.

  • Compiler crash due to the type checker forgetting to respect the
    explicitly ascribed non-consuming diet of loop parameters (#2067).

  • Size inference did incomplete level/scope checking, which could
    result in circular sizes, which usually manifested as the type
    checker going into an infinite loop (#2073).

  • The OpenCL backend now more gracefully handles lack of platform.