Skip to content

Releases: intel/yask

Version 2.15.09

07 Nov 18:48
e6476ed
Compare
Choose a tag to compare

New features:

  • Added ability to use memkind library to allocate some grid vars in "pmem" memory devices (build with pmem=1).
  • Overlapping MPI communications now works when using wave-front tiling and/or temporal block tiling. On by default; turn off with -no-overlap_comms.
  • MPI between ranks on the same node can now use shared memory to avoid buffer copies. Off by default; turn on with -use_shm.

Version 2.14.03

27 Sep 03:53
245e61b
Compare
Choose a tag to compare

Adds mini-block hierarchy level below blocks and above sub-blocks.
Separates unit-of-work for OpenMP threads and cache-block size:

  • Blocks, as before, are units-of-work for top-level OpenMP threads. Blocks are evaluated in parallel in each region.
  • Mini-blocks are evaluated sequentially within each block and are typically sized for L2 caches.
    By default, mini-blocks are the same size as blocks, so most users will see no difference.
    It is possible to apply temporal blocking to both blocks and mini-blocks. Using '-bt' will set both by default.

Also removes loop-grouping parameters because they have not shown performance gains and are confusing to users.

Version 2.13.02

10 Sep 17:30
1384d01
Compare
Choose a tag to compare

Added temporal tiling at the cache-block level and ability to specify temporal conditions on equations; more stats reported.

Temporal tiling in this version works only up to 3D spatial dims. Use version 2.14.03 for 4D and higher.

Version 2.11.00

14 Aug 15:52
0838946
Compare
Choose a tag to compare

Added ability to overlap MPI comms with computation. Disable with -no-overlap_comms.

Version 2.10.02

07 Jun 00:53
dd613b0
Compare
Choose a tag to compare
  • Overhaul of Makefiles; build with make YASK_OUTPUT_DIR=dirname to specify output location.
  • Fixed some bugs with scratch-grids + wave-fronts + MPI.

Version 2.9.0

19 May 16:54
8e75498
Compare
Choose a tag to compare

Improvements to decrease compile time and binary size.

Important change that may require your intervention: examples in src/stencils are now in .cpp (not .hpp) files. Running git pull will likely fail if any existing .hpp files have been modified.

  • If you do not need any of your local changes, just run git stash.
  • If you have modified any example stencils and wish to keep the changes, commit them to your local repository before running git pull.
  • If you have any new stencils, just change their suffixes to .cpp to make sure they are added to the YASK compiler.

Version 2.8.3

07 May 23:23
815708d
Compare
Choose a tag to compare

Provide API operator overloading for all operations in the YASK compiler.

  • This resulted in changing some return types of new-node operations in the compiler API to more generic types. This should not affect Python code or any C++ code using auto types.

Updated best-known settings on "Skylake" Xeon Scalable processors for several example stencils.

Version 2.7.3

01 May 23:54
29e6317
Compare
Choose a tag to compare

MPI improvements, esp. with temporal tiling and/or scratch grids.
Added compiler APIs to create full grid-index expressions.

Version 2.6.2

28 Apr 20:33
47d1115
Compare
Choose a tag to compare

Add compiler APIs for creating sub-domains and manual dependency graphs.
Several fixes for MPI halo exchanges with sub-domains and/or scratch-grids.

Version 2.5.4

12 Apr 16:58
a2ffe23
Compare
Choose a tag to compare

Added ability to specify NUMA node for each grid separately via an API.

Several bug fixes for corner cases such as unaligned data when using MPI and temporal wave-fronts.