Skip to content

Releases: intel/llvm

DPC++ daily 2022-06-30

30 Jun 16:24
c91f955
Compare
Choose a tag to compare
Pre-release
[SYCL] Use std::ignore for all unused args in bfloat builtins (#6381)

Signed-off-by: Larsen, Steffen <[email protected]>

DPC++ daily 2022-06-29

29 Jun 16:20
f51357f
Compare
Choose a tag to compare
Pre-release
sycl-nightly/20220629

[SYCL][FPGA][NFC] Refactor [[intel::num_simd_work_items()]] attribute…

oneAPI DPC++ Compiler 2022-06

28 Jun 06:26
4043dda
Compare
Choose a tag to compare

New features

SYCL Compiler

  • Added -fcuda-prec-sqrt frontend compiler option which enables higher presision version of sqrt in the device code [ebf9ea8]
  • Added support for local memory accessors for the HIP backend. [58508ba]
  • Added initial support of -lname processing when searching for fat static libraries. [35e32d8] [a33f9c8]
  • Added -fsycl-fp32-prec-sqrt flag which enables correctly rounded sycl::sqrt. [5c8b7e7]
  • Added support for [[intel::loop_count()]] attribute. [c536e76]
  • Added support for passing driver options to JIT compiler and linker. [1c93bfe]
  • Added default argument support for work_group_size_hint attribute. [0cff80e]
  • Added support for float and double exchange and compare exchange atomic operations in CUDA libclc. [1d84c99]
  • Added --ffast-math support for CUDA libclc. [0f0c5d1]
  • Added support for software atomics (except for the ones using system scope) for lower sm versions of CUDA architecture. Enabled SYCL_USE_NATIVE_FP_ATOMICS by default. [7bc8447]
  • Added support for the global offset for AMDGPU. [2dc3c06]
  • Added support for asynchronous barrier for CUDA backend sm 80+. [6770421]
  • Added -f[no-]sycl-device-lib-jit-link option to control JIT linking of SYCL device libraries. [dfb37a8] [c946286]
  • Added support for the new FPGA attribute [[intel::fpga_pipeline(N)]] for loop pipelining. [92aadf3]
  • Added assert support for Windows NVPTX. [f29b498]
  • Added support for sycl_ext_oneapi_properties extension. [87f60f6][1984e74][a2583ec][cdf561a][d2982c6][35c2e00]

SYCL Library

  • Added support for Nvidia MMA for bf16, mixed precision int ((u)int8/int32), and mixed precision float (half/float). [5373362]
  • Added a mode for the Level Zero plugin where only last command in each batch yields a host-visible event. Enabled this mode by default. [c6b7b8e]
  • Added an option to query for atomic scope capabilities for the CUDA backend. Updated returns for atomics memory order capabilties. [43a4192]
  • Added support for an experimental Level Zero API for host pointer import into USM. The feature can be enabled using SYCL_USM_HOSTPTR_IMPORT environment variable. [844d7b6]
  • Added support for the wi_element for bf16 type. [9f2b7bd]
  • Added complex support for the reduce and scan group algorithms. [90a4dc7]
  • Added support for SYCL 2020 methods in the group class. [73d59ce]
  • Added SYCL_RT_WARNING_LEVEL environment variable which allows to control amount of warnings and performance hints the runtime library may print. [2741010]
  • Added tanh (for floats/halfs) and exp2 (for halfs) native definitions for CUDA backend. [250c498]
  • Added bf16 builtins for fma, fmin, fmax and fmax on CUDA backend. [62651dd]
  • Added support for USM buffer location properties which allows to specify at what memory location the device usm allocation should be in. [12c988a]
  • Added support for buffer_location property to the sycl::buffer. [9808525]
  • Added single_task support for ESIMD_EMULATOR backend. [2331160]
  • Added support for SVM 1,2,4-elements gather/scatter for ESIMD. [e200720]
  • Added support for bf16 builtins operating on storage types for CUDA backend. [413a9ef]
  • Added support for backend_version device property for CUDA backend. [4b1a4bc]
  • Added support for round-robin submissions to multiple compute CCS for the Level Zero backend. Disabled by default, can be controlled using SYCL_PI_LEVEL_ZERO_USE_COMPUTE_ENGINE. [a836c87]
  • Added support for buffer migration for contexts with multiple devices in the Level Zero plugin. [7baf152]
  • Added mode where the Level Zero plugin uses immediate command-lists instead of standard command-lists. This mode is disabled by default, can be enabled using SYCL_PI_LEVEL_ZERO_USE_IMMEDIATE_COMMANDLISTS environment variable. [b9cb1d1]
  • Added support for sycl::get_native(sycl::buffer) for OpenCL and CUDA backends. [8b3c8c4]
  • Added reduction overloads accepting span. [863383b]
  • Added LSC support for ESIMD_EMULATOR backend. [b78bf00]
  • Added half type support for __esimd_convertvector_to/from. [0bfffd6]
  • Added buffer_allocator SYCL 2020 conformant variant. [53430c8]
  • Added support for the USM buffer location property in malloc_shared. [6e89821] [9f61c8e][8c4d9a5]
  • Added support for the USM buffer location property in malloc_host. [2c7caab]
  • Added experimental context and device interoperability support for CUDA. [f0df89a]
  • Added support for memory intrinsics for the ESIMD_EMULATOR plugin. [1a8f501]
  • Added support for named barrier APIs for ESIMD. [1df0038]
  • Added support for DPAS API for ESIMD. [5881938]
  • Added support for LSC memory access APIs for ESIMD. [4bd50e7]
  • Added support for the invoke_simd feature. [4072557][8471ff3][8c7bb45][62afb59][3e1c1bf]
  • Added support for info::device::atomic64 for OpenCL and Level Zero backends. [8feb558]
  • Added support for sycl_ext_oneapi_usm_device_read_only extension [644c614][58c9d3a]
  • Added support for mapping/unmapping operations for ESIMD_EMULATOR plugin. [bc0579a]
  • Added support for make_buffer API for the Level Zero backend. [7c49984]
  • Added interoperability support for HIP backend. [e06d1b5]
  • Added missing +-*/ operations for half. [059efbc]
  • Introduced new environment variable SYCL_PI_CUDA_MAX_LOCAL_MEM_SZ to control the max local memory allowed to be allocated per kernel on CUDA backend. [2e24304]
  • Added ext_intel_global_host_space in accordance with sycl_ext_intel_usm_address_spaces extension. [7a2f44b]
  • Added aspect for bfloat16. [f84fc32]
  • Introduced "Intel math functions" device library with support of type cast util functions for float, double and integer type. [a310952]
  • Added bfloat16 support for joint_matrix [6ac62ab]

Documentation

Tools

  • Implemented property set generation for device globals in the sycl-post-link. Added the `--device-gl...
Read more

DPC++ daily 2022-06-28

28 Jun 16:32
8a4777d
Compare
Choose a tag to compare
Pre-release
sycl-nightly/20220628

[SYCL] Reset signalled command list if there no available command lis…

DPC++ daily 2022-06-27

27 Jun 16:22
2baf1de
Compare
Choose a tag to compare
Pre-release
[SYCL][CUDA] Retain context with CUDA event interop (#6361)

This PR fixes an issue found in the SYCL-CTS cuda interop tests. 
Where the context had not been properly retained when creating a native event with CUDA interop.

DPC++ daily 2022-06-26

26 Jun 16:20
1afa98f
Compare
Choose a tag to compare
Pre-release
[SYCL] June 2022 Release Notes (#6317)

* Release notes for commit range f34ba2c..4043dda
* Update known issues:
1. cuda prefetch issue seems to be fixed by:
https://github.com/intel/llvm/pull/5043
2. Performance issues with assert seem to be fixed by:
https://github.com/intel/llvm/pull/4505
https://github.com/intel/llvm/pull/4516

DPC++ daily 2022-06-25

25 Jun 16:18
1afa98f
Compare
Choose a tag to compare
Pre-release
[SYCL] June 2022 Release Notes (#6317)

* Release notes for commit range f34ba2c..4043dda
* Update known issues:
1. cuda prefetch issue seems to be fixed by:
https://github.com/intel/llvm/pull/5043
2. Performance issues with assert seem to be fixed by:
https://github.com/intel/llvm/pull/4505
https://github.com/intel/llvm/pull/4516

DPC++ daily 2022-06-24

24 Jun 16:21
e848c15
Compare
Choose a tag to compare
Pre-release
[SYCL][PI][CUDA] Fix transfer stream reuse (#6354)

Fixed a bug that can cause transfer stream to be reused for compute, messing up the synchronization.

DPC++ daily 2022-06-23

23 Jun 16:19
78bd66a
Compare
Choose a tag to compare
Pre-release
[SYCL] Removes more uses of OpenCL header definitions (#6328)

This commit adds various PI definitions and replaces some uses of OpenCL header definitions from the SYCL runtime library.

To further isolate the remaining uses of the OpenCL headers, the includes of cl.h are moved from common.hpp to the dependent files.

Tests updates at: intel/llvm-test-suite#1061

DPC++ daily 2022-06-22

22 Jun 16:19
1811162
Compare
Choose a tag to compare
Pre-release
[ESIMD] Implement stateless memory accesses enforcement (#6287)

The driver option -f[no-]sycl-esimd-force-stateless-mem is added.

-fsycl-esimd-force-stateless-mem enables the automatic conversion of stateful memory
accesses via SYCL accessors or surface-index to stateless within ESIMD kernels.
It also disables those ESIMD intrinsics that use stateful accesses that cannot be converted to stateless.
-fsycl-esimd-force-stateless-mem defines the macro __ESIMD_FORCE_STATELESS_MEM
to map the calls of ESIMD API using accessors to calls of API using pointers.
It also passes a switch to sycl-post-link to signal it that it should
ignore the buffer_t attribute and use svmptr_t.

-fno-sycl-esimd-force-stateless-mem is used to tell the compiler not to convert
stateful memory accesses to stateless. Default behavior.

Draft of the design document/proposal for this change-set: #6187