From 3944e4b649b70865e6f9934a874e4167612ae357 Mon Sep 17 00:00:00 2001 From: Ben Ashbaugh Date: Wed, 8 Jun 2022 12:09:21 -0700 Subject: [PATCH 1/5] relax error behavior for clSetKernelArgMemPointerINTEL --- .../cl_intel_unified_shared_memory.asciidoc | 24 +++++++++++++++---- 1 file changed, 19 insertions(+), 5 deletions(-) diff --git a/extensions/cl_intel_unified_shared_memory.asciidoc b/extensions/cl_intel_unified_shared_memory.asciidoc index 0430935fc..4e5279905 100644 --- a/extensions/cl_intel_unified_shared_memory.asciidoc +++ b/extensions/cl_intel_unified_shared_memory.asciidoc @@ -52,7 +52,7 @@ Shipping == Version Built On: {docdate} + -Revision: 1.0.0 +Revision: 1.1.0 == Dependencies @@ -744,8 +744,14 @@ Arguments to the kernel are referred to by indices that go from 0 for the leftmo _arg_value_ is the pointer value that should be used as the argument specified by _arg_index_. The pointer value will be used as the argument by all API calls that enqueue a kernel until the argument value is set to a different pointer value by a subsequent call. A pointer into Unified Shared Memory allocation may only be set as an argument value for an argument declared to be a pointer to `global` or `constant` memory. + +The definition of a valid argument value was changed in extension version 1.1.0: + +* For extension versions prior to version 1.1.0: For devices supporting shared system allocations, any pointer value is valid. Otherwise, the pointer value must be `NULL` or must point into a Unified Shared Memory allocation returned by *clHostMemAllocINTEL*, *clDeviceMemAllocINTEL*, or *clSharedMemAllocINTEL*. +* For extension versions 1.1.0 and newer: +For all devices, any pointer value is valid and may be set as an argument to a kernel. *clSetKernelArgMemPointerINTEL* returns `CL_SUCCESS` if the function is executed successfully. Otherwise, it will return one of the following errors: @@ -1236,12 +1242,16 @@ Note that some flags will not be valid, such as `CL_MEM_USE_HOST_PTR`. . Should it be an error to set an unknown pointer as a kernel argument using *clSetKernelArgMemPointerINTEL* if no devices support shared system allocations? + -- -*UNRESOLVED*: -Returning an error for an unknown pointer is helpful to identify and diagnose possible programming errors sooner, but passing a pointer to arbitrary memory to a function on the host is not an error until the pointer is dereferenced. +`RESOLVED`: +The behavior of *clSetKernelArgMemPointerINTEL* and was changed in version 1.1.0 of this extension. + +Prior to version 1.1.0, it was considered an error to set an arbitrary pointer value as an argument to a kernel if no devices support system USM. +This was helpful to identify possible programming errors, however it did not match the behavior of passing a pointer to a function on the host, where it is only a programming error if an invalid pointer is dereferenced. +To help provide a similar programming experience, the error condition was relaxed in version 1.1.0, and any arbitrary pointer value may be passed to a kernel. -If we relax the error condition for *clSetKernelArgMemPointerINTEL* then we could also consider relaxing the error condition for *clSetKernelExecInfo*(`CL_KERNEL_EXEC_INFO_USM_PTRS_INTEL`) similarly. +The behavior was also changed for *clSetKernelExecInfo*(`CL_KERNEL_EXEC_INFO_USM_PTRS_INTEL`), similarly. -Note that if the error condition is removed we can still check for possible programming errors via optional USM checking layers, such as the https://github.com/intel/opencl-intercept-layer/blob/master/docs/controls.md#usmchecking-bool[USMChecking] functionality in the https://github.com/intel/opencl-intercept-layer[OpenCL Intercept Layer]. +If desired, checking to identify possible programming errors may still be provided via optional USM checking layers, such as the https://github.com/intel/opencl-intercept-layer/blob/master/docs/controls.md#usmchecking-bool[USMChecking] functionality in the https://github.com/intel/opencl-intercept-layer[OpenCL Intercept Layer]. -- . Should we support a "rect" memcpy similar to *clEnqueueCopyBufferRect*? @@ -1280,7 +1290,11 @@ Note that there is no similar SVM "rect" memcpy. |R|2020-08-21|Ben Ashbaugh|Fixed enum name typo in table. |S|2020-08-26|Maciej Dziuban|Added initial placement flags for shared allocations. |1.0.0|2021-11-07|Ben Ashbaugh|Added version and other minor updates prior to posting on the OpenCL registry. +<<<<<<< HEAD |1.0.0|2022-11-08|Ben Ashbaugh|Added new issues regarding error behavior for clSetKernelArgMemPointerINTEL and rect copies. +======= +|1.1.0|2022-06-08|Ben Ashbaugh|Modified error behavior for `clSetKernelArgMemPointerINTEL`. +>>>>>>> c5966770 (relax error behavior for clSetKernelArgMemPointerINTEL) |======================================== //************************************************************************ From 645b8fe5e1bf17021b870a74073333ddeae4c757 Mon Sep 17 00:00:00 2001 From: Ben Ashbaugh Date: Fri, 26 Aug 2022 15:54:01 -0700 Subject: [PATCH 2/5] also relax error behavior for clSetKernelExecInfo(USM_PTRS) --- extensions/cl_intel_unified_shared_memory.asciidoc | 13 +++++++++---- 1 file changed, 9 insertions(+), 4 deletions(-) diff --git a/extensions/cl_intel_unified_shared_memory.asciidoc b/extensions/cl_intel_unified_shared_memory.asciidoc index 4e5279905..6abe0c493 100644 --- a/extensions/cl_intel_unified_shared_memory.asciidoc +++ b/extensions/cl_intel_unified_shared_memory.asciidoc @@ -794,6 +794,14 @@ The new _param_name_ values described below may be used with the existing *clSet |==== +The definition of a valid Unified Shared Memory allocation specified using `CL_KERNEL_EXEC_INFO_USM_PTRS_INTEL` was changed in extension version 1.1.0: + +* For extension versions prior to version 1.1.0: +For devices supporting shared system allocations, any pointer value is valid. +Otherwise, the pointer value must point into a Unified Shared Memory allocation returned by *clHostMemAllocINTEL*, *clDeviceMemAllocINTEL*, or *clSharedMemAllocINTEL*. +* For extension versions 1.1.0 and newer: +For all devices, any pointer value is valid. + ==== Filling and Copying Unified Shared Memory The function @@ -1290,11 +1298,8 @@ Note that there is no similar SVM "rect" memcpy. |R|2020-08-21|Ben Ashbaugh|Fixed enum name typo in table. |S|2020-08-26|Maciej Dziuban|Added initial placement flags for shared allocations. |1.0.0|2021-11-07|Ben Ashbaugh|Added version and other minor updates prior to posting on the OpenCL registry. -<<<<<<< HEAD |1.0.0|2022-11-08|Ben Ashbaugh|Added new issues regarding error behavior for clSetKernelArgMemPointerINTEL and rect copies. -======= -|1.1.0|2022-06-08|Ben Ashbaugh|Modified error behavior for `clSetKernelArgMemPointerINTEL`. ->>>>>>> c5966770 (relax error behavior for clSetKernelArgMemPointerINTEL) +|1.1.0|2023-04-03|Ben Ashbaugh|Modified error behavior for clSetKernelArgMemPointerINTEL and clSetKernelExecInfo. |======================================== //************************************************************************ From 6b8fadbf2005287aa74f85d68d80ed36cea6de14 Mon Sep 17 00:00:00 2001 From: Ben Ashbaugh Date: Wed, 5 Apr 2023 15:06:22 -0700 Subject: [PATCH 3/5] add a new issue regarding sub-device accessibility --- extensions/cl_intel_unified_shared_memory.asciidoc | 14 ++++++++++++++ 1 file changed, 14 insertions(+) diff --git a/extensions/cl_intel_unified_shared_memory.asciidoc b/extensions/cl_intel_unified_shared_memory.asciidoc index 6abe0c493..da52d9eb7 100644 --- a/extensions/cl_intel_unified_shared_memory.asciidoc +++ b/extensions/cl_intel_unified_shared_memory.asciidoc @@ -1269,7 +1269,21 @@ If desired, checking to identify possible programming errors may still be provid This would be a fairly straightforward addition if it is useful. Note that there is no similar SVM "rect" memcpy. + +We could also support a "rect" memset, though there are no similar functions for `cl_mem` buffers or SVM. +-- + +. Can a device USM allocation for a parent device be accessed by its sub-devices? +Can a single device shared USM allocation associated with a parent device be accessed by its sub-devices? ++ -- +*UNRESOLVED*: +Since a sub-device is a partition of a parent device a USM allocation against a parent device should be accessible by its sub-devices. +We could document this expectation explicitly in this extension if it is not already covered by the main OpenCL specification. + +Note that a USM allocation against a sub-device need not be accessible by its parent device or by other sibling sub-devices, though some implementations may support this, just like some implementations may optionally support access to USM allocations from other devices. +-- + == Revision History From 9b8595ab8de9317cfe5d618e87d6f7d4b3b58780 Mon Sep 17 00:00:00 2001 From: Ben Ashbaugh Date: Fri, 14 Apr 2023 16:04:24 -0700 Subject: [PATCH 4/5] a few more wordsmithing fixes --- .../cl_intel_unified_shared_memory.asciidoc | 35 +++++-------------- 1 file changed, 8 insertions(+), 27 deletions(-) diff --git a/extensions/cl_intel_unified_shared_memory.asciidoc b/extensions/cl_intel_unified_shared_memory.asciidoc index da52d9eb7..865dc251b 100644 --- a/extensions/cl_intel_unified_shared_memory.asciidoc +++ b/extensions/cl_intel_unified_shared_memory.asciidoc @@ -1251,26 +1251,26 @@ Note that some flags will not be valid, such as `CL_MEM_USE_HOST_PTR`. + -- `RESOLVED`: -The behavior of *clSetKernelArgMemPointerINTEL* and was changed in version 1.1.0 of this extension. +The behavior of *clSetKernelArgMemPointerINTEL* was changed in version 1.1.0 of this extension. Prior to version 1.1.0, it was considered an error to set an arbitrary pointer value as an argument to a kernel if no devices support system USM. This was helpful to identify possible programming errors, however it did not match the behavior of passing a pointer to a function on the host, where it is only a programming error if an invalid pointer is dereferenced. -To help provide a similar programming experience, the error condition was relaxed in version 1.1.0, and any arbitrary pointer value may be passed to a kernel. +To provide a similar programming experience, the error condition was relaxed in version 1.1.0, and any arbitrary pointer value may be passed to a kernel. The behavior was also changed for *clSetKernelExecInfo*(`CL_KERNEL_EXEC_INFO_USM_PTRS_INTEL`), similarly. -If desired, checking to identify possible programming errors may still be provided via optional USM checking layers, such as the https://github.com/intel/opencl-intercept-layer/blob/master/docs/controls.md#usmchecking-bool[USMChecking] functionality in the https://github.com/intel/opencl-intercept-layer[OpenCL Intercept Layer]. +If desired, additional checks to identify possible programming errors may still be provided via optional USM checking layers, such as the https://github.com/intel/opencl-intercept-layer/blob/master/docs/controls.md#usmchecking-bool[USMChecking] functionality in the https://github.com/intel/opencl-intercept-layer[OpenCL Intercept Layer]. -- -. Should we support a "rect" memcpy similar to *clEnqueueCopyBufferRect*? +. Should we support a 2D "rect" memcpy similar to *clEnqueueCopyBufferRect*? + -- *UNRESOLVED*: This would be a fairly straightforward addition if it is useful. -Note that there is no similar SVM "rect" memcpy. +Note that there is no similar 2D "rect" memcpy for SVM. -We could also support a "rect" memset, though there are no similar functions for `cl_mem` buffers or SVM. +We could also support a 2D "rect" fill or memset, though there are no similar functions for `cl_mem` buffers or SVM. -- . Can a device USM allocation for a parent device be accessed by its sub-devices? @@ -1281,7 +1281,7 @@ Can a single device shared USM allocation associated with a parent device be acc Since a sub-device is a partition of a parent device a USM allocation against a parent device should be accessible by its sub-devices. We could document this expectation explicitly in this extension if it is not already covered by the main OpenCL specification. -Note that a USM allocation against a sub-device need not be accessible by its parent device or by other sibling sub-devices, though some implementations may support this, just like some implementations may optionally support access to USM allocations from other devices. +Note that a USM allocation against a sub-device need not be accessible by its parent device or by other sibling sub-devices, though some implementations may support this, just like some implementations optionally support access to USM allocations from other devices. -- @@ -1292,28 +1292,9 @@ Note that a USM allocation against a sub-device need not be accessible by its pa [options="header"] |======================================== |Rev|Date|Author|Changes -|A|2019-01-18|Ben Ashbaugh|*Initial revision* -|B|2019-03-25|Ben Ashbaugh|Minor name changes. -|C|2019-06-18|Ben Ashbaugh|Moved flags argument into properties. -|D|2019-07-19|Ben Ashbaugh|Editorial fixes. -|E|2019-07-22|Ben Ashbaugh|Allocation properties should be const. -|F|2019-07-26|Ben Ashbaugh|Removed DEFAULT mem alloc flag. -|G|2019-08-23|Ben Ashbaugh|Added mem alloc query for associated device. -|H|2019-10-11|Ben Ashbaugh|Added initial list and description of error codes. -|I|2019-11-14|Ben Ashbaugh|Switched from a memset to a memfill API. -|J|2019-11-18|Ben Ashbaugh|Updated a few more error conditions. -|K|2019-12-18|Krzysztof Gibala|Updated write combine description. -|L|2020-01-15|Ben Ashbaugh|Added invalid arg case to setkernelarg API. -|M|2020-01-17|Ben Ashbaugh|Minor name changes, removed const from memfree API. -|N|2020-01-22|Ben Ashbaugh|Updated write combine description. -|O|2020-01-23|Ben Ashbaugh|Added aliases for USM migration flags. -|P|2020-02-28|Ben Ashbaugh|Added blocking memfree API. -|Q|2020-03-12|Ben Ashbaugh|Name tweak for blocking memfree API, added comparison to SVM, allow zero memory advice. -|R|2020-08-21|Ben Ashbaugh|Fixed enum name typo in table. -|S|2020-08-26|Maciej Dziuban|Added initial placement flags for shared allocations. |1.0.0|2021-11-07|Ben Ashbaugh|Added version and other minor updates prior to posting on the OpenCL registry. |1.0.0|2022-11-08|Ben Ashbaugh|Added new issues regarding error behavior for clSetKernelArgMemPointerINTEL and rect copies. -|1.1.0|2023-04-03|Ben Ashbaugh|Modified error behavior for clSetKernelArgMemPointerINTEL and clSetKernelExecInfo. +|1.1.0|2023-04-14|Ben Ashbaugh|Modified error behavior for clSetKernelArgMemPointerINTEL and clSetKernelExecInfo. |======================================== //************************************************************************ From ec1cf6020ea5658663fc7825a5a9133ef216db22 Mon Sep 17 00:00:00 2001 From: Ben Ashbaugh Date: Thu, 9 Nov 2023 16:41:30 -0800 Subject: [PATCH 5/5] further clarify what is meant by a valid pointer value --- .../cl_intel_unified_shared_memory.asciidoc | 16 +++++++--------- 1 file changed, 7 insertions(+), 9 deletions(-) diff --git a/extensions/cl_intel_unified_shared_memory.asciidoc b/extensions/cl_intel_unified_shared_memory.asciidoc index 865dc251b..7099d3385 100644 --- a/extensions/cl_intel_unified_shared_memory.asciidoc +++ b/extensions/cl_intel_unified_shared_memory.asciidoc @@ -743,9 +743,10 @@ Arguments to the kernel are referred to by indices that go from 0 for the leftmo _arg_value_ is the pointer value that should be used as the argument specified by _arg_index_. The pointer value will be used as the argument by all API calls that enqueue a kernel until the argument value is set to a different pointer value by a subsequent call. -A pointer into Unified Shared Memory allocation may only be set as an argument value for an argument declared to be a pointer to `global` or `constant` memory. +A pointer may only be set as an argument value for an argument declared to be a pointer to `global` or `constant` memory. -The definition of a valid argument value was changed in extension version 1.1.0: +[[valid-usm-pointer-argument-definition]] +The definition of a valid pointer value was changed in extension version 1.1.0: * For extension versions prior to version 1.1.0: For devices supporting shared system allocations, any pointer value is valid. @@ -753,6 +754,9 @@ Otherwise, the pointer value must be `NULL` or must point into a Unified Shared * For extension versions 1.1.0 and newer: For all devices, any pointer value is valid and may be set as an argument to a kernel. +In this definition, a valid pointer value means that the function will not return an error. +It still may not be valid to dereference the pointer inside of a kernel if the memory that the pointer points to is not accessible on the device. + *clSetKernelArgMemPointerINTEL* returns `CL_SUCCESS` if the function is executed successfully. Otherwise, it will return one of the following errors: @@ -794,13 +798,7 @@ The new _param_name_ values described below may be used with the existing *clSet |==== -The definition of a valid Unified Shared Memory allocation specified using `CL_KERNEL_EXEC_INFO_USM_PTRS_INTEL` was changed in extension version 1.1.0: - -* For extension versions prior to version 1.1.0: -For devices supporting shared system allocations, any pointer value is valid. -Otherwise, the pointer value must point into a Unified Shared Memory allocation returned by *clHostMemAllocINTEL*, *clDeviceMemAllocINTEL*, or *clSharedMemAllocINTEL*. -* For extension versions 1.1.0 and newer: -For all devices, any pointer value is valid. +The <> specified using `CL_KERNEL_EXEC_INFO_USM_PTRS_INTEL` was changed in extension version 1.1.0. ==== Filling and Copying Unified Shared Memory