Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ocl: pointer-arithmetic for device-pointers #758

Merged
merged 2 commits into from
Mar 1, 2024
Merged

ocl: pointer-arithmetic for device-pointers #758

merged 2 commits into from
Mar 1, 2024

Conversation

hfp
Copy link
Member

@hfp hfp commented Jan 30, 2024

  • Fallback to main-thread's stream (c_dbcsr_acc_opencl_stream_default).
  • Fixed c_dbcsr_acc_opencl_stream_default and reduce one level of indirection.
  • Reworked entire memory allocation (determining offsets).
  • Improved error checks and introduced more assertions.
  • ACC_OPENCL_MEM_OFFSET is now mandatory.
  • Tightened memory facility (locks).
  • Improved locking stream facility.
  • Adjusted UNROLL-control.

@hfp hfp marked this pull request as draft February 1, 2024 13:15
@hfp hfp force-pushed the ocl branch 5 times, most recently from 22b0c67 to 87e867f Compare February 12, 2024 13:00
@hfp hfp marked this pull request as ready for review February 12, 2024 13:55
@hfp hfp force-pushed the ocl branch 3 times, most recently from 3666953 to 04fd925 Compare February 16, 2024 13:33
@hfp hfp force-pushed the ocl branch 6 times, most recently from 7994173 to 522a1fa Compare March 1, 2024 11:49
* Implemented pointer-arithmetic for device-pointers using Intel's USM as well as fallback code.
* Fallback to main-thread's stream (c_dbcsr_acc_opencl_stream_default).
* Fixed c_dbcsr_acc_opencl_stream_default and reduce one level of indirection.
* Account for an apparent bug or accuracy issue with AL=1.
* Turned assertion into runtime error (set_active_device).
* Improved finding OpenCL header file.
* Reworked entire memory allocation (determining offsets).
* Consolidated compile-time decisions about LIBXSMM_VERSION_NUMBER.
* Removed runtime decisions accounting for pooled allocations.
* Removed support for performance estimation and suitability.
* Support older LIBXSMM (pooled memory allocations).
* Set ACC_OPENCL_ATOMIC_KIND to sequentially consistent; set ACC_OPENCL_NLOCKS=1.
* Complemented ACC_OPENCL_NLOCKS with environment variable.
* Introduced ACC_OPENCL_OMPLOCKS, ACC_OPENCL_MEM_DEBUG, ACC_OPENCL_EVENT_FLUSH.
* Implemented behavior of c_dbcsr_acc_opencl_stream_default already in c_dbcsr_acc_opencl_stream.
* Cache active device-ID to avoid determining context/properties (c_dbcsr_acc_set_active_device).
* Support event chain (dependency), improved handling errors (c_dbcsr_acc_stream_wait_event).
* Support event chain (dependency), improved handling errors (c_dbcsr_acc_event_record).
* Introduced lock-arguments (internal, e.g., c_dbcsr_acc_opencl_set_active_device).
* Consolidated domain-locks into c_dbcsr_acc_opencl_config.
* Made build-log available (c_dbcsr_acc_opencl_kernel).
* Reworked stream-registry and stream-info facility.
* Consolidated tuned parameters, and updated tuned parameters.
* Use "int" instead of "cl_int" when taking the return-code.
* Consistently use EXIT_SUCCESS instead of CL_SUCCESS.
* Removed support for ACC_OPENCL_OVERMALLOC.
* Removed support for per-thread device.
* Removed ACC_OPENCL_EVENT_BARRIER.
* Introduced ACC_OPENCL_MEM_TLS (disabled).
* Simplified c_dbcsr_acc_opencl_memset.
* Support ACC_OPENCL_STREAM_NULL in event facility.
* Introduced assertion (dbcsr_acc_devmem.F).
* Fixed using size_t as kernel argument.
* Introduced UNROLL_AUTO.
* Adjusted Daint-CI.
@hfp hfp merged commit 24f24f2 into cp2k:develop Mar 1, 2024
23 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant