Skip to content

Cuda and OpenCL fixes #8

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

LourensVeen
Copy link

Here are some minimal updates to make sapporo2 compile with newer versions of CUDA and OpenCL.

The cmalloc function seems to be called by the OpenCL code, but is missing. Possibly it's present in the codes that use sapporo, and since everything is linked statically it gets taken from there when that code is linked? It seems like the allocate member function is what should have been called there, but I'm not sure.

I've tested this with CUDA, and can run the test codes, but with nVidia OpenCL they crash with

oclSafeCall() Runtime API error in file <./include/ocldev.h>, line 531 : Out of resources
.
test_gravity_block_ocl: ./include/ocldev.h:179: void dev::__oclsafeCall(cl_int, const char*, int): Assertion `false' failed.

Since OpenCL is obsolete anyway, maybe we don't bother going into this?

(I have more and bigger changes in the pipeline, but I'm going to try to split them into small, easy to review PRs to hopefully not take too much of your time. Thanks!)

@LourensVeen LourensVeen changed the title Cuda opencl fixes Cuda and OpenCL fixes Oct 31, 2023
Copy link
Collaborator

@jbedorf jbedorf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for these changes and fixes. It's been a while since anyone worked on this, glad that it's not completely broken. Just one comment on the ifdef/else construct.

fprintf(stderr, "ERROR: is not implemented in OpenCL, only in CUDA. Please");
fprintf(stderr, "ERROR: file an issue on GitHub if you need this combination.");
exit(1);
#else
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess there's no need for the else? Given that there's the exit above? In that case maybe change the #else into #endif

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem here actually was that float3 was undefined when compiling with OpenCL, causing a compiler error. So we could move the perThreadSM = ... line outside of the #ifdef, but then it wouldn't compile for OpenCL.

I think I later saw a header somewhere that aliases some OpenCL types to CUDA-like names, so maybe float3 can be added there to fix it instead, I'll have a look.

ocl_free();
cmalloc(src.n, DeviceMemFlags);
ocl_free();
allocate(src.n, DeviceMemFlags);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this looks to be the right thing todo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants