Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is --expt-extended-lambda and --expt-relaxed-constexpr not supported? #30

Open
anshumang opened this issue Aug 9, 2019 · 5 comments
Open

Comments

@anshumang
Copy link

From the nvrtc docs, it seems that the above options are not supported. Can you please confirm?
@maddyscientist

@maddyscientist
Copy link
Collaborator

@anshumang these restrictions stem from the fact that they rely on host-device compiler interactions, and since nvrtc is a pure GPU compiler, I guess it shouldn't be surprising that they are not supported.

  • The lack of support for --expt-relaxed-constexpr is not any way a restriction, since nvrtc is capable of interpreting all functions presented to it as a __device__ function. This option is enabled by jitify, so things will just work as is.
  • For --expt-extended-lambda, in the general case the lambda will be defined in offline code, and not visible to the run-time compiler, so it would not be possible for nvrtc to support it. You would need to define the lambda in code visible to the runtime compiler in order for nvrtc to compile this. I suppose it would be possible to compile the offline lambda to a ptx function and then so long as the ptx was available at runtime, then nvrtc could inline this at runtime in the kernel. Likely a multitude of headaches associated with such an approach that would need to be solved, but I think it's at least possible.

@benbarsdell agree with the above?

@anshumang
Copy link
Author

Thanks for the detailed answer @maddyscientist
That helps.
For moving a kernel from offline to runtime compilation in a code base that has --expt-relaxed-constexpr and --expt-extended-lambda among the nvcc options, I am considering invoking nvcc at runtime instead of going the nvrtc/jitify path. This is to keep code changes (to account for nvcc and nvrtc differences) to a minimum. The host code is performance sensitive and I do not yet know if the invocation of nvrtc v/s nvcc would have an impact. Also, there would be more boilerplate in the host code when using nvcc which is less of a concern. If you have some thoughts on problems with the nvcc approach, please do share. Thanks !

@maddyscientist
Copy link
Collaborator

maddyscientist commented Aug 10, 2019

As I said, the ---expt-relaxed-constexpr issue isn't a problem at all. Things will just work without this flag.

For lambda issue though, could you show me an example of what the code looks like that uses extended lambdas? Jitify does have lambda-like functionality using the JITIFY_LAMBDA macro, I wonder if this could satisfy your need.

The problem with using nvcc at runtime is that you have to ensure that wherever the code runs, you have the full CUDA toolkit and host compiler available. So, for example, if you know that your development system and deployment system are the same, this isn't an issue.

@anshumang
Copy link
Author

Let me see if I can share a representative snippet.
Yes, toolkit installation on the deployment host, is something I missed. That is possible.
Good to know that otherwise, invoking nvcc v/s using nvrtc would be the same.

@maddyscientist
Copy link
Collaborator

Great, will be good to see a snippet. In general, I would say that it would be good to understand issues where nvrtc / Jitify isn't a good match. E.g., if you need to use shell out to nvcc at runtime, perhaps that represents a weakness in the nvrtc model that needs to be addressed. Thx.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants