-
Notifications
You must be signed in to change notification settings - Fork 10
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add NRT Memsys #78
base: main
Are you sure you want to change the base?
Add NRT Memsys #78
Conversation
…ariable in nrt.py
rtsys.ensure_initialized(stream_handle) | ||
if config.CUDA_NRT_STATS: | ||
rtsys.memsys_enable_stats(stream_handle) | ||
if hasattr(self, "target_context") and self.target_context.enable_nrt: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check for target_context
indicates that for cached kernels that doesn't have a target_context attribute, NRT is always considered off. (Is this correct?)
}; | ||
|
||
/* The Memory System object */ | ||
__device__ NRT_MemSys* TheMSys; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is initialized to 0 (and therefore the checks in nrt.cu like if(TheMSys && ...)
are always safe. Am I thinking about that correctly?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, if this is included into both memsys.cu and nrt.cu, is there a potential ODR violation here? Should this be extern
in the header and defined in one of those two files?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is initialized to 0 (and therefore the checks in nrt.cu like if(TheMSys && ...) are always safe. Am I thinking about that correctly?
Yes.
Also, if this is included into both memsys.cu and nrt.cu, is there a potential ODR violation here? Should this be extern in the header and defined in one of those two files?
Currently memsys.cu
and nrt.cu
are compiled into two separate modules, so far ODR violation isn't an issue. This is a deliberate design decision to separate memsys stats method from NRT methods. I was hoping this can save us some compilation time when we turn on NRT, but do not turn on statistics. This design is debatable and could change.
|
||
namespace detail | ||
{ | ||
void __device__ check_memsys() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good idea - do we have things set up sufficiently correctly in numba-cuda such that this assertion can fire (e.g. if debug=True
for a @cuda.jit
dewcorator) when TheMSys
is null?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think the tests currently cover a use case where the memsys is null. This check only fires when the NRT is enabled, but the memsys is not allocated. However currently we always promise memsys is allocated if NRT is enabled.
This PR adds memsys to NRT. A follow up to #17