-
Notifications
You must be signed in to change notification settings - Fork 57
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Python bindings to cuFileDriverOpen()
and cuFileDriverClose()
#514
base: branch-24.12
Are you sure you want to change the base?
Conversation
47a0452
to
4b8181c
Compare
4967be7
to
e7bde28
Compare
3d02555
to
0b4537f
Compare
Thank you @madsbk for your efforts! |
We want to trigger CI error if a package was built without cufile support
I've built and reran my small segfault reproducer script without explicitly opening and closing the driver. If I guess my segfault (#497) is unrelated to the driver initialization and destruction. I have tested this on my local machine where I currently don't have a GDS-supported file system. So no actual writing happened. |
37d318f
to
3f8f8b7
Compare
return cufile_driver.driver_close() | ||
|
||
|
||
def initialize() -> None: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: Whose job is it to call initialize
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The user for now. If we find a Python example that segfaults because of cuFile's termination issues, we should consider calling it in __init__.py
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tiny suggestions, none are blocking, I think
cpp/include/kvikio/shim/cufile.hpp
Outdated
if (!stream_available) { // The stream API was introduced in CUDA 12.2. | ||
driver_open(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we gate this behind the thing we actually mean cuda < 12
? It seems we could use something like:
#if CUDA_VERSION_LT(12, 0)
driver_open();
#endif
Using a hypothetical macro.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, the version of CTK and cuFile might not match, updated the comment:
// cuFile is supposed to open and close the driver automatically but
// because of a bug in cuFile v1.4 (CUDA v11.8), it sometimes segfault:
// <https://github.com/rapidsai/kvikio/issues/159>.
// We use the stream API as an version indicator of cuFile, it was introduced
// in cuFile v1.7 (CUDA v12.2).
std::cerr << "Unable to close GDS file driver: " << cufileop_status_error(error.err) | ||
<< std::endl; | ||
} | ||
if (!stream_available) { driver_close(); } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same here.
* cuFile accept multiple calls to `cufileDriverOpen()`, only the first call opens | ||
* the driver, but every call should have a matching call to `cufileDriverClose()`. | ||
*/ | ||
void driver_open() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question/doc: Should we note that it should be unnecessary to call these functions and that this is a workaround for 11.8 bug? Or does it turn out because the automatic open/close that cufile implements is wrong we do need to do this manually?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is used by kvikio::DriverInitializer
, which can be used to pay for init overhead up front
from kvikio._lib import cufile_driver # type: ignore | ||
|
||
# TODO: Wrap nicely, maybe as a dataclass? | ||
DriverProperties = cufile_driver.DriverProperties |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: Add a short docstring?
"""
What are the DriverProperties?
"""
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
defer to #526
…o cufile_driver
Co-authored-by: Lawrence Mitchell <[email protected]>
Co-authored-by: Lawrence Mitchell <[email protected]>
Changes:
cuFileDriverOpen()
andcuFileDriverClose()
.kvikio.cufile_driver.initialize()
, which open the cuFile driver and close it again at module exit.