-
Notifications
You must be signed in to change notification settings - Fork 769
[NFCI][SYCL] Copy handler::MQueue
to handler_impl
#18830
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Ideally, it should be just moved there, but doing the whole move in one PR would be harder to review as it would require multiple internal API changes to accept `queue_impl` by raw ptr/ref instead of `std::shared_ptr<queue_impl>` value/ref. As such, this PR does the following: * Store `std::variant` of queue/graph instead of just graph as two are mutually exclusive * Store them by reference (through `std::reference_wrapper` as 'std::variant' can't store raw references directly) instead of `std::shared_ptr<graph_impl>`. The reason for that is that the object `handler` is submitted to is guaranteed to be alive for the whole lifetime of the `handler` and we don't need to extend the lifetime of queue/graph. Corresponding changes for `handler::MQueue` have been implemented recently already (although via `std::shared_ptr<queue_impl> &` and not raw reference) but are limited to prevew-only. Do the same for graphs here, essentially. * Update all uses of the old `handler_impl::MGraph` data member as it needs to go through new getters accessing the `std::variant` described above. * Update some of the direct usages of `handler::MQueue` that don't require APIs refactoring elsewhere. The remaining uses are left to the subsequent PR(s). * We'll probably need to keep the `handler::MQueue` initialized properly even after the move is complete and all internal SYCL RT accesses are through `handler_impl` as some direct `handler::MQueue` accesses might have been inlined into the users' applications (I'd be especially worried about reductions).
3ea60a9
to
5281e90
Compare
@@ -14,15 +14,6 @@ | |||
namespace sycl { | |||
inline namespace _V1 { | |||
namespace detail { | |||
device getDeviceFromHandler(handler &cgh) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is exported, I've moved it to handler.cpp
.
Just for a bit of extra context, #18767 is a draft/wip/poc showing something close to the targeted state. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, it looks good to me. Just a couple of nits, and a couple of questions:
- What is the motivation behind these changes?
- Is this not breaking ABI?
@@ -43,13 +43,15 @@ inline namespace _V1 { | |||
|
|||
namespace detail { | |||
|
|||
#ifdef __INTEL_PREVIEW_BREAKING_CHANGES | |||
// TODO: Check if two ABI exports below are still necessary. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we address this here or in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IMO, during the ABI breaking window would be most effective, hence the preview guards around the comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The question was not very precise, sorry. Let me clarify: should we try to address this within the guards already, so it's ready and tested for the next ABI breaking window?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it's necessary and I'd rather focus on more complicated things (like changes in this PR) first. Having single cleanup for trivial cases like this later, after I do most of the hard refactoring, is my current plan.
range<2> ItemLimit = Dev.get_info<info::device::max_work_item_sizes<2>>() * | ||
Dev.get_info<info::device::max_compute_units>(); | ||
return id<2>{std::min(ItemLimit[0], Height), std::min(ItemLimit[1], Width)}; | ||
} | ||
|
||
// TODO: do we need this still? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we address this here or in the future?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Likewise.
Looks like a much clear design to me. If properties are mutually exclusive, then it should be an
Don't see how would it... |
@intel/sycl-graphs-reviewers , could you take a look before EOD in your TZ? I was hoping I'd be able to proceed with subsequent PRs during my working hours today. |
I agree. Just wanted to know if this was also expected to have any performance implications, as we are doing some efforts on that direction.
Sorry, I mixed up the title of this PR with the draft PR you linked, so in my head this was already effectively moving, not copying. |
We've had multiple PRs that were eliminating unnecessary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One meaningful comment about handler_impl(ext::oneapi::experimental::detail::graph_impl &Graph)
usage with __INTEL_PREVIEW_BREAKING_CHANGES
. All other comments are superficial nitpicks.
handler_impl(ext::oneapi::experimental::detail::graph_impl &Graph) | ||
: MQueueOrGraph{Graph} {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we have any %if preview-breaking-changes-supported
Graph E2E tests to catch this. When we call this constructor with under __INTEL_PREVIEW_BREAKING_CHANGES
https://github.com/intel/llvm/blob/sycl/sycl/source/detail/graph_impl.cpp#L507 it will pass a std::shared_ptr<ext::oneapi::experimental::detail::graph_impl>
from shared_from_this()
can that compile after this change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
source/detail
is compiled unconditionally in two modes, because we provide two variants of libsycl.so
/libsycl-preview.so
. All existing E2E tests are run in preview mode in this nightly task as well:
llvm/.github/workflows/sycl-nightly.yml
Lines 111 to 115 in 57cc5fe
- name: Preview mode on SPR/PVC | |
runner: '["Linux", "pvc"]' | |
image_options: -u 1001 --device=/dev/dri -v /dev/dri/by-path:/dev/dri/by-path --privileged --cap-add SYS_ADMIN | |
target_devices: level_zero:gpu | |
extra_lit_opts: --param test-preview-mode=True |
That said, I'll take another look locally to be 100% sure, but I wouldn't expect any failures here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool, in that case I've approved the PR, but if you could double check locally that would be great.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is already modified as part of this PR - see changes in the above file (graph_impl.cpp
).
Initially started in intel#18830 Subsequent PRs before this final one: intel#18794 intel#18834 intel#18748
Ideally, it should be just moved there, but doing the whole move in one PR would be harder to review as it would require multiple internal API changes to accept
queue_impl
by raw ptr/ref instead ofstd::shared_ptr<queue_impl>
value/ref. As such, this PR does the following:std::variant
of queue/graph instead of just graph as two are mutually exclusivestd::reference_wrapper
as 'std::variant' can't store raw references directly) instead ofstd::shared_ptr<graph_impl>
. The reason for that is that the objecthandler
is submitted to is guaranteed to be alive for the whole lifetime of thehandler
and we don't need to extend the lifetime of queue/graph. Corresponding changes forhandler::MQueue
have been implemented recently already (although viastd::shared_ptr<queue_impl> &
and not raw reference) but are limited to prevew-only. Do the same for graphs here, essentially.handler_impl::MGraph
data member as it needs to go through new getters accessing thestd::variant
described above.handler::MQueue
that don't require APIs refactoring elsewhere. The remaining uses are left to the subsequent PR(s).handler::MQueue
initialized properly even after the move is complete and all internal SYCL RT accesses are throughhandler_impl
as some directhandler::MQueue
accesses might have been inlined into the users' applications (I'd be especially worried about reductions).