Skip to content

[SYCL][UR][L0 v2] Move all enqueue-related logic to command_list_manager #18837

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 4 commits into
base: sycl
Choose a base branch
from

Conversation

igchor
Copy link
Member

@igchor igchor commented Jun 5, 2025

Almost all in_order_queue functions are implemented by
just forwarding the call to the command_list_manager.

This is needed to implement out-of-order queue without
duplicating the implementation.

@igchor igchor requested review from a team as code owners June 5, 2025 18:17
@igchor igchor requested a review from konradkusiak97 June 5, 2025 18:17
@igchor igchor temporarily deployed to WindowsCILock June 5, 2025 18:18 — with GitHub Actions Inactive
@igchor igchor temporarily deployed to WindowsCILock June 5, 2025 18:53 — with GitHub Actions Inactive
@igchor igchor temporarily deployed to WindowsCILock June 5, 2025 18:53 — with GitHub Actions Inactive
igchor added 2 commits June 5, 2025 19:16
Almost all in_order_queue functions are implemented by
just forwarding the call to the command_list_manager.

This is needed to implement out-of-order queue without
duplicating the implementation.
all command list manager functions now expect ur_event_handle_t
as signal event, not ur_event_handle_t*
Copy link
Contributor

@Xewar313 Xewar313 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall makes sense, just a few nitpicks

ur_result_t ur_command_list_manager::enqueueKernelLaunch(
ur_kernel_handle_t hKernel, uint32_t workDim,
const size_t *pGlobalWorkOffset, const size_t *pGlobalWorkSize,
const size_t *pLocalWorkSize, uint32_t numPropsInLaunchPropList,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This behavior that adds multiple kernels to the list, depending on launchPropList seems quite counterintuitive. Currently, the numPropsInLaunchPropList and launchPropList seems like any other argument of the enqueueKernelLaunch that would modify the added kernel, but wouldn't add multiple kernels at once.
Wouldn't it make more sense to either add a separate command_list_manager method that would allow adding multiple kernels, or create a separate structure that would describe the launch parameters, to have a clearer distinction between these two?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think multiple kernels are added - we only iterate over properties to see if there is cooperative property and if so, we need to call different L0 function to enqueue the kernel.

I agree that this part might use a refactor - at least we could extract the logic for checking the properties to a separate function - let's do that in a separate PR.

const ur_event_handle_t *phEventWaitList,
ur_event_handle_t *phEvent);
/************ Helper methods *************/
ur_result_t enqueueGenericCommandListsExp(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be private?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it should. Done.

template <typename Base> locked<Base> lock() {
std::unique_lock lock{mut_};
return locked<Base>(&object_, std::move(lock));
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we introduce more generic approach, wouldn't it work to express the default one using the new one, to not duplicate code?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, this was a leftover from a previous implementation, this is not needed anymore.

ze_event_handle_t getSignalEvent(ur_event_handle_t hUserEvent,
ur_command_t commandType);

/************ Generic queue methods *************/
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nitpick, so feel free to ignore, but I preferred the previous appendFoo naming to enqueueFoo. As the methods append to a command-list object rather than submit to a queue object. I understand that a immediate command-list object is effectively a queue, but the command-list might not always be immediate (as in the case where it's being called by the command-buffer API)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree. I prefer append naming as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, changed back to append.

@igchor igchor temporarily deployed to WindowsCILock June 6, 2025 15:16 — with GitHub Actions Inactive
@igchor igchor temporarily deployed to WindowsCILock June 6, 2025 15:41 — with GitHub Actions Inactive
@igchor igchor temporarily deployed to WindowsCILock June 6, 2025 15:41 — with GitHub Actions Inactive
@igchor igchor temporarily deployed to WindowsCILock June 6, 2025 16:40 — with GitHub Actions Inactive
@igchor igchor temporarily deployed to WindowsCILock June 6, 2025 18:09 — with GitHub Actions Inactive
@igchor igchor temporarily deployed to WindowsCILock June 6, 2025 18:09 — with GitHub Actions Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants