Skip to content

Conversation

@steffenlarsen
Copy link
Contributor

The MHostKernel member in CGExecKernel was unused, so this commit removes it.

The MHostKernel member in CGExecKernel was unused, so this commit
removes it.

Signed-off-by: Larsen, Steffen <[email protected]>
@steffenlarsen
Copy link
Contributor Author

Looks like the host kernel copy is implicitly being used to keep the contents of it alive during scheduling. Converting this to draft for now.

@Pennycook
Copy link
Contributor

Looks like the host kernel copy is implicitly being used to keep the contents of it alive during scheduling. Converting this to draft for now.

Which tests failed? If you revert my changes from #18387, does this work?

My fast path currently depends on the HostKernel member for lifetime extension -- I figured that since it was already there, I could make it do something useful! If my fast path is the only thing using this mechanism, that would be good to know, because it would be fairly straightforward to store the arguments somewhere else.

@steffenlarsen
Copy link
Contributor Author

Which tests failed?

  SYCL :: Basic/empty_command.cpp
  SYCL :: Basic/out_of_order_queue_status_ext_oneapi_empty.cpp
  SYCL :: Basic/out_of_order_queue_status_khr_empty.cpp
  SYCL :: Graph/Explicit/add_nodes_after_finalize.cpp
  SYCL :: Graph/Explicit/basic_usm.cpp
  SYCL :: Graph/Explicit/basic_usm_host.cpp
  SYCL :: Graph/Explicit/basic_usm_mixed.cpp
  SYCL :: Graph/Explicit/basic_usm_shared.cpp
  SYCL :: Graph/Explicit/compile_time_local_memory.cpp
  SYCL :: Graph/Explicit/depends_on.cpp
  SYCL :: Graph/Explicit/empty_node.cpp
  SYCL :: Graph/Explicit/enqueue_ordering.cpp
  SYCL :: Graph/Explicit/host_task.cpp
  SYCL :: Graph/Explicit/host_task2.cpp
  SYCL :: Graph/Explicit/host_task_last.cpp
  SYCL :: Graph/Explicit/host_task_multiple_deps.cpp
  SYCL :: Graph/Explicit/host_task_multiple_roots.cpp
  SYCL :: Graph/Explicit/host_task_successive.cpp
  SYCL :: Graph/Explicit/local_accessor.cpp
  SYCL :: Graph/Explicit/local_accessor_multiple_accessors.cpp
  SYCL :: Graph/Explicit/local_accessor_multiple_nodes.cpp
  SYCL :: Graph/Explicit/memadvise.cpp
  SYCL :: Graph/Explicit/multiple_exec_graphs.cpp
  SYCL :: Graph/Explicit/node_ordering.cpp
  SYCL :: Graph/Explicit/prefetch.cpp
  SYCL :: Graph/Explicit/queue_constructor_usm.cpp
  SYCL :: Graph/Explicit/queue_shortcuts.cpp
  SYCL :: Graph/Explicit/repeated_exec.cpp
  SYCL :: Graph/Explicit/single_node.cpp
  SYCL :: Graph/Explicit/sub_graph.cpp
  SYCL :: Graph/Explicit/sub_graph_execute_without_parent.cpp
  SYCL :: Graph/Explicit/sub_graph_multiple_submission.cpp
  SYCL :: Graph/Explicit/sub_graph_nested.cpp
  SYCL :: Graph/Explicit/sub_graph_two_parent_graphs.cpp
  SYCL :: Graph/Explicit/usm_copy.cpp
  SYCL :: Graph/Explicit/work_group_memory.cpp
  SYCL :: Graph/NativeCommand/hip_explicit_usm.cpp
  SYCL :: Graph/NativeCommand/hip_record_usm.cpp
  SYCL :: Graph/RecordReplay/add_nodes_after_finalize.cpp
  SYCL :: Graph/RecordReplay/after_use.cpp
  SYCL :: Graph/RecordReplay/barrier_multi_graph.cpp
  SYCL :: Graph/RecordReplay/barrier_multi_queue.cpp
  SYCL :: Graph/RecordReplay/barrier_with_work.cpp
  SYCL :: Graph/RecordReplay/basic_usm.cpp
  SYCL :: Graph/RecordReplay/basic_usm_host.cpp
  SYCL :: Graph/RecordReplay/basic_usm_mixed.cpp
  SYCL :: Graph/RecordReplay/basic_usm_shared.cpp
  SYCL :: Graph/RecordReplay/compile_time_local_memory.cpp
  SYCL :: Graph/RecordReplay/dotp_in_order.cpp
  SYCL :: Graph/RecordReplay/dotp_in_order_pause.cpp
  SYCL :: Graph/RecordReplay/dotp_in_order_with_empty_nodes.cpp
  SYCL :: Graph/RecordReplay/dotp_multiple_queues.cpp
  SYCL :: Graph/RecordReplay/empty_node.cpp
  SYCL :: Graph/RecordReplay/ext_oneapi_enqueue_functions.cpp
  SYCL :: Graph/RecordReplay/ext_oneapi_enqueue_functions_submit_with_event.cpp
  SYCL :: Graph/RecordReplay/host_task.cpp
  SYCL :: Graph/RecordReplay/host_task2.cpp
  SYCL :: Graph/RecordReplay/host_task2_multiple_roots.cpp
  SYCL :: Graph/RecordReplay/host_task_in_order.cpp
  SYCL :: Graph/RecordReplay/host_task_last.cpp
  SYCL :: Graph/RecordReplay/host_task_multiple_deps.cpp
  SYCL :: Graph/RecordReplay/host_task_multiple_roots.cpp
  SYCL :: Graph/RecordReplay/host_task_successive.cpp
  SYCL :: Graph/RecordReplay/in_order_queue_with_host_managed_dependencies.cpp
  SYCL :: Graph/RecordReplay/in_order_queue_with_host_managed_dependencies_memcpy.cpp
  SYCL :: Graph/RecordReplay/in_order_queue_with_host_managed_dependencies_memset.cpp
  SYCL :: Graph/RecordReplay/local_accessor.cpp
  SYCL :: Graph/RecordReplay/local_accessor_multiple_accessors.cpp
  SYCL :: Graph/RecordReplay/local_accessor_multiple_nodes.cpp
  SYCL :: Graph/RecordReplay/memadvise.cpp
  SYCL :: Graph/RecordReplay/multiple_exec_graphs.cpp
  SYCL :: Graph/RecordReplay/prefetch.cpp
  SYCL :: Graph/RecordReplay/queue_constructor_usm.cpp
  SYCL :: Graph/RecordReplay/queue_shortcuts.cpp
  SYCL :: Graph/RecordReplay/repeated_exec.cpp
  SYCL :: Graph/RecordReplay/sub_graph.cpp
  SYCL :: Graph/RecordReplay/sub_graph_execute_without_parent.cpp
  SYCL :: Graph/RecordReplay/sub_graph_in_order.cpp
  SYCL :: Graph/RecordReplay/sub_graph_multiple_submission.cpp
  SYCL :: Graph/RecordReplay/sub_graph_nested.cpp
  SYCL :: Graph/RecordReplay/sub_graph_two_parent_graphs.cpp
  SYCL :: Graph/RecordReplay/temp_scope.cpp
  SYCL :: Graph/RecordReplay/transitive_queue.cpp
  SYCL :: Graph/RecordReplay/transitive_queue_barrier.cpp
  SYCL :: Graph/RecordReplay/usm_copy.cpp
  SYCL :: Graph/RecordReplay/usm_copy_in_order.cpp
  SYCL :: Graph/RecordReplay/usm_memset_shortcut.cpp
  SYCL :: Graph/RecordReplay/work_group_memory.cpp
  SYCL :: Graph/Threading/submit.cpp
  SYCL :: Graph/Update/Explicit/whole_update_host_task.cpp
  SYCL :: Graph/Update/Explicit/whole_update_local_acc.cpp
  SYCL :: Graph/Update/Explicit/whole_update_local_acc_multi.cpp
  SYCL :: Graph/Update/Explicit/whole_update_subgraph.cpp
  SYCL :: Graph/Update/Explicit/whole_update_usm.cpp
  SYCL :: Graph/Update/Explicit/whole_update_work_group_memory.cpp
  SYCL :: Graph/Update/RecordReplay/whole_update_host_task.cpp
  SYCL :: Graph/Update/RecordReplay/whole_update_local_acc.cpp
  SYCL :: Graph/Update/RecordReplay/whole_update_local_acc_multi.cpp
  SYCL :: Graph/Update/RecordReplay/whole_update_subgraph.cpp
  SYCL :: Graph/Update/RecordReplay/whole_update_usm.cpp
  SYCL :: Graph/Update/RecordReplay/whole_update_work_group_memory.cpp
  SYCL :: Graph/Update/dyn_cgf_accessor_deps.cpp
  SYCL :: Graph/Update/dyn_cgf_accessor_deps2.cpp
  SYCL :: Graph/Update/dyn_cgf_different_arg_nums.cpp
  SYCL :: Graph/Update/dyn_cgf_event_deps.cpp
  SYCL :: Graph/Update/dyn_cgf_get_active_index.cpp
  SYCL :: Graph/Update/dyn_cgf_host_task_usm.cpp
  SYCL :: Graph/Update/dyn_cgf_ndrange.cpp
  SYCL :: Graph/Update/dyn_cgf_ndrange_3D.cpp
  SYCL :: Graph/Update/dyn_cgf_overwrite_range.cpp
  SYCL :: Graph/Update/dyn_cgf_parameters.cpp
  SYCL :: Graph/Update/dyn_cgf_shared_nodes.cpp
  SYCL :: Graph/Update/dyn_cgf_update_before_finalize.cpp
  SYCL :: Graph/Update/dyn_cgf_usm.cpp
  SYCL :: Graph/Update/dyn_cgf_with_all_dyn_params.cpp
  SYCL :: Graph/Update/dyn_cgf_with_different_type_dyn_params.cpp
  SYCL :: Graph/Update/dyn_cgf_with_dyn_work_group_mem.cpp
  SYCL :: Graph/Update/dyn_cgf_with_some_dyn_params.cpp
  SYCL :: Graph/Update/dyn_work_group_memory_multiple.cpp
  SYCL :: Graph/Update/regression_padded_parameter.cpp
  SYCL :: Graph/Update/update_before_finalize.cpp
  SYCL :: Graph/Update/update_nd_range.cpp
  SYCL :: Graph/Update/update_ndrange_to_range.cpp
  SYCL :: Graph/Update/update_nullptr.cpp
  SYCL :: Graph/Update/update_range.cpp
  SYCL :: Graph/Update/update_range_to_ndrange.cpp
  SYCL :: Graph/Update/update_with_indices_multiple_exec_graphs.cpp
  SYCL :: Graph/Update/update_with_indices_ordering.cpp
  SYCL :: Graph/Update/update_with_indices_ptr.cpp
  SYCL :: Graph/Update/update_with_indices_ptr_3D.cpp
  SYCL :: Graph/Update/update_with_indices_ptr_double_update.cpp
  SYCL :: Graph/Update/update_with_indices_ptr_multiple_nodes.cpp
  SYCL :: Graph/Update/update_with_indices_ptr_multiple_nodes_different_indices.cpp
  SYCL :: Graph/Update/update_with_indices_ptr_multiple_params.cpp
  SYCL :: Graph/Update/update_with_indices_ptr_subgraph.cpp
  SYCL :: Graph/Update/update_with_indices_scalar.cpp
  SYCL :: Graph/Update/whole_update_barrier_node.cpp
  SYCL :: Graph/Update/whole_update_dynamic_cgf.cpp
  SYCL :: Graph/Update/whole_update_dynamic_param.cpp
  SYCL :: Graph/Update/whole_update_empty_node.cpp
  SYCL :: Graph/ValidUsage/linear_graph_copy.cpp
  SYCL :: Graph/ValidUsage/linear_graph_copy_D2H.cpp
  SYCL :: InorderQueue/in_order_ext_oneapi_submit_barrier.cpp

🫠

If you revert my changes from #18387, does this work?

Let me try. I haven't seen these failures before, so if things changed recently it might explain it.

@sarnex
Copy link
Contributor

sarnex commented Jul 23, 2025

I think this PR caused a hang on Linux BMG so I cancelled the job. Please double check before rerunning CI, thanks!

@sarnex
Copy link
Contributor

sarnex commented Jul 23, 2025

I think Linux PVC has a hang from this too :)

@aelovikov-intel
Copy link
Contributor

Don't we need to preserve the kernel until enqueue? Because the arguments are part of it. If so, then the enqueue might be delayed if the kernel depends on the (yet unfinished) host task, and the scheduler has to pick up the ownership for it until it's enqueued.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants