-
Notifications
You must be signed in to change notification settings - Fork 48
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split the CUDA CKF into different TUs #742
base: main
Are you sure you want to change the base?
Conversation
bbeb4f7
to
30aebf1
Compare
My plan to finally get rid of those .ipp files has failed. 🫡 Anyway, updated. |
24a9c9e
to
4ec0290
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make sure there's no hastiness on this one...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, let's go with this general setup at the moment. As we discussed in person, I have some further ideas for how to tweak the code later on even further, but these changes do not make those future changes any more difficult. (The general direction of my idea is very similar.)
typename detector_t::view_type det_data; | ||
const int n_params; | ||
bound_track_parameters_collection_types::view params_view; | ||
vecmem::data::vector_view<const unsigned int> params_liveness_view; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some Doxygen documentation should be put on the variables, and the struct as well. Since now that documentation is not on the main function anymore.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
4ec0290
to
54e7489
Compare
Great! I've rebased and incorporated the change requests |
This commit splits the monstrously large CUDA track finding translation unit up into smaller ones, one for each of the kernels. This should speed up compilation times and decrease memory usage. Also groups the payloads for each of the functions into convenient structs, so we don't need to pass 20+ arguments for some of the kernel calls. Does not change the functionality of the code.
Quality Gate passedIssues Measures |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a little more to go...
measurement_collection_types::const_view measurements_view, | ||
bound_track_parameters_collection_types::const_view seeds_view, | ||
vecmem::data::jagged_vector_view<const candidate_link> links_view, | ||
vecmem::data::vector_view<const typename candidate_link::link_index_type> | ||
tips_view, | ||
track_candidate_container_types::view track_candidates_view, | ||
vecmem::data::vector_view<unsigned int> valid_indices_view, | ||
unsigned int* n_valid_tracks); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Forgot to use traccc::device::build_tracks_payload
here.
unsigned int* shared_num_candidates; | ||
std::pair<unsigned int, unsigned int>* shared_candidates; | ||
unsigned int& shared_candidates_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As long as I complain anyway, these variables should also get some description.
This commit splits the monstrously large CUDA track finding (and some CCL) translation unit up into smaller ones, one for each of the kernels. This should speed up compilation times and decrease memory usage.
Also groups the payloads for each of the functions into convenient structs, so we don't need to pass 20+ arguments for some of the kernel calls.
Does not change the functionality of the code.