Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Increase robustness of clustering partitioning #567

Open
stephenswat opened this issue May 4, 2024 · 0 comments
Open

Increase robustness of clustering partitioning #567

stephenswat opened this issue May 4, 2024 · 0 comments
Labels
good first issue Good for newcomers improvement Improve an existing feature

Comments

@stephenswat
Copy link
Member

Writing this down here as a suggestion for any students or anyone else wanting to get started on traccc.

The clustering algorithm relies on being able to partition the hits into segments which are separated by at least one full row (or column) on a 2D pixel-like detector of zero-activation cells. This guarantees that there are no cross-partition clusters. The algorithm uses shared memory which is of limited side; the maximum partition size $n_\text{max}$ determines the amount of shared memory used and, as a result, the performance of the algorithm: as $n_\text{max}$ increases, performance decreases. However, this gives us an algorithm with a probabilistic success rate. For a hit density $d$, a module width or height $n$, the success probability for a given partition is approximated by $p = 1 - (1 - (1 - d)^n)^{\lfloor\frac{n_\text{max}}{dn}\rfloor+1}$. Although this chance is tiny, it still exists.

There are to projects here. First, the success probability can be increased by making the partition algorithm smarter. Second, there needs to be some mechanism to rescue the clustering in the unlikely event that a partition fails to be created.

Increasing the success probability can be done using the knowledge that a full empty row is actually a bit excessive; in reality, we only need to ensure that there is no cluster sharing between two adjacent rows. We can verify this by reifying adjacent rows and checking if they overlap under an 8-adjacency rule. This will lower performance, but probably not by much. Additional kudos if you can come up with a robust estimate of the success probability under this new rule.

Secondly, we need some logic to allocate memory in order to finish the clustering if we have an oversized cluster. This can be done fairly easily by allocating some scratch space from the device. You can allocate global memory in kernels using malloc; although this is not recommended for performance reasons, the overhead should be acceptable for this extremely rare edge case. The memory should be used to salvage the partitioning and then be deallocated.

@stephenswat stephenswat added good first issue Good for newcomers improvement Improve an existing feature labels May 4, 2024
stephenswat added a commit to stephenswat/traccc that referenced this issue May 28, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jun 5, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jun 5, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jun 5, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jun 6, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jun 6, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jun 6, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jun 7, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jun 14, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jun 24, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jun 24, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jun 24, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jun 24, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jun 28, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jul 1, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jul 2, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jul 2, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jul 3, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jul 10, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jul 10, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jul 10, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jul 16, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jul 22, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jul 24, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
stephenswat added a commit to stephenswat/traccc that referenced this issue Jul 31, 2024
This commit partially addresses acts-project#567. In the past, the CCL kernel was
unable to deal with extremely large partitions. Although this is very
unlikely to happen, our ODD samples contain a few cases of partitions so
large it crashes the code. This commit equips the CCL code with some
scratch memory which it can reserve using a mutex. This allows it enough
space to do its work in global memory. Although this is, of course,
slower, it should happen very infrequently. Parameters can be tuned to
determine that frequency. This commit also contains a few optimizations
to the code which reduce the running time on a μ = 200 event from about
1100 microseconds to 700 microseconds on an RTX A5000.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers improvement Improve an existing feature
Projects
None yet
Development

No branches or pull requests

1 participant