Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merge sort and AVX512 #101

Open
sergii-mamedov opened this issue Nov 5, 2023 · 3 comments
Open

Merge sort and AVX512 #101

sergii-mamedov opened this issue Nov 5, 2023 · 3 comments

Comments

@sergii-mamedov
Copy link

Hi!

Many thanks for your contribution to speeding up sorting in numpy. Wanted to ask if there are any plans to speed up merge/tim sort with AVX2/AVX512?

@r-devulap
Copy link
Contributor

Hi @sergii-mamedov that isn't in my to do list (at least not yet) unless there is a good enough justification to work on those. Is there a reason why merge/tim sort will be preferred over quicksort?

@sergii-mamedov
Copy link
Author

Hi @r-devulap
I was supported the METASPACE project. Sorting is one of the most time-consuming steps. We have from a hundred to ten thousand already sorted arrays (m/z spectra), which we need to combine into one array and sort. The total size of arrays ranges from hundreds of megabytes to hundreds of gigabytes.
I tested quicksort vs mergesort (in numpy terminology) two years ago - mergesort was 5-10 times faster. Last weekend I tested the latest release of numpy - mergesort is still about twice as fast on my data.
Also, a feature of merge/tim sort is that it is a stable variant of sorting, that is, the order of elements is determined. This is important for this task.

@r-devulap
Copy link
Contributor

curious as to what data type and distribution you have in your application. On NumPy benchmarks, quicksort is slower than merge sort only for ordered and reverse arrays:

             quick   float64        ('random',)         37.9±0.02μs
              quick   float64        ('ordered',)         37.3±0.4μs
              quick   float64       ('reversed',)         37.8±0.2μs
              quick   float64        ('uniform',)        5.44±0.04μs
              quick   float64    ('sorted_block', 10)    36.2±0.08μs
              quick   float64   ('sorted_block', 100)    39.7±0.05μs
              quick   float64   ('sorted_block', 1000)    43.5±0.1μs
              merge   float64        ('random',)          609±0.7μs
              merge   float64        ('ordered',)        18.3±0.03μs
              merge   float64       ('reversed',)        12.7±0.04μs
              merge   float64        ('uniform',)        18.3±0.03μs
              merge   float64    ('sorted_block', 10)      169±1μs
              merge   float64   ('sorted_block', 100)     123±0.8μs
              merge   float64   ('sorted_block', 1000)   68.2±0.06μs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants