-
Notifications
You must be signed in to change notification settings - Fork 915
Speedup arithmetic kernels (up to -25%) / not (-30%) #7457
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Could you also ru the boolean kernel bench? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks very cool @Dandandan -- I had a question about the performance with subtraction with no null and one potential API change, but otherwise looks great
pub fn try_binary<A: ArrayAccessor, B: ArrayAccessor, F, O>( | ||
a: A, | ||
b: B, | ||
pub fn try_binary<A, B, F, O>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this an API change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good question 🤔, I believe it isn't?
I think it was bounded by primitive arrays (via ArrowPrimitiveType
already?
I.e. you can not use it for a binary or boolean arrays.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a breaking change, you could use this with any array provided the output was primitive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm yeah, you are right!
Are we willing to make this change given that no test seems to break and other methods seem to be on PrimitiveArray
instead (unary, unary_mut, try_unary, try_unary_mut, binary, etc. are on Primitive array)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be a non-trivial functionality regression, even if in arrow-rs we aren't exploiting it - I could see this being useful for processing primitive dictionaries, for example.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW: I checked for DataFusion, here it is also only used (1 usage) for a primitive array.
This comment was marked as outdated.
This comment was marked as outdated.
It looks like performance difference is bigger for the arithmetic changes on my computer 🤔 I think it might be something related to the osx allocator. Other than that I think this is still improving the kernels on the use of safety / showing the from_trusted_len_iter is not needed as much. |
This comment was marked as outdated.
This comment was marked as outdated.
I agree this PR is an improvement in general. I'll rerun the benchmarks to see if they are repeatable |
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
🤔 second benchmark runs look very good performance wise. I'll run one more |
🤖 |
🤖: Benchmark completed Details
|
🤖 |
🤖: Benchmark completed Details
|
This seems like similar to my results (although slightly less) which often shows better performance and sometimes 0% and sometimes 10-25% change for similar set of benchmarks. |
This comment was marked as outdated.
This comment was marked as outdated.
sorry -- clicked the wrong button |
What do we want to do with the API breakage? At least I can break out the |
Which issue does this PR close?
Closes #.
Rationale for this change
What changes are included in this PR?
Are there any user-facing changes?