-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
improve performance of recode! for array dest #355
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, and sorry for the delay. I think if we add a special method, better use the most efficient implementation (see my comment).
for p in opt_pairs | ||
if x ≅ p.first | ||
return p.second | ||
end | ||
end | ||
for p in opt_pairs | ||
if recode_in(x, p.first) | ||
return p.second | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This could change the behavior in case of overlap between pairs. Why did you change this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this was a while a go I'll need to take some time with it and rerun my (micro-)benchmarks to be sure but unless my memory fails me recode_in
was a performance bottleneck and splitting the checks (in addition to switching to map!
) made a noticeable difference for highly optimizable cases. You're absolutely right that it is a breaking change, and should have been highlighted in the PR since it warrants discussion. I spent some time trying to get recode_in
to optimize away but was not satisfied with the result. The most troublesome part is of course the any(x ≅ y for y in collection)
for the case when collection
is a primitive. I'll get back to you with data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK. Maybe better do this in a separate PR since it's a bit more tricky.
pairs = map(pairs) do p | ||
p.first => convert(T, p.second) | ||
end | ||
recoded = recode(src, default, pairs...) | ||
if T >: Missing | ||
dest .= unwrap.(recoded) | ||
else | ||
dest .= missing_check.(unwrap.(recoded)) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than doing this, to avoid making a copy and two passes over the data, we should call recode
on levels(src)
, and then do something like:
@inbounds for i in eachindex(dest, src)
dest[i] = newlevels[src.refs[i]+1]
end
The actual implementation needs to be a bit more complex so that the first entry in newlevels
is missing
(to handle the case when src.refs
is 0).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than doing this, to avoid making a copy and two passes over the data
By copy
do you mean copy of the src.levels
? In this implementation no copy of the actual array (or refs
) is made which is the main reason why it is so much faster (as outlined in my StackOverflow answer) is because all the actual copying of the refs
happens only once at the last line dest .= unwrap.(recoded)
the recoded
variable shares the refs with src
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
recode(src, default, pairs...)
allocates a new vector, right? That's relatively fast, but it's even better to avoid it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, I remembered the details wrong. In the StackOverflow example I did:
mapping = Dict("X"=>1, "Y"=>2, "Z"=>3)
b = CategoricalArray{Int64,1,UInt32}(undef, 0)
b.refs = a.refs
levels!(b.pool, [mapping[l] for l in levels(a.pool)])
which is similar to what you're suggesting. However, in this PR we initialize the CategoricalArray that will be put in the recoded
variable with something like CategoricalArray{S, N, R}(undef, size(a))
so the refs
are not shared between src and recoded.
EDIT: Like I noted on SO using levels!
does not work in the general case
Any news here? |
Performance improvements to fix #354.
Before:
After:
Todo: