Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix argmin and argmax #98

Merged
merged 8 commits into from
May 10, 2024
Merged

Fix argmin and argmax #98

merged 8 commits into from
May 10, 2024

Conversation

gdalle
Copy link
Contributor

@gdalle gdalle commented May 10, 2024

Draft PR to fix #97

How do we add proper tests for this in the least disruptive way?

Copy link

codecov bot commented May 10, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 94.85%. Comparing base (fa840f9) to head (9851910).

❗ Current head 9851910 differs from pull request most recent head 0178dde. Consider uploading reports for the commit 0178dde to get more accurate results

Additional details and impacted files
@@            Coverage Diff             @@
##             main      #98      +/-   ##
==========================================
+ Coverage   94.82%   94.85%   +0.03%     
==========================================
  Files           5        5              
  Lines        1043     1030      -13     
==========================================
- Hits          989      977      -12     
+ Misses         54       53       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

src/chainedvector.jl Outdated Show resolved Hide resolved
@mkitti
Copy link
Member

mkitti commented May 10, 2024

I'm working on more tests in #99

@Seelengrab
Copy link

Seelengrab commented May 10, 2024

Might I suggest an approach based on property based testing? A good one here would be

isargmax_maximum(cv) = cv[argmax(cv)] == maximum(cv)

I.e., indexing with the result of argmax should give the maximum value.

(substitute with argmin/minimum respectively for the minimum case.)

This catches the problem immediately, if used with random data. Such a property can also be used automatically with (shameless plug) Supposition.jl:

julia> using Supposition

julia> data_vector = Data.Vectors(Data.Integers{UInt8}(); max_size=100);

julia> vecs = Data.Vectors(data_vector; min_size=1, max_size=10);

julia> chainedvecs = map(vecs) do vs
           all(isempty, vs) && reject!()
           ChainedVector(vs)
       end;

julia> isargmax_maximum(cv) = cv[argmax(cv)] == maximum(cv)

julia> @check db=false isargmax_maximum(chainedvecs);
┌ Error: Property errored!
│   Description = "isargmax_maximum"
│   arg_1 =1-element ChainedVector{UInt8, Vector{UInt8}}:0x00
│   exception =
│    BoundsError: attempt to access 1-element ChainedVector{UInt8, Vector{UInt8}} at index [0x00]
│    Stacktrace:
│     [1] throw_boundserror(A::ChainedVector{UInt8, Vector{UInt8}}, I::Tuple{UInt8})
│       @ Base ./essentials.jl:14
│     [2] checkbounds
│       @ ./abstractarray.jl:699 [inlined]
│     [3] getindex
│       @ ~/.julia/packages/SentinelArrays/1kRo4/src/chainedvector.jl:94 [inlined]
│     [4] isargmax_maximum(cv::ChainedVector{UInt8, Vector{UInt8}})
│       @ Main ./REPL[39]:1
└ @ Supposition ~/Documents/projects/Supposition.jl/src/testset.jl:287
Test Summary:    | Error  Total  Time
isargmax_maximum |     1      1  0.7s

Of course, generating some random data and testing that would be sufficient for catching the problem. Supposition.jl just tries to give a minimal example, in this case ChainedVector([[0]]).

@mkitti
Copy link
Member

mkitti commented May 10, 2024

I found another discrepancy.

julia> c = ChainedVector([[7, 21, 31], [45], [53, 53]])
6-element ChainedVector{Int64, Vector{Int64}}:
  7
 21
 31
 45
 53
 53

julia> findmax(c)
(53, 6)

julia> findmax(collect(c))
(53, 5)

@gdalle
Copy link
Contributor Author

gdalle commented May 10, 2024

Did you try to see if my fix resolves it?

I agree that adding a flurry of tests, property testing and other stuff seems like a good idea in the medium run. My goal with this PR is simpler: there was a thing that was badly broken, it is now fixed with a minimal set of tests. I don't thing merging this and thinking about future, better tests are two incompatible actions.

@mkitti
Copy link
Member

mkitti commented May 10, 2024

I agree that these are orthogonal approaches. I have merged this branch into mine on #99.

The question is what is the problem identified in #97 ? To me the problem is that testing is severely insufficient.

@gdalle
Copy link
Contributor Author

gdalle commented May 10, 2024

And that argmax is wrong. My PR only resolves the latter

Comment on lines +289 to +296
@test findmax(x) == (95, 8)
@test findmin(x) == (14, 7)
@test argmax(x) == 8
@test argmin(x) == 7
@test findmax(inv, x) == (inv(14), 7)
@test findmin(inv, x) == (inv(95), 8)
@test argmax(inv, x) == 14
@test argmin(inv, x) == 95

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@test findmax(x) == (95, 8)
@test findmin(x) == (14, 7)
@test argmax(x) == 8
@test argmin(x) == 7
@test findmax(inv, x) == (inv(14), 7)
@test findmin(inv, x) == (inv(95), 8)
@test argmax(inv, x) == 14
@test argmin(inv, x) == 95
@test let (val, idx) = findmax(x)
max_val = maximum(x)
val == max_val == x[idx]
end
@test let (val, idx) = findmin(x)
min_val = minimum(x)
val == min_val == x[idx]
end
@test x[argmax(x)] == maximum(x)
@test x[argmin(x)] == minimum(x)
@test let (val, idx) = findmax(inv, x)
max_val = maximum(inv, x)
val == max_val == inv(x[idx])
end
@test let (val, idx) = findmin(inv, x)
min_val = minimum(inv, x)
val == min_val == inv(x[idx])
end
@test x[argmax(inv, x)] == maximum(inv, x)
@test x[argmin(inv, x)] == minimum(inv, x)

IMO writing it like this would be a bit more robust against changing x in the future and forgetting to change the hardcoded data. This also ends up testing that argmax actually has to return an index, and not just the right number for this test case :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll adopt this in #99

Copy link
Member

@quinnj quinnj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @gdalle!

@quinnj quinnj merged commit c047384 into JuliaData:main May 10, 2024
4 checks passed
@Moelf
Copy link

Moelf commented May 10, 2024

can we make a release?

@gdalle gdalle deleted the findmaxmin branch May 11, 2024 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

argmax returns incorrect results
5 participants