-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
API Refactor - MatcherResults and metrics #70
Conversation
Codecov ReportAttention:
Additional details and impacted files@@ Coverage Diff @@
## master #70 +/- ##
==========================================
+ Coverage 86.36% 87.15% +0.79%
==========================================
Files 37 40 +3
Lines 1679 1760 +81
==========================================
+ Hits 1450 1534 +84
+ Misses 229 226 -3 ☔ View full report in Codecov by Sentry. |
@Archer6621 Is this ready to review? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thanks Shaad!
This PR refactors the API #61 so that a
MatcherResults
object is returned which inherits fromdict
, instead of a plaindict
, when using eithervalentine_match
orvalentine_batch_match
. This should not break the existing API too much, as the return type is still adict
as before.This dictionary is sorted upon instantiation according to its values, from high similarity to low similarity (dictionaries preserve sorting/insertion order starting from Python 3.6).
This
MatcherResults
object exposes the following API methods:get_metrics
- gets metrics according to the matchesone_to_one
- transforms the matches so that they are one-to-one and returns a newMatcherResults
with thistake_top_percent
- takes the topn
percent of the matches and returns this as a newMatcherResults
take_top_n
- takes the topn
matches and returns this as a newMatcherResults
Aside from this new
MatcherResults
object, the metrics API has been overhauled as well. Metrics are now classes that inherit from the abstractMetric
class. These need to be instantiated with the appropriate parameters in order to be used, although all of these parameters should be keyword arguments and thus have a default.The API for this is as follows:
A final minor change is that the
Match
class got converted to a dataclass.Tests and
numpy
-style documentation have been added for the new additions, and the example + readme has been updated as well.Furthermore, with this change, it will become easy to also integrate #55 into the
MatcherResults
class, where it fits much better.