-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Experiment: switch to link by log odds window #229
base: main
Are you sure you want to change the base?
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #229 +/- ##
==========================================
+ Coverage 97.69% 97.72% +0.02%
==========================================
Files 32 32
Lines 1651 1714 +63
==========================================
+ Hits 1613 1675 +62
- Misses 38 39 +1 ☔ View full report in Codecov by Sentry. |
@ericbuckley @m-goggins Updated code to make all tests pass (there's still one set of three tests that involved belongingness that I'm not quite sure what to do with, but will continue noodling). I also ran the |
Updating to mark all code here complete. All tests have been adjusted to match the new algorithm and everything passes. This is, from my view, "merge ready" code. |
1. If both grades (previously seen and newly processed) are equal, | ||
updating is easy: just take the result with the higher RMS. | ||
2. If the existing grade is certain but the new grade is not, we | ||
*don't* update: being above the Certain Match Threshold is a stricter | ||
inequality than being within the Possible Match Window, so we don't | ||
want to overwrite with less information (Example: suppose the DIBBs | ||
algorithm passes were switched. Suppose Cluster A had an RMS of 0.918. | ||
This would be a match grade of 'certain'. If Pass 1 ran after Pass 2, | ||
and Cluster A scored an RMS of 0.922, that would grade as 'possible'. | ||
But despite the higher RMS, Cluster A actually accumualted more points | ||
and stronger separation already, so we don't want to downgrade.) | ||
3. If the new grade is certain but the existing grade is not, we | ||
*always* update. Being a 'certain' match is a stronger statement and | ||
thus more worth saving. Consider the example above with the passes | ||
as normal. It would be better to save the Pass 2 RMS of 0.918 that | ||
graded as 'Certain' than it would be to keep the Pass 1 RMS of 0.922 | ||
that only graded 'Possible,' since the user's previous profiling found | ||
these matches higher quality. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine to keep the examples in here for now but we might want to move to a consolitdated place ventually, maybe in design.md
when it gets updated in #142
cmt: float | ||
grade: str | ||
|
||
def _do_update(self, earned_points, rms, mmt, cmt, grade): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about update_score_tracking_row
?
self.cmt = cmt | ||
self.mmt = mmt | ||
|
||
def handle_update(self, earned_points, rms, mmt, cmt, grade): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about check_for_score_tracking_updates
?
return rule_result | ||
|
||
|
||
def grade_result(rms: float, mmt: float, cmt: float) -> str: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about assign_match_grade
or grade_rms_result
or grade-rms
?
certain_results = [x for x in results if x.grade == 'certain'] | ||
# re-assign the results array since we already have the higher-priority | ||
# 'certain' grades if we need them | ||
results = [x for x in results if x.grade == 'possible'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
results = [x for x in results if x.grade == 'possible'] | |
possible_results = [x for x in results if x.grade == 'possible'] |
result_counts["above_lower_bound"] = len(results) | ||
if not results: | ||
|
||
if not results and not certain_results: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if not results and not certain_results: | |
if not possible_results and not certain_results: |
0.8, | ||
0.925 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you remind me where these numbers come from?
) | ||
|
||
assert link.compare(rec, pat, algorithm_pass) is True | ||
assert link.compare(rec, pat, algorithm_pass) == 0.35 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
assert link.compare(rec, pat, algorithm_pass) == 0.35 | |
assert link.compare(rec, pat, algorithm_pass) == algorithm_pass.kwargs["log_odds"]["IDENTIFIER"] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: this makes it clearer that the resulting score is expecting to be equal to the log-odds and not normalized to 1.0. Might consider for other tests as well if you think it's a good idea.
) | ||
|
||
#should pass as MR is the same for both | ||
assert link.compare(rec, pat, algorithm_pass) is True | ||
assert link.compare(rec, pat, algorithm_pass) == 0.35 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm. this is beyond the scope of this PR but I'm wondering if we should be treating all identifiers equally. It makes sense to me to give full points if the patient agrees with any identifiers of the same type in a record since you can have multiple MRNs, driver's licenses, etc. However, SSN should not be different because it's a national ID. I'm wondering if we should treat SSN differently.
Description
Experimental branch showing how the guts of the code might change switching from a windowed belongingness to a windowed log odds, with Dan's suggestion to create the whole thing as a medical test result with interpretation layered on top (i.e. normalization pushed as far up the pipeline as possible so all values the user deals with are between 0 and 1). Code is by no means PR ready or finalized (e.g. tests not handled, updates to schemas.Prediction and schemas.LinkResult not fully changed, etc.), but wanted to share the ideas.
NOTE: I think we should de-couple the log odds work from missing fields or changes to blocking to avoid scope creep and pushing this feature farther out. I think this milestone should just be about replacing belongingness with lod-odds.
Related Issues
[Link any related issues or tasks from your project management system.]
Additional Notes
[Add any additional context or notes that reviewers should know about.]
<--------------------- REMOVE THE LINES BELOW BEFORE MERGING --------------------->
Checklist
Please review and complete the following checklist before submitting your pull request:
Checklist for Reviewers
Please review and complete the following checklist during the review process: