You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
There are a number of cases where a user will have org units in their profile and they don't even come close to matching the org unit on file. To this point, we've ignored such cases. But maybe we can use this data to cut down on false positives.
An example is personIdentifier = sue2002 and PMID = 36630615. Psychiatry (sue2002's org unit) is very different than Cell and Developmental Biology.
For our data set, I estimate this will improve accuracy by 0.5%, by reducing the number of false positives. But given our use of organizational synonyms, the only way to tell for certain would be to run this for everyone.
Requirements
This Java file outputs in part a value called strategy.orgUnitScoringStrategy.organizationalUnitDepartmentMatchingScore. This is for a positive departmental match. I want to update the code so it also outputs a organizationalUnitDepartmentNegativeMatchingScore in these circumstances:
identity.getOrganizationalUnits() != null
articleAffiliation != null
The words "Department of ", "Division of ", etc. exist in articleAffiliation string but that match fails.
See this PR. It hasn't been "tested" and it probably doesn't "work," but I think it's on the right track.
Here's how a particular downweight affects the number of true / false positives / negatives. This is from a set of ~200,000 articles.
Background
There are a number of cases where a user will have org units in their profile and they don't even come close to matching the org unit on file. To this point, we've ignored such cases. But maybe we can use this data to cut down on false positives.
An example is personIdentifier = sue2002 and PMID = 36630615. Psychiatry (sue2002's org unit) is very different than Cell and Developmental Biology.
For our data set, I estimate this will improve accuracy by 0.5%, by reducing the number of false positives. But given our use of organizational synonyms, the only way to tell for certain would be to run this for everyone.
Requirements
This Java file outputs in part a value called strategy.orgUnitScoringStrategy.organizationalUnitDepartmentMatchingScore. This is for a positive departmental match. I want to update the code so it also outputs a organizationalUnitDepartmentNegativeMatchingScore in these circumstances:
See this PR. It hasn't been "tested" and it probably doesn't "work," but I think it's on the right track.
Here's how a particular downweight affects the number of true / false positives / negatives. This is from a set of ~200,000 articles.
Test case
The combination of personIdentifier = sue2002 and PMID = 36630615 should return this...
The text was updated successfully, but these errors were encountered: