-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat!: introduce confidence scores for check facts #620
Conversation
Signed-off-by: behnazh-w <[email protected]>
9968c46
to
7f92fe2
Compare
Signed-off-by: behnazh-w <[email protected]>
01d446e
to
9aedac6
Compare
Signed-off-by: behnazh-w <[email protected]>
Under this new schema, the confidence score is recorded for a |
For this particular example, I think of |
Signed-off-by: behnazh-w <[email protected]>
bc4374a
to
d25bf57
Compare
Signed-off-by: behnazh-w <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the PR should be good to merge.
Thanks for the PR.
Signed-off-by: behnazh-w <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me.
This PR changes the data model and allows specifying confidence scores for check results, which is especially useful when a check reports multiple candidate results. All of these confidence scores are added to the check tables in the database. However, the fact that has the highest confidence is shown in the HTML/JSON report only. The justifications are no longer required to be added manually to the CheckResultData. Instead, they are curated directly from the results in the table. If a column has specified JustificationType in the column mapping, it will be picked up automatically and rendered as plain text or href depending on the specified type. If a check fails or is skipped, we show a default Not Available. justification. This allows to create HTML/JSON reports from the database reproducibly. Signed-off-by: behnazh-w <[email protected]>
Confidence score and justification type
This PR allows specifying confidence scores for check results, which is especially useful when a check reports multiple candidate results. All of these confidence scores are added to the check tables in the database. However, the fact that has the highest confidence is shown in the HTML/JSON report only.
The justifications are no longer required to be added manually to the
CheckResultData
. Instead, they are curated directly from the results in the table. If a column has specifiedJustificationType
in the column mapping, it will be picked up automatically and rendered as plain text or href depending on the specified type. If a check fails or is skipped, we show a defaultNot Available.
justification. This allows to create HTML/JSON reports from the database reproducibly.Refactoring
For this feature to work, the following refactorings are done as well:
CheckFacts
ORM mapping, which all the check mappings inherit from. That caused some circular dependency issues for theExpectation
ORM mapping. I have refactored and improved this mapping accordingly.asset_url
toProvenanceAvailableFacts
. That required adding a newSLSAProvenanceData
wrapper class for GitHub release provenances.Documentation
I have added elaborate instructions for adding a new check and explained the current interface.