-
-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: quality test for nutriscore on olive oils #8360
Conversation
What is the difference between this new test (en-olive-oil-no-ingredients) and the already existing test named "olive-oil"? Maybe we could try with Virgin olive oils or/and Extra-virgin olive oils instead EDIT: just tested by replacing categories by "extra-virgin olive oil". It also results in: |
# the line should contain a single ingredient | ||
my $expected_ingredients = $'; # everything after the matched string | ||
|
||
if ($expected_ingredients =~ /,/i) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stephanegigandet, is there a function that could be used there to make sure that the ingredient provided is in the ingredients taxonomy?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's an exist_taxonomy_tag() function, but that would imply that the ingredients taxonomy would need to be loaded and built before the categories taxonomy.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
then, I will leave it as is for now
Fixes #8353 1/ Remark: nutriscore grades are a, b, c, d, e letters. Whereas nutriscore scores are some some numbers. Hence, I renamed expected_nutriscore:en:c (suggested in the issue) to expected_nutriscore_grade:en:c 2/ Remark 2: following ingredients exist: en:extra-virgin olive oil, en:virgin olive oil and en:olive oil. 3/ Tried to prevent incorrect tags in Tags.pm (although these errors are ignored) 4/ Tried to differentiate between cases (missing nutriscore grade, which grade instead of which one) in the quality error names to allow more "fine-grained" monitoring. @CharlesNepote will that be a burden for data_quality taxonomy? (we can have variants for all categories that will have these 2 tags, different nutriscore grade values) see DataQualityFood.pm. |
Codecov Report
@@ Coverage Diff @@
## main #8360 +/- ##
==========================================
+ Coverage 48.61% 48.68% +0.07%
==========================================
Files 117 117
Lines 21751 21798 +47
Branches 4852 4859 +7
==========================================
+ Hits 10574 10613 +39
- Misses 9880 9888 +8
Partials 1297 1297
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Great! Thanks a lot for the PR! :)
I would check that the actual ingredient is either the expected ingredient, or a child of it. There's a function for it: is_a("ingredients","en:virgin-olive-oil","en:olive-oil")
I would suggest to have only one generic error, that does not specify the category, something like "Nutri-Score does not match the Nutri-Score expected for the category". For monitoring, we can always add a /categories facet. If we put too many things in the tag, then it becomes impossible to translate them. One thing to note is that we show data errors very prominently in the producers platform, that's why we try to translate them. |
Reduced to 1 error for nutrition grade and 1 error for ingredient Included is_a:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If @stephanegigandet agrees, I think we can go on. Thanks A LOT @benbenben2, this is a very important improvement IMHO.
# add following tag for category having always same nutriscore grade | ||
# only 1 letter is allowed | ||
# expected_nutriscore_grade:en:c | ||
# add following tag for category having always same ingredient |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# add following tag for category having always same ingredient | |
# add following tag for category having always same ingredient |
Updated the tests results to make them pass. |
@stephanegigandet merge conflict |
Thank you @stephanegigandet! I was stuck |
Need help on this @stephanegigandet @alexgarel @teolemon |
@benbenben2 I pushed 5084698 which:
In your previous version, lots of expected_tests_results had "en:ingredients-single-ingredient-from-category-does-not-match-actual-ingredients" added, without any reason for that (like in cookies for example). |
Kudos, SonarCloud Quality Gate passed! |
Thanks a lot @alexgarel 🥇 |
Added a test to verify that we estimate the fruits/vegetables content of olive oils without ingredients to 100% for nutriscore computation.