Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically set sugars or saturated-fat to 0 when carbohydrates or fat is 0 #4561

Closed
stephanegigandet opened this issue Nov 27, 2020 · 9 comments · Fixed by #7781
Closed
Assignees
Labels
🧽 Data quality https://wiki.openfoodfacts.org/Quality ✨ Feature Features or enhancements to Open Food Facts server 🚦 Nutri-Score Nutrition facts 🎯 P1

Comments

@stephanegigandet
Copy link
Contributor

stephanegigandet commented Nov 27, 2020

What

Part of

@stephanegigandet stephanegigandet added ✨ Feature Features or enhancements to Open Food Facts server Nutrition facts labels Nov 27, 2020
@github-actions github-actions bot added the ⏰ Stale This issue hasn't seen activity in a while. You can try documenting more to unblock it. label Feb 27, 2021
@teolemon teolemon removed the ⏰ Stale This issue hasn't seen activity in a while. You can try documenting more to unblock it. label Jul 19, 2021
@CharlesNepote CharlesNepote added the 🧽 Data quality https://wiki.openfoodfacts.org/Quality label Oct 25, 2022
@CharlesNepote
Copy link
Member

CharlesNepote commented Oct 25, 2022

As of 2022-10-25, more than 44,000 products are in this case, representing 1.7% of all products.

Also more than 35K of them (1.38% of total number of products) do have a category, which means Nutri-Score could probably be calculated for those.

@CharlesNepote
Copy link
Member

A question related to this issue is the interest to know if the 0 is written on the product or computed.

I find interesting to know:

  • what is actually printed on the product or not (we can use the hyphen "-" to tell that the value is not on the packaging)
  • what is computed or not
    But that makes thing more complicated.

Some other nutrient values could be computed based on some rules. For example, if a product is virgin olive oil and he does not have some nutrient values, the missing values could be computed/deduced from either category means or other databases such as CIQUAL, etc.

@CharlesNepote CharlesNepote added the 🚦Nutri-Score https://world.openfoodfacts.org/nutriscore label Oct 25, 2022
@alexgarel alexgarel moved this to 🔖 Sprint (max 10) in Product Opener - Sprint Jan 4, 2023
@alexgarel
Copy link
Member

@stephanegigandet:

  • I imagine I can do this akin to fix_salt_equivalent (Food.pm) which is called in analyze_and_enrich_product_data (Products.pm) ?

    I would write a fix_zero_carbohydrates_sugar and fix_zero_fat_saturated

  • One question I have is: if carbohydrates is 0 but I have a value for sugar (and same for fat), should I keep the wrong value ? (I would say yes ?).

  • More over, should I put "-" if I automatically fix sugar/saturated fats ? It's seems better to me, than putting a 0. (I would verify with a test that nutrition score takes it as 0 in this case)

@stephanegigandet
Copy link
Contributor Author

@alexgarel in fact I think the safest thing would be to not set values to 0, but instead to assume they are 0 when we compute the Nutri-Score.

If carbohydrates is 0 but sugars is not 0, we should not do anything I think, there's no way to know which value is correct or not.

Putting "-" for sugar is not a good idea. "-" indicates that the value is not on the package. But we have no way to know that, it could just be that the value was not entered.

So in short, I think we should just change Nutriscore.pm to assume sugars is 0 when we don't have a sugars value but we have a carbohydrates value.

@CharlesNepote
Copy link
Member

CharlesNepote commented Jan 5, 2023

There are many cases where we could benefit to have a xxx_computed or xxx_normalized distinguished from the values entered by users:

  • polyols and erythritol
  • carbs and polyols-sugar-erythritol
  • etc.

The logic of these fields would be:

  • only compute when the data is missing; when the data have been filed by the user, just copy the data(?)
  • only compute when possible: when too many data is lacking, don't compute; it should be different from an estimation
  • do not keep the computed value if it throws a data-quality-error
  • do not keep the computed value if there is a coherence issue; eg. carb is 0 but sugar is not 0

Thus, our algorithm for Nutri-Score, Eco-Score, Nova, Energy, etc. should be based on _computed values.

@stephanegigandet
Copy link
Contributor Author

We could have *_computed and *_estimated indeed. It might be useful to think a bit more about it. The quick and dirty solution would be to suffix the fields and add them to the nutriments hash, but it's already very messy.Maybe we could put a bit more structure.

e.g.

nutrients->{as_sold|prepared}{per 100g|per serving}{listed|computed|estimated}{carbohydrates}

@stephanegigandet
Copy link
Contributor Author

@alexgarel As a short term solution, I think we could create a Nutrients.pm module (to keep Food.pm from growing) with a compute_nutrients($nutrients_ref) function that returns computed nutrients (with things like sugars and saturated fat set to 0).

Then we call it when we create the input to compute the Nutriscore, to check nutrients data quality etc.

Then we can take a bit more time to see if we should also store the result in the product data, and how to do it.

@alexgarel
Copy link
Member

For the moment I decided to tweak the nutriscore_data structure.

@alexgarel
Copy link
Member

It's done on #7947 - waiting for reviews.

@alexgarel alexgarel moved this from 🔖 Sprint (max 10) to 🏗 In progress in Product Opener - Sprint Jan 5, 2023
@CharlesNepote CharlesNepote moved this from To do to In progress in 🧽 Ensuring Data Quality Jan 5, 2023
@github-project-automation github-project-automation bot moved this from In progress to Done in 🧽 Ensuring Data Quality Jan 10, 2023
@teolemon teolemon removed the 🚦Nutri-Score https://world.openfoodfacts.org/nutriscore label May 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🧽 Data quality https://wiki.openfoodfacts.org/Quality ✨ Feature Features or enhancements to Open Food Facts server 🚦 Nutri-Score Nutrition facts 🎯 P1
Projects
Status: 🏗 In progress
Archived in project
Development

Successfully merging a pull request may close this issue.

4 participants