-
-
Notifications
You must be signed in to change notification settings - Fork 21
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add food affordability data #1381
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! I mostly added comments in the metadata to change the units of the monetary measures in constant international-$
etl/steps/data/meadow/wb/2023-07-24/food_prices_for_nutrition.py
Outdated
Show resolved
Hide resolved
"NAC": "North America (WB)", | ||
"SAS": "South Asia (WB)", | ||
"SSF": "Sub-Saharan Africa (WB)", | ||
"UMC": "Upper-middle-income countries", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The assignment of (WB) seems inconsistent between the WB income groups: Low-income countries has it, but not High income or Upper-middle or Lower-middle income
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! By the way, I'm assuming that their income groups are consistent (and up-to-date) in all World Bank datasets (and hence consistent with OWID), let me know if that's not a good assumption.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume that as well, but it will depend on when that dataset was released. If it's sufficiently old it can have a previous definition of income groups. I see they change every year.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that you say it, it should be that groups are dynamically changing per year, so that would be consistent. I don't know that with 100% certainty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If they assume a definition of income groups that changes over the years, then those income groups are inconsistent with ours. Because we assume the current definition of income groups. But for this dataset (with just a few years of data) I suppose it should not matter much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, I wasn't sure about that. I would assume theirs is dynamic, but I don't know
etl/steps/data/garden/wb/2023-07-24/food_prices_for_nutrition.countries.json
Outdated
Show resolved
Hide resolved
"BON": "Bonaire (WB)", | ||
"EAS": "East Asia & Pacific (WB)", | ||
"ECS": "Europe & Central Asia (WB)", | ||
"HIC": "High-income countries", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general we are using the hyphen only for lower-middle or upper-middle. I don't think I have seen high"-"income or low"-"income or lower-middle"-"income much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is currently how we name income groups:
MAPPING_CLASSIFICATION = {
"..": np.nan, # no available classification for country-year (maybe country didn't exist yet/anymore)
"L": "Low-income countries",
"H": "High-income countries",
"UM": "Upper-middle-income countries",
"LM": "Lower-middle-income countries",
"LM*": "Lower-middle-income countries",
}
For consistency, those are the names we should use. We could consider changing them (but it would be a significant refactor, similar to the one of East Timor).
etl/steps/data/garden/wb/2023-07-24/food_prices_for_nutrition.meta.yml
Outdated
Show resolved
Hide resolved
etl/steps/data/garden/wb/2023-07-24/food_prices_for_nutrition.meta.yml
Outdated
Show resolved
Hide resolved
etl/steps/data/garden/wb/2023-07-24/food_prices_for_nutrition.meta.yml
Outdated
Show resolved
Hide resolved
The indicator expresses the total number of people who cannot afford an energy-sufficient diet in a given country and year. The indicator is computed by multiplying the percentage of the population in a country unable to afford a healthy diet by population data taken from the World Development Indicators (WDI) of the World Bank. A value of zero indicates a null or a small number rounded down at the current precision level. | ||
cost_of_a_healthy_diet: | ||
title: Cost of a healthy diet | ||
unit: current PPP$/person/day |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's weird that they use these units here. It won't be of much use to compare countries, unless only the year 2017 is shown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see they use the ratio between this and the cost of an energy sufficient diet below (and many other cost afterward), which is not wrong just because the latter only has data for 2017
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This one is unclear to me. Currently in our chart we assume it's 2017$ (like all other variables), but that may not be accurate. I'll merge for now and consider restricting the chart to 2017.
Add steps for World Bank's Food prices for nutrition dataset.
Also, I added a small improvement of
etl.harmonize
: To include the region code as another alias.