Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add food affordability data #1381

Merged
merged 4 commits into from
Jul 26, 2023
Merged

Conversation

pabloarosado
Copy link
Contributor

Add steps for World Bank's Food prices for nutrition dataset.
Also, I added a small improvement of etl.harmonize: To include the region code as another alias.

@pabloarosado pabloarosado marked this pull request as ready for review July 25, 2023 13:20
Copy link
Contributor

@paarriagadap paarriagadap left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! I mostly added comments in the metadata to change the units of the monetary measures in constant international-$

"NAC": "North America (WB)",
"SAS": "South Asia (WB)",
"SSF": "Sub-Saharan Africa (WB)",
"UMC": "Upper-middle-income countries",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The assignment of (WB) seems inconsistent between the WB income groups: Low-income countries has it, but not High income or Upper-middle or Lower-middle income

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! By the way, I'm assuming that their income groups are consistent (and up-to-date) in all World Bank datasets (and hence consistent with OWID), let me know if that's not a good assumption.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assume that as well, but it will depend on when that dataset was released. If it's sufficiently old it can have a previous definition of income groups. I see they change every year.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that you say it, it should be that groups are dynamically changing per year, so that would be consistent. I don't know that with 100% certainty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If they assume a definition of income groups that changes over the years, then those income groups are inconsistent with ours. Because we assume the current definition of income groups. But for this dataset (with just a few years of data) I suppose it should not matter much.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, I wasn't sure about that. I would assume theirs is dynamic, but I don't know

"BON": "Bonaire (WB)",
"EAS": "East Asia & Pacific (WB)",
"ECS": "Europe & Central Asia (WB)",
"HIC": "High-income countries",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general we are using the hyphen only for lower-middle or upper-middle. I don't think I have seen high"-"income or low"-"income or lower-middle"-"income much.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently how we name income groups:

    MAPPING_CLASSIFICATION = {
        "..": np.nan,  # no available classification for country-year (maybe country didn't exist yet/anymore)
        "L": "Low-income countries",
        "H": "High-income countries",
        "UM": "Upper-middle-income countries",
        "LM": "Lower-middle-income countries",
        "LM*": "Lower-middle-income countries",
    }

For consistency, those are the names we should use. We could consider changing them (but it would be a significant refactor, similar to the one of East Timor).

The indicator expresses the total number of people who cannot afford an energy-sufficient diet in a given country and year. The indicator is computed by multiplying the percentage of the population in a country unable to afford a healthy diet by population data taken from the World Development Indicators (WDI) of the World Bank. A value of zero indicates a null or a small number rounded down at the current precision level.
cost_of_a_healthy_diet:
title: Cost of a healthy diet
unit: current PPP$/person/day
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's weird that they use these units here. It won't be of much use to compare countries, unless only the year 2017 is shown.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see they use the ratio between this and the cost of an energy sufficient diet below (and many other cost afterward), which is not wrong just because the latter only has data for 2017

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is unclear to me. Currently in our chart we assume it's 2017$ (like all other variables), but that may not be accurate. I'll merge for now and consider restricting the chart to 2017.

@pabloarosado pabloarosado merged commit b71be83 into master Jul 26, 2023
3 checks passed
@pabloarosado pabloarosado deleted the add-food-affordability-data branch July 26, 2023 09:54
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants