Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

🐛 Fix filter_to_subgraph when passing the only argument #3197

Merged
merged 2 commits into from
Aug 27, 2024

Conversation

pabloarosado
Copy link
Contributor

@pabloarosado pabloarosado commented Aug 26, 2024

I noticed that to_dependency_order(dag=dag, includes=STEPS, only=True) was returning a list of steps that was larger than STEPS. But if I understand correctly, only should avoid that, right?
This PR may fix it (but I'm not sure if that behaviour was as expected, and I'm misinterpreting the meaning of only).

As an example:

from etl.steps import load_dag, to_dependency_order
STEPS = ['data://grapher/war/2023-09-21/ucdp', 'data://garden/war/2023-09-21/ucdp']
steps = to_dependency_order(
    dag=load_dag(),
    includes=STEPS,
    excludes=[],
    downstream=False,
    only=True,
)

as far as I understand, should return ['data://garden/war/2023-09-21/ucdp', 'data://grapher/war/2023-09-21/ucdp'], but instead it was returning ['data://garden/war/2023-09-21/ucdp', 'data://grapher/war/2023-09-21/ucdp', 'data://garden/war/2023-09-21/ucdp_prio'].

(this issue affected @lucasrodes while using StepUpdater)

@pabloarosado pabloarosado self-assigned this Aug 26, 2024
@pabloarosado pabloarosado marked this pull request as ready for review August 26, 2024 13:35
@owidbot
Copy link
Contributor

owidbot commented Aug 26, 2024

Quick links (staging server):

Site Admin Wizard

Login: ssh owid@staging-site-fix-filter-to-subgraph

chart-diff: ✅ No charts for review.
data-diff: ❌ Found differences
~ Dataset garden/who/latest/monkeypox
-   - title: Mpox confirmed cases and deaths
+   + title: Monkeypox
  = Table monkeypox
    ~ Column annotation (changed metadata)
-       -     title: Mpox confirmed cases and deaths
+       +     title: Monkeypox
-       -     citation_full: Mpox confirmed cases and deaths. World Health Organization; 2024.
+       +     citation_full: |-
+       +       Global Health Estimates 2021: Deaths by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.
-       -     url_main: https://frontdoor-l4uikgap6gz3m.azurefd.net/MPX/V_MPX_VALIDATED_DAILY
+       +     url_main: https://extranet.who.int/publicemergency/
    ~ Column iso_code (changed metadata)
-       -     title: Mpox confirmed cases and deaths
+       +     title: Monkeypox
-       -     citation_full: Mpox confirmed cases and deaths. World Health Organization; 2024.
+       +     citation_full: |-
+       +       Global Health Estimates 2021: Deaths by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.
-       -     url_main: https://frontdoor-l4uikgap6gz3m.azurefd.net/MPX/V_MPX_VALIDATED_DAILY
+       +     url_main: https://extranet.who.int/publicemergency/
    ~ Column new_cases (changed metadata)
-       -     title: Mpox confirmed cases and deaths
+       +     title: Monkeypox
-       -     citation_full: Mpox confirmed cases and deaths. World Health Organization; 2024.
+       +     citation_full: |-
+       +       Global Health Estimates 2021: Deaths by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.
-       -     url_main: https://frontdoor-l4uikgap6gz3m.azurefd.net/MPX/V_MPX_VALIDATED_DAILY
+       +     url_main: https://extranet.who.int/publicemergency/
    ~ Column new_cases_per_million (changed metadata)
-       -     title: Mpox confirmed cases and deaths
+       +     title: Monkeypox
-       -     citation_full: Mpox confirmed cases and deaths. World Health Organization; 2024.
+       +     citation_full: |-
+       +       Global Health Estimates 2021: Deaths by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.
-       -     url_main: https://frontdoor-l4uikgap6gz3m.azurefd.net/MPX/V_MPX_VALIDATED_DAILY
+       +     url_main: https://extranet.who.int/publicemergency/
    ~ Column new_cases_smoothed (changed metadata)
-       -     title: Mpox confirmed cases and deaths
+       +     title: Monkeypox
-       -     citation_full: Mpox confirmed cases and deaths. World Health Organization; 2024.
+       +     citation_full: |-
+       +       Global Health Estimates 2021: Deaths by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.
-       -     url_main: https://frontdoor-l4uikgap6gz3m.azurefd.net/MPX/V_MPX_VALIDATED_DAILY
+       +     url_main: https://extranet.who.int/publicemergency/
    ~ Column new_cases_smoothed_per_million (changed metadata)
-       -     title: Mpox confirmed cases and deaths
+       +     title: Monkeypox
-       -     citation_full: Mpox confirmed cases and deaths. World Health Organization; 2024.
+       +     citation_full: |-
+       +       Global Health Estimates 2021: Deaths by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.
-       -     url_main: https://frontdoor-l4uikgap6gz3m.azurefd.net/MPX/V_MPX_VALIDATED_DAILY
+       +     url_main: https://extranet.who.int/publicemergency/
    ~ Column new_deaths (changed metadata)
-       -     title: Mpox confirmed cases and deaths
+       +     title: Monkeypox
-       -     citation_full: Mpox confirmed cases and deaths. World Health Organization; 2024.
+       +     citation_full: |-
+       +       Global Health Estimates 2021: Deaths by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.
-       -     url_main: https://frontdoor-l4uikgap6gz3m.azurefd.net/MPX/V_MPX_VALIDATED_DAILY
+       +     url_main: https://extranet.who.int/publicemergency/
    ~ Column new_deaths_per_million (changed metadata)
-       -     title: Mpox confirmed cases and deaths
+       +     title: Monkeypox
-       -     citation_full: Mpox confirmed cases and deaths. World Health Organization; 2024.
+       +     citation_full: |-
+       +       Global Health Estimates 2021: Deaths by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.
-       -     url_main: https://frontdoor-l4uikgap6gz3m.azurefd.net/MPX/V_MPX_VALIDATED_DAILY
+       +     url_main: https://extranet.who.int/publicemergency/
    ~ Column new_deaths_smoothed (changed metadata)
-       -     title: Mpox confirmed cases and deaths
+       +     title: Monkeypox
-       -     citation_full: Mpox confirmed cases and deaths. World Health Organization; 2024.
+       +     citation_full: |-
+       +       Global Health Estimates 2021: Deaths by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.
-       -     url_main: https://frontdoor-l4uikgap6gz3m.azurefd.net/MPX/V_MPX_VALIDATED_DAILY
+       +     url_main: https://extranet.who.int/publicemergency/
    ~ Column new_deaths_smoothed_per_million (changed metadata)
-       -     title: Mpox confirmed cases and deaths
+       +     title: Monkeypox
-       -     citation_full: Mpox confirmed cases and deaths. World Health Organization; 2024.
+       +     citation_full: |-
+       +       Global Health Estimates 2021: Deaths by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.
-       -     url_main: https://frontdoor-l4uikgap6gz3m.azurefd.net/MPX/V_MPX_VALIDATED_DAILY
+       +     url_main: https://extranet.who.int/publicemergency/
    ~ Column total_cases (changed metadata)
-       -     title: Mpox confirmed cases and deaths
+       +     title: Monkeypox
-       -     citation_full: Mpox confirmed cases and deaths. World Health Organization; 2024.
+       +     citation_full: |-
+       +       Global Health Estimates 2021: Deaths by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.
-       -     url_main: https://frontdoor-l4uikgap6gz3m.azurefd.net/MPX/V_MPX_VALIDATED_DAILY
+       +     url_main: https://extranet.who.int/publicemergency/
    ~ Column total_cases_per_million (changed metadata)
-       -     title: Mpox confirmed cases and deaths
+       +     title: Monkeypox
-       -     citation_full: Mpox confirmed cases and deaths. World Health Organization; 2024.
+       +     citation_full: |-
+       +       Global Health Estimates 2021: Deaths by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.
-       -     url_main: https://frontdoor-l4uikgap6gz3m.azurefd.net/MPX/V_MPX_VALIDATED_DAILY
+       +     url_main: https://extranet.who.int/publicemergency/
    ~ Column total_deaths (changed metadata)
-       -     title: Mpox confirmed cases and deaths
+       +     title: Monkeypox
-       -     citation_full: Mpox confirmed cases and deaths. World Health Organization; 2024.
+       +     citation_full: |-
+       +       Global Health Estimates 2021: Deaths by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.
-       -     url_main: https://frontdoor-l4uikgap6gz3m.azurefd.net/MPX/V_MPX_VALIDATED_DAILY
+       +     url_main: https://extranet.who.int/publicemergency/
    ~ Column total_deaths_per_million (changed metadata)
-       -     title: Mpox confirmed cases and deaths
+       +     title: Monkeypox
-       -     citation_full: Mpox confirmed cases and deaths. World Health Organization; 2024.
+       +     citation_full: |-
+       +       Global Health Estimates 2021: Deaths by Cause, Age, Sex, by Country and by Region, 2000-2021. Geneva, World Health Organization; 2024.
-       -     url_main: https://frontdoor-l4uikgap6gz3m.azurefd.net/MPX/V_MPX_VALIDATED_DAILY
+       +     url_main: https://extranet.who.int/publicemergency/


Legend: +New  ~Modified  -Removed  =Identical  Details
Hint: Run this locally with etl diff REMOTE data/ --include yourdataset --verbose --snippet

Automatically updated datasets matching weekly_wildfires|excess_mortality|covid|fluid|flunet|country_profile|garden/ihme_gbd/2019/gbd_risk are not included

Edited: 2024-08-27 07:36:56 UTC
Execution time: 14.06 seconds

@Marigold
Copy link
Collaborator

Marigold commented Aug 26, 2024

It actually works correctly. It runs all steps that re.findall-match strings from includes. data://garden/war/2023-09-21/ucdp_prio is matched because it contains data://garden/war/2023-09-21/ucdp as a substring.

To only get the matching steps, you have to add $ as the suffix (to match the end of the string). Running

etlr data://grapher/war/2023-09-21/ucdp$ data://garden/war/2023-09-21/ucdp$ --dry-run --only

works as expected.

@pabloarosado
Copy link
Contributor Author

Thanks @Marigold! I have added an exact_match argument, to use when we want to "include" only an exact list of steps. What do you think?

Copy link
Collaborator

@Marigold Marigold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't is easier to add $ than to use --exact_match? I'd personally never use it since I always type just a part of the step and almost always get what I want, but if you find it useful, feel free to merge it.

@pabloarosado
Copy link
Contributor Author

pabloarosado commented Aug 27, 2024

Isn't is easier to add $ than to use --exact_match? I'd personally never use it since I always type just a part of the step and almost always get what I want, but if you find it useful, feel free to merge it.

Using $ is indeed easier, if you are always aware that you are passing a regular expression pattern. I think exact_match is much more understandable, and should be included in etlr as well. You can keep using $ if you find that more convenient.

@pabloarosado pabloarosado merged commit a10b1f6 into master Aug 27, 2024
8 checks passed
@pabloarosado pabloarosado deleted the fix-filter-to-subgraph branch August 27, 2024 07:50
Marigold pushed a commit that referenced this pull request Aug 30, 2024
* Fix filter_to_subgraph when passing the 'only' argument

* Add exact_match as an argument
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants