Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

12 generalize downloader #13

Open
wants to merge 14 commits into
base: main
Choose a base branch
from
Open

12 generalize downloader #13

wants to merge 14 commits into from

Conversation

liellnima
Copy link
Collaborator

@liellnima liellnima commented Oct 26, 2024

I am creating a pull request, so the branches are not splitting up too much over time.

Current state:

  • removed some parts that were blocking generalizing. e.g. sticking to the selected_scenario file instead of the broader files. removed other files (or moved them somewhere else) if I realized we are not using them and don't need them in the future.
  • simplified the constant files: simple model list instead of dictionary. generalized to one node-link per project.
  • extended options: we can use all desired models for cmip6 now
  • extended scenarios: we can use all desired scenarios/experiments for cmip6 now
  • extended projects: first naive approach to extend the downloader to different projects. This will need some major rewriting though, which is why I stopped this and would like to do this in a separate PR. There is a first naive implementation. I think we need an abstract Downlaoder, and then children for the different projects (input4mips, cmip6, cmip7). This way the user can implement Downloader classes for the projects we are not supporting.
  • max_ensemble number of ensemble_members: this was already supported by the previous code and works now as well.
  • removed some other things that we don't need to simplify the code
  • implemented some failure/breaking points where needed (because code is otherwise downloading stuff the user doesn't need). still needs to be done for the input4mips part as well.
  • each project gets its own constant files (not sure this is elegant). this way we can load the relevant constants for this project and a separate node link (necessary!). each project might have different models / variables or e.g. no models at all (see input4mips)
  • cmip7 (or cmip6plus) config is there, and I tested it - unclear if it's an server issue / pyesgf issue / or issue on our side
  • updated the yaml downloader configs accordingly to the named changes (tag "project" - if ommited it should default to cmip6)

The other future_cases are not working yet and have not been implemented yet. Please ignore those.

Next immediate steps: (Input4Mips handling)

  • input4mips: rewrite some funcs, such as generate_variables
  • split the raw / model vars in the cmip6 constant files for input4mips vs cmip6
  • get input4mips configs running as expected
  • remove default downloader behaviour

Other next steps:

  • Rewrite downloader class, so that the project types are handled in a clean way. Each project needs its own downloading function, different handling for special variables if desired, and init_globs (or sth like that) imo.
  • grid handling: we should get rid of the explicit grid mentions. first of all, different variables will need different grids (oceans need gr instead of gn e.g.). secondly, each variable should only have one grid available, i.e. we want to retrieve it automatically. This is a potential failure point for downloading certain data.
  • extend configs for testing the downloader so each model component is tested once (not just atmosphere and sea-ice as it is now)
  • think about reducing logger output / terminal output. Is it too much?

Backlog:

  • time-window (we can calmly ignore that for now)

@liellnima liellnima linked an issue Oct 26, 2024 that may be closed by this pull request
@liellnima
Copy link
Collaborator Author

I haven't run the tests yet and I can already see that the linters are failing here (it worked locally for me though).

I just wanted to make sure that you have access to the latest version and can look into the changes. And I think it would be good to merge current changes before I continue adapting the downloader further.

With those changes all my cmip6 test cases are now working (6/6), so that's already a big step :D

@f-PLT f-PLT self-requested a review November 5, 2024 15:55
liellnima and others added 7 commits November 19, 2024 20:23
…sgf.py. split up raw and model vars. remove unused constants.
…g. update attribute handling of class. rewrite some if-else blocks. unify model and raw input vars handling. update constants. rename emission handling funcs. add comments for attributes in downloader class.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Generalize Downloader
2 participants