Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automated Field Solver #2070

Draft
wants to merge 9 commits into
base: main
Choose a base branch
from
Draft

Conversation

dopplershift
Copy link
Member

Description Of Changes

Finally have a really solid cut at the last major deliverable from the last grant, and at long last close out #3. The implementation, once I got my head around it, was remarkably straightforward--owing to how much other infrastructure we have developed in xarray + units. The major pieces:

  • A registry for the functions with a decorator to mark functions. This decorator can take a full list of all return values and input fields. I also allowed it to just use the parameter names (since we've been nicely consistent) where possible, as well as use the function name for the output parameter (possible less frequently--i.e. heat_index vs. relative_humidity_from_mixing_ratio) Aside: why does wind_chill take speed and not wind_speed? 🤦‍♂️
  • Breadth-First Search (BFS) through the "graph" of functions. This isn't really a full graph as such, as the path at each node really depends on what's next needed (and nothing out there like e.g. NetworkX seemed to make this easier). I was actually quite happy with how that came together--owing in all seriousness to so many Advent of Code problems.
  • Trickier is calling each of the needed function, automatically mapping the fields in the dataset to the function parameters. Weird naming, case senstivity, xarray coords vs. variables make this more complicated than I'd like.

I've also already had to fix heat_index to allow broadcasting so that when we calculate e.g. relative_humidity using 1D isobaric, things flow through the whole pipeline fine. I'm sure there's more waiting--I already tried and failed to do equivalent_potential_temperature.

Left to do:

  • Documentation and including this in the xarray/declarative materials
  • More heuristics and options for handling naming of fields. Right now this is kind of a crap shoot
  • Should we expose this through the xarray accessor?
  • Many more fixes for xarray indexing/broadcasting (e.g. Problems with xarray broadcasting/dimensions #2069)
  • Should we try to be efficient and do indexing in declarative before the calculation?
  • More tests (like in declarative)

Checklist

This is a pretty straightforward Breadth-First Search (BFS) through a
registry of calculations. The path accounts for what's available as we
traverse through the "graph".
Add a basic test for the calculate functionality.
Need to broadcast temperature and relative_humidity together.
Need to not include calculated parameters in what we "have" because it's
not actually available higher up in the call stack--instead manually
remove from the "need" list. This also means we now need the code that
detects and breaks call cycles.
This allows us to automatically calculate heat_index from NARR output.
@dopplershift
Copy link
Member Author

If you want to see what this looks like currently:

import xarray as xr
from metpy.cbook import get_test_data
from metpy.plots import ContourPlot, ImagePlot, MapPanel, PanelContainer
from metpy.units import units

narr = xr.open_dataset(get_test_data('narr_example.nc', as_file_obj=False))

contour = ContourPlot()
contour.data = narr
contour.field = 'heat_index'
contour.level = 1000 * units.hPa
contour.linecolor = 'red'
contour.contours = 15

panel = MapPanel()
panel.area = 'us'
panel.layers = ['coastline', 'borders', 'states', 'rivers', 'ocean', 'land']
panel.plots = [contour]

pc = PanelContainer()
pc.size = (10, 8)
pc.panels = [panel]
pc.show()

image

@jthielen
Copy link
Collaborator

🎉 🎉 🎉

Definitely +1 on exposing through the Dataset accessor.


Given that this implementation is centered around parameter names, there are a few potentially problematic use cases that I wanted to get your thoughts on. I have no expectation that these are handled right away (if at all), but I wanted to bring them up now so that 1) we don't bake-in any assumptions we'd regret later and 2) I can get on the same page in regards to what is in-scope and out-of-scope for the solver.

  • CAPE (and other profile calculations)
    • e.g., what do we do if a CF string of "atmosphere_convective_available_potential_energy" is given?
    • how are parcel options specified/defaults chosen?
    • how to enforce the need for data with a vertical dimension?
  • divergence (and other "generic" calculations)
    • a few example CF strings that'd be good to handle
      • "divergence_of_wind"
      • "derivative_of_air_temperature_wrt_time"
    • how to construct these generic things when CF naming doesn't provide guidance?
      • advection of some quantity?
      • Laplacian of some quantity?
      • Q-Vector divergence?
  • fields that need more input than just other fields
    • optional arguments (like Q-Vector's static_stability)
    • depth argument (e.g., "0-3 km Storm Relative Helicity")
  • multiple (or conflicting) variables of same (or similar) physical quantity
    • how to handle if on different levels (e.g., Temperature_isobaric, Temperature_sigma, Temperature_tropopause)
    • how to handle height (which is often the parameter name) vs. geopotential height (which is often what we mean by height, except in the geopotential conversion functions)?

@dopplershift dopplershift added this to the 1.3.0 milestone Nov 5, 2021
@dopplershift dopplershift modified the milestones: 1.3.0, May 2022 Mar 31, 2022
@dopplershift dopplershift modified the milestones: May 2022, July 2022 May 16, 2022
@dopplershift dopplershift removed this from the September 2022 milestone Sep 15, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Automatic field calculation
2 participants