Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Derivatives & map factors #2743

Merged
merged 83 commits into from
Dec 23, 2022
Merged

Conversation

dcamron
Copy link
Member

@dcamron dcamron commented Oct 17, 2022

This is a draft to expand @jthielen's implementation (drafted separately #893) for derivatives with support for map factors and the associated machinery to enable calculating (or providing) those map factors. These should be the minimum commits needed to test vorticity calculations on most well-described xarray dataarrays. We will be using this draft for expanding tests/validation and covered functions, unbreaking the preprocessing, and cleaning up the implementation for inclusion in 1.4 this week.

@dcamron dcamron added Type: Enhancement Enhancement to existing functionality Area: Calc Pertains to calculations Area: Projections Pertains to projecting coordinates between coordinate systems Type: API Change Changes to how existing functionality works Area: Xarray Pertains to xarray integration labels Oct 17, 2022
@dcamron dcamron added this to the September 2022 milestone Oct 17, 2022
Copy link

@github-advanced-security github-advanced-security bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CodeQL found more than 10 potential problems in the proposed changes. Check the Files changed tab for more details.

src/metpy/calc/tools.py Fixed Show fixed Hide fixed
src/metpy/calc/tools.py Fixed Show fixed Hide fixed
src/metpy/calc/tools.py Fixed Show fixed Hide fixed
src/metpy/xarray.py Fixed Show fixed Hide fixed
src/metpy/xarray.py Fixed Show fixed Hide fixed
src/metpy/xarray.py Fixed Show fixed Hide fixed
src/metpy/calc/tools.py Fixed Show fixed Hide fixed
src/metpy/calc/tools.py Fixed Show fixed Hide fixed
src/metpy/calc/tools.py Fixed Show fixed Hide fixed
@kgoebber
Copy link
Collaborator

I did a quick check with this PR on the vorticity calculation and it checks out exactly (except for the boundary) for projected coords, but there still remains a very small difference (error on order 0.01) between the GEMPAK calculation and this map projection aware vorticity calculation for a lat/lon grid.

It appears that the difference is pent up in the default latlong projection in PyProj uses the WGS84 ellipse where GEMPAK uses generic 'sphere'. If you create the CRS using `CRS(proj='latlong', ellps='sphere') you can nearly perfectly recreate the GEMPAK calculation.

I don't know that one (using the generic sphere vs WGS84) is more correct than another as projecting a three dimensional sphere on to a 2D plane has many issues/trade offs. I would be comfortable moving ahead with the vector derivatives as done in this draft PR.

@dopplershift dopplershift force-pushed the spherical-calculations branch 3 times, most recently from 10307bb to f052ea2 Compare October 19, 2022 18:57
@dopplershift
Copy link
Member

@kgoebber I can cut that to ~0.0041 in your test notebook:

gfs_data = xr.open_dataset("data/gfs_test_data.nc").metpy.assign_crs(
    {"grid_mapping_name": "latitude_longitude",
     "earth_radius": 6371229
    }
)

That's the value (now) assigned by the TDS decoding the GRIB. It's also close to a value I found in some GEMPAK projection code of 6371221.3. I think PROJ's default of WGS84 is fine with missing metadata. Eventually we may want to try to be smarter and fill in for known grids (ala #1289), but I think our math here is sound.

Full details:

Mean Comparison
  Mean Values (MetPy): -0.040905677429023034
  Mean Values (GEMPAK): -0.04004379610999328

Max Comparison
  Max Values (MetPy): 37.24095589762783
  Max Values (GEMPAK): 37.241

Min Comparison
  Min Values (MetPy): -109.24897749426847
  Min Values (GEMPAK): -109.249

Various Statistical Analyses
  Average Absolute Difference: 0.004197746873401458
  RMS Error: 7.436825124739824e-05
  Standard Deviation of Difference: 0.03743137278627013
  Max Diff: 0.8858499824667784
  Min Diff: -0.7658125597139265
  Correlation: 0.99996073096731
  Relative Magnitude Difference: 0.00011271855870028423

src/metpy/calc/tools.py Fixed Show fixed Hide fixed
@dcamron
Copy link
Member Author

dcamron commented Nov 21, 2022

@kgoebber whenever you get a chance, can you drop us any new comparisons you've put together? We're testing some functions today and more sanity checks are always better

src/metpy/calc/tools.py Fixed Show fixed Hide fixed
src/metpy/calc/tools.py Fixed Show fixed Hide fixed
src/metpy/calc/tools.py Fixed Show fixed Hide fixed
@kgoebber
Copy link
Collaborator

kgoebber commented Nov 21, 2022 via email

@kgoebber
Copy link
Collaborator

Okay, so I was able to do a bit more digging on frontogenesis and it appears that the new decorator (parse_grid_arguments) is diving different dx and dy values from what were being given by the old decorator (add_grid_arguments_from_xarray). So it is giving the correct information for divergence, shearing, stretching, and total deformation, but not for the other derivatives.

"""
# u, v, *, dx=None, dy=None, x_dim=-1, y_dim=-2, longitude=None, latitude=None, crs=None,
# parallel_scale=None, meridional_scale=None, return_only=None
from ..xarray import dataarray_arguments

Check notice

Code scanning / CodeQL

Cyclic import

Import of module [metpy.xarray](1) begins an import cycle.
@dopplershift
Copy link
Member

dopplershift commented Nov 23, 2022

Current todo list, mainly to capture context before I go off for the Thanksgiving holiday and forget all of this:

  • Clean-up (implement or more likely remove) the latitude_from_xarray flag (and any todos), which was originally looking at handling e.g. absolute_vorticity which needs latitude for reasons other than mapping
  • Look more at frontogenesis, which may be a product of needing to account for map factors (esp in projected case) for scalar gradients (GEMPAK says Del(f) = <mx * df/dx , my * df/dy>)
  • Implement some more testing for frontogensis/gradient
  • Remove unused testing fixtures
  • Implement test for calculating vorticity across pole (regression test for incorrect sign in metpy.calc.vorticity when using lambert_azimuthal_equal_area projection #2582)
  • Based on gradient/frontogeneis: what about q-vector and inertial advective wind? Based on discussion: YES.
  • geostrophic_wind and advection
  • potential_vorticity_baroclinic
  • ageostrophic_wind
  • intertial_advective_wind

@kgoebber
Copy link
Collaborator

After further testing, the issues that I was having with frontogenesis and potential_vorticity_baroclinic were result from using the new method for developing dx and dy values, but then not having the map factor incorporated into the other derivatives occurring in the calculation. For example, in frontogenesis, the calculation of dthetadx and dthetady were not capturing the different grid spacing any longer with the new parse_grid_arguments decorator. Once you include the use of parallel_scale and meridional_scale in those first_derivative calculations, you get the expected result from GEMPAK.

For example, I added the scale factors as follows within the frontogenesis function:

# Get gradients of potential temperature in both x and y
    ddy_theta = meridional_scale * first_derivative(potential_temperature, delta=dy,
                                                    axis=y_dim)
    ddx_theta = parallel_scale * first_derivative(potential_temperature, delta=dx,
                                                  axis=x_dim)

Comparing the calculations of gradient on NAM data, again, if not using lat/lon values to do derivatives, the parallel_scale and meridional_scale are needed to get the "correct" values. Some added complexity with NAM data are that for the wind components it very much matters whether the components are grid relative or north relative. It appears that for GEMPAK, the wind components are converted to north-relative to do any calculation. I was able to faithfully reproduce the GEMPAK output for gradient very well for a non-vector related scalar value and get reasonable close for the u-component of the wind by first converting that component to a north-relative wind from the original grid-relative value. There are some lingering differences that I believe are attributable to the different methods for converting the winds to north-relative.

So what does all of this mean for our implementation? I think we should go with a method for all calculations to NOT use lat/lon and instead bake in parallel_scale and meridional_scale for all of our gridded calculations. The challenge is in how to deal with cartesian data where we would otherwise not set the scale values. We could have the decorator return a value of 1 or instead of defaulting to None, default to a value of 1.

@dopplershift
Copy link
Member

Ok, I've updated absolute vorticity and adjusted the decorator infrastructure to account for its new needs (we were dynamically injecting latitude, etc. into the function signature, but it already had it and needed it passed to it). This resulted also in a few places using squeeze() because absolute_vorticity needs to be passed a broadcasted latitude array so it can added to vorticity.

@kgoebber
Copy link
Collaborator

When I run absolute vorticity against by comparison tests, I get a broadcast error. The latitude variable is still single dimensional.

@dopplershift
Copy link
Member

@kgoebber Good catch. I've fixed (we were using lat/lon as unit arrays rather than DataArrays prematurely) and added a test. Interestingly, the test didn't crash (using our 4D data), but had incorrect values due to lat and lon both having the same size.

@kgoebber
Copy link
Collaborator

kgoebber commented Dec 1, 2022

Absolute vorticity checks out now for GFS.

I'm running into problems with projected datasets (e.g., NAM, GFS 104 grid) that have two-dimensional index (e.g., x/y or lat/lon variables). Here is the error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
Cell In [5], line 10
      7 data_var_u = nam_data['u-component_of_wind_isobaric'].metpy.sel(time=datetime(2018, 3, 8, 0), vertical=500*units.hPa).metpy.quantify()
      8 data_var_v = nam_data['v-component_of_wind_isobaric'].metpy.sel(time=datetime(2018, 3, 8, 0), vertical=500*units.hPa).metpy.quantify()
---> 10 vor = mpcalc.vorticity(data_var_u, data_var_v)
     12 print('Vorticity \n')
     13 error_stats(vor[3:-3,3:-3]*1e5, out_var[3:-3,3:-3]*1e5)

File ~/unidata_stuff/MetPy/src/metpy/calc/tools.py:1162, in parse_grid_arguments.<locals>.wrapper(*args, **kwargs)
   1160 if grid_prototype is not None:
   1161     coords = list(grid_prototype.metpy.coordinates('latitude', 'longitude'))
-> 1162     p_scale = xr.DataArray(p_scale,
   1163                            coords=coords).broadcast_like(grid_prototype)
   1164     m_scale = xr.DataArray(m_scale,
   1165                            coords=coords).broadcast_like(grid_prototype)
   1167 bound_args.arguments['parallel_scale'] = p_scale

File ~/miniconda3/envs/devel/lib/python3.10/site-packages/xarray/core/dataarray.py:412, in DataArray.__init__(self, data, coords, dims, name, attrs, indexes, fastpath)
    410 data = _check_data_shape(data, coords, dims)
    411 data = as_compatible_data(data)
--> 412 coords, dims = _infer_coords_and_dims(data.shape, coords, dims)
    413 variable = Variable(dims, data, attrs, fastpath=True)
    414 indexes, coords = _create_indexes_from_coords(coords)

File ~/miniconda3/envs/devel/lib/python3.10/site-packages/xarray/core/dataarray.py:125, in _infer_coords_and_dims(shape, coords, dims)
    123     else:
    124         for n, (dim, coord) in enumerate(zip(dims, coords)):
--> 125             coord = as_variable(coord, name=dims[n]).to_index_variable()
    126             dims[n] = coord.name
    127 dims = tuple(dims)

File ~/miniconda3/envs/devel/lib/python3.10/site-packages/xarray/core/variable.py:543, in Variable.to_index_variable(self)
    541 def to_index_variable(self):
    542     """Return this variable as an xarray.IndexVariable"""
--> 543     return IndexVariable(
    544         self.dims, self._data, self._attrs, encoding=self._encoding, fastpath=True
    545     )

File ~/miniconda3/envs/devel/lib/python3.10/site-packages/xarray/core/variable.py:2720, in IndexVariable.__init__(self, dims, data, attrs, encoding, fastpath)
   2718 super().__init__(dims, data, attrs, encoding, fastpath)
   2719 if self.ndim != 1:
-> 2720     raise ValueError(f"{type(self).__name__} objects must be 1-dimensional")
   2722 # Unlike in Variable, always eagerly load values into memory
   2723 if not isinstance(self._data, PandasIndexingAdapter):

ValueError: IndexVariable objects must be 1-dimensional

Something in xarray when we set the coordinate values for the parallel and meridional scale DataArrays.

@dopplershift
Copy link
Member

dopplershift commented Dec 1, 2022

I can see why this doesn't work when lat/lon aren't the native coords (we're not creating the DataArray with any 1D coords):

        coords = list(grid_prototype.metpy.coordinates('latitude', 'longitude'))
        p_scale = xr.DataArray(p_scale, coords=coords).broadcast_like(grid_prototype)

Now the question is how to test that case...

@kgoebber
Copy link
Collaborator

kgoebber commented Dec 2, 2022

Comparisons are looking good for GFS at this point including frontogenesis and potential vorticity. There is more of a difference for potential vorticity (correlation is 0.993, which is on the lowest end for our comparisons), but feel that is likely due to vertical differencing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Calc Pertains to calculations Area: Projections Pertains to projecting coordinates between coordinate systems Area: Xarray Pertains to xarray integration Type: Enhancement Enhancement to existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants