Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generalize vorticity (and related functions) to allow for map projections #893

Closed
sgdecker opened this issue Jul 18, 2018 · 51 comments
Closed
Labels
Area: Calc Pertains to calculations Type: Enhancement Enhancement to existing functionality
Milestone

Comments

@sgdecker
Copy link
Contributor

Some MetPy functions, such as gradient and advection, allow the user to supply arrays of deltas rather than a single value, but vorticity (and its relatives) does not. Vorticity is more complicated since derivatives of the "map factors" are also required, but it should be doable.

Here is the necessary computation according to the GEMPAK documentation (Appendix B2):
equation

Since the map factors (m_x and m_y) are equivalent to the "nominal" grid spacing (\partial x and \partial y in the above equation) divided by the actual delta, an additional "nominal_delta" argument to vorticity would allow for the map factors to be computed. Since the WRF model provides the map factors in its output, allowing the map factors (and the nominal grid spacing) as optional arguments to vorticity in lieu of deltas would be useful as well.

@dopplershift dopplershift added the Area: Calc Pertains to calculations label Jul 24, 2018
@dopplershift
Copy link
Member

So while the docs for vorticity currently say that dx/dy have to be float (oops, that needs fixing), with the new finite differencing in 0.7, we can now handle vorticity for arbitrarily-spaced points. If you've tried to pass an array in for dx/dy, and it failed, that's a bug.

As far as the mapping/projection issues are concerned, that's still an open issue (covered in #174). So I'm a little unclear--is the map factor there to handle irregularly-spaced points or to handle issues from projection? or both?

@sgdecker
Copy link
Contributor Author

sgdecker commented Jul 27, 2018

OK, I see that vorticity does compute something when I provide arrays for dx/dy.

Regarding the map factors, they are fundamentally there to handle issues from projection. One of those issues is that the data array is irregularly spaced, but another issue is that computing vorticity (more generally, the curl of a vector) in a quantitatively correct manner requires the map factors.

The current implementation of vorticity is correct if the underlying data is truly in Cartesian coordinates (even if unequally spaced). I suppose output from a cloud model would fit this description.

However, when the data is not truly Cartesian (e.g., NWP output on some map projection), the third and fourth terms above are necessary for a quantitatively correct calculation. In many (most?) instances, the third and fourth terms will be quite small, certainly not noticeable qualitatively (say, when looking at contours), but one case where the difference is noticeable is when vorticity is computed from data on a lat/lon grid and examined in polar regions (where, due to convergence of the meridians, the dmx/dy term gets large).

For instance, compare the GEMPAK and MetPy-derived plots below (Python code is at https://gist.github.com/sgdecker/496f7ea7edd98b428ac3adab0b5e0842; GEMPAK is at https://gist.github.com/sgdecker/1a0bbf2e18dd0a2f9b4b02c83e07509b):
np_vort
gfs_vort
Look closely and you'll see, for instance, that the 4 contour to the left (south, technically) of the vort max in the Arctic Ocean differs noticeably.

A mathematical way to see this is to note that vorticity in spherical coordinates is given by (with theta as colatitude):
latex
which expands to:
latex
The final term (which corresponds to the dmx/dy term in GEMPAK's general expression) blows up at the pole, but if we are 1 degree away, the cotangent is order 10^2, so using U ~ 10 m/s, and r ~ 10^7 m, the term is on the order of 10^-4 s^(-1).

The divergence and laplacian operators will have the same issue.

@dopplershift
Copy link
Member

Thanks for the detailed analysis. It's what I figured, and definitely on our roadmap. Due to the nicely detailed analysis, let's leave this open for addressing the map factors, and we can close out #174.

(Putting the link to the NOAA writeup about map projections here: https://www.arl.noaa.gov/hysplit/cmapf-mapping-routines/)

@dopplershift dopplershift changed the title Generalize vorticity (and related functions) to allow for unequal grid spacing Generalize vorticity (and related functions) to allow for map projections Jul 31, 2018
@dopplershift dopplershift added the Type: Enhancement Enhancement to existing functionality label Jul 31, 2018
@deeplycloudy
Copy link
Collaborator

Nice resource in the ARL guide. I do want to suggest caution in retaining 3D coordinates as much as possible, as I advocated in the Cartopy PlateCaree saga - while the ARL guide has nice routines for 2D, and that's fine for maps, it can make forward/inverse and mixed-coordinate plotting difficult.

Anyway, proj.4 has a 3D-aware forward / inverse, and a quick search turns up partial derivative support through the proj_factors function - maybe that's useful in ensuring we account for what @sgdecker points out above. This issue on OSGeo also seems related.

@sgdecker
Copy link
Contributor Author

sgdecker commented Aug 1, 2018

Since I mentioned divergence and the Laplacian operator, I went ahead and made plots of those as well to show the difference between GEMPAK and MetPy.
Here is 500-mb divergence:
div500
gfs_div
Here is the Laplacian of 500-mb height:
lap500
gfs_lap

Perusing the GEMPAK documentation, I found other calculations that also require taking derivatives of map factors to get absolutely correct results:

  • Horizontal partial derivatives of vectors (but not scalars), which in turn affects:
    • Q-vectors
  • Deformation (total, stretching, and shearing), which in turn affects:
    • Frontogenesis (also depends on divergence)
  • Inertial advection, which in tern affects:
    • Rossby number
  • Divergence-dependent calculations, such as:
    • Flux divergence
    • Layer-average mass divergence
  • Vorticity-dependent calculations, such as:
    • Absolute vorticity
    • Potential vorticity
  • Poisson equation solver

@kgoebber
Copy link
Collaborator

Okay, so I've been looking into this and have the following to report on absolute vorticity...

Here is the difference between GEMPAK and MetPy v0.10; note the very small range of color values
GEMPAK_difference_ABSVORT_v0 10

The average difference is 0.0051 /s. If the following term is added + (uwnd / (6375000 * units.meter)) * np.tan(np.deg2rad(lats[:, None])) the difference is reduced by an order of magnitude to essentially zero.
GEMPAK_difference_ABSVORT_spherical

The average difference is -0.00042. Note: The single strip of color at the poles is because MetPy's derivative does not assess vorticity at the top and bottom of the array. Excluding those rows does not appreciably affect the mean difference calculations.

The additive factor depends on the u-component of the wind and latitude and it is not a universal or 3D solution to the issue as it will has the potential to be slightly different for some of the calculations (e.g., laplacian). Since the list is "relatively" small, this can be done for the various cases and added, however with our current implementation it would require the addition of potentially two arguments. One argument signaling to use (or not use) the spherical calculation and another that contains the latitude array. This would not be a problem for absolute_vorticity since it already requires the latitudes for the calculation of coriolis_parameter, therefore we would only need to add the spherical designation. I would additionally promote the default use of spherical to beTrue since that would be the much large use case for our current user base.

@sgdecker
Copy link
Contributor Author

I haven't tried implementing this yet, but I think the general solution would be to accept the data CRS object as an optional argument, rather than more specific things such as latitude. If that argument is not provided, assume the grid is Cartesian, and set the map factors to one. Otherwise, use the CRS object to obtain the map factors via proj4 (as discussed by @deeplycloudy). In either case, use the formula in my original comment as the basis for the computation. The part that is unclear to me without actually trying this yet is how difficult obtaining the map factors from the CRS is.

@jthielen
Copy link
Collaborator

This question is more related to the information needed for projection-aware deriviative-based calculations, rather than the implementation, but I still wanted to ask, since it may end up being useful for the implementation.

I've recently been planning out a collection of "dataset helpers" to flesh out any missing CF metadata in xarray Datasets, so that all of the MetPy-xarray integration functionality (projections, coordinates, units, and hopefully soon variable identification and the field solver) works as smoothly as possible. WRF output has been the motivating example (see #1004).

With that in mind, what would be the best set of information to standardize having in a DataArray and/or Dataset to ensure that these projection-aware calculations can be carried out? Is it sufficient to have

  • projection x and y coordinates with a crs?
  • latitude and longitude dimension coordinates (if plate carrée/equirectangular) or auxiliary coordinates (if not)?

or am I missing something else?

The goal, I would think, would be to be able to calculate something like vorticity with a call such as

vorticity = mpcalc.vorticity(data['u_wind'], data['v_wind'])

and have everything else be obtained automatically from the DataArrays. This, if combined with some way of annotating the function and its arguments, would also really set the stage for the field solver.

@dopplershift
Copy link
Member

Correct, my ideal is to have the CRS information, as well as coordinates like x or lon, attached to the data arrays. If we can't get map factors from the CRS...that would be bad.

@jthielen
Copy link
Collaborator

While I had hoped that #1260 (which allows for easy calculation of both nominal and actual grid deltas from a parsed DataArray) would be enough to make this feasible for quick implementation for v1.0, #1275 raised the point about grid-relative vs. earth-relative winds that I had not fully considered with respect to this issue. So,

  • What is the correct behavior for these kinematic calculations with respect to earth-relative vs. grid-relative winds?
  • When working with geostrophic wind, how do we know when the result should be grid-relative (like it is now) or earth-relative (like would be needed for ageostrophic wind when the observed wind is earth-relative)?
  • Do we have any test data with expected results for these calculations that we can rely on here?

With all this extra complexity, though, I unfortunately don't see a way that this can feasibly be resolved with a stable API for v1.0 in the immediate future. So, will this issue need to be punted from v1.0, and instead, for now, just use a simple xarray input/output wrapper that fills in the default grid spacing (dx and dy) and coordinate-dependent parameters (f and latitude), and calculates these kinematic functions as they are now?

@sgdecker
Copy link
Contributor Author

The equation way back in the first comment is valid for grid-relative u and v. If you used earth-relative winds on, say, a polar stereographic projection where North America is right-side up but Asia is upside down (so that grid-relative and earth-relative winds are nearly equal and opposite), I'd suspect adding the map factor terms would actually degrade the computation.

@dopplershift
Copy link
Member

@jthielen I agree about punting from 1.0. I'll do some milestone work later, but it's pretty clear we're not going to get the 1.0 we want, but the one we deserve. 😉 We can talk more when you get into AMS. Let me know.

@jthielen
Copy link
Collaborator

@jthielen I agree about punting from 1.0. I'll do some milestone work later, but it's pretty clear we're not going to get the 1.0 we want, but the one we deserve. We can talk more when you get into AMS. Let me know.

Sounds good. I'll be getting in tomorrow afternoon. We can chat via email if you want to find a more specific time, otherwise I'll plan on stopping by the Unidata table at the career fair that evening.

@dopplershift
Copy link
Member

This notebook from @deeplycloudy may help shed some light on using PyProj for some complicated reprojections.

@kgoebber
Copy link
Collaborator

In looking at @sgdecker previous notebook, I think the error that he was seeing occurs in the cell that was run 37th (e.g., In[37]) as there was a call to an lcc_proj variable where that should have been a stereographic projection. Looking at the Map Factors graphic then makes a little sense as it appears to make a cone with the highest map factors at the center top position.

Anyway, I think we have it solved (for vorticity at the least), now we have to decide how this would be best implemented within MetPy. Currently, we do assign lat/lon values to all data arrays before calculation, but then we also add the dx/dy based on the lat_lon_grid_deltas, which we don't want to do in this case. There is also the difference between calculating the grid dx (dx_grid) between the GFS, that natively brings lat/lon and everything else since it doesn't have projection x and y coordinates, and all of the others. As I get a few more cycles I will attempt to get versions of all of the equations affected (listed in a previous comment) working with map factors, unless someone else wants to take the baton from here.

The PR that I had previously started on this topic, #1963, should be closed and a new one completed as we'll need to go in a completely different direction.

@sgdecker
Copy link
Contributor Author

In looking at @sgdecker previous notebook, I think the error that he was seeing occurs in the cell that was run 37th (e.g., In[37]) as there was a call to an lcc_proj variable where that should have been a stereographic projection. Looking at the Map Factors graphic then makes a little sense as it appears to make a cone with the highest map factors at the center top position.

Thanks, @kgoebber ! That was a bug for sure. Indeed, when I put grid104 there instead (which should have been the case all along), all the discrepancies in my notebook disappear.

I guess these functions will need conditionals that determine when to apply the map factors and when not to. Or, perhaps they set the map factors to 1 in the cases where they aren't needed so that the actual calculation is the same in all cases.

@kgoebber
Copy link
Collaborator

Okay, so I have compiled the vast majority of calculations affected by the spherical calculations. What this really all comes down to is whether we are doing a derivative on a vector quantity...so I have created a notebook that contains a vector_gradient function that computes the appropriate vector derivatives and then I use that throughout the notebook to calculate the various quantities.

I've primarily focused on the GFS calculations since I had them all done for the comparisons to GEMPAK calculations. I have also included the vorticity calculation for the NAM projection, NAM Stereographic projection, and GFS stereographic projection. There is a slight nuance to calculating the grid dx and dy between the lat/lon grid of the GFS and the other projected grids, which is captured in the vector_gradient function.

We can discuss potential implementation options on the next dev call.

Notebook is hosted at: https://gist.github.com/kgoebber/b3546977e55fef96d4b36f4ef573a788

@jthielen
Copy link
Collaborator

@kgoebber Question on the most recent notebook: In the dx and dy calculation for the latitude_longitude grid, you have

earth_radius = 6371200 * units.meter
dx = 2*np.pi*earth_radius / u.lon.size
dy = -dx

However, this looks to only be valid when assuming a spherical earth with radius 6371.2 km, longitude with global extent, and equal spacing in latitude and longitude. Would it be reasonable to generalize this to the following?

geod = u.metpy.pyproj_crs.get_geod()
dx = units.Quantity(
    geod.a * np.diff(u.metpy.longitude.metpy.unit_array.m_as('radian')),
    'meter'
)
lat = u.metpy.latitude.metpy.unit_array.m_as('radian')
lon_meridian_diff = np.zeros(len(lat) - 1)
forward_az, _, dy = geod.inv(
    lon_meridian_diff, lat[:-1], lon_meridian_diff, lat[1:], radians=True
)
dy[(forward_az < -90.) | (forward_az > 90.)] *= -1
dy = units.Quantity(dy, 'meter')

Essentially, no matter what subset or spacing of longitudes and latitudes, this calculates the dx from longitude on the equator of the ellipsoid and dy from latitude on the 0 degree meridian of the ellipsoid.

@kgoebber
Copy link
Collaborator

Ah yes, I have only been working with the global GFS, so the logic would break down for smaller subsets.

A quick test of your generalization was good except for the dy were not negative as they needed to be. All forward azimuths were 3.1415926 (Pi because of 1 degree spacing?), so don't know if there is some different logic with the forward_az or just multiplying dy by -1 is best.

I don't match the GEMPAK values as well, but that would be expected because we are not using the radius that they define. Despite that, it is different by only a very very small amount (what would be expected by using a different radius value.

@jthielen
Copy link
Collaborator

A quick test of your generalization was good except for the dy were not negative as they needed to be. All forward azimuths were 3.1415926 (Pi because of 1 degree spacing?), so don't know if there is some different logic with the forward_az or just multiplying dy by -1 is best.

Yeah, not sure where that sign/direction error would be coming from. I'd be tempted to just reverse the directionality in the inv call (swap lat[:-1] and lat[1:]) to take the latitude differences backwards, but that feels wrong without understanding why the forward differences are giving the wrong result.

I don't match the GEMPAK values as well, but that would be expected because we are not using the radius that they define. Despite that, it is different by only a very very small amount (what would be expected by using a different radius value.

Does specifying the matching ellipsoid fix it? I.e., replace

gfs_data = xr.open_dataset('gfs_test_data.nc').metpy.parse_cf()

with

gfs_data = xr.open_dataset('gfs_test_data.nc').metpy.assign_crs({
    'grid_mapping_name': 'latitude_longitude',
    'earth_radius': 6371200
})

Also, I hacked away at a possible implementation: jthielen@aac4e8e. Right now,

  • vorticity is the only modified calculation,
  • the grid argument decorator has blown up in complexity,
  • I think will not work right on functions where latitude is only required for map factor calculations,
  • and there are no tests whatsoever.

But, it's at least getting the ideas I had down in code, so if anyone wanted to take advantage of it as a head start towards an actual PR, go ahead! If not, I can try further iterating on it, but not sure when I'd be able to do so next.

@kgoebber
Copy link
Collaborator

All checks out with the specific radius. Actually improves it marginally!

@sgdecker
Copy link
Contributor Author

@kgoebber Looks great! Thanks for putting all this together. It looks like there is some difference between the GEMPAK and MetPy PV calculations that isn't attributable to the map projection. Is that a known discrepancy (however minor it may be)?

@dcamron
Copy link
Member

dcamron commented Nov 29, 2021

@dopplershift and I chat about this earlier today; I'm going to spin off @jthielen's commit above into a draft PR this week and begin the process of implementing it across functions, examining existing tests, and adding new tests. If you are already working on a PR, let me know!

@jthielen
Copy link
Collaborator

jthielen commented Feb 23, 2022

Not sure if this was the impression of other folks or not, but I previously conceptualized these map factor corrections as only occurring in derivatives on vector quantities. #2356 (comment) made me realize this does not seem to be the case; map factors need to be considered in any vector operation (so, also gradients of scalar fields, dot products, cross products, etc.). However, these corrections are much simpler (just coefficients on terms and they sometimes cancel out).

@jthielen
Copy link
Collaborator

I keep on getting nerd sniped with this topic of "vector calculus on orthogonal coordinates," and while I don't have much new to show for it yet, one particular concern came up:

Do we need to delineate between vector component arguments that are scaled according to the covariant basis versus the normalized basis? (See https://en.wikipedia.org/wiki/Orthogonal_coordinates#Covariant_basis for discussion.) Is it safe to assume we will always have things in the normalized basis?

@sgdecker
Copy link
Contributor Author

Do we need to delineate between vector component arguments that are scaled according to the covariant basis versus the normalized basis? (See https://en.wikipedia.org/wiki/Orthogonal_coordinates#Covariant_basis for discussion.) Is it safe to assume we will always have things in the normalized basis?

For what it's worth, the Pielke textbook says (p. 129 of the second edition) "By convention, the equations [the set of equations solved in an NWP model] are written in the contravariant form, using the covariant differentiation operation..." but that is in the context of a whole bunch of derivations. I think in practice all of the vector quantities MetPy would be dealing with are normalized (e.g., if the u-component of the wind is given as 20 m/s, it really is 20 m/s at that point).

@jthielen
Copy link
Collaborator

jthielen commented Mar 4, 2022

On the topic of frequently having this problem in the back of my mind, I ended up writing up a rough proposal for addressing this "grid-correct vector calculus" problem on the data model level: pydata/xarray#6331. I doubt that all would be practical on the timeline we want to have this functionality implemented in MetPy, but, it could be a cool future goal?

@dopplershift
Copy link
Member

dopplershift commented Mar 4, 2022

I doubt that all would be practical on the timeline we want to have this functionality implemented in MetPy, but, it could be a cool future goal?

👍 to all of that

@dopplershift dopplershift modified the milestones: 1.3.0, May 2022 Mar 31, 2022
@dopplershift dopplershift modified the milestones: May 2022, July 2022 May 16, 2022
@dopplershift
Copy link
Member

Implemented in #2743.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area: Calc Pertains to calculations Type: Enhancement Enhancement to existing functionality
Projects
Status: Done
Development

Successfully merging a pull request may close this issue.

7 participants