-
Notifications
You must be signed in to change notification settings - Fork 416
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Laplacian and Second Derivative #1389
Comments
Thank you for digging into this (and for prompting some flashbacks to numerical analysis)! While unfortunate, I am not surprised by this at all from a numerical analysis point-of-view, and I'm not sure I'd even label it a bug. Of course, on a mathematical level, the Laplacian (in Cartesian coordinates) is equivalently the sum of second derivatives and the divergence of the gradient. But, they naturally differ if you are using an explicit n-point higher-order derivative formulation like MetPy does for its second derivative. I'd argue that MetPy's formulation is more correct (or at least more numerically well-behaved/rigorous), and that iteratively applying first derivatives is composing in a kind of extra smoothing operation (since you're incorporating more non-local information at any given point). Also, unless I've missed something, I don't think GEMPAK even has true higher-order derivatives? Though, all this being said, if MetPy relying on explicit second-order derivatives does not meet most users' expectations, I would also understand something being changed here. Also, @kgoebber, your toy example data unfortunately do not have enough non-linearity to expose the differences. Here's an example that does, however, showing both |
Thanks for the great example @jthielen, I had a sneaking suspicion that it had something to do with the numerical differencing of non-uniform grid spacing. Yes, I would agree not a bug, I should have chosen a different initial issue tag. I don't think we have to match GEMPAK, but we'll need to make it explicit that we are doing the calculation differently and if there appears to be demand, we can always add a function to do that calculation in that way, but I would prefer to stay with the more numerically robust solution as we currently have it. You are correct that GEMPAK does not have a true higher-order derivatives, just first derivative/gradient. I've used the MetPy Laplacian calculation in some of my courses and I've never thought it was off or wrong, and there have only been the handful of notable exceptions between MetPy and GEMPAK calculations, so I wanted to make sure we rolled down this path to have a clear record of it. The other exceptions are wind speed (with GEMPAKs [outdated?] conversion factor) and potential temperature using the ratio of 2/7 instead of 287/1004. |
Okay, so here are some global maps of the differences between a native This image is taking the Laplacian of the 500 hPa height field and they give largely the same answer, except that the Honestly, I'm a bit torn as to which way we should go. The noise in the second derivative form is not insubstantial and overwhelms the "signal" over northeast Russia (as an example). A simple test of using the 5 point smoother with one pass greatly reduces the noise and provides a similar smoothing to the iterative first derivative calculation. However, there are some prime meridian artifacts that appear (see below). Others thoughts? |
See also #893 |
Not so sure if this is related to the type of the grid. Does the difference scheme take into account different type of model grids? Like five Arakawa stagged grids. For Arakawa C grid, 1st-order finite difference may have half-grid offset relative to its original location, and 2nd-order finite difference will shift back to the original location. |
These functions assume that the coordinate values (e.g. x) that you provide correspond directly to the scalar values, no shift. The calculated derivative estimates are valid at these locations as well--no half-grid shift. I'm open to adding such capabilities, though I have no idea how readily that math can fit within our existing implementation. |
I think you are right. It is not easy to generalize such calculation to various kinds of grids. Although most reanalysis data are on Arakawa A grid (no shift), many model outputs are defined on stagged Arakawa grids. Since difference schemes may depend on the underlying grid, using 1st-order derivative as a building block for higher-order derivatives may not be so good as calculating higher-order derivatives directly. If I understand it right, 2nd-order derivative using twice 1st-order derivative involves ((q[i+2] - q[i]) - (q[i] - q[i-2])) /dx/dx/4, while it may also be calculated as ((q[i+1] - q[i]) - (q[i] - q[i-1])) /dx/dx (assuming a constant dx). The magnitude of the differences between the two ways depends on the smoothness of the q field. So I guess the best way is to use higher-order function directly, rather than using 1st-order one recursively. |
I've roughly imagined delegating this to something like xgcm (that has more robust handling of staggered grids), and then providing the option to plug in xgcm's derivative functions in place of MetPy's. But that's just one possibility among many for handling staggered grids. At the very least I think it would require leveraging xarray coordinate support and some measure of bounds checking. Also, @miniufo, if you're curious, MetPy's first and second derivatives use the three point versions of Bowen and Smith (2005). |
And to be clear, while @kgoebber was doing some testing with repeated 1st-order, the Lacplacian and second derivative functions in MetPy currently are not using the first-order differences--the |
@jthielen @dopplershift Thanks for the paper. I understand much better now.
I've already used |
Have you run into one of our calculations that doesn't work for your staggered dataset? |
I have not used But I do would like a try. However, I am a little confused by the difference and usage between In [62]: print(u); print(v)
<xarray.DataArray 'UVEL' (YC: 720, XG: 1440)>
dask.array<getitem, shape=(720, 1440), dtype=float32, chunksize=(720, 1440), chunktype=numpy.ndarray>
Coordinates:
* YC (YC) >f4 -89.75 -89.5 -89.25 -89.0 -88.75 ... 89.25 89.5 89.75 90.0
iter int32 dask.array<chunksize=(), meta=np.ndarray>
time int64 0
crs object Projection: latitude_longitude
* XG (XG) >f4 0.0 0.25 0.5 0.75 1.0 ... 358.75 359.0 359.25 359.5 359.75
dyG (YC, XG) >f4 dask.array<chunksize=(720, 1440), meta=np.ndarray>
dxC (YC, XG) >f4 dask.array<chunksize=(720, 1440), meta=np.ndarray>
rAw (YC, XG) >f4 dask.array<chunksize=(720, 1440), meta=np.ndarray>
hFacW (YC, XG) >f4 dask.array<chunksize=(720, 1440), meta=np.ndarray>
maskW (YC, XG) bool dask.array<chunksize=(720, 1440), meta=np.ndarray>
Attributes:
standard_name: UVEL
long_name: Zonal Component of Velocity (m/s)
units: m/s
mate: VVEL
<xarray.DataArray 'VVEL' (YG: 720, XC: 1440)>
dask.array<getitem, shape=(720, 1440), dtype=float32, chunksize=(720, 1440), chunktype=numpy.ndarray>
Coordinates:
* XC (XC) >f4 0.125 0.375 0.625 0.875 ... 359.375 359.625 359.875
iter int32 dask.array<chunksize=(), meta=np.ndarray>
time int64 0
crs object Projection: latitude_longitude
* YG (YG) >f4 -89.875 -89.625 -89.375 -89.125 ... 89.375 89.625 89.875
dxG (YG, XC) >f4 dask.array<chunksize=(720, 1440), meta=np.ndarray>
dyC (YG, XC) >f4 dask.array<chunksize=(720, 1440), meta=np.ndarray>
rAs (YG, XC) >f4 dask.array<chunksize=(720, 1440), meta=np.ndarray>
hFacS (YG, XC) >f4 dask.array<chunksize=(720, 1440), meta=np.ndarray>
maskS (YG, XC) bool dask.array<chunksize=(720, 1440), meta=np.ndarray>
Attributes:
standard_name: VVEL
long_name: Meridional Component of Velocity (m/s)
units: m/s
mate: UVEL how to calculate divergence using 'metpy' function? I tried: dx, dy = mpcalc.lat_lon_grid_deltas(u.metpy.x, v.metpy.y)
div = mpcalc.divergence(u, v, dx, dy) following @kgoebber but not sure if |
My comment from July was too terse. I think GEMPAK has it right. In spherical coordinates (which the problematic GFS data employs), second derivatives are not enough to compute the Laplacian. There are extra terms (which emerge if you expand out the divergence of the gradient in spherical coordinates). Treating spherical coordinates as a nonuniform Cartesian grid and then applying the Bowen and Smith expressions is insufficient, I believe, as it does not account for the fact that the unit vectors are not constant in spherical coordinates. (I'm no mathematician, but all the math in that paper seems to assume constant unit vectors.) So, to me it is no surprise that @kgoebber's calculations show better agreement when the divergence of the gradient approach is taken (although the divergence may be slightly off in spherical coordinates as well), as that will indirectly include approximations to some of the terms that are missing from the "add up second derivatives" approach. When I find some time, I would like to try to compute the Laplacian in spherical coordinates directly using some GFS data to see how it compares with what Kevin found. |
I've been browsing through issues on the 1.3.0 milestone, and wanted to check in about how this could be resolved as a part of #893. Should we replace the implementation of
(Also, I put a reference to the side discussion about staggered grids in #1360, as I'd think any staggered grid handling in MetPy would come about through xgcm compatibility.) |
I think going down the "vector derivative-based |
I concur with @dopplershift. The first option + resolving #893 should fix this issue for free. |
Closed by #2743, which added |
Problem:
Trying to reproduce the output from GEMPAK for the
laplacian
calculation. MetPy has a different implementation of calculating the Laplacian by using the sum of the second derivative of the values. GEMPAK takes the divergence of the gradient of the values.If I use the gradient and divergence calculations from MetPy, I get a solution that correlates very well (0.999), but is slightly off calculation wise. If I use the Laplacian (or natively use the
second_derivative
function), then I get an answer that doesn't correlate nearly as well (0.73).Interestingly, if I use the
first_derivative
twice to get a second derivative, I get the same solution as if I were using thegradient
anddivergence
functions. Not surprising since both of those functions usefirst_derivative
under the hood. But using thefirst_derivative
twice does not give me the same solution as usingsecond_derivative
.When I use made up 1D or 2D data, where I would know the answer to the first and second derivative, they all work out fine.
Examples below to play with.
Example 1
Output:
Example 2
Output:
Example 3
Output (differences are not insubstantial):
The text was updated successfully, but these errors were encountered: