Laplacian and Second Derivative #1389

kgoebber · 2020-06-05T22:16:20Z

Problem:

Trying to reproduce the output from GEMPAK for the laplacian calculation. MetPy has a different implementation of calculating the Laplacian by using the sum of the second derivative of the values. GEMPAK takes the divergence of the gradient of the values.

If I use the gradient and divergence calculations from MetPy, I get a solution that correlates very well (0.999), but is slightly off calculation wise. If I use the Laplacian (or natively use the second_derivative function), then I get an answer that doesn't correlate nearly as well (0.73).

Interestingly, if I use the first_derivative twice to get a second derivative, I get the same solution as if I were using the gradient and divergence functions. Not surprising since both of those functions use first_derivative under the hood. But using the first_derivative twice does not give me the same solution as using second_derivative.

When I use made up 1D or 2D data, where I would know the answer to the first and second derivative, they all work out fine.

Examples below to play with.

Example 1

import metpy.calc as mpcalc
import numpy as np

y = -2*np.arange(0, 5)**2 + 5
print(mpcalc.second_derivative(y, delta=1))
print(mpcalc.first_derivative(mpcalc.first_derivative(y, delta=1), delta=1))

Output:

[-4.0 -4.0 -4.0 -4.0 -4.0] dimensionless
[-4.0 -4.0 -4.0 -4.0 -4.0] dimensionless

Example 2

import metpy.calc as mpcalc
import numpy as np

x = np.arange(-5, 6)
y = np.arange(-5, 6)
xx, yy = np.meshgrid(x, y)
f = xx**2 + yy**2
print(mpcalc.second_derivative(f, delta=1))
print(mpcalc.first_derivative(mpcalc.first_derivative(f, delta=1), delta=1))

Output:

[[2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0]] dimensionless
[[2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0] [2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0 2.0]] dimensionless

Example 3

from datetime import datetime
import metpy.calc as mpcalc
import numpy as np
import xarray as xr

date = datetime(2020, 6, 1, 0)
gfs_data = xr.open_dataset('https://thredds.ucar.edu/thredds/dodsC/grib/NCEP/GFS/Global_onedeg_ana/'
                           f'GFS_Global_onedeg_ana_{date:%Y%m%d_%H}00.grib2').metpy.parse_cf()
data_var_hght = gfs_data.Geopotential_height_isobaric.metpy.sel(time=date, vertical=500 * units.hPa).metpy.quantify()

data_var_lat = gfs_data.lat
data_var_lon = gfs_data.lon

dx, dy = mpcalc.lat_lon_grid_deltas(data_var_lon, data_var_lat)

data_test = mpcalc.laplacian(data_var_hght)

dzdy, dzdx = mpcalc.gradient(data_var_hght, deltas=(dy, dx))
data_test2 = mpcalc.divergence(dzdx, dzdy, dx, dy)

print(data_test.values)
print(data_test2.m)

Output (differences are not insubstantial):

[[            nan             nan             nan ...             nan
              nan             nan]
 [-5.50590864e-09 -5.50907902e-09  5.15984316e-09 ... -5.63441096e-09
   5.03455036e-09  5.03294556e-09]
 [ 2.60302066e-09  2.60458607e-09 -1.29180946e-09 ...  1.31759610e-09
  -1.35130490e-09 -1.35126594e-09]
 ...
 [-1.94440638e-09 -1.95399582e-09 -6.44458985e-10 ... -1.92992423e-09
  -5.83437519e-10 -5.88291002e-10]
 [-5.74953538e-09 -5.75114012e-09 -4.82535019e-10 ... -5.73834144e-09
  -4.74550480e-10 -4.79325510e-10]
 [            nan             nan             nan ...             nan
              nan             nan]]
[[            nan             nan             nan ...             nan
              nan             nan]
 [-8.18597714e-09 -2.80715478e-09  1.18025786e-09 ... -1.54098051e-09
   2.44806115e-09  7.82768321e-09]
 [ 3.61531068e-09  1.65605756e-09 -3.12910951e-10 ...  1.55354859e-23
  -6.86196855e-10 -2.02896995e-09]
 ...
 [-2.28371231e-09 -1.62142904e-09 -9.49835163e-10 ... -1.27248125e-09
  -9.25932002e-10 -2.47061318e-10]
 [-7.16707052e-09 -4.51270505e-09 -3.18572965e-09 ... -3.16794873e-09
  -1.84340204e-09  8.09758940e-10]
 [            nan             nan             nan ...             nan
              nan             nan]]

The text was updated successfully, but these errors were encountered:

jthielen · 2020-06-06T02:48:22Z

Thank you for digging into this (and for prompting some flashbacks to numerical analysis)!

While unfortunate, I am not surprised by this at all from a numerical analysis point-of-view, and I'm not sure I'd even label it a bug. Of course, on a mathematical level, the Laplacian (in Cartesian coordinates) is equivalently the sum of second derivatives and the divergence of the gradient. But, they naturally differ if you are using an explicit n-point higher-order derivative formulation like MetPy does for its second derivative. I'd argue that MetPy's formulation is more correct (or at least more numerically well-behaved/rigorous), and that iteratively applying first derivatives is composing in a kind of extra smoothing operation (since you're incorporating more non-local information at any given point). Also, unless I've missed something, I don't think GEMPAK even has true higher-order derivatives?

Though, all this being said, if MetPy relying on explicit second-order derivatives does not meet most users' expectations, I would also understand something being changed here.

Also, @kgoebber, your toy example data unfortunately do not have enough non-linearity to expose the differences. Here's an example that does, however, showing both nan propagation and value differences: https://nbviewer.jupyter.org/urls/dl.dropbox.com/s/5kt5gfkbm0kmsz7/metpy_issue1389_1.ipynb

kgoebber · 2020-06-08T13:56:56Z

Thanks for the great example @jthielen, I had a sneaking suspicion that it had something to do with the numerical differencing of non-uniform grid spacing. Yes, I would agree not a bug, I should have chosen a different initial issue tag. I don't think we have to match GEMPAK, but we'll need to make it explicit that we are doing the calculation differently and if there appears to be demand, we can always add a function to do that calculation in that way, but I would prefer to stay with the more numerically robust solution as we currently have it.

You are correct that GEMPAK does not have a true higher-order derivatives, just first derivative/gradient.

I've used the MetPy Laplacian calculation in some of my courses and I've never thought it was off or wrong, and there have only been the handful of notable exceptions between MetPy and GEMPAK calculations, so I wanted to make sure we rolled down this path to have a clear record of it. The other exceptions are wind speed (with GEMPAKs [outdated?] conversion factor) and potential temperature using the ratio of 2/7 instead of 287/1004.

kgoebber · 2020-06-12T14:54:31Z

Okay, so here are some global maps of the differences between a native second_derivative calculation of the Laplacian or iterative first derivatives through use of gradient and divergence.

This image is taking the Laplacian of the 500 hPa height field and they give largely the same answer, except that the second_derivative calculation is considerably noisier. This aligns well with @jthielen indicating that in a numerical analysis sense, there would be a bit of smoothing that happens when using an iterative first derivative method.

Honestly, I'm a bit torn as to which way we should go. The noise in the second derivative form is not insubstantial and overwhelms the "signal" over northeast Russia (as an example). A simple test of using the 5 point smoother with one pass greatly reduces the noise and provides a similar smoothing to the iterative first derivative calculation. However, there are some prime meridian artifacts that appear (see below).

Others thoughts?

sgdecker · 2020-07-24T17:36:43Z

See also #893

miniufo · 2020-08-22T12:48:17Z

Not so sure if this is related to the type of the grid. Does the difference scheme take into account different type of model grids? Like five Arakawa stagged grids. For Arakawa C grid, 1st-order finite difference may have half-grid offset relative to its original location, and 2nd-order finite difference will shift back to the original location.

dopplershift · 2020-08-22T20:19:17Z

These functions assume that the coordinate values (e.g. x) that you provide correspond directly to the scalar values, no shift. The calculated derivative estimates are valid at these locations as well--no half-grid shift.

I'm open to adding such capabilities, though I have no idea how readily that math can fit within our existing implementation.

miniufo · 2020-08-23T16:40:22Z

I think you are right. It is not easy to generalize such calculation to various kinds of grids. Although most reanalysis data are on Arakawa A grid (no shift), many model outputs are defined on stagged Arakawa grids. Since difference schemes may depend on the underlying grid, using 1st-order derivative as a building block for higher-order derivatives may not be so good as calculating higher-order derivatives directly.

If I understand it right, 2nd-order derivative using twice 1st-order derivative involves ((q[i+2] - q[i]) - (q[i] - q[i-2])) /dx/dx/4, while it may also be calculated as ((q[i+1] - q[i]) - (q[i] - q[i-1])) /dx/dx (assuming a constant dx). The magnitude of the differences between the two ways depends on the smoothness of the q field.

So I guess the best way is to use higher-order function directly, rather than using 1st-order one recursively.

jthielen · 2020-08-23T16:50:36Z

These functions assume that the coordinate values (e.g. x) that you provide correspond directly to the scalar values, no shift. The calculated derivative estimates are valid at these locations as well--no half-grid shift.

I'm open to adding such capabilities, though I have no idea how readily that math can fit within our existing implementation.

I've roughly imagined delegating this to something like xgcm (that has more robust handling of staggered grids), and then providing the option to plug in xgcm's derivative functions in place of MetPy's. But that's just one possibility among many for handling staggered grids. At the very least I think it would require leveraging xarray coordinate support and some measure of bounds checking.

Also, @miniufo, if you're curious, MetPy's first and second derivatives use the three point versions of Bowen and Smith (2005).

dopplershift · 2020-08-23T17:22:55Z

And to be clear, while @kgoebber was doing some testing with repeated 1st-order, the Lacplacian and second derivative functions in MetPy currently are not using the first-order differences--the second_derivative function is a direct implementation of the appropriate calculation, including accounting for variable delta_x as described in the Bowen and Smith paper referenced above.

miniufo · 2020-08-25T09:01:06Z

@jthielen @dopplershift Thanks for the paper. I understand much better now.

I've roughly imagined delegating this to something like xgcm (that has more robust handling of staggered grids), and then providing the option to plug in xgcm's derivative functions in place of MetPy's. But that's just one possibility among many for handling staggered grids.

I've already used xgcm in ocean-data analyses for a while, and hope there would be a way to apply wonderful metpy functions to stagged datasets.

dopplershift · 2020-08-25T09:10:22Z

Have you run into one of our calculations that doesn't work for your staggered dataset?

miniufo · 2020-08-28T07:21:53Z

I have not used metpy.calc before. Since the results of the derivative calculation is defined at the same grid point as the input, I just expected the result by metpy would be different from that by xgcm.

But I do would like a try. However, I am a little confused by the difference and usage between xarray and metpy.xarray. Given u and v as two xarray.DataArrays on a stagged grid over the globe:

In [62]: print(u); print(v)
<xarray.DataArray 'UVEL' (YC: 720, XG: 1440)>
dask.array<getitem, shape=(720, 1440), dtype=float32, chunksize=(720, 1440), chunktype=numpy.ndarray>
Coordinates:
  * YC       (YC) >f4 -89.75 -89.5 -89.25 -89.0 -88.75 ... 89.25 89.5 89.75 90.0
    iter     int32 dask.array<chunksize=(), meta=np.ndarray>
    time     int64 0
    crs      object Projection: latitude_longitude
  * XG       (XG) >f4 0.0 0.25 0.5 0.75 1.0 ... 358.75 359.0 359.25 359.5 359.75
    dyG      (YC, XG) >f4 dask.array<chunksize=(720, 1440), meta=np.ndarray>
    dxC      (YC, XG) >f4 dask.array<chunksize=(720, 1440), meta=np.ndarray>
    rAw      (YC, XG) >f4 dask.array<chunksize=(720, 1440), meta=np.ndarray>
    hFacW    (YC, XG) >f4 dask.array<chunksize=(720, 1440), meta=np.ndarray>
    maskW    (YC, XG) bool dask.array<chunksize=(720, 1440), meta=np.ndarray>
Attributes:
    standard_name:  UVEL
    long_name:      Zonal Component of Velocity (m/s)
    units:          m/s
    mate:           VVEL

<xarray.DataArray 'VVEL' (YG: 720, XC: 1440)>
dask.array<getitem, shape=(720, 1440), dtype=float32, chunksize=(720, 1440), chunktype=numpy.ndarray>
Coordinates:
  * XC       (XC) >f4 0.125 0.375 0.625 0.875 ... 359.375 359.625 359.875
    iter     int32 dask.array<chunksize=(), meta=np.ndarray>
    time     int64 0
    crs      object Projection: latitude_longitude
  * YG       (YG) >f4 -89.875 -89.625 -89.375 -89.125 ... 89.375 89.625 89.875
    dxG      (YG, XC) >f4 dask.array<chunksize=(720, 1440), meta=np.ndarray>
    dyC      (YG, XC) >f4 dask.array<chunksize=(720, 1440), meta=np.ndarray>
    rAs      (YG, XC) >f4 dask.array<chunksize=(720, 1440), meta=np.ndarray>
    hFacS    (YG, XC) >f4 dask.array<chunksize=(720, 1440), meta=np.ndarray>
    maskS    (YG, XC) bool dask.array<chunksize=(720, 1440), meta=np.ndarray>
Attributes:
    standard_name:  VVEL
    long_name:      Meridional Component of Velocity (m/s)
    units:          m/s
    mate:           UVEL

how to calculate divergence using 'metpy' function? I tried:

dx, dy = mpcalc.lat_lon_grid_deltas(u.metpy.x, v.metpy.y)

div = mpcalc.divergence(u, v, dx, dy)

following @kgoebber but not sure if dx and dy are calculated properly, as the coordinates for u and v are slightly different. Besides, how to convert div back to xarray.DataArray. I am currently only farmiliar with xarray...

sgdecker · 2020-12-16T04:09:49Z

My comment from July was too terse. I think GEMPAK has it right. In spherical coordinates (which the problematic GFS data employs), second derivatives are not enough to compute the Laplacian. There are extra terms (which emerge if you expand out the divergence of the gradient in spherical coordinates). Treating spherical coordinates as a nonuniform Cartesian grid and then applying the Bowen and Smith expressions is insufficient, I believe, as it does not account for the fact that the unit vectors are not constant in spherical coordinates. (I'm no mathematician, but all the math in that paper seems to assume constant unit vectors.) So, to me it is no surprise that @kgoebber's calculations show better agreement when the divergence of the gradient approach is taken (although the divergence may be slightly off in spherical coordinates as well), as that will indirectly include approximations to some of the terms that are missing from the "add up second derivatives" approach. When I find some time, I would like to try to compute the Laplacian in spherical coordinates directly using some GFS data to see how it compares with what Kevin found.

jthielen · 2022-02-09T17:34:14Z

I've been browsing through issues on the 1.3.0 milestone, and wanted to check in about how this could be resolved as a part of #893. Should we replace the implementation of laplacian with

using the vector derivative-based divergence of the scalar gradient,
use a separate form of map factor based corrections on top of the second-order scalar derivatives, or
something else?

(Also, I put a reference to the side discussion about staggered grids in #1360, as I'd think any staggered grid handling in MetPy would come about through xgcm compatibility.)

dopplershift · 2022-02-11T20:32:53Z

I think going down the "vector derivative-based divergence of the scalar gradient" is the right choice here since it's the truest to the physics and involves the least amount of re-building the corrections. second_derivative still exists for the other case.

sgdecker · 2022-02-11T21:36:49Z

I concur with @dopplershift. The first option + resolving #893 should fix this issue for free.

dopplershift · 2022-12-23T22:02:54Z

Closed by #2743, which added geospatial_laplacian, which calculates the Laplacian as the divergence of the scalar gradient, including all of the projection/spherical corrections (if the coordinate/projection information is present).

kgoebber added the Type: Bug Something is not working like it should label Jun 5, 2020

jthielen added this to the 1.1 milestone Aug 12, 2020

dcamron modified the milestones: 1.1.0, 1.2.0 Aug 3, 2021

dopplershift modified the milestones: 1.2.0, 1.3.0 Jan 14, 2022

jthielen mentioned this issue Feb 9, 2022

Interoperability with xgcm and xcape #1360

Open

jthielen mentioned this issue Feb 22, 2022

Implement Jacobian determinant #2356

Open

dopplershift modified the milestones: 1.3.0, May 2022 Mar 31, 2022

dopplershift added Type: Enhancement Enhancement to existing functionality and removed Type: Maintenance Updates and clean ups (but not wrong) labels May 16, 2022

dopplershift modified the milestones: May 2022, July 2022 May 16, 2022

dopplershift mentioned this issue Dec 2, 2022

Derivatives & map factors #2743

Merged

dopplershift closed this as completed Dec 23, 2022

kgoebber mentioned this issue Sep 1, 2023

second_derivative shows different results from that of applying first_derivative twice #3175

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Laplacian and Second Derivative #1389

Laplacian and Second Derivative #1389

kgoebber commented Jun 5, 2020

jthielen commented Jun 6, 2020 •

edited

Loading

kgoebber commented Jun 8, 2020

kgoebber commented Jun 12, 2020

sgdecker commented Jul 24, 2020

miniufo commented Aug 22, 2020

dopplershift commented Aug 22, 2020

miniufo commented Aug 23, 2020

jthielen commented Aug 23, 2020 •

edited

Loading

dopplershift commented Aug 23, 2020 •

edited

Loading

miniufo commented Aug 25, 2020

dopplershift commented Aug 25, 2020

miniufo commented Aug 28, 2020

sgdecker commented Dec 16, 2020

jthielen commented Feb 9, 2022 •

edited

Loading

dopplershift commented Feb 11, 2022

sgdecker commented Feb 11, 2022

dopplershift commented Dec 23, 2022

Laplacian and Second Derivative #1389

Laplacian and Second Derivative #1389

Comments

kgoebber commented Jun 5, 2020

jthielen commented Jun 6, 2020 • edited Loading

kgoebber commented Jun 8, 2020

kgoebber commented Jun 12, 2020

sgdecker commented Jul 24, 2020

miniufo commented Aug 22, 2020

dopplershift commented Aug 22, 2020

miniufo commented Aug 23, 2020

jthielen commented Aug 23, 2020 • edited Loading

dopplershift commented Aug 23, 2020 • edited Loading

miniufo commented Aug 25, 2020

dopplershift commented Aug 25, 2020

miniufo commented Aug 28, 2020

sgdecker commented Dec 16, 2020

jthielen commented Feb 9, 2022 • edited Loading

dopplershift commented Feb 11, 2022

sgdecker commented Feb 11, 2022

dopplershift commented Dec 23, 2022

jthielen commented Jun 6, 2020 •

edited

Loading

jthielen commented Aug 23, 2020 •

edited

Loading

dopplershift commented Aug 23, 2020 •

edited

Loading

jthielen commented Feb 9, 2022 •

edited

Loading