-
Notifications
You must be signed in to change notification settings - Fork 6
NextGen Forcings Engine Statistical Analysis on Regridding Methods
ExactExtract (isciences/exactextract: Fast and accurate raster zonal statistics (github.com))
Weighting Scheme
w(i,j) = SUM(Value(i,j)*coverage_fraction(i,j))/SUM(coverage_fraction)
Previous Research
- Results from exactextractr are more accurate than other common implementations because raster pixels that are partially covered by polygons are considered. The significance of partial coverage increases for polygons that are small or irregularly shaped. For the 5500 Brazilian municipalities used in the example, the error introduced by incorrectly handling partial coverage is less than 1% for 88% of municipalities and reaches a maximum of 9%. Note that this analysis focus was on land surface coverage types (stagnant data) . However, our results (see below) reveal much larger errors for regridding estimates with dynamic meteorological forcing data.
ESMF (Regridding - Earth System Modeling Framework)
Different options that we tested within the ESMF:
-
ESMF-Bilinear: Linear interpolation in 2 or 3 dimensions.
-
ESMF-PATCH: Higher-order patch recovery: Patch rendezvous method of taking the least squares fit of the surrounding surface patches. This is a higher order method that may produce interpolation weights that may be slightly less than 0 or slightly greater than 1.
-
ESMF-Conserve*: First-order conservative: Preserves the integral of the source field across the regridding. For this method, weight calculation is based on the ratio of source cell area overlapped with the corresponding destination cell area.
-
ESMF-Conserve_2nd*: Second-order conservative: Like first-order conservative, this method preserves the integral of the source field across the regridding. Also, like the first-order, weight calculation is based on the ratio of source cell area overlapped with the corresponding destination cell area and allows the user to provide their own areas if desired. However, the second-order conservative calculation also includes the gradient across the source cell, so in general it gives a smoother, more accurate representation of the source field. This is particularly true when going from a coarse to finer grid.
Weighting Schemes
- Nearest Neighbor: This is the simplest method. It assigns the value of the nearest grid point in the source grid to the target grid point.
- w(i,j) = 1 if (i,j) is the nearest neighbor to (x,y)
- w(i,j) = 0 otherwise
- Bilinear Interpolation: This method uses linear interpolation in both the x and y directions to estimate the value at the target grid point.
- w(i,j) = (x - x_i) * (y - y_j) / (dx * dy)
- (x,y) is the target grid point
- (x_i, y_j) are the coordinates of the source grid point
- dx and dy are the grid spacings in the x and y directions
- Bicubic Interpolation: This method uses cubic interpolation in both the x and y directions, providing a more accurate interpolation than bilinear interpolation.
- w(i,j) = a_i * b_j
- a_i and b_j are coefficients determined by solving a system of linear equations.
- Conservative Regridding: This method ensures that the total mass or energy is conserved during the regridding process.
- w(i,j) = A_ij / A_target
- A_ij is the area of the source grid cell (i,j) that overlaps with the target grid cell
- A_target is the area of the target grid cell
Previous Research
None that has been found overall with a full-scale statistical analysis of the quality of the ESMF regridding methods
- ASOS (https://mesonet.agron.iastate.edu/request/download.phtml)
- Mesonet (Index of /madisPublic1/data/archive/2023/08/29/LDAD/mesonet/ (noaa.gov))
- Ameriflux (AmeriFlux: Measuring carbon, water and energy flux across the Americas. (lbl.gov))
- Analysis of Record for Calibration (AORC) Dataset (AORC-Version1.1-SourcesMethodsandVerifications.pdf (noaa.gov))
_Timeline of Analysis _
- Start time: 2023-08-30 12:00:00
- End time: 2023-09-04-01 00:00:00
Statistical Method
- Performed analysis during Hurricane Idalia landfall in North Florida and movement into the Southeastern US to highlight a range of meteorological summertime conditions (dry western US, wet eastern US) that we can use to evaluate the quality of regridding methods across various meteorological conditions compared to ground truth observations
- Only linked observation data with nearby model grid cells or elements when the centroid of the model data is within 10 kilometers of the observation station location
- Mesonet data quality filters - Utilized data quality flags to only extract hourly observations when data passed through 3 out of 4 or more data quality filtering methods (MADIS-mesonet data and quality memo - Google Docs)
- ASOS data quality filters - Developed similar data quality filters analogous to Mesonet methods that essentially filter out signal-noise ratios using a Locally Weighted Least Squares (LOWESS) filter method over a 2-month period of ASOS data.
Results and Implications
- Performance level: AORC > ESMF cases (Bilinear, Conserve, Patch) > ExactExtract cases (Feature, Raster). The statistical correlation between the 1km AORC retrospective dataset and surface observation stations across all meteorological variables reveals nearly identical results when AORC data is regridded to the NextGen catchment polygons. The statistical quality of the original AORC retrospective dataset is largely degraded when the meteorological forcings are regridded using the ExactExract regridding method, indicating a clear advantage to using the ESMF regridding methods that are already the default regridding implementation in the NextGen Forcings Engine.
- The three ESMF cases performed similarly better than the ExactExtract cases, and the two ExactExtract cases were pretty alike each other as well.
- The weighting schemes differences between ExactExtract and ESMF methods indicate that accounting for the area or the physical distance between overlapping grid cell features of AORC data are clearly capturing the downscaling or upscaling of dynamic meteorological variables for catchment polygons compared to calculating weights based on the percentage overlap of grid cells over a given catchment feature.
- In general, rainfall rates are captured reasonably well for AORC and ESMF regridding methods in regions where there are widespread rainfall events (like southeast region where hurricane Idalia tracked through in late August/Early September.
- Statistical correlations are strong across CONUS for temperature, specific humidity, and surface pressure fields while correlations with wind vector fields are slightly more variable over the southeastern United States (hurricane Idalia influence). ESMF regridding methods over the raw AORC wind vector fields do show a slight overall reduction in statistical correlations with surface observations, but still substantially higher than ExactExtract regridding methods.
Scatter Plots
Rainrate Analysis
Histogram, Correlation, and Spatial Plots
- In the following correlation coefficient maps and corresponding station count histograms, the columns from the left to the right are PSFC, Q2D, T2D, U2D and V2D, and the rows from the top to the bottom are AORC, ESMF_Binlinear, ESMF_Conserve, ESMF_Patch, ExactExtract_Feature and ExactExtract_Raster.
_Timeline of Analysis _
- Start time: 2019-08-30 12:00:00
- End time: 2019-09-04-01 00:00:00
Statistical Method
- Performed analysis during the late summer month of year 2019 to highlight a range of meteorological summertime conditions (dry western US, wet eastern US) that we can use to evaluate the quality of regridding methods across various meteorological conditions compared to ground truth observations
- Only linked observation data with nearby model grid cells or elements when the centroid of the model data is within 10 kilometers of the observation station location
- Ameriflux filters were applied to only include observations only for this analysis and exclude quality control gap fill methods
- Only observation station dataset with high-frequency eddy covariance flux measurements to compare regridded radiative fluxes (downward shortwave radiation, downward longwave radiation) with the retrospective AORC dataset and regridding methods
Results and Implications
- Performance level: AORC > ESMF cases (Bilinear, Conserve, Conserve2nd, Patch) > ExactExtract cases (Feature, Raster). Overall the same meteorological variables in the ASOS/Mesonet analysis are also showing nearly identical statistical correlations/results for AORC, ESMF regridding methods, and ExactExtract regridding methods.
- All the four ESMF cases performed similarly, with the CONSERVE and CONSERVE_2ND slightly better for temperature and wind. The CONSERVE methods essentially showed identical results for surface radiative fluxes where a user could see some statistical benefits in theory for conserving flux-based variables with the ESMF “CONSERVE” weighting schemes. The CONSERVE regridding methods may show some potential benefit in upscaling meteorological forcings for NextGen hydrofabric catchments, although our sample size was relatively small for this analysis. Future research should separate small vs. large catchment basins in the NextGen hydrofabric to show potential downscaling/upscaling benefits based on the ESMF weighting schemes.
- ExactExtract regridding methods revealed low statistical correlations overall across all meteorological variables (including radiative fluxes), but especially for wind components and surface pressure.
- An interesting component here of this analysis shows that there is clear negative bias between Ameriflux specific humidity values and the AORC 1km specific humidity values in the retrospective dataset. This is clearly showing deficiencies in this statistical analysis with Ameriflux observations, where eddy covariance flux towers are located at the canopy height of various vegetation types. With evapotranspiration contributions being relatively higher along the canopy height, we caution that scaling specific humidity retrospective data down to heterogeneous land-surface types within a given NextGen hydrofabric catchment will tend to underestimate the specific humidity results considering that it’s averaging out specific humidity values over a 1km grid cell.
Scatter Plots
Histogram, Correlation, and Spatial Plots
- In the following correlation coefficient maps and corresponding station count histograms, the columns from the left to the right are LWDOWN, PSFC, Q2D, SWDOWN, T2D, U2D and V2D, and the rows from the top to the bottom are AORC, ESMF_Binlinear, ESMF_Conserve, ESMF_Conserve2nd, ESMF_Patch, ExactExtract_Feature and ExactExtract_Raster.