You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Many of the issues our earthaccess users face are caused not by the library itself, but by the reading of their results into xarray. Most problems seem to relate to the issues with HDF grouped (hierarchical) data, requiring knowledge of the group structure and naming conventions a priori. An example from ICESat-2 ATL06:
We know that datatree is a solution and is being actively incorporate into xarray. However, earthaccess should reduce complexity and could also support data dictionaries so that users can succeed in opening their files without knowing nuances of the data structure or limitations of existing libraries. cf_xarray is another library that could be utilized under the hood to better interpret or apply cf conventions to data upon loading. A vision could look something like:
results = earthaccess.open(cmr_results)
Then, seamless loading, abstracting xarray or any other sub-libraries. Some ideas:
If the default approach returns an xarray error, provide a warning message explaining how to mitigate, with option to print json or yml dictionary`
ds = earthaccess.load_dataset(results, mapping=true) #uses a json or yml dictionary under the hood (in conjunction with cf_xarray(?)) to pass the correct params to xarray
ds = earthaccess.load_datatree(results, mapping=true) #uses json or yml dictionary to load data using datatree for grouped datasets
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
-
Brainstorming with @betolink and @andypbarrett :
Many of the issues our earthaccess users face are caused not by the library itself, but by the reading of their results into xarray. Most problems seem to relate to the issues with HDF grouped (hierarchical) data, requiring knowledge of the group structure and naming conventions a priori. An example from ICESat-2 ATL06:
ds = xr.open_mfdataset(earthaccess.open(results), group='/gt1l/land_ice_segments')
We know that datatree is a solution and is being actively incorporate into xarray. However, earthaccess should reduce complexity and could also support data dictionaries so that users can succeed in opening their files without knowing nuances of the data structure or limitations of existing libraries. cf_xarray is another library that could be utilized under the hood to better interpret or apply cf conventions to data upon loading. A vision could look something like:
results = earthaccess.open(cmr_results)
Then, seamless loading, abstracting xarray or any other sub-libraries. Some ideas:
ds = earthaccess.load_dataset(results, mapping=false)
#defaultds = earthaccess.load_dataset(results, mapping=true)
#uses a json or yml dictionary under the hood (in conjunction with cf_xarray(?)) to pass the correct params to xarrayds = earthaccess.load_datatree(results, mapping=true)
#uses json or yml dictionary to load data using datatree for grouped datasetsds = earthaccess.load_dataframe(results, mapping=true, var=<user-inputted-variable>)
#single variable dataframeBeta Was this translation helpful? Give feedback.
All reactions