-
Notifications
You must be signed in to change notification settings - Fork 744
Open
Milestone
Description
To be fair, this is the wild west and nobody seems to agree on what to do for the two cases of:
- xmin/xmax/ymin/ymax for EMPTY
- Geometry that contains zero or more nan values
Sedona seems to return an almost infinity value for a lower bound of an EMPTY (or maybe infinity just prints this way from Spark?) and seems to propagate a NaN encountered in coordinates Like Math.min/max():
from sedona.spark import *
config = SedonaContext.builder().getOrCreate()
sedona = SedonaContext.create(config)
sedona.sql("SELECT ST_XMin(ST_GeomFromText('POINT EMPTY'))").show()
#> +---------------------------------------+
#> |st_xmin(st_geomfromwkt(POINT EMPTY, 0))|
#> +---------------------------------------+
#> | 1.797693134862315...|
#> +---------------------------------------+
sedona.sql("SELECT ST_XMin(ST_GeomFromText('LINESTRING (0 1, nan nan, 2 3)'))").show()
#> +-----------------------------------------------------------+
#> |st_xmin(st_geomfromtext(LINESTRING (0 1, nan nan, 2 3), 0))|
#> +-----------------------------------------------------------+
#> | NaN|
#> +-----------------------------------------------------------+
sedona.sql("SELECT ST_XMin(ST_GeomFromText('LINESTRING EMPTY'))").show()
#> +---------------------------------------------+
#> |st_xmin(st_geomfromtext(LINESTRING EMPTY, 0))|
#> +---------------------------------------------+
#> | 1.797693134862315...|
#> +---------------------------------------------+GeoPandas seems to give NaN for empty bounds and ignores NaN if encountered in a coordinate sequence (like std::min/max()):
import geopandas
geopandas.GeoSeries.from_wkt(["POINT EMPTY", "POINT (0 1)", "LINESTRING (0 1, nan nan, 2 3)"]).total_bounds
#> array([0., 1., 2., 3.])
geopandas.GeoSeries.from_wkt(["POINT EMPTY", "POINT (0 1)", "LINESTRING (0 1, nan nan, 2 3)"]).bounds
#> minx miny maxx maxy
#> 0 NaN NaN NaN NaN
#> 1 0.0 1.0 0.0 1.0
#> 2 0.0 1.0 2.0 3.0
geopandas.GeoSeries.from_wkt(["POINT EMPTY"]).total_bounds
#> array([nan, nan, nan, nan])PostGIS gives NULL for the bounds of an EMPTY:
# docker run --rm -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=password -p "5432:5432" postgis/postgis
# psql -h 127.0.0.1 --user postgres
postgres=# SELECT ST_XMin(ST_GeomFromText('POINT EMPTY')) IS NULL;
?column?
----------
t
(1 row)
...and when coordinates contain NULL, it appears that the bounds are reset when the first nan occurs:
postgres=# SELECT ST_XMin(ST_GeomFromtext('LINESTRING (1 2, nan nan, 3 4)'));
st_xmin
---------
3
(1 row)
Metadata
Metadata
Assignees
Labels
No labels