Skip to content

Conversation

@yutannihilation
Copy link
Contributor

@yutannihilation yutannihilation commented Oct 26, 2025

As many people are adding geos-related functions, I pick this to avoid collision.

At first, I thought this can be implemented via geo_traits::CoordTrait. But, I read the implementation of ST_Point and felt it's preferable to handle the raw bytes of WKB for performance. While I'm not good at performance, I'm trying to implement it.

The remaining tasks are:

  • Add Python tests
  • Add benchmark

@paleolimbot
Copy link
Member

As many people are adding geos-related functions, I pick this to avoid collision.

Thank you!

At first, I thought this can be implemented via geo_traits::CoordTrait. But, I read the implementation of ST_Point and felt it's preferable to handle the raw bytes of WKB for performance.

I think using GeoTraits or the Wkb object directly would be roughly equivalent in terms of performance and would be a whole lot easier to write. While we do care about performance, we also care about bug surface area and maintainability. There are some functions were we work with WKB bytes directly but all of them had simpler implementations first that we benchmarked to ensure that the complexity had measurable benefit.

@yutannihilation
Copy link
Contributor Author

Thanks for the hint! Now I figured out how to do it with geo-traits and the functions in wkb_factory.rs.

@yutannihilation yutannihilation marked this pull request as ready for review October 27, 2025 16:01
@yutannihilation
Copy link
Contributor Author

It seems I still have troubles on running Python tests on my Windows, so I don't run these Python tests yet. I'll try on macOS tomorrow. Anyway, I believe this pull request is ready for review now.

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few things to consider...thank you!

I'll try to work on the Windows development situation soon 😬

Comment on lines 147 to 168
match dim.size() {
2 => {
let coords_tuple = coords.x_y();
write_wkb_coord(buf, coords_tuple)
}
3 => {
let coords_tuple = (coords.x(), coords.y(), coords.nth_or_panic(2));
write_wkb_coord(buf, coords_tuple)
}
4 => {
let coords_tuple = (
coords.x(),
coords.y(),
coords.nth_or_panic(2),
coords.nth_or_panic(3),
);
write_wkb_coord(buf, coords_tuple)
}
_ => Err(SedonaGeometryError::Invalid(
"Unsupported number of dimensions".to_string(),
)),
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would fit nicely as wkb_factory::write_wkb_coord_trait() (I'm sure we'll need it again!)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, moved the implementation to wkb_factory now.

@yutannihilation
Copy link
Contributor Author

Thanks for reviewing!

Sorry, this was unexpected, but I just found the document has this small note

Enhanced: 3.2.0 returns a point for all geometries. Prior behavior returns NULLs if input was not a LineString.

and this is the actual behavior (while the example on the document returns t...)

postgres=# SELECT ST_StartPoint('POINT(0 1)'::geometry) IS NULL AS is_null;
 is_null 
---------
 f
(1 row)

Probably I need some time to update the implementation.

@yutannihilation
Copy link
Contributor Author

yutannihilation commented Oct 28, 2025

I think this should be ready for review again now!

Benchmark result:

--------------------------------------------------------------------------------------------------------- benchmark: 6 tests --------------------------------------------------------------------------------------------------------
Name (time in ms)                                                     Min                 Max                Mean            StdDev              Median               IQR            Outliers       OPS            Rounds  Iterations
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
test_st_start_point[segments_large-SedonaDBSingleThread]           1.9418 (1.0)        2.3150 (1.0)        2.0126 (1.0)      0.0812 (1.0)        1.9881 (1.0)      0.0475 (1.0)           2;2  496.8624 (1.0)          26           1
test_st_start_point[collections_simple-DuckDBSingleThread]         6.3318 (3.26)       7.5675 (3.27)       6.7992 (3.38)     0.4783 (5.89)       6.6988 (3.37)     0.6017 (12.66)         1;0  147.0758 (0.30)          5           1
test_st_start_point[segments_large-DuckDBSingleThread]             7.6638 (3.95)       8.6637 (3.74)       7.9898 (3.97)     0.3350 (4.12)       7.8963 (3.97)     0.2996 (6.30)          1;1  125.1591 (0.25)          7           1
test_st_start_point[collections_simple-SedonaDBSingleThread]      10.4538 (5.38)      11.3186 (4.89)      10.6016 (5.27)     0.2383 (2.93)      10.5149 (5.29)     0.0893 (1.88)          2;3   94.3255 (0.19)         20           1
test_st_start_point[segments_large-PostGISSingleThread]          240.1957 (123.70)   243.3285 (105.11)   241.3920 (119.94)   1.1893 (14.64)    241.1900 (121.32)   1.3549 (28.50)         2;0    4.1426 (0.01)          5           1
test_st_start_point[collections_simple-PostGISSingleThread]      352.2113 (181.39)   357.8562 (154.58)   355.9990 (176.88)   2.3396 (28.80)    357.1065 (179.63)   3.0995 (65.20)         1;0    2.8090 (0.01)          5           1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Legend:
  Outliers: 1 Standard Deviation from Mean; 1.5 IQR (InterQuartile Range) from 1st Quartile and 3rd Quartile.
  OPS: Operations Per Second, computed as 1 / Mean

fn st_start_point_doc() -> Documentation {
Documentation::builder(
DOC_SECTION_OTHER,
"Returns the start point of a LINESTRING geometry. Returns NULL if the geometry is not a LINESTRING.",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to take care of the documentation and test suggestions here but I this is a great implementation and we can also handle that in a follow-on ticket if needed. Thank you!

Comment on lines 1050 to 1051
[
("LINESTRING (1 2, 3 4, 5 6)", "POINT (5 6)"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NULL and empties would be good to test here, too.

Comment on lines +219 to +221
let input = create_array(
&[
Some("LINESTRING (1 2, 3 4, 5 6)"),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For full test coverage of your implementation I think we would need to test empty geometries here as well.

@yutannihilation
Copy link
Contributor Author

yutannihilation commented Oct 28, 2025

Thanks for the great review! Adding tests about empty geometry actually helped me to find a bug in my implementation of ST_EndPoint :) I think I addressed all the comments now.

Copy link
Member

@paleolimbot paleolimbot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

@paleolimbot paleolimbot merged commit 0551583 into apache:main Oct 29, 2025
12 checks passed
@yutannihilation yutannihilation deleted the feat/st_start_point branch October 29, 2025 03:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants