Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: Sliced GeoSeries have invalid ._meta.input_types #1142

Closed
thomcom opened this issue May 19, 2023 · 1 comment · Fixed by #1166
Closed

[BUG]: Sliced GeoSeries have invalid ._meta.input_types #1142

thomcom opened this issue May 19, 2023 · 1 comment · Fixed by #1166
Assignees
Labels
bug Something isn't working Python Related to Python code

Comments

@thomcom
Copy link
Contributor

thomcom commented May 19, 2023

Version

23.04

On which installation method(s) does this occur?

Rapids-Compose

Describe the issue

While working on the binary predicates notebook I am building large GeoSeries objects by repeat-index-slicing a small starting GeoSeries:

base = cuspatial.GeoSeries([Point(0, 0), Point(1, 0)])
lhs = base[[0, 1, 0, 1, 0, 1, 0, 1, 0, 1]]
rhs = base[[0, 0, 0, 0, 0, 1, 1, 1, 1, 1]]
lhs.geom_equals(rhs)

Some binary predicates using this test object fail: lhs.geom_equals(rhs) -> ValueError because the input_types of the two GeoSeries are compared, and their indices have not been reset. I'm writing the fix for this so I can progress with the binary predicates example notebook.

Minimum reproducible example

No response

Relevant log output

No response

Environment details

No response

Other/Misc.

No response

@thomcom thomcom added bug Something isn't working Needs Triage Need team to review and classify labels May 19, 2023
@thomcom thomcom self-assigned this May 19, 2023
@thomcom
Copy link
Contributor Author

thomcom commented May 22, 2023

PR for this coming up shortly.

@harrism harrism added Python Related to Python code and removed Needs Triage Need team to review and classify labels May 22, 2023
@harrism harrism moved this from Todo to In Progress in cuSpatial May 22, 2023
@rapids-bot rapids-bot bot closed this as completed in #1166 Jun 8, 2023
rapids-bot bot pushed a commit that referenced this issue Jun 8, 2023
Closes #1142 

This PR adds a few bugfixes and optimizations that improve performance when large `GeoSeries` are used with binary predicates. It also corrects a few errors in the predicate logic that were revealed when the size of the feature space increased by combining all possible features in the `dispatch_list`.

Changes:
`contains.py`
- Add `pairwise_point_in_polygon` and steps to resemble `quadtree` results.

`contains_geometry_processor.py`
- Drop `is True` and add a TODO for future optimization.

`feature_contains.py`
- Refactor `_compute_polygon_linestring_contains` to handle `GeoSeries` containing `LineStrings` of varying lengths.

`feature_contains_properly.py`
- Add `pairwise_point_in_polygon` as default mode with documentation.
- Add `PointMultiPointContains` which is needed by internal methods.

`feature_crosses.py`
- Drop extraneous `intersection`

`feature_disjoint.py`
- Add `PointPointDisjoint` and drop extraneous `intersections`.

`feature_equals.py`
- Fix LineStringLineStringEquals which wasn't properly handling LineStrings with varying lengths.

`feature_intersects.py`
- Drop extraneous `intersection`

`feature_touches.py`
- Fix LineStringLineStringTouches. It is slow and needs further optimization.
- Fix PolygonPolygonTouches. It is also slow and needs further optimization.

`geoseries.py`
- Drop index from `input_types`.
- Fix `point_indices` for `Point` type.
- Optimize `reset_index` which was doing a host->device copy.

`binpred_test_dispatch.py`
- Add test case
`test_binpred_large_examples.py`
- Test large sets of all the dispatched tests together.

`test_equals_only_binpreds.py`
- Test corrections to input_types indexes.

`test_binpred_large_examples.py`
- Use the features from `test_dispatch` to create large `GeoSeries` and compare results with `GeoPandas`.

`test_feature_groups.py`
- Test each of the `dispatch_list` feature sets combined into a single GeoSeries.

`binpred_utils.py`
- Don't count hits when point and polygon indexes don't match (a bug in `_basic_contains_count`).
- Optimize mask generation in `_points_and_lines_to_multipoints`

`column_utils.py`
- Optimize `contains_only` calls.

Authors:
  - H. Thomson Comer (https://github.com/thomcom)

Approvers:
  - Mark Harris (https://github.com/harrism)
  - Michael Wang (https://github.com/isVoid)

URL: #1166
@github-project-automation github-project-automation bot moved this from In Progress to Done in cuSpatial Jun 8, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working Python Related to Python code
Projects
Status: Done
2 participants