Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
osm_to_pq.py
clips OSM.pbf
files ofways
to a bounding box and converts theways
toGeoDataFrames
ofedges
, split at junctions. It also tries to label which edges havetermini
(starts or ends) resulting from being clipped.An OSM
way
may be split into severalsegments
, wherenodes
from otherways
intersect theway
in question. Previously, it was assumed that anysegment
ending in anode
that wasn't 'shared', i.e. degree >= 2, must have been clipped. In fact, this also includes dead ends and so erroneously labelled many edges as having been clipped.This PR reworks the script, using an index of all nodes (not just shared ones) to check if a segment end is new (and therefore resulting from a clip operation). It also hopefully improves the readability (although it's still a bit hard to follow).
Fixing this is a step towards network component labeling at scale, i.e. on sections of networks, before joining the sections on the clipped boundaries later.