-
Notifications
You must be signed in to change notification settings - Fork 40
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Onboarding new city: Baltimore #121
Comments
@alicefeng I can't remember the specific issues you encountered when trying to run Baltimore through the pipeline, or is it now working? It doesn't look as though OSM has a polygon for the city so it'll probably need to be handled through a separate approach like Brisbane. |
@terryf82 Oh this was the issue where the individual crash ids were alphanumeric rather than strictly numeric which clashed with our data standards (at the time - not sure if we've modified the standards since then). |
@alicefeng I've updated the crashes & concerns standards in the data_standards branch to allow for both string and numeric ids. Give it a run on Baltimore when you get a chance and let me know how it goes! |
Hey @alicefeng the latest commits to the data_standards branch should allow you to get past the graph_from_place() problem that was preventing us from onboarding Baltimore. Basically there's a new function there that checks the OSM API (nominatim) for a polygon. If it finds one it returns the position, which is fed into graph_from_place() as which_result=x (sometimes the polygon isn't the first result). If there's no polygon for a city, we use graph_from_point() against the city lat+lng instead. I've been testing the Baltimore pipeline using crash data from https://data.maryland.gov/Public-Safety/MDTA-Accidents/rqid-652u (not sure if this is the same source you were using?) and even though the map is now built properly, it still breaks in train_model. I tried a few different config file setups (start_year, end_year etc) but no luck, mostly I hit this error (@bpben any thoughts?)
|
@terryf82 Awesome about the function for checking if there's a polygon. And yes, that looks to be the dataset I was using. |
I fixed an error on my end and tried rerunning the pipeline for Baltimore. It's still failing at the model training script though this time I got a different error from before:
|
Either there's no crashes or very few crashes. I just took a look at the canonical dataset you sent me and there's zero crashes there. Maybe send me the original crash dataset or just the full baltimore folder? |
Yeah, that was due to an error on my part. I fixed it, reran the pipeline and now have a canonical dataset that has non-zero crashes in it. Using that dataset led me the second error posted here. I'll send you that file. But you said even having all zeroes shouldn't lead to the first error right? (@terryf82 's dataset had non-zero weeks and he also got the first error) |
No description provided.
The text was updated successfully, but these errors were encountered: