Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data type error #122

Open
yuqiww opened this issue Mar 9, 2020 · 3 comments
Open

Data type error #122

yuqiww opened this issue Mar 9, 2020 · 3 comments

Comments

@yuqiww
Copy link
Member

yuqiww commented Mar 9, 2020

I ran py 2.7 using the new 'parcels_geography' data and got a data type error at 'proportional jobs model for gov/edu'. The fields in the new parcels_geography table have the same data type as the previous table.

Capture

@smmaurer
Copy link

Hi @yuqiww, thanks for posting this! A couple questions:

  1. Have you gotten the same error with other 'parcel_geography' data, or is it just the new file?

  2. I'm currently using '07_11_2019_parcels_geography.csv' -- if there's a newer one, could you send it to me so I can troubleshoot? ([email protected])

This looks easy to "fix" in the sense of getting the code to work, because we can add a parameter to update_col_from_series() to automatically cast the values appropriately (see here). But it would be nice to diagnose what exactly is going on.

Here's a similar section of the log file from one of my recent runs. I don't see anything different in the diagnostics except that the cities are being indexed by name rather than id:

Running step 'proportional_elcm'
Running proportional jobs model for retail
Need more jobs total: 51880
Available jobs: 65815
Need more jobs
 Alameda                 375
Alameda County          264
Albany                   84
American Canyon          81
Antioch                 477
Atherton                  2
Belmont                 202
Belvedere                19
...
Length: 88, dtype: int64
Excess demand
 Atherton            2.000
Belvedere          10.000
Healdsburg         13.000
Napa County       640.000
...

I've been running the model on Mac and Linux, so one possibility is that this is a Windows-specific bug that we just haven't run into recently. And another is that some minor difference in the new data file is leading to the datatypes not actually matching by this point in the run..

@yuqiww
Copy link
Member Author

yuqiww commented Mar 20, 2020

Hi @smmaurer, thank you so much for looking into this.
I got this error with an updated 'parcel_geography' data ('03_06_2020_parcels_geography.csv'). I'll send it to your shortly.
Yes, I ran it in Windows. If your test shows the new file runs ok on Mac, I'll try installing a Linux environment on my computer.
As for the city name/id, it seems that the 'Running proportional jobs model for retail' steps uses city names:

location_options = building_subset.juris.repeat(
while the 'Running proportional jobs model for gov/edu' uses ID:
location_options = building_subset.zone_id.repeat(
My run log had the same output as yours in the 'Running proportional jobs model for retail' part.

@theocharides
Copy link

theocharides commented Mar 25, 2020

This is similar to an error that gets discussed in this issue and numerous connected discussions. Seems like another example of where data type checks would be useful, but more largely consistent installations across OS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants