-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Can I search through all zipcodes or bounding boxes in the U.S.? #33
Comments
Hi Liviayi! I would probably implement this so that I would pre-process the bounding boxes e.g. from shapefiles to my local database or to a csv file and then loop over them with the bounding box search option. In PostGIS you can find the bounding box with ST_envelope function: https://postgis.net/docs/ST_Envelope.html |
Thanks! That would be awesome. Is this difficult to implement? When do you plan to implement this? |
I meant that that might be sensible to do as a pre-processing script and not implement in this project... That's just my suggestion. |
I see. Thanks. In your code, is it possible to run bounding box search using bounding box of a state, instead of a city? |
Hi Livi.ayi: The word "city" appears in tables etc purely for historical reasons. So long as you can get a bounding box and are prepared to be patient, you should be able to run it on other areas. I've run it on Switzerland and Sri Lanka, for example. That said, please note the status message on https://github.com/tomslee/airbnb-data-collection. The script currently misses some listings because of changes on the Airbnb site (about 10% for some cities) and I am unlikely to fix that problem. If you are lucky, the bounding box method gets about 8 or 9 new listings per request on average: let's be generous and say 10 (I don't know if that would work for an area the size of the USA). Each request takes a few seconds, so let's say you get 200 listings per minute. If there are about 650K listings in the US (http://www.businessinsider.com/airbnb-total-worldwide-listings-2017-8) then that's about 3,000 minutes, or 50 hours. It may be possible! Caveat: you would also collect a lot of non-USA listings from southern Canada and northern Mexico. Maybe more like four days or so. I've started an exploratory run and I'll post a note when/if it finishes. Maybe this weekend. There may be other questions about zoom levels and bounding boxes for boxes that big... |
Thanks so much for the detailed answer. I have tried to run a bounding box search around one particular (rectangle-looking) state and it worked well. It took about 2 hours. I did not use proxies so my IP was occasionally blocked for five minutes or so in the middle...The regular wait time was a fantastic idea. That leads to my last question: what is the procedure for using proxy servers with your code? Do I just obtain a list of hosts/ports, and add them to the user.config file, and that's it? |
Hello to all! I am getting this problem when searching by bbox... Warning HTTP Status 400 from web site: IP address blocked.Waiting 1.0 minutes... It seems my university IP is blocked... Do you have any recommendation to overpass this issue? When searching by zipcode or neighborhood, it finishes the process but no data is on DB.. Thanks in advance! I am looking for data within Lisbon boundaries |
Thanks for the amazing project!! Is there a way for me to search over all zip codes in the U.S.? Or maybe divide the U.S. into several bounding boxes and search over all bounding boxes? It seems that your code is based on cities (regardless of whether the search is being done through bounding boxes, neighborhoods or zipcodes). Thank you very much.
The text was updated successfully, but these errors were encountered: