Gently scraped one of the biggest property listing website for condo market in Bangkok, Thailand. End result csv file has 50k+ listings with Lat/Long locations and current asking price.
-
In the 'Sale' section, there were 1,484 pages. Retrieve all the href links for each page. Total 53,413 links were retreived.
-
Scraped the data for each property, and store in csv files. Make new csv for each 1000 listings. This step is crucial because the web scraping was done gently with sleep time, therefore, it took quite a bit of time to complete. Since this project was done on my personal laptop, I could not let it run 24/7 and it took several days to complete the whole run. This small chunks were combined in the next step.
-
Combine all csv files into one and drop missing values.