This project involves scraping real estate data, cleaning and inserting it into a PostgreSQL database, and performing clustering analysis with visualization on a map. The main components of the project are:
- Data Scraping and Cleaning
- Database Setup and Data Insertion
- Clustering and Geospatial Visualization of Real Estate Data
To get started with this project, follow these steps:
- Python 3.7+
- Docker
- Docker Compose
git clone https://github.com/Malek-logh/RealEstateMapperTool.git
RealEstateMapperTool
pip install -r requirements.txt
pandas
beautifulsoup4
selenium
sqlalchemy
psycopg2
scikit-learn
folium
Run the scraping scripts to collect data from real estate websites. Make sure you have the necessary drivers for Selenium (e.g., ChromeDriver).
python scrapingTerrain.py
python scrapingMaison.py
python scrapingAppartement.py
Start the PostgreSQL database using Docker Compose and insert the scraped data.
docker-compose up -d
python insertdata.py
Run the clustering analysis and generate the map with clustered data points.
python clustering.py
real-estate-data/
├── scrapingTerrain.py
├── scrapingMaison.py
├── scrapingAppartement.py
├── MubawabTerrain.csv
├── MubawabMaison.csv
├── MubawabAppartement.csv
├── insertdata.py
├── clustering.py
├── docker-compose.yml
├── requirements.txt
└── README.md
These scripts scrape data from https://www.mubawab.tn/ and save it into CSV files. They use BeautifulSoup
for parsing HTML and selenium
for web interactions.
This script reads the data from the CSV files, cleans it, and inserts it into a PostgreSQL database. It uses pandas
for data manipulation and sqlalchemy
for database interactions.
This script performs clustering analysis on the real estate data and visualizes the results on Tunisa map using folium
. It uses scikit-learn
for clustering and folium
for map visualization.
This file sets up the PostgreSQL database and pgAdmin using Docker Compose.
Use Docker Compose to set up and run the PostgreSQL database and pgAdmin.
docker-compose up -d
- Access pgAdmin at
http://localhost:8081
- PostgreSQL will be running on port
5432
Feel free to fork this project, make your changes, and submit a pull request. Any contributions are highly appreciated!
This project is licensed under the MIT License. See the LICENSE
file for details.