A simple vacation project done in summer of 2021 to
- scrape food recipes from a website using scrapy
- Build a NoSQL database using mongoDB
- Build machine learning models using scikit-learn
- Build a web app to visualize the data and model predictions using streamlit
- Deploy app on streamlit share or other cloud services
I scraped 5000+ food recipes from a website using scrapy, built a database using mongoDB, built machine learning models using scikit-learn, built a web app to visualize the data and model predictions using streamlit. The main goal of this project was to document the tools and techniques that I thought myself and reference some of them in the future.
Links to some of my codes in this repository:
- Scrapy, middlewares, pipelines, spiders
- MongoDB, mongosh, NoSQLBooster
- Database management, securing database following industry standards, querying, and data visualization
- Scikit-learn, logistic regression, random forest, kmeans, agglomerative clustering & regression, preprocessing, model selection, feature engineering, t-SNE, PCA, and more
- Streamlit
- EDA, Data wrangling
- Deployment (didn't have time to do this for this particular project)
- THe DOM, inspecting the DOM, querying the DOM, and scraping the DOM
- Selenium
Credit to www.allrecipe.com and the many authors of the recipes.