Predicting beer ratings

I scraped Beer Advocate's website and utilized supervised learning models to predict beer ratings with a low complexity, high R-squared random forest regressor.

The best model proved to be a random forest though linear regression did surprisingly well. The fact that linear regression did so well (R-squared of 0.69) makes sense given that the data show people tend to like beers with more alcohol (higher ABV content) and they give higher ratings to more recent beers. More importantly, the average rating of other beers at the brewery seems to be linearly related to the rating of a new beer by that same brewery (see scatterplot below). In short, the brewery is key.

All code is contained is in notebooks. Presentation slides are also included in the PDF.

Languages: Python, HTML
Libraries: BeautifulSoup, sklearn, pandas, matplotlib, seaborn
Methods: Web scraping, regression

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
notebooks		notebooks
.gitignore		.gitignore
Predicting_beer_ratings.pdf		Predicting_beer_ratings.pdf
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Predicting beer ratings

About

Releases

Packages

Languages

ejm714/predicting_beer_ratings

Folders and files

Latest commit

History

Repository files navigation

Predicting beer ratings

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages