Skip to content

Performing analysis on dataset of active MLB players in R

Notifications You must be signed in to change notification settings

cadedupont/mlb-data-analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 

Repository files navigation

MLB Data Analysis

Project intended to familiarize myself with data analysis in R. The data used is from the Lahman database, which contains a wide variety of statistics for Major League Baseball (MLB).

Creates a scatter plot of the earned run average (ERA) of MLB pitchers against their age in the 2022 season. The data utilizes the Pitching table left-joined with the People table in the database to get the age of the pitchers.

To be qualified for the plot, a pitcher must have thrown at least 100 innings in the season and played in a minimum of 20 games. This is to ensure that the pitcher had a significant amount of playing time in the season (i.e. ignore position players that have pitched, pitchers that were injured, etc.).

ERA vs Age

Creates a scatter plot of the win percentage of MLB teams in 2016 against their total expenditure on players salaries for that season.

Win % vs Salary

About

Performing analysis on dataset of active MLB players in R

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages