Skip to content

Latest commit

 

History

History
27 lines (25 loc) · 1.73 KB

README.md

File metadata and controls

27 lines (25 loc) · 1.73 KB

Salary-Prediction-with-Linear-Regression

Dataset: Hitters

Sport Art: Baseball

Description: We have some baseball players with their season '86 and also career statistics. And the most important data for the dataset, Salary is from Sports Illustrated. Steps are gonna be like this: Firstly, we check the dataset if there is missing values or some outliers. And then we create some features. Modelling part is gonna be with Linear Regression and we get the Test, Train scores. At the end with our Model we try to predict salaries from baseball players, that they have no information about their salaries.

Columns of Dataset

  • AtBat: Number of times at bat in 1986
  • Hits: Number of hits in 1986
  • HmRun: Number of home runs in 1986
  • Runs: Number of runs in 1986
  • RBI: Number of runs batted in in 1986
  • Walks: Number of walks in 1986
  • Years: Number of years in the major leagues
  • CAtBat: Number of times at bat during his career
  • CHits: Number of hits during his career
  • CHmRun: Number of home runs during his career
  • CRuns: Number of runs during his career
  • CRBI: Number of runs batted in during his career
  • CWalks: Number of walks during his career
  • League: A factor with levels A and N indicating player’s league at the end of 1986
  • Division: A factor with levels E and W indicating player’s division at the end of 1986
  • PutOuts: Number of put outs in 1986
  • Assists: Number of assists in 1986
  • Errors: Number of errors in 1986
  • Salary: 1987 annual salary on opening day in thousands of dollars
  • NewLeague: A factor with levels A and N indicating player’s league at the beginning of 1987