The purpose and scope of this project is to expand intuition in the processing and analysis of data for a simple linear regression problem.
- Kenji Alford Git
Notebook(s) for this project can be found here.
Table of Contents
Project introduction. As part of a series of basic projects I'm taking a simple dataset and applying the general data science projecess, loading, wrangling, transformations and encoding, EDA, ML, and evaluations. Goals:
- The dataset used by the project provides 33 features for 395 students.
- amonsgt those features are values for the populations two schools as well as grades for third test.
- The goal will be to develop and train a model using features (X) excluding the the last test score so that student success on that test can be predicted using the other features.- Goals: Project Objectives Evaluation Metrics. (if the model can predict with 99% accuracy...)
- Inferential Statistics`
- Machine Learning
- Why linear regression? The data lends itself to a straight-forward solution using this algorithm because the target and independent features are simple numeric values. This simpllicity allows me to focus more on the application of each data science step.
- Data Visualization
- Although I primarily use Seaborn for visual analsys I also wanted to observe possible grouping beween more than 2 variables so during EDA I use plotly.express to generate 3D scatterplots.
- Predictive Modeling
- etc.
- Python
- Numpy, Pandas
- Scikit-learn
- Jupyter, VSCde
This project is broken ito two parts.
- stages of what you’re doing
- can be broken down
- into
- a tree
- target
- the control/test split
- the validation set
- ML algorithm stack
Description of Data Acquisition Date of collection Description of each data source
- Source How Sources May Be Related Variables Directory
- column headings
- types
- number of variables
- units of measurement
- Definition of missing data Directory Tree Description of Methods of Data Processing
- Wrangling
- Transformation
- Encoding
- Scaling
- this
- that
- and another thing
Information About Model Model Evaluation - Model Card Predictions Real World Applications
- Add Changelog
- Add back to top links
- Add Additional Templates w/ Examples
- Add "components" document to easily copy & paste sections of the readme
- Multi-language Support
- Chinese
- Spanish

