Before beginning, make sure your environment is properly set up. This was documented in Hands-On 0: https://github.com/ucsc-cse-40/HO0
In this assignment, you will pre-process and analyze synthetic data (i.e., data produced by an algorithm rather than collected from the real world).
The objective of this assignment is for you to learn about:
- Data manipulation (selecting, adding, removing rows and columns).
- Data exploration (understanding the structure and contents of a dataset).
- Data selection (filtering rows and columns).
- Feature engineering (pre-processing data for use with machine learning).
- Basic data visualization (plotting data to explore functional relationships between variables).
Submit your assignment by running python3 -m cse40.autograder submit
from your local repository directory.
This script will check for your cruzid, password, and assignment id in config.json
and submit your work to a server controlled by the TAs where tests will be run, reporting the results back to you.