Files for the Python for Data Science workshops, presented by the IDEA Student Center at UC San Diego.
Data Science is relevant to a wide range of disciplines and applications, which are too numerous to be covered in a single workshop. Instead, the files in this repository focus on a subset of some of the more common Data Science topics and skills:
- data files: reading/writing
- data visualization
- time-series analysis
- regression
- classification
- statistics
- Python 2 or 3
- Jupyter Notebook
- NumPy
- SciPy
- Matplotlib
- Pandas
- scikit-learn
We highly recommend installing Python using Anaconda, an all-in-one installer that includes everything we will need for the workshop (Python + essential packages). To install:
- go to the Anaconda download page: https://www.anaconda.com/download/
- download the Python 3.x graphical installer for your OS
- run the installer
NOTE: The installer download is large (approx. 600 MB) and installation (after downloading) can take 10–20 mins to run.
Online courses:
- Learn Python for Data Science: https://www.datacamp.com/learn-python-with-anaconda
Datasets:
- FiveThirtyEight data repository: https://github.com/fivethirtyeight/data
- Kaggle
Articles:
- Using Facebook to track sleeping habits: https://medium.com/@sqrendk/how-you-can-use-facebook-to-track-your-friends-sleeping-habits-505ace7fffb6