Skip to content

webvalley/scientific-python

Repository files navigation

# Introduction to Data Science in Python @ WebValley 2019

Introduction

(adapted from Step by step approach to perform data analysis in Python)

So you have decided to learn Python, but you don’t have prior programming experience. So you are confused on where to start, and how much Python to learn.

These are some of the common questions a beginner has while getting started with Python(for data centric application):

  • “How long does it take to learn Python”
  • “How much Python should I learn for performing data analysis”
  • “What are the best books/courses to learn Python”
  • “Should I be an expert Python programmer, in order to work with data sets”

It is good to be confused, while beginning to learn a new skill, that’s what author of “learn anything in 20 hours” says.

However the key word here is: Don’t Panic! This tutorial has been thought and designed to show you that

What do you need to get started

Most people have the misconception that for performing data analysis in Python requires to be proficient in Python programming.

Coding is fun, but you don't really need to be a coding ninja in Python to do data analysis.

What you just need to get started is some basics of (Python) programming and some very elementary software engineering concepts, just to avoid disasters when you go in production - whatever production means to you (e.g. deploy a system online, or share the code of your prototype or experiments on a public repo for reproducibility.)

What you won't find in this tutorial

In this tutorial, you won't learn how to program in Python. If you are looking for a quick tutorial on Python programming, maybe this is the tutorial for you: Python Programming Tutorial

What you will find in this tutorial

For a glimpse on what to expect by this tutorial, I would suggest this 5 mins reading: 5 amazingly powerful Python libraries for Data Science

Jupyter Notebook Format

(Most of) The materials in this tutorial will be provided as Jupyter Notebooks.

If you don't know what a Jupyter notebook is, or how to use it, please take a look at this quick introductory tour: IPython Notebook Beginner Guide.

For additional details and materials on Jupyter and IPython, here there are some other suggester readings:

Learning Path

The lecture materials is organised as it follows:

  • Introduction to Jupyter and iPython notebook format notebook
  • Introduction to numpy for numerical computation notebook
  • Data Representation in Machine Learning and scipy.sparse notebook
  • Dataset for Machine Learning: pmlb notebook
  • Introduction to pandas for data analysis notebook
  • Data Science case study:
  • Introduction to Data Visualisation using bokeh notebook

Further Readings

If you want an introductory overview of Python for Data Science, I strongly recommend
Scipy Lecture Notes: a community driven project where you can find tutorials (for non-experts) on the scientific Python ecosystems.

Additional Books for further readings:

About

Tutorial on Scientific Python and Pythonic Data Science

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published