SQL is a very important language for working with data. People who can write both SQL and Python can work across a broader spectrum in the data world. How does SQL combine with python?
In this workshop we will be using Fugue and DuckDB and practice manipulating data using SQL.
🍴Fork this repo!
Requirements:
- python > 3.10
If you use environments, do make a new environment and activated it. Then install the libraries:
macos/linux:
python3 -m pip install -r requirements.txt
win:
py -m pip install -r requirements.txt
GITPOD!
Alternatively you can create an account on gitpod and use gitpod for the workshop.
- created a gitpod account (you can select github to authenticate)
- create new workspace and paste the link of this repo (or to your fork if you forked it)
- chose your editor, can also use the browser version
- open vscode from gitpod, and follow all the prompts (you need to have vscode installed locally). Alternatively, if you are a pycharm user, open that.
- local: open ssh with token, install the libraries with
pip install -r requirements.txt
- browser: activate ipykernel, create a new environment (
python -m venv .venv
), activate it (pyenv .venv/bin/activate
) and install requirements (pip install -r requirements.txt
) and select the environment as notebook kernel.
Our data science team needs features for predicting the customer life time value. The features they've asked for are recency, frequency and monetary value.
Data is in the data folder in different formats. Open the notebooks below to get the features with 2 different approaches.