Skip to content

Latest commit

 

History

History
51 lines (33 loc) · 1.93 KB

README.md

File metadata and controls

51 lines (33 loc) · 1.93 KB

SQL in Python

Mary Cassatt, In the Loge (1878), MFA Boston.

SQL is a very important language for working with data. People who can write both SQL and Python can work across a broader spectrum in the data world. How does SQL combine with python?

In this workshop we will be using Fugue and DuckDB and practice manipulating data using SQL.

Start here:

🍴Fork this repo!

Setup

Requirements:

  • python > 3.10 but python <= 3.11.3
  • As of October 2023, python 3.12 still doesn't support all the dependencies in the requirements, this can change later on.

If you use environments, do make a new environment and activated it. Then install the libraries:

macos/linux:

python3 -m pip -r requirements.txt

win:

py -m pip install -r requirements.txt

Alternatively you can create an account on gitpod and use gitpod for the workshop.

  • created a gitpod account (you can select github to authenticate)
  • create new workspace and paste the link of this repo
  • open ssh with token, install a python version with pyenv install 3.11.3 and set the python version with pyenv global 3.11.3 and install the libraries with pip install -r requirements.txt
  • open vscode from gitpod, and follow all the prompts (you need to have vscode installed locally). Alternatively, if you are a pycharm user, open that.

GITPOD! Change the Python version (libraries don't yet work with 3.12):

pyenv install 3.11.3
pyenv global 3.11.3

Problem:

Our data science team needs features for predicting the customer life time value. The features they've asked for are recency, frequency and monetary value.

Notebooks: