Skip to content

k-bosko/cohort_analysis

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 

Repository files navigation

Cohort Analysis

Cohort Analysis allows to track user behavior over time and is the stepping stone in calculating retention rates.

Installation

File Descriptions

Because the dataset is large and publicly available, I did not upload it here.

The analysis can be found as Jupyter Notebook here:

Project Description

In this project, I analyzed customer behavior for online retail store that sells unique all-occasion gift-ware in the UK.

The dataset consists of 1,067,371 transactions and has the following variables:

Variable Description
InvoiceNo Invoice number. Nominal. A 6-digit integral number uniquely assigned to each transaction. If this code starts with the letter 'c', it indicates a cancellation.
StockCode Product (item) code. Nominal. A 5-digit integral number uniquely assigned to each distinct product.
Description Product (item) name. Nominal.
Quantity The quantities of each product (item) per transaction. Numeric.
InvoiceDate Invice date and time. Numeric. The day and time when a transaction was generated.
UnitPrice Unit price. Numeric. Product price per unit in sterling.
CustomerID Customer number. Nominal. A 5-digit integral number uniquely assigned to each customer.
Country Country name. Nominal. The name of the country where a customer resides.

I created cohorts based on monthly data between years 2009 and 2011, calculated retention rates and visualized them via a heatmap.

Results

Retention Rates

Acknowledgement

This project is part of "Customer Segmentation in Python" course on Data Camp taught by Karolis Urbonas, Global Head of Machine Learning and Science at Amazon Web Services (AWS).

About

retention rates by cohorts for online retailer

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published