Skip to content

Analysis of bike rentals in NYC during the summer of 2013. Numerous visualizations.

Notifications You must be signed in to change notification settings

SavannahWithAnH/CitiBike-Summer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 

Repository files navigation

CitiBike-Summer

Please see my completed presentation here.

Getting Started

This dataset was very large, with over 1 million rows of data. For the purposes of publishing to Github, and calculating speed of Tableau, I used Python (Pandas) to read in the CSV file and take a sample of the data for visualization purposes. After selecting 30% of the data at random, I wrote the smaller dataset to a new CSV titled CitiBikeoutput. See the process below:

import pandas as pd
import numpy as np

#read in csv
df = pd.read_csv("./2013-07+08 - Citi Bike trip data.csv")

#create sample, random 30%
df_sample = df.sample(frac=0.30, random_state=42)

#write to new csv
df_sample.to_csv("CitiBikeoutput.csv", index=False)

Visualizing the data

The questions I wanted to answer with this data are as follows:

  • Who is cycling?
  • What are the busiest rental hours during summer months (July & August)
  • What are the most popular locations to end cycling trips?
  • What bikes are utilized the most? (This could help with maintenance and upkeep)

    Additionally I wanted to visualize all of the stations with details, and over a period of time.
    Finally, I created dashboards and a story to summarize my findings.
image image image image image


Questions?

Please refer to the following:
My LinkedIn Page
My Email Contact

About

Analysis of bike rentals in NYC during the summer of 2013. Numerous visualizations.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published