CitiBike-Summer

Please see my completed presentation here.

Getting Started

This dataset was very large, with over 1 million rows of data. For the purposes of publishing to Github, and calculating speed of Tableau, I used Python (Pandas) to read in the CSV file and take a sample of the data for visualization purposes. After selecting 30% of the data at random, I wrote the smaller dataset to a new CSV titled CitiBikeoutput. See the process below:

import pandas as pd
import numpy as np

#read in csv
df = pd.read_csv("./2013-07+08 - Citi Bike trip data.csv")

#create sample, random 30%
df_sample = df.sample(frac=0.30, random_state=42)

#write to new csv
df_sample.to_csv("CitiBikeoutput.csv", index=False)

Visualizing the data

The questions I wanted to answer with this data are as follows:

Who is cycling?
What are the busiest rental hours during summer months (July & August)
What are the most popular locations to end cycling trips?
What bikes are utilized the most? (This could help with maintenance and upkeep)

Additionally I wanted to visualize all of the stations with details, and over a period of time.
Finally, I created dashboards and a story to summarize my findings.

Questions?

Please refer to the following:
My LinkedIn Page
My Email Contact

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
Resources		Resources
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CitiBike-Summer

Getting Started

Visualizing the data

Questions?

About

Releases

Packages

Languages

SavannahWithAnH/CitiBike-Summer

Folders and files

Latest commit

History

Repository files navigation

CitiBike-Summer

Getting Started

Visualizing the data

Questions?

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages