Skip to content

This GitHub repository contains code and resources for a real-world case study of New York taxi demand prediction using machine learning.

Notifications You must be signed in to change notification settings

princebari/New-York-Taxi-Demand-Prediction

Repository files navigation

New-York-Taxi-Demand-Prediction

image

Introduction

Medallion (yellow) cabs are concentrated in the borough of Manhattan but can be hailed anywhere throughout the five boroughs of New York City with a raised hand or from a taxi stand. In the New York city, people use taxis at a much higher frequency than most places. Instead of booking customers by phone ahead of time, there is still a majority of New York taxi drivers that pick-up passengers on street.

The New York City Taxi Demand Prediction project aims to develop accurate predictive models to forecast taxi demand across various areas of the city. By leveraging historical taxi trip data and employing machine learning algorithms, the project uncovers insights into the temporal and spatial patterns of taxi demand. This information is invaluable for taxi service providers, city planners, and transportation authorities to make informed decisions and optimize resource allocation. This repository provides a comprehensive solution for taxi demand prediction, including data preprocessing, exploratory data analysis (EDA), model training, evaluation.

About Dataset

The project utilizes the TLC Trip Record data provided by the New York City Taxi and Limousine Commission (TLC). This dataset contains detailed information about taxi trips, including pickup and drop-off locations, timestamps, and other relevant attributes. The dataset is publicly available and can be downloaded from the New York City TLC website https://www.nyc.gov/site/tlc/about/tlc-trip-record-data.page.

Business Problem

The business problem of this project is to accurately predict taxi demand in different areas of New York City. Overall, accurate taxi demand prediction addresses the business problem of optimizing taxi services, enhancing service quality, reducing costs, improving traffic management, and aiding in urban planning for the benefit of taxi companies, customers, and city authorities.

ML Problem Formulation

  • Time-series forecasting and Regression

  • To find number of pickups, given location cordinates(latitude and longitude) and time, in the query reigion and surrounding regions.

  • To solve the above we would be using data collected in Jan - Mar 2015 to predict the pickups in Jan - Mar 2016.

Performance metrics

  • Mean Absolute percentage error.

  • Mean Squared error.

Machine Learning Objective

The machine learning objective of this problem is to develop models that can accurately predict the number of taxi pickups given specific cluster or region in New York City, and 10-minute time intervals.

About

This GitHub repository contains code and resources for a real-world case study of New York taxi demand prediction using machine learning.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published