Skip to content

Practicum by Yandex Project 2: This Exploratory Data Analysis (EDA) project is prepared to analyze Crankshaft car listing and determine which factors influence the price of a vehicle.

Notifications You must be signed in to change notification settings

chuksoo/vehicle_price_analytics

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

18 Commits
 
 
 
 
 
 

Repository files navigation

Research on car sales ads

Chukwuemeka Okoli
Practicum by Yandex Project 2
April 16, 2021

Project description
You're an analyst at Crankshaft List. Hundreds of free advertisements for vehicles are published on your site every day.

You need to study data collected over the last few years and determine which factors influence the price of a vehicle.

Guiding Question
What factors influence price of a vehicle?

Table of contents


Objectives

The objective of this project is to:
  • Determine which factors influence the price of a vehicle.
  • Apply Exploratory Data Analysis to a real-life analytical case study.

Data Source

Description of the data

The dataset contains the following fields:

  • price
  • model_year
  • model
  • condition
  • cylinders
  • fuel — gas, diesel, etc.
  • odometer — the vehicle's mileage when the ad was published
  • transmission
  • paint_color
  • is_4wd — whether the vehicle has 4-wheel drive (Boolean type)
  • date_posted — the date the ad was published
  • days_listed — from publication to removal

Technology Used

  • Python
  • Jupyter Notebook
  • Pandas
  • Numpy
  • Matplotlib
  • Seaborn
  • Plotly

Structure of Notebook

  1. Open the data file and study the general information
  2. Data preprocessing
    • Processing missing values
    • Data type replacement
  3. Make calculations and add them to the table
  4. Carry out exploratory data analysis
  5. Overall Conclusion

Executive Summary

Introduction

As a Data Scientists working for a car listing site, how do you determine the factors that influence vehicle price. The answer is to look at the data. In this project, a car listing company - Crankshaft List publishes free advertisements everyday, and is hoping you use your analytic knowledge to study data collected over the last few years to assist with business decision making. The goal is to determine which factors influence the price of a vehicle.

Methods

I first inspected the data using the pandas library to obtain general information about the data. I processed the missing values, changed data type, and converted data to the right type. I made calculations and added new features to the data. I investigated the following parametesr - price, vehicle's age when the ad was placed, mileage, number of cylinders, and condition. I plotted histogram for each parameters created. Prior to analyzing the data, I determined the upper limits of outliers and removed them. I used the filtered data to plot new histograms and compared them with the earlier histogram. In analyzing the data, I studied how many days advertisements were displayed (days_listed). I plotted new histogram and calculated the statistics of the data in order to describe the typical lifetime of an ad. I then determine when ads were removed quickly, and when they were listed for an abnormally long time.

I then analyze the number of ads and the average price for each type of vehicle. I studied whether the price depends on age, mileage, condition, transmission type, and color. I plotted box-and-whisker charts, and create scatterplots for the rest using the Matplotlib and Seaborn libraries. Analysis the data was important in answering some of the business needs.

Key Findings

Deployment and Application

Future Development

Accomplishments

About

Practicum by Yandex Project 2: This Exploratory Data Analysis (EDA) project is prepared to analyze Crankshaft car listing and determine which factors influence the price of a vehicle.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published