Skip to content

rohitgotecha/BikeSharingAssignment

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 
 
 

Repository files navigation

Bike Sharing Assignment

A bike-sharing system is a service in which bikes are made available for shared use to individuals on a short term basis for a price or free. Many bike share systems allow people to borrow a bike from a "dock" which is usually computer-controlled wherein the user enters the payment information, and the system unlocks it. This bike can then be returned to another dock belonging to the same system.

A US bike-sharing provider BoomBikes has recently suffered considerable dips in their revenues due to the ongoing Corona pandemic. The company is finding it very difficult to sustain in the current market scenario. So, it has decided to come up with a mindful business plan to be able to accelerate its revenue as soon as the ongoing lockdown comes to an end, and the economy restores to a healthy state.

In such an attempt, BoomBikes aspires to understand the demand for shared bikes among the people after this ongoing quarantine situation ends across the nation due to Covid-19. They have planned this to prepare themselves to cater to the people's needs once the situation gets better all around and stand out from other service providers and make huge profits.

They have contracted a consulting company to understand the factors on which the demand for these shared bikes depends. Specifically, they want to understand the factors affecting the demand for these shared bikes in the American market. The company wants to know:

Which variables are significant in predicting the demand for shared bikes. How well those variables describe the bike demands Based on various meteorological surveys and people's styles, the service provider firm has gathered a large dataset on daily bike demands across the American market based on some factors.

Business Goal:

You are required to model the demand for shared bikes with the available independent variables. It will be used by the management to understand how exactly the demands vary with different features. They can accordingly manipulate the business strategy to meet the demand levels and meet the customer's expectations. Further, the model will be a good way for management to understand the demand dynamics of a new market.

Table of Contents

General Information

  • General information about the project : A bike-sharing system is a service that allows people to check out bikes for free or at a cost for short-term shared use. Numerous bike share programs enable users to check out bikes from "docks," which are often computerized locations where users input payment details and the system unlocks the bike. After that, you can return this bike to any other dock in the system.

  • Background of project : Company is making an effort to comprehend the need for shared bikes among the populace once the country's current Covid-19-related quarantine ends. They have planned this in order to position themselves to differentiate themselves from other service providers, make significant profits, and be ready to meet people's needs when things improve overall.

  • Business problem : Company needs a strategic solution because of the ongoing epidemic, which is causing significant revenue decreases.

  • Dataset : We can observe in the dataset that some of the variables like 'weathersit' and 'season' have values as 1, 2, 3, 4 which have specific labels associated with them (as can be seen in the data dictionary). These numeric values associated with the labels may indicate that there is some order to them - which is actually not the case (Check the data dictionary and think why). So, it is advisable to convert such feature values into categorical string values before proceeding with model building. Please refer the data dictionary to get a better understanding of all the independent variables. We should notice the column 'yr' with two values 0 and 1 indicating the years 2018 and 2019 respectively. At the first instinct, you might think it is a good idea to drop this column as it only has two values so it might not be a value-add to the model. But in reality, since these bike-sharing systems are slowly gaining popularity, the demand for these bikes is increasing every year proving that the column 'yr' might be a good variable for prediction. So think twice before dropping it.

Conclusions

  • Effective generalization in a linear regression model is indicated by the close alignment of R2 values (R2: 0.847 vs. 0.804 ) between the training and test sets. This resemblance indicates that the model is likely to function consistently on fresh, untested data and avoids overfitting to the training set.
  • Factors like year, workingday, temp, hum, windspeed, summer, winter, September, and Sunday affect the demand for bikes.
  • The greatest coefficient values for three important feature variables—temp, yr, and Winter—indicate their considerable influence.
  • For a linear regression model, the RMSE values of the training set and in the test set show that the model fits the training data well and generalizes to new, unknown data with a negligible performance differential between the test and training sets.

Technologies Used

Language

  • Python

Numerical Analysis and Data Analysis

  • Numpy
  • Pandas

Data Visualization

  • Seaborn
  • MatplotLib-pyplot

Regression libraries

  • sklearn
  • statsmodels

Acknowledgements

Give credit here.

  • The upGrad live session on the industry relevance of linear regression models served as the inspiration for this project.
  • Linear regression tutorials from UpGrad on the learning platform

Contact

Created by [@rohitgotecha] - feel free to contact me!

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published