[GroupID] Group 11 Boston Housing price

Groups

曾偉綱資科碩二 108753122
盧禹叡經濟碩二 109258026
張修誠資科碩一 110753165
邱顯安資科碩一 110753110

Goal

A breif introduction about your project, i.e., what is your goal?

Demo

Commend to reproduce our result

Rscript performance.R --fold <k> --train data/training --test data/test --report results/performance.csv --predict result/predict.csv

Shiny io app : https://brianchiu.shinyapps.io/finalproject/

Folder organization and its related information

docs

Your presentation, 1101_datascience_FP_<yourID|groupName>.ppt/pptx/pdf, by Jan. 13
Any related document for the final project
- papers
- software user guide

data

Form Kaggle API : $ kaggle competitions download boston-housing
Input format : CSV
Preprocessing
- Check Missing value
```
colSums(is.na(data))
```
No missing value
- Outlier check & remove by box plot
- Skewness check & process
- Correlated Heat Map between Features

code

Which method do you use?

We use Random forest model for our prediction.

train_control <- trainControl(method = "none")

model <- train(medv~., data = train_data, method = "rf", trControl = train_control)

What is a null model for comparison?
- We compare our model with different select respectively

model <- train(medv~., data = train_data, method = "knn", trControl = train_control)
model <- train(medv~., data = train_data, method = "lm", trControl = train_control)

How do your perform evaluation? ie. cross-validation, or addtional indepedent data set
- We use cross-validation to evaluate our performance, and also use the addtional indepedent dataset to check our prediction on Kaggle.

results

Which metric do you use
- We use RMSE as our metric.
Is your improvement significant?
- Yes, we improve the RMSE from 3.45 to 3.41
What is the challenge part of your project?
- Test chose the most appropriate features

References

Packages you use

library(caret)
library(randomForest)
library(ggvis)
library(shiny)

Related publications
- Kaggle
  - https://www.kaggle.com/c/boston-housing/overview
- shiny templete
  - https://shiny.rstudio.com/gallery/
- package
  - https://www.rdocumentation.org/packages/randomForest/versions/4.6-14/topics/importance

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
.github		.github
code		code
data		data
docs		docs
results		results
shiny		shiny
.Rhistory		.Rhistory
DS_final_project_group11.pdf		DS_final_project_group11.pdf
KNN_model.png		KNN_model.png
Linear_Regression_model.png		Linear_Regression_model.png
Missing_value_checking.png		Missing_value_checking.png
README.md		README.md
Random_forest_model.png		Random_forest_model.png
importance.png		importance.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

[GroupID] Group 11 Boston Housing price

Groups

Goal

Demo

Folder organization and its related information

docs

data

code

results

References

About

Releases

Packages

Contributors 2

Languages

1101-datascience/finalproject-finalproject_group11

Folders and files

Latest commit

History

Repository files navigation

[GroupID] Group 11 Boston Housing price

Groups

Goal

Demo

Folder organization and its related information

docs

data

code

results

References

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages