You should create one R script called run_analysis.R that does the following-
- Merges the training and the test sets to create one data set.
- Extracts only the measurements on the mean and standard deviation for each measurement.
- Uses descriptive activity names to name the activities in the data set.
- Appropriately labels the data set with descriptive activity names.
- Creates a second, independent tidy data set with the average of each variable for each activity and each subject.
- Write the tidy dataset to disk using write.table()
- Download the data from source and unzip it. You'll get a
UCI HAR Dataset
folder. - Put
run_analysis.R
in the parent folder ofUCI HAR Dataset
, then set it as your working directory usingsetwd()
function. - Run
source("run_analysis.R")
, then it will generate a filetidy_data.txt
in your working directory. The generated file will be in long format.
run_analysis.R
depends on reshape2
package to convert wide format data to long format. Alternatively, you can use tidyr
package to do the same.
##About Code Book
CodeBook.md file explains the transformations performed and the resulting data and variables.