Getting-and-Cleaning-Data

The "Data Science" Specialization

Course Instructions:

You should create one R script called run_analysis.R that does the following.
Merges the training and the test sets to create one data set.
Extracts only the measurements on the mean and standard deviation for each measurement.
Uses descriptive activity names to name the activities in the data set
Appropriately labels the data set with descriptive activity names.
Creates a second, independent tidy data set with the average of each variable for each activity and each subject.

Documentation for run_analysis.R

Initial Housekeeping

Data Labels are read into R, scrubbed according to tidy data standards
Subject Vectos are read into R, and scrubbed.
Data sets for X (raw and calcualted values) and Y (activity) are read into R.
Subject Voctors are cbound to the data set X for the test and training
Activity vector (Y) is bound to the data set X for the test and training
rbind is used to "merge" the two data sets. This was used for simplicity given that a true merge wasn't necessary and rather costly in this case.

Part 1:

Grep the features vector for "mean" or "std" text
Supply the modified features vector to the columns requirement within the merged data set.
Output this data frame to a tab seperated .txt file
Included in github is the head of this output including 1000 rows of data. (part1_output.head.txt)

File details of part1_output.txt :

size: 10.1 MB
rows: 10299
cols: 86
sample col name: e.g. tbodyaccmeanx, tbodyaccstdx, etc.
only part1_output_head.txt is included due to size limitations (100 lines)

Part 2:

Using the originall merged data frame, split based on subject number and then calculate the average across each variable.
Output this data frame to a tab seperated .txt file
Included in github is the output file. (part2_output.txt)

File details of part2_output.txt :

size: 321.9 KB
rows: 30
cols: 561
sample col name: e.g. tbodyaccmeanx, tbodyaccmeany, etc.

Name		Name	Last commit message	Last commit date
Latest commit History 48 Commits
UCI HAR Dataset		UCI HAR Dataset
.gitignore		.gitignore
Getting-and-Cleaning-Data.Rproj		Getting-and-Cleaning-Data.Rproj
HW4.R		HW4.R
Quiz1.R		Quiz1.R
Quiz2.R		Quiz2.R
Quiz3.R		Quiz3.R
Quiz4.R		Quiz4.R
README.md		README.md
part1_output.txt		part1_output.txt
part1_output_head.txt		part1_output_head.txt
part2_output.txt		part2_output.txt
run_analysis.R		run_analysis.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Getting-and-Cleaning-Data

Course Instructions:

Documentation for run_analysis.R

Initial Housekeeping

Part 1:

File details of part1_output.txt :

Part 2:

File details of part2_output.txt :

About

Releases

Packages

Languages

ckatsulis/Getting-and-Cleaning-Data

Folders and files

Latest commit

History

Repository files navigation

Getting-and-Cleaning-Data

Course Instructions:

Documentation for run_analysis.R

Initial Housekeeping

Part 1:

File details of part1_output.txt :

Part 2:

File details of part2_output.txt :

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages