Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crosstabs #13

Merged
merged 70 commits into from
May 1, 2017
Merged

Crosstabs #13

merged 70 commits into from
May 1, 2017

Conversation

gregmacfarlane
Copy link
Contributor

This adds functionality for publication-ready cross-tabulations. The function determines, for instance, if a variable is numeric or character and labels the rows and columns appropriately. The function also allows the user to specify if they want marginal totals appended to the table, or if the table should be presented as percents. Weighting the table by a third variable in the table is supported.

The single and colored plots need to use the same x-axis.
This should close #10; it requires that text fields either have explicity factor levels or it will return alphabetical order. But not that bad.
This function is a generic cross-tabulation function like 
`table`, but it contains options to weight the data and 
to place marginals on the sides. It also labels the 
columns descriptively and gives an option to present
the table as percentages. And because the output is a data_frame, it can be wrapped with kable() for print-quality output.

This was migrated from the (somewhat stale) nhtsHelper repository.
This helps avoid NOTEs in R CMD CHECK
Previously the variable name would be prepended to both the row and column names. This is good when the variable field is ambiguous (1, 2, 3, ...) but bad when the field is already a factor or a character variable. This commit allows the function to determine what is happening independently.
@gregmacfarlane
Copy link
Contributor Author

This pull request addresses #12

gregmacfarlane and others added 19 commits April 5, 2016 10:45
Can reduce code duplication with a single table function that can handle both flow and RMSE statistics. The user selects which to use with a function argument.

Additionally, restructured gave a volume_breaks argument to allow the volume groups to be specified by the user.
Add the factored and named area type group to the table.
Use count as x axis on mdd plot
I use these in many other places, so we'll keep them available.
Only need to include the aggregation function in the ifelse tree; everything else can go at the end. This will make fixing and improving simpler later on.
gregmacfarlane and others added 24 commits January 31, 2017 16:48
This new function is simply a wrapper to https://github.com/mattflor/chorddiag so that the user can pass in a table object instead of a matrix.
Update with "count" rather than "volume" as the "group_field" when do…
R2 is actually not an appropriate measurement for this
case. We should use overall PRMSE instead. Close #19
It really doesn't add to interpretation. Addresses #22
The user can specify an ID column, else it will use the row names already on the data frame.
Also remove the entry for it in data.R
Given from, to, and step arguments, generates
a table to of volumes and their respective
MDD.
Correct the case_when() equations to give the right results.
Update the default input values.
Replaces static lookup table with exposed dynamic function to calculate MDD curves.
Now supports a regression line and equation.
Also shows link IDs appropriately and handles
a null "id" variable correctly.
Improve plotly_validation()
# Conflicts:
#	DESCRIPTION
#	NAMESPACE
#	R/data.R
#	data-raw/links.R
#	data/links.rda
#	inst/rmarkdown/templates/assignment_validation/skeleton/skeleton.Rmd
#	man/links.Rd
@dkyleward
Copy link
Contributor

Had to do some manual conflict resolution and regenerating the package documentation using roxygen. I merged locally, but couldn't push without the travis checks. Instead, I merged the resolutions back into this branch to update the pull request.

Kyle Ward added 3 commits May 1, 2017 14:03
Lost this field during conflict resolution, which led to
errors with the link_measures_table() function.
Added a distance field was placeholder value.
Took links.rda from an earlier commit on the master branch.
The crosstab function required at least one link have an area
type of "CBD", so I made that change. This is because the field
is defined as a factor with five levels, one of which is "CBD".
@dkyleward dkyleward merged commit d53012f into master May 1, 2017
@dkyleward dkyleward deleted the crosstabs branch May 1, 2017 18:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants