Skip to content

Commit

Permalink
Merge branch 'main' of https://github.com/jospueyo/janitor
Browse files Browse the repository at this point in the history
  • Loading branch information
jospueyo committed Feb 2, 2024
2 parents 279f796 + 4ea9f7a commit 723d3ff
Show file tree
Hide file tree
Showing 54 changed files with 540 additions and 338 deletions.
2 changes: 1 addition & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
Expand Up @@ -53,4 +53,4 @@ Config/testthat/edition: 3
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.2.3
RoxygenNote: 7.3.0
2 changes: 1 addition & 1 deletion NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -28,7 +28,7 @@ These are all minor breaking changes resulting from enhancements and are not exp

* Remove dplyr verbs superseded in dplyr 1.0.0 (#547, @olivroy)

* Restyle the package and vignettes according to the [tidyverse style guide](style.tidyverse.org) (#548, olivroy)
* Restyle the package and vignettes according to the [tidyverse style guide](https://style.tidyverse.org) (#548, olivroy)

# janitor 2.2.0 (2023-02-02)

Expand Down
25 changes: 18 additions & 7 deletions R/adorn_ns.R
Original file line number Diff line number Diff line change
@@ -1,14 +1,25 @@
#' Add underlying Ns to a tabyl displaying percentages.
#'
#' This function adds back the underlying Ns to a `tabyl` whose percentages were calculated using `adorn_percentages()`, to display the Ns and percentages together. You can also call it on a non-tabyl data.frame to which you wish to append Ns.
#' This function adds back the underlying Ns to a `tabyl` whose percentages were
#' calculated using [adorn_percentages()], to display the Ns and percentages together.
#' You can also call it on a non-tabyl data.frame to which you wish to append Ns.
#'
#' @param dat a data.frame of class `tabyl` that has had `adorn_percentages` and/or `adorn_pct_formatting` called on it. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param position should the N go in the front, or in the rear, of the percentage?
#' @param ns the Ns to append. The default is the "core" attribute of the input tabyl `dat`, where the original Ns of a two-way `tabyl` are stored. However, if your Ns are stored somewhere else, or you need to customize them beyond what can be done with `format_func`, you can supply them here.
#' @param format_func a formatting function to run on the Ns. Consider defining with [base::format()].
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all columns are adorned except for the first column and columns not of class `numeric`, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to `tabyl`.
#' @param dat A data.frame of class `tabyl` that has had `adorn_percentages` and/or
#' `adorn_pct_formatting` called on it. If given a list of data.frames,
#' this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param position Should the N go in the front, or in the rear, of the percentage?
#' @param ns The Ns to append. The default is the "core" attribute of the input tabyl
#' `dat`, where the original Ns of a two-way `tabyl` are stored. However, if your Ns
#' are stored somewhere else, or you need to customize them beyond what can be done
#' with `format_func`, you can supply them here.
#' @param format_func A formatting function to run on the Ns. Consider defining
#' with [base::format()].
#' @param ... Columns to adorn. This takes a tidyselect specification. By default,
#' all columns are adorned except for the first column and columns not of class
#' `numeric`, but this allows you to manually specify which columns should be adorned,
#' for use on a data.frame that does not result from a call to `tabyl`.
#'
#' @return a data.frame with Ns appended
#' @return A `data.frame` with Ns appended
#' @export
#' @examples
#' mtcars %>%
Expand Down
20 changes: 14 additions & 6 deletions R/adorn_percentages.R
Original file line number Diff line number Diff line change
@@ -1,13 +1,21 @@
#' Convert a data.frame of counts to percentages.
#'
#' This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to adorn in the `...` argument.
#' This function defaults to excluding the first column of the input data.frame,
#' assuming that it contains a descriptive variable, but this can be overridden
#' by specifying the columns to adorn in the `...` argument.
#'
#' @param dat a `tabyl` or other data.frame with a tabyl-like layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param denominator the direction to use for calculating percentages. One of "row", "col", or "all".
#' @param na.rm should missing values (including NaN) be omitted from the calculations?
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to `tabyl`.
#' @param dat A `tabyl` or other data.frame with a tabyl-like layout.
#' If given a list of data.frames, this function will apply itself to each
#' `data.frame` in the list (designed for 3-way `tabyl` lists).
#' @param denominator The direction to use for calculating percentages.
#' One of "row", "col", or "all".
#' @param na.rm should missing values (including `NaN`) be omitted from the calculations?
#' @param ... columns to adorn. This takes a <[`tidy-select`][dplyr::dplyr_tidy_select]>
#' specification. By default, all numeric columns (besides the initial column, if numeric)
#' are adorned, but this allows you to manually specify which columns should
#' be adorned, for use on a `data.frame` that does not result from a call to [tabyl()].
#'
#' @return Returns a data.frame of percentages, expressed as numeric values between 0 and 1.
#' @return A `data.frame` of percentages, expressed as numeric values between 0 and 1.
#' @export
#' @examples
#'
Expand Down
31 changes: 23 additions & 8 deletions R/adorn_rounding.R
Original file line number Diff line number Diff line change
@@ -1,16 +1,29 @@
#' Round the numeric columns in a data.frame.
#'
#' @description
#' Can run on any data.frame with at least one numeric column. This function defaults to excluding the first column of the input data.frame, assuming that it contains a descriptive variable, but this can be overridden by specifying the columns to round in the `...` argument.
#' Can run on any `data.frame` with at least one numeric column.
#' This function defaults to excluding the first column of the input data.frame,
#' assuming that it contains a descriptive variable, but this can be overridden by
#' specifying the columns to round in the `...` argument.
#'
#' If you're formatting percentages, e.g., the result of `adorn_percentages()`, use `adorn_pct_formatting()` instead. This is a more flexible variant for ad-hoc usage. Compared to `adorn_pct_formatting()`, it does not multiply by 100 or pad the numbers with spaces for alignment in the results data.frame. This function retains the class of numeric input columns.
#' If you're formatting percentages, e.g., the result of [adorn_percentages()],
#' use [adorn_pct_formatting()] instead. This is a more flexible variant for ad-hoc usage.
#' Compared to `adorn_pct_formatting()`, it does not multiply by 100 or pad the
#' numbers with spaces for alignment in the results `data.frame`.
#' This function retains the class of numeric input columns.
#'
#' @param dat a `tabyl` or other data.frame with similar layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param digits how many digits should be displayed after the decimal point?
#' @param rounding method to use for rounding - either "half to even", the base R default method, or "half up", where 14.5 rounds up to 15.
#' @param ... columns to adorn. This takes a tidyselect specification. By default, all numeric columns (besides the initial column, if numeric) are adorned, but this allows you to manually specify which columns should be adorned, for use on a data.frame that does not result from a call to `tabyl`.
#' @param dat A `tabyl` or other `data.frame` with similar layout.
#' If given a list of data.frames, this function will apply itself to each
#' `data.frame` in the list (designed for 3-way `tabyl` lists).
#' @param digits How many digits should be displayed after the decimal point?
#' @param rounding Method to use for rounding - either "half to even"
#' (the base R default method), or "half up", where 14.5 rounds up to 15.
#' @param ... Columns to adorn. This takes a tidyselect specification.
#' By default, all numeric columns (besides the initial column, if numeric)
#' are adorned, but this allows you to manually specify which columns should
#' be adorned, for use on a data.frame that does not result from a call to `tabyl`.
#'
#' @return Returns the data.frame with rounded numeric columns.
#' @return The `data.frame` with rounded numeric columns.
#' @export
#' @examples
#'
Expand Down Expand Up @@ -54,7 +67,9 @@ adorn_rounding <- function(dat, digits = 1, rounding = "half to even", ...) {
}
numeric_cols <- which(vapply(dat, is.numeric, logical(1)))
non_numeric_cols <- setdiff(1:ncol(dat), numeric_cols)
numeric_cols <- setdiff(numeric_cols, 1) # assume 1st column should not be included so remove it from numeric_cols. Moved up to this line so that if only 1st col is numeric, the function errors
# assume 1st column should not be included so remove it from numeric_cols.
# Moved up to this line so that if only 1st col is numeric, the function errors
numeric_cols <- setdiff(numeric_cols, 1)

if (rlang::dots_n(...) == 0) {
cols_to_round <- numeric_cols
Expand Down
51 changes: 37 additions & 14 deletions R/adorn_title.R
Original file line number Diff line number Diff line change
@@ -1,13 +1,30 @@
#' @title Add column name to the top of a two-way tabyl.
#' Add column name to the top of a two-way tabyl.
#'
#' @description
#' This function adds the column variable name to the top of a `tabyl` for a complete display of information. This makes the tabyl prettier, but renders the data.frame less useful for further manipulation.
#' This function adds the column variable name to the top of a `tabyl` for a
#' complete display of information. This makes the tabyl prettier, but renders
#' the `data.frame` less useful for further manipulation.
#'
#' @param dat a data.frame of class `tabyl` or other data.frame with a tabyl-like layout. If given a list of data.frames, this function will apply itself to each data.frame in the list (designed for 3-way `tabyl` lists).
#' @param placement whether the column name should be added to the top of the tabyl in an otherwise-empty row `"top"` or appended to the already-present row name variable (`"combined"`). The formatting in the `"top"` option has the look of base R's `table()`; it also wipes out the other column names, making it hard to further use the data.frame besides formatting it for reporting. The `"combined"` option is more conservative in this regard.
#' @param row_name (optional) default behavior is to pull the row name from the attributes of the input `tabyl` object. If you wish to override that text, or if your input is not a `tabyl`, supply a string here.
#' @param col_name (optional) default behavior is to pull the column_name from the attributes of the input `tabyl` object. If you wish to override that text, or if your input is not a `tabyl`, supply a string here.
#' @return the input tabyl, augmented with the column title. Non-tabyl inputs that are of class `tbl_df` are downgraded to basic data.frames so that the title row prints correctly.
#' The `placement` argument indicates whether the column name should be added to
#' the `top` of the tabyl in an otherwise-empty row `"top"` or appended to the
#' already-present row name variable (`"combined"`). The formatting in the `"top"`
#' option has the look of base R's `table()`; it also wipes out the other column
#' names, making it hard to further use the `data.frame` besides formatting it for reporting.
#' The `"combined"` option is more conservative in this regard.
#'
#' @param dat A `data.frame` of class `tabyl` or other `data.frame` with a tabyl-like layout.
#' If given a list of data.frames, this function will apply itself to each `data.frame`
#' in the list (designed for 3-way `tabyl` lists).
#' @param placement The title placement, one of `"top"`, or `"combined"`.
#' See **Details** for more information.
#' @param row_name (optional) default behavior is to pull the row name from the
#' attributes of the input `tabyl` object. If you wish to override that text,
#' or if your input is not a `tabyl`, supply a string here.
#' @param col_name (optional) default behavior is to pull the column_name from
#' the attributes of the input `tabyl` object. If you wish to override that text,
#' or if your input is not a `tabyl`, supply a string here.
#' @return The input `tabyl`, augmented with the column title. Non-tabyl inputs
#' that are of class `tbl_df` are downgraded to basic data.frames so that the
#' title row prints correctly.
#'
#' @export
#' @examples
Expand Down Expand Up @@ -38,12 +55,14 @@ adorn_title <- function(dat, placement = "top", row_name, col_name) {

if (inherits(dat, "tabyl")) {
if (attr(dat, "tabyl_type") == "one_way") {
warning("adorn_title is meant for two-way tabyls, calling it on a one-way tabyl may not yield a meaningful result")
warning(
"adorn_title is meant for two-way tabyls, calling it on a one-way tabyl may not yield a meaningful result"
)
}
}
if (missing(col_name)) {
if (!inherits(dat, "tabyl")) {
stop("When input is not a data.frame of class tabyl, a value must be specified for the col_name argument")
stop("When input is not a data.frame of class tabyl, a value must be specified for the col_name argument.")
}
col_var <- attr(dat, "var_names")$col
} else {
Expand All @@ -63,13 +82,15 @@ adorn_title <- function(dat, placement = "top", row_name, col_name) {
if (inherits(dat, "tabyl")) {
row_var <- attr(dat, "var_names")$row
} else {
row_var <- names(dat)[1] # for non-tabyl input, if no row_name supplied, use first existing name
# for non-tabyl input, if no row_name supplied, use first existing name
row_var <- names(dat)[1]
}
}


if (placement == "top") {
dat[, ] <- lapply(dat[, ], as.character) # to handle factors, problematic in first column and at bind_rows.
# to handle factors, problematic in first column and at bind_rows.
dat[, ] <- lapply(dat[, ], as.character)
# Can't use mutate_all b/c it strips attributes
top <- dat[1, ]

Expand All @@ -82,8 +103,10 @@ adorn_title <- function(dat, placement = "top", row_name, col_name) {
out <- dat
names(out)[1] <- paste(row_var, col_var, sep = "/")
}
if (inherits(out, "tbl_df")) { # "top" text doesn't print if input (and thus the output) is a tibble
out <- as.data.frame(out) # but this prints row numbers, so don't apply to non-tbl_dfs like tabyls
# "top" text doesn't print if input (and thus the output) is a tibble
if (inherits(out, "tbl_df")) {
# but this prints row numbers, so don't apply to non-tbl_dfs like tabyls
out <- as.data.frame(out)
}
out
}
Expand Down
Loading

0 comments on commit 723d3ff

Please sign in to comment.