Skip to content

Commit

Permalink
no longer allow data.frame() input to make_clean_names() (#535)
Browse files Browse the repository at this point in the history
  • Loading branch information
billdenney authored Mar 14, 2023
1 parent bb78f34 commit d64c8bb
Show file tree
Hide file tree
Showing 4 changed files with 37 additions and 20 deletions.
2 changes: 2 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,8 @@

* `adorn_totals("row")` now succeeds if the new `name` of the totals row is already a factor level of the input data.frame (#529, thanks @egozoglu for reporting).

* `make_clean_names()` no longer accepts a data.frame or tibble as input, use `clean_names()` for that (fix #532, **@billdenney**).

# janitor 2.2.0 (2023-02-02)

## Breaking changes
Expand Down
30 changes: 16 additions & 14 deletions R/make_clean_names.R
Original file line number Diff line number Diff line change
Expand Up @@ -15,21 +15,21 @@
#'
#' The order of operations is: make replacements, (optional) ASCII conversion,
#' remove initial spaces and punctuation, apply \code{base::make.names()},
#' apply \code{snakecase::to_any_case}, and add numeric suffixes
#' apply \code{snakecase::to_any_case}, and add numeric suffixes
#' to resolve any duplicated names.
#'
#' This function relies on \code{snakecase::to_any_case} and can take advantage of
#' its versatility. For instance, an abbreviation like "ID" can have its
#' capitalization preserved by passing the argument \code{abbreviations = "ID"}.
#' See the documentation for \code{\link[snakecase:to_any_case]{snakecase::to_any_case}}
#' This function relies on \code{snakecase::to_any_case} and can take advantage of
#' its versatility. For instance, an abbreviation like "ID" can have its
#' capitalization preserved by passing the argument \code{abbreviations = "ID"}.
#' See the documentation for \code{\link[snakecase:to_any_case]{snakecase::to_any_case}}
#' for more about how to use its features.
#'
#' On some systems, not all transliterators to ASCII are available. If this is
#' the case on your system, all available transliterators will be used, and a
#' warning will be issued once per session indicating that results may be
#' different when run on a different system. That warning can be disabled with
#' \code{options(janitor_warn_transliterators=FALSE)}.
#'
#'
#' If the objective of your call to \code{make_clean_names()} is only to translate to
#' ASCII, try the following instead:
#' \code{stringi::stri_trans_general(x, id="Any-Latin;Greek-Latin;Latin-ASCII")}.
Expand Down Expand Up @@ -93,7 +93,9 @@ make_clean_names <- function(string,
parsing_option = 1,
numerals = "asis",
...) {

if (is.data.frame(string)) {
stop("`string` must not be a data.frame, use clean_names()")
}
# Handling "old_janitor" case for backward compatibility
if (case == "old_janitor") {
return(old_make_clean_names(string))
Expand Down Expand Up @@ -154,7 +156,7 @@ make_clean_names <- function(string,
numerals = numerals,
...
)

# Handle duplicated names by appending an incremental counter to repeats
if (!allow_dupes) {
while (any(duplicated(cased_names))) {
Expand All @@ -165,21 +167,21 @@ make_clean_names <- function(string,
},
1L
)

cased_names[dupe_count > 1] <-
paste(
cased_names[dupe_count > 1],
dupe_count[dupe_count > 1],
sep = "_"
)
}
}
}

cased_names
}

#' Warn if micro or mu are going to be replaced with make_clean_names()
#'
#'
#' @inheritParams make_clean_names
#' @param character Which character should be tested for ("micro" or "mu", or both)?
#' @return TRUE if a warning was issued or FALSE if no warning was issued
Expand Down Expand Up @@ -247,7 +249,7 @@ warn_micro_mu <- function(string, replace) {

# copy of clean_names from janitor v0.3 on CRAN, to preserve old behavior
old_make_clean_names <- function(string) {

# Takes a data.frame, returns the same data frame with cleaned names
old_names <- string
new_names <- old_names %>%
Expand All @@ -260,13 +262,13 @@ old_make_clean_names <- function(string) {
gsub("[_]+", "_", .) %>% # fix rare cases of multiple consecutive underscores
tolower(.) %>%
gsub("_$", "", .) # remove string-final underscores

# Handle duplicated names - they mess up dplyr pipelines
# This appends the column number to repeated instances of duplicate variable names
dupe_count <- vapply(seq_along(new_names), function(i) {
sum(new_names[i] == new_names[1:i])
}, integer(1))

new_names[dupe_count > 1] <- paste(
new_names[dupe_count > 1],
dupe_count[dupe_count > 1],
Expand Down
10 changes: 5 additions & 5 deletions man/make_clean_names.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

15 changes: 14 additions & 1 deletion tests/testthat/test-clean-names.R
Original file line number Diff line number Diff line change
Expand Up @@ -213,6 +213,19 @@ test_that("warnings are issued when micro/mu are not handled (fix #448)", {
)
})

test_that("make_clean_names error for data.frame input", {
expect_error(
make_clean_names(data.frame(a = 1)),
regexp = "`string` must not be a data.frame, use clean_names()",
fixed = TRUE
)
expect_error(
make_clean_names(tibble::tibble(a = 1)),
regexp = "`string` must not be a data.frame, use clean_names()",
fixed = TRUE
)
})

# Tests for warn_micro_mu ####

test_that("warn_micro_mu", {
Expand Down Expand Up @@ -628,7 +641,7 @@ test_that("tbl_lazy/dbplyr", {
testing_vector[6:7],
"repeated_2",
testing_vector[9:22]))

# create a database object with clean names
# warning due to unhandled mu
expect_warning(clean_db <- clean_names(test_db, case = "snake"))
Expand Down

0 comments on commit d64c8bb

Please sign in to comment.