BREAKING CHANGES
-
Following deprecated functions have been removed:
data_cut()
,data_recode()
,data_shift()
,data_reverse()
,data_rescale()
,data_to_factor()
,data_to_numeric()
-
New
text_format()
alias is introduced forformat_text()
, latter of which will be removed in the next release. -
New
recode_values()
alias is introduced forchange_code()
, latter of which will be removed in the next release.
CHANGES
- The
regex
argument was added to functions that use select-helpers and did not already have this argument.
- Fixes failing tests due to
{poorman}
update.
MAJOR CHANGES
-
Following statistical transformation functions have been renamed to not have
data_*()
prefix, since they do not work exclusively with data frames, but are typically first of all used with vectors, and therefore had misleading names:data_cut()
->categorize()
data_recode()
->change_code()
data_shift()
->slide()
data_reverse()
->reverse()
data_rescale()
->rescale()
data_to_factor()
->to_factor()
data_to_numeric()
->to_numeric()
Note that these functions also have .data.frame()
methods and still work for
data frames as well. Former function names are still available as aliases, but
will be deprecated and removed in a future release.
-
Bumps the needed minimum R version to
3.5
. -
Removed deprecated function
data_findcols()
. Please use its replacement,data_find()
. -
Removed alias
extract()
fordata_extract()
function since it collided withtidyr::extract()
. -
Argument
training_proportion
indata_partition()
is deprecated. Please useproportion
now. -
Given his continued and significant contributions to the package, Etienne Bacher (@etiennebacher) is now included as an author.
-
unstandardise()
now works forcenter(x)
-
unnormalize()
now works forchange_scale(x)
-
reshape_wider()
now follows more consistentlytidyr::pivot_wider()
syntax. Argumentscolnames_from
,sep
, androws_from
are deprecated and should be replaced bynames_from
,names_sep
, andid_cols
respectively.reshape_wider()
also gains an argumentnames_glue
(#182, #198). -
Similarly,
reshape_longer()
now follows more consistentlytidyr::pivot_longer()
syntax. Argumentcolnames_to
is deprecated and should be replaced bynames_to
.reshape_longer()
also gains new arguments:names_prefix
,names_sep
,names_pattern
, andvalues_drop_na
(#189).
CHANGES
-
Some of the text formatting helpers (like
text_concatenate()
) gain anenclose
argument, to wrap text elements with surrounding characters. -
winsorize
now accepts "raw" and "zscore" methods (in addition to "percentile"). Additionally, whenrobust
is set toTRUE
together withmethod = "zscore"
, winsorizes via the median and median absolute deviation (MAD); else via the mean and standard deviation. (@rempsyc, #177, #49, #47). -
convert_na_to
now accepts numeric replacements on character vectors and single replacement for multiple vector classes. (@rempsyc, #214). -
data_partition()
now allows to create multiple partitions from the data, returning multiple training and a remaining test set. -
Functions like
center()
,normalize()
orstandardize()
no longer fail when data contains infinite values (Inf
).
NEW FUNCTIONS
-
row_to_colnames()
andcolnames_to_row()
to move a row to column names, and column names to row (@etiennebacher, #169). -
data_arrange()
to sort the rows of a dataframe according to the values of the selected columns.
BUG FIXES
- Fixed wrong column names in
data_to_wide()
(#173).
BREAKING
- Added the
standardize.default()
method (moved from package effectsize), to be consistent in that the default-method now is in the same package as the generic.standardize.default()
behaves exactly like in effectsize and particularly works for regression model objects. effectsize now re-exportsstandardize()
from datawizard.
NEW FUNCTIONS
-
data_shift()
to shift the value range of numeric variables. -
data_recode()
to recode old into new values. -
data_to_factor()
as counterpart todata_to_numeric()
. -
data_tabulate()
to create frequency tables of variables. -
data_read()
to read (import) data files (from text, or foreign statistical packages). -
unnormalize()
as counterpart tonormalize()
. This function only works for variables that have been normalized withnormalize()
. -
data_group()
anddata_ungroup()
to create grouped data frames, or to remove the grouping information from grouped data frames.
CHANGES
-
data_find()
was added as alias tofind_colums()
, to have consistent name patterns for the datawizard functions.data_findcols()
will be removed in a future update and usage is discouraged. -
The
select
argument (and thus, also theexclude
argument) now also accepts functions testing for logical conditions, e.g.is.numeric()
(oris.numeric
), or any user-defined function that selects the variables for which the function returnsTRUE
(like:foo <- function(x) mean(x) > 3
). -
Arguments
select
andexclude
now allow the negation of select-helpers, like-ends_with("")
,-is.numeric
or-Sepal.Width:Petal.Length
. -
Many functions now get a
.default
method, to capture unsupported classes. This now yields a message and returns the original input, and hence, the.data.frame
methods won't stop due to an error. -
The
filter
argument indata_filter()
can also be a numeric vector, to indicate row indices of those rows that should be returned. -
convert_to_na()
gets methods for variables of classlogical
andDate
. -
convert_to_na()
for factors (and data frames) gains adrop_levels
argument, to drop unused levels that have been replaced byNA
. -
data_to_numeric()
gains two more arguments,preserve_levels
andlowest
, to give better control of conversion of factors.
BUG FIXES
- When logicals were passed to
center()
orstandardize()
andforce = TRUE
, these were not properly converted to numeric variables.
MAJOR CHANGES
-
data_match()
now returns filtered data by default. Old behavior (returning rows indices) can be set by settingreturn_indices = TRUE
. -
The following functions are now re-exported from
{insight}
package:object_has_names()
,object_has_rownames()
,is_empty_object()
,compact_list()
,compact_character()
-
data_findcols()
will become deprecated in future updates. Please use the new replacementsfind_columns()
andget_columns()
. -
The vignette Analysing Longitudinal or Panel Data has now moved to parameters package.
NEW FUNCTIONS
-
To convert rownames to a column, and vice versa:
rownames_as_column()
andcolumn_as_rownames()
(@etiennebacher, #80). -
find_columns()
andget_columns()
to find column names or retrieve subsets of data frames, based on various select-methods (including select-helpers). These function will supersededata_findcols()
in the future. -
data_filter()
as complement fordata_match()
, which works with logical expressions for filtering rows of data frames. -
For computing weighted centrality measures and dispersion:
weighted_mean()
,weighted_median()
,weighted_sd()
andweighted_mad()
. -
To replace
NA
in vectors and dataframes:convert_na_to()
(@etiennebacher, #111).
MINOR CHANGES
-
The
select
argument in several functions (likedata_remove()
,reshape_longer()
, ordata_extract()
) now allows the use of select-helpers for selecting variables based on specific patterns. -
data_extract()
gains new arguments to allow type-safe return values,
i.e. always return a vector or a data frame. Thus, data_extract()
can now
be used to select multiple variables or pull a single variable from data
frames.
-
data_match()
gains amatch
argument, to indicate with which logical operation matching results should be combined. -
Improved support for labelled data for many functions, i.e. returned data frame will preserve value and variable label attributes, where possible and applicable.
-
describe_distribution()
now works with lists (@etiennebacher, #105). -
data_rename()
doesn't usepattern
anymore to rename the columns ifreplacement
is not provided (@etiennebacher, #103). -
data_rename()
now adds a suffix to duplicated names inreplacement
(@etiennebacher, #103).
BUG FIXES
-
data_to_numeric()
produced wrong results for factors whendummy_factors = TRUE
and factor contained missing values. -
data_match()
produced wrong results when data contained missing values. -
Fixed CRAN check issues in
data_extract()
when more than one variable was extracted from a data frame.
NEW FUNCTIONS
-
To find or remove empty rows and columns in a data frame:
empty_rows()
,empty_columns()
,remove_empty_rows()
,remove_empty_columns()
, andremove_empty
. -
To check for names:
object_has_names()
andobject_has_rownames()
. -
To rotate data frames:
data_rotate()
. -
To reverse score variables:
data_reverse()
. -
To merge/join multiple data frames:
data_merge()
(or its aliasdata_join()
). -
To cut (recode) data into groups:
data_cut()
. -
To replace specific values with
NA
s:convert_to_na()
. -
To replace
Inf
andNaN
values withNA
s:replace_nan_inf()
.
- Arguments
cols
,before
andafter
indata_relocate()
can now also be numeric values, indicating the position of the destination column.
-
New functions:
-
to work with lists:
is_empty_object()
andcompact_list()
-
to work with strings:
compact_character()
-
-
New function
data_extract()
(or its aliasextract()
) to pull single variables from a data frame, possibly naming each value by the row names of that data frame. -
reshape_ci()
gains aci_type
argument, to reshape data frames where CI-columns have prefixes other than"CI"
. -
standardize()
andcenter()
gain argumentscenter
andscale
, to define references for centrality and deviation that are used when centering or standardizing variables. -
center()
gains the argumentsforce
andreference
, similar tostandardize()
. -
The functionality of the
append
argument incenter()
andstandardize()
was revised. This made thesuffix
argument redundant, and thus it was removed. -
Fixed issue in
standardize()
. -
Fixed issue in
data_findcols()
.
-
Exports
plot
method forvisualisation_recipe()
objects from{see}
package. -
centre()
,standardise()
,unstandardise()
are exported as aliases forcenter()
,standardize()
,unstandardize()
, respectively.
- This is mainly a maintenance release that addresses some issues with conflicting namespaces.
-
New function:
visualisation_recipe()
. -
The following function has now moved to performance package:
check_multimodal()
. -
Minor updates to documentation, including a new vignette about
demean()
.
- First release.