wompwomp

Make alluvial plots with node order and colors optimized to minimize edge crossings with wompwomp!

wompwomp solves the Weighted (permutation) Optimization of Multiple Partitions-Weighted (label) Optimization of Multiple Partitions (W_POMP--W_LOMP) problem.

wompwomp functions/commands

Installation:

R - Requires system R to be installed

Bioconductor (not yet released on Bioconductor - please install from GitHub)

if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install("wompwomp")
wompwomp::setup_python_env()

GitHub

if (!require("remotes", quietly = TRUE))
    install.packages("remotes")
remotes::install_github("pachterlab/wompwomp")
wompwomp::setup_python_env()

Command line - Does not require system R to be installed if using conda.

git clone https://github.com/pachterlab/wompwomp
cd wompwomp
conda env create -f environment.yml  # or to avoid conda: Rscript inst/install.R
conda activate wompwomp_env  # skip if used install.R above
remotes::install_local(".")  # or use --dev flag in commands

The first time any command is run on the command line, a prompt will appear asking to install any missing R dependencies.

While Python is not strictly required for use of the package, it is required for some options, including default package options (i.e., NeighborNet algorithm for sorting_algorithm == "neighbornet" or column_sorting_algorithm == "neighbornet", Leiden clustering for coloring_algorithm == "advanced", fenwick tree optimization for objective calculation).

Usage

The I/O for each of wompwomp's functions is as follows:

plot_alluvial: dataframe, csv, or tibble (grouped or ungrouped) --> plot
data_preprocess: dataframe, csv, or tibble (grouped or ungrouped) --> dataframe (grouped)
data_sort: dataframe, csv, or tibble (grouped or ungrouped) --> dataframe (grouped)
plot_alluvial_internal: dataframe, csv, or tibble (grouped) --> plot
determine_crossing_edges: dataframe, csv, or tibble (grouped or ungrouped) --> list
determine_weighted_layer_free_objective: dataframe, csv, or tibble (grouped or ungrouped) --> integer

The input table can have one of two formats:

Ungrouped: columns specified by column1 and column2, where each row corresponds to a separate entity
Grouped: columns specified by column1, column2, and column_weights, where each row corresponds to a combination of column1 and column2, and column_weights specified the number of items in this combination

Examples in R

Ungrouped input

library("wompwomp")
df <- data.frame(method1 = sample(1:3, 100, TRUE), method2 = sample(1:3, 100, TRUE))
head(df)
#>   method1    method2
#> 1   1   1
#> 2   1   3
#> 3   1   2
#> 4   1   1
#> 5   2   1
#> 6   2   2

p <- plot_alluvial(df)
p

Grouped input

set.seed(42)
raw_df <- data.frame(
    method1 = sample(1:3, 100, TRUE),
    method2 = sample(1:3, 100, TRUE)
)

# Aggregate by combination
df <- as.data.frame(dplyr::count(raw_df, method1, method2, name = "weight"))
head(df)

#>   method1    method2     weight
#> 1    1   1   13  
#> 2    1   2   15  
#> 3    1   3   12  
#> 4    2   1   12  
#> 5    2   2   17  
#> 6    2   3   10  

p <- plot_alluvial(df, column_weights = "weight")
p

Examples in Command Line:

./exec/wompwomp plot_alluvial --df mydata.csv --graphing_columns column1 column2

For help on any command, run ./exec/wompwomp COMMAND --help

Notes about command line usage:

all parameter values should be space-separted ex. ./exec/wompwomp plot_alluvial --df data.csv, NOT --df=data.csv
all parameters that take a single argument have identical names between R and command line, with the value immediately following the argument ex. plot_alluvial(df=data.csv), ./exec/wompwomp plot_alluvial --df data.csv
all parameters that take a vector/list of arguments have identical names between R and command line, with the values immediately following the argument, all separated by spaced ex. plot_alluvial(graphing_columns=c("tissue", "cluster")), ./exec/wompwomp plot_alluvial --graphing_columns tissue cluster
all boolean parameters are passed with the flag without any following arguments; boolean parameters that default to FALSE have identical names between R and command line, while boolean parameters that default to TRUE have "disable_" prepended to the name in the command line ex. (note that the defaults for include_group_sizes=FALSE and include_axis_titles=TRUE): plot_alluvial(include_group_sizes=TRUE, include_axis_titles=FALSE), ./exec/wompwomp plot_alluvial --include_group_sizes --disable_include_axis_titles

See a full tutorial in our introductory vignette wompwomp-intro.Rmd

Read our preprint on arXiv here.

Name		Name	Last commit message	Last commit date
Latest commit History 244 Commits
R		R
exec		exec
figures		figures
inst		inst
man		man
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
.here		.here
.lintr		.lintr
DESCRIPTION		DESCRIPTION
LICENSE		LICENSE
NAMESPACE		NAMESPACE
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

wompwomp

wompwomp functions/commands

Installation:

R - Requires system R to be installed

Command line - Does not require system R to be installed if using conda.

Usage

Examples in R

Examples in Command Line:

See a full tutorial in our introductory vignette wompwomp-intro.Rmd

About

Uh oh!

Releases

Packages

Contributors 3

Uh oh!

Languages

License

pachterlab/wompwomp

Folders and files

Latest commit

History

Repository files navigation

wompwomp

wompwomp functions/commands

Installation:

R - Requires system R to be installed

Command line - Does not require system R to be installed if using conda.

Usage

Examples in R

Examples in Command Line:

See a full tutorial in our introductory vignette wompwomp-intro.Rmd

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Uh oh!

Languages

Packages