NeighborFinder

NeighborFinder is an R package enabling the reconstruction of the local neighborhood of a species of interest in the microbial interaction network, based on microbiome abundance data. Unlike most methods, NeighborFinder does not attempt to reconstruct full networks before extracting local neighborhoods but focuses from the start on local interactions to gain statistical power.

Using cross-validated multiple linear regression with ℓ1 penalty and microbiome-specific filters, our approach infers interpretable species-centered interactions, with F1 score ≥ 0.95 on simulated datasets ranging from 250 to 1000 samples.

Furthermore, when multiple abundance datasets are available for the species of interest, NeighborFinder integrates the results obtained on each dataset to produce a robust shortlist of high-likelihood companion species.

NeighborFinder is tailored to microbiome data. It was specifically developed for shotgun metagenomic data and includes a default normalization step for such datasets, but can accommodate metabarcoding data (and other count-based inputs) by skipping it.

Installation

The latest NeighborFinder version is available from the public github repo.

if (!requireNamespace("remotes")) {
  install.packages("remotes")
}
remotes::install_github("metagenopolis/NeighborFinder")

Getting started

1. Download data

We use the data provided in the package: abundance tables from three datasets. NeighborFinder requires an abundance table (data) (with species as rows and samples as columns) and can use a taxonomic affiliation table (taxo) to provide additional details on the taxa when visualizing the results.

library(neighborfinder)
data(data)
data(taxo)

2. Apply NeighborFinder on a species of interest

Let’s find the neighborhood of Escherichia coli using the abundance data from the Japanese patients in this cohort (data$CRC_JPN).

res_CRC_JPN <- apply_NeighborFinder(
  data_with_annotation = data$CRC_JPN,
  object_of_interest = "Escherichia coli",
  col_module_id = "msp_id",
  annotation_level = "species",
  prev_level = 0.30,
  filtering_top = 30
)

3. Visualize the corresponding network

The species identified as neighbor can then be visualized using visualize_network() with or without taxonomic annotation.

plot_JPN <- visualize_network(
  res_CRC_JPN,
  taxo,
  object_of_interest = "Escherichia coli",
  col_module_id = "msp_id",
  annotation_level = "species",
  label_size = 5
)

plot_JPN_annot <- visualize_network(
  res_CRC_JPN,
  taxo,
  object_of_interest = "Escherichia coli",
  col_module_id = "msp_id",
  annotation_level = "species",
  label_size = 5,
  annotation_option = TRUE,
  seed = 2
)

library(patchwork)
plot_JPN + plot_JPN_annot + plot_layout(widths=c(1,1.3))

4. Use different datasets

We can repeat the process to the two other datasets: the Chinese patients (data$CRC_CHN) and the European patients (data$CRC_EUR).

# CHINA
res_CRC_CHN <- apply_NeighborFinder(
  data$CRC_CHN,
  object_of_interest = "Escherichia coli",
  col_module_id = "msp_id",
  annotation_level = "species",
  prev_level = 0.30,
  filtering_top = 30,
  covar = ~study_accession,
  meta_df = metadata$CRC_CHN,
  sample_col = "secondary_sample_accession"
)

plot_CHN <- visualize_network(
  res_CRC_CHN,
  taxo,
  object_of_interest = "Escherichia coli",
  col_module_id = "msp_id",
  annotation_level = "species",
  label_size = 5
)

# EUROPE
res_CRC_EUR <- apply_NeighborFinder(
  data$CRC_EUR,
  object_of_interest = "Escherichia coli",
  col_module_id = "msp_id",
  annotation_level = "species",
  prev_level = 0.30,
  filtering_top = 30,
  covar = ~study_accession,
  meta_df = metadata$CRC_EUR,
  sample_col = "secondary_sample_accession"
)

plot_EUR <- visualize_network(
  res_CRC_EUR,
  taxo,
  object_of_interest = "Escherichia coli",
  col_module_id = "msp_id",
  annotation_level = "species",
  label_size = 5
)

plot_JPN | plot_CHN | plot_EUR

5. Aggregate the results

The results from all three datasets are combined. In this aggregated network, we selected the edges detected in at least 2 out of the 3 datasets.

final_net <- intersections_network(
 res_list = list(res_CRC_JPN, res_CRC_CHN, res_CRC_EUR),
 taxo,
 threshold = 2,
 "Escherichia coli",
 col_module_id = "msp_id",
 annotation_level = "species",
 label_size = 7,
 edge_label_size = 4,
 node_size = 15
)

plot_spacer() + final_net + plot_spacer() + plot_layout(widths=c(0.5,1,0.5))

Full tutorial

The vignette provides an overview of the various use cases of NeighborFinder through examples based on real data extracted from this repository.

Name		Name	Last commit message	Last commit date
Latest commit History 87 Commits
.github/workflows		.github/workflows
R		R
data		data
data_raw		data_raw
dev		dev
man		man
tests		tests
vignettes		vignettes
.Rbuildignore		.Rbuildignore
.gitignore		.gitignore
.gitlab-ci.yml		.gitlab-ci.yml
DESCRIPTION		DESCRIPTION
LICENSE.md		LICENSE.md
NAMESPACE		NAMESPACE
README.Rmd		README.Rmd
README.md		README.md
_pkgdown.yml		_pkgdown.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NeighborFinder

Installation

Getting started

Full tutorial

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NeighborFinder

Installation

Getting started

Full tutorial

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages