BEpipeR: a user-friendly, flexible, and scalable data synthesis pipeline for the Biodiversity Exploratories and other research consortia

Marcel Glück | Oliver Bossdorf | Henri A. Thomassen

Resources

What the pipeline can do for you: Features and functionalities
How to set up the pipeline on your system: Setting-up
How to operate the pipeline: BEpipeR publication
Found a bug? Report a bug

Motivation

The wealth of (a)biotic environmental data generated in the Biodiversity Exploratories (BE) continues to grow steadily, and so does the effort of using the data available in our statistical frameworks. Unsurprisingly, many BE projects restrict their analyses to a handful of frequently used data sets and neglect the wealth of information at their fingertips. Oftentimes, this might be caused by the need for stringent quality control and (pre-)processing that many data sets still require. However, this approach might often prevent us from obtaining a holistic understanding of our complex study systems. To remedy this issue, this project provides a user-friendly, flexible, scalable, reproducible, and easy-to-expand R pipeline that permits the streamlined synthesis of experimental plot data generated by the Exploratories. We are convinced that this framework will benefit many scientists in the Exploratories, as the data generated may be used as input in many types of environmental association studies. Additionally, this pipeline may be readily adapted to process plot-based data generated by other research consortia.

This project is a registered Biodiversity Exploratories synthesis project.

Features and functionalities

✔️ Ease of use: Parse aggregation information hassle-free through csv parameter files.

✔️ Flexibility: One pipeline, three modes. Toggle effortlessly between processing forest or grassland data, or a combination thereof.

✔️ Deployability: Run this pipeline on your infrastructure effortlessly, thanks to a reproducible environment.

✔️ Participatory: Shape the future of this project by providing suggestions or code.

Processing steps

Pre-processing: Template creation, plot location harmonization, data correction and subsetting, taxonomic fallbacks, data reshaping, and normalization by variable
Quality control: Multi-mode outlier detection
Aggregation: Within and across data sets (metrics: mean, median, standard deviation (SD), median absolute deviation (MAD)); processing of yearly climate aggregates (incl. the removal of weakly-supported data points; metrics: mean, median, SD, MAD, min, and max)
Diversity calculations: Normalization by repeated rarefaction; calculating alpha diversity indices (species richness, Simpson, Shannon-Wiener, Margalef, Menhinick, ...)
Post-processing: Data joining, quality control, and variables selection by variance inflation factor (VIF) analyses
Data export and metadata compilation: Export of composite data sets and VIF-produced subsets; fetching metadata to the variables produced to assist in preparing the data for publication

Acknowledgements

People/institutions we are indebted to:

Founders and staff of the Biodiversity Exploratories: For the envisioning, setting-up, and maintenance of the research platform.
Copyright Office, Tübingen University: For assisting in finding a suitable license for this pipeline.
Open Access Publishing Fund, Tübingen University: For covering publication fees.

Name		Name	Last commit message	Last commit date
Latest commit History 212 Commits
.github		.github
Helpers		Helpers
Metadata		Metadata
Output		Output
Processing		Processing
R_scripts		R_scripts
Source		Source
renv		renv
.Rprofile		.Rprofile
BEpipeR.Rproj		BEpipeR.Rproj
BEpipeR_logo.png		BEpipeR_logo.png
LICENSE.txt		LICENSE.txt
README.md		README.md
renv.lock		renv.lock
setup_guide.md		setup_guide.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BEpipeR: a user-friendly, flexible, and scalable data synthesis pipeline for the Biodiversity Exploratories and other research consortia

Resources

Motivation

Features and functionalities

Processing steps

Acknowledgements

About

Releases 2

Languages

License

marcelglueck/BEpipeR

Folders and files

Latest commit

History

Repository files navigation

BEpipeR: a user-friendly, flexible, and scalable data synthesis pipeline for the Biodiversity Exploratories and other research consortia

Resources

Motivation

Features and functionalities

Processing steps

Acknowledgements

About

Topics

Resources

License

Stars

Watchers

Forks

Releases 2

Languages