Skip to content

laminlabs/arrayloader-benchmarks

Repository files navigation

Data loader benchmarks for scRNA-seq counts et al.

A collaboration between scverse, Lamin, and anyone interested in contributing!

This repository contains benchmarking scripts & utilities for scRNA-seq data loaders and allows to collaboratively contribute new benchmarking results.

Quickstart

Setup:

git clone https://github.com/laminlabs/arrayloader-benchmarks
cd arrayloader-benchmarks
uv pip install -e ".[scdataset,annbatch]"  # provide tools you'd like to install
lamin connect laminlabs/arrayloader-benchmarks  # to contribute results to the hosted lamindb instance, call `lamin init` to create a new lamindb instance

Typical calls of the main benchmarking script are:

cd scripts
python run_loading_benchmark_on_collection.py annbatch   # run annbatch on collection Tahoe100M_tiny, n_datasets = 1
python run_loading_benchmark_on_collection.py MappedCollection   # run MappedCollection
python run_loading_benchmark_on_collection.py scDataset   # run scDataset
python run_loading_benchmark_on_collection.py annbatch --n_datasets -1  # run against all datasets in the collection
python run_loading_benchmark_on_collection.py annbatch --collection Tahoe100M --n_datasets -1  # run against the full 100M cells
python run_loading_benchmark_on_collection.py annbatch --collection Tahoe100M --n_datasets 1  # run against the the first dataset, 2M cells
python run_loading_benchmark_on_collection.py annbatch --collection Tahoe100M --n_datasets 5  # run against the the first dataset, 10M cells

You can choose between different benchmarking dataset collections.

image

When running the script, parameters and results are automatically tracked in a parquet file, along with source code, run environment, and input and output datasets.

image

Note: A previous version of this repo contained the benchmarking scripts accompanying the 2024 blog post: lamin.ai/blog/arrayloader-benchmarks.

About

Benchmarking logic for arrayloaders.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 5

Languages