LAPIS-SILO

High-performance analytical database for sequence alignment data

For information on how to build, test, and contribute to SILO, see Contributing.

Python Bindings

SILO provides Python bindings via Cython. The bindings wrap the core C++ Database and are installable by pip install silodb.

See Contributing for build instructions.

Usage

from silodb import Database

# Create a new database
db = Database()

# Or load from a saved state
db = Database("/path/to/saved/database")

# Create a nucleotide sequence table
db.create_nucleotide_sequence_table(
    table_name="sequences",
    primary_key_name="id",
    sequence_name="main",
    reference_sequence="ACGT..."
)

# Append data from file
db.append_data_from_file("sequences", "/path/to/data.ndjson")

# Get reference sequence
ref = db.get_nucleotide_reference_sequence("sequences", "main")

# Get filtered bitmap (list of matching row indices)
indices = db.get_filtered_bitmap("sequences", "some_filter")

# Get prevalent mutations
mutations = db.get_prevalent_mutations(
    table_name="sequences",
    sequence_name="main",
    prevalence_threshold=0.05,
    filter_expression=""
)

# Save database state
db.save_checkpoint("/path/to/save/directory")

# Print all data (to stdout)
db.print_all_data("sequences")

Configuration Files

For SILO, there are three different configuration files:

DatabaseConfig described in file database_config.h
PreprocessingConfig used when started with preprocessing and described in file preprocessing_config.h. For details see silo preprocessing --help.
RuntimeConfig used when started with api and described in file runtime_config.h For details see silo api --help.

The database config contains the schema of the database and is always required when preprocessing data. The database config will be saved together with the output of the preprocessing and is therefore not required when starting SILO as an API.

An example configuration file can be seen in testBaseData/exampleDataset/database_config.yaml.

By default, the config files are expected to be YAML files in the current working directory in snake_case (database_config.yaml, preprocessing_config.yaml, runtime_config.yaml), but their location can be overridden using the options --database-config=X, --preprocessing-config=X, and --runtime-config=X.

Preprocessing and Runtime configurations contain default values for all fields and are thus only optional. Their parameters can also be provided as command-line arguments in snake_case and as environment variables prefixed with SILO_ in capital SNAKE_CASE. (e.g. SILO_INPUT_DIRECTORY).

The precendence is CLI argument > Environment Variable > Configuration File > Default Value

Run The Preprocessing

The preprocessing acts as a program that takes an input directory that contains the to-be-processed data and an output directory where the processed data will be stored. Both need to be mounted to the container.

SILO expects a preprocessing config that can to be mounted to the default location /app/preprocessing_config.yaml.

Additionally, a database config and a ndjson file containing the data are required. They should typically be mounted in /preprocessing/input.

docker run \
  -v your/input/directory:/preprocessing/input \
  -v your/preprocessing/output:/preprocessing/output \
  -v your/preprocessing_config.yaml:/app/preprocessing_config.yaml
  silo preprocessing

Both config files can also be provided in custom locations:

silo preprocessing --preprocessing-config=./custom/preprocessing_config.yaml --database-config=./custom/database_config.yaml

The Docker image contains a default preprocessing config that sets defaults specific for running SILO in Docker. Apart from that, there are default values if neither user-provided nor default config specify fields. The user-provided preprocessing config can be used to overwrite the default values. For a full reference, see the help text.

Run docker container (api)

After preprocessing the data, the api can be started with the following command:

docker run
  -p 8081:8081
  -v your/preprocessing/output:/data
  silo api

The directory where SILO expects the preprocessing output can be overwritten via silo api --data-directory=/custom/data/directory or in a corresponding configuration file.

Acknowledgments

Original genome indexing logic with roaring bitmaps by Prof. Neumann: https://db.in.tum.de/~neumann/gi/

Name		Name	Last commit message	Last commit date
Latest commit History 1,649 Commits
.github		.github
.run		.run
.vscode		.vscode
benchmarking		benchmarking
buildScripts		buildScripts
documentation		documentation
endToEndTests		endToEndTests
performance		performance
python		python
src		src
testBaseData		testBaseData
tools		tools
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.clangd		.clangd
.commitlintrc.js		.commitlintrc.js
.dir-locals.el		.dir-locals.el
.dockerignore		.dockerignore
.gitignore		.gitignore
.release-please-manifest.json		.release-please-manifest.json
CHANGELOG.md		CHANGELOG.md
CMakeLists.txt		CMakeLists.txt
Dockerfile		Dockerfile
Dockerfile_dependencies		Dockerfile_dependencies
Dockerfile_linter		Dockerfile_linter
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
conanfile.py		conanfile.py
docker-compose-for-tests-api.yml		docker-compose-for-tests-api.yml
docker-compose-for-tests-preprocessing-from-ndjson.yml		docker-compose-for-tests-preprocessing-from-ndjson.yml
docker_default_preprocessing_config.yaml		docker_default_preprocessing_config.yaml
docker_runtime_config.yaml		docker_runtime_config.yaml
linter_changes.bash		linter_changes.bash
package-lock.json		package-lock.json
package.json		package.json
pyproject.toml		pyproject.toml
release-please-config.json		release-please-config.json
setup.py		setup.py
version.txt		version.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LAPIS-SILO

Python Bindings

Usage

Configuration Files

Run The Preprocessing

Run docker container (api)

Acknowledgments

About

Uh oh!

Releases 77

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LAPIS-SILO

Python Bindings

Usage

Configuration Files

Run The Preprocessing

Run docker container (api)

Acknowledgments

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 77

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages