Skip to content

bhklab/orcestra-downloader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

80 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

orcestra-downloader

Simplified access to download data from orcestra.ca

pixi-badge Ruff Built with Material for MkDocs

PyPI - Python Version GitHub release (latest by date) PyPI - Version Downloads

GitHub last commit GitHub issues GitHub pull requests GitHub contributors GitHub stars GitHub forks

Table of Contents

Installation

1. Recommended CLI access

The recommended way to use orcestra-downloader is through its CLI tool, which can be easily done without ever installing it on your system. You can run the CLI directly using pixi or uvx commands.

pixi exec orcestra-downloader --help                                                               
Output

pixi-exec-help

uvx via pypi

uvx orcestra-downloader --help                                                                     
Output

uvx-help

2. Install into pixi project

If you wish to use orcestra-downloader in a pixi project, you can install orcestra-downloader into your project.

conda-forge:

pixi add orcestra-downloader         # from conda-forge

pixi add --pypi orcestra-downloader  # from pypi

3. Install with pip

If you have a python virtual environment set up, you can install orcestra-downloader directly using pip or python -m pip.

To install the package, use pip:

pip install orcestra-downloader

Usage

The orcestra-downloader provides a convenient command-line interface to interact with the orcestra.ca API. The CLI allows you to list, view, and download various datasets easily.

Available Dataset Types

πŸ”¬ Seven different dataset types are available through orcestra.ca:

Dataset Type Description
pharmacosets Pharmacological screening datasets
icbsets Immune checkpoint blockade datasets
radiosets Radiotherapy response datasets
xevasets Xenograft-derived datasets
toxicosets Toxicological screening datasets
radiomicsets Radiomics datasets
clinicalgenomics Clinical genomics datasets

Basic Commands

πŸ§‘β€πŸ’» Each dataset type supports these common commands:

# List all items in a dataset
orcestra-downloader [dataset_type] list

# Print a table of items in a dataset
orcestra-downloader [dataset_type] table [DATASET_NAME]

# Download a file for a dataset
orcestra-downloader [dataset_type] download [DATASET_NAME]

# Download all files for a dataset
orcestra-downloader [dataset_type] download-all

Examples

πŸ“‹ Basic listing and table commands

# List all radiosets
orcestra-downloader radiosets list

# Print a table of all xevasets after refreshing the cache
orcestra-downloader xevasets table --force

# Print a table of a specific dataset with more details
orcestra-downloader pharmacosets table GDSC_2020(v2-8.2)
πŸ‘€ Command Demo

orcestra-gif

Refreshing Cache

πŸ’‘ orcestra-downloader uses a cache to store dataset metadata from the Orcestra API. This should be located at ~/.cache/orcestra-downloader.

By default, the tool will only update cache when used 7 days after the last update. To refresh the cache, use the --refresh flag.

orcestra-downloader --refresh

Downloading Datasets

⬇️ Download specific datasets or entire collections:

# Download a specific pharmacoset
orcestra-downloader pharmacosets download 'GDSC_2020(v2-8.2)'

# Download multiple datasets at once
orcestra-downloader radiomicsets download HNSCC_Features RADCURE_Features

# Specify a custom download directory
orcestra-downloader toxicosets download 'DrugMatrix Rat' --directory ./my-data-folder

# Download all datasets of a specific type (with progress bar)
orcestra-downloader xevasets download-all

# Force overwrite of existing files
orcestra-downloader icbsets download-all --overwrite

Command Reference

βš™οΈ Global options available for all commands:

Options:
  -r, --refresh  Fetch all datasets and hydrate the cache.
  -h, --help     Show this message and exit.
  -q, --quiet    Suppress all logging except errors.
  -v, --verbose  Increase verbosity of logging (0-3: ERROR, WARNING, INFO, DEBUG).
⌨️ Dataset-specific command options

For the list command:

Options:
  --force      Force fetch new data.
  --no-pretty  Disable pretty printing.

For the table command:

Arguments:
  [NAME OF DATASET]  Optional dataset name for detailed information.

Options:
  --force      Force fetch new data.

For the download command:

Arguments:
  [ORCESTRA DATASET NAME]  Required dataset name(s) to download.

Options:
  -o, --overwrite          Overwrite existing file if it exists.
  -d, --directory PATH     Directory to save the file to.
  --force                  Force fetch new data from the API.

For the download-all command:

Options:
  -o, --overwrite          Overwrite existing files if they exist.
  -d, --directory PATH     Directory to save the files to.
  --force                  Force fetch new data from the API.

Troubleshooting

❓ Common issues and solutions:

  • Cache issues: If you're getting outdated information, try using the --refresh flag or --force option.
  • Download errors: Check your internet connection and make sure the orcestra.ca API is accessible.
  • Permission errors: Ensure you have write permissions to the download directory.
  • Dataset not found: Make sure the dataset name is correct and exists on orcestra.ca.

Contributing

Contributions are welcome! Please feel free to submit a Pull Request.

If you encounter any issues or have questions, please open an issue on the GitHub repository: https://github.com/bhklab/orcestra-downloader/issues

About

Seamless CLI access to orcestra datasets

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages