Authors: Petar Todorov1, Artem Sokolov1
1Laboratory of Systems Pharmacology, Harvard Medical School
This repo aims to offer an easy way to test gene set hypotheses against backgrounds of randomly chosen gene sets predictions. Gene sets which are more predictive than the distribution of randomly chosen gene sets may indicate a link to the importance of their constituent genes.
If you are running on a Linux system which has multiple versions of python
, gcc
, and git
you should go ahead and make sure you have the correct ones by running which <module name>
. To discover what version of these modules are available use module avail <module name>
and then module load <module name>
to load the correct one before proceeding. The btr
package has been tested with with python/3.6.0
, gcc/6.2.0
, and git/2.14.2
or higher.
We get started by creating a Python 3 virtual environment.
virtualenv nameyourenvhere
In case you are setting this up on a cluster, you may want to use the packages compiled by the cluster. To do that:
virtualenv nameyourenvhere --system-site-packages
To activate your environment
source nameyourenvhere/bin/activate
Then clone this repo, and install it as editable using pip
git clone https://github.com/pvtodorov/btr.git
cd btr
pip install -e .
You're ready to go!
Installing the repo will also bind some commands that can be used in the terminal.
In order to specify how to run the software, a settings file is needed. An example
can be see in this repo's example_settings.json
. If a background is being generated,
a file such as example_background_params.json
must be provided. If a hypothesis is being
used as the feature set, a .gmt
file must be used such as this this from Pathway Commons.
To generate background predictions:
btr-predict <path to settings file> -b <path to background parmeters>
To generate geneset predictions:
btr-predict <path to settings file> -g <path to GMT file or folder with txt gene lists>
To evaluate predictions, first score the background runs
btr-score <path to settings file>
To evaluate gene files, score them
btr-score <path to settings file> -g <path to GMT file or folder with txt gene lists>
btr-stats <path to settings file> -g <path to GMT file or folder with txt gene lists>