Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to use Pyinstaller on UMI-tools and make it reproducible? #349

Closed
christianbioinf opened this issue Jul 8, 2019 · 1 comment
Closed

Comments

@christianbioinf
Copy link
Contributor

Hi,

I was stumbling on an issue of using UMI-tools together with the generation of standalone (one-file) executable using pyinstaller. I use it to being able to distribute UMI-tools more easily to machines with same architecture but I noticed that the output was not reproducible anymore even if setting the --random-seed option. It seems that it is related to PYTHONHASHSEED since when using UMI-tools normally (e.g. from a virtualenv), setting this made it reproducible.

This is how I first installed UMI-tools (using python3.6 but I guess it is the same with other version of python3) and used it to run functional tests.

git clone https://github.com/CGATOxford/UMI-tools
cd UMI-tools/ && virtualenv -p python3.6 venv && . venv/bin/activate
pip install "numpy==1.16.4" "pandas==0.24.2" "future==0.17.1" "regex==2019.6.8" "scipy==1.3.0" "matplotlib==3.1.1" "pysam==0.15.2"
pip install .
export PYTHONHASHSEED=0 && nosetests -v tests/test_umi_tools.py
deactivate

=> All 53 test were successful

My procedure to build a one-file executable for UMI-tools with pystaller, including a patch to circumvent an issue with Tcl/Tk libraries on OSX (pyinstaller/pyinstaller#3753) and also run the functional tests.

. venv/bin/activate && pip install "pyinstaller==3.4"
patch venv/lib/python3.6/site-packages/PyInstaller/hooks/hook-_tkinter.py pyinstaller_hook-_tkinter.patch
rm -rf build dist && pyinstaller --clean $(find venv/lib/python3.6/site-packages/umi_tools/ -maxdepth 1 \( -name "*.py" -o -name "*.c" \) | grep -v "__init__.py" | grep -v "umi_tools_exe.py" | xargs -n 1 -I {} echo "--add-data "{}":umi_tools") --hidden-import pysam --hidden-import pysam.libctabixproxies --hidden-import numpy --hidden-import numpy.matlib --hidden-import pandas  --hidden-import regex  --hidden-import scipy --hidden-import scipy._lib.messagestream --hidden-import matplotlib --hidden-import future.utils -F -n umi_tools venv/lib/python3.6/site-packages/umi_tools/umi_tools.py
deactivate
export PYTHONHASHSEED=0 && OLD_PATH=${PATH} && export PATH=${PATH}:${PWD}/dist && nosetests -v tests/test_umi_tools.py; export PATH=${OLD_PATH}

=>17 of 53 tests failed

To fix this I tried to set PYTHONHASHSEED explicitly in the modules (e.g. in umi_tools/umi_tools.py) or created a run-time hook that simply sets this environment variable but this also did not work. Maybe it is related to the way, I am creating the one-file executable but adding all python source files and the one c file as data was the only way I was successful in generating a functional one-file executable.

Just for completion, I am using Python 3.6.7 and umi_tools (e8c2b47) on Mac OSX.

Best regards,
Christian

christianbioinf added a commit to christianbioinf/UMI-tools that referenced this issue Aug 22, 2019
…ries (only true for python3.5+) in cases where the order of elements influences the results. Also updating test results to compensate for using dicts as ordered structures. Note that it is now not necessary to set the environment variable PYTHONHASHSEED anymore.
christianbioinf added a commit to christianbioinf/UMI-tools that referenced this issue Aug 22, 2019
…ries (only true for python3.5+) in cases where the order of elements influences the results. Also updating test results to compensate for using dicts as ordered structures. Note that it is now not necessary to set the environment variable PYTHONHASHSEED anymore.
@TomSmithCGAT
Copy link
Member

See #550 for an outstanding PR to make UMI-tools deterministic

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants