Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MPNST samples generation is failing due to a dependency issue #204

Open
jjacobson95 opened this issue Aug 28, 2024 · 7 comments
Open

MPNST samples generation is failing due to a dependency issue #204

jjacobson95 opened this issue Aug 28, 2024 · 7 comments

Comments

@jjacobson95
Copy link
Collaborator

Reproduced twice, same error each time.

Error:

b'Error in library(synapser) : there is no package called \xe2\x80\x98synapser\xe2\x80\x99\nExecution halted\n'

Context:
The synapser R package is not being properly downloaded into the docker image, causing the build pipeline to fail. The reason is unknown, however it looks like you may have dealt with a dependency issue related to this recently.

Tracing the code we can see that the mpnst Dockerfile has an updated R base and instructions to install the requirements.r file. The requirements.r file includes the synapser package although the 'repos' argument differs from the documentation which was most recently updated on 8/21/24.

Dockerfile build logs confirm this is not installing correctly.

#106 215.4 ERROR: dependency ‘rjson’ is not available for package ‘synapser’
#106 215.4 * removing ‘/usr/local/lib/R/site-library/synapser’
#106 215.4 
#106 215.4 The downloaded source packages are in
#106 215.4      ‘/tmp/Rtmp0MtMyd/downloaded_packages’
#106 215.4 Warning message:
#106 215.4 In install.packages("synapser", repos = c("http://ran.synapse.org",  :
#106 215.4   installation of package ‘synapser’ had non-zero exit status
#106 215.4 Installing package into ‘/usr/local/lib/R/site-library’
#106 215.4 (as ‘lib’ is unspecified)
#106 215.8 also installing the dependencies ‘R.oo’, ‘R.methodsS3’

To fix, I will try updating the 'repos' argument in install.packages, and adding rjson to the requirements.r file.

Note:
If we create stable docker images, we should make sure to add a test for this.

To Reproduce:

  • Environment:
    • AWS EC2 Instance:
      • Amazon Linux 2 AMI (HVM) – Kernel 5.10, SSD Volume Type
      • Architecture: 64-bit x86
      • Instance Type: t2.2xlarge
      • Storage Amount: 60Gb
      • vCPUs: 8
      • Subnet: coderdata-Public Subnet A
      • Public IP
    • Setup:
      • I ran the two setup scripts (aws_1.sh and aws_2.sh) provided at the end of this issue, while refreshing the page between them for docker setup. I used the requirements.txt file from an older version of the repo as well. Then I set the Auth tokens.
      • Python Version 3.10.4
      • Docker Engine Version: 25.0.6
      • Docker Compose version: v2.29.2
    • Command:
      • nohup python3.10 build/build_all.py --all --pypi --figshare --version 0.1.41 > output_2.log 2>&1 &

aws_1.sh: Setup script 1


# Install Git
sudo yum install git -y

# Clone the repository
git clone https://github.com/PNNL-CompBio/coderdata.git

# Install Docker and configure it
sudo amazon-linux-extras install docker -y
sudo service docker start
sudo systemctl enable docker
sudo usermod -a -G docker ec2-user

sudo curl -L https://github.com/docker/compose/releases/latest/download/docker-compose-$(uname -s)-$(uname -m) -o /usr/local/bin/docker-compose
sudo chmod +x /usr/local/bin/docker-compose

echo "Part 1 completed successfully!"

aws_2.sh: Setup script 2

#!/bin/bash

# Display Docker info
docker info

# Install development tools
sudo yum groupinstall "Development Tools" -y

# Erase previous OpenSSL development files
sudo yum erase openssl-devel -y

# Install required libraries
sudo yum install openssl11 openssl11-devel libffi-devel bzip2-devel wget -y

# Download and install Python 3.10
wget https://www.python.org/ftp/python/3.10.4/Python-3.10.4.tgz
tar -xf Python-3.10.4.tgz
cd Python-3.10.4
./configure --enable-optimizations
make -j $(nproc)
sudo make altinstall

# Install pip for Python 3.10
sudo yum install python3-pip -y

# Change directory to coderdata and install requirements
cd ~/coderdata
pip3.10 install -r requirements.txt

# Set SYNAPSE_AUTH_TOKEN if provided
if [ -n "$1" ]; then
  export SYNAPSE_AUTH_TOKEN="$1"
  echo "SYNAPSE_AUTH_TOKEN set to $1"
else
  echo "No token provided; proceed without authentication token."
fi

echo "Part 2 completed successfully!"
@jjacobson95
Copy link
Collaborator Author

Looks like "rjson" requires R version 4.4.0 which is higher than the mpnst DockerFile r-base:4.3.2. When I upgrade this version, rjson installs correctly. However, installing synapser gives the following error message. As a note, it says numpy is not found and installs a version using reticulate but numpy should already be installed in the environment.

synapser installation error:

* installing *source* package ‘synapser’ ...
** using staged installation
[1] "*** Using Python Configuration:"
python:         /root/.virtualenvs/r-reticulate/bin/python
libpython:      /usr/lib/python3.12/config-3.12-x86_64-linux-gnu/libpython3.12.so
pythonhome:     /root/.virtualenvs/r-reticulate:/root/.virtualenvs/r-reticulate
version:        3.12.5 (main, Aug 22 2024, 13:11:09) [GCC 14.2.0]
numpy:           [NOT FOUND]
Using virtual environment '/root/.virtualenvs/r-reticulate' ...
+ /root/.virtualenvs/r-reticulate/bin/python -m pip install --upgrade --no-user 'pandas>=1.5,<=2.0.3' jinja2 markupsafe 'numpy<=1.24.4'
Collecting pandas<=2.0.3,>=1.5
  Using cached pandas-2.0.3.tar.gz (5.3 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... done
  Preparing metadata (pyproject.toml) ... done
Collecting jinja2
  Using cached jinja2-3.1.4-py3-none-any.whl.metadata (2.6 kB)
Collecting markupsafe
  Using cached MarkupSafe-2.1.5-cp312-cp312-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (3.0 kB)
Collecting numpy<=1.24.4
  Using cached numpy-1.24.4.tar.gz (10.9 MB)
  Installing build dependencies ... done
  Getting requirements to build wheel ... error
  error: subprocess-exited-with-error
  
  × Getting requirements to build wheel did not run successfully.
  │ exit code: 1
  ╰─> [33 lines of output]
      Traceback (most recent call last):
        File "/root/.virtualenvs/r-reticulate/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 353, in <module>
          main()
        File "/root/.virtualenvs/r-reticulate/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 335, in main
          json_out['return_val'] = hook(**hook_input['kwargs'])
                                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "/root/.virtualenvs/r-reticulate/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 112, in get_requires_for_build_wheel
          backend = _build_backend()
                    ^^^^^^^^^^^^^^^^
        File "/root/.virtualenvs/r-reticulate/lib/python3.12/site-packages/pip/_vendor/pyproject_hooks/_in_process/_in_process.py", line 77, in _build_backend
          obj = import_module(mod_path)
                ^^^^^^^^^^^^^^^^^^^^^^^
        File "/usr/lib/python3.12/importlib/__init__.py", line 90, in import_module
          return _bootstrap._gcd_import(name[level:], package, level)
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
        File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
        File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
        File "<frozen importlib._bootstrap>", line 1310, in _find_and_load_unlocked
        File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
        File "<frozen importlib._bootstrap>", line 1387, in _gcd_import
        File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
        File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
        File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
        File "<frozen importlib._bootstrap_external>", line 995, in exec_module
        File "<frozen importlib._bootstrap>", line 488, in _call_with_frames_removed
        File "/tmp/pip-build-env-orwb5ow2/overlay/lib/python3.12/site-packages/setuptools/__init__.py", line 16, in <module>
          import setuptools.version
        File "/tmp/pip-build-env-orwb5ow2/overlay/lib/python3.12/site-packages/setuptools/version.py", line 1, in <module>
          import pkg_resources
        File "/tmp/pip-build-env-orwb5ow2/overlay/lib/python3.12/site-packages/pkg_resources/__init__.py", line 2172, in <module>
          register_finder(pkgutil.ImpImporter, find_on_path)
                          ^^^^^^^^^^^^^^^^^^^
      AttributeError: module 'pkgutil' has no attribute 'ImpImporter'. Did you mean: 'zipimporter'?
      [end of output]
  
  note: This error originates from a subprocess, and is likely not a problem with pip.
error: subprocess-exited-with-error

× Getting requirements to build wheel did not run successfully.
│ exit code: 1
╰─> See above for output.

note: This error originates from a subprocess, and is likely not a problem with pip.
Error: Error installing package(s): "'pandas>=1.5,<=2.0.3'", "jinja2", "markupsafe", "'numpy<=1.24.4'"
Execution halted
ERROR: configuration failed for package ‘synapser’
* removing ‘/usr/local/lib/R/site-library/synapser’

@sgosline
Copy link
Member

Yes, synapser failed to catch up to the numpy bug. This was finally addressed about 4 days ago, see comment on synapser github site.

@jjacobson95
Copy link
Collaborator Author

I don't think this was fully addressed as I ran into this current issue after the latest synapser release. I used R version 4.4.0 and Python 3.10.

@sgosline
Copy link
Member

I think it is, since the update to the synapser docs came this week, and I haven't touched this code for over a month. Something shifted and requires updating to account for the new release.

@thomasyu888
Copy link

Hi all, just following a link here, but we also noticed the rjson issue, as outlined here: https://github.com/Sage-Bionetworks/synapser/blob/develop/DESCRIPTION.

It's a bit unfortunate but since we ship for all 4 R versions, we pin the dependency of rjson to [email protected].

@jjacobson95
Copy link
Collaborator Author

Thanks @thomasyu888. I've pinned rjson to [email protected], used R 4.3.2, and python3.10 and synapse now appears to be downloading correctly. If you are interested in a stable docker container, I have one working using the following files in this commit - build/docker/Dockerfile.mpnst , build/mpnst/requirements.r, build/mpnst/requirements.txt, 61965a5, though it has some extra files that should be removed.

However, the MPNST dataset is still not generating so I'll continue to track this here. Likely due to other dependancies or just things that have shifted in the code since the last version.

@thomasyu888
Copy link

thomasyu888 commented Sep 24, 2024

Thanks @jjacobson95 for tagging me! You may also want to use [email protected] because future versions of reticulate has unintended consequences as outlined here: https://r-docs.synapse.org/articles/troubleshooting.html#using-synapser-with-reticulate

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants