Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Embedded Python interpreter on macOS puts system site-packages path before venv paths #628

Open
mattw-nws opened this issue Sep 5, 2023 · 3 comments
Labels
bug Something isn't working

Comments

@mattw-nws
Copy link
Contributor

In testing #626, it was discovered that on macOS, when a venv is active, the interpreter embedded in ngen reports sys.path directories including a system-level site-packages path like */Frameworks/Python.framework/Versions/3.*/lib/python3.*/site-packages of the same version as the venv... this does not occur (or has not been observed) on Linux.

Initially it was thought that this might be a result of #540 (or possibly #592), but checking out the commit before #540 (c261af4) shows the same behavior. This can cause several problems, and it was observed that if there is a mismatched NumPy version in the system path from the on in a venv that this would trigger the mismatch exception added in #558.

Other (worse) things could happen because of this, for example if different BMI module versions were installed in a venv vs. the system/Python.framework site-packages paths, then an unexpected version could be transparently used (this could be one source of the problems described in #441). Generally installing things in the system Python path (or user site) is likely to cause problems like this, but is not uncommon--especially NumPy seems to end up there frequently, given search results.

It seems as if this may be different behavior of pybind11 (or the embedded interpreter tooling of CPython, which is used by pybind11?) on macOS, but the cause is largely unknown. Some interesting info on pybind11's embedding settings that may be relevant (or not).

Current behavior

ngen built on macOS will include a path like the 4th line in the paths returned from sys.path in the embedded pybind11 Python interpreter:

Python Environment Info:
  VIRTUAL_ENV environment variable: /Volumes/Expendable/venv-trm-311
  Discovered venv: /Volumes/Expendable/venv-trm-311
  System paths:
    
    /usr/local/Cellar/[email protected]/3.11.1/Frameworks/Python.framework/Versions/3.11/lib/python311.zip
    /usr/local/Cellar/[email protected]/3.11.1/Frameworks/Python.framework/Versions/3.11/lib/python3.11
    /usr/local/Cellar/[email protected]/3.11.1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/lib-dynload
    /usr/local/Cellar/[email protected]/3.11.1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/site-packages
    /Volumes/Expendable/venv-trm-311/lib/python3.11
    /Volumes/Expendable/venv-trm-311/lib/python3.11/site-packages

...While in the same environment, python itself reports different paths:

(venv-trm-311) > python -c 'import sys; print("\n".join(sys.path))'

/usr/local/Cellar/[email protected]/3.11.1/Frameworks/Python.framework/Versions/3.11/lib/python311.zip
/usr/local/Cellar/[email protected]/3.11.1/Frameworks/Python.framework/Versions/3.11/lib/python3.11
/usr/local/Cellar/[email protected]/3.11.1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/lib-dynload
/Volumes/Expendable/venv-trm-311/lib/python3.11/site-packages

Notably, ngen explicitly adds the venv paths in InterpreterUtil.hpp ... but it does not add the system site-packages path, and its origin is unknown.

Expected behavior

The system site-packages path for the venv python version should not be included. This is an example from a Linux (RHEL7) system:

Python Environment Info:
  VIRTUAL_ENV environment variable: /local/ngen/workdir/.venv
  Discovered venv: /local/ngen/workdir/.venv
  System paths:
    
    /usr/lib64/python2.7/site-packages/openmpi
    /opt/rh/devtoolset-8/root/usr/lib64/python2.7/site-packages
    /opt/rh/devtoolset-8/root/usr/lib/python2.7/site-packages
    /opt/rh/rh-python38/root/usr/lib64/python38.zip
    /opt/rh/rh-python38/root/usr/lib64/python3.8
    /opt/rh/rh-python38/root/usr/lib64/python3.8/lib-dynload
    /local/ngen/workdir/.venv/lib64/python3.8/site-packages
    /local/ngen/workdir/extern/t-route/src/ngen_routing/src
    /local/ngen/workdir/extern/t-route/src/nwm_routing/src
    /local/ngen/workdir/extern/t-route/src/python_framework_v02
    /local/ngen/workdir/extern/t-route/src/python_routing_v02
    /local/ngen/workdir/.venv/lib/python3.8/site-packages
    /local/ngen/workdir/.venv/lib/python3.8

(note that the python2.7 site-packages directories are also not desirable but don't seem to cause the same problems)

Steps to replicate behavior (include URLs)

  1. Build ngen on macOS with Python support
  2. Run ngen without any command line arguments and inspect the Python system paths
  3. Note the differences between those and python -c 'import sys; print("\n".join(sys.path))'

Or...

  1. (on macOS) Deactivate the venv, and run the same Python executable to install an old numpy with pip, e.g. python3 -m pip install numpy==1.24.3
  2. Activate the venv (same Python version) and install a newer numpy with python -m pip install numpy>1.24.
  3. Run ngen with the venv active, and you should see a NumPy version mismatch exeception.

Screenshots

@aaraney
Copy link
Member

aaraney commented Jul 4, 2024

To work around this, you can specify PYTHONPATH=$(python -c "import sysconfig; print(sysconfig.get_path('purelib'))") ngen <args...> when running ngen. python -c "import sysconfig; print(sysconfig.get_path('purelib')) expands to the user site-package directory (more info here).

Ive tested this on pybind11>=v2.10.4 using python 3.7 and 3.9.18.

@aaraney
Copy link
Member

aaraney commented Jul 4, 2024

pybind11>=v2.11.0 picks up the expected python interpreter in both a virtual environment or global environment on linux. It still seems that the PYTHONPATH work around is necessary on macos, however.

As for a long term solution, it is possible to programmatically specify PYTHONPATH by setting the pythonpath_env field on the PyConfig struct. In theory we could look for the presence of the VIRTUAL_ENV env var and append /lib/python{version}/site-packages to determine a sensible PYTHONPATH to programmatically append. pybind11 is abstracting all of this away from us at the moment (see our usage of py::scoped_interpreter). I've not found an elegant way to pass the relevant additional options to PyConfig via py::scoped_interpreter's constructor though(hopefully i'm missing something). In any case, i've found pep-587 to be valuable while investigating this.

@aaraney
Copy link
Member

aaraney commented Jul 15, 2024

Update, this doesn't seem to be a pybind11 issue but instead a behavioral discrepancy across OS's at the python c api level. See pybind/pybind11#5238 for more in depth detail.

TL;DR

Run ngen on macOS like:

PYTHONEXECUTABLE=$(which python) ngen <args>

sys.executable is set to the path to the python interpreter on linux but set to the path of the binary embedding python [e.g. ngen] on macOS and presumedly windows. Coincidentally, this is also why embedded interpreters on macOS and presumedly windows don't pick up virtual environments correctly either.

PYTHONEXECUTABLE changes this behavior only on macOS.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants