Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Automatically import binary extensions whose suffixes include the python version #15

Merged
merged 1 commit into from
Oct 25, 2024

Conversation

ian-hurst
Copy link
Contributor

@ian-hurst ian-hurst commented Oct 18, 2024

TODO:

  • Write new tests or update the old ones to cover new functionality.
  • Update doc-strings where appropriate.
  • Update or write new documentation in packit/packit.dev.

For an edge case where the current interpreter version differs from that of the rpm library, and where that library's extension suffixes are included in the _rpm.so filename (e.g. _rpm.cpython-39-x86_64-linux-gnu.so on RHEL 9.x), add a new reload_module method that finds the matching .so and imports it directly.

The basic operation is:

  • gather a list of valid extension suffixes for the rpm library's interpreter, via importlib.machinery.EXTENSION_SUFFIXES
  • try the import but catch any ModuleNotFoundError
  • on ModuleNotFoundError, loop through the suffix list, looking for a matching {module}.{suffix}.so on disk
  • when a match is found, import it as rpm.{module} directly by path

This is probably naive or unwise in some way, but I wanted to take a stab at it so I can use rpm-shim in on RHEL 9.x systems at my company without needing to use the sudo ln workaround

Example from python3.11 on el9 (Oracle 9.x):

DEBUG:rpm-shim:Collected sitepackages and extension suffixes for /usr/libexec/platform-python:
{'sitepackages': ['/usr/local/lib64/python3.9/site-packages',
                  '/usr/local/lib/python3.9/site-packages',
                  '/usr/lib64/python3.9/site-packages',
                  '/usr/lib/python3.9/site-packages'],
 'suffixes': ['.cpython-39-x86_64-linux-gnu.so', '.abi3.so', '.so']}
DEBUG:rpm-shim:Collected sitepackages and extension suffixes for /usr/bin/python3:
{'sitepackages': ['/usr/local/lib64/python3.9/site-packages',
                  '/usr/local/lib/python3.9/site-packages',
                  '/usr/lib64/python3.9/site-packages',
                  '/usr/lib/python3.9/site-packages'],
 'suffixes': ['.cpython-39-x86_64-linux-gnu.so', '.abi3.so', '.so']}
DEBUG:rpm-shim:Collected sitepackages and extension suffixes for /usr/bin/python3.11:
{'sitepackages': ['/usr/local/lib64/python3.11/site-packages',
                  '/usr/local/lib/python3.11/site-packages',
                  '/usr/lib64/python3.11/site-packages',
                  '/usr/lib/python3.11/site-packages'],
 'suffixes': ['.cpython-311-x86_64-linux-gnu.so', '.abi3.so', '.so']}
DEBUG:rpm-shim:Trying /usr/local/lib64/python3.9/site-packages
DEBUG:rpm-shim:Trying /usr/local/lib/python3.9/site-packages
DEBUG:rpm-shim:Trying /usr/lib64/python3.9/site-packages
DEBUG:rpm-shim:Module rpm._rpm not found in /usr/lib64/python3.9/site-packages/rpm, looking for alternative filenames trying valid extension suffixes
DEBUG:rpm-shim:Trying to load rpm._rpm from /usr/lib64/python3.9/site-packages/rpm/_rpm.cpython-39-x86_64-linux-gnu.so
DEBUG:rpm-shim:Loaded rpm._rpm from /usr/lib64/python3.9/site-packages/rpm/_rpm.cpython-39-x86_64-linux-gnu.so
DEBUG:rpm-shim:Reloaded rpm
DEBUG:rpm-shim:Import successfull

Copy link
Contributor

Copy link
Contributor

Copy link
Contributor

Copy link
Contributor

Copy link
Contributor

Copy link
Contributor

Copy link
Contributor

Copy link
Member

@nforro nforro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, this looks interesting!
I left some comments. It would be also nice to have some test cases for this.

rpm/__init__.py Outdated Show resolved Hide resolved
rpm/__init__.py Outdated Show resolved Hide resolved
rpm/__init__.py Show resolved Hide resolved
rpm/__init__.py Outdated Show resolved Hide resolved
rpm/__init__.py Outdated Show resolved Hide resolved
rpm/__init__.py Show resolved Hide resolved
@ian-hurst
Copy link
Contributor Author

Thanks for the feedback! Going through it now. I'm happy to add testcases if this is ultimately seen as viable

Copy link
Contributor

@ian-hurst
Copy link
Contributor Author

ian-hurst commented Oct 23, 2024

Ok, I reworked this to just loop through *.cpython*.so on ModuleNotFoundError exception. If the glob matches any binary extensions, I just load them. One benefit of this is I no longer need any of the suffix stuff. Debug output ends up:

DEBUG:rpm-shim:Collected sitepackages for /usr/libexec/platform-python:
/usr/local/lib64/python3.9/site-packages
/usr/local/lib/python3.9/site-packages
/usr/lib64/python3.9/site-packages
/usr/lib/python3.9/site-packages
DEBUG:rpm-shim:Collected sitepackages for /usr/bin/python3:
/usr/local/lib64/python3.9/site-packages
/usr/local/lib/python3.9/site-packages
/usr/lib64/python3.9/site-packages
/usr/lib/python3.9/site-packages
DEBUG:rpm-shim:Collected sitepackages for /usr/bin/python3.11:
/usr/local/lib64/python3.11/site-packages
/usr/local/lib/python3.11/site-packages
/usr/lib64/python3.11/site-packages
/usr/lib/python3.11/site-packages
DEBUG:rpm-shim:Trying /usr/lib/python3.11/site-packages
DEBUG:rpm-shim:Trying /usr/lib64/python3.9/site-packages
DEBUG:rpm-shim:The rpm._rpm module was not found in /usr/lib64/python3.9/site-packages/rpm, try loading binary extensions directly
DEBUG:rpm-shim:Found binary extensions:
[PosixPath('/usr/lib64/python3.9/site-packages/rpm/_rpm.cpython-39-x86_64-linux-gnu.so')]
DEBUG:rpm-shim:importing /usr/lib64/python3.9/site-packages/rpm/_rpm.cpython-39-x86_64-linux-gnu.so as rpm._rpm
DEBUG:rpm-shim:imported extensions, reloading rpm
DEBUG:rpm-shim:Import successfull

Copy link
Contributor

Copy link
Contributor

@nforro
Copy link
Member

nforro commented Oct 23, 2024

Ok, I reworked this to just loop through *.cpython*.so on ModuleNotFoundError exception. If the glob matches any binary extensions, I just load them. One benefit of this is I no longer need any of the suffix stuff.

While this is more straightforward, I think I liked the previous solution better. Doing it this way it could attempt to load unnecessary modules and that could cause problems, I can imagine a situation where in an unversioned sitepackages path (I think Debian/Ubuntu use those) there are binary modules in multiple versions and loading all of them would be impossible. Getting module name from the exception and constructing filename by trying valid suffixes seems like a better way.

@ian-hurst
Copy link
Contributor Author

While this is more straightforward, I think I liked the previous solution better. Doing it this way it could attempt to load unnecessary modules and that could cause problems, I can imagine a situation where in an unversioned sitepackages path (I think Debian/Ubuntu use those) there are binary modules in multiple versions and loading all of them would be impossible. Getting module name from the exception and constructing filename by trying valid suffixes seems like a better way.

yeah, definitely a drawback to this method. The alternative, I think, would be to do something like reload() in a loop, and break when either:

  • reload() succeeds, or
  • the helper has exhausted all binary extensions, and reload() is still failing

If that sounds good (or if there's a better way) I'm happy to go with it.

@nforro
Copy link
Member

nforro commented Oct 23, 2024

The alternative, I think, would be to do something like reload() in a loop, and break when either:

  • reload() succeeds, or
  • the helper has exhausted all binary extensions, and reload() is still failing

If that sounds good (or if there's a better way) I'm happy to go with it.

Yes, that's what I had in mind - loop until there are no more recoverable ModuleNotFoundError exceptions or until it's not possible to recover.

@ian-hurst
Copy link
Contributor Author

Yes, that's what I had in mind - loop until there are no more recoverable ModuleNotFoundError exceptions or until it's not possible to recover.

Alright, super! Working on it now

Copy link
Contributor

Copy link
Contributor

Copy link
Member

@nforro nforro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These are all style suggestions, the code looks fine 👍

rpm/__init__.py Outdated Show resolved Hide resolved
rpm/__init__.py Outdated Show resolved Hide resolved
rpm/__init__.py Outdated Show resolved Hide resolved
rpm/__init__.py Outdated Show resolved Hide resolved
rpm/__init__.py Outdated Show resolved Hide resolved
rpm/__init__.py Outdated Show resolved Hide resolved
rpm/__init__.py Outdated Show resolved Hide resolved
rpm/__init__.py Outdated Show resolved Hide resolved
rpm/__init__.py Outdated Show resolved Hide resolved
rpm/__init__.py Outdated Show resolved Hide resolved
@ian-hurst
Copy link
Contributor Author

These are all style suggestions, the code looks fine 👍

Alright, thanks for the feedback! I've taken all the suggested changes now

Copy link
Contributor

Copy link
Contributor

Copy link
Member

@nforro nforro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, thanks. Only some test cases are missing, but I'll try to come up with some, unless you want to do it yourself.

Could you please squash the commits into one?

@ian-hurst
Copy link
Contributor Author

Alright, thanks so much for your review and feedback. I love this little shim and I'm so happy to be able to contribute. As an internet rando, I don't actually have write access, so I can't do the merge step myself.

Only some test cases are missing, but I'll try to come up with some, unless you want to do it yourself.

I'm not familiar with your test framework, but I can think of a few basic designs. The obvious one is the positive test case, on rhel9, install python3.11, create a virtualenv, then import rpm. It might also be good to test that the binary extension import by path works when multiple extensions are used. Can't do that on rhel9 which only has one. But on rhel8 you could create that condition by running sudo rm -f /usr/lib64/python3.6/site-packages/rpm/_{rpm,rpmb,rpms}.so. The .cpython-36m-x86_64-linux-gnu.so files would remain and then you'd be able to test your loop. I did this manually and it works.

But seems like it might be good to have some negative test cases:

  • if _rpm*.so are absent, import should fail
  • if _rpm*.so are present, but have invalid suffixes for the original interpreter (probably never occurs in the real world but a good test that things fail)

@nforro
Copy link
Member

nforro commented Oct 25, 2024

Alright, thanks so much for your review and feedback. I love this little shim and I'm so happy to be able to contribute. As an internet rando, I don't actually have write access, so I can't do the merge step myself.

I will merge the PR once you squash the commits into one (or more if you feel that's necessary, but I think one is fine). Also, could you please follow the guidelines when doing it? 🙂

I'm not familiar with your test framework

There isn't really any framework, just a script that imports rpm, either in a virtualenv or directly on a system.

The obvious one is the positive test case, on rhel9, install python3.11, create a virtualenv, then import rpm.

Yes, that's what I had in mind, and it shouldn't be diffcult to modify existing test case to install and run Python 3.11.

It might also be good to test that the binary extension import by path works when multiple extensions are used. Can't do that on rhel9 which only has one. But on rhel8 you could create that condition by running sudo rm -f /usr/lib64/python3.6/site-packages/rpm/_{rpm,rpmb,rpms}.so. The .cpython-36m-x86_64-linux-gnu.so files would remain and then you'd be able to test your loop. I did this manually and it works.

Nice idea.

@ian-hurst
Copy link
Contributor Author

I will merge the PR once you squash the commits into one (or more if you feel that's necessary, but I think one is fine). Also, could you please follow the guidelines when doing it? 🙂

sure, will do!

The obvious one is the positive test case, on rhel9, install python3.11, create a virtualenv, then import rpm.

Yes, that's what I had in mind, and it shouldn't be diffcult to modify existing test case to install and run Python 3.11.

Ok, I'm happy to write tests. Just looking through the workflow now, the same test is run on all flavors of linux. How would you approach adding a test that only runs on one of them?

@nforro
Copy link
Member

nforro commented Oct 25, 2024

Ok, I'm happy to write tests.

Thanks!

Just looking through the workflow now, the same test is run on all flavors of linux. How would you approach adding a test that only runs on one of them?

There could be another environment variable that would be passed to the container and the scripts would react accordingly. Though technically for the simplest case of installing and running in Python 3.11 on RHEL9 (ubi9) it can be simply added to INSTALL_DEPS_CMD, tox should run for all interpreters present on the system.

Actually, even the removal of binary modules could be done in INSTALL_DEPS_CMD, there is probably no need to edit tests/entrypoint.sh.

Copy link
Contributor

Copy link
Contributor

Copy link
Contributor

Copy link
Contributor

Copy link
Contributor

Copy link
Contributor

Copy link
Contributor

@ian-hurst
Copy link
Contributor Author

Ok, I have this squashed and a couple scenarios added. Unfortunately (or really, fortunately) the scenario I set up with python36 reveals an issue. _rpms and _rpmb aren't being imported. Why not? Because no exception is raised when they fail. Here is /usr/lib64/python3.6/site-packages/rpm/__init__.py:

# try to import build bits but dont require it
try:
    from rpm._rpmb import *
except ImportError:
    pass

# try to import signing bits but dont require it
try:
    from rpm._rpms import *
except ImportError:
    pass

Nothing is raised, so nothing is caught, and then nothing is helped.

@nforro
Copy link
Member

nforro commented Oct 25, 2024

Ok, I have this squashed and a couple scenarios added.

Excellent, thanks. The commit message title is a bit long, but that's ok. It shouldn't end with a dot though.

Unfortunately (or really, fortunately) the scenario I set up with python36 reveals an issue. _rpms and _rpmb aren't being imported. Why not? Because no exception is raised when they fail. Here is /usr/lib64/python3.6/site-packages/rpm/__init__.py:

# try to import build bits but dont require it
try:
    from rpm._rpmb import *
except ImportError:
    pass

# try to import signing bits but dont require it
try:
    from rpm._rpms import *
except ImportError:
    pass

Nothing is raised, so nothing is caught, and then nothing is helped.

Ok, it's a bit hackish, but I suppose we could do something like sed -i 's/^ pass/ raise/' /usr/lib64/python3.6/site-packages/rpm/__init__.py after installation.

@ian-hurst ian-hurst changed the title Discover extension modules with different suffixes than the current interpreter's Automatically import binary extensions with unexpected suffixes. Oct 25, 2024
@ian-hurst
Copy link
Contributor Author

Ok, it's a bit hackish, but I suppose we could do something like sed -i 's/^ pass/ raise/' /usr/lib64/python3.6/site-packages/rpm/__init__.py after installation.

Yeah, and this is still an improvement, since none of the existing python-rpm versions require this. All old versions include the unversioned .so files, and all new versions have just one extension, _rpm, which works. But still, it's agonizing to be so close to a robust solution and then have to settle.

One thing I toyed with was modifying importlib.machinery.EXTENSION_SUFFIXES. If you make that the union of the list from both interpreters, you can use pkgutil to inspect all modules cleanly:

>>> importlib.machinery.EXTENSION_SUFFIXES
['.cpython-36m-x86_64-linux-gnu.so', '.cpython-311-x86_64-linux-gnu.so', '.abi3.so', '.so']
>>> [print(i) for i in pkgutil.iter_modules(['/usr/lib64/python3.6/site-packages/rpm'])]
ModuleInfo(module_finder=FileFinder('/usr/lib64/python3.6/site-packages/rpm'), name='_rpm', ispkg=False)
ModuleInfo(module_finder=FileFinder('/usr/lib64/python3.6/site-packages/rpm'), name='_rpmb', ispkg=False)
ModuleInfo(module_finder=FileFinder('/usr/lib64/python3.6/site-packages/rpm'), name='_rpms', ispkg=False)
ModuleInfo(module_finder=FileFinder('/usr/lib64/python3.6/site-packages/rpm'), name='transaction', ispkg=False)

So you could use that to load _rpmb and _rpms dynamically. But you would have to assume that all modules should be imported. It's really too bad the actual loader machinery doesn't care what's in EXTENSION_SUFFIXES. If it did, we'd be home free. It's clear python just does not want you to use anything but native suffixes, haha.

@ian-hurst ian-hurst changed the title Automatically import binary extensions with unexpected suffixes. Automatically import binary extensions whose suffixes include the python version Oct 25, 2024
Copy link
Contributor

@nforro
Copy link
Member

nforro commented Oct 25, 2024

Ok, let me merge this. Thanks again for your contribution.

@nforro nforro added the mergeit Zuul, merge it! label Oct 25, 2024
Copy link
Contributor

Build succeeded (gate pipeline).
https://softwarefactory-project.io/zuul/t/packit-service/buildset/a0075131843e41f595a87c4eefd25551

✔️ pre-commit SUCCESS in 1m 25s

Copy link
Contributor

Pull request merge failed: Resource not accessible by integration, You may need to manually rebase your PR and retry.

@ian-hurst
Copy link
Contributor Author

Ok, let me merge this. Thanks again for your contribution.

My pleasure, thank you so much for taking this change, and for walking me through your process

@nforro nforro merged commit 15c2742 into packit:main Oct 25, 2024
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
mergeit Zuul, merge it!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants