Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Manage dependencies with PDM #250

Merged

Conversation

vbaltrusaitis-reef
Copy link

@vbaltrusaitis-reef vbaltrusaitis-reef commented Feb 1, 2024

This is a draft PR for two reasons:

Changes:

  • pdm is used to manage dependencies, test and build b2.
  • Docker image and bundles now use locked dependencies, providing consistency in tests vs publishing.
  • All tests use locked dependencies. There is an unfortunate consequence - b2sdk is no longer installed from git master in CI. It would be possible to install b2sdk from master on top of locked dependencies but there are tradeoffs here:
    • If we use b2sdk from master for everything, there's no indication that CLI depends on unreleased features of b2sdk. If developed CLI features depend on new b2sdk features, we will need to make a commit that updates pyproject.toml/pdm.lock anyways before release.
    • With pdm, it seems natural to pdm add b2sdk @ git+https://github.com/Backblaze/b2-sdk-python while developing a feature together with sdk.
    • If we're concerned with ensuring that new b2sdk are not somehow breaking the CLI, it might be better to resurrect Run CLI tests in CI Backblaze/b2-sdk-python#260
    • It would make sense to run tests on both latest release and master of b2sdk but that would blow up our already long-running workflow (I think tests are parallelized only up to some number of workers).

jobs:
post_create_environment:
- pip install pdm
- pdm export --format requirements --prod --group doc --output requirements-doc.txt --no-hashes

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why no hashes?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a non-PyPI dependency (like a dependency on a git repo), pip install -r requirements.doc will error out:

ERROR: Can't verify hashes for these requirements because we don't have a way to hash version control repositories:
    b2sdk@ git+https://github.com/vbaltrusaitis-reef/b2-sdk-python@ac1e1355f1aaac10a398618214021cc7ecb6a1eb from git+https://github.com/vbaltrusaitis-reef/b2-sdk-python@ac1e1355f1aaac10a398618214021cc7ecb6a1eb (from -r requirements-doc.txt (line 21))

To be fair, non-PyPI dependencies shouldn't get into master (and definitely definitely not release), so this is only a convenience for testing. But for this use-case this seems pretty harmless, I'm now on the fence if I should remove --no-hashes.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't think it matters much - this is run on readthedocs side afterall and only for docs generation;
so the worst case (i.e. someone hacked pypi depdency) our docs get defaced - but at that point it would very cheap price to pay for knowing one of our dependencies were hacked :D

ENV PDM_BUILD_SCM_VERSION=${version}
RUN --mount=type=bind,source=./b2,target=/b2/b2 \
--mount=type=bind,source=./pyproject.toml,target=/b2/pyproject.toml \
--mount=type=bind,source=./pdm.lock,target=/b2/pdm.lock \

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

at first I was wondering why not use wheel/tar, but it does make sense to use lock indeed 👍

Copy link
Author

@vbaltrusaitis-reef vbaltrusaitis-reef Feb 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, lock is half the point. I did consider (and try out) a wheel+requirements approach: install dependencies using exported requirement file (pdm export --no-self [...]) and then install a b2 wheel on top of that with --no-deps . This would get rid of pdm within the container, less files to copy, it's nice. But in this case, I feel like I should definitely export requirements with hashes (see comment about the --no-hashes flag in readthedocs config). However, this would fail the docker build when you have b2sdk from git as a dependency. And that makes for an unpleasant development experience in case you're trying out unreleased b2sdk features, you want to make a draft PR that uses them and see if all tests pass.

This was a bit of a dilemma, I feel that exported requirements + wheel is nicer in many ways.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

multi-layered dockerfilerfile is the way to go long term, not necessary now

How big image is after this change? our last b2 image was 170MB, so unless this one is much bigger (>250MB) then I feel we can easily try doing it later down the line.

When using requirements.txt I tried doing multilayer, but other than it being more error prone, the image size didn't differ much at all.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked the image size and yes, it increased by a lot after these changes, to 300Mb. So I implemented a multi-stage build and now it's 146M.

noxfile.py Outdated Show resolved Hide resolved
noxfile.py Outdated Show resolved Hide resolved
pyproject.toml Outdated
dependencies = [
"argcomplete>=2,<4",
"arrow>=1.0.2,<2.0.0",
"b2sdk @ git+https://github.com/vbaltrusaitis-reef/b2-sdk-python",

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this for sure has to be b2sdk>=1.29.0,<2 like it was before

we cannot have pip install b2 users install stuff from git after this PR

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, absolutely. I'll change this before merging and submit for a review once b2sdk SyntaxWarning problem is fixed.

nox -s make_release_commit would detect this dependency and complain. That's, unfortunately, not the same as preventing this from reaching master, but it's something.

Co-authored-by: Maciej Urbański <[email protected]>
Dockerfile.template Outdated Show resolved Hide resolved
@vbaltrusaitis-reef
Copy link
Author

In the last few commits:

Requesting re-review @mjurbanski-reef

@vbaltrusaitis-reef vbaltrusaitis-reef marked this pull request as ready for review February 2, 2024 18:02
# due https://github.com/moby/moby/issues/47021 we cannot have /root/.cache leftover as it causes random errors in CI
RUN --mount=type=bind,source=${tar_path}/${tar_name},target=/tmp/${tar_name} \
pip install --no-cache-dir /tmp/${tar_name}[full] && rm -rf /root/.cache
ENV PYTHONPATH=/b2/pkgs

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/opt/b2 will be slightly better IMO as per https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch03s13.html , but do as you will

@@ -0,0 +1 @@
Use pdm for building, testing and managing dependencies.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think seeing at issue reported in other repo it is worth mentioning here that tests and other superfluous files have been removed from sdist tarball.

@vbaltrusaitis-reef vbaltrusaitis-reef merged commit 0693535 into reef-technologies:master Feb 5, 2024
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants