Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PHEP 3: PyHC Python & Upstream Package Support Policy #29

Open
wants to merge 16 commits into
base: main
Choose a base branch
from
Open

Conversation

sapols
Copy link
Contributor

@sapols sapols commented Jun 6, 2024

Overview

This is the initial draft of PHEP 3, which proposes adopting a Python version & upstream package support policy for the PyHC ecosystem, inspired by SPEC-0. The goal is to standardize the support duration for Python versions and popular packages across all PyHC packages, ensuring a balance between stability and the incorporation of new features.

Specifically, this PHEP recommend that projects:

  1. Support Python versions for at least 36 months (3 years) after their initial release.
  2. Support upstream Scientific Python packages for at least 24 months (2 years) after their initial release.
  3. Adopt support for new versions of these dependencies within 6 months of their release.

The upstream Scientific Python packages are: numpy, scipy, matplotlib, pandas, scikit-image, networkx, scikit-learn, xarray, ipython, zarr.

This policy aims to replace the current standard #11, which mandates only Python 3 support, with a more structured timeline that supports consistent and predictable maintenance across the community.

This closes Issue #21.

Renders

Rendered current text of the PHEP

Render of PHEP before scope was expanded to include upstream packages

Inspiration

This PHEP was inspired by the Python version support policies listed in:

Open questions and comments

  • What should go in the "How to Teach This" section? Should we expand on the ideas already there or take it a different direction?

Resolved questions and comments

  • We decided to use SPEC 0's 36-month policy instead of NEP 29's 42 months, because SPEC 0 officially supersedes NEP 29.
  • We decided this policy has to be a "should" not a "must."
  • We decided against a firm "drop" policy (e.g. requiring bumping python_requires to 3.X) in favor of softer language that allows packages to support older dependencies for longer if they want.
  • We decided to expand the scope of this PHEP to include upstream package support from SPEC 0, rather than save such a policy for a potential future PHEP (it originally only described a policy for minor versions of Python).
  • Use one line per sentence in the file to improve git diff/commentability

@sapols sapols changed the title Initial draft of new PHEP PHEP 3: PyHC Python Support Policy Jun 6, 2024
@sapols sapols marked this pull request as ready for review June 6, 2024 23:24
@jameswilburlewis
Copy link
Collaborator

jameswilburlewis commented Jun 7, 2024

For Python versions that age out of the proposed support window -- how firm is the expectation that package maintainers will drop support for the old Python release, in the case where there are no known incompatibilities? Could that take the form of documentation stating "Recommended Python version >= 3.X, but still works under Python 3.Y as of this writing", or would you want us to take more definitive action (bump python_requires to 3.X)? For example, if someone depends on a non-PyHC package that wants an older Python release, it could be a problem for them to upgrade Python to continue using PyHC packages.

I've read some of the discussion around NEP 29, and I see the merit in the arguments about "who's going to take the plunge first and bump their package requirements?", and general community cohesion and predictability. Just wondering what the repercussions might be, in the event one of these messy real-world edge cases collides with what is otherwise sound policy.

@sapols
Copy link
Contributor Author

sapols commented Jun 7, 2024

@jameswilburlewis that's an important question I'm wrestling with myself. I know some core packages like PlasmaPy and SunPy already go as far as bumping requires-python = ">=3.10" (as is strictly suggested in NEP 29). But I'm open to feedback here.

pheps/phep-0003.md Outdated Show resolved Hide resolved
pheps/phep-0003.md Outdated Show resolved Hide resolved
@jtniehof
Copy link
Contributor

Just commenting and not formal review yet, since I think we're a bit more in a "discussion" phase than wordsmithing.

As far as I can tell, PHEP 1 doesn't explicitly require the editor be distinct from the author, but I'd think it would generally be a good idea.

I'd like to suggest expanding the scope to close #21: packages probably should be able to think about Python and other dependencies in the same context even if the principles are slightly different. I appreciate trying to keep scope reasonable but these seem interconnected to me.

I really dislike the "everything not compulsory is forbidden" nature of SPEC 0. I don't think forcing our users to upgrade dependencies is a good idea. And given the difficulties with HelioCloud, we should probably err on being looser with "permitted" versions than tighter. This isn't something like Python 2 where a dedicated "kill the beast" plan was in order.

So here's the sort of thing I'd like to see:

  1. Packages must have a description of their dependency version policy, e.g. PlasmaPy, SpacePy
  2. Packages must support dependencies at least up to the timelines of SPEC 0, i.e. at the time a package version is released, it should support Python feature versions (x.y.0) released in the previous 36 months and feature versions of other dependencies released in the previous 24 months; for dependencies that do not use semantic versioning, simply versions released in the previous 24 months. (The specific numbers should be in this PHEP, with the note, as Shawn has, that it's inspired by SPEC 0).
  3. Packages may drop support immediately after those times, or may choose to continue support after, potentially in a reduced capacity.
  4. Packages that use semantic versioning should consider using their version number to indicate versions that drop support for older dependencies.
  5. There is no expectation (not even a "should") that a package "deprecate" an older dependency before dropping support for it.
  6. Packages must explicitly support (and test for) new versions of dependencies within six? twelve? months of their release. (This doesn't mean CI tests going into all eternity, just that it's been verified to work and will install).
  7. Packages which specify a maximum version number for dependencies must (terrible wording) use a carefully selected maximum, not merely specifying the current release as a maximum. (Also potentially some wording about being more aggressive about updating releases when dependencies are released?) Suggested policies include:
    a. Specifying the release after the current as the maximum, e.g. if numpy 1.26 is the current release, specify numpy<1.28. This should usually be reasonable if the package is clean of deprecation warnings and the dependency has a deprecation
    b. For dependencies using semantic versioning, specify a version that is likely to have breaking changes based on the version number, e.g. if numpy 1.26 is current, specify numpy<2.
    c. I'm sure people can come up with others
  8. Packages should test against release candidate versions of dependencies to facilitate support for future versions. Testing in CI is encouraged but ad-hoc testing is acceptable; testing against earlier pre-releases is also encouraged.

I can make edit suggestions to flow into Shawn's writing, but figured kicking the ideas around for a bit first would make sense. If any of these prove really controversial, we can just drop it out of the scope.

tldr: support for a reasonable about of time. Be clear to your users. Don't leave your package uninstallable.

@jtniehof jtniehof mentioned this pull request Jun 11, 2024
@nabobalis
Copy link

I really dislike the "everything not compulsory is forbidden" nature of SPEC 0. I don't think forcing our users to upgrade dependencies is a good idea. And given the difficulties with HelioCloud, we should probably err on being looser with "permitted" versions than tighter. This isn't something like Python 2 where a dedicated "kill the beast" plan was in order.

SPEC 0 is the high level plan from the broader scientific python community, I don't see the need to be seperate from that push, we rely on all of their packages. Reducing the scope of what we need to support reduces the burden on all package maintainers within PyHC.

We should also be telling users to create separate environments for each piece of work and that way can avoid pitfalls of updates breaking or messing with their current code or environment.

Packages must explicitly support (and test for) new versions of dependencies within six? twelve? months of their release. (This doesn't mean CI tests going into all eternity, just that it's been verified to work and will install).

Typically for sunpy since we test with upstream on a cron job schedule, we don't need to worry about at least a smaller subset of package updates.

We don't test the full suite so package updates that do break, will and do slip through, so we still have to patch and release at times for those.

The main bottleneck is typically new python versions since we have a large dependency stack and we need to wait for those to explicitly support that python version but we try to push towards 3-6 months after release. Thankfully more core packages are testing sooner with python versions and their RCs so that timeframe is getting shorter.

  1. Packages which specify a maximum version number for dependencies must (terrible wording) use a carefully selected maximum, not merely specifying the current release as a maximum. (Also potentially some wording about being more aggressive about updating releases when dependencies are released?) Suggested policies include:
    a. Specifying the release after the current as the maximum, e.g. if numpy 1.26 is the current release, specify numpy<1.28. This should usually be reasonable if the package is clean of deprecation warnings and the dependency has a deprecation
    b. For dependencies using semantic versioning, specify a version that is likely to have breaking changes based on the version number, e.g. if numpy 1.26 is current, specify numpy<2.
    c. I'm sure people can come up with others

I am hesitant to suggest max pinning of packages unless the package it self suggests it. In the numpy case due to their massive set of changes in the coming 2.0 release, it makes sense and it's pretty common in the sphinx world due how often they can break items in a release.

But in my view, pinning either a max or a specific version should be discouraged unless you have really specific requirements in your package.

  1. Packages should test against release candidate versions of dependencies to facilitate support for future versions. Testing in CI is encouraged but ad-hoc testing is acceptable; testing against earlier pre-releases is also encouraged.

Ideally packages should add something like weekly or monthly cron job to test with "main" version of the core set of dependencies they use. Won't need to be all of them but it should at least cover the install dependencies.

I don't think that adhoc testing is good enough for this, especially with how fast the python ecosystem moves.

@sapols
Copy link
Contributor Author

sapols commented Jun 13, 2024

Thank you for the thoughtful comments, @jtniehof. And I appreciate the view from SunPy, @nabobalis! To what @jtniehof said, I definitely think it's best for PyHC's long-term success if we adopt the dependency version policy from SPEC 0. I was gonna push for it eventually, so I started questioning now whether it should be in scope for this PHEP. If people are game I'd like to include it here, but if it'll be a point of contention I'm more on the fence. I like your ideas though, especially having packages explicitly document their version policies. I plan to lead a discussion about this at Monday's telecon where hopefully I can start to get a sense of community consensus. If people seem onboard, I'd welcome and appreciate your edit suggestions. Let's see how people feel in the telecon then go from there?

UPDATE: there ended up not being time to discuss this PHEP last telecon, so we'll have that discussion next telecon in two weeks instead.

@aburrell
Copy link

I just want to add that there is frequently a need for our science packages to support old versions of Python. One example essential for some of my packages is that they need to work in an operational environment, and we can't make them use a modern version of Python. However, I do think it's reasonable to request users ensure their code works with the actively supported, non-beta versions of Python.

@rebeccaringuette
Copy link

While I happily give kudos on this for a step towards purposeful interoperability, I must throw a word of caution in here against unfunded mandates. We have a careful line to walk here between requiring things and not funding them. As such, I would not want to see dependency requirements ("must") be added to the lowest level of PyHC packages, especially the "you get listed on our webpage" level. My suggestion here is that the Python version support requirements be a requirement for the package level above "you get listed on our webpage" and all higher levels. Other dependency requirements (e.g. numpy versions) be a "should" at that level. For the next level up (2 levels above "you get listed on our webpage"), those "should" dependency requirements become required.
We also need consequences spelled out for what happens when a requirement is not complied with, but perhaps that should go in the PHEP on PyHC package levels and not this one.

@jtniehof
Copy link
Contributor

@nabobalis , I totally agree we should work in the framework and timelines of SPEC 0. But requiring packages to drop support for old versions instad of allowing it is, IMO, overly prescriptive. Within minimum bounds, packages can make their own decisions about the tradeoffs of supporting old versions for their users vs. maintenance burden. As @aburrell points out, there are many environments where jumping to the latest is not always practicable, and these are often users that fund some of this work.

As far as "must" vs. "should", I think it makes sense to have some granularity above the PHEP level. So it might be "PHEP x has must a, b, c" but we don't require full compliance with PHEP x for being a core package (or a listed package, or whatever). This of course interacts with the question of exactly how we tier packages...

@nabobalis
Copy link

@nabobalis , I totally agree we should work in the framework and timelines of SPEC 0. But requiring packages to drop support for old versions instad of allowing it is, IMO, overly prescriptive. Within minimum bounds, packages can make their own decisions about the tradeoffs of supporting old versions for their users vs. maintenance burden. As @aburrell points out, there are many environments where jumping to the latest is not always practicable, and these are often users that fund some of this work.

I think that's totally fair, in that case we should turn this PHEP in a more relaxed version:

Try to support new Python releases within a time frame
Support older versions of Python based the package maintainers needs.

If the PHEP is just that, I guess that's more informationally than a requirement/standard?

@nabobalis
Copy link

While I happily give kudos on this for a step towards purposeful interoperability, I must throw a word of caution in here against unfunded mandates.

While this is a great point, I would say that the maintenance of a package is almost always unfunded. This is a problem with almost any library, it requires the deadicated time of a small group of maintainers or community contributions to keep a package ticking alone.

I would personally argue if that you want to release and advertise your code/library/package that support from the authors/maintainers and them making sure the package is kept in a working (this meaning checking support for newer dependencies and Python versions, package metadata changes as the ecosystem moves etc) is the bare minimum required. This would be unfunded work normally, and I have little knowledge about what funding opportunities are available for this type of maintaince.

@sapols sapols changed the title PHEP 3: PyHC Python Support Policy PHEP 3: PyHC Python & Upstream Package Support Policy Jul 3, 2024
@sapols
Copy link
Contributor Author

sapols commented Jul 3, 2024

Okay! I just pushed a change that I believe incorporates all the feedback from the comments here, while also expanding the scope of the PHEP to include the upstream package support policy from SPEC 0. The "drop" policy language has been softened to allow packages to continue supporting older versions if they choose to. The upstream packages touched by this new policy are clearly defined. Further recommendations have been added to the "Specification" section.

If I missed something obvious please yell at me. Otherwise we're moving into more word-smithy territory now and I'd appreciate nit-picky wording comments and other such things. Also still seeking feedback about the How to Teach This section.

@jameswilburlewis
Copy link
Collaborator

I have a thought, which may or may not be in scope for this PHEP... The theme here seems to be promoting interoperability by setting expectations on how package maintainers will deal with upstream dependencies, specifically Python itself, and PyHC-core or scientific-python-core packages. But there are other considerations that come into play, especially for packages that supply binary wheels: OS versions and CPU architectures. New OS releases and new CPU architectures (e.g. Mac Intel -> Apple Silicon M1, M2, M3 etc) can both trigger a need to recompile non-Python library code.

Should we have a policy on a timeline for supporting new OS releases or new CPU architectures with compatible wheels for PyHC packages? If I'm going to introduce a dependency on some other PyHC package for the sake of having a common way to handle coordinate systems, times, units, etc., I would hate to have a situation where installing my package on the hot new platform requires end users to compile their own C (or, God forbid, FORTRAN) libraries because binary wheels aren't available yet...

@jtniehof
Copy link
Contributor

jtniehof commented Jul 3, 2024

Good point, @jameswilburlewis. We don't have any explicit statement requiring binary wheels right now and that feels out of scope for this discussion, but where we're talking about supporting new versions of Python in a timely manner, that seems to put issues like OS and arch in-scope.

Maybe just wording that suggests support for the new must be at the same level as for the old--so if a package never does binary wheels (or conda, say) that's "okay" and users and potential downstream dependencies can make their decision, but if they've been releasing binary wheels people can reasonably rely on that in the future?

@sapols
Copy link
Contributor Author

sapols commented Jul 9, 2024

I'm tempted to say that OS/architecture support is mostly out of scope here. The crux of this PHEP is really just "PyHC is jumping on the SPEC 0 bandwagon". Plus we already have PyHC standard 4: Operating System Support: Packages must strive to support all major operating systems (e.g., OS X, Linux, Windows).

@jameswilburlewis @jtniehof Would it be sufficient to simply add a sentence like "Additionally, if a package has been releasing binary wheels, this support should continue for new OS versions and CPU architectures to maintain the same level of support as for previous environments."?

@jameswilburlewis
Copy link
Collaborator

@sapols Sounds good to me!

@jtniehof
Copy link
Contributor

jtniehof commented Jul 9, 2024

@jameswilburlewis @jtniehof Would it be sufficient to simply add a sentence like "Additionally, if a package has been releasing binary wheels, this support should continue for new OS versions and CPU architectures to maintain the same level of support as for previous environments."?

I might be even less specific: "packages should support new OS versions and CPU architectures to the same level as for previous environments". So whatever you were doing before, thou shalt do now. Up to the point of reason, of course...I don't deliver installer .exes for SpacePy anymore.

@sapols
Copy link
Contributor Author

sapols commented Jul 9, 2024

Nice. With that change the paragraph becomes:

"PyHC packages should clearly document their dependency version policy (e.g., like PlasmaPy and SpacePy) and be tested against the minimum and maximum supported versions. Testing with CI against release candidates is encouraged, too, as a way to stay ahead of future releases. Packages that use semantic versioning should consider using their version number to indicate versions that drop support for older dependencies. There is no expectation that a package "deprecate" an older dependency before dropping support for it. However, there is an expectation that maximum or exact requirements (e.g., numpy<2 or matplotlib==3.5.3) be set only when absolutely necessary (and that GitHub issues be immediately created to remove such requirements). Additionally, packages should support new OS versions and CPU architectures to the same level as previous environments."

Is it clear what we mean by that without explicitly calling out binary wheels etc? I like how succinct it is, just wanna make sure it's clear too.

@namurphy
Copy link
Contributor

namurphy commented Jul 9, 2024

Is it clear what we mean by that without explicitly calling out binary wheels etc? I like how succinct it is, just wanna make sure it's clear too.

Perhaps we could add a link to the corresponding page in the Python documentation or PyPA's packaging guide?

Copy link

@Cadair Cadair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please use one line per sentence it makes it so much easier to git diff and comment on.

pheps/phep-0003.md Outdated Show resolved Hide resolved
pheps/phep-0003.md Outdated Show resolved Hide resolved
pheps/phep-0003.md Show resolved Hide resolved
pheps/phep-0003.md Show resolved Hide resolved
pheps/phep-0003.md Outdated Show resolved Hide resolved
2. Support upstream Scientific Python packages for at least **24 months** (2 years) after their initial release.
3. Adopt support for new versions of these dependencies within **6 months** of their release.

The upstream Scientific Python packages are: `numpy, scipy, matplotlib, pandas, scikit-image, networkx, scikit-learn, xarray, ipython, zarr`.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this list of packages?

I know that they are the SP core packages, but why is that relevant to PyHC?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply because they're the packages controlled by SPEC 0. I wouldn't call it a comprehensive list of upstream packages PyHC cares about, but it's many of them and a great start. I was avoiding extending beyond the bounds of SPEC 0.


# Rationale
<a name="rationale"></a>
Following [SPEC 0](https://scientific-python.org/specs/spec-0000/)'s 24/36-month support timeline keeps PyHC in better sync with the broader Scientific Python community, maintaining compatibility with newer Python features and key upstream dependencies, while providing adequate time for package maintainers to adapt. Allowing 6 months to adopt new versions ensures packages stay current with development cycles while providing a reasonable timeframe for testing and integration.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be worth adding to the rationale that reducing the number of versions supported by projects reduces the maintenance burden. It's almost impossible to test oldest supported versions of all packages in a pre-SPEC 0 land.

SunPy basically used to bump our minimum versions when our CI failed, since adopting NEP 29 / SPEC 0 we haven't had to do that at all. So we have a lot more confidence that our code actually works with a concrete set of minimum deps which are clear to everyone.

Copy link
Contributor

@namurphy namurphy Jul 11, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should be worth adding to the rationale that reducing the number of versions supported by projects reduces the maintenance burden.

💯 Absolutely right! This is personally my main motivation for adopting SPEC 0.

It's almost impossible to test oldest supported versions of all packages in a pre-SPEC 0 land.

🦀 uv has resolution strategies for --lowest and --lowest-direct which makes it pretty straightforward nowadays to test against the oldest supported versions of either all dependencies or direct dependencies. (We use --lowest-direct for PlasmaPy because some dependencies don't list minimum required versions of their dependencies.)

SunPy basically used to bump our minimum versions when our CI failed, since adopting NEP 29 / SPEC 0 we haven't had to do that at all. So we have a lot more confidence that our code actually works with a concrete set of minimum deps which are clear to everyone.

Adopting NEP 29 / SPEC 0 has worked really well for PlasmaPy too in much the same way!

Copy link
Contributor Author

@sapols sapols Jul 16, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last sentence of the Motivation section immediately preceding this Rationale one is:

Additionally, limiting the scope of supported versions is an effective way for packages to limit maintenance burden while promoting interoperability.

We probably don't need to say that twice in back-to-back sections. @Cadair @namurphy would you prefer I move that sentence from Motivation to Rationale? Or leave it as-is?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point! Personally, I'll be happy with it so long as it's mentioned in at least one place in the document.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My recent push kept it in the same place. @Cadair if you'd like me to move it under Rationale instead I happily will.

@Cadair
Copy link

Cadair commented Jul 11, 2024

Another thought from #31:

Why don't we add a line or something here strengthening the maximum or hard pin requirement to say that "you MUST not require versions of any dependency older than 24 months?" This would go a long way to removing conflicts when trying to install all compliant packages in the same env?

…rt; remove "indirectly" & "GitHub"; no pinning outdated deps
@sapols
Copy link
Contributor Author

sapols commented Jul 17, 2024

Okay I pushed another round of changes that capture all comments added since the last push:

  • Started using one line per sentence in the file to improve the git diff/commentability
  • Added language clarifying we're adopting SPEC 0
  • Added a sentence about OS version/CPU architecture support
  • Removed the words "indirectly" and "GitHub"
  • Added language clarifying that maximum/exact version pins shouldn't force dependencies older than 24 months

I think this document is looking pretty strong now. I still want feedback on the "How to Teach This" section (basically, do people like the two ideas in there already and should I expand them, or does someone have a better idea?). Otherwise, we may be approaching a point where I could see putting this to a first vote.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants