-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add support for package url fields #7
base: master
Are you sure you want to change the base?
Conversation
Package URLs are useful for referring to (and being referred to) packages in other ecosystem namespaces. With this, we can introduce a packaging/ecosystem-agnostic way to refer to CPAN packages.
Is the |
I added it to make it easier for clients/tooling in other package ecosystems (without knowledge about the inner workings of CPAN) can put together a valid download URL with the supplied package URL. |
in this commit, I still get a few failing tests:
|
I think there are other concerns as well for generating URLs to artifacts directly from purls, like:
If AUTHOR, DISTRIBUTION, VERSION map to one unique downloadable artifact, then lookups could be done against some index like 02packages or MetaCPAN that knows the correct artifact URL. Some examples that might need to be considered:
|
I'd also love support for checksum qualifiers identifying the artifact a purl resolves to: Like: |
CPAN::DistnameInfo is (afaik) only a producer of PURLs, and hence is limited to things with the fields it can extract from an module's full path on CPAN (e.g. A discussion around a PURL checksum field probably belongs in the https://github.com/giterlizzi/perl-URI-PackageURL issue tracker? |
I don't think fetching things online is an option available for this module. As I see it, the purpose of this module is to pick out all necessary bits of the distro path so these can be used for something else. My adding a In this regard, I'm thinking our task is to make sure we can extract everything we can from the provided path (however weird it is), and then provide the necessary fields so we can reproduce it (with the
The full URL is not possible to create with the information available to CPAN::DistnameInfo as it is. The best we can do is to try and produce the path (which later can be consumed by eg. cpan(1)) and let this consumer pick a server. The spec itself needs to support a hostname though, but that's not a conversation for this module. Other than the server part, I think your
This isn't tested for, it seems. I'll add one and see what happens. I don't think the
This isn't tested in (Strictly speaking, I'd argue it's time for for crazy filenames to be stopped in PAUSE 😠 ) |
... or, maybe CPAN or MetaCPAN could serve assets directly from URLs that are more friendy to PURLs. This rest on the assumption that the AUTHOR, DISTRO, VERSION triple always resolves to the same artifact of course. A benefit of this would be that a URL to the asset could be derived directly from the purl. I'm imagining something like:
Which could return the artifact directly like:
I'm thinking this could be possible to implement with some map in nginx, for instance. |
For www.cpan.org, that would involve a lot of work. It is currently a fully static site, where supporting URLs like that would involve some kind of index lookup. And doing a lookup is complicated more by the fact that in PAUSE, there isn't any data stored connected to a distribution. It's something MetaCPAN could provide though. |
The big clue with package urls (as far as I understand) is to make it possible to refer to packages from one ecosystem to another one. How these URLs are resolved is entirely up to that package system's tooling. e.g. I guess these are capable of downloading 02packages and do whatever is necessary to figure out which release the pkg URL refers to. If we keep a strict separation of concerns in mind, then I suggest our task here to be this: CPAN Distro -> PackageURL
To me, it seems the "hard" bits are what to do with the crazy stuff in 1.4 and 1.7, and I guess this can be resolved by taking a quick look at how the clients do their disambiguation and just add some exceptions to the CPAN::DistnameInfo code. All the problematic distros seem to be in BackPAN, so I guess we can safely assume no more funky filenames are likely to be uploaded to CPAN? This leaves only the situation where a dev produces crazy filenames on their own DarkPAN mirror... I'm thinking this can be somewhat avoided by adding some sanity checks to whatever tooling manages these? Also, I'm starting to think some of this is relevant to package-url/purl-spec#155 and maybe giterlizzi/perl-URI-PackageURL#2 ? PackageURL -> CPAN DistroFinally, there's the question about verification that a given PackageURL actually identifies the correct CPAN distro. This is especially important when producing SBOM objects where identifying the source of the software is critical to do correctly. This is probably not a task for CPAN::DistnameInfo though. |
Very interesting this discussion which is across in different areas such as packaging, SBOM security, etc. For Example:
Using the
Supported "qualifiers" for CPAN for the moment I only added |
@giterlizzi good you're working on this! 😄 One thought – I'd like to suggest that we (the Perl Toolchain Gang + the CPAN Security WG + yourself, if you're up for it) make an effort to add purl support to as many of the CPAN clients we can. Since the SBOM thing has come to stay due to the CRA and NIS2 directives coming to EU in the coming year, and PURL are a central component in these, I'm thinking we might as well make the necessary changes to make it a first-class citizen in the CPAN/Perl world. :-) Would you be up for that? 😁 |
Would you mind if you use sha256 checksums by default, btw? sha1 has been considered completely unsafe since 2017. |
Even better use sha3. Sha256 is yet another MD based narrow pipe hash whose predecessors are all broken. Might as well switch to a wide pipe hash function instead. |
Totally agree ;) |
- Add "fullversion" parameter - Don't add ext field to the pkgurl unless it's different than tar.gz - Add support for specifying t/path.t tests as TODO or SKIP - Use fullversion to produce pkgurl versions - Skip CGI.pm test, as it's only available on BackPAN - Skip Bio-ASN1-EntrezGene-1.10-withoutworldwriteables test, as it's only available on BackPAN
Some reasons I can think of to use SHA2 (at the moment):
|
- Add tests for "Strange" versions discovered at SuSE by Tina Müller
|
||
CPAN/authors/id/M/MI/MINGYILIU/Bio-ASN1-EntrezGene-1.10-withoutworldwriteables.tar.gz | ||
filename Bio-ASN1-EntrezGene-1.10-withoutworldwriteables.tar.gz | ||
dist Bio-ASN1-EntrezGene | ||
maturity released | ||
distvname Bio-ASN1-EntrezGene-1.10-withoutworldwriteables | ||
version 1.10 | ||
fullversion 1.10-withoutworldwritables |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo in "fullversion" (1.10-withoutworldwritables
--> 1.10-withoutworldwriteables
)
Implements #6