Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New feature: Package URL support (e.g. a "purl" method?) #6

Open
sjn opened this issue May 13, 2023 · 6 comments
Open

New feature: Package URL support (e.g. a "purl" method?) #6

sjn opened this issue May 13, 2023 · 6 comments

Comments

@sjn
Copy link

sjn commented May 13, 2023

Hei!

Would it be sensible for this package to support the creation of a Package URL for distros?

For example, a package identified as IBMTORDB2/DBD-DB2-0.99.tar.bz2 might get a PURL like this: pkg:cpan/IBMTORDB2/[email protected]?ext=tar.bz2.

The purpose of this is to have a canonical naming of packages that work across ecosystems, and that are suitable for using in SBOM (Software Bill of Material) documents.

@Leont
Copy link
Member

Leont commented May 14, 2023

For example, a package identified as IBMTORDB2/DBD-DB2-0.99.tar.bz2 might get a PURL like this: pkg:cpan/IBMTORDB2/[email protected]?ext=tar.bz2.

I'm not sure I understand the difference between these two. In particular the reasoning behind it.

@sjn
Copy link
Author

sjn commented May 14, 2023

The purpose is for creating a standard way to refer to dependencies across packaging systems.

For example, with a common schema like this, an SBOM can easy list dependencies in whatever multitude of sources that was used to put together any particular application it is meant to describe. Maybe there's a few CPAN packages downloaded from the company DarkPAN, and a bunch from CPAN, and a few were already found installed with the system perl package (which might come from an RPM repo)... The idea is to make it possible to represent all these sources in a standard and common way.

I guess there may be cases where package URLs can't offer enough nuance (i.e. how do we specify a DEB package came from an internal mirror?), but I guess this is something we can figure out later.

My hope with this is to open the door for a conversation on how to represent software packages across ecosystems, and the Package URL spec seems to be a good place to start.

The spec helps clarifying:

When tools, APIs and databases process or store multiple package types, it is difficult to reference the same software package across tools in a uniform way.

For example, these tools, specifications and API use relatively similar approaches to identify and locate software packages, each with subtle differences in syntax, naming and conventions:

@sjn
Copy link
Author

sjn commented May 14, 2023

For example, CycloneDX (one SBOM standard worth exploring) needs some standard way to refer to packages when they link security advisories with whatever is installed. PURL is one way, and Software ID (SWID) is another (defined in ISO/IEC 19770-2:2015)

https://cyclonedx.org/use-cases/#known-vulnerabilities

@sjn
Copy link
Author

sjn commented May 14, 2023

Thinking a little more about how to refer to internal CPAN repos; this may be possible to do with a repository_url=hostname parameter...

pkg:cpan/IBMTORDB2/[email protected]?ext=tar.bz2&repository_url=cpan.org # default repository_url can be skipped
pkg:cpan/IBMTORDB2/[email protected]?ext=tar.bz2&repository_url=internacpan.mycompany.example

(I guess this may require some new input to the new method to work)

Source: https://github.com/package-url/purl-spec/blob/master/PURL-SPECIFICATION.rst

@Leont
Copy link
Member

Leont commented May 14, 2023

I should rephrase that. Why /IBMTORDB2/[email protected]?ext=tar.bz2 instead of IBMTORDB2/DBD-DB2-0.99.tar.bz2? I would prefer a 1-on-1 mapping unless there's a reason otherwise. Filenames are unique on CPAN if that's the requirement.

@sjn
Copy link
Author

sjn commented May 14, 2023

Why /IBMTORDB2/[email protected]?ext=tar.bz2 instead of IBMTORDB2/DBD-DB2-0.99.tar.bz2?

Mostly because the spec says the version number is optional. Also, there's a few oddities around version numbers on CPAN, so having it clearly & unambiguouosly delimited is useful.

I'm thinking this is so the same URL can be used to both specify individual versioned objects and the projects themselves (which basically would mean "download the latest from here", I guess).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants