Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[rpm] Add VMWare Photon Example #214

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

captn3m0
Copy link
Contributor

@captn3m0 captn3m0 commented Jan 5, 2023

just for example purposes, to avoid confusion elsewhere.

There are lots of ways to represent this that aren't as good:

  • pkg:rpm/photon/systemd?distro=vmware-photon-1
  • pkg:rpm/photon/systemd?distro=photon-1
  • pkg:rpm/systemd?distro=vmware-photon-1
  • pkg:rpm/systemd?distro=photon-1

However, the representation in the example should hopefully be clear and provide some guidance.

just for example purposes, to avoid confusion elsewhere.
@bureado
Copy link

bureado commented Jan 10, 2023

@captn3m0, I'm drawing a bunch of inferences on what this one-line PR intends to convey, so please call me out if I inferred anything incorrectly or incompletely.

The proposed example implies that, in the case of Photon, the value for namespace should be vmware. The PR further implies that a value of photon would be suboptimal.

For the record, we're discussing namespace as defined in https://github.com/package-url/purl-spec/blob/master/PURL-TYPES.rst#rpm, which says the namespace is the vendor, with vendor implied to be the "provider" of the distro that ships the package being referenced.

"Fedora" and "OpenSUSE" are used in the original examples. But is "Fedora" the "vendor" of "Fedora Linux"? Or is it "Fedora Project"? And "OpenSUSE Project" for the "OpenSUSE" distribution? In the case of Fedora and OpenSUSE, these might not be critical considerations since people know what "fedora" stands for.

In the case of Photon, though, it means that one would consider "VMWare" as the namespace, hence this PR.

However, I think that interpretation actually suggests the spec should evolve so the namespace is not the vendor but the distributor ID as per lsb_release or the distro ID as per systemd.

This would mean that the proposed example would actually read pkg:rpm/photon/systemd?distro=photon-4.0 (the distro value being derived from Photon's actual os-release)

An alternative to this approach would be to use the vendor as defined for example in a CPE string. Using this definition, then vmware is acceptable, but it introduces a number of problems including the fact that in the CPE 2.3 dictionary calls photon differently: photon_os. I would find using CPE string parts a restrictive approach for a number of other reasons, too.

So this brings up an interesting purl-wide question. What is a vendor, particularly when it comes to Linux packages? There are 7 mentions to vendor in the types specification, one being rpm and the other ones being:

  • For alpm (pacman), some are distro names, some are something else (e.g., msys)
  • For apk and deb, it's distro names
  • For composer, conan and qpkg, it appears to be the publisher name

The spirit behind this seems to be that vendor is a part that helps locate the package univocally on the Internet. To that extent, some ecosystems use publisher (like qpkg) and others use the distro name.

Following this spirit, the appropriate namespace for Photon would be, IMHO, photon.

If this is in fact the spirit and the logic that purl wants for using "vendor as namespace" then I suggest it's made more clear in places where it's ambiguous, perhaps rpm being one of them (I don't see deb as being particularly ambiguous, but it's situational and ecosystem-dependent)

Of course, ambiguity can always be resolved with repository_url. Maybe we also say that if someone intends to use a purl to assemble a URL to fetch a package from a repository (which is not a great idea considering snapshot repositories aren't too common) then they must use a repository_url or they risk ambiguity.

We might also want to recommend that purl users do some due diligence on what the preferred namespace for the ecosystem they're referencing to is. Maybe photon users prefer to use vmware as their namespace.

Thank you for reading. I think we owe it to ourselves to do this analysis, but I recognize deciding between CPE vendors, distro names, entity names, etc., might be overthinking the problem. In general, I don't think rpm users are having trouble right now filling out the namespace part.

@captn3m0
Copy link
Contributor Author

I agree with most of the analysis, but this shouldn't be left to the end-users - I'm facing these conundrums everyday while generating PURLs). It should be made clear in the specification as to where the vendor/namespace are expected to be picked up from (such as deferring to /etc/os-release).

Even if we pick /etc/os-release, there's still more issues, such as both centos linux and centos stream (which are quite different operating systems), both using the exact same details in the lsb-release, even down to the same CPE: endoflife-date/endoflife.date#1255 (comment)

For differentiating between centos 8 and centos stream 8 packages, you need to encode this in the distro field (such as centos-8, centos-stream-8), but there's no guidance provided to RPM users currently on what forms this will take.

Even in case of debian packages, there are scenarios where the PURL spec falls short, such as in the case of Debian ELTS:

  1. The /etc/os-release doesn't change
  2. However, the package repository changes to the ELTS repository

In such case, imo - the distro should remain jessie, as earlier, with the repository_url field for upgraded packages pointing to the new repository. However, since such a PURL might not be clear enough, we're discussing whether to change the ecosystem field to accommodate this additional information.

The spirit behind this seems to be that vendor is a part that helps locate the package univocally on the Internet.

From my usual understanding of the word "vendor", I'd just assumed this to mean "the supplier behind the package". "locate the package un-ambigiously" is something other fields also do, not just the vendor field, so it's not a clear definition.

Either the specification should provide a clear definition of what vendor is supposed to be, or it should provide guidance as to how the field can be inferred (with appropriate fallbacks).

imo, it is quite impossible for PURL to fulfill both of the obligations at the same time:

  1. Not maintain a dictionary of keywords/mappings for fields like namespaces/distro/os/arch etc
  2. Become a widely interoperable and easily-used specification.

@bureado
Copy link

bureado commented Jan 13, 2023

I'm not convinced the Debian ELTS example you provide is a shortcoming in purl. With Debian ELTS, I don't expect ID to change in os-release. In fact, I'm OK with a largely untouched os-release (maybe the *_URL or SUPPORT_END are augmented) That's because, for example, the package nano provided in Freexian's current ELTS for amd64 is bit-by-bit the same package as the one provided in stretch from an official Debian mirror. I would certainly not namespace that nano package under freexian instead of debian. I could see a case for a security update calling for repository_url to disambiguate from the Debian archives where that update won't be found after the LTS period. I think the example you provided on Gitter disambiguates sufficiently for the case you presented, and that's what the spec reads today.

I used the Debian ELTS example since I think the Stream example is extemporary post 8 and I don't understand enough about Stream to reason about how different the nano 2.9.8 in c8 was from the nano 2.9.8 in c8s. It's possible there's an RPM-specific tag that might help with such cases, but given what I've seen I'm not sure that's a namespace.

Both of these examples are different from the vendor/provider problem discussed above and in the PR.

I won't weigh in on your premises for purl success, but I appreciate the discussion and the edge cases because they will help users make the best out of an specification that is also improving. I'm also keeping an eye on https://github.com/scanoss/purl2cpe. These are exactly the type of ecosystem efforts I would expect to happen.

@bureado
Copy link

bureado commented Jan 20, 2023

See #195

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants