Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a package orderer that matches Python's version specifier spec (PEP440) #1706

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

isohedronpipeline
Copy link
Contributor

@isohedronpipeline isohedronpipeline commented Apr 2, 2024

This PR adds a package order to sort packages according to pypa rules, with an option for risk tolerance of prerelease options, appropriate for how users consume packages from PyPI. By default, users only get release versions, but depending on their risk tolerance, they can opt into the latest RC, beta, or alpha releases as well.

PEP440PackageOrder

This adds support for sorting packages according to PyPA / Packaging Version Specifiers

Specifically, such that 1.1.0 > 1.1.0rc2 > 1.1.0b3 > 1.1.0a4.
As a refresher, PyPA prerelease suffixes are work towards creating the main version. i.e. 1.1.0a1 is the first version of the branch that will eventually be 1.1.0. 1.1.0 and 1.1.0a1 will not be developed at the same time.

This package orderer allows the user to specify a prerelease risk tolerance, defaulting to sorting full release versions to the front. The package order may be configured to allow prerelease of a certain level ahead of fully released versions as well.
For example, given the versions [1.0b2, 1.0a1, 1.0, 1.1, 1.0rc1, 1.1b2, 1.2b1],
an orderer initialized with prerelease=None would give the order:
[1.1, 1.0, 1.2b1, 1.1b2, 1.0rc1, 1.0b2, 1.0a1].
an orderer initialized with prerelease="b" would give the order:
[1.2b1, 1.1, 1.1b2, 1.0, 1.0rc1, 1.0b2, 1.0a1].

Copy link

codecov bot commented Apr 2, 2024

Codecov Report

Attention: Patch coverage is 93.93939% with 2 lines in your changes are missing coverage. Please review.

Project coverage is 58.33%. Comparing base (3c75a19) to head (0a139ce).
Report is 2 commits behind head on main.

Files Patch % Lines
src/rez/package_order.py 93.93% 1 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1706      +/-   ##
==========================================
+ Coverage   58.25%   58.33%   +0.08%     
==========================================
  Files         126      126              
  Lines       17157    17189      +32     
  Branches     3504     3509       +5     
==========================================
+ Hits         9995    10028      +33     
+ Misses       6496     6495       -1     
  Partials      666      666              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Member

@JeanChristopheMorinPerso JeanChristopheMorinPerso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for creating this. I'm a little bit puzzled by the implementation of the PEP440 orderers and left some questions/comments. Also note how I refer to PEP440 and not PyPA. I'm not sure if we should call it PyPA or Python or PEP440.

I didn't review any of the custom orderer stuff.

Could you split both in two different pull requests please? They don't seem related and one (the PEP440 one) will be much easier to get consensus on if we want that or not than the other (the custom orderer).

import rez.vendor.packaging.version as pypa
pypa_version = pypa.parse(str(version))

key = pypa_version._key

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not a fan of relying on private attributes.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, but the sort key isn't made public and it's explicitly the thing we want. We can reproduce what currently goes into _key, but that's fragile to changes.

Unfortunately, the key is assembled in the __init__ and there is no other way to access it.

And our goal here is to get a tuple that is sortable, we don't want to sort the Versions themselves at this point, which is handled by the packaging Version.__lt__ internally, based on that _key private member. It's unfortunate, but unavoidable :-/

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR now only includes PyPA package ordering.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CustomPackageOrder now found here:
#1709

'1.0.0.a2',
'1.0.0.a1',
]
self._test_reorder(orderer, "pypa", expected)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we test that the the implementation matches the PEP440 implementation by comparing the result from rez with the result from packaging.version?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's essentially what this does, doesn't it? Maybe i'm not following.
This package orderer uses the pep440/packaging/pypa sort key directly, so we know that we're getting the right thing.

self._test_reorder(orderer, "pypa", expected)

# Test that we can allow RC versions ahead of release
orderer = PyPAPackageOrder(prerelease="rc")

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure to understand why we would select what is a prerelease. Isn't the whole goal of adding a PEP440 orderer to get the exact same behavior than when using pip?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is a risk allowance setting.
Consider 2.0.a1, 1.3.rc1, 1.2.b1, and 1.1.
Typically users do not want the alpha version of some software. Most users want the release version. This would not prevent a user from using 1.1, but would allow them to opt into it.

PyPA sorting is almost a two dimensional version scheme, where you may want to sort the release versions as preferred or you may want the alpha versions as preferred.

The idea here is to mirror how pip works, where you need to specifically ask for an alpha version. If you just do pip install requests you will get the latest release version.

Comment on lines +788 to +793
1. First separate versions into two groups based on the value of
``prerelease``: a disallowed/non-preferential group and those which are allowed/preferred
2. Sort each group from step 1 in the order defined by the pep440 spec.
e.g. 1.1 > 1.1rc1 > 1.1b1 > 1.1a1
3. Concatenate the two groups: placing the non-preferred group -- in sorted order --
after the preferred group.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm quite lost as to why you have to do all this. What's a disallowed or non-preferential group? And what's an allowed and preferred group? Why is there a concept of what's allowed or not or even preferences?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully the above example with pip install helps clarify it.
Essentially, most of the time you want users to get the latest release version, not the absolute latest version. But you also don't want to specifically disallow them from using it. The prerelease argument allows the package sorter to set a risk tolerance for the consumer, so that if I am comfortable with rc, but not beta then I am happy with the highest version number of either rc or release, but not beta or alpha.

@isohedronpipeline
Copy link
Contributor Author

Thanks for creating this. I'm a little bit puzzled by the implementation of the PEP440 orderers and left some questions/comments. Also note how I refer to PEP440 and not PyPA. I'm not sure if we should call it PyPA or Python or PEP440.

Yeah, I thought about the name of it for a little while too. I settled on pypa because:

  1. Calling it python or packaging would be pretty confusing considering that lots of (most?) rez packages are in python and a package has a more well understood meaning in the context of rez.
  2. pep440 was another option, but since the official name for the version scheme is actually pypa, I thought it was more appropriate. Pep440 is the proposal number, but pypa is the actual rules that covern the scheme: https://www.pypa.io/en/latest/
  3. Even though pypa is not immediately recognizable for most people, neither is pep440 and in time "pypa" will be more recgonizable than the historical document that spawned it.

I didn't review any of the custom orderer stuff.

Could you split both in two different pull requests please? They don't seem related and one (the PEP440 one) will be much easier to get consensus on if we want that or not than the other (the custom orderer).

Sure, I'll split it out later this week.

I'm not sure if there is any specific issue or worry that you want to discuss about why it would be hard to get consensus on it or why we specifically would not want it. It doesn't do anything different from the existing version_split package orderer, it just solves the problem in a more general/flexible way.

They both serve pretty important functions and I'd love to see those features available to the rest of the community. I can split them up for the sake of isolating the conversation, but the custom sort order unit tests won't work until pypa is also merged in.

Signed-off-by: Ben Andersen <[email protected]>
@JeanChristopheMorinPerso JeanChristopheMorinPerso changed the title Added CustomPackageOrder and PyPAPackageOrder Added PyPAPackageOrder Apr 6, 2024
@JeanChristopheMorinPerso JeanChristopheMorinPerso changed the title Added PyPAPackageOrder Add a package orderer that matches Python's version specifier spec (PEP440) Apr 6, 2024
@JeanChristopheMorinPerso
Copy link
Member

JeanChristopheMorinPerso commented Apr 6, 2024

pep440 was another option, but since the official name for the version scheme is actually pypa, I thought it was more appropriate. Pep440 is the proposal number, but pypa is the actual rules that covern the scheme

I'll be pedantic, but PyPA is a group of people (to be a member, you need to be a maintainer of a project under the PyPA GitHub organization), not rules. They host pip, setuptools, flit, etc and they also host the packaging specs. They also don't "govern" it any specs. Any change to a spec needs a new PEP (unless the change is really really minor/cosmetic) and the packaging specs/PEPs fall under one (or two I don't remember correctly) PEP delegates that the Steering Council chose. (details are maybe slightly different, but what I wrote gives a general idea).

All that to say that Python or PEP440 would probably be a more appropriate/representative name.

I'm not sure if there is any specific issue or worry that you want to discuss about why it would be hard to get consensus on it or why we specifically would not want it. It doesn't do anything different from the existing version_split package orderer, it just solves the problem in a more general/flexible way.

The custom orderer would potentially open the door to managing a production configuration inside rez, which as we have discussed, is something that has been discussed multiple times and is going to be a hot topic.

Ordering based on PEP440 is also not necessarily guaranteed to be accepted, because it also raises multiple questions, but it still has more chances to appear in a release than the custom orderer. Additionally, the two are completely unrelated.

So splitting the two will at least simplify the discussions and it will also help us decide what we want to see released or not.

Signed-off-by: Ben Andersen <[email protected]>
@isohedronpipeline
Copy link
Contributor Author

pep440 was another option, but since the official name for the version scheme is actually pypa, I thought it was more appropriate. Pep440 is the proposal number, but pypa is the actual rules that covern the scheme

I'll be pedantic, but PyPA is a group of people (to be a member, you need to be a maintainer of a project under the PyPA GitHub organization), not rules. They host pip, setuptools, flit, etc and they also host the packaging specs. They also don't "govern" it any specs. Any change to a spec needs a new PEP (unless the change is really really minor/cosmetic) and the packaging specs/PEPs fall under one (or two I don't remember correctly) PEP delegates that the Steering Council chose. (details are maybe slightly different, but what I wrote gives a general idea).

All that to say that Python or PEP440 would probably be a more appropriate/representative name.

I've renamed it to PEP440PackageOrder, but I'm not it's the right choice. It directly uses the packaging rules, which are dictated by pypa, instead of recreating the pep440 rules internally. Calling it python is even worse than calling it pypa I think, since that is much more likely to support a different versioning scheme than PyPA, to say nothing of the obvious confusion that would arise from it.

I'm not sure if there is any specific issue or worry that you want to discuss about why it would be hard to get consensus on it or why we specifically would not want it. It doesn't do anything different from the existing version_split package orderer, it just solves the problem in a more general/flexible way.

The custom orderer would potentially open the door to managing a production configuration inside rez, which as we have discussed, is something that has been discussed multiple times and is going to be a hot topic.

I suppose so. I still don't understand the sentiment, since the other package orderers are literally for configuration of package resolution and there is no other way to achieve the goal of configuring second order requirements without explicitly fully baking environments, but I guess that's what a discussion is for.

Ordering based on PEP440 is also not necessarily guaranteed to be accepted, because it also raises multiple questions, but it still has more chances to appear in a release than the custom orderer. Additionally, the two are completely unrelated.

So splitting the two will at least simplify the discussions and it will also help us decide what we want to see released or not.

The custom package orderer can take advantage of the pep440 sorting. Since pep440 sorting is the first package orderer that actually not a configuration level orderer, there aren't other orderers that are suitable for that part of the test. I'm happy to separate them though. Custom will just need to come after pep440.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants