Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GKE Autopilot: Add support for Extended Duration pods #3387

Merged
merged 2 commits into from
Dec 8, 2023

Conversation

zmerlynn
Copy link
Collaborator

@zmerlynn zmerlynn commented Sep 19, 2023

Introduces Extended Duration Pod support for Autopilot under feature gate "GKEAutopilotExtendedDurationPods". This feature has some rough edges with Agones on Autopilot 1.27, so I'm introducing this as a dev feature gate until we sort everything out.

ED pods have the same UI as disabling Autopilot on GKE Standard [1]: annotate with cluster-autoscaler.kubernetes.io/safe-to-evict=false and your pod is protected from eviction. However, Autopilot handles node upgrades specially and just autoscales new nodes at new versions, so the concept of "safe to evict on upgrade" is unnecessary. Since we can support the concept and do nothing really special, do so: We pull out the "generic" implementation of SetEviction into cloudproduct/eviction and call that.

[1] https://cloud.google.com/kubernetes-engine/docs/how-to/extended-duration-pods

Towards #3386

Note to reviewers: This change is not as big as it seems - it's mostly refactoring.

@google-oss-prow
Copy link

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: zmerlynn
Once this PR has been reviewed and has the lgtm label, please assign roberthbailey for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 03978250-f46d-4c1f-b726-02454fc89e07

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 027d34f9-a726-4a5b-b540-ba41bdb2a57b

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 3495db2c-89e7-49e6-ab11-774e33026414

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@zmerlynn
Copy link
Collaborator Author

The e2e issue seems real, moving this back to draft for now.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 08e903d9-71a9-47d9-9a4f-217a28d5f793

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 182dc4e2-92c7-4030-9ae6-b122fb9b4dfd

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@zmerlynn zmerlynn changed the title GKE Autopilot: Add support for Extended Duration pods [WIP] GKE Autopilot: Add support for Extended Duration pods Sep 21, 2023
@zmerlynn
Copy link
Collaborator Author

(This is shelved for the time being.)

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: a1bd1e0c-30ae-487d-b14c-0d4a2419e02d

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@agones-bot
Copy link
Collaborator

Build Failed 😱

Build Id: 75c88fd2-9b08-49b5-b945-3d332d6e786d

To get permission to view the Cloud Build view, join the agones-discuss Google Group.

@zmerlynn zmerlynn force-pushed the version-aware-cloud-product branch 2 times, most recently from 893638c to 10973dd Compare December 5, 2023 20:46
@zmerlynn zmerlynn changed the title [WIP] GKE Autopilot: Add support for Extended Duration pods GKE Autopilot: Add support for Extended Duration pods Dec 5, 2023
@zmerlynn zmerlynn marked this pull request as ready for review December 5, 2023 20:53
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 9d1d6f1f-2dce-4007-a444-b1e377867acf

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3387/head:pr_3387 && git checkout pr_3387
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.37.0-dev-10973dd-amd64

… gate

Introduces Extended Duration Pod support for Autopilot under feature
gate "GKEAutopilotExtendedDurationPods". This feature has some rough
edges with Agones on Autopilot 1.27, so I'm introducing this as a dev
feature gate until we sort everything out.

ED pods have the same UI as disabling Autopilot on GKE Standard [1]:
annotate with `cluster-autoscaler.kubernetes.io/safe-to-evict=false`
and your pod is protected from eviction. However, Autopilot handles
node upgrades specially and just autoscales new nodes at new versions,
so the concept of "safe to evict on upgrade" is unnecessary. Since we
can support the concept and do nothing really special, do so: We pull
out the "generic" implementation of SetEviction into
cloudproduct/eviction and call that.

[1] https://cloud.google.com/kubernetes-engine/docs/how-to/extended-duration-pods

Towards googleforgames#3366
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: bf69aace-21f3-4e6f-9b1a-9e97db021ab8

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3387/head:pr_3387 && git checkout pr_3387
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.37.0-dev-fa7c006-amd64

Copy link
Member

@markmandel markmandel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non blocking question, was more curious than anything else.

pkg/cloudproduct/eviction/eviction.go Show resolved Hide resolved
@github-actions github-actions bot added the size/M label Dec 8, 2023
@zmerlynn zmerlynn enabled auto-merge (squash) December 8, 2023 18:27
@agones-bot
Copy link
Collaborator

Build Succeeded 👏

Build Id: 1a2e33c3-c7b4-444a-a30e-44b04cde1b7a

The following development artifacts have been built, and will exist for the next 30 days:

A preview of the website (the last 30 builds are retained):

To install this version:

  • git fetch https://github.com/googleforgames/agones.git pull/3387/head:pr_3387 && git checkout pr_3387
  • helm install agones ./install/helm/agones --namespace agones-system --set agones.image.registry=us-docker.pkg.dev/agones-images/ci --set agones.image.tag=1.37.0-dev-95305d3-amd64

@zmerlynn zmerlynn merged commit 0dc9654 into googleforgames:main Dec 8, 2023
4 checks passed
@Kalaiselvi84 Kalaiselvi84 added kind/feature New features for Agones and removed kind/other labels Dec 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/feature New features for Agones size/L size/M
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants