Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the VPAAndHPAForAPIServer feature gate for the gardener-operator #9735

Merged
merged 6 commits into from
May 28, 2024

Conversation

ialidzhikov
Copy link
Member

@ialidzhikov ialidzhikov commented May 10, 2024

How to categorize this PR?

/area auto-scaling
/kind enhancement

What this PR does / why we need it:

Which issue(s) this PR fixes:
Part of #9562
A follow-up of #9678

Special notes for your reviewer:

Release note:

The `VPAAndHPAForAPIServer` feature gate is now also implemented for the gardener-operator. When enabled, the virtual-garden-kube-apiserver and gardener-apiserver are scaled simultaneously by VPA and HPA on the same metric (CPU and memory usage).

Copy link
Contributor

gardener-prow bot commented May 10, 2024

Skipping CI for Draft Pull Request.
If you want CI signal for your change, please convert it to an actual PR.
You can still manually trigger a test run with /test all

@gardener-prow gardener-prow bot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. area/auto-scaling Auto-scaling (CA/HPA/VPA/HVPA, predominantly control plane, but also otherwise) related kind/enhancement Enhancement, improvement, extension cla: yes Indicates the PR's author has signed the cla-assistant.io CLA. labels May 10, 2024
@gardener-prow gardener-prow bot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label May 10, 2024
@gardener-prow gardener-prow bot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels May 15, 2024
@ialidzhikov ialidzhikov marked this pull request as ready for review May 15, 2024 13:16
@gardener-prow gardener-prow bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label May 15, 2024
@voelzmo
Copy link
Member

voelzmo commented May 17, 2024

Hey @ialidzhikov, thanks for the PR! While looking at the changes, I was wondering if we're missing the removal code for HVPA, HPA and VPA objects for the cases when the autoscalingMode is changed? This seems to have been broken already for any switch between HVPA enabled or HVPA disabled, but we never saw this?
Once we merge this and the mode gets changed from HVPA to VPAAndHPA, I guess we would still have an HVPA object and the corresponding VPA and HPA objects created by the hvpa-controller, right?

@ialidzhikov
Copy link
Member Author

While looking at the changes, I was wondering if we're missing the removal code for HVPA, HPA and VPA objects for the cases when the autoscalingMode is changed? This seems to have been broken already for any switch between HVPA enabled or HVPA disabled, but we never saw this?
Once we merge this and the mode gets changed from HVPA to VPAAndHPA, I guess we would still have an HVPA object and the corresponding VPA and HPA objects created by the hvpa-controller, right?

For the kubernetes apiserver component (pkg/component/kubernetes/apiserver, used for the Shoot kube-apiserver and virtual-garden-kube-apiserver) - this is a component that is NOT deployed via GRM, but with a client. Hence, we have everywhere explicit client invocations to delete the no longer needed objects:

  • if k.values.Autoscaling.Mode != apiserver.AutoscalingModeHVPA ||
    k.values.Autoscaling.Replicas == nil ||
    *k.values.Autoscaling.Replicas == 0 {
    return kubernetesutils.DeleteObject(ctx, k.client.Client(), hvpa)
    }
  • func (k *kubeAPIServer) reconcileVerticalPodAutoscaler(ctx context.Context, verticalPodAutoscaler *vpaautoscalingv1.VerticalPodAutoscaler, deployment *appsv1.Deployment) error {
    switch k.values.Autoscaling.Mode {
    case apiserver.AutoscalingModeHVPA:
    return kubernetesutils.DeleteObject(ctx, k.client.Client(), verticalPodAutoscaler)
    case apiserver.AutoscalingModeVPAAndHPA:
    return k.reconcileVerticalPodAutoscalerInVPAAndHPAMode(ctx, verticalPodAutoscaler, deployment)
    default:
    return k.reconcileVerticalPodAutoscalerInBaselineMode(ctx, verticalPodAutoscaler, deployment)
    }
    }
  • func (k *kubeAPIServer) reconcileHorizontalPodAutoscaler(ctx context.Context, hpa *autoscalingv2.HorizontalPodAutoscaler, deployment *appsv1.Deployment) error {
    if k.values.Autoscaling.Mode == apiserver.AutoscalingModeHVPA ||
    k.values.Autoscaling.Replicas == nil ||
    *k.values.Autoscaling.Replicas == 0 {
    return kubernetesutils.DeleteObject(ctx, k.client.Client(), hpa)
    }
    if k.values.Autoscaling.Mode == apiserver.AutoscalingModeVPAAndHPA {
    return k.reconcileHorizontalPodAutoscalerInVPAAndHPAMode(ctx, hpa, deployment)
    }
    return k.reconcileHorizontalPodAutoscalerInBaselineMode(ctx, hpa, deployment)
    }

For the gardener apiserver (pkg/component/gardener/apiserver) - this is a component deployed via GRM:

runtimeResources, err := runtimeRegistry.AddAllAndSerialize(
g.podDisruptionBudget(),
g.serviceRuntime(),
g.horizontalPodAutoscaler(),
g.verticalPodAutoscaler(),
g.hvpa(),
g.deployment(secretCAETCD, secretETCDClient, secretGenericTokenKubeconfig, secretServer, secretAdmissionKubeconfigs, secretETCDEncryptionConfiguration, secretAuditWebhookKubeconfig, secretVirtualGardenAccess, configMapAuditPolicy, configMapAdmissionConfigs),
g.serviceMonitor(),
)

Hence, for the gardener apiserver component returning nil from the verticalPodAutoscaler/horizontalPodAutoscaler/hvpa funcs is enough, GRM takes care to delete the no longer desired objects.

@rfranzke
Copy link
Member

/assign

docs/deployment/feature_gates.md Outdated Show resolved Hide resolved
pkg/component/gardener/apiserver/hpa.go Show resolved Hide resolved
pkg/component/gardener/apiserver/hpa.go Outdated Show resolved Hide resolved
@rfranzke
Copy link
Member

/lgtm

@gardener-prow gardener-prow bot added the lgtm Indicates that a PR is ready to be merged. label May 27, 2024
Copy link
Contributor

gardener-prow bot commented May 27, 2024

LGTM label has been added.

Git tree hash: 3fb2e5501e072f30b500c92d4353875f1e590096

Copy link
Member

@voelzmo voelzmo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@ialidzhikov
Copy link
Member Author

/approve

Copy link
Contributor

gardener-prow bot commented May 28, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: ialidzhikov, voelzmo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@gardener-prow gardener-prow bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 28, 2024
@gardener-prow gardener-prow bot merged commit c72cab2 into gardener:master May 28, 2024
18 checks passed
@ialidzhikov ialidzhikov deleted the enh/vpaandhpa-for-operator branch May 28, 2024 15:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/auto-scaling Auto-scaling (CA/HPA/VPA/HVPA, predominantly control plane, but also otherwise) related cla: yes Indicates the PR's author has signed the cla-assistant.io CLA. kind/enhancement Enhancement, improvement, extension lgtm Indicates that a PR is ready to be merged. size/L Denotes a PR that changes 100-499 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants