Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing some validations for ConfigAPI #2084

Closed
3 of 4 tasks
tenzen-y opened this issue Apr 27, 2024 · 1 comment · Fixed by #2309
Closed
3 of 4 tasks

Missing some validations for ConfigAPI #2084

tenzen-y opened this issue Apr 27, 2024 · 1 comment · Fixed by #2309
Assignees
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@tenzen-y
Copy link
Member

tenzen-y commented Apr 27, 2024

What happened:
The kubebuiler or controller-gen markers wouldn't work well since the ConfigAPI is not CRD.
So, we need to implement validations by ourselves, but we don't have some validations at the following APIs:

  • MultiKueue: ConfigAPI: Implement the MultiKueue validations #2129
    type MultiKueue struct {
    // GCInterval defines the time interval between two consecutive garbage collection runs.
    // Defaults to 1min. If 0, the garbage collection is disabled.
    // +optional
    GCInterval *metav1.Duration `json:"gcInterval"`
    // Origin defines a label value used to track the creator of workloads in the worker
    // clusters.
    // This is used by multikueue in components like its garbage collector to identify
    // remote objects that ware created by this multikueue manager cluster and delete
    // them if their local counterpart no longer exists.
    // +optional
    Origin *string `json:"origin,omitempty"`
    // WorkerLostTimeout defines the time a local workload's multikueue admission check state is kept Ready
    // if the connection with its reserving worker cluster is lost.
    //
    // Defaults to 15 minutes.
    // +optional
    WorkerLostTimeout *metav1.Duration `json:"workerLostTimeout,omitempty"`
    }
  • InternalCertManagement: ConfigAPI: Implement validations for the internalCertManagement #2169
    type InternalCertManagement struct {
    // Enable controls whether to enable internal cert management or not.
    // Defaults to true. If you want to use a third-party management, e.g. cert-manager,
    // set it to false. See the user guide for more information.
    Enable *bool `json:"enable,omitempty"`
    // WebhookServiceName is the name of the Service used as part of the DNSName.
    // Defaults to kueue-webhook-service.
    WebhookServiceName *string `json:"webhookServiceName,omitempty"`
    // WebhookSecretName is the name of the Secret used to store CA and server certs.
    // Defaults to kueue-webhook-server-cert.
    WebhookSecretName *string `json:"webhookSecretName,omitempty"`
    }
  • ClusterQueueVisibility: ConfigAPI: Implement the non negative validation for the queueVisibility.clusterQueues.maxCount #2309
    type ClusterQueueVisibility struct {
    // MaxCount indicates the maximal number of pending workloads exposed in the
    // cluster queue status. When the value is set to 0, then ClusterQueue
    // visibility updates are disabled.
    // The maximal value is 4000.
    // Defaults to 10.
    MaxCount int32 `json:"maxCount,omitempty"`
    }
  • waitForPodsReady: configAPI: Implement waitForPodsReady validations #2214
    type WaitForPodsReady struct {
    // Enable when true, indicates that each admitted workload
    // blocks the admission of all other workloads from all queues until it is in the
    // `PodsReady` condition. If false, all workloads start as soon as they are
    // admitted and do not block admission of other workloads. The PodsReady
    // condition is only added if this setting is enabled. It defaults to false.
    Enable bool `json:"enable,omitempty"`
    // Timeout defines the time for an admitted workload to reach the
    // PodsReady=true condition. When the timeout is reached, the workload admission
    // is cancelled and requeued in the same cluster queue. Defaults to 5min.
    // +optional
    Timeout *metav1.Duration `json:"timeout,omitempty"`

What you expected to happen:
Once the KueueConfig with invalid parameters is created, the kueue-controller-manager raises the errors and stops working during booting manager.

How to reproduce it (as minimally and precisely as possible):

Anything else we need to know?:

Environment:

  • Kubernetes version (use kubectl version):
  • Kueue version (use git describe --tags --dirty --always):
  • Cloud provider or hardware configuration:
  • OS (e.g: cat /etc/os-release):
  • Kernel (e.g. uname -a):
  • Install tools:
  • Others:
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
1 participant