Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support non-Istio deployment, using Cilium support as a use case #7553

Open
adriantorrie opened this issue Apr 19, 2024 · 0 comments
Open

Support non-Istio deployment, using Cilium support as a use case #7553

adriantorrie opened this issue Apr 19, 2024 · 0 comments

Comments

@adriantorrie
Copy link

/kind feature

Why you need this feature:
[Is your feature request related to a problem? Please describe in details]

Use case: We are using Cilium + NVidia Network Operator (Multus, SRV-IO under the covers), and adding Istio would just be bloat for an HPC/AI cluster.

  • Cilium is a graduated CNCF project (Isovalent is the primary maintainer)
  • Cilium is being used more regularly for high performing networking with eBPF, and BGP support
  • Cilium supports K8S Gateway API and K8S Ingress
  • Cilium supports cluster mesh and service mesh
  • Cilium can integrate with Keycloak via K8S Gateway API

Describe the solution you'd like:
[A clear and concise description of what you want to happen.]

  • Abstract away the need for Istio Ingress/Gateway, to allow the backend that serves the Gateway API/Ingress to be any provider that conforms to K8S APIs, e.g. Cilium

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

  • We are building an HPC/AI cluster with a dual network stack
    • Management network
      • Cilium with eBPF (kube-proxy replacement) and BGP
    • Data network = NVidia Network Operator (Multus, with choices of SRV-OI, RoCE, etc under the covers)
      • Mellanox switches
      • Bluefield DPU
      • GPU Direct Storage (Vast, Quantum, Weka etc)
      • Nvidia GPU Operator
    • Some clusters are purely batch workloads (YuniKorn scheduler replacement - Kubeflow Training Operator, Spark Operator, MPI, etc)
    • Mixed workloads required on other clusters, require interactive dev, Exploratory Data Analysis (EDA) by Data Scientists/Researchers (Jupyter Notebooks for devs, Grafana and Keycloak services for admins as a couple of possible examples of the mixed workloads) alongside batch workloads (Yunikorn plugin scheduling)
  • Isovalent also have
    • Hubble for eBPF based network monitoring and observability
    • Tetragon for eBPF based security enforcement at runtime
  • Isovalent are building a very well defined stack similar to how Grafana Labs have done with their Grafana/Loki/Mimir products
    • Therefore, Cilium uptake likely to be higher due to eBPF stack of products
    • Also, Istio could be considered redundant with Cilium as the CNI
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Development

No branches or pull requests

1 participant