Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allowedCIDRs breaks when not using the reference neutron implementation due to the api being accessed over public IP #1851

Open
huxcrux opened this issue Feb 1, 2024 · 3 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@huxcrux
Copy link
Contributor

huxcrux commented Feb 1, 2024

/kind bug

What steps did you take and what happened:
I try to implement allowedCIDRs for my clusters. doing this by setting allowedCIDRs in the apiServerLoadBalancer in the openstackcluster spec.

Example:

spec:
  allowAllInClusterTraffic: true
  apiServerLoadBalancer:
    allowedCidrs:
    - <listofips-redacted>
    enabled: true
  cloudName: elastx
  controlPlaneEndpoint:
    host: <redacted>
    port: 6443
  disableAPIServerFloatingIP: false
  disableExternalNetwork: false

The problem I face is that we do not use the standard neutron implementation where all nodes are using the routers IP for snat. This means when trying to bootstrap the first node it fails to start the kubelet due to connections to the API being blocked by the loadbalancer.

I could manually add all our snat pools IPs and everything works. the problem with this is that the SNAT IPs are shared between multiple customers. and even If I use the new IPAM code that is under review I would still need to manually add those IPs to the allowedCIDRs list.

What did you expect to happen:
I expect all cluster nodes to use the internal LB endpoint for api traffic. It seems a bit odd when there is an internal endpoint to use the external IP for in-cluster traffic?

Anything else you would like to add:

Another alternative is to use the IPAM ippool, the problem for this is that we need to watch another object and trigger an lb reconcile upon. However this object contains a list of valid IPs that could simply be appended. I think it would make more sense to migrate to use the internal endpoint for in-cluster api traffic.

Environment:

  • Cluster API Provider OpenStack version (Or git rev-parse HEAD if manually built): latest master (commit: 5cc483b)
  • Cluster-API version: 1.6.1
  • OpenStack version: Ussuri
  • Minikube/KIND version: kind 0.20.0
  • Kubernetes version (use kubectl version): 1.29.1
  • OS (e.g. from /etc/os-release): ubuntu 22.04
@k8s-ci-robot k8s-ci-robot added the kind/bug Categorizes issue or PR as related to a bug. label Feb 1, 2024
@mdbooth
Copy link
Contributor

mdbooth commented Feb 2, 2024

This is a CAPI limitation (which may in turn be based on a kubeadm limitation?): it is only possible to configure a single control plane endpoint, so it is the public one. I completely agree that it would be ideal to have separate internal and external endpoints, but I don't think there's currently anywhere to configure them.

It's tracked here: kubernetes-sigs/cluster-api#5295

Reading through the comments on that issue and also kubernetes-sigs/cluster-api#8500, it sounds like some other providers may have various degrees of workaround/hack for the issue which it might be worth investigating until we can implement it properly.

@k8s-triage-robot
Copy link

The Kubernetes project currently lacks enough contributors to adequately respond to all issues.

This bot triages un-triaged issues according to the following rules:

  • After 90d of inactivity, lifecycle/stale is applied
  • After 30d of inactivity since lifecycle/stale was applied, lifecycle/rotten is applied
  • After 30d of inactivity since lifecycle/rotten was applied, the issue is closed

You can:

  • Mark this issue as fresh with /remove-lifecycle stale
  • Close this issue with /close
  • Offer to help out with Issue Triage

Please send feedback to sig-contributor-experience at kubernetes/community.

/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 2, 2024
@huxcrux
Copy link
Contributor Author

huxcrux commented May 2, 2024

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

4 participants