Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

High Availability Considerations docs outdated #3052

Closed
nijave opened this issue Apr 24, 2024 · 20 comments · Fixed by #3060
Closed

High Availability Considerations docs outdated #3052

nijave opened this issue Apr 24, 2024 · 20 comments · Fixed by #3060
Labels
area/HA kind/documentation Categorizes issue or PR as related to documentation. priority/backlog Higher priority than priority/awaiting-more-evidence.
Milestone

Comments

@nijave
Copy link
Contributor

nijave commented Apr 24, 2024

Is this a BUG REPORT or FEATURE REQUEST?

Choose one: BUG REPORT

Versions

kubeadm version (use kubeadm version):

kubeadm version: &version.Info{Major:"1", Minor:"29", GitVersion:"v1.29.3", GitCommit:"6813625b7cd706db5bc7388921be03071e1a492d", GitTreeState:"clean", BuildDate:"2024-03-15T00:06:16Z", GoVersion:"go1.21.8", Compiler:"gc", Platform:"linux/amd64"}

Environment:

  • Kubernetes version (use kubectl version):
Client Version: v1.29.2
Kustomize Version: v5.0.4-0.20230601165947-6ce0bf390ce3
Server Version: v1.29.2
  • Cloud provider or hardware configuration:
    Ubuntu VMs
  • OS (e.g. from /etc/os-release):
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy
  • Kernel (e.g. uname -a):
Linux vmubtkube-a01 5.15.0-1054-kvm #59-Ubuntu SMP Thu Mar 14 16:03:41 UTC 2024 x86_64 x86_64 x86_64 GNU/Linux
  • Container runtime (CRI) (e.g. containerd, cri-o):
    cri-o
  • Container networking plugin (CNI) (e.g. Calico, Cilium):
    Calico
  • Others:

What happened?

haproxy was still including nodes which returned non 200 health checks. I attempted to troubleshoot but there's been significant changes since haproxy v2.1 so documentation isn't readily available. It seems most likely the health check was running over HTTP (plaintext) and ignoring the returned 400 error code. In addition, haproxy v2.1 is no longer officially supported.

I also observed the very low timeouts in haproxy lead to frequent termination of kubectl ... -w and kubectl logs -f

Edit: I think it may also be possible that ssl-hello-chk option in the guide is overriding httpchk which would also explain the behavior I was seeing. https://cbonte.github.io/haproxy-dconv/2.1/configuration.html#5.2-check

https://endoflife.date/haproxy

What you expected to happen?

The guide to provide a setup using software still supported (patched) by the vendor

How to reproduce it (as minimally and precisely as possible)?

Attempt to use config in haproxy v2.8 docs (currently the default LTS) and it fails due to syntax changes

Anything else we need to know?

For keepalived check, I ended up with

#!/bin/sh

errorExit() {
    echo "*** $*" 1>&2
    exit 1
}

curl -sfk --max-time 2 https://localhost:5443/healthz -o /dev/null   || errorExit "Error GET https://localhost:5443/healthz

It's unclear in the guide why haproxy is only being health checked if it's running on the VIP. The guide configuration could allow keepalived to move the VIP to a node with working API server and broken haproxy which will fail and it will be moved again. Additionally, it doesn't check the health endpoint so the VIP could be moved to a node that's misconfigured but still responds to requests.

For haproxy 2.8, I ended up with the following (static pod)

global
    log stdout format raw local0
    daemon

#---------------------------------------------------------------------
# common defaults that all the 'listen' and 'backend' sections will
# use if not designated in their block
#---------------------------------------------------------------------
defaults
    mode                    http
    log                     global
    option                  httplog
    option                  dontlognull
    option http-server-close
    option forwardfor       except 127.0.0.0/8
    option                  redispatch
    retries                 1
    timeout http-request    10s
    timeout queue           20s
    timeout connect         5s
    timeout client          1800s
    timeout server          1800s
    timeout http-keep-alive 10s
    timeout check           10s

#---------------------------------------------------------------------
# apiserver frontend which proxys to the control plane nodes
#---------------------------------------------------------------------
frontend apiserver
    bind *:5443
    mode tcp
    option tcplog
    default_backend apiserverbackend

#---------------------------------------------------------------------
# round robin balancing for apiserver
#---------------------------------------------------------------------
backend apiserverbackend
    option httpchk

    http-check connect ssl
    http-check send meth GET uri /healthz
    http-check expect status 200

    mode tcp
    balance roundrobin
    server vmubtkube-a01 172.16.1.65:6443 check verify none
    server vmubtkube-a02 172.16.1.66:6443 check verify none
    server vmubtkube-a03 172.16.1.67:6443 check verify none

I don't know much about haproxy, but I'm not sure how the health check worked before unless haproxy v2.1 didn't validate certificates by default.

I reached these conclusions with the following tests:

  • kill etcd on a control plane node rendering api-server inoperable (it responds 500 to /healthz)
  • kill haproxy on a node while the control plane is running (other haproxy instances should still route to the node since api-server is operable)
  • reboot the node with the VIP, assuming it takes a non trivial amount of time (30 seconds or more), VIP should move to a new node and haproxy running on existing nodes should remove the rebooting node api-server. On completion of reboot, haproxy instances should be available on all nodes with all api-server backends active
@nijave
Copy link
Contributor Author

nijave commented Apr 24, 2024

I can create a PR but a sanity check on the above conclusions first would be appreciated

@neolit123
Copy link
Member

@nijave FTR, we haven't gotten complaints from other users about the guide. and yes, it has users.

@mbert do the above suggestions seem good to you?

/kind documentation
/area ha

@k8s-ci-robot k8s-ci-robot added kind/documentation Categorizes issue or PR as related to documentation. area/HA labels Apr 24, 2024
@neolit123 neolit123 added this to the v1.31 milestone Apr 24, 2024
@neolit123 neolit123 added the priority/backlog Higher priority than priority/awaiting-more-evidence. label Apr 24, 2024
@sftim
Copy link

sftim commented Apr 24, 2024

Is this also relevant to SIG Docs? I can't tell.

@sftim
Copy link

sftim commented Apr 24, 2024

If kubeadm is OK and a page inside https://k8s.io/docs/ is out of date: transfer this to k/website and label it for SIG Cluster Lifecycle.

@nijave
Copy link
Contributor Author

nijave commented Apr 24, 2024

If kubeadm is OK and a page inside https://k8s.io/docs/ is out of date: transfer this to k/website and label it for SIG Cluster Lifecycle.

This one which I think official docs link to (but not seeing the link at the moment) https://github.com/kubernetes/kubeadm/blob/main/docs/ha-considerations.md

@sftim
Copy link

sftim commented Apr 24, 2024

If we can move that doc into the main website, I think that benefits our users. Not required though.

@neolit123
Copy link
Member

moving it might break existing URL references.

also frankly, i would prefer if kubeadm docs eventually start moving in the other direction - i.e. to this repo. if this repo starts hosting kubeadm source code and it has versioned branches.

the kubeadm docs can be hosted similarly to other projects with netlify / hugo:

but that's a discussion for another time.
and it's a lot of work.

@mbert
Copy link
Contributor

mbert commented Apr 25, 2024

@nijave FTR, we haven't gotten complaints from other users about the guide. and yes, it has users.

@mbert do the above suggestions seem good to you?

/kind documentation /area ha

From first glance (have been unable so far to thoroughly review things in my development environment): it is true that HAProxy has undergone some changes leading to different configuration files. If this turns out to be an issue here, then providing two sets of example configuration would make sense.

As the rest of the report is concerned I will need to take a closer look. Improvements are always welcome. I hope to be able to do this on the weekend.

@neolit123
Copy link
Member

thanks @mbert
if we can include only the latest config, that seems better to me, so that we can avoid being in the business of tracking N versions.

@mbert
Copy link
Contributor

mbert commented Apr 25, 2024

thanks @mbert if we can include only the latest config, that seems better to me, so that we can avoid being in the business of tracking N versions.

Let's see. For now still both versions are used in the field, because the older one is still present in EL distros.

@neolit123
Copy link
Member

thanks @mbert if we can include only the latest config, that seems better to me, so that we can avoid being in the business of tracking N versions.

Let's see. For now still both versions are used in the field, because the older one is still present in EL distros.

that's a good point. users sometimes just install from the distro packaging.

@mbert
Copy link
Contributor

mbert commented Apr 27, 2024

I have now had some time to read through everything. First of all: I totally agree with @nijave - the guide is outdated here, and I think a PR with the proposed changes would be very welcome.

Actually the examples in the guide were, IIRC, created using HAProxy 1.8 on an EL7 platform, and given the changes in HAProxy since then the configuration example should really get updated. Since what I still have (I am not actively using the setup in my environment, hence experimenting would require me setting things up again before) is all based on that "ancient" HAProxy I cannot quickly provide a configuration for version 2.1, because I have never had one in use. Long story short: I propose to provide the configuration example for 2.8 as seen above (assuming that it has been tested and works) along with mentioning the version and the fact that this may not work for other versions.

Regarding the health check: again, I totally agree that the HAProxy should be checked on all nodes, not only the one with the VIP. Thank you for noticing!

@neolit123
Copy link
Member

@nijave would you send a PR for this as you suggested earlier?

@larsskj
Copy link

larsskj commented Apr 29, 2024

I'm running HA Proxy 2.4 (default in Ubuntu 22.04) without any custom healthcheck setups - and it works just fine:

frontend kube-cph
        bind :::6443 v4v6
        mode tcp
        timeout client 600s
        option tcplog
        default_backend kube-cph-api

backend kube-cph-api
        mode tcp
        timeout server 600s
        option log-health-checks
        server kube01 kube01:6443 check
        server kube02 kube02:6443 check
        server kube03 kube03:6443 check

I've been running an HA Proxy loadbalancer in front of my K8s clusters with similar configurations for at least four years, on several clusters, never had any problems.

@nijave
Copy link
Contributor Author

nijave commented Apr 29, 2024

@nijave would you send a PR for this as you suggested earlier?

Yeah, give me a few days

@nijave
Copy link
Contributor Author

nijave commented Apr 29, 2024

I'm running HA Proxy 2.4 (default in Ubuntu 22.04) without any custom healthcheck setups - and it works just fine:

frontend kube-cph
        bind :::6443 v4v6
        mode tcp
        timeout client 600s
        option tcplog
        default_backend kube-cph-api

backend kube-cph-api
        mode tcp
        timeout server 600s
        option log-health-checks
        server kube01 kube01:6443 check
        server kube02 kube02:6443 check
        server kube03 kube03:6443 check

I've been running an HA Proxy loadbalancer in front of my K8s clusters with similar configurations for at least four years, on several clusters, never had any problems.

I was able to get a working setup when using the current guide, however it didn't handle failures scenarios correctly (so it was load balanced, but not really highly available). With your setup, it doesn't look like you're checking api-server health, only if it accepts TCP connections. If api-server health check is failing, haproxy will still route traffic to those instances. An easy test is killing etcd on a node and observing api-server is still running but returning an error code for /healthz (in which case it should be removed from active backends in haproxy)

@larsskj
Copy link

larsskj commented Apr 29, 2024

You're probably right. At home I have a bare metal cluster, and I routinely update the nodes, meaning that Kubernetes will be shutdown during reboots.

So far HA Proxy have handled this situation without any hickups - but let me have a look when I return home.

@larsskj
Copy link

larsskj commented Apr 30, 2024

Did some further testing: This works for me.

backend kube-cph-api
        mode tcp
        timeout server 600s

        option log-health-checks
        option httpchk GET /healthz

        server kube01 kube01:6443 check check-ssl verify none
        server kube02 kube02:6443 check check-ssl verify none
        server kube03 kube03:6443 check check-ssl verify none

@nijave
Copy link
Contributor Author

nijave commented May 4, 2024

Poked around and it looks like Ubuntu and recent version of RHEL (clones) are on HAProxy v2.4 (LTS) which appears to also work with the config I mentioned above

OS - version (EOL date for community support)

RHEL 7 - v1.5.18 (30 Jun 2024)
RHEL 8 - v1.8.27 (31 May 2029)
RHEL 9 - v2.4.22 (31 May 2032)

Ubuntu 20.04 - v2.0.33 (02 Apr 2025)
Ubuntu 22.04 - v2.4.24 (01 Apr 2027)
Ubuntu 24.04 - v2.8.5 (25 Apr 2036)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/HA kind/documentation Categorizes issue or PR as related to documentation. priority/backlog Higher priority than priority/awaiting-more-evidence.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants