Skip to content
This repository has been archived by the owner on Jan 4, 2022. It is now read-only.

cluster fails to start when running with systemd >= v240 #339

Open
dongsupark opened this issue Feb 19, 2019 · 1 comment
Open

cluster fails to start when running with systemd >= v240 #339

dongsupark opened this issue Feb 19, 2019 · 1 comment
Labels

Comments

@dongsupark
Copy link
Member

When trying to test kube-spawn with systemd v240 or newer, it does not start the whole cluster.

Only one of the nodes becomes active, and only the node gets registered to systemd-machined. Actually all of the nodes should be active. Which node becomes active seems to be pretty random. Sometimes the master is active, sometimes one of the workers is active. In some cases, every node becomes inactive.

On the other hand, when running with systemd v239, everything works fine.

I have not managed to track down its reason. Probably git-bisect between v239 and v240 is needed.

@dongsupark dongsupark added the bug label Feb 19, 2019
@invidian
Copy link
Member

https://github.com/systemd/systemd/blob/401faa3533280b05fee972e0c64885caf4b31e4c/units/var-lib-machines.mount#L10 this might be the reason why it's not spawning.

When I try to spawn it on Ubuntu 19.04 with systemd version 240.

I get this error:

    ubuntu: Download of https://alpha.release.flatcar-linux.net/amd64-usr/current/flatcar_developer_container.bin.bz2 complete.
    ubuntu: Failed to set file attributes on /var/lib/machines/.#flatcar.raw43b38c6ebee86770: Operation not supported
    ubuntu: Created new local image 'flatcar'.
    ubuntu: Operation completed successfully.
    ubuntu: Exiting.
    ubuntu: + sudo GOPATH=/home/vagrant/go ./kube-spawn create --cni-plugin-dir=/home/vagrant/go/bin
    ubuntu: Downloading kubectl
    ubuntu: Downloading 10-kubeadm.conf
    ubuntu: Downloading kubelet.service
    ubuntu: Downloading kubelet
    ubuntu: Downloading kubeadm
    ubuntu: downloading socat
    ubuntu: Copying files for cluster ...
    ubuntu: Generating configuration files from templates ...
    ubuntu: Cluster default created
    ubuntu: + sudo GOPATH=/home/vagrant/go ./kube-spawn start --cni-plugin-dir=/home/vagrant/go/bin --nodes=2
    ubuntu: Warning: kube-proxy could crash due to insufficient nf_conntrack hashsize.
    ubuntu: setting nf_conntrack hashsize to 131072... 
    ubuntu: making iptables FORWARD chain defaults to ACCEPT...
    ubuntu: Failed to start cluster: stat /var/lib/machines.raw: no such file or directory

I'm testing current workaround now:

diff --git a/pkg/bootstrap/node.go b/pkg/bootstrap/node.go
index 890e974..9272b7e 100644
--- a/pkg/bootstrap/node.go
+++ b/pkg/bootstrap/node.go
@@ -39,7 +39,7 @@ const (
        ctHashsizeValue        string = "131072"
        ctMaxSysctl            string = "/proc/sys/net/nf_conntrack_max"
        machinesDir            string = "/var/lib/machines"
-       machinesImage          string = "/var/lib/machines.raw"
+       machinesImage          string = "/var/lib/machines/flatcar.raw"
        baseImageStableVersion string = "1478.0.0"
 )

invidian added a commit to invidian/kube-spawn that referenced this issue May 24, 2019
Systemd prior to version 240 was creating /var/lib/machines.raw as btrfs
volume for storing machined disks. However, this functionality is now
remove and now it stores it as regular files. More details here:
https://github.com/systemd/systemd/blob/401faa3533280b05fee972e0c64885caf4b31e4c/units/var-lib-machines.mount#L10

Additionally, it someone already have /var/lib/machines baked by btrfs,
machined won't create machines.raw, which will also cause setting up
cluster to fail.

This patch adds check, if machineImage exists and it not, it fallbacks
to just checking disk space available for /var/lib/machines.

Closes kinvolk#339 kinvolk#293

Signed-off-by: Mateusz Gozdek <[email protected]>
invidian added a commit to invidian/kube-spawn that referenced this issue May 27, 2019
Systemd prior to version 240 was creating /var/lib/machines.raw as btrfs
volume for storing machined disks. However, this functionality is now
remove and now it stores it as regular files. More details here:
https://github.com/systemd/systemd/blob/401faa3533280b05fee972e0c64885caf4b31e4c/units/var-lib-machines.mount#L10

Additionally, it someone already have /var/lib/machines baked by btrfs,
machined won't create machines.raw, which will also cause setting up
cluster to fail.

This patch adds check, if machineImage exists and it not, it fallbacks
to just checking disk space available for /var/lib/machines.

Refs kinvolk#339 kinvolk#293

Signed-off-by: Mateusz Gozdek <[email protected]>
invidian added a commit to invidian/kube-spawn that referenced this issue May 27, 2019
Systemd prior to version 240 was creating /var/lib/machines.raw as btrfs
volume for storing machined disks. However, this functionality is now
remove and now it stores it as regular files. More details here:
https://github.com/systemd/systemd/blob/401faa3533280b05fee972e0c64885caf4b31e4c/units/var-lib-machines.mount#L10

Additionally, it someone already have /var/lib/machines baked by btrfs,
machined won't create machines.raw, which will also cause setting up
cluster to fail.

This patch adds check, if machineImage exists and it not, it fallbacks
to just checking disk space available for /var/lib/machines.

Refs kinvolk#339 kinvolk#293

Signed-off-by: Mateusz Gozdek <[email protected]>
invidian added a commit to invidian/kube-spawn that referenced this issue May 27, 2019
Systemd prior to version 240 was creating /var/lib/machines.raw as btrfs
volume for storing machined disks. However, this functionality is now
remove and now it stores it as regular files. More details here:
https://github.com/systemd/systemd/blob/401faa3533280b05fee972e0c64885caf4b31e4c/units/var-lib-machines.mount#L10

Additionally, it someone already have /var/lib/machines baked by btrfs,
machined won't create machines.raw, which will also cause setting up
cluster to fail.

This patch adds check, if machineImage exists and it not, it fallbacks
to just checking disk space available for /var/lib/machines.

Refs kinvolk#339 kinvolk#293

Signed-off-by: Mateusz Gozdek <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants