Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Calico node crashing without error message on Raspberry Pi 4 connected with wireless wlan0 #8819

Open
tkislan opened this issue May 14, 2024 · 2 comments

Comments

@tkislan
Copy link

tkislan commented May 14, 2024

apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
  name: default
spec:
  # Configures Calico networking.
  calicoNetwork:
    # Note: The ipPools section cannot be modified post-install.
    ipPools:
      - blockSize: 26
        cidr: 10.244.0.0/16
        encapsulation: VXLANCrossSubnet
        natOutgoing: Enabled
        nodeSelector: all()
    nodeAddressAutodetectionV4:
      interface: tun0

I'm using openvpn network to connect edge devices with master node running in the cloud
I have Intel nuc device working as expected, from the same network as the problematic raspberry pi

ip addr output

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN group default qlen 1000
    link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    inet 127.0.0.1/8 scope host lo
       valid_lft forever preferred_lft forever
2: eth0: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN group default qlen 1000
    link/ether dc:a6:32:9f:c1:27 brd ff:ff:ff:ff:ff:ff
3: wlan0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP group default qlen 1000
    link/ether dc:a6:32:9f:c1:28 brd ff:ff:ff:ff:ff:ff
    inet 192.168.1.23/24 metric 600 brd 192.168.1.255 scope global dynamic wlan0
       valid_lft 11832sec preferred_lft 11832sec
4: tun0: <POINTOPOINT,MULTICAST,NOARP,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 500
    link/none
    inet 10.8.0.7/24 scope global tun0
       valid_lft forever preferred_lft forever
5: docker0: <NO-CARRIER,BROADCAST,MULTICAST,UP> mtu 1500 qdisc noqueue state DOWN group default
    link/ether 02:42:02:f2:d8:dc brd ff:ff:ff:ff:ff:ff
    inet 172.17.0.1/16 brd 172.17.255.255 scope global docker0
       valid_lft forever preferred_lft forever
9: tunl0@NONE: <NOARP,UP,LOWER_UP> mtu 1480 qdisc noqueue state UNKNOWN group default qlen 1000
    link/ipip 0.0.0.0 brd 0.0.0.0
    inet 10.244.210.192/32 scope global tunl0
       valid_lft forever preferred_lft forever
50: calib9ebbc1fedc@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP group default qlen 1000
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netns cni-d27bd61c-6107-d514-2f04-31a40d632e19
54: cali82b6a9674c3@if3: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1480 qdisc noqueue state UP group default qlen 1000
    link/ether ee:ee:ee:ee:ee:ee brd ff:ff:ff:ff:ff:ff link-netns cni-b6010d39-9714-044e-86b7-7d308d8f310c

ethernet port is not used, and tun0 interface should be used, configured through autodetection, where wlan0 is the interface that is connected to the internet

there are no logs indicating any kind of error, the calico-node just ends up in Completed state, and is being restarted
and other pods fail dns resolve, probably because kube-proxy pod is crashing as well

but what is very suspicious is, that there are multiple logs in calico-node, with EndpointId=eth0, which doesn't make sense, because it is disabled and not used

logs:
calico-node-describe.txt
calico-node.log
csi-node-driver.log
kube-proxy-describe.txt
kube-proxy.log

Expected Behavior

Current Behavior

Endless CrashLoopBackOff, no pods working on the node

Possible Solution

Steps to Reproduce (for bugs)

  1. Install calico tigera operator
  2. kubeadm join raspberry pi with wlan0 interface

Context

Your Environment

  • Calico version: quay.io/tigera/operator:v1.32.7 docker.io/calico/cni:v3.27.3
  • Orchestrator version (e.g. kubernetes, mesos, rkt): kubernetes bootstraped with kubeadm
  • Operating System and version: Ubuntu 24.04 LTS
  • Link to your project (optional):
@tomastigera
Copy link
Contributor

@tkislan could it be related to #8726 ?

@tkislan
Copy link
Author

tkislan commented May 14, 2024

Doesn't seem to have helped when I unloaded the kernel module, and restarted the pods

  Warning  Unhealthy       20s (x2 over 21s)  kubelet            Readiness probe failed: calico/node is not ready: BIRD is not ready: Error querying BIRD: unable to connect to BIRDv4 socket: dial unix /var/run/calico/bird.ctl: connect: connection refused
  Warning  Unhealthy       16s                kubelet            Readiness probe failed: 2024-05-14 16:56:55.526 [INFO][243] confd/health.go 180: Number of node(s) with BGP peering established = 0
calico/node is not ready: BIRD is not ready: BGP not established with 10.8.0.13,10.8.0.1,10.8.0.6

at least here it seems it's getting killed because of the healtcheck

# ls -l /var/run/calico
total 0
srw-rw---- 1 root root  0 May 14 19:00 bird.ctl
srw-rw---- 1 root root  0 May 14 19:00 bird6.ctl
drwx------ 2 root root 40 May 14 18:56 cgroup
-rw------- 1 root root  0 May 14 18:56 ipam.lock

but the files exist on the host

# ./calicoctl node checksystem
Checking kernel version...
		6.8.0-1004-raspi    					OK
Checking kernel modules...
		nf_conntrack_netlink					OK
		xt_addrtype         					OK
		xt_icmp             					OK
		ip_set              					OK
		ip6_tables          					OK
		ip_tables           					OK
		ipt_rpfilter        					OK
		xt_mark             					OK
		xt_multiport        					OK
		vfio-pci            					OK
		xt_bpf              					OK
		ipt_REJECT          					OK
		xt_rpfilter         					OK
		ipt_set             					OK
		xt_icmp6            					OK
		ipt_ipvs            					OK
		xt_conntrack        					OK
		xt_set              					OK
		xt_u32              					OK
System meets minimum system requirements to run Calico!

let me know what more information I can provide .. I'm really desperate here .. have been trying to figure this out for the past 3 days

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants