Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support IP fragmentation in eBPF #8821

Open
nick-oconnor opened this issue May 14, 2024 · 5 comments
Open

Support IP fragmentation in eBPF #8821

nick-oconnor opened this issue May 14, 2024 · 5 comments
Labels
area/bpf eBPF Dataplane issues kind/enhancement

Comments

@nick-oconnor
Copy link

nick-oconnor commented May 14, 2024

Expected Behavior

UDP packet fragments destined for a pod's IP which are not denied by policy arrive on the pod's interface.

Current Behavior

The eBPF data plane appears to be dropping UDP packet fragments by policy. The initial fragment is correctly forwarded from the node interface to the pod interface, but subsequent fragments do not appear on the pod's interface. When a UDP packet fragment is dropped, calico's dropped by policy counter for the interface is incremented. The pod interface eventually responds with "fragment reassembly time exceeded".

The only policies I have defined are k8s network policies. This problem does not occur when using the IPTables data plane.

Possible Solution

No idea. There may be a bug in calico's eBPF policy code.

Steps to Reproduce (for bugs)

  1. Enable the eBPF data plane (kube-proxy not running, with or without DSR)
    • BGP w/ no encapsulation + dual stack (I'm unsure if this is relevant, packet captures were all IPv4)
  2. Deploy a pod
  3. Start a packet capture on the node running the pod
  4. Send a fragmented UDP packet to the pod IP (I'm unsure how to replicate this outside of SNMP)

Context

I experienced this behavior after migrating from the IPTables data plane to the eBPF data plane. All SNMP responses exceeding the network's MTU caused my SNMP collector to timeout. I used captures from various points to determine where the packets were being dropped.

Your Environment

  • Calico version: v3.28.0
  • Orchestrator version (e.g. kubernetes, mesos, rkt): kubernetes 1.29.4
  • Operating System and version: Ubuntu 24.04 (6.8.0-31 kernel)
  • Relevant Calico config: BGP w/ no encapsulation, dual stack (added in v3.28.0)
@tomastigera
Copy link
Contributor

That is correct observation. Unfortunately, ebpf dataplane does not support ip fragmentation as only the first fragment contains udp ports. The subsequent fragments cannot be matched reliably with the ongoing flow. We cannot reassemble the fragments in eBPF easily (that is a limitation of the technology). This said, we might consider some improvements/workarounds in a future release.

@tomastigera tomastigera added kind/enhancement area/bpf eBPF Dataplane issues labels May 14, 2024
@tomastigera tomastigera changed the title eBPF data plane fails to forward UDP packet fragments Support IP fragmentation in eBPF May 14, 2024
@nick-oconnor
Copy link
Author

@tomastigera Wow thanks for the quick reply! Very interesting. Looks like I have some homework regarding eBPF APIs. It'll probably save folks some time by adding this to the eBPF docs for Calico.

@nick-oconnor
Copy link
Author

Related: cilium/cilium#25709 (comment)

@tomastigera
Copy link
Contributor

Thanks for the pointer. Problem with kfunc is that they are in "newer" kernels only and are not necessarily a stable API. But we could perhaps add it for kernels that have that feature! 👍

@tomastigera
Copy link
Contributor

Seems like the patch ⬆️ is not present in any released kernel :(

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/bpf eBPF Dataplane issues kind/enhancement
Projects
None yet
Development

No branches or pull requests

2 participants