Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v1.13 without the deny precedence patch + ipsec #32459

Draft
wants to merge 13 commits into
base: v1.13
Choose a base branch
from

Conversation

julianwiedmann
Copy link
Member

Please ensure your pull request adheres to the following guidelines:

  • For first time contributors, read Submitting a pull request
  • All code is covered by unit and/or runtime tests where feasible.
  • All commits contain a well written commit description including a title,
    description and a Fixes: #XXX line if the commit addresses a particular
    GitHub issue.
  • If your commit description contains a Fixes: <commit-id> tag, then
    please add the commit author[s] as reviewer[s] to this issue.
  • All commits are signed off. See the section Developer’s Certificate of Origin
  • Provide a title or release-note blurb suitable for the release notes.
  • Are you a user of Cilium? Please add yourself to the Users doc
  • Thanks for contributing!

Fixes: #issue-number

<!-- Enter the release note text here if needed or remove this section! -->

pippolo84 and others added 8 commits March 28, 2024 17:14
This reverts commit b0a4f37.

Signed-off-by: Fabio Falzoi <[email protected]>
This reverts commit 0c4dcb2.

Signed-off-by: Fabio Falzoi <[email protected]>
This reverts commit 90f8859.

Signed-off-by: Fabio Falzoi <[email protected]>
This reverts commit 80f7fef.

Signed-off-by: Fabio Falzoi <[email protected]>
No functional change, just to make further revision easier.

Signed-off-by: gray <[email protected]>
@maintainer-s-little-helper maintainer-s-little-helper bot added backport/1.13 This PR represents a backport for Cilium 1.13.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master. labels May 10, 2024
@julianwiedmann
Copy link
Member Author

/test-backport-1.13

This patch introduces a workaround to avoid kernel issue when deleting
xfrm states.

Let's start from the kernel issue.

After installing two xfrm states on the same host using below commands
(please note the differences on the mark, mask, src):

```
ip x s a src 10.244.1.43 dst 10.244.3.114 proto esp spi 0x00000003 reqid 1 mode tunnel replay-window 0 mark 0x2a450d00 mask 0xffff0f00 output-mark 0xd00 mask 0xffffff00 aead 'rfc4106(gcm(aes))' 0x42a89014074c49243219a20cc87cadd7b9c0d7d1 128 sel src 0.0.0.0/0 dst 0.0.0.0/0
ip x s a src 0.0.0.0 dst 10.244.3.114 proto esp spi 0x00000003 reqid 1 mode tunnel replay-window 0 mark 0xd00 mask 0xf00 output-mark 0xd00 mask 0xffffff00 aead 'rfc4106(gcm(aes))' 0x42a89014074c49243219a20cc87cadd7b9c0d7d1 128 sel src 0.0.0.0/0 dst 0.0.0.0/0
```

When trying to delete the first xfrm state using the following command,
Linux kernel will instead remove the second xfrm state but keep the
first:

```
ip x s d src 10.244.1.43 dst 10.244.3.114 proto esp spi 0x00000003 mark 0x2a450d00 mask 0xffff0f00
```

This causes troubles for cilium upgrade.

A real world scenario for cilium upgrade from 1.13.12 to 1.13.14 could
be like:
1. Before upgrade, the node has "old-style" xfrm state to catch mark
   "0xd00/0xf00" for ingress traffic; old bpf programs also set "0xd00"
   mark to ingress skbs;
2. Upgrade begins, bpf programs are reloaded to new version, thereafter
   ingress skbs are marked with "0xXXXX0d00";
3. After a short while, cilium-agent installs new xfrm states to catch
   traffic with specific mark "0xXXXX0d00";

During window between step 2 and 3, cilium relies on "old-style" xfrm
states "0xd00/0xf00" to catch traffic with specific mark "0xXXXX0d00".

So far so good.

However, in a large scale cluster it's inevitable to receive
NodeDeletion events during upgrade due to node churn. Once seeing a
NodeDeletion event, cilium-agent will remove the xfrm state for that
gone-away remote node.

Now we hit the aforementioned kernel issue: cilium-agent tries to delete
the xfrm state catching more specific mark, but kernel wrongly removes
the one catching general mark.

This causing traffic disruption until upgrade completes with all new xfrm
states installed.

This patch provides an elegant solution at low cost: if cilium-agent
wants to remove a xfrm state catching specific mark, it has to
temporarily remove the xfrm state catching general mark first and add it
back after:
1. Temporarily remove the xfrm states catching the general mark;
2. Remove the xfrm state we really care abot;
3. Add back the temporaily removed one on step 1;

Indeed there will be a small window between temporary removing and
adding back, but our past test shows the window lasts 200-900µs only, so short
that we shoudn't see many drops.

Suggested-by: Julian Wiedmann <[email protected]>
Signed-off-by: gray <[email protected]>
@jschwinger233 jschwinger233 force-pushed the hf/v1.13/v1.13.14-without-deny-precedence-with-xfrm-fix branch from ae6454d to 33ebbca Compare May 15, 2024 10:42
@jschwinger233 jschwinger233 temporarily deployed to release-developer-images May 15, 2024 10:43 — with GitHub Actions Inactive
@jschwinger233 jschwinger233 deployed to release-developer-images May 15, 2024 10:43 — with GitHub Actions Active
@jschwinger233 jschwinger233 temporarily deployed to release-developer-images May 15, 2024 10:43 — with GitHub Actions Inactive
@jschwinger233 jschwinger233 temporarily deployed to release-developer-images May 15, 2024 10:43 — with GitHub Actions Inactive
@jschwinger233 jschwinger233 temporarily deployed to release-developer-images May 15, 2024 10:43 — with GitHub Actions Inactive
@jschwinger233 jschwinger233 temporarily deployed to release-developer-images May 15, 2024 10:43 — with GitHub Actions Inactive
@jschwinger233 jschwinger233 temporarily deployed to release-developer-images May 15, 2024 10:43 — with GitHub Actions Inactive
@jschwinger233 jschwinger233 temporarily deployed to release-developer-images May 15, 2024 10:43 — with GitHub Actions Inactive
@jschwinger233 jschwinger233 temporarily deployed to release-developer-images May 15, 2024 10:43 — with GitHub Actions Inactive
@jschwinger233
Copy link
Member

/test-backport-1.13

@marseel marseel requested a deployment to release-developer-images May 22, 2024 14:11 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 22, 2024 14:11 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 22, 2024 14:11 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 22, 2024 14:11 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 22, 2024 14:11 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 22, 2024 14:12 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 22, 2024 14:12 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 22, 2024 14:12 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 22, 2024 14:12 — with GitHub Actions Waiting
[ upstream commit 197b763 ]

The new box versions include Go 1.20.1 which is now required for
development.

Backporter note: 4.9 box was not updated.

Signed-off-by: Tobias Klauser <[email protected]>
@marseel marseel requested a deployment to release-developer-images May 23, 2024 07:57 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 23, 2024 07:57 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 23, 2024 07:57 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 23, 2024 07:57 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 23, 2024 07:57 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 23, 2024 07:57 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 23, 2024 07:57 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 23, 2024 07:57 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 23, 2024 07:57 — with GitHub Actions Waiting
Reduces GC CPU usage and memory allocations coming from XfrmStateList.
To ensure we have up-to-date cache, wrap all XfrmState related
functions inside cache, which is invalidated whenever XfrmState changes.

This is follow-up to #32577
While that PR averages out CPU usage over time, in large cluster 100+
nodes amount of allocations coming from netlink.XfrmStateList() is high
due to backgroundSync where we usually don't change any Xfrm states.
This becomes more and more expensive as number of nodes increases.

Added CI test to make sure that we accidentally don't add calls that
modify XFRMState without going through cache.

Also, added hidden option that allows to turn of caching.

Signed-off-by: Marcel Zieba <[email protected]>
Signed-off-by: Marcel Zieba <[email protected]>
@marseel marseel requested a deployment to release-developer-images May 23, 2024 20:20 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 23, 2024 20:20 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 23, 2024 20:20 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 23, 2024 20:20 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 23, 2024 20:20 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 23, 2024 20:20 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 23, 2024 20:20 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 23, 2024 20:20 — with GitHub Actions Waiting
@marseel marseel requested a deployment to release-developer-images May 23, 2024 20:20 — with GitHub Actions Waiting
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.13 This PR represents a backport for Cilium 1.13.x of a PR that was merged to main. kind/backports This PR provides functionality previously merged into master.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants