Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs - EVPN: extend the information about arp suppression support in case of L2 #16015

Open
2 tasks done
fedepaol opened this issue May 15, 2024 · 0 comments
Open
2 tasks done
Labels
triage Needs further investigation

Comments

@fedepaol
Copy link

Description

With the premise that I am still learning about evpn, I tried to setup a EVPN L2 topology to see how it works. I was able to make everything work but arp suppression, so I wonder if my case is a corner case or we could enhance the docs a bit.

Arp suppression is meant to reduce the amount of multicast (or replicated) traffic going through the underlay.
The docs says:

"ARP/ND suppression is enabled per bridge_slave via neigh_suppress." and suggests how setting the parameter will enable ARP suppression in a MAC VRF. The problem is, that parameter alone is not enough (or, I am missing something).

Given a topology like the one I have in my container lab based example:

 

            ┌─────────┐
            │         │
            │  64612  │
            │         │
            └────┬────┘
                 │
                 │
         ┌───────┴────────┐
         │                │
    ┌────┴────┐      ┌────┴────┐
    │         │      │         │
    │  64512  │      │  64512  ├───────┐
    │         │      │         │       │
    └────┬────┘      └────┬────┘       │
         │                │            │        L2
    ┌────┴────┐     ┌─────┴────┐  ┌────┴─────┐
    │         │     │          │  │          │
    │  Host1  │     │   Host2  │  │   Host3  │
    │         │     │          │  │          │
    └─────────┘     └──────────┘  └──────────┘

192.168.10.2/24    192.168.10.3/24   192.168.10.4/24

When I arping host2 from host1, everything but arp suppression works, meaning that every arp request goes through the vxlan tunnel.

my understanding is that:

  • zebra looks at the local fib / neighbor table to advertise the mac / ip to the other side
  • the other leaf would see the type 3 route and add them to the fib / arp table
  • the vxlan (with the neigh_suppress on parameter) would proxy the arp request

But the arp request is broadcasted, the reply gets back to the leaf's bridge and gets unicasted towards Host1. So the leaf's host never sees it, and the arp cache is never filled.

Note: I don't think this is an FRR bug, but I think the docs should be more clear on what's provided by FRR and by the kernel in this scenario.

Relevant discussion here https://frrouting.slack.com/archives/CP5NXU36G/p1715767012284389

Version

not relevant

How to reproduce

Set up a L2EVPN, make some traffic between two hosts connected to the linux bridges. An example can be found here https://github.com/fedepaol/evpnlab/tree/main/02_clab_l2

Expected behavior

After the first arping from host1 to host2, the local leaf will reply instead of forwarding.

Actual behavior

The neigh_suppress parameter relies on the neighbor table to proxy the arp request.
The neighbor table of the leaf either because a type 2 route is sent, or by the kernel when it receives an arp reply.

So, if the traffic is pure l2 between the hosts, no arp cache is filled and arp suppression never kicks. I was reading #12574 (comment) where neighmgrd is mentioned, which would probably solve the issue.

Additional context

I am not sure whether there are scenarios with mixed l2 and l3 where the leaf acts as gateway and thus arp requests would go from the bridge to the host which would make the arp request fill the local arp table, but in a pure L2 scenario I think it just don't work.

Checklist

  • I have searched the open issues for this bug.
  • I have not included sensitive information in this report.
@fedepaol fedepaol added the triage Needs further investigation label May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
triage Needs further investigation
Projects
None yet
Development

No branches or pull requests

1 participant