Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix _bgpv4Rib and _ebgpv4Rib inconsistency in BgpRoutingProcess #8985

Closed
wants to merge 2 commits into from

Conversation

heroinedd
Copy link

When a BGP process is multipath-ebgp disabled and multipath-ibgp enabled, for a BgpRoutingProcess and several equal-cost EBGP routes, its _bgpv4Rib will select them all while its _ebgpv4Rib will select only one of them. This commit fixes this inconsistency.

@batfish-bot
Copy link

This change is Reviewable

Copy link
Member

@arifogel arifogel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewed 4 of 4 files at r1, all commit messages.
Reviewable status: all files reviewed, 7 unresolved discussions (waiting on @heroinedd, @progwriter, and @SLarkworthy)


projects/batfish/src/main/java/org/batfish/dataplane/rib/BgpRib.java line 137 at r1 (raw file):

  }

  protected BgpRib(

Why did you add a new constructor instead of changing the old one? This presents several problems:

  1. I cannot easily see what you changed, since there is no concise diff from the old constructor
  2. Any other callsites of the old constructor are at risk of the inconsistency this PR is intended to fix.
  3. We now have to maintain unrelated changes to these constructors separately.

projects/batfish/src/main/java/org/batfish/dataplane/rib/BgpRib.java line 156 at r1 (raw file):

    boolean multipathIbgp = maxPathsIbgp == null || maxPathsIbgp > 1;
    boolean multipathEbgp = maxPathsEbgp == null || maxPathsEbgp > 1;
    _maxPaths = (multipathIbgp || multipathEbgp) ? null : 1;

Looks like you are projecting the maxPaths information down to a boolean and than inverting the projection to null, removing the actual max number of paths and setting to null in the case the input was greater than 1. What is the justification for removing this information?


projects/batfish/src/main/java/org/batfish/dataplane/rib/BgpRib.java line 227 at r1 (raw file):

            .thenComparing(this::compareRouteAsPath)
            .compare(lhs, rhs);
    if (multipathCompare != 0 || isMultipath(lhs, rhs)) {

Such a change should be tested exhaustively via tests of the containing function, i.e. comparePreference.


projects/batfish/src/main/java/org/batfish/dataplane/rib/BgpRib.java line 234 at r1 (raw file):

  }

  private boolean isMultipath(R lhs, R rhs) {

Can you please document the purpose and logic of this function? As is I am having trouble understanding the intended behavior.

For instance, why would multipath only apply to LEARNED routes? This seems like a behavior change in general that is not sufficiently motivated by the stated scope of this PR.


projects/batfish/src/main/java/org/batfish/dataplane/rib/BgpRib.java line 238 at r1 (raw file):

        && lhs.getOriginMechanism() == LEARNED
        && rhs.getOriginMechanism() == LEARNED) {
      return lhs.getProtocol() == RoutingProtocol.IBGP ? isMultipathIbgp() : isMultipathEbgp();

What if isMultipathIbgp() is true but isMultipath() is false?


projects/batfish/src/main/java/org/batfish/dataplane/rib/Bgpv4Rib.java line 182 at r1 (raw file):

  }

  public Bgpv4Rib(

Same question about why you created a new constructor as for BgpRib


projects/batfish/src/test/java/org/batfish/dataplane/ibdp/BgpRoutingProcessTest.java line 989 at r1 (raw file):

  @Test
  public void testBgpv4RibConsistency() {

What is being tested here? Please document. Also, add link to github issue in comment if applicable.

Copy link
Member

@arifogel arifogel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @heroinedd, @progwriter, and @SLarkworthy)

a discussion (no related file):

This commit fixes this inconsistency.

  1. Nit: commit -> PR
  2. Please update the description to say what the new behavior is.

@heroinedd
Copy link
Author

heroinedd commented Apr 9, 2024

Hi, arifogel

In BgpRoutingProcess, there is a combined BGPv4 RIB named _bgpv4Rib to store IBGP/EBGP routes LEARNED from peers, and routes generated locally (e.g., redistributed from other protocol, aggregated routes, generated by BGP network command). There're also two specific BGPv4 RIB, i.e., _ibgpv4Rib and _ebgpv4Rib, they are helper RIBs to store IBGP and EBGP routes LEARNED from peers.

  1. I'm adding a new constructor since for _ibgpv4Rib and _ebgpv4Rib, only maxPathIbgp or maxPathEbgp matters while for _bgpv4Rib, both maxPathIbgp and maxPathEbgp matters. I think using two constructors to differentiate this is appropriate.
  2. In the new constructor, I set _maxPaths = (multipathIbgp || multipathEbgp) ? null : 1 since there's no accurate value for _maxPath. However, if you want to have an accurate value, it can be set to the maximum of maxPathIbgp and maxPathEbgp. Actually, the accurate maxPath value for _bgpv4Rib is already lost when initialized in BgpRoutingProcess.
  3. I change if (multipathCompare != 0 || isMultipath()) to if (multipathCompare != 0 || isMultipath(lhs, rhs)) since we should handle the multipath compare result differently for different cases (suppose now we have multipathIbgp enabled and multipathEbgp disabled, therefore multipath = multipathIbgp || multipathEbgp = true):
    (1) lhs and rhs are both LEARNED from IBGP/EBGP peers, then we have to check whether multipathIbgp/multipathEbgp is enabled.
    (2) At least one of lhs and rhs is not LEARNED from peers, then we have to check whether multipath is enabled.
  4. The test is testing the consistency of a BgpRoutingProcess's _bgpv4Rib and _ebgpv4Rib when multipathIbgp is enabled but multipathEbgp is disabled. That is, when multipathIbgp = true, multipathEbgp = false and multipath = multipathIbgp || multipathEbgp = true:
    (1) If multipath EBGPv4 routes are equal cost, then only one of them is merged onto the _tree of _bgpv4Rib and _ebgpv4Rib.
    (2) If some IBGPv4, EBGPv4 and locally generated routes are equal cost, them all IBGPv4 routes, one of the EBGPv4 routes and all locally generated routes are merged onto the _tree of _bgpv4Rib.

Copy link
Member

@arifogel arifogel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@heroinedd please reply to individual comment threads, and quote relevant text to keep the review organized.

Reviewable status: all files reviewed, 8 unresolved discussions (waiting on @heroinedd, @progwriter, and @SLarkworthy)


projects/batfish/src/main/java/org/batfish/dataplane/rib/BgpRib.java line 137 at r1 (raw file):

  1. I'm adding a new constructor since for _ibgpv4Rib and _ebgpv4Rib, only maxPathIbgp or maxPathEbgp matters while for _bgpv4Rib, both maxPathIbgp and maxPathEbgp matters. I think using two constructors to differentiate this is appropriate.

With this statement you are answering the initial question of this thread, but not the numbered issues I have identified. Please ensure you have addressed all concerns in your replies, and preferably by quoting them inline.

I am still not convinced that a new constructor is the appropriate solution for the reasons I have identified. Furthermore, even if we were to go with your solution, the semantics are now unclear for each constructor, as you have not provided any additional documentation to clarify them.

Given the already sprawling complexity of this class, I instead urge you to consider an API change that:

  1. Does not involve an additional constructor
  2. Provides uniform semantics for all constructions of BgpRib
  3. Includes documentation for the parameters and fields implicated by the change.

projects/batfish/src/main/java/org/batfish/dataplane/rib/BgpRib.java line 156 at r1 (raw file):

  1. In the new constructor, I set _maxPaths = (multipathIbgp || multipathEbgp) ? null : 1 since there's no accurate value for _maxPath. However, if you want to have an accurate value, it can be set to the maximum of maxPathIbgp and maxPathEbgp. Actually, the accurate maxPath value for _bgpv4Rib is already lost when initialized in BgpRoutingProcess.

Sorry, it looks like that may be reasonable and essentially what was already done for the old constructor; however, this is one example of a maintenance problem that also makes review difficult when you make another constructor - the comment wasn't automatically nearby in the diff, and you neglected to copy it:

      /*
       * Essentially make this infinite. This is a design choice to enable more comprehensive path
       * analysis up the stack (e.g., reachability, multipath consistency, etc.)
       */

projects/batfish/src/main/java/org/batfish/dataplane/rib/BgpRib.java line 234 at r1 (raw file):

  1. I change if (multipathCompare != 0 || isMultipath()) to if (multipathCompare != 0 || isMultipath(lhs, rhs)) since we should handle the multipath compare result differently for different cases (suppose now we have multipathIbgp enabled and multipathEbgp disabled, therefore multipath = multipathIbgp || multipathEbgp = true):
    (1) lhs and rhs are both LEARNED from IBGP/EBGP peers, then we have to check whether multipathIbgp/multipathEbgp is enabled.
    (2) At least one of lhs and rhs is not LEARNED from peers, then we have to check whether multipath is enabled.

I was asking you to document the function, not just to explain to me in the review what it does. I'm asking for a code change.

To be clear I understand perfectly what your code does, because it's there in front of me. What I am asking for is the intended behavior and explanation, which I need in order to judge if your code is correct.

Right now I am of the opinion that this code likely does the wrong thing when e.g. lhs is LEARNED but rhs is not (as well as the reverse).

For instance, why would multipath only apply to LEARNED routes? This seems like a behavior change in general that is not sufficiently motivated by the stated scope of this PR.

(emphasis added)
I asked a "why" question. The answer should not be what your code does, but why it work this way.


projects/batfish/src/test/java/org/batfish/dataplane/ibdp/BgpRoutingProcessTest.java line 989 at r1 (raw file):

The test is testing the consistency of a BgpRoutingProcess's _bgpv4Rib and _ebgpv4Rib when multipathIbgp is enabled but multipathEbgp is disabled. That is, when multipathIbgp = true, multipathEbgp = false and multipath = multipathIbgp || multipathEbgp = true:
(1) If multipath EBGPv4 routes are equal cost, then only one of them is merged onto the _tree of _bgpv4Rib and _ebgpv4Rib.
(2) If some IBGPv4, EBGPv4 and locally generated routes are equal cost, them all IBGPv4 routes, one of the EBGPv4 routes and all locally generated routes are merged onto the _tree of _bgpv4Rib.

Please document. Also, add link to github issue in comment if applicable.

Still missing requested documentation.

@heroinedd
Copy link
Author

Hi fogel,

I have made a new commit to resolve your concerns, which (1) keeps one constructor in BgpRib and Bgpv4Rib, and (2) adds documentation. Please have a look. Thanks.


Why did you add a new constructor instead of changing the old one? This presents several problems:
1. I cannot easily see what you changed, since there is no concise diff from the old constructor
2. Any other callsites of the old constructor are at risk of the inconsistency this PR is intended to fix.
3. We now have to maintain unrelated changes to these constructors separately.

Now I have only one constructor for BgpRib and Bgpv4Rib.


Looks like you are projecting the maxPaths information down to a boolean and than inverting the projection to null, removing the actual max number of paths and setting to null in the case the input was greater than 1. What is the justification for removing this information?

Now it is consistent with your original code.


Can you please document the purpose and logic of this function? As is I am having trouble understanding the intended behavior.
For instance, why would multipath only apply to LEARNED routes? This seems like a behavior change in general that is not sufficiently motivated by the stated scope of this PR.

This is intended to solve #8990. Multipath does not only apply to LEARNED routes, it applies to all routes. The condition

if (lhs.getProtocol() == rhs.getProtocol()
        && lhs.getOriginMechanism() == LEARNED
        && rhs.getOriginMechanism() == LEARNED)

is a special case of multipath comparing. Since in BgpRoutingProcess, _ebgpv4Rib and _ibgpv4Rib are used to store eBGP and iBGP routes learned (i.e., route.getOriginMechanism() == OriginMechanism.LEARNED) from BGP peers, given two iBGP/eBGP routes learned from BGP peers, the multipath comparing result should be the same for the _ibgpv4Rib/ebgpv4Rib and the _bgpv4Rib.


What if isMultipathIbgp() is true but isMultipath() is false?

If isMultipathIbgp() is true, then isMultipath() must be true, since multipath = multipathIbgp || multipathEbgp. In BgpRoutingProcess:

_bgpv4Rib =
        new Bgpv4Rib(
            _mainRib,
            bestPathTieBreaker,
            _process.getMultipathEbgp() || _process.getMultipathIbgp() ? null : 1,
            multiPathMatchMode,
            clusterListAsIbgpCost,
            _process.getLocalOriginationTypeTieBreaker(),
            _process.getNetworkNextHopIpTieBreaker(),
            _process.getRedistributeNextHopIpTieBreaker(),
            nextHopIpResolverRestriction);

Same question about why you created a new constructor as for BgpRib.

Only one constructor now.


What is being tested here? Please document. Also, add link to github issue in comment if applicable.

This is testing for #8990.

@heroinedd heroinedd closed this May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants