Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

policy(feat): GrpcRoute status support #12508

Conversation

the-wondersmith
Copy link
Contributor

Subject

Enable handling of Gateway GRPCRoute resources in linkerd-policy-controller-k8s-status.

Problem

In order to support driving traffic via Gateway GRPCRoute resources, the policy controller's status component
must be made aware of both the different route type(s) (and their associated "sources") as well as how to handle
them.

Solution

  • new route types implemented
  • existing test suite expanded to cover new route types

Validation

  • tests pass
Screenshot 2024-04-25 at 5 23 46 PM

Fixes Partially Addresses

@the-wondersmith the-wondersmith requested a review from a team as a code owner April 25, 2024 21:35
@the-wondersmith the-wondersmith force-pushed the policy-feat-grpcroute-status-support branch 2 times, most recently from 2800b51 to f4a2649 Compare April 25, 2024 21:38
Copy link
Member

@olix0r olix0r left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking generally right to me.

Small nit: We generally have moved to stop using 'mod.rs' files, so in the case of policy-controller/k8s/status/src/tests/routes/mod.rs, we would typically name that controller/k8s/status/src/tests/routes.rs.

I assume we need to (temporarily, just for the sake of seeing it work end-to-end) add a patch our Cargo.toml to take your branch of the k8s-gateway-api? Something like

[patch.crates-io]
k8s-gateway-api = { git = "https://github.com/the-wondersmith/k8s-gateway-api-rs", branch = "..." }

Copy link
Member

@adleong adleong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great!

policy-controller/k8s/status/src/index.rs Outdated Show resolved Hide resolved
policy-controller/k8s/status/src/index.rs Outdated Show resolved Hide resolved
policy-controller/k8s/status/src/index.rs Outdated Show resolved Hide resolved
@the-wondersmith
Copy link
Contributor Author

@olix0r

This is looking generally right to me.

💪😁

Small nit: We generally have moved to stop using 'mod.rs' files, so in the case of policy-controller/k8s/status/src/tests/routes/mod.rs, we would typically name that controller/k8s/status/src/tests/routes.rs.

I want to make sure I action this correctly - for context I organized the tests as a mirror of the crate itself, so because src/routes/ is specifically structured so that anything generic / that applies to any route type lives at crate::routes::GenericRouteThing, everything that applies to a specific route subtype would live at crate::routes::subtype::SpecificRouteThing.

Is the nit about the partitioning of the route types into discrete submodules? Or does the nit just mean that this the preferred style / convention for linkerd crates?

├── Cargo.toml
└── src
   ├── ...
   ├── routes
   │  ├── grpc.rs
   │  └── http.rs
   ├── routes.rs
   └── tests
      ├── mod.rs
      ├── routes
      │  ├── grpc.rs
      │  └── http.rs
      └── routes.rs

I assume we need to (temporarily, just for the sake of seeing it work end-to-end) add a patch our Cargo.toml to take your branch of the k8s-gateway-api? Something like

[patch.crates-io]
k8s-gateway-api = { git = "https://github.com/the-wondersmith/k8s-gateway-api-rs", branch = "..." }

Yes. Locally, I've got essentially that in a .cargo/config.toml (just pointing to the local directory instead of the remote repo). There's an open PR in that repo as well.

the-wondersmith and others added 15 commits April 27, 2024 14:34
…rom linkerd-policy-controller-k8s-api

Signed-off-by: Mark S <[email protected]>
Bumps [lock_api](https://github.com/Amanieu/parking_lot) from 0.4.11 to 0.4.12.
- [Changelog](https://github.com/Amanieu/parking_lot/blob/master/CHANGELOG.md)
- [Commits](Amanieu/parking_lot@lock_api-0.4.11...lock_api-0.4.12)

---
updated-dependencies:
- dependency-name: lock_api
  dependency-type: indirect
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Mark S <[email protected]>
Bumps [parking_lot](https://github.com/Amanieu/parking_lot) from 0.12.1 to 0.12.2.
- [Changelog](https://github.com/Amanieu/parking_lot/blob/master/CHANGELOG.md)
- [Commits](Amanieu/parking_lot@0.12.1...0.12.2)

---
updated-dependencies:
- dependency-name: parking_lot
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
Signed-off-by: Mark S <[email protected]>
…inkerd#12500)

Subject
Fixes a bug where headless endpoint mirrors get cleaned up during GC

Problem
When GC is triggered (which also happens at startup or when the link watch disconnects), the service mirror controller attempts to look for services that can be GC'ed. This is done by looping through the local mirrored services on the cluster, then extracting the name of the original service in the remote (by dropping the target name suffix).

However, this check doesn't account for the headless endpoint service mirrors (the per pod cluster IP services). For example, if you have nginx-svc in the west cluster and two replicas, the source cluster will end up with nginx-svc-west, nginx-set-0-west and nginx-set-1-west. The logic would then parse the resource name for the latter two services as nginx-set-0 and nginx-set-1 which won't exist on the remote and ends up deleting them as part of GC.

The next sync would recreate those mirrors but you end up with downtime.

Solution
For those cases, instead of parsing the remote resource from the local service name, retrieve the info from the `mirror.linkerd.io/headless-mirror-svc-name` label.

Validation
Unit tests

Fixes linkerd#12499

Signed-off-by: Marwan Ahmed <[email protected]>
Signed-off-by: Mark S <[email protected]>
Closes linkerd#12395

Failing to iterate over init containers as well as regular containers for finding the proxy in various parts of the code when the proxy is injected as a native sidecar resulted in:

- `Get` Destination API failing in the presence of opaque ports
- Failure having the injector detecting already injected pods
- Various CLI issues

This PR is split into the following commits addressing each issue separately:

a8ebe76 - Fix injection check for existing sidecars
44e9625 - Fix 'linkerd uninject'
6269496 - Fix 'linkerd version --proxy'
42dbdad - Fix 'linkerd identity'
39db823 - Fix 'linkerd check'
7359f37 - Fix 'linkerd dg proxy-metrics'
f8f73c4 - Fix destination controller

Signed-off-by: Mark S <[email protected]>
@the-wondersmith the-wondersmith force-pushed the policy-feat-grpcroute-status-support branch from 3b8aad6 to fa79a34 Compare April 27, 2024 18:34
@olix0r
Copy link
Member

olix0r commented Apr 29, 2024

Is the nit about the partitioning of the route types into discrete submodules? Or does the nit just mean that this the preferred style / convention for linkerd crates?

It's limited to the mod.rs files. We usually avoid mod.rs files now, e.g. tests.rs instead of tests/mod.rs, etc.

@the-wondersmith
Copy link
Contributor Author

@olix0r

Is the nit about the partitioning of the route types into discrete submodules? Or does the nit just mean that this the preferred style / convention for linkerd crates?

It's limited to the mod.rs files. We usually avoid mod.rs files now, e.g. tests.rs instead of tests/mod.rs, etc.

Gotcha. Given that the tests/mod.rs predates these changes, I want to make sure I action the feedback correctly.

If you'll confirm that this is the desired structure for the crate, I'll push the relevant changes:

├── Cargo.toml
└── src
   ├── index.rs
   ├── lib.rs
   ├── resource_id.rs
   ├── routes
   │  ├── grpc.rs
   │  └── http.rs
   ├── routes.rs
   ├── service.rs
   ├── tests
   │  ├── routes
   │  │  ├── grpc.rs
   │  │  └── http.rs
   │  └── routes.rs
   └── tests.rs

Copy link
Member

@adleong adleong left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

the-wondersmith and others added 24 commits May 7, 2024 16:54
Adds index metrics to the outbound policy index.

```
# HELP outbound_index_service_index_size The number of entires in service index
# TYPE outbound_index_service_index_size gauge
outbound_index_service_index_size 20
# HELP outbound_index_service_info_index_size The number of entires in the service info index
# TYPE outbound_index_service_info_index_size gauge
outbound_index_service_info_index_size 23
# HELP outbound_index_service_route_index_size The number of entires in the service route index
# TYPE outbound_index_service_route_index_size gauge
outbound_index_service_route_index_size{namespace="kube-system"} 0
outbound_index_service_route_index_size{namespace="cert-manager"} 0
outbound_index_service_route_index_size{namespace="default"} 0
outbound_index_service_route_index_size{namespace="linkerd"} 0
outbound_index_service_route_index_size{namespace="emojivoto"} 0
outbound_index_service_route_index_size{namespace="linkerd-viz"} 0
# HELP outbound_index_service_port_route_index_size The number of entires in the service port route index
# TYPE outbound_index_service_port_route_index_size gauge
outbound_index_service_port_route_index_size{namespace="kube-system"} 0
outbound_index_service_port_route_index_size{namespace="cert-manager"} 0
outbound_index_service_port_route_index_size{namespace="default"} 1
outbound_index_service_port_route_index_size{namespace="linkerd"} 0
outbound_index_service_port_route_index_size{namespace="emojivoto"} 3
outbound_index_service_port_route_index_size{namespace="linkerd-viz"} 0
```

Signed-off-by: Alex Leong <[email protected]>
(cherry picked from commit 405aabb)
…inkerd#12427)" (linkerd#12589)

This reverts commit 4fccf3e.

The early return was causing `pp.addresses = newAddressSet` to not be run when the list of addresses is empty; but setting that is still necessary so that labels are tracked correctly.

This was caught by the tap (viz) integration test run in the release workflow.

(cherry picked from commit 9bd8c00)
Bumps [thiserror](https://github.com/dtolnay/thiserror) from 1.0.59 to 1.0.60.
- [Release notes](https://github.com/dtolnay/thiserror/releases)
- [Commits](dtolnay/thiserror@1.0.59...1.0.60)

---
updated-dependencies:
- dependency-name: thiserror
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 9d5994c)
Bumps [github.com/prometheus/client_golang](https://github.com/prometheus/client_golang) from 1.19.0 to 1.19.1.
- [Release notes](https://github.com/prometheus/client_golang/releases)
- [Changelog](https://github.com/prometheus/client_golang/blob/main/CHANGELOG.md)
- [Commits](prometheus/client_golang@v1.19.0...v1.19.1)

---
updated-dependencies:
- dependency-name: github.com/prometheus/client_golang
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 5156aa8)
…d#12587)

Bumps [sinon](https://github.com/sinonjs/sinon) from 17.0.1 to 17.0.2.
- [Release notes](https://github.com/sinonjs/sinon/releases)
- [Changelog](https://github.com/sinonjs/sinon/blob/main/docs/changelog.md)
- [Commits](sinonjs/sinon@v17.0.1...v17.0.2)

---
updated-dependencies:
- dependency-name: sinon
  dependency-type: direct:development
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit efdc4c8)
We instrumented conditional execution of some of our flakier integration test
suites: viz, multicluster, and policy. Since then, we have introduced retries
that ameliorate flakiness. The current state allows these tests to regress on
main, to be discovered only at release time.

This commit removes the conditional execution of these tests, and instead
runs all integration tests uniformly.

(cherry picked from commit 14d259a)
…rd#12590)

Bumps [github.com/fatih/color](https://github.com/fatih/color) from 1.16.0 to 1.17.0.
- [Release notes](https://github.com/fatih/color/releases)
- [Commits](fatih/color@v1.16.0...v1.17.0)

---
updated-dependencies:
- dependency-name: github.com/fatih/color
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 7cfe5d9)
…kerd#12588)

Bumps [tj-actions/changed-files](https://github.com/tj-actions/changed-files) from 44.3.0 to 44.4.0.
- [Release notes](https://github.com/tj-actions/changed-files/releases)
- [Changelog](https://github.com/tj-actions/changed-files/blob/main/HISTORY.md)
- [Commits](tj-actions/changed-files@0874344...a29e8b5)

---
updated-dependencies:
- dependency-name: tj-actions/changed-files
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <[email protected]>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
(cherry picked from commit 21dd252)
@the-wondersmith
Copy link
Contributor Author

Just commenting for historical context purposes -

PR was opened to facilitate code review, but its overall size and content doesn't mesh with internal change management strategy. I'm closing this PR now that code review is complete and will be breaking the changes out into smaller chunks of mergeable changes.

@the-wondersmith the-wondersmith deleted the policy-feat-grpcroute-status-support branch June 5, 2024 18:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants