Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update aws-node from 1.12.6 to 1.18.1 #7756

Merged
merged 3 commits into from
May 16, 2024

Conversation

consideRatio
Copy link
Contributor

@consideRatio consideRatio commented May 15, 2024

1.18.1 is recommended for EKS clusters, where its documented in aws/amazon-vpc-cni that "For all Kubernetes releases, we recommend installing the latest VPC CNI release."

The latest available addon for various k8s minor versions are listed here, and it currently sais 1.18.1 for k8s 1.23 to 1.29.

This PR mimics what was done in #6692 that updated this version from 1.11 to 1.12 before.

NetworkPolicy enforcement (not enabled by default, and doesn't work if enabled)

A noteworthy change introduced with this newer version of aws-node / amazon-vpc-cni is that NetworkPolicy resources are meant to become enforced as I understand it, this was introduced in v1.14.0 according to changelog.

I tested the network policy enforcement, and it didn't get enabled by the change in this PR by default. In a blogpost we see that they also configure the addon like:

addons:
  - name: vpc-cni
    version: 1.14.0
    attachPolicyARNs: #optional
    - arn:aws:iam::aws:policy/AmazonEKS_CNI_Policy 
    configurationValues: |-
      enableNetworkPolicy: "true"

I don't understand why there is a aws-node.yaml in this project with a hardcoded version and why its separate from using an EKS addon - but in this case since eksctl utils update-aws-node doesn't relate to the listed addons, I'm not sure how we would do the equivalent configuration.

A container in the aws-node daemonset has the following args, so I figure we would then toggle --enable-network-policy=false to =true.

      - args:
        - --enable-ipv6=false
        - --enable-network-policy=false
        - --enable-cloudwatch-logs=false
        - --enable-policy-event-logs=false
        - --metrics-bind-addr=:8162
        - --health-probe-bind-addr=:8163
        - --conntrack-cache-cleanup-period=300

I tested changing this, but network policy remained not-enforced.

eksctl utils update-aws-node tested

I tested building an eksctl binary on top of this commit, and I used it to successfully update from 1.12.6 to 1.18.1 in a cluster. The aws-node pod entered a running state successfully and the cluster's workload behaved as expected still.

$ eksctl utils update-aws-node --config-file=$CLUSTER_NAME.eksctl.yaml -v 4

2024-05-15 16:23:27 [▶]  Setting credentials expiry window to 30 minutes
2024-05-15 16:23:29 [▶]  role ARN for the current session is "arn:aws:sts::redacted:assumed-role/AWSReservedSSO_AdministratorAccess_redacted/redacted"
2024-05-15 16:23:30 [▶]  cluster = &types.Cluster{AccessConfig:(*types.AccessConfigResponse)(0xc000f6a940), Arn:(*string)(0xc000c81590), CertificateAuthority:(*types.Certificate)(0xc000c81550), ClientRequestToken:(*string)(nil), ConnectorConfig:(*types.ConnectorConfigResponse)(nil), CreatedAt:time.Date(2024, time.March, 25, 15, 53, 23, 337000000, time.UTC), EncryptionConfig:[]types.EncryptionConfig(nil), Endpoint:(*string)(0xc000c814f0), Health:(*types.ClusterHealth)(0xc000f6a960), Id:(*string)(nil), Identity:(*types.Identity)(0xc000c815a0), KubernetesNetworkConfig:(*types.KubernetesNetworkConfigResponse)(0xc000f7c180), Logging:(*types.Logging)(0xc000f6a900), Name:(*string)(0xc000c81570), OutpostConfig:(*types.OutpostConfigResponse)(nil), PlatformVersion:(*string)(0xc000c815d0), ResourcesVpcConfig:(*types.VpcConfigResponse)(0xc0005173b0), RoleArn:(*string)(0xc000c815e0), Status:"ACTIVE", Tags:map[string]string{"Name":"eksctl-bican-cluster/ControlPlane", "alpha.eksctl.io/cluster-name":"bican", "alpha.eksctl.io/cluster-oidc-enabled":"true", "alpha.eksctl.io/eksctl-version":"0.174.0-dev+3c1a5c4c2.2024-03-15T18:46:40Z", "aws:cloudformation:logical-id":"ControlPlane", "aws:cloudformation:stack-id":"arn:aws:cloudformation:us-east-2:redacted:stack/eksctl-bican-cluster/bff93ee0-eabf-11ee-8dcb-0a467cbfbedd", "aws:cloudformation:stack-name":"eksctl-bican-cluster", "eksctl.cluster.k8s.io/v1alpha1/cluster-name":"bican"}, Version:(*string)(0xc000c814e0), noSmithyDocumentSerde:document.NoSerde{}}
2024-05-15 16:23:31 [ℹ]  (plan) would have replaced "CustomResourceDefinition.apiextensions.k8s.io/eniconfigs.crd.k8s.amazonaws.com"
2024-05-15 16:23:31 [ℹ]  (plan) would have replaced "CustomResourceDefinition.apiextensions.k8s.io/policyendpoints.networking.k8s.aws"
2024-05-15 16:23:32 [ℹ]  (plan) would have skipped existing "kube-system:ServiceAccount/aws-node"
2024-05-15 16:23:32 [ℹ]  (plan) would have replaced "kube-system:ConfigMap/amazon-vpc-cni"
2024-05-15 16:23:32 [ℹ]  (plan) would have replaced "ClusterRole.rbac.authorization.k8s.io/aws-node"
2024-05-15 16:23:32 [ℹ]  (plan) would have replaced "ClusterRoleBinding.rbac.authorization.k8s.io/aws-node"
2024-05-15 16:23:32 [ℹ]  (plan) would have replaced "kube-system:DaemonSet.apps/aws-node"
2024-05-15 16:23:32 [✖]  (plan) "aws-node" is not up-to-date
2024-05-15 16:23:32 [!]  no changes were applied, run again with '--approve' to apply the changes

$ eksctl utils update-aws-node --config-file=$CLUSTER_NAME.eksctl.yaml -v 4 --approve

2024-05-15 16:23:40 [▶]  Setting credentials expiry window to 30 minutes
2024-05-15 16:23:42 [▶]  role ARN for the current session is "arn:aws:sts::redacted:assumed-role/AWSReservedSSO_AdministratorAccess_redacted/redacted"
2024-05-15 16:23:42 [▶]  cluster = &types.Cluster{AccessConfig:(*types.AccessConfigResponse)(0xc0000cd320), Arn:(*string)(0xc000d8cd10), CertificateAuthority:(*types.Certificate)(0xc000d8cd80), ClientRequestToken:(*string)(nil), ConnectorConfig:(*types.ConnectorConfigResponse)(nil), CreatedAt:time.Date(2024, time.March, 25, 15, 53, 23, 337000000, time.UTC), EncryptionConfig:[]types.EncryptionConfig(nil), Endpoint:(*string)(0xc000d8cd20), Health:(*types.ClusterHealth)(0xc0000cd340), Id:(*string)(nil), Identity:(*types.Identity)(0xc000d8cd30), KubernetesNetworkConfig:(*types.KubernetesNetworkConfigResponse)(0xc000ddc5a0), Logging:(*types.Logging)(0xc0000cd260), Name:(*string)(0xc000d8cca0), OutpostConfig:(*types.OutpostConfigResponse)(nil), PlatformVersion:(*string)(0xc000d8cc90), ResourcesVpcConfig:(*types.VpcConfigResponse)(0xc000037030), RoleArn:(*string)(0xc000d8ccb0), Status:"ACTIVE", Tags:map[string]string{"Name":"eksctl-bican-cluster/ControlPlane", "alpha.eksctl.io/cluster-name":"bican", "alpha.eksctl.io/cluster-oidc-enabled":"true", "alpha.eksctl.io/eksctl-version":"0.174.0-dev+3c1a5c4c2.2024-03-15T18:46:40Z", "aws:cloudformation:logical-id":"ControlPlane", "aws:cloudformation:stack-id":"arn:aws:cloudformation:us-east-2:redacted:stack/eksctl-bican-cluster/bff93ee0-eabf-11ee-8dcb-0a467cbfbedd", "aws:cloudformation:stack-name":"eksctl-bican-cluster", "eksctl.cluster.k8s.io/v1alpha1/cluster-name":"bican"}, Version:(*string)(0xc000d8cd60), noSmithyDocumentSerde:document.NoSerde{}}
2024-05-15 16:23:44 [ℹ]  replaced "CustomResourceDefinition.apiextensions.k8s.io/eniconfigs.crd.k8s.amazonaws.com"
2024-05-15 16:23:44 [ℹ]  replaced "CustomResourceDefinition.apiextensions.k8s.io/policyendpoints.networking.k8s.aws"
2024-05-15 16:23:44 [ℹ]  skipped existing "kube-system:ServiceAccount/aws-node"
2024-05-15 16:23:45 [ℹ]  replaced "kube-system:ConfigMap/amazon-vpc-cni"
2024-05-15 16:23:46 [ℹ]  replaced "ClusterRole.rbac.authorization.k8s.io/aws-node"
2024-05-15 16:23:46 [ℹ]  replaced "ClusterRoleBinding.rbac.authorization.k8s.io/aws-node"
2024-05-15 16:23:47 [ℹ]  replaced "kube-system:DaemonSet.apps/aws-node"
2024-05-15 16:23:47 [ℹ]  "aws-node" is now up-to-date

1.18.1 is recommended for EKS clusters, where its documented that "For
all Kubernetes releases, we recommend installing the latest VPC CNI
release." as read at https://github.com/aws/amazon-vpc-cni-k8s?tab=readme-ov-file#recommended-version.

The latest available addon for various k8s minor versions are listed at
https://docs.aws.amazon.com/eks/latest/userguide/managing-vpc-cni.html#updating-vpc-cni-add-on,
and it currently sais 1.18.1 for k8s 1.23 to 1.29.
Copy link
Contributor

@github-actions github-actions bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hello consideRatio 👋 Thank you for opening a Pull Request in eksctl project. The team will review the Pull Request and aim to respond within 1-10 business days. Meanwhile, please read about the Contribution and Code of Conduct guidelines here. You can find out more information about eksctl on our website

@consideRatio
Copy link
Contributor Author

A key confusion point is why aws-node.yaml is hardcoded inside eksctl codebase, and why we may want to use that over having addons list the addon named vpc-cni etc.

Maybe the way forward isn't to bump this, but to help users transition to using addons as declared in the eksctl config file?

@TiberiuGC
Copy link
Collaborator

Ideally we'd have our github actions workflow that opens a PR with the latest version of aws-node weekly. That's not working as intended unfortunately.

I think the reason why curl --silent --location https://raw.githubusercontent.com/aws/amazon-vpc-cni-k8s/v1.18.1/config/master/aws-k8s-cni.yaml?raw=1 --output assets/aws-node.yaml is being used over EKS API is that the latest version for the self-managed add-on may not coincide with the latest version for EKS managed addon. Indeed this hardcoded URL is not being updated as part of the aforementioned workflow, which seems to be the problem.

@consideRatio
Copy link
Contributor Author

consideRatio commented May 16, 2024

I pushed an update meant to fix the unit tests -- we now have two containers instead of just one in the aws-node pod. The added container relates to NetworkPolicy enforcement - which is disabled by default in the daemonset, and doesn't seem to get functional after enabling it either.

It would be great if networkpolicy enforcement could be configured and enabled successfully, but it could be considered separate from this PR as well of course. I lack the knowledge in eksctl to drive this work =/

Expect(awsNode.Spec.Template.Spec.Containers[0].Image).To(
Equal("602401143452.dkr.ecr.us-east-1.amazonaws.com/amazon-k8s-cni:v1.18.1"),
)
Expect(awsNode.Spec.Template.Spec.Containers[1].Image).To(
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm working on a script that automatically updates the unit test as-well as part of the workflow. Let's remove this assertion, the one above should be enough for the purpose of testing

@TiberiuGC TiberiuGC merged commit e7b9846 into eksctl-io:main May 16, 2024
9 checks passed
@consideRatio
Copy link
Contributor Author

Thank you @TiberiuGC!

For reference about the network policy stuff, the steps to enable it are listed here under "self-managed add-on" I think - I've followed them and failed to see a result so far though

IdanShohamNetApp pushed a commit to spotinst/weaveworks-eksctl that referenced this pull request Jun 2, 2024
* Update aws-node from 1.12.6 to 1.18.1

1.18.1 is recommended for EKS clusters, where its documented that "For
all Kubernetes releases, we recommend installing the latest VPC CNI
release." as read at https://github.com/aws/amazon-vpc-cni-k8s?tab=readme-ov-file#recommended-version.

The latest available addon for various k8s minor versions are listed at
https://docs.aws.amazon.com/eks/latest/userguide/managing-vpc-cni.html#updating-vpc-cni-add-on,
and it currently sais 1.18.1 for k8s 1.23 to 1.29.

* Update tests for aws-node 1.18.1

* Reduce complexity of aws-node test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] eksctl utils update-aws-node downgraded aws-node significantly instead of upgrading it
2 participants