Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with LB Creation #6172

Closed
nidomoko opened this issue May 9, 2024 · 5 comments · Fixed by #6250
Closed

Issues with LB Creation #6172

nidomoko opened this issue May 9, 2024 · 5 comments · Fixed by #6250
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@nidomoko
Copy link

nidomoko commented May 9, 2024

What happened:

Prior to 1.28.X LB creation would succeed as expected. After upgrading cluster to 1.28.5 and subsequently 1.29.4, all LB service creates fail to properly add backend pools to the LB because they are missing VNet/Subnet properties.

What you expected to happen:'

Expectation is that the payload from creating LB resources in K8s would contain all properties needed for ANP

How to reproduce it (as minimally and precisely as possible):

On 1.29.4 while using backendpool configuration of NodeIP and multiple standard load balancers; attempt to create a lb service in K8s - the LB backend pool is created with empty values, and the LB is not properly updated because it is missing VNet/Subnet properites.

Environment:

  • Kubernetes version (use kubectl version): 1.29.4

  • Cloud provider or hardware configuration:

  • {

    "cloud": "AzurePublicCloud",
    "tenantId": "XXXXXX",
    "subscriptionId": "XXXXXX",
    "aadClientId": "XXXXXX",
    "aadClientSecret": "XXXXXX",
    "resourceGroup": "mnx387-2-sandbox",
    "securityGroupName": "k8s-default-nsg",
    "securityGroupResourceGroup": "mnx387-2-sandbox",
    "location": "eastus",
    "vmType": "vmss",
    "vnetName": "k8s-vnet",
    "vnetResourceGroup": "mnx387-2-sandbox",
    "subnetName": "k8s",
    "routeTableName": "routes",
    "loadBalancerSku": "Standard",
    "cloudProviderBackoff": true,
    "cloudProviderBackoffRetries": 6,
    "cloudProviderBackoffExponent": 1.5,
    "cloudProviderBackoffDuration": 5,
    "cloudProviderBackoffJitter": 1,
    "cloudProviderRatelimit": true,
    "cloudProviderRateLimitQPS": 3,
    "cloudProviderRateLimitBucket": 10,
    "maximumLoadBalancerRuleCount": 250,
    "useManagedIdentityExtension": true,
    "useInstanceMetadata": true,
    "disableOutboundSnat": true,
    "preConfiguredBackendPoolLoadBalancerTypes": "external",
    "primaryScaleSetName": "k8s-sbox-mnx387-2-nodes-0330",
    "loadBalancerBackendPoolConfigurationType": "nodeIP",
    "multipleStandardLoadBalancerConfigurations": [
    {
    "name": "kubernetes",
    "primaryVMSet": "k8s-sbox-mnx387-2-nodes-0330"
    },
    {
    "name": "k8s-sbox-mnx387-2-ingress2-nodes-0327",
    "primaryVMSet": "k8s-sbox-mnx387-2-ingress2-nodes-0327",
    "serviceLabelSelector": {
    "matchLabels": {
    "dedicatedlb": "k8s-sbox-mnx387-2-ingress2-nodes-0327"
    }
    }
    },
    {
    "name": "k8s-sbox-mnx387-2-mediabptt-nodes-0327",
    "primaryVMSet": "k8s-sbox-mnx387-2-mediabptt-nodes-0327",
    "serviceLabelSelector": {
    "matchLabels": {
    "dedicatedlb": "k8s-sbox-mnx387-2-mediabptt-nodes-0327"
    }
    }
    },
    {
    "name": "k8s-sbox-mnx387-2-radios-nodes-0327",
    "primaryVMSet": "k8s-sbox-mnx387-2-radios-nodes-0327",
    "serviceLabelSelector": {
    "matchLabels": {
    "dedicatedlb": "k8s-sbox-mnx387-2-radios-nodes-0327"
    }
    }
    }
    ]
    }

    VMSSs and load balancers having the same name:
    k8s-sbox-mnx387-2-ingress2-nodes-0327
    k8s-sbox-mnx387-2-mediabptt-nodes-0327
    k8s-sbox-mnx387-2-radios-nodes-0327

  • Others:
    - Sanitized screenshot from Portal:
    image

Example of a working cofig
{
"etag": "W/"XXXXXX"",
"id": "/subscriptions/XXXXXX/resourceGroups/mnx387-2-sandbox/providers/Microsoft.Network/loadBalancers/k8s-sbox-mnx387-2-mediabptt-nodes-0327/backendAddressPools/monitoring-cie-infra-debug-service-dedicated-lb-mediabptt",
"loadBalancerBackendAddresses": [
{
"ipAddress": "XXXXXX",
"name": "k8s-sbox-mnx387-2-mediabptt-nodes-0327000001"
},
{
"ipAddress": "XXXXXX",
"name": "k8s-sbox-mnx387-2-mediabptt-nodes-0327000000"
}
],
"loadBalancingRules": [
{
"id": "/subscriptions/XXXXXX/resourceGroups/mnx387-2-sandbox/providers/Microsoft.Network/loadBalancers/k8s-sbox-mnx387-2-mediabptt-nodes-0327/loadBalancingRules/afc102b4bf54c4ea2b1e885eecdf12df-TCP-8080",
"resourceGroup": "mnx387-2-sandbox"
}
],
"name": "monitoring-cie-infra-debug-service-dedicated-lb-mediabptt",
"provisioningState": "Succeeded",
"resourceGroup": "mnx387-2-sandbox",
"type": "Microsoft.Network/loadBalancers/backendAddressPools",
"virtualNetwork": {
"id": "/subscriptions/XXXXXX/resourceGroups/mnx387-2-sandbox/providers/Microsoft.Network/virtualNetworks/k8s-vnet",
"resourceGroup": "mnx387-2-sandbox"
}
}

Example of a new service of type LB that fails…missing the VNet config for some reason?

{
"etag": "W/"XXXXXX"",
"id": "/subscriptions/XXXXXX/resourceGroups/mnx387-2-sandbox/providers/Microsoft.Network/loadBalancers/k8s-sbox-mnx387-2-radios-nodes-0327/backendAddressPools/monitoring-cie-infra-debug-service-dedicated-lb-radios",
"loadBalancerBackendAddresses": [
{
"ipAddress": "XXXXXX",
"name": "k8s-sbox-mnx387-2-radios-nodes-0327000000"
},
{
"ipAddress": "XXXXXX",
"name": "k8s-sbox-mnx387-2-radios-nodes-0327000001"
}
],
"loadBalancingRules": [
{
"id": "/subscriptions/XXXXXX/resourceGroups/mnx387-2-sandbox/providers/Microsoft.Network/loadBalancers/k8s-sbox-mnx387-2-radios-nodes-0327/loadBalancingRules/abf42a3fc91b748a0b91139c490cc4fe-TCP-8080",
"resourceGroup": "mnx387-2-sandbox"
}
],
"name": "monitoring-cie-infra-debug-service-dedicated-lb-radios",
"provisioningState": "Succeeded",
"resourceGroup": "mnx387-2-sandbox",
"type": "Microsoft.Network/loadBalancers/backendAddressPools"
}

@nidomoko nidomoko added the kind/bug Categorizes issue or PR as related to a bug. label May 9, 2024
@nilo19
Copy link
Contributor

nilo19 commented May 10, 2024

@nidomoko you set "preConfiguredBackendPoolLoadBalancerTypes" which means we don't proactively configure the backend pool for you. Can you remove this and retry?

@nidomoko
Copy link
Author

@nidomoko you set "preConfiguredBackendPoolLoadBalancerTypes" which means we don't proactively configure the backend pool for you. Can you remove this and retry?

I tested the k8s upgrade process again in a sandbox cluster going from k8s v1.27.11 to k8s v1.28.9 and then finally to k8s v1.29.4. This time I removed the configuration you requested from the cloud controller manager (CCM) configuration.

Unfortunately, I observed the exact same behavior as before.

CCM configuration:

{
"cloud": "AzurePublicCloud",
"tenantId": "xxxxx",
"subscriptionId": "xxxxx",
"aadClientId": "msi",
"aadClientSecret": "msi",
"resourceGroup": "mnx387-2-sandbox",
"securityGroupName": "k8s-default-nsg",
"securityGroupResourceGroup": "mnx387-2-sandbox",
"location": "eastus",
"vmType": "vmss",
"vnetName": "k8s-vnet",
"vnetResourceGroup": "mnx387-2-sandbox",
"subnetName": "k8s",
"routeTableName": "routes",
"loadBalancerSku": "Standard",
"cloudProviderBackoff": true,
"cloudProviderBackoffRetries": 6,
"cloudProviderBackoffExponent": 1.5,
"cloudProviderBackoffDuration": 5,
"cloudProviderBackoffJitter": 1,
"cloudProviderRatelimit": true,
"cloudProviderRateLimitQPS": 3,
"cloudProviderRateLimitBucket": 10,
"maximumLoadBalancerRuleCount": 250,
"useManagedIdentityExtension": true,
"useInstanceMetadata": true,
"disableOutboundSnat": true,
"enableMigrateToIPBasedBackendPoolAPI": true,
"primaryScaleSetName": "k8s-sbox-mnx387-2-nodes-1842",
"loadBalancerBackendPoolConfigurationType": "nodeIP",
"multipleStandardLoadBalancerConfigurations": [
{
"name": "kubernetes",
"primaryVMSet": "k8s-sbox-mnx387-2-nodes-1842"
},
{
"name": "k8s-sbox-mnx387-2-ingress2-nodes-1842",
"primaryVMSet": "k8s-sbox-mnx387-2-ingress2-nodes-1842",
"serviceLabelSelector": {
"matchLabels": {
"dedicatedlb": "k8s-sbox-mnx387-2-ingress2-nodes-1842"
}
}
},
{
"name": "k8s-sbox-mnx387-2-mediabptt-nodes-1842",
"primaryVMSet": "k8s-sbox-mnx387-2-mediabptt-nodes-1842",
"serviceLabelSelector": {
"matchLabels": {
"dedicatedlb": "k8s-sbox-mnx387-2-mediabptt-nodes-1842"
}
}
},
{
"name": "k8s-sbox-mnx387-2-radios-nodes-1844",
"primaryVMSet": "k8s-sbox-mnx387-2-radios-nodes-1844",
"serviceLabelSelector": {
"matchLabels": {
"dedicatedlb": "k8s-sbox-mnx387-2-radios-nodes-1844"
}
}
}
]
}

Same result as before. Existing service of type load balancer (created when cluster was at k8s v1.27.11) was properly configured. Creating a new service of type load balancer following the upgrade to k8s v1.29.4 failed the same as before.

image

image

@nilo19
Copy link
Contributor

nilo19 commented May 15, 2024

@nidomoko Can you share more information? Thanks in advance.

  1. the cloud-controller-manager log of both 1.27.11 and 1.29.4
  2. the service manifest (created successfully and fail)

@nidomoko
Copy link
Author

ccm-v1.27.11.log
config-v1.27.11.txt
config-v1.29.4.txt
load_balancer_activity_log

@nidomoko Can you share more information? Thanks in advance.

  1. the cloud-controller-manager log of both 1.27.11 and 1.29.4
  2. the service manifest (created successfully and fail)

Attached requested files.

Following the upgrade from k8s 1.27.11, I created two new services of type load balancer:
cie-infra-debug-service-shared-lb-worker-1 (this new service on the shared load balancer 'kubernetes' worked)
cie-infra-debug-service-dedicated-lb-radios-2 (this new service on the dedicated load balancer 'k8s-sbox-mnx387-2-radios-nodes-2155' failed)

The following services of type load balancer were created when the cluster was created at k8s v1.27.11 (all of these services continued to work after the upgrade to k8s v1.29.4):
cie-infra-debug-service
cie-infra-debug-service-dedicated-lb-mediabptt
cie-infra-debug-service-dedicated-lb-radios
cie-infra-debug-service-shared-lb-elastic
nginx-ingress
nginx-ingress-udp

Attached is the error message from the activity log for load balancer 'k8s-sbox-mnx387-2-radios-nodes-2155'.

@nidomoko
Copy link
Author

ccm-v1.29.4_Redacted.txt
And uploading CCM 1.29.4 logs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants