Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dial tcp: lookup oidc.eks.eu-central-1.amazonaws.com on 10.144.197.2:53: no such host #2992

Closed
1 task done
ahoehma opened this issue Mar 28, 2024 · 5 comments
Closed
1 task done
Labels

Comments

@ahoehma
Copy link

ahoehma commented Mar 28, 2024

Description

Please provide a clear and concise description of the issue you are encountering, and a reproduction of your configuration (see the examples/* directory for references that you can copy+paste and tailor to match your configs if you are unable to copy your exact configuration). The reproduction MUST be executable by running terraform init && terraform apply without any further changes.

If your request is for a new feature, please use the Feature request template.

  • ✋ I have searched the open/closed issues and my issue is not listed.

Versions

  • Module version [Required]: 20.8.4

  • Terraform version:
    1.4.4

  • Provider version(s):

required_providers {
aws = {
# https://registry.terraform.io/providers/hashicorp/aws/latest
source = "hashicorp/aws"
version = "5.40.0"
}
kubernetes = {
# https://registry.terraform.io/providers/hashicorp/kubernetes/latest
source = "hashicorp/kubernetes"
version = "2.27.0"
}
helm = {
# https://registry.terraform.io/providers/hashicorp/helm/latest
source = "hashicorp/helm"
version = "2.12.1"
}
kubectl = {
# https://registry.terraform.io/providers/alekc/kubectl/latest
source = "alekc/kubectl"
version = "2.0.4"
}
}

Reproduction Code [Required]

Plain tf code I guess ...

Steps to reproduce the behavior:

Running tf plan locally works fine but when I run the tf plan in a gitlab pipeline I got this:

Planning failed. Terraform encountered an error while generating this plan.

│ Error: Failed to identify fetch peer certificates

│ with module.euce1_apps_100794_eks_cluster_01.data.tls_certificate.this[0],
│ on .terraform/modules/euce1_apps_100794_eks_cluster_01/main.tf line 329, in data "tls_certificate" "this":
│ 329: data "tls_certificate" "this" {

│ failed to fetch certificates from URL 'https': Get
│ "https://oidc.eks.eu-central-1.amazonaws.com:443/id/xxxxxxxxxxxxxxxxx":
│ dial tcp: lookup oidc.eks.eu-central-1.amazonaws.com on 10.144.197.2:53: no
│ such host

@ForbiddenEra
Copy link

Running tf plan locally works fine but when I run the tf plan in a gitlab pipeline I got this:

Planning failed. Terraform encountered an error while generating this plan. ╷ │ Error: Failed to identify fetch peer certificates │ │ with module.euce1_apps_100794_eks_cluster_01.data.tls_certificate.this[0], │ on .terraform/modules/euce1_apps_100794_eks_cluster_01/main.tf line 329, in data "tls_certificate" "this": │ 329: data "tls_certificate" "this" { │ │ failed to fetch certificates from URL 'https': Get │ "https://oidc.eks.eu-central-1.amazonaws.com:443/id/xxxxxxxxxxxxxxxxx": │ dial tcp: lookup oidc.eks.eu-central-1.amazonaws.com on 10.144.197.2:53: no │ such host ╵

Hi,

I also use GitLab but haven't seen this issue, though, I also typically don't run infra/EKS TF in pipelines as they don't often need to be run and I don't quite trust some things (at least, in my config) 100% sometimes. That said, I'm pretty familiar with GitLab/EKS and a fair bit of intricacies with GitLab pipelines.

Can you clarify whether you're using GitLab SaaS (eg. GitLab.com), dedicated or self-hosted? And, I guess, more importantly, what type of runner(s) you're using, eg. GitLab SaaS Sharead Runners (the runners provided automatically/free on GitLab.com) or self-hosted and if self-hosted, what executor you're using (shell, docker, k8s, etc)? I might assume you're using GitLab.com with shared runners since you haven't mentioned otherwise.

Either way; one thing is clear in your error message:
dial tcp: lookup oidc.eks.eu-central-1.amazonaws.com on 10.144.197.2:53: no such host

This means that your runner is using a DNS server at a10.144.197.2 to lookup the domain oidc.eks.eu-central-1.amazonaws.com; if you're using SaaS+Shared Runners, that would be one of GitLab's own internal DNS servers.

I'm able to lookup this domain without issue using CloudFlare's DNS (1.1.1.1) server and also Google's (8.8.8.8); this means that that domain is publicly resolvable and accessible and that the issue is specifically related to the DNS server at 10.144.197.2 (from your Runner's perspective) was unable to perform the lookup. If you're using GitLab SaaS+Shared Runners, then it's quite likely in GitLab's domain to look into.

If you're using you're own runners, then check into your DNS servers to ensure they're working as expected and have proper egress access; 10.*.*.* IP addresses are of course internal-only IPs that are not routable on the internet.

GitLab does sometimes have issues that are temporary or sporadic; you might do well to simply try again. If it's happening continuously and you're using SaaS+Shared Runners, a potential workaround would be to somehow modify the /etc/resolv.conf of the image you're using for your pipelines and have it use a public server like the above mentioned CloudFlare or Google ones for resolution instead.

If using an off-the-shelf/public image in your runner, the user your job runs as likely won't have permission to modify that file, thus you likely wouldn't be able to update it inside a job script, however, you could relatively easily make a Dockerfile that uses that same image as a base with FROM and add a few lines of script to update that file to use a different DNS server. It should work but I can't say for certain that GitLab doesn't have any network/security policies that might prevent that.

Either way, I'm confident this issue is absolutely not related to the Terraform AWS EKS module; hope this helps, best of luck!

@ahoehma
Copy link
Author

ahoehma commented Mar 28, 2024

@ForbiddenEra thanks for all the info. I'm running this on a company gitlab-installation. We are using our own gitlab-runner which is running in our own aws account. This account should have all the access to the aws I hope :) But I will take this feedback with me and ask the infra-crew!

@cgoedicke
Copy link

hi there,
Problem is the EKS VPC endpoint with private DNS enabled and has nothing todo with SaaS Runner or something else.
The private DNS is capturing all DNS queries to the Zone instead of the Record of "eks.eu-central-1.amazonaws.com". So they catch all queries for *.eks..amazonaws.com.

Solution should be to create an own oidc eks endpoint inside VPC but this isnt an available feature yet, if you have an private VPC. I´ve ask if there is feature request already or if not, that one is opened.

Problem is described inside AWS docs:
https://docs.aws.amazon.com/eks/latest/userguide/enable-iam-roles-for-service-accounts.html

workarounds can be

  • set an local hosts entry on the runner, to one of the actual ip of the endpoint
  • put a runner inside a public subnet to use a public dns instead of a private one
  • disable the private dns feature for this endpoint
  • wait for the new endpoint....

so agreed, issue can be closed.

Copy link

This issue has been automatically marked as stale because it has been open 30 days
with no activity. Remove stale label or comment or this issue will be closed in 10 days

@github-actions github-actions bot added the stale label Apr 28, 2024
Copy link

github-actions bot commented May 9, 2024

This issue was automatically closed because of stale in 10 days

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale May 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants