Pulumi thinks that K8S deployments exist when they don't, and refresh doesn't change it either #8281

alex-hunt-materialize · 2021-10-22T17:43:09Z

Hello!

Vote on this issue by adding a 👍 reaction
To contribute a fix for this issue, leave a comment (and link to your pull request, if you've opened one already)

Issue details

Pulumi seems confused about the state of two K8S deployment resources. It thinks they exist, but they do not. Running pulumi refresh does not update the state of these resources (pulumi still thinks they exist when they clearly do not).

Pulumi version v3.16.0

pulumi up fails to update the resources, since they don't exist.

     Type                                  Name                                     Status               
     pulumi:pulumi:Stack                   mzcloud-alexhunt                         **failed**           
     ├─ kubernetes:batch/v1:Job            backend-migrations                                        
     ├─ kubernetes:core/v1:ConfigMap       mzcloud-alexhunt                                          
 ~   ├─ frontegg:index:Webhook             webhook                                  updated          
     ├─ eks:index:Cluster                  mzcloud-alexhunt-system                                   
     │  └─ aws:eks:Cluster                 mzcloud-alexhunt-system-eksCluster                      
 ~   ├─ kubernetes:core/v1:Service         mzcloud-alexhunt                         **updating failed
     ├─ eks:index:Cluster                  mzcloud-alexhunt-us-east-1                                
     │  └─ aws:eks:Cluster                 mzcloud-alexhunt-us-east-1-eksCluster                   
 ~   ├─ kubernetes:apps/v1:Deployment      deployment-controller                    **updating failed
 ~   ├─ kubernetes:apps/v1:Deployment      webapp                                   **updating failed
 ~   ├─ kubernetes:core/v1:ServiceAccount  eip-operator-mzcloud-alexhunt-us-east-1  updated          
 ~   └─ kubernetes:apps/v1:Deployment      eip-operator-mzcloud-alexhunt-us-east-1  updated          
 
Diagnostics:
  kubernetes:apps/v1:Deployment (deployment-controller):
    error: update of resource mz-system/deployment-controller failed because the Kubernetes API server reported that it failed to fully initialize or become live: deployments.apps "deployment-controller" not found
 
  pulumi:pulumi:Stack (mzcloud-alexhunt):
    Forwarding from 127.0.0.1:32123 -> 5432
    Forwarding from [::1]:32123 -> 5432
 
    error: update failed
 
  kubernetes:core/v1:Service (mzcloud-alexhunt):
    error: 2 errors occurred:
    	* the Kubernetes API server reported that "mz-system/mzcloud-alexhunt-s075offl" failed to fully initialize or become live: 'mzcloud-alexhunt-s075offl' timed out waiting to be Ready
    	* Service does not target any Pods. Selected Pods may not be ready, or field '.spec.selector' may not match labels on any Pods
 
  kubernetes:apps/v1:Deployment (webapp):
    error: update of resource mz-system/webapp failed because the Kubernetes API server reported that it failed to fully initialize or become live: deployments.apps "webapp" not found

(same behavior with a full refresh, but the logs are to big to paste here).

❯ pulumi refresh --skip-preview --target urn:pulumi:alexhunt::mzcloud::kubernetes:apps/v1:Deployment::deployment-controller --target urn:pulumi:alexhunt::mzcloud::kubernetes:apps/v1:Deployment::webapp
Refreshing (materialize/alexhunt)

View Live: https://app.pulumi.com/materialize/mzcloud/alexhunt/updates/19

     Type                              Name                   Status     
     pulumi:pulumi:Stack               mzcloud-alexhunt                  
     ├─ kubernetes:apps/v1:Deployment  webapp                            
     └─ kubernetes:apps/v1:Deployment  deployment-controller             
 
Resources:
    2 unchanged

Duration: 1s

Steps to reproduce

I don't know if I will be able to reproduce it.

I was testing a change to remove some aliases that had already been applied, and the pulumi up failed due to a mistake in the code that renamed the EKS cluster. This rename forces a delete of the cluster, which Pulumi correctly aborted the run for, without modifying the EKS cluster.
Switch back to the older code, that has the aliases and the original name of the EKS cluster.
Try many combinations of pulumi up and pulumi refresh, with the above result.

Expected: Pulumi shouldn't think these deployments exist when they don't. Even if it did get confused, pulumi refresh should remove them from the state.
Actual: Pulumi stores non-existent resources in the state, and pulumi refresh doesn't remove them from it.

The text was updated successfully, but these errors were encountered:

viveklak · 2021-10-26T05:25:53Z

Could you provide the output of the full refresh? The update failure is expected in this situation.
Could you try running the refresh with detailed logging, something like the following should work:
pulumi -d --logflow --logtostderr -v=9 refresh 2>&1 | tee /tmp/logs

If you can share logs here or privately, it would really help here.

alex-hunt-materialize · 2021-10-26T21:09:12Z

I unfortunately cannot provide that, since the stack is no longer in that state, and I did not save the logs.

supergillis · 2024-03-28T12:04:25Z

Hi @viveklak, I'm running into the same issue. How can I share you the logs privately?

Hugzy · 2024-05-16T13:40:14Z

I'm running into this issue using the automation api, i have a stack where pulumi thinks a kubernetes job exists and tries to refresh it but fails in the process, is there a solution to this?

This is the only output it provides

  kubernetes:batch/v1:Job (digizuitecore-db-upgrade-administrationservice):
    error: jobs.batch "digizuitecore-db-upgrade-administrationservice" not found

I tried to delete the resource directly in the pulumi state using pulumi state delete but pulumi just tells me that it does not exist

error: No such resource "urn:pulumi:migration-jobs:batch/v1:Job::digizuitecore-db-upgrade-administrationservice" exists in the current state

However i can clearly see it in the UI of pulumi

Frassle · 2024-05-16T17:56:05Z

I tried to delete the resource directly in the pulumi state using pulumi state delete but pulumi just tells me that it does not exist

I'd bet your original URN had a dollar sign in it and it got escaped by your shell. Use single quotes when using state delete so that shell escaping doesn't half eat your URNs.

alex-hunt-materialize added the kind/bug Some behavior is incorrect or out of spec label Oct 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pulumi thinks that K8S deployments exist when they don't, and refresh doesn't change it either #8281

Pulumi thinks that K8S deployments exist when they don't, and refresh doesn't change it either #8281

alex-hunt-materialize commented Oct 22, 2021 •

edited

viveklak commented Oct 26, 2021

alex-hunt-materialize commented Oct 26, 2021

supergillis commented Mar 28, 2024

Hugzy commented May 16, 2024 •

edited

Frassle commented May 16, 2024

Pulumi thinks that K8S deployments exist when they don't, and refresh doesn't change it either #8281

Pulumi thinks that K8S deployments exist when they don't, and refresh doesn't change it either #8281

Comments

alex-hunt-materialize commented Oct 22, 2021 • edited

Hello!

Issue details

Steps to reproduce

viveklak commented Oct 26, 2021

alex-hunt-materialize commented Oct 26, 2021

supergillis commented Mar 28, 2024

Hugzy commented May 16, 2024 • edited

Frassle commented May 16, 2024

alex-hunt-materialize commented Oct 22, 2021 •

edited

Hugzy commented May 16, 2024 •

edited