Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Terminating workflow with running exit-handler leaves stepGroup of the exit-handler in Running phase #13052

Open
3 of 4 tasks
skarna987 opened this issue May 14, 2024 · 4 comments · May be fixed by #13120
Open
3 of 4 tasks
Labels

Comments

@skarna987
Copy link

skarna987 commented May 14, 2024

Pre-requisites

  • I have double-checked my configuration
  • I have tested with the :latest image tag (i.e. quay.io/argoproj/workflow-controller:latest) and can confirm the issue still exists on :latest. If not, I have explained why, in detail, in my description below.
  • I have searched existing issues and could not find a match for this bug
  • I'd like to contribute the fix myself (see contributing guide)

What happened/what did you expect to happen?

Created a workflow with exit-handler. When exit-handler was running I terminated the workflow with argo terminate -n <namespace> <workflow> command. My expectation was that after termination all exit handler nodes would be in "Failed" phase. However the "StepGroup" and "Steps" type nodes were still in "Running" phase.

"test-workflow-kqxvv-113404891": {
        "boundaryID": "test-workflow-kqxvv-3314254127",
        "children": [
          "test-workflow-kqxvv-1105926477"
        ],
        "displayName": "[0]",
        "finishedAt": null,
        "id": "test-workflow-kqxvv-113404891",
        "name": "test-workflow-kqxvv.onExit[0]",
        "nodeFlag": {},
        "phase": "Running",
        "progress": "0/1",
        "startedAt": "2024-05-14T13:04:43Z",
        "templateScope": "local/test-workflow-kqxvv",
        "type": "StepGroup"
      },

Version

v3.5.6

Paste a small workflow that reproduces the issue. We must be able to run the workflow; don't enter a workflows that uses private images.

apiVersion: argoproj.io/v1alpha1
kind: Workflow
metadata:
  generateName: test-workflow-
  annotations:
    description: 'Test workflow'
spec:
  podGC:
    strategy: OnPodCompletion
  entrypoint: workflow-startup
  onExit: exit-handler

  templates:
  - name: workflow-startup
    dag:
      tasks:
      - name: say-hello
        template: echo-template
        arguments:
          parameters:
            - name: message
              value: Hi!


  - name: sleep
    inputs:
      parameters:
        - name: time-value
    container:
      image: alpine:3.19.1
      command: [sh, -c]
      args: ["sleep {{inputs.parameters.time-value}}"]

  - name: echo-template
    inputs:
      parameters:
      - name: message
    script:
      image: python:alpine3.6
      command: [python]
      source: |
        print("{{inputs.parameters.message}}")


  - name: exit-handler
    steps:
      - - arguments:
            parameters:
              - name: time-value
                value: "600"
          name: exit-handleri
          template: sleep

Logs from the workflow controller

time="2024-05-14T13:04:34.774Z" level=info msg="Processing workflow" Phase= ResourceVersion=10124262 namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:34.779Z" level=info msg="Task-result reconciliation" namespace=neat-workflows numObjs=0 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:34.779Z" level=info msg="Updated phase  -> Running" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:34.779Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:34.780Z" level=info msg="was unable to obtain node for , letting display name to be nodeName" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:34.780Z" level=info msg="DAG node test-workflow-kqxvv initialized Running" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:34.780Z" level=warning msg="was unable to obtain the node for test-workflow-kqxvv-2633604284, taskName say-hello"
time="2024-05-14T13:04:34.780Z" level=warning msg="was unable to obtain the node for test-workflow-kqxvv-2633604284, taskName say-hello"
time="2024-05-14T13:04:34.780Z" level=info msg="All of node test-workflow-kqxvv.say-hello dependencies [] completed" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:34.780Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:34.780Z" level=info msg="Pod node test-workflow-kqxvv-2633604284 initialized Pending" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:34.938Z" level=info msg="Created pod: test-workflow-kqxvv.say-hello (test-workflow-kqxvv-echo-template-2633604284)" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:34.939Z" level=info msg="TaskSet Reconciliation" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:34.939Z" level=info msg=reconcileAgentPod namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:34.955Z" level=info msg="Workflow update successful" namespace=neat-workflows phase=Running resourceVersion=10124267 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:36.941Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=10124267 namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:36.941Z" level=info msg="Task-result reconciliation" namespace=neat-workflows numObjs=0 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:36.941Z" level=info msg="node changed" namespace=neat-workflows new.message=PodInitializing new.phase=Pending new.progress=0/1 nodeID=test-workflow-kqxvv-2633604284 old.message= old.phase=Pending old.progress=0/1 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:36.942Z" level=info msg="TaskSet Reconciliation" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:36.942Z" level=info msg=reconcileAgentPod namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:36.954Z" level=info msg="Workflow update successful" namespace=neat-workflows phase=Running resourceVersion=10124279 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:39.203Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=10124279 namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:39.204Z" level=info msg="Task-result reconciliation" namespace=neat-workflows numObjs=1 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:39.204Z" level=info msg="task-result changed" namespace=neat-workflows nodeID=test-workflow-kqxvv-2633604284 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:39.204Z" level=info msg="node changed" namespace=neat-workflows new.message= new.phase=Running new.progress=0/1 nodeID=test-workflow-kqxvv-2633604284 old.message=PodInitializing old.phase=Pending old.progress=0/1 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:39.204Z" level=info msg="TaskSet Reconciliation" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:39.204Z" level=info msg=reconcileAgentPod namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:39.215Z" level=info msg="Workflow update successful" namespace=neat-workflows phase=Running resourceVersion=10124295 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:41.209Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=10124295 namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:41.210Z" level=info msg="Task-result reconciliation" namespace=neat-workflows numObjs=1 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:41.210Z" level=info msg="task-result changed" namespace=neat-workflows nodeID=test-workflow-kqxvv-2633604284 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:41.210Z" level=info msg="node changed" namespace=neat-workflows new.message= new.phase=Running new.progress=0/1 nodeID=test-workflow-kqxvv-2633604284 old.message= old.phase=Running old.progress=0/1 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:41.211Z" level=info msg="TaskSet Reconciliation" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:41.211Z" level=info msg=reconcileAgentPod namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:41.216Z" level=info msg="cleaning up pod" action=terminateContainers key=neat-workflows/test-workflow-kqxvv-echo-template-2633604284/terminateContainers
time="2024-05-14T13:04:41.221Z" level=info msg="Workflow update successful" namespace=neat-workflows phase=Running resourceVersion=10124305 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.355Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=10124305 namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.355Z" level=info msg="Task-result reconciliation" namespace=neat-workflows numObjs=1 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.355Z" level=info msg="task-result changed" namespace=neat-workflows nodeID=test-workflow-kqxvv-2633604284 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.356Z" level=info msg="node changed" namespace=neat-workflows new.message= new.phase=Succeeded new.progress=0/1 nodeID=test-workflow-kqxvv-2633604284 old.message= old.phase=Running old.progress=0/1 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.356Z" level=info msg="Outbound nodes of test-workflow-kqxvv set to [test-workflow-kqxvv-2633604284]" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.356Z" level=info msg="node test-workflow-kqxvv phase Running -> Succeeded" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.356Z" level=info msg="node test-workflow-kqxvv finished: 2024-05-14 13:04:43.356496069 +0000 UTC" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.356Z" level=info msg="TaskSet Reconciliation" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.356Z" level=info msg=reconcileAgentPod namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.356Z" level=info msg="Running OnExit handler: exit-handler" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.356Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.356Z" level=info msg="was unable to obtain node for , letting display name to be nodeName" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.356Z" level=info msg="Steps node test-workflow-kqxvv-3314254127 initialized Running" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.356Z" level=info msg="StepGroup node test-workflow-kqxvv-113404891 initialized Running" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.356Z" level=warning msg="Node was nil, will be initialized as type Skipped" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.357Z" level=info msg="Pod node test-workflow-kqxvv-1105926477 initialized Pending" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.367Z" level=info msg="Created pod: test-workflow-kqxvv.onExit[0].exit-handleri (test-workflow-kqxvv-sleep-1105926477)" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.367Z" level=info msg="Workflow step group node test-workflow-kqxvv-113404891 not yet completed" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:43.377Z" level=info msg="Workflow update successful" namespace=neat-workflows phase=Running resourceVersion=10124315 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:45.369Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=10124315 namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:45.369Z" level=info msg="Task-result reconciliation" namespace=neat-workflows numObjs=1 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:45.369Z" level=info msg="task-result changed" namespace=neat-workflows nodeID=test-workflow-kqxvv-2633604284 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:45.369Z" level=info msg="node unchanged" namespace=neat-workflows nodeID=test-workflow-kqxvv-2633604284 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:45.369Z" level=info msg="node changed" namespace=neat-workflows new.message=PodInitializing new.phase=Pending new.progress=0/1 nodeID=test-workflow-kqxvv-1105926477 old.message= old.phase=Pending old.progress=0/1 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:45.370Z" level=info msg="TaskSet Reconciliation" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:45.370Z" level=info msg=reconcileAgentPod namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:45.370Z" level=info msg="Running OnExit handler: exit-handler" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:45.371Z" level=info msg="Workflow step group node test-workflow-kqxvv-113404891 not yet completed" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:45.383Z" level=info msg="Workflow update successful" namespace=neat-workflows phase=Running resourceVersion=10124334 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:47.459Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=10124334 namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:47.459Z" level=info msg="Task-result reconciliation" namespace=neat-workflows numObjs=2 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:47.459Z" level=info msg="task-result changed" namespace=neat-workflows nodeID=test-workflow-kqxvv-2633604284 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:47.459Z" level=info msg="task-result changed" namespace=neat-workflows nodeID=test-workflow-kqxvv-1105926477 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:47.459Z" level=info msg="node changed" namespace=neat-workflows new.message= new.phase=Running new.progress=0/1 nodeID=test-workflow-kqxvv-1105926477 old.message=PodInitializing old.phase=Pending old.progress=0/1 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:47.459Z" level=info msg="node unchanged" namespace=neat-workflows nodeID=test-workflow-kqxvv-2633604284 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:47.460Z" level=info msg="TaskSet Reconciliation" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:47.460Z" level=info msg=reconcileAgentPod namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:47.460Z" level=info msg="Running OnExit handler: exit-handler" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:47.460Z" level=info msg="Workflow step group node test-workflow-kqxvv-113404891 not yet completed" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:47.472Z" level=info msg="Workflow update successful" namespace=neat-workflows phase=Running resourceVersion=10124343 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:48.380Z" level=info msg="cleaning up pod" action=deletePod key=neat-workflows/test-workflow-kqxvv-echo-template-2633604284/deletePod
time="2024-05-14T13:04:49.474Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=10124343 namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:49.474Z" level=info msg="Task-result reconciliation" namespace=neat-workflows numObjs=2 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:49.474Z" level=info msg="task-result changed" namespace=neat-workflows nodeID=test-workflow-kqxvv-2633604284 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:49.474Z" level=info msg="task-result changed" namespace=neat-workflows nodeID=test-workflow-kqxvv-1105926477 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:49.475Z" level=info msg="node unchanged" namespace=neat-workflows nodeID=test-workflow-kqxvv-1105926477 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:49.475Z" level=info msg="TaskSet Reconciliation" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:49.475Z" level=info msg=reconcileAgentPod namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:49.475Z" level=info msg="Running OnExit handler: exit-handler" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:49.475Z" level=info msg="Workflow step group node test-workflow-kqxvv-113404891 not yet completed" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:49.487Z" level=info msg="Workflow update successful" namespace=neat-workflows phase=Running resourceVersion=10124349 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:51.489Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=10124349 namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:51.489Z" level=info msg="Task-result reconciliation" namespace=neat-workflows numObjs=2 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:51.489Z" level=info msg="task-result changed" namespace=neat-workflows nodeID=test-workflow-kqxvv-2633604284 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:51.489Z" level=info msg="task-result changed" namespace=neat-workflows nodeID=test-workflow-kqxvv-1105926477 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:51.489Z" level=info msg="node unchanged" namespace=neat-workflows nodeID=test-workflow-kqxvv-1105926477 workflow=test-workflow-kqxvv
time="2024-05-14T13:04:51.490Z" level=info msg="TaskSet Reconciliation" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:51.490Z" level=info msg=reconcileAgentPod namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:51.490Z" level=info msg="Running OnExit handler: exit-handler" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:51.490Z" level=info msg="Workflow step group node test-workflow-kqxvv-113404891 not yet completed" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:04:51.498Z" level=info msg="Workflow update successful" namespace=neat-workflows phase=Running resourceVersion=10124349 workflow=test-workflow-kqxvv
time="2024-05-14T13:05:11.216Z" level=info msg="cleaning up pod" action=killContainers key=neat-workflows/test-workflow-kqxvv-echo-template-2633604284/killContainers
time="2024-05-14T13:07:01.837Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=10124537 namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:07:01.837Z" level=info msg="Task-result reconciliation" namespace=neat-workflows numObjs=2 workflow=test-workflow-kqxvv
time="2024-05-14T13:07:01.837Z" level=info msg="task-result changed" namespace=neat-workflows nodeID=test-workflow-kqxvv-2633604284 workflow=test-workflow-kqxvv
time="2024-05-14T13:07:01.837Z" level=info msg="task-result changed" namespace=neat-workflows nodeID=test-workflow-kqxvv-1105926477 workflow=test-workflow-kqxvv
time="2024-05-14T13:07:01.837Z" level=info msg="node unchanged" namespace=neat-workflows nodeID=test-workflow-kqxvv-1105926477 workflow=test-workflow-kqxvv
time="2024-05-14T13:07:01.837Z" level=info msg="Terminating pod as part of workflow shutdown" namespace=neat-workflows podName=test-workflow-kqxvv-sleep-1105926477 shutdownStrategy=Terminate workflow=test-workflow-kqxvv
time="2024-05-14T13:07:01.837Z" level=info msg="node test-workflow-kqxvv-1105926477 phase Running -> Failed" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:07:01.837Z" level=info msg="node test-workflow-kqxvv-1105926477 message: workflow shutdown with strategy:  Terminate" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:07:01.837Z" level=info msg="node test-workflow-kqxvv-1105926477 finished: 2024-05-14 13:07:01.837964427 +0000 UTC" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:07:01.843Z" level=info msg="cleaning up pod" action=terminateContainers key=neat-workflows/test-workflow-kqxvv-sleep-1105926477/terminateContainers
time="2024-05-14T13:07:01.845Z" level=info msg="https://10.233.0.1:443/api/v1/namespaces/neat-workflows/pods/test-workflow-kqxvv-sleep-1105926477/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=main&stderr=true&stdout=true&tty=false"
time="2024-05-14T13:07:01.851Z" level=info msg="Workflow update successful" namespace=neat-workflows phase=Running resourceVersion=10124541 workflow=test-workflow-kqxvv
time="2024-05-14T13:07:01.974Z" level=info msg="signaled container" container=main error="<nil>" namespace=neat-workflows pod=test-workflow-kqxvv-sleep-1105926477 stderr= stdout="killing 1 with terminated\n"
time="2024-05-14T13:07:01.974Z" level=info msg="https://10.233.0.1:443/api/v1/namespaces/neat-workflows/pods/test-workflow-kqxvv-sleep-1105926477/exec?command=%2Fvar%2Frun%2Fargo%2Fargoexec&command=kill&command=15&command=1&container=wait&stderr=true&stdout=true&tty=false"
time="2024-05-14T13:07:02.101Z" level=info msg="signaled container" container=wait error="<nil>" namespace=neat-workflows pod=test-workflow-kqxvv-sleep-1105926477 stderr= stdout="killing 1 with terminated\n"
time="2024-05-14T13:07:04.546Z" level=info msg="Processing workflow" Phase=Running ResourceVersion=10124541 namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:07:04.546Z" level=info msg="Task-result reconciliation" namespace=neat-workflows numObjs=2 workflow=test-workflow-kqxvv
time="2024-05-14T13:07:04.546Z" level=info msg="task-result changed" namespace=neat-workflows nodeID=test-workflow-kqxvv-2633604284 workflow=test-workflow-kqxvv
time="2024-05-14T13:07:04.546Z" level=info msg="task-result changed" namespace=neat-workflows nodeID=test-workflow-kqxvv-1105926477 workflow=test-workflow-kqxvv
time="2024-05-14T13:07:04.546Z" level=info msg="node changed" namespace=neat-workflows new.message="workflow shutdown with strategy:  Terminate" new.phase=Running new.progress=0/1 nodeID=test-workflow-kqxvv-1105926477 old.message="workflow shutdown with strategy:  Terminate" old.phase=Failed old.progress=0/1 workflow=test-workflow-kqxvv
time="2024-05-14T13:07:04.546Z" level=info msg="Terminating pod as part of workflow shutdown" namespace=neat-workflows podName=test-workflow-kqxvv-sleep-1105926477 shutdownStrategy=Terminate workflow=test-workflow-kqxvv
time="2024-05-14T13:07:04.546Z" level=info msg="node test-workflow-kqxvv-1105926477 phase Running -> Failed" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:07:04.546Z" level=info msg="Updated phase Running -> Succeeded" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:07:04.546Z" level=info msg="Marking workflow completed" namespace=neat-workflows workflow=test-workflow-kqxvv
time="2024-05-14T13:07:04.552Z" level=info msg="cleaning up pod" action=terminateContainers key=neat-workflows/test-workflow-kqxvv-sleep-1105926477/terminateContainers
time="2024-05-14T13:07:04.552Z" level=info msg="cleaning up pod" action=deletePod key=neat-workflows/test-workflow-kqxvv-1340600742-agent/deletePod
time="2024-05-14T13:07:04.556Z" level=warning msg="failed to clean-up pod" action=deletePod error="pods \"test-workflow-kqxvv-1340600742-agent\" not found" key=neat-workflows/test-workflow-kqxvv-1340600742-agent/deletePod
time="2024-05-14T13:07:04.556Z" level=warning msg="Non-transient error: pods \"test-workflow-kqxvv-1340600742-agent\" not found"
time="2024-05-14T13:07:04.558Z" level=info msg="Workflow update successful" namespace=neat-workflows phase=Succeeded resourceVersion=10124556 workflow=test-workflow-kqxvv
time="2024-05-14T13:07:06.852Z" level=info msg="cleaning up pod" action=deletePod key=neat-workflows/test-workflow-kqxvv-sleep-1105926477/deletePod
time="2024-05-14T13:07:32.102Z" level=info msg="cleaning up pod" action=killContainers key=neat-workflows/test-workflow-kqxvv-sleep-1105926477/killContainers

Logs from in your workflow's wait container

time="2024-05-14T13:04:45.442Z" level=info msg="Starting Workflow Executor" version=v3.5.6
time="2024-05-14T13:04:45.446Z" level=info msg="Using executor retry strategy" Duration=1s Factor=1.6 Jitter=0.5 Steps=5
time="2024-05-14T13:04:45.446Z" level=info msg="Executor initialized" deadline="0001-01-01 00:00:00 +0000 UTC" includeScriptOutput=false namespace=neat-workflows podName=test-workflow-kqxvv-sleep-1105926477 templateName=sleep version="&Version{Version:v3.5.6,BuildDate:2024-04-19T20:54:43Z,GitCommit:555030053825dd61689a086cb3c2da329419325a,GitTag:v3.5.6,GitTreeState:clean,GoVersion:go1.21.9,Compiler:gc,Platform:linux/amd64,}"
time="2024-05-14T13:04:45.459Z" level=info msg="Starting deadline monitor"
time="2024-05-14T13:07:02.088Z" level=info msg="Deadline monitor stopped"
time="2024-05-14T13:07:02.088Z" level=info msg="stopping progress monitor (context done)" error="context canceled"
time="2024-05-14T13:07:02.520Z" level=warning msg="Non-transient error: context canceled"
time="2024-05-14T13:07:02.520Z" level=info msg="Main container completed" error="context canceled"
time="2024-05-14T13:07:02.520Z" level=info msg="No Script output reference in workflow. Capturing script output ignored"
time="2024-05-14T13:07:02.520Z" level=info msg="No output parameters"
time="2024-05-14T13:07:02.520Z" level=info msg="No output artifacts"
time="2024-05-14T13:07:02.521Z" level=info msg="S3 Save path: /tmp/argo/outputs/logs/main.log, key: test-workflow-kqxvv/test-workflow-kqxvv-sleep-1105926477/main.log"
time="2024-05-14T13:07:02.521Z" level=info msg="Creating minio client using static credentials" endpoint="neat-minio:9000"
time="2024-05-14T13:07:02.521Z" level=info msg="Saving file to s3" bucket=argo-artifacts endpoint="neat-minio:9000" key=test-workflow-kqxvv/test-workflow-kqxvv-sleep-1105926477/main.log path=/tmp/argo/outputs/logs/main.log
time="2024-05-14T13:07:02.530Z" level=info msg="Save artifact" artifactName=main-logs duration=9.480581ms error="<nil>" key=test-workflow-kqxvv/test-workflow-kqxvv-sleep-1105926477/main.log
time="2024-05-14T13:07:02.530Z" level=info msg="not deleting local artifact" localArtPath=/tmp/argo/outputs/logs/main.log
time="2024-05-14T13:07:02.530Z" level=info msg="Successfully saved file: /tmp/argo/outputs/logs/main.log"
time="2024-05-14T13:07:02.544Z" level=info msg="Alloc=7931 TotalAlloc=14121 Sys=24933 NumGC=5 Goroutines=7"
@miltalex
Copy link
Contributor

miltalex commented May 14, 2024

I could reproduce it I can have a look.

miltalex added a commit to miltalex/argo-workflows that referenced this issue May 30, 2024
miltalex added a commit to miltalex/argo-workflows that referenced this issue May 30, 2024
miltalex added a commit to miltalex/argo-workflows that referenced this issue May 31, 2024
@tczhao
Copy link
Member

tczhao commented May 31, 2024

Screenshot for original issue
image

@tczhao tczhao changed the title Terminating workflow with running exit-handler leaves some nodes to Running phase Terminating workflow with running exit-handler leaves stepGroup of the exit-handler in Running phase May 31, 2024
@tczhao
Copy link
Member

tczhao commented May 31, 2024

Verify in the example logs

"test-workflow-kqxvv-113404891": {
        "boundaryID": "test-workflow-kqxvv-3314254127",
        "children": [
          "test-workflow-kqxvv-1105926477"
        ],
        "displayName": "[0]",
        "finishedAt": null,
        "id": "test-workflow-kqxvv-113404891",
        "name": "test-workflow-kqxvv.onExit[0]",
        "nodeFlag": {},
        "phase": "Running",
        "progress": "0/1",
        "startedAt": "2024-05-14T13:04:43Z",
        "templateScope": "local/test-workflow-kqxvv",
        "type": "StepGroup"
      },

StepGroup test-workflow-kqxvv-113404891 stuck at running phase (incorrect behaviour)
StepGroup's child node test-workflow-kqxvv-113404891 marked failed (correct behaviour)
workflow marked complete (correct behaviour)

@miltalex
Copy link
Contributor

miltalex commented May 31, 2024

@tczhao I was able to reproduce it and opened an MR with a possible solution

miltalex added a commit to miltalex/argo-workflows that referenced this issue May 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
4 participants