You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
• [FAILED] [0.459 seconds]
TFJob controller Test Normal Path [It] should create desired Pods and Services
/home/runner/work/training-operator/training-operator/go/src/github.com/kubeflow/training-operator/pkg/controller.v1/tensorflow/tfjob_controller_test.go:39
Timeline >>
STEP: Distributed TFJob (4 workers, 2 PS) is created @ 04/26/24 21:22:53.416
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-0" not found {"tfjob": {"name":"test-case-norm-0","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-0"}
2024-04-26T21:22:53Z DEBUG events Created pod: test-case-norm-0-worker-0 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-0","uid":"18381da0-8cbb-465d-bf07-96de003f9a1b"}, "reason": "SuccessfulCreatePod"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-0" not found {"tfjob": {"name":"test-case-norm-0","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-0"}
2024-04-26T21:22:53Z DEBUG events Created pod: test-case-norm-0-worker-1 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-0","uid":"18381da0-8cbb-465d-bf07-96de003f9a1b"}, "reason": "SuccessfulCreatePod"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-0" not found {"tfjob": {"name":"test-case-norm-0","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-0"}
2024-04-26T21:22:53Z DEBUG events Created pod: test-case-norm-0-worker-2 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-0","uid":"18381da0-8cbb-465d-bf07-96de003f9a1b"}, "reason": "SuccessfulCreatePod"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-0" not found {"tfjob": {"name":"test-case-norm-0","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-0"}
2024-04-26T21:22:53Z DEBUG events Created pod: test-case-norm-0-worker-3 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-0","uid":"18381da0-8cbb-465d-bf07-96de003f9a1b"}, "reason": "SuccessfulCreatePod"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-0" not found {"tfjob": {"name":"test-case-norm-0","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-0"}
2024-04-26T21:22:53Z DEBUG events Created service: test-case-norm-0-worker-0 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-0","uid":"18381da0-8cbb-465d-bf07-96de003f9a1b"}, "reason": "SuccessfulCreateService"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-0" not found {"tfjob": {"name":"test-case-norm-0","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-0"}
2024-04-26T21:22:53Z DEBUG events Created service: test-case-norm-0-worker-1 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-0","uid":"18381da0-8cbb-465d-bf07-96de003f9a1b"}, "reason": "SuccessfulCreateService"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-0" not found {"tfjob": {"name":"test-case-norm-0","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-0"}
2024-04-26T21:22:53Z DEBUG events Created service: test-case-norm-0-worker-2 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-0","uid":"18[381](https://github.com/kubeflow/training-operator/actions/runs/8854426275/job/24317424179#step:4:382)da0-8cbb-465d-bf07-96de003f9a1b"}, "reason": "SuccessfulCreateService"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-0" not found {"tfjob": {"name":"test-case-norm-0","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-0"}
2024-04-26T21:22:53Z DEBUG events Created service: test-case-norm-0-worker-3 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-0","uid":"18381da0-8cbb-465d-bf07-96de003f9a1b"}, "reason": "SuccessfulCreateService"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-0" not found {"tfjob": {"name":"test-case-norm-0","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-0"}
2024-04-26T21:22:53Z DEBUG events Created pod: test-case-norm-0-ps-0 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-0","uid":"18381da0-8cbb-465d-bf07-96de003f9a1b"}, "reason": "SuccessfulCreatePod"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-0" not found {"tfjob": {"name":"test-case-norm-0","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-0"}
2024-04-26T21:22:53Z DEBUG events Created pod: test-case-norm-0-ps-1 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-0","uid":"18381da0-8cbb-465d-bf07-96de003f9a1b"}, "reason": "SuccessfulCreatePod"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-0" not found {"tfjob": {"name":"test-case-norm-0","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-0"}
2024-04-26T21:22:53Z DEBUG events Created service: test-case-norm-0-ps-0 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-0","uid":"18381da0-8cbb-465d-bf07-96de003f9a1b"}, "reason": "SuccessfulCreateService"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-0" not found {"tfjob": {"name":"test-case-norm-0","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-0"}
2024-04-26T21:22:53Z DEBUG events Created service: test-case-norm-0-ps-1 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-0","uid":"18381da0-8cbb-465d-bf07-96de003f9a1b"}, "reason": "SuccessfulCreateService"}
2024-04-26T21:22:53Z INFO KubeAPIWarningLogger unknown field "spec.tfReplicaSpecs.PS.template.metadata.creationTimestamp"
STEP: Distributed TFJob (4 workers, 2 PS) is created and all replicas are pending @ 04/26/24 21:22:53.458
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-1" not found {"tfjob": {"name":"test-case-norm-1","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-1"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-1" not found {"tfjob": {"name":"test-case-norm-1","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-1"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-1" not found {"tfjob": {"name":"test-case-norm-1","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-1"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-1" not found {"tfjob": {"name":"test-case-norm-1","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-1"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-1" not found {"tfjob": {"name":"test-case-norm-1","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-1"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-1" not found {"tfjob": {"name":"test-case-norm-1","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-1"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-1" not found {"tfjob": {"name":"test-case-norm-1","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-1"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-1" not found {"tfjob": {"name":"test-case-norm-1","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-1"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-1" not found {"tfjob": {"name":"test-case-norm-1","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-1"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-1" not found {"tfjob": {"name":"test-case-norm-1","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-1"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-1" not found {"tfjob": {"name":"test-case-norm-1","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-1"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-1" not found {"tfjob": {"name":"test-case-norm-1","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-1"}
STEP: Distributed TFJob (4 workers, 2 PS) is created, 2 workers, 1 PS are pending @ 04/26/24 21:22:53.533
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-2" not found {"tfjob": {"name":"test-case-norm-2","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-2"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-2" not found {"tfjob": {"name":"test-case-norm-2","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-2"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-2" not found {"tfjob": {"name":"test-case-norm-2","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-2"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-2" not found {"tfjob": {"name":"test-case-norm-2","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-2"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-2" not found {"tfjob": {"name":"test-case-norm-2","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-2"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-2" not found {"tfjob": {"name":"test-case-norm-2","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-2"}
2024-04-26T21:22:53Z ERROR Reconciler error {"controller": "tfjob-controller", "object": {"name":"test-tfjob","namespace":"tfjob-ns-ls4h8"}, "namespace": "tfjob-ns-ls4h8", "name": "test-tfjob", "reconcileID": "1b2c2164-a3b1-4be5-b812-a9[388](https://github.com/kubeflow/training-operator/actions/runs/8854426275/job/24317424179#step:4:389)b02a99a", "error": "pods \"test-tfjob-worker-1\" is forbidden: unable to create new content in namespace tfjob-ns-ls4h8 because it is being terminated"}
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
/home/runner/work/training-operator/training-operator/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:329
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
/home/runner/work/training-operator/training-operator/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:266
sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).Start.func2.2
/home/runner/work/training-operator/training-operator/go/pkg/mod/sigs.k8s.io/[email protected]/pkg/internal/controller/controller.go:227
2024-04-26T21:22:53Z DEBUG events Error creating: pods "test-tfjob-worker-1" is forbidden: unable to create new content in namespace tfjob-ns-ls4h8 because it is being terminated {"type": "Warning", "object": {"kind":"TFJob","namespace":"tfjob-ns-ls4h8","name":"test-tfjob","uid":"fcdfcb63-412d-45f5-8cd0-dddc17442233","apiVersion":"kubeflow.org/v1","resourceVersion":"227"}, "reason": "FailedCreatePod"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-2" not found {"tfjob": {"name":"test-case-norm-2","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-2"}
2024-04-26T21:22:53Z DEBUG events Created pod: test-case-norm-2-worker-2 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-2","uid":"58f59bd2-48bf-4f30-9530-2e7f7f63b434"}, "reason": "SuccessfulCreatePod"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-2" not found {"tfjob": {"name":"test-case-norm-2","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-2"}
2024-04-26T21:22:53Z DEBUG events Created pod: test-case-norm-2-worker-3 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-2","uid":"58f59bd2-48bf-4f30-9530-2e7f7f63b434"}, "reason": "SuccessfulCreatePod"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-2" not found {"tfjob": {"name":"test-case-norm-2","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-2"}
2024-04-26T21:22:53Z DEBUG events Created service: test-case-norm-2-worker-2 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-2","uid":"58f59bd2-48bf-4f30-9530-2e7f7f63b434"}, "reason": "SuccessfulCreateService"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-2" not found {"tfjob": {"name":"test-case-norm-2","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-2"}
2024-04-26T21:22:53Z DEBUG events Created service: test-case-norm-2-worker-3 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-2","uid":"58f59bd2-48bf-4f30-9530-2e7f7f63b434"}, "reason": "SuccessfulCreateService"}
2024-04-26T21:22:53Z DEBUG events Created pod: test-case-norm-2-ps-1 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-2","uid":"58f59bd2-48bf-4f30-9530-2e7f7f63b434"}, "reason": "SuccessfulCreatePod"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-2" not found {"tfjob": {"name":"test-case-norm-2","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-2"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-2" not found {"tfjob": {"name":"test-case-norm-2","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-2"}
2024-04-26T21:22:53Z DEBUG events Created service: test-case-norm-2-ps-1 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-2","uid":"58f59bd2-48bf-4f30-9530-2e7f7f63b434"}, "reason": "SuccessfulCreateService"}
STEP: Distributed TFJob (4 workers, 2 PS) is created, 2 workers, 1 PS are pending, 1 worker is succeeded @ 04/26/24 21:22:53.613
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-3" not found {"tfjob": {"name":"test-case-norm-3","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-3"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-3" not found {"tfjob": {"name":"test-case-norm-3","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-3"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-3" not found {"tfjob": {"name":"test-case-norm-3","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-3"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-3" not found {"tfjob": {"name":"test-case-norm-3","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-3"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-3" not found {"tfjob": {"name":"test-case-norm-3","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-3"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-3" not found {"tfjob": {"name":"test-case-norm-3","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-3"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-3" not found {"tfjob": {"name":"test-case-norm-3","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-3"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-3" not found {"tfjob": {"name":"test-case-norm-3","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-3"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-3" not found {"tfjob": {"name":"test-case-norm-3","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-3"}
2024-04-26T21:22:53Z DEBUG events Created pod: test-case-norm-3-worker-3 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-3","uid":"214c8fcb-d647-4d1b-b9f9-de1f3e8f6924"}, "reason": "SuccessfulCreatePod"}
2024-04-26T21:22:53Z DEBUG events Created service: test-case-norm-3-worker-3 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-3","uid":"214c8fcb-d647-4d1b-b9f9-de1f3e8f6924"}, "reason": "SuccessfulCreateService"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-3" not found {"tfjob": {"name":"test-case-norm-3","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-3"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-3" not found {"tfjob": {"name":"test-case-norm-3","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-3"}
2024-04-26T21:22:53Z DEBUG events Created pod: test-case-norm-3-ps-1 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-3","uid":"214c8fcb-d647-4d1b-b9f9-de1f3e8f6924"}, "reason": "SuccessfulCreatePod"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-3" not found {"tfjob": {"name":"test-case-norm-3","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-3"}
2024-04-26T21:22:53Z DEBUG events Created service: test-case-norm-3-ps-1 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-3","uid":"214c8fcb-d647-4d1b-b9f9-de1f3e8f6924"}, "reason": "SuccessfulCreateService"}
STEP: Distributed TFJob (4 workers, 2 PS) is created and all replicas are running @ 04/26/24 21:22:53.704
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-4" not found {"tfjob": {"name":"test-case-norm-4","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-4"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-4" not found {"tfjob": {"name":"test-case-norm-4","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-4"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-4" not found {"tfjob": {"name":"test-case-norm-4","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-4"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-4" not found {"tfjob": {"name":"test-case-norm-4","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-4"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-4" not found {"tfjob": {"name":"test-case-norm-4","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-4"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-4" not found {"tfjob": {"name":"test-case-norm-4","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-4"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-4" not found {"tfjob": {"name":"test-case-norm-4","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-4"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-4" not found {"tfjob": {"name":"test-case-norm-4","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-4"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-4" not found {"tfjob": {"name":"test-case-norm-4","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-4"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-4" not found {"tfjob": {"name":"test-case-norm-4","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-4"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-4" not found {"tfjob": {"name":"test-case-norm-4","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-4"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-4" not found {"tfjob": {"name":"test-case-norm-4","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-4"}
STEP: Distributed TFJob (4 workers, 2 PS) is created, 2 workers, 1 PS are pending, 1 worker is running @ 04/26/24 21:22:53.776
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-5" not found {"tfjob": {"name":"test-case-norm-5","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-5"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-5" not found {"tfjob": {"name":"test-case-norm-5","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-5"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-5" not found {"tfjob": {"name":"test-case-norm-5","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-5"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-5" not found {"tfjob": {"name":"test-case-norm-5","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-5"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-5" not found {"tfjob": {"name":"test-case-norm-5","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-5"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-5" not found {"tfjob": {"name":"test-case-norm-5","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-5"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-5" not found {"tfjob": {"name":"test-case-norm-5","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-5"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-5" not found {"tfjob": {"name":"test-case-norm-5","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-5"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-5" not found {"tfjob": {"name":"test-case-norm-5","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-5"}
2024-04-26T21:22:53Z DEBUG events Created pod: test-case-norm-5-worker-3 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-5","uid":"280513e0-f904-4854-8b62-748e7f3d1889"}, "reason": "SuccessfulCreatePod"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-5" not found {"tfjob": {"name":"test-case-norm-5","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-5"}
2024-04-26T21:22:53Z DEBUG events Created service: test-case-norm-5-worker-3 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-5","uid":"280513e0-f904-4854-8b62-748e7f3d1889"}, "reason": "SuccessfulCreateService"}
2024-04-26T21:22:53Z DEBUG events Created pod: test-case-norm-5-ps-1 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-5","uid":"280513e0-f904-4854-8b62-748e7f3d1889"}, "reason": "SuccessfulCreatePod"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-5" not found {"tfjob": {"name":"test-case-norm-5","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-5"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-5" not found {"tfjob": {"name":"test-case-norm-5","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-5"}
2024-04-26T21:22:53Z DEBUG events Created service: test-case-norm-5-ps-1 {"type": "Normal", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-5","uid":"280513e0-f904-4854-8b62-748e7f3d1889"}, "reason": "SuccessfulCreateService"}
STEP: Distributed TFJob (4 workers, 2 PS) is succeeded @ 04/26/24 21:22:53.818
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-6" not found {"tfjob": {"name":"test-case-norm-6","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-6"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-6" not found {"tfjob": {"name":"test-case-norm-6","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-6"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-6" not found {"tfjob": {"name":"test-case-norm-6","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-6"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-6" not found {"tfjob": {"name":"test-case-norm-6","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-6"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-6" not found {"tfjob": {"name":"test-case-norm-6","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-6"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-6" not found {"tfjob": {"name":"test-case-norm-6","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-6"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-6" not found {"tfjob": {"name":"test-case-norm-6","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-6"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-6" not found {"tfjob": {"name":"test-case-norm-6","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-6"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-6" not found {"tfjob": {"name":"test-case-norm-6","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-6"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-6" not found {"tfjob": {"name":"test-case-norm-6","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-6"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-6" not found {"tfjob": {"name":"test-case-norm-6","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-6"}
2024-04-26T21:22:53Z INFO TFJob.kubeflow.org "test-case-norm-6" not found {"tfjob": {"name":"test-case-norm-6","namespace":"default"}, "unable to fetch TFJob": "default/test-case-norm-6"}
2024-04-26T21:22:53Z DEBUG events Error creating: services "test-case-norm-6-ps-1" already exists {"type": "Warning", "object": {"kind":"TFJob","namespace":"default","name":"test-case-norm-6","uid":"5499f286-168c-4046-8286-[389](https://github.com/kubeflow/training-operator/actions/runs/8854426275/job/24317424179#step:4:390)3f2fa67ea"}, "reason": "FailedCreateService"}
[FAILED] in [It] - /home/runner/work/training-operator/training-operator/go/src/github.com/kubeflow/training-operator/pkg/controller.v1/tensorflow/tfjob_controller_test.go:321 @ 04/26/24 21:22:53.876
<<Timeline [FAILED] Expected <bool>: false to be true In [It] at: /home/runner/work/training-operator/training-operator/go/src/github.com/kubeflow/training-operator/pkg/controller.v1/tensorflow/tfjob_controller_test.go:321 @ 04/26/24 21:22:53.876
The text was updated successfully, but these errors were encountered:
tenzen-y
changed the title
Flaky Test: [It] should create desired Pods and Services
Flaky Test: [It] should create desired Pods and Services: Distributed TFJob (4 workers, 2 PS) is succeeded
Apr 27, 2024
Observed at: https://github.com/kubeflow/training-operator/actions/runs/8854426275/job/24317424179#step:4:364
The text was updated successfully, but these errors were encountered: