Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Flaking Test][sig-storage] should block a second pod from using an in-use ReadWriteOncePod volume on the same node #124784

Open
eddiezane opened this issue May 10, 2024 · 3 comments
Labels
kind/flake Categorizes issue or PR as related to a flaky test. sig/storage Categorizes an issue or PR as relevant to SIG Storage. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@eddiezane
Copy link
Member

eddiezane commented May 10, 2024

Which jobs are flaking?

  • e2e-kops-aws-cni-cilium-eni
  • ci-kubernetes-kind-e2e-parallel-1-29
  • pull-kubernetes-e2e-kind

Which tests are flaking?

Kubernetes e2e suite: [It] [sig-storage] CSI Volumes [Driver: csi-hostpath] [Testpattern: Dynamic PV (default fs)] read-write-once-pod [MinimumKubeletVersion:1.27] should block a second pod from using an in-use ReadWriteOncePod volume on the same node

Since when has it been flaking?

Unclear. It appears to be for a while but I haven't been able to find an issue already filed.

https://storage.googleapis.com/k8s-triage/index.html?date=2024-05-09&test=should%20block%20a%20second%20pod%20from%20using%20an%20in-use%20ReadWriteOncePod

Testgrid link

https://testgrid.k8s.io/presubmits-kubernetes-blocking#pull-kubernetes-e2e-kind

Reason for failure (if possible)

[FAILED] failed to wait for FailedMount event for pod2: got error while getting events: client rate limiter Wait returned an error: context deadline exceeded
In [It] at: k8s.io/kubernetes/test/e2e/storage/testsuites/readwriteoncepod.go:233
[FAILED] failed to wait for FailedMount event for pod2: got error while getting events: client rate limiter Wait returned an error: rate: Wait(n=1) would exceed context deadline

Anything else we need to know?

https://prow.k8s.io/view/gs/kubernetes-jenkins/pr-logs/pull/124683/pull-kubernetes-e2e-kind/1788664943130710016

Relevant SIG(s)

/sig storage

@eddiezane eddiezane added the kind/flake Categorizes issue or PR as related to a flaky test. label May 10, 2024
@k8s-ci-robot k8s-ci-robot added sig/storage Categorizes an issue or PR as relevant to SIG Storage. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 10, 2024
@eddiezane eddiezane changed the title [Flaking Test][sig-storage] [Flaking Test][sig-storage] should block a second pod from using an in-use ReadWriteOncePod volume on the same node May 10, 2024
@liangyuanpeng
Copy link
Contributor

liangyuanpeng commented May 10, 2024

FYI: I used github action to do some tests on the job pull-kubernetes-e2e-kind, and it's always success if PARALLEL=false,the problem can only be reproduced when PARALLEL is true.

@brianpursley
Copy link
Member

OutOfpods: Node didn't have enough resource: pods, requested: 1, used: 110, capacity: 110

  STEP: Collecting events from namespace "read-write-once-pod-9757". @ 05/09/24 20:38:51.154
  STEP: Found 10 events. @ 05/09/24 20:38:51.156
  I0509 20:38:51.156989 65982 dump.go:53] At 2024-05-09 20:33:01 +0000 UTC - event for csi-hostpath5v7vm: {persistentvolume-controller } ExternalProvisioning: Waiting for a volume to be created either by the external provisioner 'csi-hostpath-read-write-once-pod-9757' or manually by the system administrator. If volume creation is delayed, please verify that the provisioner is running and correctly registered.
  I0509 20:38:51.157008 65982 dump.go:53] At 2024-05-09 20:33:20 +0000 UTC - event for csi-hostpath5v7vm: {csi-hostpath-read-write-once-pod-9757_csi-hostpathplugin-0_b225a3f7-b843-4de0-ac58-61bc09343ab8 } Provisioning: External provisioner is provisioning volume for claim "read-write-once-pod-9757/csi-hostpath5v7vm"
  I0509 20:38:51.157018 65982 dump.go:53] At 2024-05-09 20:33:20 +0000 UTC - event for csi-hostpath5v7vm: {csi-hostpath-read-write-once-pod-9757_csi-hostpathplugin-0_b225a3f7-b843-4de0-ac58-61bc09343ab8 } ProvisioningSucceeded: Successfully provisioned volume pvc-aef51327-f502-4525-89bb-cb3f30838c95
  I0509 20:38:51.157026 65982 dump.go:53] At 2024-05-09 20:33:21 +0000 UTC - event for pod-55dfb4dd-f05e-4776-a8e1-37283232c0ff: {default-scheduler } Scheduled: Successfully assigned read-write-once-pod-9757/pod-55dfb4dd-f05e-4776-a8e1-37283232c0ff to kind-worker
  I0509 20:38:51.157032 65982 dump.go:53] At 2024-05-09 20:33:22 +0000 UTC - event for pod-55dfb4dd-f05e-4776-a8e1-37283232c0ff: {attachdetach-controller } SuccessfulAttachVolume: AttachVolume.Attach succeeded for volume "pvc-aef51327-f502-4525-89bb-cb3f30838c95" 
  I0509 20:38:51.157039 65982 dump.go:53] At 2024-05-09 20:33:40 +0000 UTC - event for pod-55dfb4dd-f05e-4776-a8e1-37283232c0ff: {kubelet kind-worker} Pulled: Container image "registry.k8s.io/e2e-test-images/busybox:1.36.1-1" already present on machine
  I0509 20:38:51.157046 65982 dump.go:53] At 2024-05-09 20:33:40 +0000 UTC - event for pod-55dfb4dd-f05e-4776-a8e1-37283232c0ff: {kubelet kind-worker} Created: Created container write-pod
  I0509 20:38:51.157056 65982 dump.go:53] At 2024-05-09 20:33:42 +0000 UTC - event for pod-55dfb4dd-f05e-4776-a8e1-37283232c0ff: {kubelet kind-worker} Started: Started container write-pod
  I0509 20:38:51.157063 65982 dump.go:53] At 2024-05-09 20:33:44 +0000 UTC - event for pod-da786ea3-645b-4ea1-9f62-51f97220b8be: {kubelet kind-worker} OutOfpods: Node didn't have enough resource: pods, requested: 1, used: 110, capacity: 110
  I0509 20:38:51.157069 65982 dump.go:53] At 2024-05-09 20:38:44 +0000 UTC - event for pod-55dfb4dd-f05e-4776-a8e1-37283232c0ff: {kubelet kind-worker} Killing: Stopping container write-pod

@xing-yang
Copy link
Contributor

/triage accepted

@k8s-ci-robot k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/flake Categorizes issue or PR as related to a flaky test. sig/storage Categorizes an issue or PR as relevant to SIG Storage. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

5 participants