app Container can't reuse its init Container cpuset in a specific condition #124797

lianghao208 · 2024-05-10T14:12:18Z

What happened?

We can't make sure app Container always reuses init Container cpuset, which may lead to the waste of CPU(init container has already exited but the cpuset can not be reused by other containers) and not enough cpus available to satis fy request error.

What did you expect to happen?

app Container always reuses init Container cpuset after init Container exits.

How can we reproduce it (as minimally and precisely as possible)?

This is one of the spefic condition that might cause the issue:

Pod A ready to allocate cpuset with init container and app container both request 92 cpu.
Pod B already running on the node and ready to be deleted.

Pod A's init container starts to allocate cpuset（4-24,48-60,73-84,100-120,144-156,169-180）：

I0510 16:40:21.232949   20266 state_mem.go:80] "Updated desired CPUSet" podUID="2f9922ce-df66-4b58-abd8-01187b813318" containerName="init-container" cpuSet="4-24,48-60,73-84,100-120,144-156,169-180"

Pod A's init container exits.
Before Pod A's app container starts to allocate cpuset, Pod B gets deleted and release it's cpuset（cpuSet="0-3,25-47,61-72,85-99,121-143,157-168,181-191）：

I0510 16:40:27.759335   20266 state_mem.go:107] "Deleted CPUSet assignment" podUID="74510e24-48ba-4fd7-ab85-80dd99c6df5d" containerName="deleted-container"
I0510 16:40:27.759714   20266 state_mem.go:88] "Updated default CPUSet" cpuSet="0-3,25-47,61-72,85-99,121-143,157-168,181-191"

Pod A's app container starts to allocate cpuset.
What we expect is that Pod A's app container reuses its init container's cpuset.
But due to Pod B's deletion, it won't allocate the same cpuset as its init container.（4-49,100-145）：

 I0510 16:40:27.989453   20266 state_mem.go:80] "Updated desired CPUSet" podUID="2f9922ce-df66-4b58-abd8-01187b813318" containerName="app-container" cpuSet="4-49,100-145"

Now we have Pod A's init container taking cpuset: 4-24,48-60,73-84,100-120,144-156,169-180
And Pod A's app container taking cpuset: 4-49,100-145
The init container cpuset won't be reused as expected.

A new Pod C starts to allocate cpuset, it may get not enough cpus available to satis fy request error

Anything else we need to know?

No response

Kubernetes version

$ kubectl version
# paste output here
1.30

Cloud provider

NONE

OS version

# On Linux:
$ cat /etc/os-release
# paste output here
$ uname -a
# paste output here

# On Windows:
C:\> wmic os get Caption, Version, BuildNumber, OSArchitecture
# paste output here

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

The text was updated successfully, but these errors were encountered:

lianghao208 · 2024-05-10T14:12:35Z

/sig node

lianghao208 · 2024-05-10T14:20:24Z

/cc @klueska Hi Klues, I noticed you have solved some similar issues like #102014, I wonder if you have encountered this issue before.

lianghao208 · 2024-05-10T14:30:32Z

The piont is: the cpuset allocations for init-containers and app-containers differ due to changes in available cpusets in the time interval between the start of the init-container and the app-container.

ffromani · 2024-05-11T08:06:29Z

related: #94220

chengjoey · 2024-05-11T09:51:00Z

#124282
similar issue?

lianghao208 · 2024-05-11T13:33:44Z

related: #94220

@ffromani Thanks for the mention, the issue I mention in this issue is a little different from #94220 .
In #94220, the bug is caused by different cpu request between init-container and app-container.
This bug might be caused by the changes of available cpusets in the time interval between the start of the init-container and the app-container.

lianghao208 · 2024-05-11T13:38:38Z

#124282 similar issue?

@chengjoey Not exactly the same issue. In #124282 , the init-container and app-container request different amount of cpu(init > app), and the cpuset from init-container can't be released even though it has exited (similar to #94220).

But in this case, init-container and app-container request same amount of cpu(init == app), so this is a kubelet issue instead of kube-scheduler.

ffromani · 2024-05-11T14:15:02Z

related: #94220

@ffromani Thanks for the mention, the issue I mention in this issue is a little different from #94220 . In #94220, the bug is caused by different cpu request between init-container and app-container. This bug might be caused by the changes of available cpusets in the time interval between the start of the init-container and the app-container.

Yes, I realized after re-reading the description of this issue. I'd need to check if the system does guarantee the maximum reuse of init container cpu cores when allocating the app container cpu cores. Nevertheless, it's a very desirable property the system should strive to ensure. My gut feeling is there is just a bug in this area, I remember various conversations over time.

ffromani · 2024-05-13T14:15:43Z

the core issue is here: https://github.com/kubernetes/kubernetes/blob/v1.30.0/pkg/kubelet/cm/cpumanager/policy_static.go#L394

with this line of code, all the available CPUs are put in a single pool. IOW, nothing guarantees that the reusable CPUs from terminated init container will be consumed first, or at all if the system has enough CPUs to fulfill the app container requirement.

I vaguely remember some past conversations in this area about guaranteeing optimal allocation in the context of the topology manager-enforced constraints. Also, I wonder if and how we should extend this guarantee. IOW, should the reuse be best-effort (and so, arguably, there's no bug?)

Perhaps the best way to fix would be to add a new cpu manager policy option.

ffromani · 2024-05-13T14:25:57Z

/triage accepted
/priority backlog

lianghao208 · 2024-05-14T02:08:28Z

@ffromani

with this line of code, all the available CPUs are put in a single pool. IOW, nothing guarantees that the reusable CPUs from terminated init container will be consumed first, or at all if the system has enough CPUs to fulfill the app container requirement.

In this case, should we release init-container cpuset as soon as it exits? If a init-container exits successfully and won't restart anymore, it's cpuset either be reused by its own pod's app-container, or other pods' containers. Or else this "available" cpuset will not be used at all.
However, From the scheduler perspective, it considers these cpu as available.

I vaguely remember some past conversations in this area about guaranteeing optimal allocation in the context of the topology manager-enforced constraints. Also, I wonder if and how we should extend this guarantee. IOW, should the reuse be best-effort (and so, arguably, there's no bug?)

If we release init-container cpuset as soon as it exits, the reuse will be guarantee.

lianghao208 added the kind/bug Categorizes issue or PR as related to a bug. label May 10, 2024

k8s-ci-robot added needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 10, 2024

k8s-ci-robot added sig/node Categorizes an issue or PR as relevant to SIG Node. and removed needs-sig Indicates an issue or PR lacks a `sig/foo` label and requires one. labels May 10, 2024

k8s-ci-robot added triage/accepted Indicates an issue or PR is ready to be actively worked on. priority/backlog Higher priority than priority/awaiting-more-evidence. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels May 13, 2024

pacoxu added this to Triage in SIG Node Bugs May 14, 2024

ffromani moved this from Triage to Triaged in SIG Node Bugs May 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

app Container can't reuse its init Container cpuset in a specific condition #124797

app Container can't reuse its init Container cpuset in a specific condition #124797

lianghao208 commented May 10, 2024 •

edited

lianghao208 commented May 10, 2024

lianghao208 commented May 10, 2024

lianghao208 commented May 10, 2024

ffromani commented May 11, 2024

chengjoey commented May 11, 2024

lianghao208 commented May 11, 2024

lianghao208 commented May 11, 2024

ffromani commented May 11, 2024

ffromani commented May 13, 2024

ffromani commented May 13, 2024

lianghao208 commented May 14, 2024

app Container can't reuse its init Container cpuset in a specific condition #124797

app Container can't reuse its init Container cpuset in a specific condition #124797

Comments

lianghao208 commented May 10, 2024 • edited

What happened?

What did you expect to happen?

How can we reproduce it (as minimally and precisely as possible)?

Anything else we need to know?

Kubernetes version

Cloud provider

OS version

Install tools

Container runtime (CRI) and version (if applicable)

Related plugins (CNI, CSI, ...) and versions (if applicable)

lianghao208 commented May 10, 2024

lianghao208 commented May 10, 2024

lianghao208 commented May 10, 2024

ffromani commented May 11, 2024

chengjoey commented May 11, 2024

lianghao208 commented May 11, 2024

lianghao208 commented May 11, 2024

ffromani commented May 11, 2024

ffromani commented May 13, 2024

ffromani commented May 13, 2024

lianghao208 commented May 14, 2024

lianghao208 commented May 10, 2024 •

edited