大规模集群下3000节点，从4wPod扩容到6wPod时，pod无法正常分配ip #4078

cmdy · 2024-05-24T03:14:25Z

Kube-OVN Version

v1.12.11

Kubernetes Version

Client Version: v1.28.8
Server Version: v1.28.8

Operation-system/Kernel Version

"CentOS Linux 7 (Core)"
5.16.20-3.el7.bzl.x86_64

Description

使用 kwok模拟3k节点，当pod 从4w扩容到6w时，Pod IP未正常分配；

kube-ovn-controller 日志

I0524 11:05:15.035224       1 pod.go:550] handle add/update pod fake-pod/fake-pod-7b99c6d54d-wbxdb
I0524 11:05:15.109086       1 pod.go:607] sync pod fake-pod/fake-pod-7b99c6d54d-wbxdb allocated
I0524 11:05:15.109119       1 ipam.go:60] allocate v4 10.16.233.106, v6 , mac 00:00:00:64:E6:A5 for fake-pod/fake-pod-7b99c6d54d-wbxdb from subnet ovn-default
E0524 11:05:15.111178       1 ovn-nb-logical_switch_port.go:91] get logical switch port fake-pod-7b99c6d54d-wbxdb.fake-pod: not connected
E0524 11:05:15.111212       1 pod.go:695] get logical switch port fake-pod-7b99c6d54d-wbxdb.fake-pod: not connected
E0524 11:05:15.111238       1 pod.go:405] error syncing 'fake-pod/fake-pod-7b99c6d54d-wbxdb': get logical switch port fake-pod-7b99c6d54d-wbxdb.fake-pod: not connected, requeuing
I0524 11:05:15.111281       1 event.go:298] Event(v1.ObjectReference{Kind:"Pod", Namespace:"fake-pod", Name:"fake-pod-7b99c6d54d-wbxdb", UID:"6ad08145-d783-4adb-89e4-fcfb427535b7", APIVersion:"v1", ResourceVersion:"452827617", FieldPath:""}): type: 'Warning' reason: 'CreateOVNPortFailed' get logical switch port fake-pod-7b99c6d54d-wbxdb.fake-pod: not connected

Steps To Reproduce

使用kwok 模拟3k 节点，并将pod 扩容至6w
kube-ovn-controller 资源 8c8Gi

        resources:
          limits:
            cpu: "8"
            memory: 8Gi
          requests:
            cpu: 200m
            memory: 200Mi

ovn-central 3实例

       resources:
          limits:
            cpu: "4"
            memory: 8Gi
          requests:
            cpu: 300m
            memory: 300Mi

Current Behavior

Pod IP 未被正常分配

Expected Behavior

Pod IP 可以正常分配

bobz965 · 2024-05-24T03:28:31Z

@cmdy 大佬，你们已经用到这么大物理机集群了嘛？

cmdy · 2024-05-24T03:31:56Z

@cmdy 大佬，你们已经用到这么大物理机集群了嘛？

嗯嗯差不多了，我们现在线上单集群已经1700多台物理机了，最近在验证kube-ovn 准备用这套方案

cmdy · 2024-05-24T03:45:26Z

ovn-northd.log

2024-05-24T03:33:30.635Z|283076|northd|ERR|Dropped 28983 log messages in last 72 seconds (most recently, 24 seconds ago) due to excessive rate
2024-05-24T03:33:30.636Z|283077|northd|ERR|lport fake-pod-7b99c6d54d-r87tb.fake-pod in port group node.kwok.node.545 not found.
2024-05-24T03:33:30.762Z|283078|inc_proc_eng|INFO|node: northd, recompute (forced) took 10572ms
2024-05-24T03:33:31.913Z|283079|inc_proc_eng|INFO|node: lflow, recompute (forced) took 860ms
2024-05-24T03:33:32.132Z|283080|jsonrpc|WARN|tcp:[10.56.64.18]:6642: send error: Broken pipe
2024-05-24T03:33:32.193Z|283081|ovn_northd|INFO|OVNSB commit failed, force recompute next time.
2024-05-24T03:33:32.213Z|283082|timeval|WARN|Unreasonably long 12024ms poll interval (12016ms user, 7ms system)
2024-05-24T03:33:32.213Z|283083|timeval|WARN|faults: 1275 minor, 0 major
2024-05-24T03:33:32.213Z|283084|timeval|WARN|context switches: 0 voluntary, 17 involuntary
2024-05-24T03:33:32.214Z|283085|poll_loop|INFO|wakeup due to [POLLIN][POLLHUP] on fd 3 (/var/run/ovn/ovn-northd.595.ctl<->) at ../lib/stream-fd.c:157 (100% CPU usage)
2024-05-24T03:33:32.215Z|283086|poll_loop|INFO|wakeup due to [POLLIN] on fd 17 (10.56.64.18:24322<->10.56.64.18:6641) at ../lib/stream-fd.c:157 (100% CPU usage)
2024-05-24T03:33:32.216Z|283087|reconnect|WARN|tcp:[10.56.64.18]:6642: connection dropped (Broken pipe)
2024-05-24T03:33:32.216Z|283088|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:33:32.216Z|283089|poll_loop|INFO|wakeup due to 0-ms timeout at tcp:[10.56.64.18]:6641 (100% CPU usage)
2024-05-24T03:33:32.216Z|283090|reconnect|INFO|tcp:[10.56.64.18]:6641: connection closed by peer
2024-05-24T03:33:33.217Z|283091|poll_loop|INFO|wakeup due to 1001-ms timeout at ../lib/reconnect.c:677 (100% CPU usage)
2024-05-24T03:33:33.217Z|283092|reconnect|INFO|tcp:[10.56.64.16]:6641: connecting...
2024-05-24T03:33:33.217Z|283093|reconnect|INFO|tcp:[10.56.64.16]:6642: connecting...
2024-05-24T03:33:33.217Z|283094|reconnect|INFO|tcp:[10.56.64.16]:6641: connected
2024-05-24T03:33:33.218Z|283095|reconnect|INFO|tcp:[10.56.64.16]:6642: connected
2024-05-24T03:33:33.231Z|283096|ovsdb_cs|INFO|tcp:[10.56.64.16]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:33:33.231Z|283097|reconnect|INFO|tcp:[10.56.64.16]:6642: connection attempt timed out
2024-05-24T03:33:33.231Z|283098|reconnect|INFO|tcp:[10.56.64.16]:6642: waiting 2 seconds before reconnect
2024-05-24T03:33:33.238Z|283099|ovsdb_cs|INFO|tcp:[10.56.64.16]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:33:33.238Z|283100|reconnect|INFO|tcp:[10.56.64.16]:6641: connection attempt timed out
2024-05-24T03:33:33.238Z|283101|reconnect|INFO|tcp:[10.56.64.16]:6641: waiting 2 seconds before reconnect
2024-05-24T03:33:35.233Z|283102|reconnect|INFO|tcp:[10.56.64.17]:6642: connecting...
2024-05-24T03:33:35.233Z|283103|reconnect|INFO|tcp:[10.56.64.17]:6642: connected
2024-05-24T03:33:35.236Z|283104|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:33:35.236Z|283105|ovsdb_cs|INFO|tcp:[10.56.64.17]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:33:35.236Z|283106|reconnect|INFO|tcp:[10.56.64.17]:6642: connection attempt timed out
2024-05-24T03:33:35.236Z|283107|reconnect|INFO|tcp:[10.56.64.17]:6642: waiting 4 seconds before reconnect
2024-05-24T03:33:35.236Z|283108|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:33:35.237Z|283109|reconnect|INFO|tcp:[10.56.64.17]:6641: connecting...
2024-05-24T03:33:35.237Z|283110|reconnect|INFO|tcp:[10.56.64.17]:6641: connected
2024-05-24T03:33:35.238Z|283111|ovsdb_cs|INFO|tcp:[10.56.64.17]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:33:35.238Z|283112|reconnect|INFO|tcp:[10.56.64.17]:6641: connection attempt timed out
2024-05-24T03:33:35.238Z|283113|reconnect|INFO|tcp:[10.56.64.17]:6641: waiting 4 seconds before reconnect
2024-05-24T03:33:39.237Z|283114|reconnect|INFO|tcp:[10.56.64.18]:6642: connecting...
2024-05-24T03:33:39.237Z|283115|reconnect|INFO|tcp:[10.56.64.18]:6642: connected
2024-05-24T03:33:39.237Z|283116|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:33:39.239Z|283117|reconnect|INFO|tcp:[10.56.64.18]:6641: connecting...
2024-05-24T03:33:39.239Z|283118|reconnect|INFO|tcp:[10.56.64.18]:6641: connected
2024-05-24T03:33:50.111Z|283119|inc_proc_eng|INFO|node: northd, recompute (forced) took 10843ms
2024-05-24T03:33:51.256Z|283120|inc_proc_eng|INFO|node: lflow, recompute (forced) took 860ms
2024-05-24T03:33:51.473Z|283121|jsonrpc|WARN|tcp:[10.56.64.18]:6642: send error: Broken pipe
2024-05-24T03:33:51.538Z|283122|ovn_northd|INFO|OVNSB commit failed, force recompute next time.
2024-05-24T03:33:51.556Z|283123|timeval|WARN|Unreasonably long 12288ms poll interval (12250ms user, 38ms system)
2024-05-24T03:33:51.556Z|283124|timeval|WARN|faults: 1273 minor, 0 major
2024-05-24T03:33:51.556Z|283125|timeval|WARN|disk: 0 reads, 8 writes
2024-05-24T03:33:51.556Z|283126|timeval|WARN|context switches: 0 voluntary, 24 involuntary
2024-05-24T03:33:51.556Z|283127|coverage|INFO|Dropped 1 log messages in last 20 seconds (most recently, 20 seconds ago) due to excessive rate
2024-05-24T03:33:51.556Z|283128|coverage|INFO|Skipping details of duplicate event coverage for hash=ab6315f5
2024-05-24T03:33:51.556Z|283129|poll_loop|INFO|Dropped 3 log messages in last 19 seconds (most recently, 19 seconds ago) due to excessive rate
2024-05-24T03:33:51.556Z|283130|poll_loop|INFO|wakeup due to [POLLIN][POLLHUP] on fd 3 (/var/run/ovn/ovn-northd.595.ctl<->) at ../lib/stream-fd.c:157 (97% CPU usage)
2024-05-24T03:33:51.556Z|283131|poll_loop|INFO|wakeup due to [POLLIN] on fd 17 (10.56.64.18:24328<->10.56.64.18:6641) at ../lib/stream-fd.c:157 (97% CPU usage)
2024-05-24T03:33:51.558Z|283132|reconnect|WARN|tcp:[10.56.64.18]:6642: connection dropped (Broken pipe)
2024-05-24T03:33:51.558Z|283133|reconnect|INFO|tcp:[10.56.64.18]:6642: continuing to reconnect in the background but suppressing further logging
2024-05-24T03:33:51.558Z|283134|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:33:51.559Z|283135|poll_loop|INFO|wakeup due to 0-ms timeout at tcp:[10.56.64.18]:6641 (97% CPU usage)
2024-05-24T03:33:51.559Z|283136|reconnect|INFO|tcp:[10.56.64.18]:6641: connection closed by peer
2024-05-24T03:33:52.560Z|283137|reconnect|INFO|tcp:[10.56.64.16]:6641: connecting...
2024-05-24T03:33:52.560Z|283138|reconnect|INFO|tcp:[10.56.64.16]:6641: connected
2024-05-24T03:33:52.571Z|283139|ovsdb_cs|INFO|tcp:[10.56.64.16]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:33:52.571Z|283140|reconnect|INFO|tcp:[10.56.64.16]:6641: connection attempt timed out
2024-05-24T03:33:52.571Z|283141|reconnect|INFO|tcp:[10.56.64.16]:6641: waiting 2 seconds before reconnect
2024-05-24T03:33:54.572Z|283142|reconnect|INFO|tcp:[10.56.64.17]:6641: connecting...
2024-05-24T03:33:54.572Z|283143|reconnect|INFO|tcp:[10.56.64.17]:6641: connected
2024-05-24T03:33:54.583Z|283144|ovsdb_cs|INFO|tcp:[10.56.64.17]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:33:54.583Z|283145|reconnect|INFO|tcp:[10.56.64.17]:6641: connection attempt timed out
2024-05-24T03:33:54.583Z|283146|reconnect|INFO|tcp:[10.56.64.17]:6641: waiting 4 seconds before reconnect
2024-05-24T03:33:58.584Z|283147|reconnect|INFO|tcp:[10.56.64.18]:6641: connecting...
2024-05-24T03:33:58.584Z|283148|reconnect|INFO|tcp:[10.56.64.18]:6641: connected
2024-05-24T03:33:59.559Z|283149|reconnect|INFO|tcp:[10.56.64.16]:6642: connected
2024-05-24T03:33:59.560Z|283150|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:33:59.560Z|283151|ovsdb_cs|INFO|tcp:[10.56.64.16]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:33:59.560Z|283152|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:34:07.564Z|283153|reconnect|INFO|tcp:[10.56.64.17]:6642: connected
2024-05-24T03:34:07.565Z|283154|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:34:07.566Z|283155|ovsdb_cs|INFO|tcp:[10.56.64.17]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:34:07.566Z|283156|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.

2024-05-24T03:36:29.427Z|283343|inc_proc_eng|INFO|node: northd, recompute (forced) took 10600ms
2024-05-24T03:36:30.527Z|283344|inc_proc_eng|INFO|node: lflow, recompute (forced) took 808ms
2024-05-24T03:36:30.755Z|283345|jsonrpc|WARN|tcp:[10.56.64.18]:6642: send error: Broken pipe
2024-05-24T03:36:30.823Z|283346|ovn_northd|INFO|OVNSB commit failed, force recompute next time.
2024-05-24T03:36:30.841Z|283347|timeval|WARN|Unreasonably long 12014ms poll interval (12002ms user, 11ms system)
2024-05-24T03:36:30.841Z|283348|timeval|WARN|faults: 1650 minor, 0 major
2024-05-24T03:36:30.841Z|283349|timeval|WARN|context switches: 0 voluntary, 14 involuntary
2024-05-24T03:36:30.841Z|283350|poll_loop|INFO|wakeup due to [POLLIN] on fd 3 (10.56.64.18:24380<->10.56.64.18:6641) at ../lib/stream-fd.c:157 (100% CPU usage)
2024-05-24T03:36:30.842Z|283351|reconnect|WARN|tcp:[10.56.64.18]:6642: connection dropped (Broken pipe)
2024-05-24T03:36:30.842Z|283352|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:36:30.843Z|283353|poll_loop|INFO|wakeup due to 0-ms timeout at tcp:[10.56.64.18]:6641 (100% CPU usage)
2024-05-24T03:36:30.843Z|283354|reconnect|INFO|tcp:[10.56.64.18]:6641: connection closed by peer
2024-05-24T03:36:31.843Z|283355|poll_loop|INFO|wakeup due to 999-ms timeout at ../lib/reconnect.c:677 (100% CPU usage)
2024-05-24T03:36:31.843Z|283356|reconnect|INFO|tcp:[10.56.64.16]:6641: connecting...
2024-05-24T03:36:31.843Z|283357|reconnect|INFO|tcp:[10.56.64.16]:6642: connecting...
2024-05-24T03:36:31.843Z|283358|poll_loop|INFO|wakeup due to [POLLOUT] on fd 3 (10.56.64.18:50246<->10.56.64.16:6641) at ../lib/stream-fd.c:153 (100% CPU usage)
2024-05-24T03:36:31.843Z|283359|reconnect|INFO|tcp:[10.56.64.16]:6641: connected
2024-05-24T03:36:31.844Z|283360|reconnect|INFO|tcp:[10.56.64.16]:6642: connected
2024-05-24T03:36:31.858Z|283361|ovsdb_cs|INFO|tcp:[10.56.64.16]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:36:31.858Z|283362|reconnect|INFO|tcp:[10.56.64.16]:6642: connection attempt timed out
2024-05-24T03:36:31.858Z|283363|reconnect|INFO|tcp:[10.56.64.16]:6642: waiting 2 seconds before reconnect
2024-05-24T03:36:31.864Z|283364|ovsdb_cs|INFO|tcp:[10.56.64.16]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:36:31.864Z|283365|reconnect|INFO|tcp:[10.56.64.16]:6641: connection attempt timed out
2024-05-24T03:36:31.864Z|283366|reconnect|INFO|tcp:[10.56.64.16]:6641: waiting 2 seconds before reconnect
2024-05-24T03:36:33.859Z|283367|reconnect|INFO|tcp:[10.56.64.17]:6642: connecting...
2024-05-24T03:36:33.859Z|283368|reconnect|INFO|tcp:[10.56.64.17]:6642: connected
2024-05-24T03:36:33.860Z|283369|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:36:33.860Z|283370|ovsdb_cs|INFO|tcp:[10.56.64.17]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:36:33.860Z|283371|reconnect|INFO|tcp:[10.56.64.17]:6642: connection attempt timed out
2024-05-24T03:36:33.860Z|283372|reconnect|INFO|tcp:[10.56.64.17]:6642: waiting 4 seconds before reconnect
2024-05-24T03:36:33.860Z|283373|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:36:33.863Z|283374|reconnect|INFO|tcp:[10.56.64.17]:6641: connecting...
2024-05-24T03:36:33.863Z|283375|reconnect|INFO|tcp:[10.56.64.17]:6641: connected
2024-05-24T03:36:33.864Z|283376|ovsdb_cs|INFO|tcp:[10.56.64.17]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:36:33.864Z|283377|reconnect|INFO|tcp:[10.56.64.17]:6641: connection attempt timed out
2024-05-24T03:36:33.864Z|283378|reconnect|INFO|tcp:[10.56.64.17]:6641: waiting 4 seconds before reconnect
2024-05-24T03:36:37.861Z|283379|reconnect|INFO|tcp:[10.56.64.18]:6642: connecting...
2024-05-24T03:36:37.861Z|283380|reconnect|INFO|tcp:[10.56.64.18]:6642: connected
2024-05-24T03:36:37.863Z|283381|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:36:37.864Z|283382|reconnect|INFO|tcp:[10.56.64.18]:6641: connecting...
2024-05-24T03:36:37.864Z|283383|reconnect|INFO|tcp:[10.56.64.18]:6641: connected
2024-05-24T03:36:49.551Z|283384|inc_proc_eng|INFO|node: northd, recompute (forced) took 10650ms
2024-05-24T03:36:50.660Z|283385|inc_proc_eng|INFO|node: lflow, recompute (forced) took 820ms
2024-05-24T03:36:50.894Z|283386|jsonrpc|WARN|tcp:[10.56.64.18]:6642: send error: Broken pipe
2024-05-24T03:36:50.950Z|283387|ovn_northd|INFO|OVNSB commit failed, force recompute next time.
2024-05-24T03:36:50.972Z|283388|timeval|WARN|Unreasonably long 12072ms poll interval (12065ms user, 6ms system)
2024-05-24T03:36:50.972Z|283389|timeval|WARN|faults: 378 minor, 0 major
2024-05-24T03:36:50.972Z|283390|timeval|WARN|disk: 0 reads, 8 writes
2024-05-24T03:36:50.972Z|283391|timeval|WARN|context switches: 0 voluntary, 13 involuntary
2024-05-24T03:36:50.972Z|283392|coverage|INFO|Dropped 1 log messages in last 20 seconds (most recently, 20 seconds ago) due to excessive rate
2024-05-24T03:36:50.972Z|283393|coverage|INFO|Skipping details of duplicate event coverage for hash=ab6315f5
2024-05-24T03:36:50.972Z|283394|poll_loop|INFO|Dropped 2 log messages in last 19 seconds (most recently, 19 seconds ago) due to excessive rate
2024-05-24T03:36:50.975Z|283395|poll_loop|INFO|wakeup due to [POLLIN][POLLHUP] on fd 3 (/var/run/ovn/ovn-northd.595.ctl<->) at ../lib/stream-fd.c:157 (86% CPU usage)
2024-05-24T03:36:50.977Z|283396|poll_loop|INFO|wakeup due to [POLLIN] on fd 17 (10.56.64.18:24388<->10.56.64.18:6641) at ../lib/stream-fd.c:157 (86% CPU usage)
2024-05-24T03:36:50.980Z|283397|reconnect|WARN|tcp:[10.56.64.18]:6642: connection dropped (Broken pipe)
2024-05-24T03:36:50.980Z|283398|reconnect|INFO|tcp:[10.56.64.18]:6642: continuing to reconnect in the background but suppressing further logging
2024-05-24T03:36:50.980Z|283399|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:36:50.981Z|283400|poll_loop|INFO|wakeup due to 0-ms timeout at tcp:[10.56.64.18]:6641 (86% CPU usage)
2024-05-24T03:36:50.981Z|283401|reconnect|INFO|tcp:[10.56.64.18]:6641: connection closed by peer
2024-05-24T03:36:51.981Z|283402|poll_loop|INFO|wakeup due to 1000-ms timeout at ../lib/reconnect.c:677 (86% CPU usage)
2024-05-24T03:36:51.981Z|283403|reconnect|INFO|tcp:[10.56.64.16]:6641: connecting...
2024-05-24T03:36:51.981Z|283404|reconnect|INFO|tcp:[10.56.64.16]:6641: connected
2024-05-24T03:36:51.994Z|283405|ovsdb_cs|INFO|tcp:[10.56.64.16]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:36:51.994Z|283406|reconnect|INFO|tcp:[10.56.64.16]:6641: connection attempt timed out
2024-05-24T03:36:51.994Z|283407|reconnect|INFO|tcp:[10.56.64.16]:6641: waiting 2 seconds before reconnect
2024-05-24T03:36:53.996Z|283408|reconnect|INFO|tcp:[10.56.64.17]:6641: connecting...
2024-05-24T03:36:53.996Z|283409|reconnect|INFO|tcp:[10.56.64.17]:6641: connected
2024-05-24T03:36:54.007Z|283410|ovsdb_cs|INFO|tcp:[10.56.64.17]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:36:54.007Z|283411|reconnect|INFO|tcp:[10.56.64.17]:6641: connection attempt timed out
2024-05-24T03:36:54.007Z|283412|reconnect|INFO|tcp:[10.56.64.17]:6641: waiting 4 seconds before reconnect
2024-05-24T03:36:58.008Z|283413|reconnect|INFO|tcp:[10.56.64.18]:6641: connecting...
2024-05-24T03:36:58.008Z|283414|reconnect|INFO|tcp:[10.56.64.18]:6641: connected
2024-05-24T03:36:58.981Z|283415|reconnect|INFO|tcp:[10.56.64.16]:6642: connected
2024-05-24T03:36:58.983Z|283416|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:36:58.984Z|283417|ovsdb_cs|INFO|tcp:[10.56.64.16]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:36:58.984Z|283418|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:37:06.987Z|283419|reconnect|INFO|tcp:[10.56.64.17]:6642: connected
2024-05-24T03:37:06.988Z|283420|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:37:06.988Z|283421|ovsdb_cs|INFO|tcp:[10.56.64.17]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:37:06.988Z|283422|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:37:14.989Z|283423|reconnect|INFO|tcp:[10.56.64.18]:6642: connected
2024-05-24T03:37:23.377Z|283424|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:37:23.867Z|283425|ovn_util|WARN|Dropped 21737 log messages in last 87 seconds (most recently, 35 seconds ago) due to excessive rate
2024-05-24T03:37:23.867Z|283426|ovn_util|WARN|all port tunnel ids exhausted
2024-05-24T03:37:33.870Z|283427|northd|ERR|Dropped 43475 log messages in last 88 seconds (most recently, 44 seconds ago) due to excessive rate
2024-05-24T03:37:33.871Z|283428|northd|ERR|lport fake-pod-7b99c6d54d-r87tb.fake-pod in port group node.kwok.node.545 not found.
2024-05-24T03:37:33.987Z|283429|inc_proc_eng|INFO|node: northd, recompute (forced) took 10610ms
2024-05-24T03:37:35.123Z|283430|inc_proc_eng|INFO|node: lflow, recompute (forced) took 846ms
2024-05-24T03:37:35.362Z|283431|jsonrpc|WARN|tcp:[10.56.64.18]:6642: send error: Broken pipe
2024-05-24T03:37:35.422Z|283432|ovn_northd|INFO|OVNSB commit failed, force recompute next time.
2024-05-24T03:37:35.442Z|283433|timeval|WARN|Unreasonably long 12064ms poll interval (12050ms user, 14ms system)
2024-05-24T03:37:35.442Z|283434|timeval|WARN|faults: 1405 minor, 0 major
2024-05-24T03:37:35.442Z|283435|timeval|WARN|disk: 0 reads, 8 writes
2024-05-24T03:37:35.442Z|283436|timeval|WARN|context switches: 0 voluntary, 12 involuntary
2024-05-24T03:37:35.442Z|283437|poll_loop|INFO|Dropped 3 log messages in last 44 seconds (most recently, 44 seconds ago) due to excessive rate
2024-05-24T03:37:35.444Z|283438|poll_loop|INFO|wakeup due to [POLLIN][POLLHUP] on fd 17 (/var/run/ovn/ovn-northd.595.ctl<->) at ../lib/stream-fd.c:157 (100% CPU usage)
2024-05-24T03:37:35.444Z|283439|poll_loop|INFO|wakeup due to [POLLIN] on fd 3 (10.56.64.18:24396<->10.56.64.18:6641) at ../lib/stream-fd.c:157 (100% CPU usage)
2024-05-24T03:37:35.445Z|283440|reconnect|WARN|tcp:[10.56.64.18]:6642: connection dropped (Broken pipe)
2024-05-24T03:37:35.445Z|283441|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:37:35.446Z|283442|poll_loop|INFO|wakeup due to 0-ms timeout at tcp:[10.56.64.18]:6641 (100% CPU usage)
2024-05-24T03:37:35.446Z|283443|reconnect|INFO|tcp:[10.56.64.18]:6641: connection closed by peer
2024-05-24T03:37:36.445Z|283444|poll_loop|INFO|wakeup due to 999-ms timeout at ../lib/reconnect.c:677 (100% CPU usage)
2024-05-24T03:37:36.445Z|283445|reconnect|INFO|tcp:[10.56.64.16]:6642: connecting...
2024-05-24T03:37:36.446Z|283446|poll_loop|INFO|wakeup due to [POLLOUT] on fd 3 (10.56.64.18:27608<->10.56.64.16:6642) at ../lib/stream-fd.c:153 (100% CPU usage)
2024-05-24T03:37:36.446Z|283447|reconnect|INFO|tcp:[10.56.64.16]:6642: connected
2024-05-24T03:37:36.448Z|283448|poll_loop|INFO|wakeup due to 0-ms timeout at ../lib/reconnect.c:677 (100% CPU usage)
2024-05-24T03:37:36.448Z|283449|reconnect|INFO|tcp:[10.56.64.16]:6641: connecting...
2024-05-24T03:37:36.457Z|283450|ovsdb_cs|INFO|tcp:[10.56.64.16]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:37:36.457Z|283451|reconnect|INFO|tcp:[10.56.64.16]:6642: connection attempt timed out
2024-05-24T03:37:36.457Z|283452|reconnect|INFO|tcp:[10.56.64.16]:6642: waiting 2 seconds before reconnect
2024-05-24T03:37:36.457Z|283453|poll_loop|INFO|wakeup due to [POLLOUT] on fd 17 (10.56.64.18:50250<->10.56.64.16:6641) at ../lib/stream-fd.c:153 (100% CPU usage)
2024-05-24T03:37:36.457Z|283454|reconnect|INFO|tcp:[10.56.64.16]:6641: connected
2024-05-24T03:37:36.467Z|283455|ovsdb_cs|INFO|tcp:[10.56.64.16]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:37:36.467Z|283456|reconnect|INFO|tcp:[10.56.64.16]:6641: connection attempt timed out
2024-05-24T03:37:36.467Z|283457|reconnect|INFO|tcp:[10.56.64.16]:6641: waiting 2 seconds before reconnect
2024-05-24T03:37:38.458Z|283458|reconnect|INFO|tcp:[10.56.64.17]:6642: connecting...
2024-05-24T03:37:38.459Z|283459|reconnect|INFO|tcp:[10.56.64.17]:6642: connected
2024-05-24T03:37:38.459Z|283460|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:37:38.459Z|283461|ovsdb_cs|INFO|tcp:[10.56.64.17]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:37:38.459Z|283462|reconnect|INFO|tcp:[10.56.64.17]:6642: connection attempt timed out
2024-05-24T03:37:38.459Z|283463|reconnect|INFO|tcp:[10.56.64.17]:6642: waiting 4 seconds before reconnect
2024-05-24T03:37:38.459Z|283464|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:37:38.468Z|283465|reconnect|INFO|tcp:[10.56.64.17]:6641: connecting...
2024-05-24T03:37:38.468Z|283466|reconnect|INFO|tcp:[10.56.64.17]:6641: connected
2024-05-24T03:37:38.468Z|283467|ovsdb_cs|INFO|tcp:[10.56.64.17]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:37:38.468Z|283468|reconnect|INFO|tcp:[10.56.64.17]:6641: connection attempt timed out
2024-05-24T03:37:38.468Z|283469|reconnect|INFO|tcp:[10.56.64.17]:6641: waiting 4 seconds before reconnect
2024-05-24T03:37:42.459Z|283470|reconnect|INFO|tcp:[10.56.64.18]:6642: connecting...
2024-05-24T03:37:42.460Z|283471|reconnect|INFO|tcp:[10.56.64.18]:6642: connected
2024-05-24T03:37:42.469Z|283472|reconnect|INFO|tcp:[10.56.64.18]:6641: connecting...
2024-05-24T03:37:42.469Z|283473|reconnect|INFO|tcp:[10.56.64.18]:6641: connected
2024-05-24T03:37:46.215Z|283474|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.

ovsdb-server-nb.log

2024-05-24T03:39:19.604Z|132818|timeval|WARN|Unreasonably long 1064ms poll interval (1055ms user, 7ms system)
2024-05-24T03:39:19.604Z|132819|timeval|WARN|faults: 1275 minor, 0 major
2024-05-24T03:39:19.604Z|132820|timeval|WARN|context switches: 0 voluntary, 3 involuntary
2024-05-24T03:39:19.604Z|132821|reconnect|ERR|tcp:10.56.64.17:29954: no response to inactivity probe after 5.18 seconds, disconnecting
2024-05-24T03:39:23.551Z|132822|timeval|WARN|Unreasonably long 1001ms poll interval (995ms user, 5ms system)
2024-05-24T03:39:23.551Z|132823|timeval|WARN|faults: 1278 minor, 0 major
2024-05-24T03:39:23.551Z|132824|timeval|WARN|context switches: 0 voluntary, 2 involuntary
2024-05-24T03:39:25.467Z|132825|reconnect|ERR|tcp:10.56.64.16:19996: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:39:27.612Z|132826|timeval|WARN|Unreasonably long 1055ms poll interval (1047ms user, 6ms system)
2024-05-24T03:39:27.612Z|132827|timeval|WARN|faults: 846 minor, 0 major
2024-05-24T03:39:27.612Z|132828|timeval|WARN|context switches: 0 voluntary, 1 involuntary
2024-05-24T03:39:31.655Z|132829|timeval|WARN|Unreasonably long 1006ms poll interval (996ms user, 7ms system)
2024-05-24T03:39:31.655Z|132830|timeval|WARN|faults: 843 minor, 0 major
2024-05-24T03:39:31.655Z|132831|timeval|WARN|context switches: 0 voluntary, 7 involuntary
2024-05-24T03:39:35.719Z|132832|timeval|WARN|Unreasonably long 1064ms poll interval (1060ms user, 4ms system)
2024-05-24T03:39:35.719Z|132833|timeval|WARN|faults: 846 minor, 0 major
2024-05-24T03:39:35.719Z|132834|timeval|WARN|context switches: 0 voluntary, 8 involuntary
2024-05-24T03:39:43.722Z|132835|timeval|WARN|Unreasonably long 1054ms poll interval (1049ms user, 4ms system)
2024-05-24T03:39:43.722Z|132836|timeval|WARN|faults: 812 minor, 0 major
2024-05-24T03:39:43.722Z|132837|timeval|WARN|context switches: 0 voluntary, 5 involuntary
2024-05-24T03:39:43.722Z|132838|reconnect|ERR|tcp:10.56.64.17:29956: no response to inactivity probe after 5.37 seconds, disconnecting
2024-05-24T03:39:47.684Z|132839|timeval|WARN|Unreasonably long 1010ms poll interval (1004ms user, 6ms system)
2024-05-24T03:39:47.685Z|132840|timeval|WARN|faults: 833 minor, 0 major
2024-05-24T03:39:47.685Z|132841|timeval|WARN|context switches: 0 voluntary, 1 involuntary
2024-05-24T03:39:51.734Z|132842|timeval|WARN|Unreasonably long 1053ms poll interval (1045ms user, 6ms system)
2024-05-24T03:39:51.734Z|132843|timeval|WARN|faults: 833 minor, 0 major
2024-05-24T03:39:51.734Z|132844|timeval|WARN|context switches: 0 voluntary, 1 involuntary
2024-05-24T03:39:53.735Z|132845|reconnect|ERR|tcp:10.56.64.16:20008: no response to inactivity probe after 5.01 seconds, disconnecting
2024-05-24T03:39:55.704Z|132846|timeval|WARN|Unreasonably long 1017ms poll interval (1012ms user, 5ms system)
2024-05-24T03:39:55.704Z|132847|timeval|WARN|faults: 838 minor, 0 major
2024-05-24T03:39:55.704Z|132848|timeval|WARN|context switches: 0 voluntary, 7 involuntary
2024-05-24T03:39:59.788Z|132849|timeval|WARN|Unreasonably long 1096ms poll interval (1091ms user, 3ms system)
2024-05-24T03:39:59.788Z|132850|timeval|WARN|faults: 834 minor, 0 major
2024-05-24T03:39:59.788Z|132851|timeval|WARN|context switches: 0 voluntary, 2 involuntary
2024-05-24T03:40:01.737Z|132852|reconnect|ERR|tcp:10.56.64.18:24442: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:40:03.731Z|132853|timeval|WARN|Unreasonably long 1033ms poll interval (1027ms user, 7ms system)
2024-05-24T03:40:03.731Z|132854|timeval|WARN|faults: 834 minor, 0 major
2024-05-24T03:40:03.731Z|132855|timeval|WARN|context switches: 0 voluntary, 1 involuntary
2024-05-24T03:40:07.776Z|132856|timeval|WARN|Unreasonably long 1071ms poll interval (1061ms user, 10ms system)
2024-05-24T03:40:07.776Z|132857|timeval|WARN|faults: 835 minor, 0 major
2024-05-24T03:40:07.776Z|132858|timeval|WARN|context switches: 0 voluntary, 7 involuntary
2024-05-24T03:40:07.776Z|132859|reconnect|ERR|tcp:10.56.64.17:29964: no response to inactivity probe after 5.62 seconds, disconnecting
2024-05-24T03:40:12.777Z|132860|reconnect|ERR|tcp:10.56.64.16:20022: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:40:15.777Z|132861|timeval|WARN|Unreasonably long 1060ms poll interval (1051ms user, 9ms system)
2024-05-24T03:40:15.777Z|132862|timeval|WARN|faults: 794 minor, 0 major
2024-05-24T03:40:15.777Z|132863|timeval|WARN|context switches: 0 voluntary, 2 involuntary
2024-05-24T03:40:15.778Z|132864|coverage|INFO|Dropped 12 log messages in last 56 seconds (most recently, 8 seconds ago) due to excessive rate
2024-05-24T03:40:15.778Z|132865|coverage|INFO|Skipping details of duplicate event coverage for hash=77dac2d1
2024-05-24T03:40:19.726Z|132866|timeval|WARN|Unreasonably long 1002ms poll interval (996ms user, 7ms system)
2024-05-24T03:40:19.726Z|132867|timeval|WARN|faults: 690 minor, 0 major
2024-05-24T03:40:19.726Z|132868|timeval|WARN|context switches: 0 voluntary, 1 involuntary
2024-05-24T03:40:22.250Z|132869|reconnect|ERR|tcp:10.56.64.18:24458: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:40:23.810Z|132870|timeval|WARN|Unreasonably long 1066ms poll interval (1061ms user, 4ms system)
2024-05-24T03:40:23.810Z|132871|timeval|WARN|faults: 685 minor, 0 major
2024-05-24T03:40:23.810Z|132872|timeval|WARN|context switches: 0 voluntary, 7 involuntary
2024-05-24T03:40:27.770Z|132873|timeval|WARN|Unreasonably long 1020ms poll interval (1015ms user, 5ms system)
2024-05-24T03:40:27.770Z|132874|timeval|WARN|faults: 473 minor, 0 major
2024-05-24T03:40:27.770Z|132875|timeval|WARN|context switches: 0 voluntary, 1 involuntary
2024-05-24T03:40:31.954Z|132876|timeval|WARN|Unreasonably long 1108ms poll interval (1100ms user, 8ms system)
2024-05-24T03:40:31.954Z|132877|timeval|WARN|faults: 801 minor, 0 major
2024-05-24T03:40:31.954Z|132878|timeval|WARN|context switches: 0 voluntary, 1 involuntary
2024-05-24T03:40:31.954Z|132879|reconnect|ERR|tcp:10.56.64.17:29966: no response to inactivity probe after 5.77 seconds, disconnecting

ovsdb-server-sb.log

2024-05-24T03:41:03.258Z|58044|jsonrpc|WARN|tcp:10.16.0.2:35724: receive error: Connection reset by peer
2024-05-24T03:41:03.258Z|58045|reconnect|WARN|tcp:10.16.0.2:35724: connection dropped (Connection reset by peer)
2024-05-24T03:41:05.466Z|58046|reconnect|ERR|tcp:10.56.64.16:24968: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:41:11.644Z|58047|reconnect|ERR|tcp:10.56.64.17:9754: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:41:22.055Z|58048|jsonrpc|WARN|tcp:10.16.0.2:35730: receive error: Connection reset by peer
2024-05-24T03:41:22.055Z|58049|reconnect|WARN|tcp:10.16.0.2:35730: connection dropped (Connection reset by peer)
2024-05-24T03:41:27.614Z|58050|reconnect|ERR|tcp:10.56.64.16:24970: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:41:34.818Z|58051|reconnect|ERR|tcp:10.56.64.17:9756: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:41:40.054Z|58052|jsonrpc|WARN|tcp:10.16.0.2:35734: receive error: Connection reset by peer
2024-05-24T03:41:40.054Z|58053|reconnect|WARN|tcp:10.16.0.2:35734: connection dropped (Connection reset by peer)
2024-05-24T03:41:41.840Z|58054|reconnect|ERR|tcp:10.56.64.18:8978: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:41:51.101Z|58055|reconnect|ERR|tcp:10.56.64.16:24972: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:41:56.430Z|58056|reconnect|ERR|tcp:10.56.64.17:9758: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:41:58.658Z|58057|jsonrpc|WARN|tcp:10.16.0.2:35738: receive error: Connection reset by peer
2024-05-24T03:41:58.658Z|58058|reconnect|WARN|tcp:10.16.0.2:35738: connection dropped (Connection reset by peer)
2024-05-24T03:42:10.905Z|58059|reconnect|ERR|tcp:10.56.64.16:24974: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:42:16.955Z|58060|jsonrpc|WARN|tcp:10.16.0.2:35742: receive error: Connection reset by peer
2024-05-24T03:42:16.955Z|58061|reconnect|WARN|tcp:10.16.0.2:35742: connection dropped (Connection reset by peer)
2024-05-24T03:42:19.908Z|58062|reconnect|ERR|tcp:10.56.64.17:9760: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:42:25.899Z|58063|reconnect|ERR|tcp:10.56.64.18:8998: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:42:34.622Z|58064|reconnect|ERR|tcp:10.56.64.16:24976: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:42:35.160Z|58065|jsonrpc|WARN|tcp:10.16.0.2:35746: receive error: Connection reset by peer
2024-05-24T03:42:35.160Z|58066|reconnect|WARN|tcp:10.16.0.2:35746: connection dropped (Connection reset by peer)
2024-05-24T03:42:40.127Z|58067|reconnect|ERR|tcp:10.56.64.17:9762: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:42:49.098Z|58068|reconnect|ERR|tcp:10.56.64.18:9008: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:42:53.460Z|58069|jsonrpc|WARN|tcp:10.16.0.2:35750: receive error: Connection reset by peer
2024-05-24T03:42:53.460Z|58070|reconnect|WARN|tcp:10.16.0.2:35750: connection dropped (Connection reset by peer)
2024-05-24T03:42:54.784Z|58071|reconnect|ERR|tcp:10.56.64.16:24978: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:43:04.102Z|58072|reconnect|ERR|tcp:10.56.64.17:9764: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:43:12.859Z|58073|jsonrpc|WARN|tcp:10.16.0.2:35754: receive error: Connection reset by peer
2024-05-24T03:43:12.859Z|58074|reconnect|WARN|tcp:10.16.0.2:35754: connection dropped (Connection reset by peer)
2024-05-24T03:43:18.411Z|58075|reconnect|ERR|tcp:10.56.64.16:24980: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:43:23.613Z|58076|reconnect|ERR|tcp:10.56.64.17:9766: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:43:31.133Z|58077|reconnect|ERR|tcp:10.56.64.18:9028: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:43:31.655Z|58078|jsonrpc|WARN|tcp:10.16.0.2:35758: receive error: Connection reset by peer
2024-05-24T03:43:31.655Z|58079|reconnect|WARN|tcp:10.16.0.2:35758: connection dropped (Connection reset by peer)
2024-05-24T03:43:37.376Z|58080|reconnect|ERR|tcp:10.56.64.16:24982: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:43:47.256Z|58081|reconnect|ERR|tcp:10.56.64.17:9768: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:43:50.057Z|58082|jsonrpc|WARN|tcp:10.16.0.2:35762: receive error: Connection reset by peer
2024-05-24T03:43:50.057Z|58083|reconnect|WARN|tcp:10.16.0.2:35762: connection dropped (Connection reset by peer)
2024-05-24T03:44:00.688Z|58084|reconnect|ERR|tcp:10.56.64.16:24984: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:44:06.345Z|58085|reconnect|ERR|tcp:10.56.64.17:9770: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:44:08.262Z|58086|jsonrpc|WARN|tcp:10.16.0.2:35766: receive error: Connection reset by peer
2024-05-24T03:44:08.263Z|58087|reconnect|WARN|tcp:10.16.0.2:35766: connection dropped (Connection reset by peer)
2024-05-24T03:44:15.098Z|58088|reconnect|ERR|tcp:10.56.64.18:9046: no response to inactivity probe after 5 seconds, disconnecting

changluyi · 2024-05-24T05:03:28Z

看起来像tunnel id给用完了。如果是geneve的话，应该是2 的 16次方个

cmdy · 2024-05-24T06:01:18Z

看起来像tunnel id给用完了。如果是geneve的话，应该是2 的 16次方个

目前我测试的是 geneve ，这个 tunnel id 与那个资源有关系？如果用完后有解决的方案嘛？还是这个就已经是上限了？另 vxlan模式下这个 tunnel id 是多少呢？

oilbeater · 2024-05-27T05:21:50Z

@cmdy https://kubeovn.github.io/docs/v1.12.x/reference/tunnel-protocol/#vxlan vxlan 数量会更少一些，单个 datapath 下 4096 个端口

oilbeater · 2024-06-04T05:09:01Z

根据 ovn 的架构文档 geneve 单个 datapath 最多支持 2**15 个端口

cmdy added the bug Something isn't working label May 24, 2024

zhangzujian added performance Anything that can make Kube-OVN faster and removed bug Something isn't working labels May 24, 2024

cmdy changed the title ~~[BUG] 大规模集群下3000节点，从4wPod扩容到6wPod时，pod无法正常分配ip~~ 大规模集群下3000节点，从4wPod扩容到6wPod时，pod无法正常分配ip May 24, 2024

oilbeater added the documents Need documents label Jun 4, 2024

oilbeater closed this as completed Jun 4, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

大规模集群下3000节点，从4wPod扩容到6wPod时，pod无法正常分配ip #4078

大规模集群下3000节点，从4wPod扩容到6wPod时，pod无法正常分配ip #4078

cmdy commented May 24, 2024

bobz965 commented May 24, 2024

cmdy commented May 24, 2024

cmdy commented May 24, 2024

changluyi commented May 24, 2024

cmdy commented May 24, 2024

oilbeater commented May 27, 2024

oilbeater commented Jun 4, 2024

大规模集群下3000节点，从4wPod扩容到6wPod时，pod无法正常分配ip #4078

大规模集群下3000节点，从4wPod扩容到6wPod时，pod无法正常分配ip #4078

Comments

cmdy commented May 24, 2024

Kube-OVN Version

Kubernetes Version

Operation-system/Kernel Version

Description

Steps To Reproduce

Current Behavior

Expected Behavior

bobz965 commented May 24, 2024

cmdy commented May 24, 2024

cmdy commented May 24, 2024

changluyi commented May 24, 2024

cmdy commented May 24, 2024

oilbeater commented May 27, 2024

oilbeater commented Jun 4, 2024