Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

大规模集群下3000节点,从4wPod扩容到6wPod时,pod无法正常分配ip #4078

Closed
cmdy opened this issue May 24, 2024 · 7 comments
Closed
Labels
documents Need documents performance Anything that can make Kube-OVN faster

Comments

@cmdy
Copy link

cmdy commented May 24, 2024

Kube-OVN Version

v1.12.11

Kubernetes Version

Client Version: v1.28.8
Server Version: v1.28.8

Operation-system/Kernel Version

"CentOS Linux 7 (Core)"
5.16.20-3.el7.bzl.x86_64

Description

使用 kwok模拟3k节点,当pod 从4w扩容到6w时,Pod IP未正常分配;

kube-ovn-controller 日志

I0524 11:05:15.035224       1 pod.go:550] handle add/update pod fake-pod/fake-pod-7b99c6d54d-wbxdb
I0524 11:05:15.109086       1 pod.go:607] sync pod fake-pod/fake-pod-7b99c6d54d-wbxdb allocated
I0524 11:05:15.109119       1 ipam.go:60] allocate v4 10.16.233.106, v6 , mac 00:00:00:64:E6:A5 for fake-pod/fake-pod-7b99c6d54d-wbxdb from subnet ovn-default
E0524 11:05:15.111178       1 ovn-nb-logical_switch_port.go:91] get logical switch port fake-pod-7b99c6d54d-wbxdb.fake-pod: not connected
E0524 11:05:15.111212       1 pod.go:695] get logical switch port fake-pod-7b99c6d54d-wbxdb.fake-pod: not connected
E0524 11:05:15.111238       1 pod.go:405] error syncing 'fake-pod/fake-pod-7b99c6d54d-wbxdb': get logical switch port fake-pod-7b99c6d54d-wbxdb.fake-pod: not connected, requeuing
I0524 11:05:15.111281       1 event.go:298] Event(v1.ObjectReference{Kind:"Pod", Namespace:"fake-pod", Name:"fake-pod-7b99c6d54d-wbxdb", UID:"6ad08145-d783-4adb-89e4-fcfb427535b7", APIVersion:"v1", ResourceVersion:"452827617", FieldPath:""}): type: 'Warning' reason: 'CreateOVNPortFailed' get logical switch port fake-pod-7b99c6d54d-wbxdb.fake-pod: not connected

Steps To Reproduce

使用kwok 模拟3k 节点,并将pod 扩容至6w
kube-ovn-controller 资源 8c8Gi

        resources:
          limits:
            cpu: "8"
            memory: 8Gi
          requests:
            cpu: 200m
            memory: 200Mi

ovn-central 3实例

       resources:
          limits:
            cpu: "4"
            memory: 8Gi
          requests:
            cpu: 300m
            memory: 300Mi

Current Behavior

Pod IP 未被正常分配

Expected Behavior

Pod IP 可以正常分配

@cmdy cmdy added the bug Something isn't working label May 24, 2024
@zhangzujian zhangzujian added performance Anything that can make Kube-OVN faster and removed bug Something isn't working labels May 24, 2024
@bobz965
Copy link
Collaborator

bobz965 commented May 24, 2024

@cmdy 大佬,你们已经用到这么大物理机集群了嘛?

@cmdy
Copy link
Author

cmdy commented May 24, 2024

@cmdy 大佬,你们已经用到这么大物理机集群了嘛?

嗯嗯 差不多了,我们现在线上单集群已经1700多台物理机了,最近在验证kube-ovn 准备用这套方案

@cmdy
Copy link
Author

cmdy commented May 24, 2024

ovn-northd.log

2024-05-24T03:33:30.635Z|283076|northd|ERR|Dropped 28983 log messages in last 72 seconds (most recently, 24 seconds ago) due to excessive rate
2024-05-24T03:33:30.636Z|283077|northd|ERR|lport fake-pod-7b99c6d54d-r87tb.fake-pod in port group node.kwok.node.545 not found.
2024-05-24T03:33:30.762Z|283078|inc_proc_eng|INFO|node: northd, recompute (forced) took 10572ms
2024-05-24T03:33:31.913Z|283079|inc_proc_eng|INFO|node: lflow, recompute (forced) took 860ms
2024-05-24T03:33:32.132Z|283080|jsonrpc|WARN|tcp:[10.56.64.18]:6642: send error: Broken pipe
2024-05-24T03:33:32.193Z|283081|ovn_northd|INFO|OVNSB commit failed, force recompute next time.
2024-05-24T03:33:32.213Z|283082|timeval|WARN|Unreasonably long 12024ms poll interval (12016ms user, 7ms system)
2024-05-24T03:33:32.213Z|283083|timeval|WARN|faults: 1275 minor, 0 major
2024-05-24T03:33:32.213Z|283084|timeval|WARN|context switches: 0 voluntary, 17 involuntary
2024-05-24T03:33:32.214Z|283085|poll_loop|INFO|wakeup due to [POLLIN][POLLHUP] on fd 3 (/var/run/ovn/ovn-northd.595.ctl<->) at ../lib/stream-fd.c:157 (100% CPU usage)
2024-05-24T03:33:32.215Z|283086|poll_loop|INFO|wakeup due to [POLLIN] on fd 17 (10.56.64.18:24322<->10.56.64.18:6641) at ../lib/stream-fd.c:157 (100% CPU usage)
2024-05-24T03:33:32.216Z|283087|reconnect|WARN|tcp:[10.56.64.18]:6642: connection dropped (Broken pipe)
2024-05-24T03:33:32.216Z|283088|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:33:32.216Z|283089|poll_loop|INFO|wakeup due to 0-ms timeout at tcp:[10.56.64.18]:6641 (100% CPU usage)
2024-05-24T03:33:32.216Z|283090|reconnect|INFO|tcp:[10.56.64.18]:6641: connection closed by peer
2024-05-24T03:33:33.217Z|283091|poll_loop|INFO|wakeup due to 1001-ms timeout at ../lib/reconnect.c:677 (100% CPU usage)
2024-05-24T03:33:33.217Z|283092|reconnect|INFO|tcp:[10.56.64.16]:6641: connecting...
2024-05-24T03:33:33.217Z|283093|reconnect|INFO|tcp:[10.56.64.16]:6642: connecting...
2024-05-24T03:33:33.217Z|283094|reconnect|INFO|tcp:[10.56.64.16]:6641: connected
2024-05-24T03:33:33.218Z|283095|reconnect|INFO|tcp:[10.56.64.16]:6642: connected
2024-05-24T03:33:33.231Z|283096|ovsdb_cs|INFO|tcp:[10.56.64.16]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:33:33.231Z|283097|reconnect|INFO|tcp:[10.56.64.16]:6642: connection attempt timed out
2024-05-24T03:33:33.231Z|283098|reconnect|INFO|tcp:[10.56.64.16]:6642: waiting 2 seconds before reconnect
2024-05-24T03:33:33.238Z|283099|ovsdb_cs|INFO|tcp:[10.56.64.16]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:33:33.238Z|283100|reconnect|INFO|tcp:[10.56.64.16]:6641: connection attempt timed out
2024-05-24T03:33:33.238Z|283101|reconnect|INFO|tcp:[10.56.64.16]:6641: waiting 2 seconds before reconnect
2024-05-24T03:33:35.233Z|283102|reconnect|INFO|tcp:[10.56.64.17]:6642: connecting...
2024-05-24T03:33:35.233Z|283103|reconnect|INFO|tcp:[10.56.64.17]:6642: connected
2024-05-24T03:33:35.236Z|283104|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:33:35.236Z|283105|ovsdb_cs|INFO|tcp:[10.56.64.17]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:33:35.236Z|283106|reconnect|INFO|tcp:[10.56.64.17]:6642: connection attempt timed out
2024-05-24T03:33:35.236Z|283107|reconnect|INFO|tcp:[10.56.64.17]:6642: waiting 4 seconds before reconnect
2024-05-24T03:33:35.236Z|283108|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:33:35.237Z|283109|reconnect|INFO|tcp:[10.56.64.17]:6641: connecting...
2024-05-24T03:33:35.237Z|283110|reconnect|INFO|tcp:[10.56.64.17]:6641: connected
2024-05-24T03:33:35.238Z|283111|ovsdb_cs|INFO|tcp:[10.56.64.17]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:33:35.238Z|283112|reconnect|INFO|tcp:[10.56.64.17]:6641: connection attempt timed out
2024-05-24T03:33:35.238Z|283113|reconnect|INFO|tcp:[10.56.64.17]:6641: waiting 4 seconds before reconnect
2024-05-24T03:33:39.237Z|283114|reconnect|INFO|tcp:[10.56.64.18]:6642: connecting...
2024-05-24T03:33:39.237Z|283115|reconnect|INFO|tcp:[10.56.64.18]:6642: connected
2024-05-24T03:33:39.237Z|283116|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:33:39.239Z|283117|reconnect|INFO|tcp:[10.56.64.18]:6641: connecting...
2024-05-24T03:33:39.239Z|283118|reconnect|INFO|tcp:[10.56.64.18]:6641: connected
2024-05-24T03:33:50.111Z|283119|inc_proc_eng|INFO|node: northd, recompute (forced) took 10843ms
2024-05-24T03:33:51.256Z|283120|inc_proc_eng|INFO|node: lflow, recompute (forced) took 860ms
2024-05-24T03:33:51.473Z|283121|jsonrpc|WARN|tcp:[10.56.64.18]:6642: send error: Broken pipe
2024-05-24T03:33:51.538Z|283122|ovn_northd|INFO|OVNSB commit failed, force recompute next time.
2024-05-24T03:33:51.556Z|283123|timeval|WARN|Unreasonably long 12288ms poll interval (12250ms user, 38ms system)
2024-05-24T03:33:51.556Z|283124|timeval|WARN|faults: 1273 minor, 0 major
2024-05-24T03:33:51.556Z|283125|timeval|WARN|disk: 0 reads, 8 writes
2024-05-24T03:33:51.556Z|283126|timeval|WARN|context switches: 0 voluntary, 24 involuntary
2024-05-24T03:33:51.556Z|283127|coverage|INFO|Dropped 1 log messages in last 20 seconds (most recently, 20 seconds ago) due to excessive rate
2024-05-24T03:33:51.556Z|283128|coverage|INFO|Skipping details of duplicate event coverage for hash=ab6315f5
2024-05-24T03:33:51.556Z|283129|poll_loop|INFO|Dropped 3 log messages in last 19 seconds (most recently, 19 seconds ago) due to excessive rate
2024-05-24T03:33:51.556Z|283130|poll_loop|INFO|wakeup due to [POLLIN][POLLHUP] on fd 3 (/var/run/ovn/ovn-northd.595.ctl<->) at ../lib/stream-fd.c:157 (97% CPU usage)
2024-05-24T03:33:51.556Z|283131|poll_loop|INFO|wakeup due to [POLLIN] on fd 17 (10.56.64.18:24328<->10.56.64.18:6641) at ../lib/stream-fd.c:157 (97% CPU usage)
2024-05-24T03:33:51.558Z|283132|reconnect|WARN|tcp:[10.56.64.18]:6642: connection dropped (Broken pipe)
2024-05-24T03:33:51.558Z|283133|reconnect|INFO|tcp:[10.56.64.18]:6642: continuing to reconnect in the background but suppressing further logging
2024-05-24T03:33:51.558Z|283134|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:33:51.559Z|283135|poll_loop|INFO|wakeup due to 0-ms timeout at tcp:[10.56.64.18]:6641 (97% CPU usage)
2024-05-24T03:33:51.559Z|283136|reconnect|INFO|tcp:[10.56.64.18]:6641: connection closed by peer
2024-05-24T03:33:52.560Z|283137|reconnect|INFO|tcp:[10.56.64.16]:6641: connecting...
2024-05-24T03:33:52.560Z|283138|reconnect|INFO|tcp:[10.56.64.16]:6641: connected
2024-05-24T03:33:52.571Z|283139|ovsdb_cs|INFO|tcp:[10.56.64.16]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:33:52.571Z|283140|reconnect|INFO|tcp:[10.56.64.16]:6641: connection attempt timed out
2024-05-24T03:33:52.571Z|283141|reconnect|INFO|tcp:[10.56.64.16]:6641: waiting 2 seconds before reconnect
2024-05-24T03:33:54.572Z|283142|reconnect|INFO|tcp:[10.56.64.17]:6641: connecting...
2024-05-24T03:33:54.572Z|283143|reconnect|INFO|tcp:[10.56.64.17]:6641: connected
2024-05-24T03:33:54.583Z|283144|ovsdb_cs|INFO|tcp:[10.56.64.17]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:33:54.583Z|283145|reconnect|INFO|tcp:[10.56.64.17]:6641: connection attempt timed out
2024-05-24T03:33:54.583Z|283146|reconnect|INFO|tcp:[10.56.64.17]:6641: waiting 4 seconds before reconnect
2024-05-24T03:33:58.584Z|283147|reconnect|INFO|tcp:[10.56.64.18]:6641: connecting...
2024-05-24T03:33:58.584Z|283148|reconnect|INFO|tcp:[10.56.64.18]:6641: connected
2024-05-24T03:33:59.559Z|283149|reconnect|INFO|tcp:[10.56.64.16]:6642: connected
2024-05-24T03:33:59.560Z|283150|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:33:59.560Z|283151|ovsdb_cs|INFO|tcp:[10.56.64.16]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:33:59.560Z|283152|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:34:07.564Z|283153|reconnect|INFO|tcp:[10.56.64.17]:6642: connected
2024-05-24T03:34:07.565Z|283154|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:34:07.566Z|283155|ovsdb_cs|INFO|tcp:[10.56.64.17]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:34:07.566Z|283156|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.

2024-05-24T03:36:29.427Z|283343|inc_proc_eng|INFO|node: northd, recompute (forced) took 10600ms
2024-05-24T03:36:30.527Z|283344|inc_proc_eng|INFO|node: lflow, recompute (forced) took 808ms
2024-05-24T03:36:30.755Z|283345|jsonrpc|WARN|tcp:[10.56.64.18]:6642: send error: Broken pipe
2024-05-24T03:36:30.823Z|283346|ovn_northd|INFO|OVNSB commit failed, force recompute next time.
2024-05-24T03:36:30.841Z|283347|timeval|WARN|Unreasonably long 12014ms poll interval (12002ms user, 11ms system)
2024-05-24T03:36:30.841Z|283348|timeval|WARN|faults: 1650 minor, 0 major
2024-05-24T03:36:30.841Z|283349|timeval|WARN|context switches: 0 voluntary, 14 involuntary
2024-05-24T03:36:30.841Z|283350|poll_loop|INFO|wakeup due to [POLLIN] on fd 3 (10.56.64.18:24380<->10.56.64.18:6641) at ../lib/stream-fd.c:157 (100% CPU usage)
2024-05-24T03:36:30.842Z|283351|reconnect|WARN|tcp:[10.56.64.18]:6642: connection dropped (Broken pipe)
2024-05-24T03:36:30.842Z|283352|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:36:30.843Z|283353|poll_loop|INFO|wakeup due to 0-ms timeout at tcp:[10.56.64.18]:6641 (100% CPU usage)
2024-05-24T03:36:30.843Z|283354|reconnect|INFO|tcp:[10.56.64.18]:6641: connection closed by peer
2024-05-24T03:36:31.843Z|283355|poll_loop|INFO|wakeup due to 999-ms timeout at ../lib/reconnect.c:677 (100% CPU usage)
2024-05-24T03:36:31.843Z|283356|reconnect|INFO|tcp:[10.56.64.16]:6641: connecting...
2024-05-24T03:36:31.843Z|283357|reconnect|INFO|tcp:[10.56.64.16]:6642: connecting...
2024-05-24T03:36:31.843Z|283358|poll_loop|INFO|wakeup due to [POLLOUT] on fd 3 (10.56.64.18:50246<->10.56.64.16:6641) at ../lib/stream-fd.c:153 (100% CPU usage)
2024-05-24T03:36:31.843Z|283359|reconnect|INFO|tcp:[10.56.64.16]:6641: connected
2024-05-24T03:36:31.844Z|283360|reconnect|INFO|tcp:[10.56.64.16]:6642: connected
2024-05-24T03:36:31.858Z|283361|ovsdb_cs|INFO|tcp:[10.56.64.16]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:36:31.858Z|283362|reconnect|INFO|tcp:[10.56.64.16]:6642: connection attempt timed out
2024-05-24T03:36:31.858Z|283363|reconnect|INFO|tcp:[10.56.64.16]:6642: waiting 2 seconds before reconnect
2024-05-24T03:36:31.864Z|283364|ovsdb_cs|INFO|tcp:[10.56.64.16]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:36:31.864Z|283365|reconnect|INFO|tcp:[10.56.64.16]:6641: connection attempt timed out
2024-05-24T03:36:31.864Z|283366|reconnect|INFO|tcp:[10.56.64.16]:6641: waiting 2 seconds before reconnect
2024-05-24T03:36:33.859Z|283367|reconnect|INFO|tcp:[10.56.64.17]:6642: connecting...
2024-05-24T03:36:33.859Z|283368|reconnect|INFO|tcp:[10.56.64.17]:6642: connected
2024-05-24T03:36:33.860Z|283369|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:36:33.860Z|283370|ovsdb_cs|INFO|tcp:[10.56.64.17]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:36:33.860Z|283371|reconnect|INFO|tcp:[10.56.64.17]:6642: connection attempt timed out
2024-05-24T03:36:33.860Z|283372|reconnect|INFO|tcp:[10.56.64.17]:6642: waiting 4 seconds before reconnect
2024-05-24T03:36:33.860Z|283373|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:36:33.863Z|283374|reconnect|INFO|tcp:[10.56.64.17]:6641: connecting...
2024-05-24T03:36:33.863Z|283375|reconnect|INFO|tcp:[10.56.64.17]:6641: connected
2024-05-24T03:36:33.864Z|283376|ovsdb_cs|INFO|tcp:[10.56.64.17]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:36:33.864Z|283377|reconnect|INFO|tcp:[10.56.64.17]:6641: connection attempt timed out
2024-05-24T03:36:33.864Z|283378|reconnect|INFO|tcp:[10.56.64.17]:6641: waiting 4 seconds before reconnect
2024-05-24T03:36:37.861Z|283379|reconnect|INFO|tcp:[10.56.64.18]:6642: connecting...
2024-05-24T03:36:37.861Z|283380|reconnect|INFO|tcp:[10.56.64.18]:6642: connected
2024-05-24T03:36:37.863Z|283381|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:36:37.864Z|283382|reconnect|INFO|tcp:[10.56.64.18]:6641: connecting...
2024-05-24T03:36:37.864Z|283383|reconnect|INFO|tcp:[10.56.64.18]:6641: connected
2024-05-24T03:36:49.551Z|283384|inc_proc_eng|INFO|node: northd, recompute (forced) took 10650ms
2024-05-24T03:36:50.660Z|283385|inc_proc_eng|INFO|node: lflow, recompute (forced) took 820ms
2024-05-24T03:36:50.894Z|283386|jsonrpc|WARN|tcp:[10.56.64.18]:6642: send error: Broken pipe
2024-05-24T03:36:50.950Z|283387|ovn_northd|INFO|OVNSB commit failed, force recompute next time.
2024-05-24T03:36:50.972Z|283388|timeval|WARN|Unreasonably long 12072ms poll interval (12065ms user, 6ms system)
2024-05-24T03:36:50.972Z|283389|timeval|WARN|faults: 378 minor, 0 major
2024-05-24T03:36:50.972Z|283390|timeval|WARN|disk: 0 reads, 8 writes
2024-05-24T03:36:50.972Z|283391|timeval|WARN|context switches: 0 voluntary, 13 involuntary
2024-05-24T03:36:50.972Z|283392|coverage|INFO|Dropped 1 log messages in last 20 seconds (most recently, 20 seconds ago) due to excessive rate
2024-05-24T03:36:50.972Z|283393|coverage|INFO|Skipping details of duplicate event coverage for hash=ab6315f5
2024-05-24T03:36:50.972Z|283394|poll_loop|INFO|Dropped 2 log messages in last 19 seconds (most recently, 19 seconds ago) due to excessive rate
2024-05-24T03:36:50.975Z|283395|poll_loop|INFO|wakeup due to [POLLIN][POLLHUP] on fd 3 (/var/run/ovn/ovn-northd.595.ctl<->) at ../lib/stream-fd.c:157 (86% CPU usage)
2024-05-24T03:36:50.977Z|283396|poll_loop|INFO|wakeup due to [POLLIN] on fd 17 (10.56.64.18:24388<->10.56.64.18:6641) at ../lib/stream-fd.c:157 (86% CPU usage)
2024-05-24T03:36:50.980Z|283397|reconnect|WARN|tcp:[10.56.64.18]:6642: connection dropped (Broken pipe)
2024-05-24T03:36:50.980Z|283398|reconnect|INFO|tcp:[10.56.64.18]:6642: continuing to reconnect in the background but suppressing further logging
2024-05-24T03:36:50.980Z|283399|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:36:50.981Z|283400|poll_loop|INFO|wakeup due to 0-ms timeout at tcp:[10.56.64.18]:6641 (86% CPU usage)
2024-05-24T03:36:50.981Z|283401|reconnect|INFO|tcp:[10.56.64.18]:6641: connection closed by peer
2024-05-24T03:36:51.981Z|283402|poll_loop|INFO|wakeup due to 1000-ms timeout at ../lib/reconnect.c:677 (86% CPU usage)
2024-05-24T03:36:51.981Z|283403|reconnect|INFO|tcp:[10.56.64.16]:6641: connecting...
2024-05-24T03:36:51.981Z|283404|reconnect|INFO|tcp:[10.56.64.16]:6641: connected
2024-05-24T03:36:51.994Z|283405|ovsdb_cs|INFO|tcp:[10.56.64.16]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:36:51.994Z|283406|reconnect|INFO|tcp:[10.56.64.16]:6641: connection attempt timed out
2024-05-24T03:36:51.994Z|283407|reconnect|INFO|tcp:[10.56.64.16]:6641: waiting 2 seconds before reconnect
2024-05-24T03:36:53.996Z|283408|reconnect|INFO|tcp:[10.56.64.17]:6641: connecting...
2024-05-24T03:36:53.996Z|283409|reconnect|INFO|tcp:[10.56.64.17]:6641: connected
2024-05-24T03:36:54.007Z|283410|ovsdb_cs|INFO|tcp:[10.56.64.17]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:36:54.007Z|283411|reconnect|INFO|tcp:[10.56.64.17]:6641: connection attempt timed out
2024-05-24T03:36:54.007Z|283412|reconnect|INFO|tcp:[10.56.64.17]:6641: waiting 4 seconds before reconnect
2024-05-24T03:36:58.008Z|283413|reconnect|INFO|tcp:[10.56.64.18]:6641: connecting...
2024-05-24T03:36:58.008Z|283414|reconnect|INFO|tcp:[10.56.64.18]:6641: connected
2024-05-24T03:36:58.981Z|283415|reconnect|INFO|tcp:[10.56.64.16]:6642: connected
2024-05-24T03:36:58.983Z|283416|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:36:58.984Z|283417|ovsdb_cs|INFO|tcp:[10.56.64.16]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:36:58.984Z|283418|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:37:06.987Z|283419|reconnect|INFO|tcp:[10.56.64.17]:6642: connected
2024-05-24T03:37:06.988Z|283420|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:37:06.988Z|283421|ovsdb_cs|INFO|tcp:[10.56.64.17]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:37:06.988Z|283422|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:37:14.989Z|283423|reconnect|INFO|tcp:[10.56.64.18]:6642: connected
2024-05-24T03:37:23.377Z|283424|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:37:23.867Z|283425|ovn_util|WARN|Dropped 21737 log messages in last 87 seconds (most recently, 35 seconds ago) due to excessive rate
2024-05-24T03:37:23.867Z|283426|ovn_util|WARN|all port tunnel ids exhausted
2024-05-24T03:37:33.870Z|283427|northd|ERR|Dropped 43475 log messages in last 88 seconds (most recently, 44 seconds ago) due to excessive rate
2024-05-24T03:37:33.871Z|283428|northd|ERR|lport fake-pod-7b99c6d54d-r87tb.fake-pod in port group node.kwok.node.545 not found.
2024-05-24T03:37:33.987Z|283429|inc_proc_eng|INFO|node: northd, recompute (forced) took 10610ms
2024-05-24T03:37:35.123Z|283430|inc_proc_eng|INFO|node: lflow, recompute (forced) took 846ms
2024-05-24T03:37:35.362Z|283431|jsonrpc|WARN|tcp:[10.56.64.18]:6642: send error: Broken pipe
2024-05-24T03:37:35.422Z|283432|ovn_northd|INFO|OVNSB commit failed, force recompute next time.
2024-05-24T03:37:35.442Z|283433|timeval|WARN|Unreasonably long 12064ms poll interval (12050ms user, 14ms system)
2024-05-24T03:37:35.442Z|283434|timeval|WARN|faults: 1405 minor, 0 major
2024-05-24T03:37:35.442Z|283435|timeval|WARN|disk: 0 reads, 8 writes
2024-05-24T03:37:35.442Z|283436|timeval|WARN|context switches: 0 voluntary, 12 involuntary
2024-05-24T03:37:35.442Z|283437|poll_loop|INFO|Dropped 3 log messages in last 44 seconds (most recently, 44 seconds ago) due to excessive rate
2024-05-24T03:37:35.444Z|283438|poll_loop|INFO|wakeup due to [POLLIN][POLLHUP] on fd 17 (/var/run/ovn/ovn-northd.595.ctl<->) at ../lib/stream-fd.c:157 (100% CPU usage)
2024-05-24T03:37:35.444Z|283439|poll_loop|INFO|wakeup due to [POLLIN] on fd 3 (10.56.64.18:24396<->10.56.64.18:6641) at ../lib/stream-fd.c:157 (100% CPU usage)
2024-05-24T03:37:35.445Z|283440|reconnect|WARN|tcp:[10.56.64.18]:6642: connection dropped (Broken pipe)
2024-05-24T03:37:35.445Z|283441|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:37:35.446Z|283442|poll_loop|INFO|wakeup due to 0-ms timeout at tcp:[10.56.64.18]:6641 (100% CPU usage)
2024-05-24T03:37:35.446Z|283443|reconnect|INFO|tcp:[10.56.64.18]:6641: connection closed by peer
2024-05-24T03:37:36.445Z|283444|poll_loop|INFO|wakeup due to 999-ms timeout at ../lib/reconnect.c:677 (100% CPU usage)
2024-05-24T03:37:36.445Z|283445|reconnect|INFO|tcp:[10.56.64.16]:6642: connecting...
2024-05-24T03:37:36.446Z|283446|poll_loop|INFO|wakeup due to [POLLOUT] on fd 3 (10.56.64.18:27608<->10.56.64.16:6642) at ../lib/stream-fd.c:153 (100% CPU usage)
2024-05-24T03:37:36.446Z|283447|reconnect|INFO|tcp:[10.56.64.16]:6642: connected
2024-05-24T03:37:36.448Z|283448|poll_loop|INFO|wakeup due to 0-ms timeout at ../lib/reconnect.c:677 (100% CPU usage)
2024-05-24T03:37:36.448Z|283449|reconnect|INFO|tcp:[10.56.64.16]:6641: connecting...
2024-05-24T03:37:36.457Z|283450|ovsdb_cs|INFO|tcp:[10.56.64.16]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:37:36.457Z|283451|reconnect|INFO|tcp:[10.56.64.16]:6642: connection attempt timed out
2024-05-24T03:37:36.457Z|283452|reconnect|INFO|tcp:[10.56.64.16]:6642: waiting 2 seconds before reconnect
2024-05-24T03:37:36.457Z|283453|poll_loop|INFO|wakeup due to [POLLOUT] on fd 17 (10.56.64.18:50250<->10.56.64.16:6641) at ../lib/stream-fd.c:153 (100% CPU usage)
2024-05-24T03:37:36.457Z|283454|reconnect|INFO|tcp:[10.56.64.16]:6641: connected
2024-05-24T03:37:36.467Z|283455|ovsdb_cs|INFO|tcp:[10.56.64.16]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:37:36.467Z|283456|reconnect|INFO|tcp:[10.56.64.16]:6641: connection attempt timed out
2024-05-24T03:37:36.467Z|283457|reconnect|INFO|tcp:[10.56.64.16]:6641: waiting 2 seconds before reconnect
2024-05-24T03:37:38.458Z|283458|reconnect|INFO|tcp:[10.56.64.17]:6642: connecting...
2024-05-24T03:37:38.459Z|283459|reconnect|INFO|tcp:[10.56.64.17]:6642: connected
2024-05-24T03:37:38.459Z|283460|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.
2024-05-24T03:37:38.459Z|283461|ovsdb_cs|INFO|tcp:[10.56.64.17]:6642: clustered database server is not cluster leader; trying another server
2024-05-24T03:37:38.459Z|283462|reconnect|INFO|tcp:[10.56.64.17]:6642: connection attempt timed out
2024-05-24T03:37:38.459Z|283463|reconnect|INFO|tcp:[10.56.64.17]:6642: waiting 4 seconds before reconnect
2024-05-24T03:37:38.459Z|283464|ovn_northd|INFO|ovn-northd lock lost. This ovn-northd instance is now on standby.
2024-05-24T03:37:38.468Z|283465|reconnect|INFO|tcp:[10.56.64.17]:6641: connecting...
2024-05-24T03:37:38.468Z|283466|reconnect|INFO|tcp:[10.56.64.17]:6641: connected
2024-05-24T03:37:38.468Z|283467|ovsdb_cs|INFO|tcp:[10.56.64.17]:6641: clustered database server is not cluster leader; trying another server
2024-05-24T03:37:38.468Z|283468|reconnect|INFO|tcp:[10.56.64.17]:6641: connection attempt timed out
2024-05-24T03:37:38.468Z|283469|reconnect|INFO|tcp:[10.56.64.17]:6641: waiting 4 seconds before reconnect
2024-05-24T03:37:42.459Z|283470|reconnect|INFO|tcp:[10.56.64.18]:6642: connecting...
2024-05-24T03:37:42.460Z|283471|reconnect|INFO|tcp:[10.56.64.18]:6642: connected
2024-05-24T03:37:42.469Z|283472|reconnect|INFO|tcp:[10.56.64.18]:6641: connecting...
2024-05-24T03:37:42.469Z|283473|reconnect|INFO|tcp:[10.56.64.18]:6641: connected
2024-05-24T03:37:46.215Z|283474|ovn_northd|INFO|ovn-northd lock acquired. This ovn-northd instance is now active.

ovsdb-server-nb.log

2024-05-24T03:39:19.604Z|132818|timeval|WARN|Unreasonably long 1064ms poll interval (1055ms user, 7ms system)
2024-05-24T03:39:19.604Z|132819|timeval|WARN|faults: 1275 minor, 0 major
2024-05-24T03:39:19.604Z|132820|timeval|WARN|context switches: 0 voluntary, 3 involuntary
2024-05-24T03:39:19.604Z|132821|reconnect|ERR|tcp:10.56.64.17:29954: no response to inactivity probe after 5.18 seconds, disconnecting
2024-05-24T03:39:23.551Z|132822|timeval|WARN|Unreasonably long 1001ms poll interval (995ms user, 5ms system)
2024-05-24T03:39:23.551Z|132823|timeval|WARN|faults: 1278 minor, 0 major
2024-05-24T03:39:23.551Z|132824|timeval|WARN|context switches: 0 voluntary, 2 involuntary
2024-05-24T03:39:25.467Z|132825|reconnect|ERR|tcp:10.56.64.16:19996: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:39:27.612Z|132826|timeval|WARN|Unreasonably long 1055ms poll interval (1047ms user, 6ms system)
2024-05-24T03:39:27.612Z|132827|timeval|WARN|faults: 846 minor, 0 major
2024-05-24T03:39:27.612Z|132828|timeval|WARN|context switches: 0 voluntary, 1 involuntary
2024-05-24T03:39:31.655Z|132829|timeval|WARN|Unreasonably long 1006ms poll interval (996ms user, 7ms system)
2024-05-24T03:39:31.655Z|132830|timeval|WARN|faults: 843 minor, 0 major
2024-05-24T03:39:31.655Z|132831|timeval|WARN|context switches: 0 voluntary, 7 involuntary
2024-05-24T03:39:35.719Z|132832|timeval|WARN|Unreasonably long 1064ms poll interval (1060ms user, 4ms system)
2024-05-24T03:39:35.719Z|132833|timeval|WARN|faults: 846 minor, 0 major
2024-05-24T03:39:35.719Z|132834|timeval|WARN|context switches: 0 voluntary, 8 involuntary
2024-05-24T03:39:43.722Z|132835|timeval|WARN|Unreasonably long 1054ms poll interval (1049ms user, 4ms system)
2024-05-24T03:39:43.722Z|132836|timeval|WARN|faults: 812 minor, 0 major
2024-05-24T03:39:43.722Z|132837|timeval|WARN|context switches: 0 voluntary, 5 involuntary
2024-05-24T03:39:43.722Z|132838|reconnect|ERR|tcp:10.56.64.17:29956: no response to inactivity probe after 5.37 seconds, disconnecting
2024-05-24T03:39:47.684Z|132839|timeval|WARN|Unreasonably long 1010ms poll interval (1004ms user, 6ms system)
2024-05-24T03:39:47.685Z|132840|timeval|WARN|faults: 833 minor, 0 major
2024-05-24T03:39:47.685Z|132841|timeval|WARN|context switches: 0 voluntary, 1 involuntary
2024-05-24T03:39:51.734Z|132842|timeval|WARN|Unreasonably long 1053ms poll interval (1045ms user, 6ms system)
2024-05-24T03:39:51.734Z|132843|timeval|WARN|faults: 833 minor, 0 major
2024-05-24T03:39:51.734Z|132844|timeval|WARN|context switches: 0 voluntary, 1 involuntary
2024-05-24T03:39:53.735Z|132845|reconnect|ERR|tcp:10.56.64.16:20008: no response to inactivity probe after 5.01 seconds, disconnecting
2024-05-24T03:39:55.704Z|132846|timeval|WARN|Unreasonably long 1017ms poll interval (1012ms user, 5ms system)
2024-05-24T03:39:55.704Z|132847|timeval|WARN|faults: 838 minor, 0 major
2024-05-24T03:39:55.704Z|132848|timeval|WARN|context switches: 0 voluntary, 7 involuntary
2024-05-24T03:39:59.788Z|132849|timeval|WARN|Unreasonably long 1096ms poll interval (1091ms user, 3ms system)
2024-05-24T03:39:59.788Z|132850|timeval|WARN|faults: 834 minor, 0 major
2024-05-24T03:39:59.788Z|132851|timeval|WARN|context switches: 0 voluntary, 2 involuntary
2024-05-24T03:40:01.737Z|132852|reconnect|ERR|tcp:10.56.64.18:24442: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:40:03.731Z|132853|timeval|WARN|Unreasonably long 1033ms poll interval (1027ms user, 7ms system)
2024-05-24T03:40:03.731Z|132854|timeval|WARN|faults: 834 minor, 0 major
2024-05-24T03:40:03.731Z|132855|timeval|WARN|context switches: 0 voluntary, 1 involuntary
2024-05-24T03:40:07.776Z|132856|timeval|WARN|Unreasonably long 1071ms poll interval (1061ms user, 10ms system)
2024-05-24T03:40:07.776Z|132857|timeval|WARN|faults: 835 minor, 0 major
2024-05-24T03:40:07.776Z|132858|timeval|WARN|context switches: 0 voluntary, 7 involuntary
2024-05-24T03:40:07.776Z|132859|reconnect|ERR|tcp:10.56.64.17:29964: no response to inactivity probe after 5.62 seconds, disconnecting
2024-05-24T03:40:12.777Z|132860|reconnect|ERR|tcp:10.56.64.16:20022: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:40:15.777Z|132861|timeval|WARN|Unreasonably long 1060ms poll interval (1051ms user, 9ms system)
2024-05-24T03:40:15.777Z|132862|timeval|WARN|faults: 794 minor, 0 major
2024-05-24T03:40:15.777Z|132863|timeval|WARN|context switches: 0 voluntary, 2 involuntary
2024-05-24T03:40:15.778Z|132864|coverage|INFO|Dropped 12 log messages in last 56 seconds (most recently, 8 seconds ago) due to excessive rate
2024-05-24T03:40:15.778Z|132865|coverage|INFO|Skipping details of duplicate event coverage for hash=77dac2d1
2024-05-24T03:40:19.726Z|132866|timeval|WARN|Unreasonably long 1002ms poll interval (996ms user, 7ms system)
2024-05-24T03:40:19.726Z|132867|timeval|WARN|faults: 690 minor, 0 major
2024-05-24T03:40:19.726Z|132868|timeval|WARN|context switches: 0 voluntary, 1 involuntary
2024-05-24T03:40:22.250Z|132869|reconnect|ERR|tcp:10.56.64.18:24458: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:40:23.810Z|132870|timeval|WARN|Unreasonably long 1066ms poll interval (1061ms user, 4ms system)
2024-05-24T03:40:23.810Z|132871|timeval|WARN|faults: 685 minor, 0 major
2024-05-24T03:40:23.810Z|132872|timeval|WARN|context switches: 0 voluntary, 7 involuntary
2024-05-24T03:40:27.770Z|132873|timeval|WARN|Unreasonably long 1020ms poll interval (1015ms user, 5ms system)
2024-05-24T03:40:27.770Z|132874|timeval|WARN|faults: 473 minor, 0 major
2024-05-24T03:40:27.770Z|132875|timeval|WARN|context switches: 0 voluntary, 1 involuntary
2024-05-24T03:40:31.954Z|132876|timeval|WARN|Unreasonably long 1108ms poll interval (1100ms user, 8ms system)
2024-05-24T03:40:31.954Z|132877|timeval|WARN|faults: 801 minor, 0 major
2024-05-24T03:40:31.954Z|132878|timeval|WARN|context switches: 0 voluntary, 1 involuntary
2024-05-24T03:40:31.954Z|132879|reconnect|ERR|tcp:10.56.64.17:29966: no response to inactivity probe after 5.77 seconds, disconnecting

ovsdb-server-sb.log

2024-05-24T03:41:03.258Z|58044|jsonrpc|WARN|tcp:10.16.0.2:35724: receive error: Connection reset by peer
2024-05-24T03:41:03.258Z|58045|reconnect|WARN|tcp:10.16.0.2:35724: connection dropped (Connection reset by peer)
2024-05-24T03:41:05.466Z|58046|reconnect|ERR|tcp:10.56.64.16:24968: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:41:11.644Z|58047|reconnect|ERR|tcp:10.56.64.17:9754: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:41:22.055Z|58048|jsonrpc|WARN|tcp:10.16.0.2:35730: receive error: Connection reset by peer
2024-05-24T03:41:22.055Z|58049|reconnect|WARN|tcp:10.16.0.2:35730: connection dropped (Connection reset by peer)
2024-05-24T03:41:27.614Z|58050|reconnect|ERR|tcp:10.56.64.16:24970: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:41:34.818Z|58051|reconnect|ERR|tcp:10.56.64.17:9756: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:41:40.054Z|58052|jsonrpc|WARN|tcp:10.16.0.2:35734: receive error: Connection reset by peer
2024-05-24T03:41:40.054Z|58053|reconnect|WARN|tcp:10.16.0.2:35734: connection dropped (Connection reset by peer)
2024-05-24T03:41:41.840Z|58054|reconnect|ERR|tcp:10.56.64.18:8978: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:41:51.101Z|58055|reconnect|ERR|tcp:10.56.64.16:24972: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:41:56.430Z|58056|reconnect|ERR|tcp:10.56.64.17:9758: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:41:58.658Z|58057|jsonrpc|WARN|tcp:10.16.0.2:35738: receive error: Connection reset by peer
2024-05-24T03:41:58.658Z|58058|reconnect|WARN|tcp:10.16.0.2:35738: connection dropped (Connection reset by peer)
2024-05-24T03:42:10.905Z|58059|reconnect|ERR|tcp:10.56.64.16:24974: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:42:16.955Z|58060|jsonrpc|WARN|tcp:10.16.0.2:35742: receive error: Connection reset by peer
2024-05-24T03:42:16.955Z|58061|reconnect|WARN|tcp:10.16.0.2:35742: connection dropped (Connection reset by peer)
2024-05-24T03:42:19.908Z|58062|reconnect|ERR|tcp:10.56.64.17:9760: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:42:25.899Z|58063|reconnect|ERR|tcp:10.56.64.18:8998: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:42:34.622Z|58064|reconnect|ERR|tcp:10.56.64.16:24976: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:42:35.160Z|58065|jsonrpc|WARN|tcp:10.16.0.2:35746: receive error: Connection reset by peer
2024-05-24T03:42:35.160Z|58066|reconnect|WARN|tcp:10.16.0.2:35746: connection dropped (Connection reset by peer)
2024-05-24T03:42:40.127Z|58067|reconnect|ERR|tcp:10.56.64.17:9762: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:42:49.098Z|58068|reconnect|ERR|tcp:10.56.64.18:9008: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:42:53.460Z|58069|jsonrpc|WARN|tcp:10.16.0.2:35750: receive error: Connection reset by peer
2024-05-24T03:42:53.460Z|58070|reconnect|WARN|tcp:10.16.0.2:35750: connection dropped (Connection reset by peer)
2024-05-24T03:42:54.784Z|58071|reconnect|ERR|tcp:10.56.64.16:24978: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:43:04.102Z|58072|reconnect|ERR|tcp:10.56.64.17:9764: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:43:12.859Z|58073|jsonrpc|WARN|tcp:10.16.0.2:35754: receive error: Connection reset by peer
2024-05-24T03:43:12.859Z|58074|reconnect|WARN|tcp:10.16.0.2:35754: connection dropped (Connection reset by peer)
2024-05-24T03:43:18.411Z|58075|reconnect|ERR|tcp:10.56.64.16:24980: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:43:23.613Z|58076|reconnect|ERR|tcp:10.56.64.17:9766: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:43:31.133Z|58077|reconnect|ERR|tcp:10.56.64.18:9028: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:43:31.655Z|58078|jsonrpc|WARN|tcp:10.16.0.2:35758: receive error: Connection reset by peer
2024-05-24T03:43:31.655Z|58079|reconnect|WARN|tcp:10.16.0.2:35758: connection dropped (Connection reset by peer)
2024-05-24T03:43:37.376Z|58080|reconnect|ERR|tcp:10.56.64.16:24982: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:43:47.256Z|58081|reconnect|ERR|tcp:10.56.64.17:9768: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:43:50.057Z|58082|jsonrpc|WARN|tcp:10.16.0.2:35762: receive error: Connection reset by peer
2024-05-24T03:43:50.057Z|58083|reconnect|WARN|tcp:10.16.0.2:35762: connection dropped (Connection reset by peer)
2024-05-24T03:44:00.688Z|58084|reconnect|ERR|tcp:10.56.64.16:24984: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:44:06.345Z|58085|reconnect|ERR|tcp:10.56.64.17:9770: no response to inactivity probe after 5 seconds, disconnecting
2024-05-24T03:44:08.262Z|58086|jsonrpc|WARN|tcp:10.16.0.2:35766: receive error: Connection reset by peer
2024-05-24T03:44:08.263Z|58087|reconnect|WARN|tcp:10.16.0.2:35766: connection dropped (Connection reset by peer)
2024-05-24T03:44:15.098Z|58088|reconnect|ERR|tcp:10.56.64.18:9046: no response to inactivity probe after 5 seconds, disconnecting

@changluyi
Copy link
Collaborator

image

看起来像tunnel id给用完了。如果是geneve的话,应该是2 的 16次方个
image

@cmdy
Copy link
Author

cmdy commented May 24, 2024

image

看起来像tunnel id给用完了。如果是geneve的话,应该是2 的 16次方个 image

目前我测试的是 geneve ,这个 tunnel id 与那个资源有关系?如果用完后有解决的方案嘛?还是这个就已经是上限了?另 vxlan模式下 这个 tunnel id 是多少呢?

@cmdy cmdy changed the title [BUG] 大规模集群下3000节点,从4wPod扩容到6wPod时,pod无法正常分配ip 大规模集群下3000节点,从4wPod扩容到6wPod时,pod无法正常分配ip May 24, 2024
@oilbeater
Copy link
Collaborator

@cmdy https://kubeovn.github.io/docs/v1.12.x/reference/tunnel-protocol/#vxlan vxlan 数量会更少一些,单个 datapath 下 4096 个端口

@oilbeater oilbeater added the documents Need documents label Jun 4, 2024
@oilbeater
Copy link
Collaborator

根据 ovn 的架构文档 geneve 单个 datapath 最多支持 2**15 个端口

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documents Need documents performance Anything that can make Kube-OVN faster
Projects
None yet
Development

No branches or pull requests

5 participants