Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

etcd compatibility issues #12743

Open
elysian-gc opened this issue Mar 20, 2024 · 0 comments
Open

etcd compatibility issues #12743

elysian-gc opened this issue Mar 20, 2024 · 0 comments
Assignees
Labels

Comments

@elysian-gc
Copy link

What happened?

relate to 12255, I compile with tag v5.6.0-rc.1 which contain 86d1f8f .but cannot work good.

when etcd down few second. then start . it cause cluster damaged.

What did you expect to happen?

No matter how long etcd stopped, emqx cluster can restore after etcd restart

How can we reproduce it (as minimally and precisely as possible)?

  1. boot a core and a replicant with etcd.
  2. stop etcd
  3. start etcd after 30 second
  4. connect mqtt on replicant.

Anything else we need to know?

after stop and start etcd. etcd kv is empty

image

EMQX version

$ ./bin/emqx_ctl broker
[root@ip-10-61-3-165 emqx]# ./bin/emqx_ctl broker
sysdescr  : EMQX
version   : 5.6.0-rc.1
datetime  : 2024-03-20T08:06:48.032697506+00:00
uptime    : 1 minutes, 40 seconds

OS version

# On Linux:
$ cat /etc/os-release
NAME="Amazon Linux"
VERSION="2"
ID="amzn"
ID_LIKE="centos rhel fedora"
VERSION_ID="2"
PRETTY_NAME="Amazon Linux 2"
ANSI_COLOR="0;33"
CPE_NAME="cpe:2.3:o:amazon:amazon_linux:2"
HOME_URL="https://amazonlinux.com/"
SUPPORT_END="2025-06-30"

$ uname -a
Linux ip-10-61-3-165.ec2.internal 5.10.210-201.852.amzn2.aarch64 #1 SMP Tue Feb 27 17:09:24 UTC 2024 aarch64 aarch64 aarch64 GNU/Linux

Log files

Erlang/OTP 26 [erts-14.2.1] [emqx] [64-bit] [smp:2:2] [ds:2:2:8] [async-threads:4] [jit]

Listener tcp:default on 0.0.0.0:1883 started.
Listener ssl:default on 0.0.0.0:8883 started.
Listener ws:default on 0.0.0.0:8083 started.
Listener wss:default on 0.0.0.0:8084 started.
Listener http:dashboard on :18083 started.
EMQX 5.6.0-rc.1 is running now!
Restricted Eshell V14.2.1 (press Ctrl+G to abort, type help(). for help)
2024-03-20T08:06:06.816144+00:00 [warning] ekka_cluster_etcd failed to connect [10.61.3.143:2379] by {shutdown,econnrefused}
2024-03-20T08:06:06.817462+00:00 [warning] ekka_cluster_etcd failed to connect [10.61.3.143:2379] by {shutdown,econnrefused}
2024-03-20T08:06:07.619207+00:00 [warning] ekka_cluster_etcd failed to connect [10.61.3.143:2379] by {shutdown,econnrefused}
2024-03-20T08:06:09.220216+00:00 [warning] ekka_cluster_etcd failed to connect [10.61.3.143:2379] by {shutdown,econnrefused}
2024-03-20T08:06:12.421216+00:00 [warning] ekka_cluster_etcd failed to connect [10.61.3.143:2379] by {shutdown,econnrefused}
2024-03-20T08:06:12.820826+00:00 [error] Supervisor: {local,ekka_cluster_sup}. Context: child_terminated. Reason: {shutdown,#{reason => eetcd_conn_unavailable,event => 'KeepAliveHalted',lease_id => 7587877467865066763}}. Offender: id=ekka_cluster_etcd,pid=<0.2359.0>.
2024-03-20T08:06:12.821578+00:00 [warning] ekka_cluster_etcd failed to connect [10.61.3.143:2379] by {shutdown,econnrefused}
2024-03-20T08:06:12.821693+00:00 [error] Failed to connect ETCD: {"10.61.3.143",2379} by {shutdown,econnrefused}
2024-03-20T08:06:12.821849+00:00 [error] crasher: initial call: ekka_cluster_etcd:init/1, pid: <0.3492.0>, registered_name: [], error: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}, ancestors: [ekka_cluster_sup,ekka_sup,<0.2356.0>], message_queue_len: 0, messages: [], links: [<0.2358.0>], dictionary: [], trap_exit: true, status: running, heap_size: 610, stack_size: 28, reductions: 252; neighbours:
2024-03-20T08:06:12.822138+00:00 [error] Supervisor: {local,ekka_cluster_sup}. Context: start_error. Reason: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}. Offender: id=ekka_cluster_etcd,pid=<0.2359.0>.
2024-03-20T08:06:12.823093+00:00 [warning] ekka_cluster_etcd failed to connect [10.61.3.143:2379] by {shutdown,econnrefused}
2024-03-20T08:06:12.823236+00:00 [error] Failed to connect ETCD: {"10.61.3.143",2379} by {shutdown,econnrefused}
2024-03-20T08:06:12.823532+00:00 [error] crasher: initial call: ekka_cluster_etcd:init/1, pid: <0.3495.0>, registered_name: [], error: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}, ancestors: [ekka_cluster_sup,ekka_sup,<0.2356.0>], message_queue_len: 0, messages: [], links: [<0.2358.0>], dictionary: [], trap_exit: true, status: running, heap_size: 610, stack_size: 28, reductions: 250; neighbours:
2024-03-20T08:06:12.823926+00:00 [error] Supervisor: {local,ekka_cluster_sup}. Context: start_error. Reason: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}. Offender: id=ekka_cluster_etcd,pid={restarting,<0.2359.0>}.
2024-03-20T08:06:12.824611+00:00 [warning] ekka_cluster_etcd failed to connect [10.61.3.143:2379] by {shutdown,econnrefused}
2024-03-20T08:06:12.824703+00:00 [error] Failed to connect ETCD: {"10.61.3.143",2379} by {shutdown,econnrefused}
2024-03-20T08:06:12.824995+00:00 [error] crasher: initial call: ekka_cluster_etcd:init/1, pid: <0.3498.0>, registered_name: [], error: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}, ancestors: [ekka_cluster_sup,ekka_sup,<0.2356.0>], message_queue_len: 0, messages: [], links: [<0.2358.0>], dictionary: [], trap_exit: true, status: running, heap_size: 610, stack_size: 28, reductions: 250; neighbours:
2024-03-20T08:06:12.825269+00:00 [error] Supervisor: {local,ekka_cluster_sup}. Context: start_error. Reason: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}. Offender: id=ekka_cluster_etcd,pid={restarting,<0.2359.0>}.
2024-03-20T08:06:12.826057+00:00 [warning] ekka_cluster_etcd failed to connect [10.61.3.143:2379] by {shutdown,econnrefused}
2024-03-20T08:06:12.826161+00:00 [error] Failed to connect ETCD: {"10.61.3.143",2379} by {shutdown,econnrefused}
2024-03-20T08:06:12.826427+00:00 [error] crasher: initial call: ekka_cluster_etcd:init/1, pid: <0.3501.0>, registered_name: [], error: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}, ancestors: [ekka_cluster_sup,ekka_sup,<0.2356.0>], message_queue_len: 0, messages: [], links: [<0.2358.0>], dictionary: [], trap_exit: true, status: running, heap_size: 610, stack_size: 28, reductions: 250; neighbours:
2024-03-20T08:06:12.826797+00:00 [error] Supervisor: {local,ekka_cluster_sup}. Context: start_error. Reason: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}. Offender: id=ekka_cluster_etcd,pid={restarting,<0.2359.0>}.
2024-03-20T08:06:12.827552+00:00 [warning] ekka_cluster_etcd failed to connect [10.61.3.143:2379] by {shutdown,econnrefused}
2024-03-20T08:06:12.827651+00:00 [error] Failed to connect ETCD: {"10.61.3.143",2379} by {shutdown,econnrefused}
2024-03-20T08:06:12.827851+00:00 [error] crasher: initial call: ekka_cluster_etcd:init/1, pid: <0.3504.0>, registered_name: [], error: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}, ancestors: [ekka_cluster_sup,ekka_sup,<0.2356.0>], message_queue_len: 0, messages: [], links: [<0.2358.0>], dictionary: [], trap_exit: true, status: running, heap_size: 610, stack_size: 28, reductions: 250; neighbours:
2024-03-20T08:06:12.828177+00:00 [error] Supervisor: {local,ekka_cluster_sup}. Context: start_error. Reason: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}. Offender: id=ekka_cluster_etcd,pid={restarting,<0.2359.0>}.
2024-03-20T08:06:12.828944+00:00 [warning] ekka_cluster_etcd failed to connect [10.61.3.143:2379] by {shutdown,econnrefused}
2024-03-20T08:06:12.829039+00:00 [error] Failed to connect ETCD: {"10.61.3.143",2379} by {shutdown,econnrefused}
2024-03-20T08:06:12.829116+00:00 [error] crasher: initial call: ekka_cluster_etcd:init/1, pid: <0.3507.0>, registered_name: [], error: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}, ancestors: [ekka_cluster_sup,ekka_sup,<0.2356.0>], message_queue_len: 0, messages: [], links: [<0.2358.0>], dictionary: [], trap_exit: true, status: running, heap_size: 610, stack_size: 28, reductions: 250; neighbours:
2024-03-20T08:06:12.829551+00:00 [error] Supervisor: {local,ekka_cluster_sup}. Context: start_error. Reason: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}. Offender: id=ekka_cluster_etcd,pid={restarting,<0.2359.0>}.
2024-03-20T08:06:12.830235+00:00 [warning] ekka_cluster_etcd failed to connect [10.61.3.143:2379] by {shutdown,econnrefused}
2024-03-20T08:06:12.830356+00:00 [error] Failed to connect ETCD: {"10.61.3.143",2379} by {shutdown,econnrefused}
2024-03-20T08:06:12.830551+00:00 [error] crasher: initial call: ekka_cluster_etcd:init/1, pid: <0.3510.0>, registered_name: [], error: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}, ancestors: [ekka_cluster_sup,ekka_sup,<0.2356.0>], message_queue_len: 0, messages: [], links: [<0.2358.0>], dictionary: [], trap_exit: true, status: running, heap_size: 610, stack_size: 28, reductions: 250; neighbours:
2024-03-20T08:06:12.830884+00:00 [error] Supervisor: {local,ekka_cluster_sup}. Context: start_error. Reason: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}. Offender: id=ekka_cluster_etcd,pid={restarting,<0.2359.0>}.
2024-03-20T08:06:12.831594+00:00 [warning] ekka_cluster_etcd failed to connect [10.61.3.143:2379] by {shutdown,econnrefused}
2024-03-20T08:06:12.831701+00:00 [error] Failed to connect ETCD: {"10.61.3.143",2379} by {shutdown,econnrefused}
2024-03-20T08:06:12.831882+00:00 [error] crasher: initial call: ekka_cluster_etcd:init/1, pid: <0.3513.0>, registered_name: [], error: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}, ancestors: [ekka_cluster_sup,ekka_sup,<0.2356.0>], message_queue_len: 0, messages: [], links: [<0.2358.0>], dictionary: [], trap_exit: true, status: running, heap_size: 610, stack_size: 28, reductions: 250; neighbours:
2024-03-20T08:06:12.832226+00:00 [error] Supervisor: {local,ekka_cluster_sup}. Context: start_error. Reason: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}. Offender: id=ekka_cluster_etcd,pid={restarting,<0.2359.0>}.
2024-03-20T08:06:12.833163+00:00 [warning] ekka_cluster_etcd failed to connect [10.61.3.143:2379] by {shutdown,econnrefused}
2024-03-20T08:06:12.833258+00:00 [error] Failed to connect ETCD: {"10.61.3.143",2379} by {shutdown,econnrefused}
2024-03-20T08:06:12.833432+00:00 [error] crasher: initial call: ekka_cluster_etcd:init/1, pid: <0.3516.0>, registered_name: [], error: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}, ancestors: [ekka_cluster_sup,ekka_sup,<0.2356.0>], message_queue_len: 0, messages: [], links: [<0.2358.0>], dictionary: [], trap_exit: true, status: running, heap_size: 610, stack_size: 28, reductions: 250; neighbours:
2024-03-20T08:06:12.833735+00:00 [error] Supervisor: {local,ekka_cluster_sup}. Context: start_error. Reason: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}. Offender: id=ekka_cluster_etcd,pid={restarting,<0.2359.0>}.
2024-03-20T08:06:12.834272+00:00 [warning] ekka_cluster_etcd failed to connect [10.61.3.143:2379] by {shutdown,econnrefused}
2024-03-20T08:06:12.834362+00:00 [error] Failed to connect ETCD: {"10.61.3.143",2379} by {shutdown,econnrefused}
2024-03-20T08:06:12.834481+00:00 [error] crasher: initial call: ekka_cluster_etcd:init/1, pid: <0.3519.0>, registered_name: [], error: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}, ancestors: [ekka_cluster_sup,ekka_sup,<0.2356.0>], message_queue_len: 0, messages: [], links: [<0.2358.0>], dictionary: [], trap_exit: true, status: running, heap_size: 610, stack_size: 28, reductions: 250; neighbours:
2024-03-20T08:06:12.834737+00:00 [error] Supervisor: {local,ekka_cluster_sup}. Context: start_error. Reason: {{badmatch,{error,[{{"10.61.3.143",2379},{shutdown,econnrefused}}]}},[{ekka_cluster_etcd,init,1,[{file,"ekka_cluster_etcd.erl"},{line,357}]},{gen_server,init_it,2,[{file,"gen_server.erl"},{line,980}]},{gen_server,init_it,6,[{file,"gen_server.erl"},{line,935}]},{proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,241}]}]}. Offender: id=ekka_cluster_etcd,pid={restarting,<0.2359.0>}.
2024-03-20T08:06:12.834845+00:00 [error] Supervisor: {local,ekka_cluster_sup}. Context: shutdown. Reason: reached_max_restart_intensity. Offender: id=ekka_cluster_etcd,pid={restarting,<0.2359.0>}.
2024-03-20T08:06:19.822751+00:00 [warning] ekka_cluster_etcd failed to connect [10.61.3.143:2379] by timeout

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants