Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: k8s集群部署后,matrixone 连接不上,有cn Pod反复重启 #387

Open
1 task done
yanghaitao5000 opened this issue Jul 20, 2023 · 2 comments
Open
1 task done
Assignees
Labels
kind/bug Something isn't working

Comments

@yanghaitao5000
Copy link

yanghaitao5000 commented Jul 20, 2023

Is there an existing issue for the same bug?

  • I have checked the existing issues.

Environment

  • Version or commit-id (e.g. v0.1.0 or 8b23a93): k8s version: v1.22.10,MatrixOne version:0.8.0
  • Hardware parameters:
  • OS type: CentOS Linux release 7.9.2009.
  • Others:

Actual Behavior

安装参照文档:matrixone 0.8.0官方文档:集群部署指南
集群环境配置:1个master,3个worker,4台虚拟机都是: 32C 36G,系统都为CentOS Linux release 7.9.2009.
集群部署方式:3logService、3cn、3dn(3logService、1cn、1dn 也连不上,现象一样)
安装软件版本:
1、matrixone的镜像版本:matrixorigin/matrixone:0.8.0
2、matrixone-operator版本包:matrixone-operator-0.8.0-alpha.6.tgz 或 matrixone-operator-0.8.0-alpha.5.tgz
3、其他组件安装:均按照matrixone 0.8.0官方文档操作。
问题描述:
安装文档,mo集群部署成功后,执行“mysql -hxxx -P6001 -uroot -p111" 或“mysql -h $(kubectl get svc/mo-tp-cn -n mo-hn -o jsonpath='{.spec.clusterIP}') -P 6001 -uroot -p111” 连接命令后,报如下错:“……ERROR 1045 (28000): Access denied for user root. SQL parser error: table "mo_account" does not exist”
1689849430176
然后kubectl get pods -n mo-hn,发现cn pod再反复重启:
1689850649862

CN 报错日志如下文件
cn.log

matrixone-operator报错如下文件
matrixone-operator.log

matrixone-operator部分报错截图:
de4d5d9f8515bb6ef44bf9af5f86964

Expected Behavior

No response

Steps to Reproduce

No response

Additional information

No response

@yanghaitao5000 yanghaitao5000 added the kind/bug Something isn't working label Jul 20, 2023
@yanghaitao5000
Copy link
Author

使用的 k8s version: v1.22.10

@aronchanisme
Copy link

aronchanisme commented Jul 21, 2023

可能跟k8s的local dns有关,可以排查一下local dns的pod是否有问题,有没有如下错误:

这个看起来是kuboard spary的问题,在/etc/hosts文件里面加了nameserver 169.254.25.10这个条目,导致dns一直在loop,死循环了
[FATAL] plugin/loop: Loop (169.254.25.10:58480 -> 169.254.25.10:53) detected for zone ".", see https://coredns.io/plugins/loop#troubleshooting. Query: "HINFO 3826882204040966417.8782391214162251717."

@aylei aylei self-assigned this Aug 4, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants