Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Orchestrator GUI incorrectly shows recovery option for intermediate database in chained replication #1463

Open
kamil-holubicki opened this issue Dec 29, 2022 · 0 comments

Comments

@kamil-holubicki
Copy link

In a chained replication environment such as A -> B -> C. The loss of the leaf node, C, makes Orchestrator GUI show the Recover button on B when no action is possible. This can be confusing for users unfamiliar with MySQL topology and may think there is a possible action.

To reproduce this issue, you can create a cluster using 3 nodes with anydbver:

./anydbver deploy hn:ps0 ps:5.7 node1 hn:ps1 ps:5.7 master:default node2 hn:ps2 ps:5.7 master:node1 node3 hn:orc orchestrator master:default

Shutdown the ps2 instance and access the GUI. It will show the instance ps1 to recover.

kamil-holubicki added a commit to kamil-holubicki/orchestrator-openark-fork that referenced this issue Dec 29, 2022
intermediate database in chained replication

openark#1463

Problem:
If we've got replication chain A->B->C, and C is down, GUI shows
'Recover' dropdown for node B, but there is no possible recovery action
available in such a case.

Cause:
The root cause of the problem is the analysis logic in
Analysis_dao.go:GetReplicationAnalysis(). The condition for setting
AllIntermediateMasterReplicasNotReplicating does not check if there are
any replicas reachable. So the case when all replicas are dead
(no recovery action possible) and the case when some replicas are still
reachable, but are not replicating (recovery action possible) are
undistingushable.

Solution:
Improve the analysis logic. Report
AllIntermediateMasterReplicasNotReplicating only if all replicas are not
replicating, but there are still some reachable replicas.

This commit also contains improvement of not trying to query the node
which is not reachable (ping node before examinig it)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant