-
Notifications
You must be signed in to change notification settings - Fork 475
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDDS-10789. Check policy based on original replicas instead of eligibleReplicas #6617
base: master
Are you sure you want to change the base?
Conversation
I am not sure on this. @siddhantsangwan might remember more clearly, as I think he worked on it. However, eligibleReplicas is the set of replias with pendingDelete and out of service removed. Ie, those replicas are going to go away "soon" so we should not consider them for over replication / placement or removal. So the set is trimmed down to that. Then we retain certain unhealthy replicas by also removing them from the set. At the end of the trim, we have a set of replicas that are in the system and can be removed. If we use the original set for the placement check, then there are potentially replicas that are going to be removed in it, and so the check is not consistent with the future view. By using the eligible replicas in the placement check, we are using all the healthy and not going to be deleted replicas, which seems like the correct thing to do. |
@sodonnel Thanks for the review. The case we met the issue is as follows:
For inprogress moves, this should be removed here. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the patch @symious.
From my understanding, reserveReplica
are the replicas that are excluded ("saved") from the eligible replicas to delete because it has the same origin node ID as the other container replicas (with maximum of 1 of each replica for each origin datanode ID will be excluded).
Could you help to rename the PR title and ticket to reflect this? "original replicas" are not entirely correct since some readers might think it includes all of the container replicas, including the containers that were removed because of decommission / maintenance DN or unhealthy container, while in reality this patch only includes the excluded reserveReplica
.
Additionally it is also good to highlight that this problem only affects QUASI_CLOSED
containers not CLOSED
containers since only QUASI_CLOSED
containers have such uniqueness check based on the origin node ID.
I left a few comments. Not very well-versed on this, so I'll let others with more familiarity to review this.
RatisContainerReplicaCount replicaCount, | ||
List<ContainerReplicaOp> pendingOps) { | ||
List<ContainerReplica> reserveReplicas = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: reserveReplicas
-> reservedReplicas
@@ -321,4 +325,43 @@ private int createCommands( | |||
return commandsSent; | |||
} | |||
|
|||
static class EligibleAndReserveReplicas { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you help add a short Javadoc on this class and the attributes for clarity?
|
||
/* | ||
Remove excess replicas if that does not make the container mis replicated. | ||
If the container was already mis replicated, then remove replicas if that | ||
does not change the placement count. | ||
*/ | ||
Set<ContainerReplica> replicaSet = new HashSet<>(replicas); | ||
Set<ContainerReplica> replicaSet = eligibleAndReserveReplicas.getReplicaSet(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Maybe can add a comment on why we use the eligible + reserved replica set instead of only the eligible replica set?
@@ -321,4 +325,43 @@ private int createCommands( | |||
return commandsSent; | |||
} | |||
|
|||
static class EligibleAndReserveReplicas { | |||
private List<ContainerReplica> eligibleReplicas; | |||
private List<ContainerReplica> reserveReplicas; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think reserved
is fine, but might want to consider to use "saved" instead by renaming it to savedReplicas
or savedReplicasWithUniqueOrigins
so that it's consistent with the method saveReplicasWithUniqueOrigins
and reduce the number of special terms in RM.
I think this change looks good in concept. If the comments from @ivandika3 are addressed it should be good to commit. A unit test would be nice too, but I know it can be a little tricky to setup mis-replication scenarios in the unit tests. |
What changes were proposed in this pull request?
In RatisOverReplicationHandler, the eligibleReplicas might be trimmed due to uniqueness check, and the check of placement policy should be done on the original replicas set, instead of the trimmed eligibleReplicas.
What is the link to the Apache JIRA
https://issues.apache.org/jira/browse/HDDS-10789
How was this patch tested?
Original tests.