-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HDFS-17509. RBF: Fix ClientProtocol.concat will throw NPE if tgr is a empty file. #6784
Conversation
RBF: Fix ClientProtocol.concat will throw NPE if tgr is a empty file.
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
@goiri Hi,sir. Do you have time to review this PR, thanks! |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @LiuGuH for your catch.
This bug makes sense. Leave some comments
} | ||
} | ||
// Concat only effects when all files in same namespace. | ||
// And in router view, a file only exists in one RemoteLocation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally a file only exits in one Namespace. But maybe multiple namespaces contain this file and RBF returns the file in the first namespace to the client.
This first namespace is got by the OrderedResolver
.
you can refer to this case:
- /path mounts to NS1, NS2 and NS3
- NS2 and NS3 contains /path
OrderedResolver
returns NS1, NS2 and NS3
for this case, we should proxy this concat
to NS2 instead of NS1, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For the empty file, maybe we can use getFileInfo
to get the namespace that this trg belongs to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for review.
you can refer to this case:
- /path mounts to NS1, NS2 and NS3
- NS2 and NS3 contains /path
OrderedResolver
returns NS1, NS2 and NS3for this case, we should proxy this
concat
to NS2 instead of NS1, right?
This only happens when /path is a directory. And for a file ,it must only be from one exactly nameservice via dfsrouter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Emmm, there is a scene .
-
If a file is already exist in two nameservices. And the add router mount.
NS1 /user/test/file
NS2 /user/test/file -
Add router mount.
hdfs dfsrouteradmin -add /user/test NS1,NS2 /user/test -order RANDOM -
getDestination
hdfs dfsrouteradmin -getDestination /user/test/file
Will return NS1,NS2
For a file in Router view with more than one nameservices, I think should thrown Exception for concat method.
Look forward to your guidance , thanks ! @ZanderXu
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For a file in Router view with more than one nameservices, I think should thrown Exception for concat method.
For multiple namespaces contain one same file case, RBF just return the file in the first namespace currently, such as: getBlockLocation, getFileInfo, etc.
So If you want to thrown Exception for concat, maybe you need to modify all RPCs to throw Exception for this case.
May we can use getFileInfo instead of getBlockLocation to fix this bug. BTW, getFileInfo is no needed if this path only mounts to one namespace.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
May we can use getFileInfo instead of getBlockLocation to fix this bug. BTW, getFileInfo is no needed if this path only mounts to one namespace.
At the beginning , I consider use getFileInfo . But the problem is that HdfsFileStatus does not have any information about nameservices or blockpoolid, only LocatedBlock has.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need to get NSId or BPId from the result of getFileInfo, you can refer to invokeSequential
to loop all namespaces one by one.
- Get all namespaces of the input path.
- Send getFileInfo to each namespace one by one
- The first namespace that the result of getFileInfo is not null is the one we need
@@ -1224,6 +1224,17 @@ public void testProxyConcatFile() throws Exception { | |||
String badPath = "/unknownlocation/unknowndir"; | |||
compareResponses(routerProtocol, nnProtocol, m, | |||
new Object[] {badPath, new String[] {routerFile}}); | |||
|
|||
// Test when concat trg is a empty file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we also need to check the empty source file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When there is a empty source file, it will throw exception in namenode. This behaiver is as same as with dfsrouter.
And when trg is a empty file, it is diffrent . Without dfsrouter ,it success. And with dfsrouter, it will throw Exception.
So I think there is no need to check the empty source file. Or implement it in another PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the namenode and rbf throw the same Exception?
Maybe RBF throws NPE, but NN throws org.apache.hadoop.HadoopIllegalArgumentException
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When srclist has a empty file ,both namenode and dfsrouter will throw the same IOException
https://github.com/apache/hadoop/blob/trunk/hadoop-hdfs-project/hadoop-hdfs/src/main/java/org/apache/hadoop/hdfs/server/namenode/FSDirConcatOp.java#L153-L155
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you modify the UT to cover the case that one or more source files are empty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add test for a empty src file
🎊 +1 overall
This message was automatically generated. |
@@ -1224,6 +1224,17 @@ public void testProxyConcatFile() throws Exception { | |||
String badPath = "/unknownlocation/unknowndir"; | |||
compareResponses(routerProtocol, nnProtocol, m, | |||
new Object[] {badPath, new String[] {routerFile}}); | |||
|
|||
// Test when concat trg is a empty file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do the namenode and rbf throw the same Exception?
Maybe RBF throws NPE, but NN throws org.apache.hadoop.HadoopIllegalArgumentException
.
RemoteMethod method = | ||
new RemoteMethod("getFileInfo", new Class<?>[] {String.class}, new RemoteParam()); | ||
// Check for file information sequentially | ||
RemoteResult<RemoteLocation, HdfsFileStatus> result = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If locations
only contains one namespace, we can returns this namespace directly instead of getting the namespace through getFileInfo
, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This may not be true. Even locations only contains one namespace, it still cannot decide whether the file exists or not. So getFileInfo is better to execute at least once. Or it will send to namenode and namenode throw file is not found.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
RBF is simply responsible for locating the downstream namespace and then proxying the request.
So if the input path is only mounted to one namespace, RBF only needs to proxy it directly. RBF does not need to check if the file exists in this only one downstream namespace, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
...-rbf/src/main/java/org/apache/hadoop/hdfs/server/federation/router/RouterClientProtocol.java
Outdated
Show resolved
Hide resolved
String targetFile = cluster.getFederatedTestDirectoryForNS(sameNameservice) + "_targetFile"; | ||
createFile(routerFS, targetFile, existingFileSize); | ||
// Concat in same namespaces, succeeds | ||
testConcat(srcEmptyFile, targetFile, true, true, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
here please check if the exception is org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.HadoopIllegalArgumentException)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done. Thanks
💔 -1 overall
This message was automatically generated. |
💔 -1 overall
This message was automatically generated. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @LiuGuH .
LGTM +1
Merged. Thanks @LiuGuH for your contribution. |
Description of PR
As described HDFS-17509
hdfs dfs -concat /tmp/merge /tmp/t1 /tmp/t2
When /tmp/merge is a empty file, this command will throw NPE via DFSRouter.
Because a file via DFSRouter only can be exists in one namespace, so this RP can implement in this way.