-
Notifications
You must be signed in to change notification settings - Fork 8.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
HADOOP-18610: [ABFS] OAuth2 Token Provider support for Azure Workload Identity #6787
base: trunk
Are you sure you want to change the base?
Conversation
@@ -552,16 +552,7 @@ protected void assumeValidAuthConfigsPresent() { | |||
currentAuthType == AuthType.SAS); | |||
if (currentAuthType == AuthType.SharedKey) { | |||
assumeValidTestConfigPresent(getRawConfiguration(), FS_AZURE_ACCOUNT_KEY); | |||
} else if (currentAuthType == AuthType.OAuth) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this removed ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed code was to make sure that required configs to work with OAuth are present. This was a redundant check as a similar check is already present in AbfsConfiguration::getTokenProvider() method where these fields are parsed as Mandatory fields. getMandatoryPasswordField will take care of throwing relevant exception in case these are not configured.
Moreover the redundant check was working only for Client-Secret Token Provider and not for other OAuth Token Providers.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@anujmodi2021 : Can you pls give a try to add the testConfigPropNotFound
kinda unit test cases for the Oauth configs to make sure the source code validations are respected. Say, if someone tries to refactor getMandatoryPasswordString() to getPasswordString() function unknowingly, these unit test cases can make sure the behavior is retained by failing the test, isn't it?
String authority = appendSlashIfNeeded(
getTrimmedPasswordString(FS_AZURE_ACCOUNT_OAUTH_MSI_AUTHORITY,
AuthConfigurations.DEFAULT_FS_AZURE_ACCOUNT_OAUTH_MSI_AUTHORITY));
String tenantGuid =
getMandatoryPasswordString(FS_AZURE_ACCOUNT_OAUTH_MSI_TENANT);
String clientId =
getMandatoryPasswordString(FS_AZURE_ACCOUNT_OAUTH_CLIENT_ID);
String tokenFile =
getTrimmedPasswordString(FS_AZURE_ACCOUNT_OAUTH_TOKEN_FILE,
AuthConfigurations.DEFAULT_FS_AZURE_ACCOUNT_OAUTH_TOKEN_FILE);
Probably you can write the test cases into this test class.
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/TestAccountConfiguration.java#L377
Secondly, I could see some interesting test cases testGlobalAndAccountOAuthPrecedence
. Can you pls go through the class and add necessary test cases.
https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-azure/src/test/java/org/apache/hadoop/fs/azurebfs/TestAccountConfiguration.java#L405
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the suggestion Rakesh..
I see that this test is already present in the class that you pointed to. It is there for testing ClientSecret Based OAuth configs. I will add similar for other OAuth token providers as well including Workload Identity Token Provider.
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
Hi @steveloughran @mukund-thakur |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added few comments:
...-azure/src/main/java/org/apache/hadoop/fs/azurebfs/oauth2/WorkloadIdentityTokenProvider.java
Outdated
Show resolved
Hide resolved
...-azure/src/main/java/org/apache/hadoop/fs/azurebfs/oauth2/WorkloadIdentityTokenProvider.java
Outdated
Show resolved
Hide resolved
// In case token is not refreshed for 1 hr or any clock skew issues, | ||
// refresh token. | ||
expiring = elapsedTimeSinceLastTokenRefreshInMillis >= ONE_HOUR | ||
|| elapsedTimeSinceLastTokenRefreshInMillis < 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So in case of clock skew we will be renewing during every call?
I would at least add a warning message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a debug log added. Should I change it to warn then??
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is also same logic ported from MSI Token provider.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Retaining the logic of refreshing tokens in case of clock skews..
🎊 +1 overall
This message was automatically generated. |
Preconditions.checkNotNull(authEndpoint, "authEndpoint"); | ||
Preconditions.checkNotNull(clientId, "clientId"); | ||
Preconditions.checkNotNull(clientAssertion, "clientAssertion"); | ||
boolean isVersion2AuthenticationEndpoint = authEndpoint.contains("/oauth2/v2.0/"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you move this to a function and can re-use it in #getTokenUsingJWTAssertion() and #getTokenUsingClientCreds()
private static final String OAUTH_VERSION = "/oauth2/v2.0/";
private boolean isVersion2AuthenticationEndpoint(String authEndpoint){
authEndpoint.contains(OAUTH_VERSION);
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good suggestion.
Taken
qp.add("client_assertion_type", JWT_BEARER_ASSERTION); | ||
LOG.debug("AADToken: starting to fetch token using client assertion for client ID " + clientId); | ||
|
||
return getTokenCall(authEndpoint, qp.serialize(), null, "POST"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whats the reason for passing null headers ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Msi token fetch needs additional header "Metadata": "true"
Others don't need any specific headers, therefore null for them
🎊 +1 overall
This message was automatically generated. |
@rakeshadr @mukund-thakur Thanks. |
🎊 +1 overall
This message was automatically generated. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM pending one minor comment
expiring = elapsedTimeSinceLastTokenRefreshInMillis >= ONE_HOUR | ||
|| elapsedTimeSinceLastTokenRefreshInMillis < 0; | ||
System.currentTimeMillis() - tokenFetchTime; | ||
boolean expiring = elapsedTimeSinceLastTokenRefreshInMillis < 0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we need this expiring variable now?
Description of PR
Jira: https://issues.apache.org/jira/browse/HADOOP-18610
Code Ported from PR: #5953
Please refer to the original PR Description below. This PR take forward the work done by original author @creste.
Original PR Description
Add support for Azure Active Directory (Azure AD) workload identities which integrate with the Kubernetes's native capabilities to federate with any external identity provider.
This PR is based on Haifeng Chen's patch attached to HADOOP-18610. I fixed a few typos and linter errors but did not modify the core functionality.
How was this patch tested?
:::: AGGREGATED TEST RESULT ::::
============================================================
HNS-OAuth
[WARNING] Tests run: 142, Failures: 0, Errors: 0, Skipped: 2
[WARNING] Tests run: 617, Failures: 0, Errors: 0, Skipped: 73
[WARNING] Tests run: 380, Failures: 0, Errors: 0, Skipped: 54
============================================================
HNS-SharedKey
[WARNING] Tests run: 142, Failures: 0, Errors: 0, Skipped: 3
[WARNING] Tests run: 617, Failures: 0, Errors: 0, Skipped: 28
[WARNING] Tests run: 380, Failures: 0, Errors: 0, Skipped: 41
============================================================
NonHNS-SharedKey
[WARNING] Tests run: 142, Failures: 0, Errors: 0, Skipped: 9
[WARNING] Tests run: 601, Failures: 0, Errors: 0, Skipped: 268
[WARNING] Tests run: 380, Failures: 0, Errors: 0, Skipped: 44
============================================================
AppendBlob-HNS-OAuth
[WARNING] Tests run: 142, Failures: 0, Errors: 0, Skipped: 2
[WARNING] Tests run: 617, Failures: 0, Errors: 0, Skipped: 75
[WARNING] Tests run: 380, Failures: 0, Errors: 0, Skipped: 78
Time taken: 57 mins 20 secs.