Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

datasets fail when Glue catalog is encrypted with KMS CMK #1207

Open
zsaltys opened this issue Apr 22, 2024 · 2 comments
Open

datasets fail when Glue catalog is encrypted with KMS CMK #1207

zsaltys opened this issue Apr 22, 2024 · 2 comments

Comments

@zsaltys
Copy link
Contributor

zsaltys commented Apr 22, 2024

If glue metadata is encrypted with a KMS CMK then data.all pivot role will not have access to glue and import or share creation will fail (if encryption is enabled after glue catalog import). data.all could probably detect that glue is encrypted with a KMS CMK and provide a link how to update the KMS CMK with the pivotRole so that failing imports or failing shares could be fixed. We could also just advise that KMS CMK is not supported and managed keys should be used. I personally don't see a reason why glue metadata should be encrypted with KMS CMK but it's supported nonetheless and some users ended up doing it in our organization.

@zsaltys zsaltys changed the title data.all does not work with Glue catalogs encrypted with KMS CMK data.all does not work with Glue catalog encrypted with KMS CMK Apr 22, 2024
@zsaltys zsaltys changed the title data.all does not work with Glue catalog encrypted with KMS CMK datasets fail with Glue catalog encrypted with KMS CMK Apr 22, 2024
@zsaltys zsaltys changed the title datasets fail with Glue catalog encrypted with KMS CMK datasets fail when Glue catalog is encrypted with KMS CMK Apr 22, 2024
@zsaltys
Copy link
Contributor Author

zsaltys commented Apr 22, 2024

I noticed that this gets even more confusing when importing a glue db which is encrypted with KMS CMK - this causes the CF custom resource to throw an exception on GetDatabase and then it assumes that the database does not exist and attempts to create it which also fails because the database already exists.

@dlpzx
Copy link
Contributor

dlpzx commented Apr 24, 2024

Hi @zsaltys thanks for opening an issue. Independently from supporting the use-case or not, we need to clarify users what is allowed and what are current limitations. In this case encrypting the Glue Catalog with CMK keys is not supported and users should be aware. That is step number 1, then, the other question is, should we support CMK-encrypted Glue Catalogs? We need to understand the reasons that the users have to go for this pattern and the implications and value-added of implementing the feature.

Here is a quick assessment of what would be needed:

  • Modify pivot role permissions and grant permissions to the CMK key
    - Add IAM permissions to CDK roles to getGlueCatalog settings
    - Modify Pivot Role CDK stack to add IAM permissions to CMK stack
  • Modify any role that accesses or writes to an encrypted catalog—that is, any role, crawler, job—needs the following permissions. (as they explain in the docs) ---> this part is a bit more difficult
    - Modify IAM roles created by data.all (dataset roles, environment team roles)
    - Add extra permissions to consumption roles?

It would be great to have the feedback of the affected users, so that we implement what they really need.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants