Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Aggregation CFN template for Data Exports #787

Open
wants to merge 19 commits into
base: main
Choose a base branch
from

Conversation

AWS-ZachErdman
Copy link

This commit adds a new cloudformation template to support CUR 2.0 in Data Exports with the same functionality as cur-aggregation.yaml

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

This commit adds a new cloudformation template to support CUR 2.0 in Data Exports with the same functionality as `cur-aggregation.yaml`
Copy link
Collaborator

@yprikhodko yprikhodko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work! I've added few comments inline. Also I see that PermissionsBoundary parameter and policy property excluded compare to current version. Is this intentional?

Fn::Sub: "arn:${AWS::Partition}:s3:::${ResourcePrefix}-${DestinationAccountId}-shared"
StorageClass: STANDARD
Id: ReplicationRule1
Prefix: ""
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to manage this carefully. Please can you test creating athena table with 2 aggregated exports?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 it should be leading to data folder to exclude metadata folder replication

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should it be a filter on the ReplicationConfiguration? Or how else would I exclude metadata folder replication?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can try both. whatever works the best. I would suggest using the right prefix would be optimal as you know the path. Also this means that we need to use replication even if we have a local CUR (or alternatively prohibit cur to write /metadata/).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed, we will have to use local replication, because: 1) S3 replication itself doesn't support exclusion 2) We can't exclude metadata folder in Athena 3) SCAD won't accept reduced bucket write permissions without metadata folder

@AWS-ZachErdman
Copy link
Author

Great work! I've added few comments inline. Also I see that PermissionsBoundary parameter and policy property excluded compare to current version. Is this intentional?

@yprikhodko I had it in the version you shared with me, but it wasn't in the version that was present on Well Architected Labs website. It also kept causing errors for me when I tried to upload the template if I didn't specify anything in that field so I ended up just removing it.

We can add it back in if you'd like and know how to make it not error.

Screenshot 2024-04-14 at 7 37 25 PM

Made updates suggested by Yuriy and Iakov
@AWS-ZachErdman
Copy link
Author

AWS-ZachErdman commented Apr 19, 2024

@yprikhodko and @iakov-aws have you seen this error before? I'm getting it when trying to run the CFN template outside of us-east-1

All other resources besides SourceS3 seem to be deploying correctly.

image

@iakov-aws
Copy link
Collaborator

This might happen if you delete s3 in one region and then try to recreate in another one too soon with the same name. You can try changing prefix or wait a bit more.

res = client.create_export(
Export=export
)
print(json.dumps(res))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you need to delete old one as well?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Delete is handled in next block

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in case when you change the name on update and you have to create the new CUR, you need to delete old. Correct ?

###########################################################################
# Glue Database for CID framework
###########################################################################
CIDDatabase:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be a blocker. I struggle to understand the full flow if CID database is defined here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants