Skip to content

Automate the daily partitioning of your CloudTrail bucket in Athena

License

Notifications You must be signed in to change notification settings

GorillaStack/athena-cloudtrail-partitioner

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

20 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Athena CloudTrail Partitioner

AWS Athena is a serverless query service that helps you query your unstructured S3 data without all the ETL.

Athena allows you to query your CloudTrail log data from your S3 bucket on demand. However, it can be challenging to maintain sensible partitioning on the database over time.

This project helps you periodically add partitions to your Athena/Glue database for each day/month/year/region/account added to your CloudTrail log bucket.

Read more about why we built this, and how it can be used, in this blog post.

Prerequisite - Enable CloudTrail

CloudTrail is an audit log of every action to occur in your AWS Action. It should be on all the time.

You can now enable CloudTrail at the AWS Organization level, which means that CloudTrail for each account will be centrally logged and automatically enabled for all new accounts.

Read about how to create your organization CloudTrail here.

Installation

Install the Athena CloudTrail Partitioner through CloudFormation, either through the AWSCLI:

aws cloudformation deploy \
  --stack-name athena-cloudtrail-partitioner \
  --region ${AWS_DEFAULT_REGION} \
  --template-file cf/template.yml \
  --force-upload \
  --parameter-overrides \
    "OrganizationId=${ORGANIZATION_ID}" \
    "S3BucketName=${S3_BUCKET_NAME}" \
  --capabilities CAPABILITY_NAMED_IAM \
  --no-fail-on-empty-changeset

or click this button to deploy throught the AWS Console:

Launch Stack