Skip to content

Latest commit

 

History

History
110 lines (105 loc) · 8.22 KB

File metadata and controls

110 lines (105 loc) · 8.22 KB

Amazon Simple Storage Service (S3)

Provides secure, durable, higly-scalabe object store.

Useful Links

Best Practices

  • Use caching for frequently accessed content, such as CloudFront or ElastiCache
  • For latency-sensitive applications, S3 advises tracking and aggressively retrying slower operations
    • When you retry a request, it's recommended to use a new connection to S3 and performing a fresh DNS lookup
  • Ensure that your S3 buckets use the correct policies and are not publicly accessible
  • Implement least privilege access
  • Use IAM roles for applications and AWS services that require S3 access
  • Enable multi-factor authentication (MFA) Delete to prevent accidental bucket deletion
  • Consider encryption of data at rest
  • Enforce encryption of data in transit
    • Allow only encrypted connections over HTTPS (TLS) using the aws:SecureTransport condition on S3 bucket policies.
  • Consider S3 Object Lock to prevent accidental or inappropriate deletion of data
    • For example, prevent deletion of CloudTrail logs
  • Consider S3 cross-region replication (CRR): although S3 stores your data across multiple geographically diverse AZs by default, compliance requirements might dictate that you store data at even greater distances
  • Consider VPC endpoints for S3 access to prevent traffic from potentially traversing the open internet and being subject to open internet environment
  • Identify and audit all your S3 buckets
  • Implement monitoring using AWS monitoring tools
  • Enable S3 server access logging to get detailed records of the requests made to a bucket
  • Use AWS CloudTrail to record actions taken by an user, a role, or an AWS service in S3.
    • You can use information collected by CloudTrail to determine the request that was made to S3, the IP address from which the request was made, who made the request, when it was made, and additional details
  • Enable AWS Config so you can assess, audit, and evaluate the configurations of your AWS resources, helping you simplify compliance auditing, security analysis, change management, and operational troubleshooting
  • Consider using Amazon Macie with S3, it uses machine learning to automatically discover, classify, and protect sensitive data in AWS.
    • Macie recognizes sensitive data such as personally identifiable information (PII) or intellectual property
    • It provides you with dashboards and alerts that give visibility into how this data is being accessed or moved
  • Monitor AWS security advisories regularly, in particular, note warnings about S3 buckets with “open access permissions.”

General Notes

  • Safe place to store your files
  • Allows you to upload files, but is not suitable to install an OS or running a database
  • Object-based storage (not blob storage)
  • Data is spread across multiple devices and facilities (+ high availability and + disaster recovery)
  • Files can be from 0 bytes to 5TB
  • Unlimited storage (pay as you go)
  • Files are stored in Buckets (similar to folder)
  • S3 is an universal namespace, so names must be unique globally
  • When you upload a file, you will receive a HTTP 200 code if it was successful
  • Data consistency:
    • Read after write consistency for PUTs of new objects
    • Eventual consistency for overwrite PUTs and DELETEs (can take some time to propagate)
  • S3 is key-value based, and objects consists of:
    • Key: the name of the object
    • Value: the actual data, made up of a sequence of bytes
    • Version id
    • Metadata: such as tags
    • Subresources: bucket-specific configuration, such as:
      • bucket policies, access control lists
      • CORS
      • Transfer acceleration
  • Amazon guarantees 99.9% availability and 99.999999999% durability for S3 information (11x9s)
  • Storage Tiers
    • S3: store redudantly accross multiple facilities, and is designed to sustain the loss of 2 facilities concurrently
    • S3 IA (Infrequently Accessed): for data that is accessed less frequently, but requires rapid access when needed. There is a retrieval fee for all s3 objects
    • S3 One Zone IA: same as IA, however data is stored in a single AZ, still 11x9s durability, but only 99.5% availability. Cost is 20% less than regular S3 IA
    • Reduced Redudancy Storage: designed to provide 99.99% durability and 99.99% availability. Used for data that can be recreated if lost, e.g. thumbnails
      • Fading out, but might still appear on the exam
    • Glacier: very cheap, but used for archival only. Optimised for data that is infrequently accessed, takes 3-5 hours to restore
  • Intelligent Tiering
    • Recommended for unknown or unpredictable access patterns
    • 2 tiers, frequent and infrequent access
    • Automatically moves your data to most cost-effective tier based on how frequently you access each object
    • Objects are moved to infrequent access tier after not being accessed for 30 or more days, and moved to frequent access as soon as it is accessed again
    • No fees for accessing data, but small monthly fee for monitoring and automation, based on number of objects
  • You are charged for:
    • Storage per GB
    • Requests (get, put, copy, etc)
    • Storage management pricing (inventory, analytics, object tags)
    • Data management pricing: data transferred out of s3
    • Transfer acceleration: use of CloudFront to optimize transfers
  • You can host static websites, it will create a public endpoint
  • If you want to share resources that can be loaded on webpages, you need to enable CORS on your bucket
    • CORS configuration is set in a XML file, where you can set rules to identify the origins allowed, HTTP methods supported for each origin and other operation specific information
  • Transfer Acceleration enables fast, easy and secure transfers of files over long distances between your end users and S3 buckets
    • It takes advantage of CloudFront's edge locations
    • As data arrives at an edge location, it is routed to S3 over an optimized network path
  • S3 automatically scales in response to sustained new request rates, dynamically optimizing performance. While this optimization is in progress, you might receive HTTP 503 responses temporarily

Security

  • By default, all newly created buckets are private (only the owner of the bucket has access to it and its contents)
  • You can set up access control to your buckets using:
    • Bucket policies: applied at a bucket level, json file
    • Access Control Lists: applied at an object level
  • Buckets can be configured to create access logs, which logs all requests made to the bucket. These logs can be written to another bucket
  • You can allow public access to a file only if the bucket has public access enabled
  • You can create bucket policies using AWS Policy Generator

Encryption

  • In transit
    • SSL/TLS
  • At rest
    • Server Side Encryption
      • S3 managed keys (SSE-S3): AWS handles all for you
      • AWS KMS (SSE-KMS): you can control some aspects, such as using your own keys, audit trails, etc.
      • Customer provided keys (SSE-C): you handle the keys, aws will only encrypt/decrypt
  • Client Side Encryption
    • You choose your own encryption methodology and upload files already encrypted
  • Enforcing encryption on S3 buckets
    • For files that should be encrypted at upload time, the request header x-amz-server-side-encryption parameter should be included, using one of the two options:
      • x-amz-server-side-encryption: AWS256 for SSE-S3
      • x-amz-server-side-encryption: ams:kms for SSE-KMS
    • When this parameter is included in the header of the PUT request, it tells S3 to encrypt the object at upload time, using the specified encryption method
    • You can enforce server side encryption by using a bucket policy which denies any S3 PUT request which doesn't include the x-amz-server-side-encryption parameter in the header