S3 Storage
Salient features
- 11 9s of durability
- 4 9s of availability, except for one-zone IA (99.5%)
- IA and Glacier tiers ave minimum duration charge. IA - 30 days, Glacier - 90 days, Glacier deep archive - 180 days
- Typical latency: 100-200 ms
- Performance: Multi-part upload, Transfer acceleration (move to edge, copied to another region privately)
Anti patterns:
- Lots of small files
- POSIX file system (use EFS instead), file locks
- Search features, queries, rapidly changing data
- Website with dynamic content
Integrations
- Integrate with Event Bridge for fine-grained filtering & multiple targets
S3 Select & Glacier Select
- S3 Select, launching in preview now generally available, enables applications to retrieve only a subset of data from an object by using simple SQL expressions.
-
By using S3 Select to retrieve only the data needed by your application, you can achieve drastic performance increases – in many cases you can get as much as a 400% improvement.
-
cold data stored in Glacier can now be easily queried within minutes.
- This unlocks a lot of exciting new business value for your archived data.
- Glacier Select allows you to to perform filtering directly against a Glacier object using standard SQL statements.
Glacier Vault with read-only vault lock policy can protect logs from tampering.
S3 Cross-Region
Suitable for dynamic content that needs to be served closer to multi-region consumers. Cloudfront data is cached for a day (typically), so not suitable for dynamic content.
The cost coming out of CloudFront is typically half a cent less per GB than data transfer for the same tier and Region. What this means is that you can take advantage of the additional performance and security of CloudFront by putting it in front of your Application Load Balancers (ALB), AWS Elastic Beanstalk, Amazon S3, and other AWS resources delivering HTTP(S) objects for next to no additional cost.
To compare the general upload speed across Amazon S3 Regions, you can use the Amazon S3 Transfer Acceleration Speed Comparison tool.
Lifecycle policies
Incomplete Multipart Uploads – S3’s multipart upload feature accelerates the uploading of large objects by allowing you to split them up into logical parts that can be uploaded in parallel. If you initiate a multipart upload but never finish it, the in-progress upload occupies some storage space and will incur storage charges. However, these uploads are not visible when you list the contents of a bucket and (until today’s release) had to be explicitly removed.
Expired Object Delete Markers – S3’s versioning feature allows you to preserve, retrieve, and restore every version of every object stored in a versioned bucket. When you delete a versioned object, a delete marker is created. If all previous versions of the object subsequently expire, an expired object delete marker is left. These markers do not incur storage charges. However, removing unneeded delete markers can improve the performance of S3’s LIST operation.
You can exercise additional control over these objects using some new lifecycle rules, lowering your costs and improving performance in the process.
Replication
- For both cross-region and same-region replication, Versioning must be enabled.
Encryption
- SSE-S3: S3 managed keys
- SSE-KMS: Keys are managed in KMS
- SSE-C: Can be own keys or KMS keys, but encryption happens at the client side.
- Prevent upload of unencrypted objects through policies. Condition: s3:x-amz-server-side-encryption. AES256: SSE-S3 or KMS: KMS
- Default encryption can be enabled for a bucket (as either SSE-S3 or SSE-KMS)
S3 Transfer Acceleration
Use Cases:
- You have customers that upload to a centralized bucket from all over the world.
- You transfer gigabytes to terabytes of data on a regular basis across continents.
- You are unable to utilize all of your available bandwidth over the Internet when uploading to Amazon S3.
You can transfer data to and from the acceleration-enabled bucket by using one of the following s3-accelerate endpoint domain names:
- s3-accelerate.amazonaws.com: to access an acceleration-enabled bucket.
- s3-accelerate.dualstack.amazonaws.com: to access an acceleration-enabled bucket over IPv6. Amazon S3 dual-stack endpoints support requests to S3 buckets over IPv6 and IPv4.
Transfer Acceleration vs Multi-part Upload
- For users distributed across the world (or regions) but accessing a centralized bucket, Transfer Acceleration is preferred since it uses AWS backbone network which can optimize the transfer from far away regions.
Website Hosting
- When you configure an Amazon S3 bucket for website hosting, you must give the bucket the same name as the record that you want to use to route traffic to the bucket.
Troubleshooting
If you want to route traffic to an S3 bucket that is configured for website hosting but the name of the bucket doesn't appear in the Alias Target list in the Amazon Route 53 console, check the following:
- The name of the bucket exactly matches the name of the record.
- The S3 bucket is correctly configured for website hosting. I.e. Public READ has been configured properly.