Restricting Access to a CloudFront S3 Website Origin
Jan 7, 2018 15:00
I had been meaning to migrate the blog you’re reading right now to AWS’s Simple Storage Service, S3, for quite some time. Per usual, other work took precedence over said task. As fate would have it, however, I was able to drive the project to completion shortly before the close of 2017.
If you’re not already aware, S3, like many cloud object storage solutions, is able to host static websites. "Static" simply refers to the fact that none of the site’s content relies on any server-side processing. Loosely speaking, if your site has any components other than HTML, CSS or JavaScript, it’s not a candidate for static website hosting. Adding CloudFront into the mix provides a few niceties, namely encryption and content distribution.
The process for configuring S3 and CloudFront is pretty straight-forward and won’t be covered here. However, AWS maintains some pretty good documentation on the process here and here. Also, a quick search on DDG would return some pretty detailed blog posts on the process.
In this post, I’d like to share a method I stumbled-upon which will allow you to ensure that traffic to your S3-hosted website site always comes in via CloudFront - never to your S3 bucket directly.
The Problem
When using CloudFront with S3 for retrieving assets (e.g. images, videos, GIFs, etc.), you would use an Origin Access Identity (OAI) to restrict access to the S3 bucket as suggested here. However, when using an S3 static website as a CloudFront origin, you must configure the origin as a "custom origin" in CloudFront. This unfortunately breaks the OAI-related portion of the bucket policy you’ve configured to allow s3:GetObject
(or similar permissions) on the bucket from your CloudFront distribution.
I haven’t been able to find any public acknowledgement from AWS that this is, in fact, the case. However, a user on /r/AWS suggested that they’d had a similar experience and were instructed by AWS Support to do something similar to what I’m shortly going to suggest.
A Solution
CloudFront gives you the ability to inject custom headers into it’s request to your custom origin. If we set this value to some very long, very random string, we can check for the string on the bucket policy to ensure any requests coming have it. It’s akin to using password-based authentication. The only thing I was unsure of was whether there was a header I could inject which could also be referenced somehow in the S3 bucket policy. I discovered this list which documents all the available "condition" keys available in IAM policies. The user on /r/AWS, at the instruction of AWS Support, decided to use the "Referer" header. I decided to use the "User-Agent" header (aws:UserAgent in the bucket policy) instead.
Food for Thought
As with any secret-based (or similar) authentication mechanism, anyone who gets their hands on the secret also gets their hands on whatever the secret was protecting. There would be nothing preventing someone with the secret string you’ve chosen from simply CURLing your S3 website endpoint and passing in a doctored Referer
, User-Agent
or whichever header you’ve chosen.
You may want to consider whether it’s worth the (arguably small amount of) trouble to go through the exercise of locking down your bucket using this method. For compliance purposes it may be a requirement for you or your organization. Personally, I think that not locking down the bucket to CloudFront makes your CloudFront access logs less useful - which is my main concern.
TL;DR
-
Create a custom header in your CloudFront distribution called
User-Agent
. Set it’s value to a long, random, string. -
Update your bucket policy to add a condition to ensure that the
User-Agent
header’s value is equal to the string you previously selected. For example:
{
"Version": "2012-10-17",
"Statement": [
{
"Action": [
"s3:GetObject"
],
"Effect": "Allow",
"Principal": "*",
"Resource": [
"arn:aws:s3:::my-static-site.example.com"
],
"Condition": {
"StringEquals": {
"aws:UserAgent": "VeryLongVeryRandomStringHere"
}
}
}
]
}