Setup AWS CloudFront for WordPress | Scaling this Blog

Over the past few years, this site has gone through some pretty substantial changes. After getting hacked on BlueHost sometime in 2016, I reworked the entire site architecture using hand-rolled resources on AWS. While this taught me a lot, it has been increasingly expensive and time-consuming.

To keep my costs down, I’m really relying on some of the smallest hardware that AWS provides. When I made the call to switch, I figured it would be awhile before I outgrew my current setup, but as a result of a doubled-down focus on creating more blog content, my monthly traffic has increased a little more than 3X in the last year.

bar chart of traffic to my blog increasing on the last year

While this is a great thing all around, it also provides some pretty significant challenges given my desire to focus my development time and efforts on things other than this site. I would say I live in pretty constant fear of my server going down, which it tends to do at least once every two weeks or so. Luckily, Jetpack lets me know, I restart Apache, and things return to normal until Apache processes spawn out of control again.

You can even see some of these craters in my traffic where I may have taken a little while to get everything running again.

an image of a message from Jetpack letting me know I failed

Obviously, this is an issue for anyone claiming to be a web developer. I know there are likely a lot of things I could do to performance tune Apache to handle this, or I could switch wholesale to something like NGINX, but I figured since it is 2018 at the time of writing this post, my stuff should really be in a CDN of some kind.

So, to partly solve my server issues and add another AWS tool to my tool belt, this post will be a step-by-step guide to setting up a CloudFront distribution for my WordPress site.

Change Roadmap: Adding a CDN

So, before we start changing stuff, it worth taking a quick overview of what we have now and what we will change as we add in CloudFront.

cloudfront-diagrams

 

All of the DNS management for this domain is done using Route 53, so the DNS talks directly to an application server hosted on a Linux EC2 instance running WordPress. The MySQL database is decoupled from the server using the RDS service. Pretty standard LAMP stack.

In the new configuration, we’ll place a CloudFront distribution in between the DNS and the application server. All requests for content from the DNS will hit the CloudFront CDN first and serve a copy of that content until it needs to pull content from the application server as it becomes new or something has changed.

Since most of my content changes little after publication, aside from the occasional comment, this should greatly reduce the amount of load my server experiences at any one time. In theory, this will also offer the base for some pretty significant performance gains down the road. Since there are CloudFront caches around the world, my readers will always get the closest copy of the content.

So, let’s get started.

Creating a CloudFront Distribution

First, we’ll need to create a distribution in CloudFront. Basically, we can consider a distribution as a bundle of files pulled from our server, or our origin, that are stored in edge locations around the world.

After navigating to the CloudFront service menu in the AWS console, we’ll go ahead and create a distribution. 

AWS console menu for creating a distribution in CloudFront

From here, we’ll need to decide between the two types of possible distributions: Web and RTMP. Since we’re serving a website site, Web will be our natural choice here: 

AWS console screen allowing us to choose between  web and rtmp distributions

Once we’ve selected the type of distribution, we’ll need to tell the distribution where our origin is and how to configure some additional basic settings. While configuring the distribution, we can define how the cache will behave at a default level, but we can and will come back later to override this default behavior at specific paths. 

Setting a Content Origin

The first step in creating our CDN involves telling the CloudFront distribution where to pull our content from. The terminology AWS uses here is origin, so that is what I’ll use from here on out. 

Origin Domain Name

First, we need to set the origin domain name, which is essentially the server or store that CloudFront will pull from when someone hits your CDN. It’s important to note that this origin cannot be an IP address, as I learned on my first way through this process. 

I specified the public DNS of my EC2 instance, but you could also pull from load balancers or S3 buckets. I also just used the name of my site as the Origin ID. 

SSL and Protocols

For the SSL protocols, I pretty much left things as a default, but for the origin protocol policy, I set it to use HTTPS only, which also activated the 443 port by default. That means that CloudFront will only communicate with my origin using HTTPS, which means that all traffic to and from my site will be encrypted.  

Timeouts and Keep-alive

Again, I just left these as the sensible defaults here. This deals with how CloudFront will handle requests to your origin should something go wrong on your server. In the future, I’m going to experiment with the keep-alive timeout number to speed up my requests to the /wp-admin path, but that will be a future blog post. 

Ports 

For this, I was able to leave the defaults set, but that is because my server is configured to responded to HTTPS requests over port 443. Since I’ve decided to have CloudFront redirect all HTTP requests to HTTPS, nothing will ever actually hit port 80. 

Managing HTTP Connections

In addition to the options above, we can also specify how CloudFront will handle different HTTP methods and headers.

 

Viewer Protocol Policy 

The first things we get to configure is our viewer protocol policy, which determines how your users will connect to CloudFront. Obviously, using HTTPS will require you to configure an SSL certificate, which we will do later.

The HTTP and HTTPS option will allow your user the option of using either protocol, while the redirect option will redirect all HTTP requests to the HTTPS version of the URL. Since this is the default behavior on my Apache server, I decided to keep this setting for CloudFront as well. 

HTTPS only on the other hand will only allow your users to connect if they type https:// into the browser address bar. In my opinion, I can’t really see a huge reason to ever use this option. 

Allowed HTTP Methods 

Since WordPress is a dynamic web application and content management system, it’s really best allow all HTTP methods on your origin, as that is the only setting that will allow you to create or update any content, login, or do anything else that would require a POST request. 

Since the results of GET and HEAD methods are already cached by default, I also turned on the caching of OPTIONS requests, which should speed up any CORS requests made to the WP API.

HTTP Headers 

 

When it comes to HTTP headers, the ideal scenario is to cache content on the fewest number of headers we need to make our application function. Some of the articles I consulted for this project recommend caching based on the Host and Options headers at a minimum, so we can go ahead and set this as our default. 

As we adjust our rules for other paths, particularly content in /wp-admin, we’ll need to add some additional whitelisted headers here to make things work appropriately. 

Object Caching

With the object caching settings, we can set the cache control headers for all of the resources served by our CloudFront distribution. Here we have two options. First, we can just allow the resource served by CloudFront to inherit the save cache headers sent by your origin server. Second, we can have CloudFront set custom headers on all of our objects. 

WordPress decides to offload most of the caching configuration to plugins according to their docs on optimization, so we’ll go ahead and specify our own custom rules as a default.

 

Cookies 

Since WordPress is a PHP-based web application, it makes extensive use of cookies to manage the session state of users. Therefore, if we expect to use our WordPress site to its full extent, we need to add support for cookies. 

However, since most of your readers won’t be authenticated in any capacity, the majority of your users will all receive the same cached copy of your content. Again, as with the HTTP Headers, instead of forwarding all cookies, we’ll create a whitelist with only the cookies WordPress sets and needs to operate. 

There is a little bit of information about the cookies WordPress uses in these docs, but I had to play around a bit to get everything working. If in doubt, open up the dev tools application panel and you can look at all of the cookies that are set every time you log in or browse your site. 

In the end, I may have been too aggressive here, but I’d rather have every piece of functionality work at 100% and make a few more trips to the origin, but that is a trade-off I’m OK with. Again, I made some additional modifications to the /wp-admin/* path in behaviors to pass all cookies since I think caching objects in the dashboard is likely a no-no. More details on that later. 

Query Strings 

There are lots of places where WordPress will use query strings in the URL, so we need to instruct CloudFront how to handle those as well. Obviously, not forwarding query strings offer the best performance, but that’s not realistic given our application.

We also have the option of creating a query string whitelist like we did with the HTTP headers and cookies, but in this case since we want all of our posts, pages, and attachments available at both the permalink and raw URL variants, its better to forward all of the query strings and cache each unique combination. 

In addition, turning on the ‘Compress Objects Automatically’ will allow us to serve gzipped files when the browser will accept them, giving us an additional performance boost. 

Default Distribution Settings

In addition to all of the settings above that we can specify about our default path, there are lots of options that we can configure for the distribution as a whole. Some of these things are larger concepts which may warrant their own articles in the future, so if you have any questions, just let me know.  

Price Class

One of the things that CloudFront lets you configure is which edge locations to use to store your distributions. An edge location is essentially a data center that stores a copy of your content, and CloudFront is smart enough to serve the user the closest copy of content geographically, which is what allows for the extremely low latency associated with CloudFront.

There are a few configurations, but I configured my distribution to use all edge locations, which is slightly more expensive but more performant. Since I have lots of traffic from India and South America, this makes sense to me. If you are a US-based business or only serve a particular geographic area, then it may make sense to only use a subset of the available edge locations. 

AWS WAF

AWS WAF stands for Web Application Firewall, which is a service that allows you to specify rules to block common attacks like SQL injections or cross-site scripting (XSS). Maybe one day I will add this to the stack as well, but I’m sure configuring this service could become a blog post on its own. 

Alternate Domain Names 

This option exists for distributions that will be served using a custom domain. Each CloudFront distribution gets its own publicly addressable domain name, but if we wan’t to tie our distribution to a domain name we own, we need to include all of the variations of the domain name we’ll use in this text box. 

SSL Certificate 

If you decide to make your distribution accessible over HTTPS, you need to configure the SSL certificate that will be used to encrypt that traffic. 

 

Additional Configuration Details  

As far as I’m concerned, none of the following details are super important to cover. In my case, and in most cases, I think you’d be fine leaving these as the AWS defaults. If all of this stuff looks good to you, go ahead and click ‘Create Distribution’ to finish off this portion of the process. 

 

Creating Behaviors for CloudFront Distributions

Ok, so at this point, you should be redirected to the CloudFront Distributions menu where you can see all of your distributions and their statuses. Under the CNAME row, we can see any DNS names associated with our distribution, and the status can tell us when the distribution has been pulled into all of the edge locations. 

CloudFront Distributions Menu

Now that we have our distribution created, we want to alter our distributions behavior at different paths. For example, the content served at the index of your site and the content served anywhere in /wp-admin require two different configurations. 

We want CloudFront to heavily cache the content our readers will request, while providing us uncached copies of our admin pages. In practicality, this typically means that CloudFront will end up requesting most of the wp-admin content from our origin server. 

To add a behavior to our distribution, click into the distribution we just created to access some additional options. From here you should see a tab for ‘Behaviors,’ which should show at least the default behavior we created when we created the distribution. List of Distribution Behaviors | CloudFront

Click ‘Create Behavior’ and add the path you want to create the behavior for. You will also be asked to specify an origin, which in this case is the same server everything else is on. But, looking forward, this would let you do some cool things like offload static content to S3 or add other AWS services as possible origins. 

Creating a Behavior for a CloudFront Distribution

 

For each behavior we create, we’ll have the opportunity to configure a subset of the options we did for our default behavior. Since all of my /wp-includes and /wp-content files are on my origin, I only needed to create the following behaviors to get everything working: 

 

There is a lot of overlap in the configurations for these three different paths, so instead of outlining everything, I’ll only highlight the differences and explain why they are import in the context of WordPress. 

/wp-login.php

Since wp-login.php is the default login page for most sites, we just want to make sure that the browser doesn’t cache a version of this page, so instead of allowing CloudFront to set cache control headers, we use the headers set by the origin server instead. 

Setting Value
Object Caching Use Origin Cache Headers
Whitelist Cookies comment_author_*
comment_author_email_*
comment_author_url_*
wordpress_*
wordpress_logged_in_*
wordpress_sec_*
wordpress_test_cookie
wp-settings-*

 

/wp-admin/*

For this path, I took a slightly more aggressive path than some of the other tutorials I read. What I’m doing here is choosing to value consistency over performance for the admin dashboard by forwarding all cookies instead of just a whitelist.

While it may not matter in the long run, my usage of my site’s admin area is low in comparison to the rest of my traffic, so this seems like a pretty reasonable trade-off. 

For the additional headers, like Referer and User-Agent, I added these after some initial things went wrong.

First, it seems like the wp-json endpoint and admin-ajax.php both needed the Referer header present to communicate with the backend. This solved some initial issues uploading media. 

Second, a recognizable User-Agent header seems to be a requirement for the visual editor to display, as per this thread. CloudFront can pass its own UA headers, but they are not standard. CloudFront also griped because I added this since its many possible values meant basically no caching, but since this is just the dashboard that I access from one machine, I’m fine with that. 

Setting Value
Object Caching Use Origin Cache Headers
Whitelist Headers Host
Options
Referer
User-Agent
Foward Cookies All


/wp-json/*

The only real difference for this path is the addition of the ‘Referer’ header. I uncovered this after receiving some error messages when trying to load and save posts with the Gutenberg editor, since it does all of those transaction via the REST API. It seems like adding the ‘Referer’ header calmed that down, but I expect some additional changes to this path after I start using Gutenberg full-time. 

Setting Value
Object Caching Use Origin Cache Headers
Whitelist Headers Host
Options
Referer
Foward Cookies All

 

Cutting Over to CloudFront

Once you have all of the distribution and behavior settings in place, the last step is to test and actually make the DNS cutover to point your domain records at the new CloudFront url. 

 If you want to do some testing, the best way is to get the IP Address of the CloudFront distribution using some method on the command line (i.e. curl, ping, or dig): 

command line program to get cloudfront ip address

From here, you can change your hosts file so that your chosen domain name resolves to this IP Address on your local machine. Once you do that, you can browse your site using you full domain name and do some testing. I’d recommend testing out both the front end and the back end, including all admin areas and plugin pages that are important to your site. 

If everything looks good, you can update your actual DNS records. Since DNS registrars differ, there are lots of places where this step will differ for different people. 

Since my records are located in AWS Route53, they make it simple by allowing me to just alias my A record to the CloudFront distribution. 

updating dns records in route 53

 

Wrapping Up 

Overall, this process took me a few days or research and a few days to implement working very intermittently. If you were determined and know a little bit about the cloud and networking, this is easily something that could be accomplished in a few hours. 

While I’m still only a few weeks out from the change, I can say nothing but good things about my decision. Everything seems to be working for me, my pages are loading faster when not logged in, and most importantly, the load on my tiny server is almost non-existent at this point. 

Overall, I estimate the cost difference for adding CloudFront to be at a maximum of $5 a month, which is really impressive considering what my additional costs would be to reserve more hardware. 

I’d be remiss if I didn’t mention a few other posts I used as a reference to get this far. You’ll find that a lot of what appears here is a recap of these other sources, some of them are a bit dated in the tech world, and WordPress has evolved a lot since 2015 or even 2017. Thus, I was left to figure out some of the REST API settings on my own using the rest of the wider internet: 

 

I'm working on building the most comprehensive course available on building workflows with Google Apps Script.

Join the Course Waiting List for a Huge Discount!

* indicates required

7 thoughts on “Setup AWS CloudFront for WordPress | Scaling this Blog”

  1. Luis Carlos says:

    How much faster your pages loaded after moving to CloudFront, and also how much the CPU usage of the ec2 instance went down?
    It does a full page cache?

    1. BrownBearWhatDoYouSee says:

      It took about a full second off the full page load, but the time to first byte went through the roof, so initial html now renders almost instantly. I wouldn’t say the average CPU went down a ton, but I’ve had no crashes since switching. My issue on the t2.micro instance was never sustained high load, but high load when 10+ connections would be on the site at the same time. The EC2 instance is now low at all times. Yes, it does a full page cache, so whatever an origin responds with for a given path, that is what will be cached. Overall highly recommend doing this if you are on AWS. Thanks for reading!

  2. Amit Dunsky says:

    Specifically with WP, the described implementation method hides some drawbacks that can impose [potentially substantial] maintenance overheads. I’ll describe the major three:

    1. When installing WP plugins, you need to understand the way it works, and adjust the distribution’s behavior accordingly.
    2. When updating WP and/or Plugins and/or theme versions, you need to keep track of changes, and make sure your distribution’s behavior adhere to it.
    3. Having your server as the origin means that in the event of loosing the server, you’ll also loose all files. As a consequence, when you restore your server from your last backup, you might end up with a WP site that is missing files referred to by the DB records (for example: you created a new post with images, but your last backup is prior to that post’s creation time. Assuming your DB runs on RDS, it will hold the records to that post, but your backup won’t have the images. Same goes with plugins installation). So you’ll probably need to implement EFS for files, and have the server mount it upon starting.

    There a few more attention points…
    The method this post describes is totally doable. That said, you need to consider the maintenance overheads.

    1. BrownBearWhatDoYouSee says:

      Thanks, Amit. Yes, those are all valid concerns. I have had conflicts with a few plugins that no longer work the way they are expected to, i.e. Wordfence, but that may be due to the added network layer that CloudFront introduces. I think it goes without saying that adding functionality, especially functionality that occurs outside of the /wp-admin or REST API paths, necessitates a revision of distribution behaviors. For #3, I’m not sure that downside is specific to using CloudFront. Seems like that could happen with a single-instance installation just using built-in page caching for NGINX. As you note though, using some sort of file system abstraction like EBS or EFS would help alleviate that. I think most EC2 instances try to enforce that convention by default, so it isn’t something I decided to go into detail about here.

      Another maintenance headache in regards to the set-up I outlined is renewing SSL certs for end-to-end HTTPS. At one point I had an SSL certificate expire on the origin and block CloudFront from connecting with it. Might simplify things a bit to terminate SSL at CloudFront.

      Either way, thanks for reaching out with the comment! Thanks for reading, JE

  3. slj says:

    Thanks for your sharing.
    I notice that you are using EC2 which is expensive. Have a look at AWS Lightsail, it only needs 10$ (or even 5$) a month.

    1. BrownBearWhatDoYouSee says:

      Thanks for reading! I’ve looked at Lightsail, but don’t see a ton of benefit in switching. I use EC2 reserved instances, which brings the cost of a t2.micro instance to around $5 a month anyway. You agree to buy a year worth of time, but I typically have a few small servers running at any one time, so I go through that every few months and then just buy more reserved time. Not sure if Reserved Instances would give you any benefit, but might be worth checking out if you haven’t. Thanks again for reading and commenting! Regards, JE

  4. slj says:

    oops, didn’t know about Reserved Instances, the discount rate seems great! Thanks for your info!

Leave a Reply

Your email address will not be published. Required fields are marked *