High traffic WordPress architecture using AWS Lightsail

Here is how I built a high-performance WordPress website in AWS Lightsail for aier.org.  While low-traffic blogs can be hosted on a shared hosting service or a cheap VPC, if your site hosts millions of visitors each month, you will need a more ambitious service-oriented architecture.

The key to high-performance WordPress is a service-oriented architecture that splits the application into independent layers.  Amazon provides a reference architecture for high-performance WordPress hosting on AWS.  While this is a great start, all those services get expensive and complex to manage.  I wanted a lot fewer moving parts and to make things maintenance-free.  I also included important performance and management optimizations such as a dedicated editor server and git-based deployment.  To lower costs, I used AWS Lightsail and Cloudflare to get significant cost savings versus AWS’s EC2 and CloudFront-based reference architecture.

High-performance WordPress requirements for my project:

  • Lower the cost of hosting from well over $20K/year to under $1500/year while supporting many millions of monthly users.
  • Keep backend 100% available and fast regardless of traffic.
  • Highly available and highly scalable architecture: easy recovery from failure, and ability to quickly scale without any downtime.
  • Minimal administrative management overhead (the servers should maintain themselves after I set them up).
  • Minimal configuration – the server should be set up with just a few commands: I promised to build this out in two hours.
  • Git-based deployment process. Deploy website updates via git merge.

Process

Below, I explain why I used specific tools and configuration, then I’ll provide one technical details to help you do the same.

  1. Configure a basic WordPress hosting environment
  2. Migrate or build your WordPress site
  3. Upgrade the hosting environment for scalability
  4. Configure analytics and alerting tools

Step 1: Use environment and server management tools

For my web host, I considered two options: Digital Ocean and Amazon Web Services. I have a slight preference for the value of Digital Ocean, but my client preferred AWS, so I went with that.  

Amazon Lightsail

I decided to use Amazon Lightsail rather than the more traditional AWS service. Lightsail is a simple (and cheaper!) bundled offering for smaller projects meant to compete with VPS services like Digital Ocean. It’s easier to get started with Lightsail and hand it over to a non-technical team, and the pricing bundle is optimized for simple web sites. I was still able to use advanced AWS services when I wanted to expand beyond the scope of what Lightsail supports.  

Easy Engine and WordOps

Setting up a hosting environment in Linux can be a lot of work, and there are dozens of configuration steps you should make to optimize Ubuntu, nginx, and php for performance.

Server management tools like Easy Engine make this easy. For this project, I used both Easy Engine 4 and WordOps 3.9.

Easy Engine or WordOps?

When Easy Engine upgraded from 3.x to the 4.x, they moved their services into Docker containers, which cost them a lot of fans. Running services in Docker has certain advantages such as being able to run any OS that supports Docker, so that you can have a dev environment on a Mac or even Windows. However, Docker ads a lot of complexity, and when the Easy Engine team rewrite their entire code base for 4.x, they left a lot of missing features and rough edges, while reducing the level of support they provided to the community. This led to a fork of the 3.x version of Easy Engine: WordOps. Currently, both Easy Engine 4.x and WordOps are new and immature, and I hesitate to pick a clear winner.    

After doing a complete configuration with Easy Engine 4.0, I decided to start over in WordOps to see the difference. Unless environment portability and service isolation are important to you, WordOps seems like a better choice for most people at this time.

Step 2: Migrate or build your WordPress site before performing advanced configuration

Before you get ambitious with building a high-performance site, it’s important to validate the basics. If you are migrating an existing site, test the migrated environment before scaling it out

Step 3: Use managed, auto-scaling services in a service-oriented architecture

Most websites experience daily traffic cycles as people using them during the day and sign off when they go to sleep. Furthermore, websites need to be prepared for massive traffic spikes caused by media mentions or viral social media posts.

To handle the traffic spikes, we have two options: 

1: run (and pay for) the highest possible traffic level for those 1% peak traffic periods 100% of the time

2: build an auto-scaling design that automatically adds additional resources with additional load 

Obviously, we prefer #2, but it is difficult and relatively expensive to isolate and scale every part of our stack. Given a limited budget but still millions of users, I’m going to go with a hybrid approach that scales the things that are easy to scale, while isolating important components such as the back-end from high traffic.

Here is how I’m isolating each component:

  • User-facing site: AWS Lightsail load balancing
  • Editor Backend (wp-admin): dedicated editors instance, isolated from public-facing traffic, so editors can always get in
  • File storage: AWS EFS volume
  • Database: Lightsail MySQL instance (just a frontend for RDS)
  • Caching: Cloudflare and AWS ElastiCache redis service
  • CDN & Firewall: CloudFlare

Here is what this looks like:

Step 4: Configure analytics and alerting tools

The main tool I’m using is New Relic – to identify WordPress performance bottlenecks.  The New Relic modules comes with Easy Engine, but you must configure it to add your API key.

I used New Relic to troubleshoot which plugins and templates had the biggest impact on performance. New Relic can show you which plugins and templates have the biggest impact on performance, I was able to quickly see than Yoast SEO was conflicting with another SEO plugin and disabled one of them.

WordOps comes with a pretty dashboard and a bunch of monitoring and debugging tools, including netdata:

WordOps admin tools

netdata dashboard

 

Detailed WordPress Build Process

I’m just going to highlight important steps that may save you some time:

1. Configure a basic WordPress hosting environment

Launch your first Ubuntu server

    • Add your ssh key to Lightsail
    • Launch new Ubuntu 18.x instance
    • Enable port 443 and 22222 (for WordOps). Don’t forget to do this for each new instance!  
    • Configure static IP and bind it to your instance

Launch MySQL

    • I’m using the Amazon Lightsail database service, so I need to add a remote DB:
    • Easy Engine allows you to specify a remote DB.  WordOps has a process that did not work for me. I created a local DB, then changed the connection string in wp-config.php to the remote database.

Configure DNS and CDN in Cloudflare

The $20 Cloudflare Pro plan is well worth it:

    • Optimize Cloudflare performance
      • I basically turn on every single feature in every single Cloudflare module. Note that Rocket Loader and Mirage add scripts that may interfere with your scripts, so test your site after adding them.
    • Optimize Cloudflare for WordPress
      • Cloudflare has specific firewall rulesets for WordPress and PHP. If you install the Cloudflare WordPress plugin, it will prompt you to turn them on.

Install Easy Engine or WordOps

Getting started is really easy.  

Here is the command I used to create my site in Easy Engine:  

ee site create aier.org –ssl=le –dbhost=[mydb].rds.amazonaws.com –dbuser=dbmasteruser –dbpass=[mypass] –dbname=[dbname] –type=wp –cache –[email protected]

and WordOps:

wo site create aier.org –wp –wpredis –letsencrypt

As you can see, WordOps required the missing paramers to be made by editing config files. 

2. Migrate or build your WordPress site

  • I migrated my site in three steps:

 1. Git pull for the code: I added my ssh key to Bitbucket and the server. 

I added a script to run git pullin a @reboot directive in my crontab, so that instances auto-update when they are started: @reboot /home/ubuntu/update.sh

 2. PHPMyAdmin import for the RDS DB.  PHPMyAdmin actually handled a 500MB import better than a desktop MySQL client.

 3. zip/unzip for the files.  I then uploaded them into EFS (see below)

3. Configure the WordPress architecture for scalability

Create an EFS mount point:

You want to share the uploads folder between all your WordPress instances.  Previously, I used a paid AWS S3 bucket plugin, but Amazon’s EFS server is easier and more reliable:

Amazon’s Elastic File System (EFS) is a NAS solution that scales performance as you add more files. I followed this guide to mount my EFS volume in Ubuntu.  

sudo mount -t nfs4 -o nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2 [IPaddress]:/ /aier.org/wp-content/uploads

Add the mapping to etc/fstab:

[IPaddress]:/ /aier.org/wp-content/uploads nfs4 nfsvers=4.1,rsize=1048576,wsize=1048576,hard,timeo=600,retrans=2

After the first instance is complete, I cloned it twice and named them web1, web2, admin1. Then I bound static IP’s in Lightsail and assigned DNS names to them in Cloudflare DNS.

Note: I realized that I can store the WordPress code in EFS as well.  I plan to try this in the next revision.

Using fail2ban in a load-balanced environment:

24 hours after switching to load-balanced WordPress, the load-balancer detected both nodes as offline!

WordOps configures fail2ban aggressively to prevent DDOS attacks.  fail2ban interpreted the uptime checking on the load balancer as a DDOS attack and blocked it.

I added the load balancer’s IP to /etc/fail2ban/jail.local

ignoreip = 172.26.0.0/16

Sharing the Redis Cache

There are two places where you need to point to a shared Redis server if using multiple server:

  • wp-config.php to share the WordPress object cache
  • WordOps comes with the nginx redis page cache module enabled.   Configure a shared cache by changing the IP address under “# Redis cache upstream” in upstream.conf
    • I am using my admin instance for this, but you can use the ElastiCache service if the load is too high

Note: Docker Debugging Easy Engine 4.x:
Use ee shell [site] to login to the Docker shell and ee shell [site] –user=root to login as root.  The log files are here:
tail /opt/easyengine/sites/aier.org/logs/nginx/access.log -f
tail /opt/easyengine/sites/aier.org/logs/php/error.log
tail /opt/easyengine/sites/aier.org/logs/nginx/error.log -f

5 thoughts on “High traffic WordPress architecture using AWS Lightsail”

  1. Hello David,
    thanks for this very nice article.
    Just a small tip to avoid issues with fail2ban and the load-balancer :
    use the Nginx directive set_real_ip_from 172.26.0.0/16; to make sure Nginx will not keep the load-balancer ip in logs.

  2. Very interesting article. Finding this via Google comes at a good time for me – I’m in the process of researching and evaluating WordPress scaling solutions right now. What you’ve laid you – the resources, the links – is all very helpful to me right now, and I sincerely thank you for sharing this.

  3. Now that EFS has throughput provisioning, I’m interested to see what the consensus is to host PHP files (with local caching) to simplify the deployment of PHP nodes.

    Maybe I missed it, but everytime you update WP core, and plugins, do you re-clone the master? Also what happens when the Master updates the DB, while the Web Workers are not updated yet?

    Thanks for your article

  4. Hi David, thanks a for for your writeup, very interesting. One question I have :
    When using Cloudflare to manage the DNS, how do you get the Lightsail load balancer to work with SSL? connection on port 80 works out the box, 443 wont allow connections without a certificate, but I cant seem to issue a certificate without changing the DNS? cant find any support online for this anywhere!
    Thanks!
    Phil

  5. Nice to see some geeks mention Lightsail for once instead of always S3 and EC2. However why keep so much bloat? You have analytics and dashboard bloat from WordOps and then more tracking from Fail2ban and New Relic, plus the dependencies like Python. You can get rid of all that with a lighter LEMP-for-WordPress script like SlickStack and use just UFW firewall which is included in Ubuntu, and the analytics provided by Amazon or third party. Or another bare bones LEMP script. Most of the WordOps add-ons are utterly pointless in your distributed setup here, anyways. Good work David…..

Leave a Reply