Over the last few weeks, we’ve been dealing with some heavy fallout from a recent server migration on our large WordPress MU installation. While moving everything from Linode to Digital Ocean should give us a more powerful toolset, save on costs, and give us more infrastructure flexibility, I wish the cut over was uneventful.
On Friday we spent the better part of our day working with the team at Reclaim Hosting to troubleshoot some issues we were having, and one day I or one of my other compatriots might feel brave enough to document that crusade in a post to enshrine the great display of teamwork that pulled off a clutch save. Or write the other post shaming us for our poor testing patterns : )
However, after the switch sat for a few days, we ran into a scenario where MySQL started to eat through the available memory on the server, all the way up to 96 GB of RAM. Obviously this was unsustainable, so Tim at Reclaim suggested swapping out MySQL for MariaDB, which is a drop-in enhancement for MySQL.
Tim wired this up on our staging server, and it seemed to pretty immediately level out our database performance. However, we wanted to do some basic load testing of this new setup to make sure that performance would stay the same as our usage increased.
You can see some of these represented in the dramatic drop in the green part of this area chart below:
After that new usage level stayed stable for a day or so, we decided to do some load testing on both servers to test some of these assumptions. Once we got down to it, we felt like existing load testing tools didn’t give us a great picture of how our setup might get hit under live load.
For example, we use NGINX as a proxy to Apache, and NGINX does a great job of caching frequently used resources. However, since we have 30K sites on this multisite, we’re not talking about tons of traffic to a few popular pages; instead, we are looking at fresh loads on lots of different sites, which might actually impact SQL performance.
When NGINX responds with a cached page or asset, nothing actually gets through to MySQL at all.
Since most existing load testing tools let you focus on throwing a lot of traffic at a few urls, we decided to write up a quick testing script that would loop through a larger CSV file of our 100 most popular sites as per our analytics data.
You can find a link to the GitHub repo here, but I posted the meat of the code below:
import csv import requests import chardet import time def check_traffic(): while True: with open('./page_urls.csv', 'rb') as csv_file: result = chardet.detect(csv_file.read()) with open('./page_urls.csv', 'r', encoding=result['encoding']) as encoded_file: rows = csv.reader(encoded_file) for row in rows: url = "http://staging.rampages.us/" + row[0] print("getting url: " + url) r = requests.get(url) res = {"url": url, "status_code": r.status_code, "test": r.text } print(res) time.sleep(.2) check_traffic()
This quick script did a great job of getting a ton of requests past the NGINX caching on an initial pass, which let us evaluate the performance metrics we were interested in on the backend. However, this was a pretty manual process of logging into the server via SSH, running the top or htop command, kicking of the load test via local command line, then monitoring the top output in a separate terminal window.
While this was helpful for our particular purposes, it really highlighted to us how far some of the load testing tools need to grow in terms of their usefulness, especially for the non-standard WP set-up, once you get outside of caring about latency at a single url.
In our wrap-up discussion, we talked about where we think load testing for WordPress should head in the future:
Overall, this has been an exciting foray into a part of development I’ve never really touched, but at the end of the day I’m left feeling disappointed by the existing options out there for load/performance/stress testing, especially when related to WordPress. I’m interested in how other people in the community handle this. I’m sure I’ve overlooked more capable tools, or sussed out problems that don’t exist for the community at-large.