Greg Aker

Bust 'yo cache!

Filed in: Blobbing, web-dev

August 31, 2011

If you're using a CDN such as the one we have setup at EngineHosting, it is possible that your cache will get "stale" since the nodes will hold on to your static assets for as long as the expires headers you send tell them to. There are a couple of different methods we can employ to ensure when you push a new build of your site, the CDN will always come back to your origin servers and get the right versions of files.

I've been over this before, but a real key to success here is managing your static assets in a sane manner. I wholeheartedly recommend keeping everything in a single static or assets directory. The name doesn't matter, just consolidate!

For purposes of this example, we're going to assume your static assets are being served from http://static.example.com/, the 'origin' is http://static-origin.example.com/, and the static directory is sitting at /var/www/public_html/static/.

You always need to be sending caching headers, otherwise using a CDN like this is completely worthless, as the distributed nodes will always ask the origin server for the file. In your /var/www/public_html/static/ directory, drop in a .htaccess file with the following:

<IfModule mod_expires.c>
    ExpiresActive On
    ExpiresByType text/html "access plus 1 second"
    ExpiresByType text/css "access plus 1 month"
    ExpiresByType image/gif "access plus 1 month"
    ExpiresByType image/png "access plus 1 month"
    ExpiresByType image/jpg "access plus 1 month"
    ExpiresByType image/jpeg "access plus 1 month"
    ExpiresByType text/javascript "access plus 1 month"
    ExpiresByType application/pdf "access plus 1 month"
    ExpiresByType application/x-download "access plus 1 month"
    ExpiresByType application/x-javascript "access plus 1 month"
    ExpiresByType application/x-shockwave-flash "access plus 1 month"
</IfModule>

Blindly copying and pasting is a bad idea, so let's discuss what this does. We're setting 1 second expires headers on html documents, but for CSS, Javascript, Images, etc, we set 1 month. #knowingishalfthebattle

If we're updating our CSS or Javascript often (doing it live), this will pose issues for some of your sites visitors. Remember this is because these files will already be cached on the CDN nodes.

Solution

We can do a little trickery in the .htaccess file to ensure users will always get the latest...check it out.

# Expires stuff from above is here.

RewriteEngine on
RewriteRule ^(.*\.)v[0-9.]+\.(css|js|jpg|png|pdf)$  /$1$2   [QSA,L]

This takes everything that contains .v and removes it on the back end. This is only going to happen when the CDN is trying to get the latest version of the file.

For instance, if you have a file that is http://static.example.com/css/main.css you write it as: http://static.example.com/css/main.v2011083100.css. When you bump the number when pushing a new build, the CDN will get the latest version of your file. It's a personal preference, but I prefer a timestamp in the form of: yyyymmddnn, so I can increment 99 times in a day if I'm screwing things up that badly.

If you're running Nginx, the rewrite might look like:

location /images {
    proxy_redirect off;
    rewrite ^(.*\.)v[0-9.]+\.(png|jpg|jpeg|gif)$  /$1$2 last;
    expires max;
}

In a web-application such as ExpressionEngine, you can add a custom template variable to your index.php file. Look for the $assign_to_config section, and add something like:

<?php

$assign_to_config['assets_timestamp'] = '2011083100';

Then in your template, you'd reference images like:

<link rel="stylesheet" href="http://static.example.com/css/main.v{assets_timestamp}.css" type="text/css" media="screen">

and you're good to go.

In Django or Tornado, you can add this to your application settings, although abstracting it a bit more might make it look nicer. I don't have a clue how you'd do this in WordPress, as I haven't looked at that application in 3-4 years. if someone who deals with that can comment below and I'll drop it in here.

I find this is one of the hardest transitions for a developer to make as they start working on their first busy site. Sorry but you have a busy site, you're lucky. Deal with it. :)