Boosting Djangos performance with Nginx reverse proxy cache

This is part 1 of 3 in a series of blog posts:

  1. Boosting Djangos performance with Nginx reverse proxy cache
  2. Manually invalidate URLs cached in Nginx's reverse proxy cache
  3. Invalidate the whole Nginx reverse proxy cache in production

Your website is lame. The first thing you see when you point your browser to the website is a blank screen. And then, after a small eternity, the site is finally rendered.

Before we solve this problem and make your website blazingly fast, we are going to assume that your website setup is as following:

You have a front end Nginx server. This is the front line to your website. Every user request goes through this Nginx. Nginx serves static files (like stylesheets, Javascript files and images) and proxies dynamic request to the application server. We assume that your application server is Gunicorn running a Django application. (But it can be anything, it does not matter for our case). Your application server queries a database and assembles an HTML document and serves it to the user through Nginx.

The journey of a user request looks something like this:

A requests journey from the client through Nginx to Gunicorn and into the database and all the way back to the client.

Mike Haertel (the Author of GNU grep) once said:

“The key to making programs fast is to make them do practically nothing”

Our application (as we have described before) does a lot. Therefore it is slow. So we need to make it do less. And this is done by using Nginx as a reverse proxy cache.

How the reverse proxy cache works

When a request arrives at Nginx, it looks if it has a cached version of a response matching the request. If it has a cached response it serves it to the user. If not, it sends the request to the application server. The application server returns the response, Nginx puts the response into its cache and serves it to the user.

Every cached response is stored in a separate file in the file system.

Base configuration

In your http block in Nginx you specify the base configuration of your Nginx cache:

proxy_cache nginx_cache;  

First you define the proxy_cache and a assign a shared memory zone where Nginx should store the keys for the cache. In this case the zone is called nginx_cache.

proxy_cache_path /data/nginx-cache levels=1:2 max_size=10g inactive=1y keys_zone=nginx_cache:40m;  

Then you define the proxy_cache_path. You specify the path and other parameters of the cache. With the levels parameter as in our example you will get cache files like /data/nginx-cache/c/29/b7f54b2df7773722d382f4809d65029c (notice the last three characters of the file name) You also specify a maximum size of the cache (10 Gigabytes) and the size of the keys_zone (one megabyte zone can store about 8 thousand keys) Cached items that are not accessed during the time specified by the inactive parameter get removed from the cache regardless of their freshness.

proxy_temp_path /data/nging-cache-temp 1 2;  

The proxy_temp_path specifies the temporary directory where the cache item is created. After it was created it is moved to the real cache directory. The other parameters define the directory structure to be the same as in the proxy_cache_path option.

Configuration of the cache

The configuration of the cache you should put into a location block. So you can specify which URLs to cache. (You probably have some URLs that should not be cached)

proxy_cache_key "$scheme://$host$request_uri";  

Define a key for caching. See proxy_cache_key

proxy_cache_valid 200 6M;  

Specify what requests should be cached and for how long. In this case only responses with a HTTP status of 200 (OK) are cached. As our pages do not change a lot we cache them for 6 months. See proxy_cache_valid

proxy_ignore_headers Cache-Control;  
proxy_ignore_headers Set-Cookie;  
proxy_hide_header Set-Cookie;  

Nginx ignores some headers sent by the application server (proxy_ignore_headers). It also does not send the Set-Cookie header to the client. (proxy_hide_header)This is necessary because on our site you can login on every page. It could happen that the Set-Cookie header with our session cookie would end up in the cache and so every user would be logged in as the user that was logging in when the response was cached.

proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;  
proxy_set_header X-Real-IP $remote_addr;  
proxy_set_header Host $http_host;  

We also set some headers so we have responses with all the correct information set.

add_header X-Cached $upstream_cache_status;  

Finally add a header to the response called X-Cached that shows if a response was a cache hit or a cache miss. This is great for debugging.

Voilá, now you have your reverse proxy cache set up. Just restart Nginx and see the performance improvements! (In our case the response time dropped by 62% on a cache hit.)

Conclusion

It is not really hard to set up reverse proxy caching. But as always the devil is in the details. We had problems with users being logged in as other users because the Set-Cookie header was cached. Be sure to test your setup thoroughly!

But if everything runs fine it will boost your page rendering time and the load of your application server and your database will drop.

Another really nice side effect: If your application server crashes, then Nginx is still serving all your cached pages.

If you have enough RAM you could place the cache into a ram drive to make it even faster.

Further reading

This is a blog post in a series of blog posts about the Nginx reverse proxy feature:

  1. Boosting Djangos performance with Nginx reverse proxy cache
  2. Manually invalidate URLs cached in Nginx's reverse proxy cache
  3. Invalidate the whole Nginx reverse proxy cache in production