- Magento Front Controller
- Reinstalling Magento Modules
- Clearing the Magento Cache
- Magento’s Class Instantiation Abstraction and Autoload
- Magento Development Environment
- Logging Magento’s Controller Dispatch
- Magento Configuration Lint
- Slides from Magento Developer’s Paradise
- Generated Magento Model Code
- Magento Knowledge Base
- Magento Connect Role Directories
- Magento Base Directories
- PHP Error Handling and Magento Developer Mode
- Magento Compiler Mode
- Magento: Standard OOP Still Applies
- Magento: Debugging with Varien Object
- Generating Google Sitemaps in Magento
- IE9 fix for Magento
- Magento’s Many 404 Pages
- Magento Quickies
- Commerce Bug in Magento CE 1.6
- Welcome to Magento: Pre-Innovate
- Magento’s Global Variable Design Patterns
- Magento 2: Factory Pattern and Class Rewrites
- Magento Block Lifecycle Methods
- Goodnight and Goodluck
- Magento Attribute Migration Generator
- Fixing Magento Flat Collections with Chaos
- Pulse Storm Launcher in Magento Connect
- StackExchange and the Year of the Site Builder
- Scaling Magento at Copious
- Incremental Migration Scripts in Magento
- A Better Magento 404 Page
- Anatomy of the Magento PHP 5.4 Patch
- Validating a Magento Connect Extension
- Magento Cross Area Sessions
- Review of Grokking Magento
- Imagine 2014: Magento 1.9 Infinite Theme Fallback
- Magento Ultimate Module Creator Review
- Magento Imagine 2014: Parent/Child Themes
- Early Magento Session Instantiation is Harmful
- Using Squid for Local Hostnames on iPads
- Magento, Varnish, and Turpentine
Today we’re going to take a break from Magento 2 and talk about using Varnish as a tool for improving the performance and scalability of your Magento 1 system. While Magento 2 is on the mind of anyone connected to the Magento ecosystem, it’s important to remember that eBay, eBay Enterprises, and whatever the spun off company will call itself, has promised a long, gradual end-of-life period for Magento 1. I expect most working Magento programmers will use both Magento 1 and Magento 2 for the next few years.
The past few gigs I’ve been on have involved configuring Varnish for use with Magento. Some of these were for scaling up traffic during email campaigns, others were a duct tape situation to work around poorly performing category pages due to inefficient extension programming. This article isn’t quite a tutorial — it’s more an overview of Varnish and the Nexcess Turpentine extension for Magento. i.e. The high level overview I wish I’d had when I first started using Varnish and Magento.
Also, as long as I’m here, I’ve got some open time up on my technical consulting/freelancing schedule. If you’re looking for some help with performance improvement on your Magento system, a complicated problem that someone like me could help solve, or even some boring bread and butter middleware business programming, reach out and say hi. I charge a fixed cost over time, and compared to a full blown agency my rates are surprisingly affordable. For smaller merchants on a fixed budget, I’m always open to negotiating rates if you’ve got a straight forward problem to solve.
Self promotion done, let’s get to Varnish and Turpentine!
A Caching Proxy
Varnish, the computer program, fits in nicely with the old school unix philosophy that says programs should do one thing, and do it well. Varnish’s one thing is “acts as a caching HTTP proxy”. Varnish’s one thing is not “easily integrate with existing web applications”. There are several challenges you’ll need to overcome to make Varnish work with any web application.
First, you’ll need to change your web server configuration. Normally, it’s your web server that listens on port 80 for incoming requests. If you’re running Varnish, it’s Varnish that listens on port 80. Varnish basically becomes your web server. When Varnish needs to ask your application (in our case, Magento) for a page, Varnish will communicate with the application over a different network port. This means you need to setup your web server (apache, nginx) to listen on a different port. This varies from web host to web host, and dev ops engineer to dev ops engineer, but 8080
is a common port to use in this situation.
Also, Varnish doesn’t support SSL/HTTPS on its own. This means one of two things
- Varnish will only cache
http
pages, andhttps
pages will bypass the cache - You’ll need to setup your web server, or another proxy to listen on port 443 (the standard SSL/HTTP port), and (in the case of the proxy) forward those requests on to Varnish.
All of these problems are easy to overcome if you own the entire stack — but because so many smaller ecommerce store owners don’t own the entire stack, they’re stuck with what their hosts offer them. From what I’ve seen in the wild, it’s typical for Magento systems to opt-out of using Varnish for HTTPS pages.
The other issue with Varnish is application state. Things like a user’s personal information (name, age, etc.), any application state that’s calculated based on the same, or any application state that varies throughout the day based on time or other factors. Varnish is a very efficient caching proxy, but it’s efficient because it completely skips loading any of the that stateful information.
In the world of “big tech” startups, this isn’t immediately a problem. If you consider the sort of systems Varnish grew up around, they’re applications meant to attract a the maximum number of random users in the world (stateless users) and either convert some small percentage of them into paying (stateful) customers, or show them ads (ads served by a <script/>
tag), whose implementation details aren’t the startup’s responsibility.
The problem with ecommerce systems is, no matter the level of traffic you’re serving, every user is a stateful user because every user can add items to a shopping cart. In addition to this, an ecommerce page is chocked full of stateful information. Product prices that vary based on promotions, the aforementioned shopping cart information, XSS protecting form keys, etc.
Varnish has tools for dealing with this sort of state, but it’s an order of magnitude more difficult to configure Varnish to gracefully handle it. Another order of magnitude of difficulty is added because Magento wasn’t developed with Varnish proxy caching in mind.
Again, nothing that can’t be overcome, but if you’re used to a world where “enabling Varnish” means
apt-get install varnishd
sudo /etc/init.d/varnishd start
you’re in for a rude awakening.
What is Turpentine
Turpentine is an open source Magento module that will help you implement Varnish on your system. The PHP web hosting company Nexcess controls its development.
Turpentine attempts to do a number of different things, so we’re going to cover each below. In short though, Turpentine wants to completely manage Varnish for you, and provides systems for
- Generating Varnish Configuration Files
- Communicating with Varnish via unix sockets
- Swapping out stateful Magento blocks with edge side includes, or ajax includes
Turpentine: Configuration Generator
First off, Turpentine is a Varnish configuration generator. The Turpentine extension will, at the request of the Magento admin user, generate a Varnish VCL file. Files are generated based on configuration settings in
System -> Configuration -> Turpentine -> Varnish Options
System -> Configuration -> Turpentine -> Caching Options
and the base templates
app/code/community/Nexcessnet/Turpentine/misc/version-2.vcl
app/code/community/Nexcessnet/Turpentine/misc/version-3.vcl
app/code/community/Nexcessnet/Turpentine/misc/version-4.vcl
The .vcl
files that ship with Varnish are not valid configuration files on their own. Instead, they’re base Varnish configuration files that also contain simple string template variable placeholders that look like this
{{custom_vcl_include}}
The Varnish Turpentine extension will replace these placeholders with actual values, based on the above mentioned configuration settings.
As an admin user, you can tell Turpentine to generate a new Varnish configuration from the
System -> Cache Management
page. Turpentine adds a few buttons to this page.
The Save Varnish Config
button will let you save the Varnish configuration to a location on your server. By default this location is var/default.vcl
, but is configurable in the Varnish Options
configuration section mention above.
The Download Varnish Config
button will let you download the current configuration to your computer.
In addition to the two buttons above, saving any System Configuration values (in the Varnish Options
and Caching Options
sections) will automatically save the Varnish configuration to the var/default.vcl
folder (or the configured location if you’ve changed it) and (getting a bit ahead of ourselves) automatically apply the configuration to the running Varnish instance. You can turn this behavior off via the
System -> Configuration -> Turpentine -> Varnish Options -> Apply VCL on Config Change
setting. If you’ve turned this behavior off, you’ll need to manually tell Turpentine to apply the new configuration file. You can do this via the
System -> Cache Management
page, specifically by clicking the Apply Varnish Configuration
button. This will apply the file from var/default.vcl
(or the configured path location if you’ve changed it).
One important thing to note about this configuration generating — depending on your hosting setup, Varnish may or may not read the var/default.vcl
if it needs to restart. Magento specific hosting partners (like Nexcess) will often setup their varnishd
/init.d varnish
services to start using the file at /path/to/magento/var/default.vcl
. A generic VPS host will almost certainly not do this. Knowing how to restart Varnish and apply Turpentine’s configuration should be one of the first things you figure out when taking over an existing Magento system.
The Binary Language of Moisture Vaporators
As hinted at above, the second thing Turpentine does is talk to running Varnish instances. Things like
- Please apply this new configuration file
- Please tell me which version you are so I can generate the new configuration file
- Please remove URLs matching this pattern from your cache
Rather than use the command line tools that ship with Varnish, Turpentine talks directly to the running Varnish instance using unix sockets. You can see where Turpentine makes a connection using fsockopen
right here
#File: app/code/community/Nexcessnet/Turpentine/Model/Varnish/Admin/Socket.php
protected function _connect() {
$this->_varnishConn = fsockopen( $this->_host, $this->_port, $errno,
$errstr, $this->_timeout );
//...
}
By directly communicating with Varnish via sockets, Turpentine developers don’t need to worry which specific Varnish command line tools are installed, as well as the usual litany of permissions issues that arise when running shell commands from a PHP instance.
You’ll configure which Varnish servers (IP address:port) Turpentine talks to at
System -> Configuration -> Turpentine -> Varnish Options -> Servers
By default, Turpentine will issue most of its commands to all the servers configured here. The one exception is when Turpentine is generating a new VCL and needs to ask Varnish which version to generate. In this case, Turpentine uses the first server configured. Your main takeaway here should be
Don’t try running different versions of Varnish if you’re in a multiple frontend/app server architecture
Stateful Includes
The third thing Turpentine does is provide a system for including stateful information on each page. Things like “how many items are in the cart”, “what’s the logged in user’s name”, etc.
From a high level, Turpentine ensures any initial page rendered for the Varnish cache (and therefore served from the Varnish cache) will have stateful Magento layout blocks stripped out, and replaced by either
- An ajax include
- An edge side (ESI) include
ESI includes are a special Varnish feature. If you render a page for Varnish that looks something like this
<esi:include src="/path/to/some/url"/>
Varnish will, each time the page is served from cache, make an HTTP request to the src
URL and replace the <esi ... />
tag with the contents it finds.
Stateful blocks are “stripped out” by a core_block_abstract_to_html_before
observer method that replaces the normal block contents with either the ESI include or Javascript code that fetches stateful content via ajax. The ESI or ajax code will point to a Turpentine provided controller (controllers/EsiController.php
) action method endpoint
/turpentine/esi/getBlock/...parameters...
/turpentine/esi/getFormKey/...parameters...
As a system owner, you configure which blocks are or aren’t stateful by assigning an esi_options
data parameter (setEsiOptions
) to the block object.
You can configure which include method (ajax or ESI) your system uses on a block by block bases in the esi_options
data.
Tying the Room Together
Finally, the fourth, and most important thing, Turpentine does is provide base Magento systems with a default caching strategy. As mentioned, Varnish isn’t a turnkey solution, and an ecommerce system presents a number of challenges. Via the default choices in the Varnish VCL templates, and the default categorizing of blocks as stateful in a layout update XML file.
app/design/frontend/base/default/layout/turpentine_esi.xml
Turpentine attempts to create a caching strategy that will maximize Varnish cache hits, while ensuring stateful information is still displayed to the end users. This includes things like
- Telling Varnish (via VCL configuration) to skip certain URLs (like
/admin
) - Attempting to pre-generate (via VCL configuration) a frontend cookie session ID to avoid initial user cache misses
- Ensuring the frontend session ID is correctly passed along between Varnish and the backend web server
- Ensuring the XSS protecting
formKey
is correctly generated - Picking stateful blocks to ESI include based on the default Magento theme
- Optimizing HTTP headers sent by Varnish and Magento both
- etc.
This default caching strategy is a good first effort, but it can’t possibly cover every Magento system in the world. Different themes have different blocks with stateful information. Different extensions create unknown blocks with stateful information. Some systems need Varnish to generate a session ID on an initial user visit due to extremely ill performing page load times, others can afford a miss and allow Magento to generate the session ID.
Unless you start your development cycle with Turpentine and Varnish, it’s doubtful that you’ll be able to just drop the extension into place and go to town. Where Turpentine distinguishes itself is in the set of tools it provides to craft your own Varnish caching strategy. You can discard its VCL, but still use its ESI includes, The ESI include system is robust and the registry_keys
/dummy_blocks
feature ensure you can render almost any individual block separate from the full Magento layout.
In the past year or so of (heavily) using Varnish and Magento, I’ve hit numerous blockers that threatened to turn the project into a mini-death march, only to find that Turpentine already had a solution. Beyond solving problems you may have now, Turpentine is the rare Magento 1 extension whose development is ongoing.
In short, if you’re planning to use Varnish as part of your performance and scalability strategy, Turpentine is a must have extension. I can’t recommend it highly enough.