I’m playing apache config guy this weekend, and I’d forgotten what a struggle configuring mod_rewrite can be. Here’s two things you always want to do when setting up a new ruleset, even (especially) if you think it’s just a quick five minutes change that couldn’t possibly require all that setup time.
Test Locally
Use your local apache install to test the rewrite rules, with apache configured for full logging.
<VirtualHost *:80>
...
#turn on rewrite logging
RewriteEngine On
RewriteLog /tmp/log.txt
RewriteLogLevel 9
</VirtualHost>
You would never want this on a production server, but when you’re writing a new ruleset you need to know why the rules aren’t doing what you think (is your regular expression bad, are other rules interfering, etc.). With logging on you’ll be able to follow each pattern matching attempt and whether or not it results in a rewrite.
Avoid Caching Problems
The other big problem with debugging mod_rewrite rules is dealing with a browser’s desire to cache pages and redirects. Here’s a common debug scenario
- Check a URL, get an unexpected results
- Change a RewriteRule
- Refresh the URL to see if your change did anything
The problem lies in step 3. You can never be sure if you’re making a fresh request to the server, or if your browser has decided it doesn’t really need to. Yes, technically you should to be able to refresh without hitting the cache by using a modifier key, but keeping those combinations straight between reloads is error prone at best.
Curl, the command line URL program, doesn’t have this problem. Using the -i option will fetch the content of the URI as well as the headers
curl -i http://example.com
HTTP/1.1 200 OK
Date: Sun, 31 Aug 2008 21:59:10 GMT
Server: Apache/2.2.3 (CentOS)
Last-Modified: Tue, 15 Nov 2005 13:24:10 GMT
ETag: "280100-1b6-80bfd280"
Accept-Ranges: bytes
Content-Length: 438
Connection: close
Content-Type: text/html; charset=UTF-8
<HTML>
<HEAD>
<TITLE>Example Web Page</TITLE>
</HEAD>
<body>
<p>You have reached this web page by typing "example.com",
"example.net",
or "example.org" into your web browser.</p>
<p>These domain names are reserved for use in documentation and are not available
for registration. See <a href="http://www.rfc-editor.org/rfc/rfc2606.txt">RFC
2606</a>, Section 3.</p>
</BODY>
</HTML>
In addition, when receiving a 301 or 302 redirect request, curl won’t actually redirect you. Instead you’ll see a message something like this
curl -i http://www.alanstorm.com/
HTTP/1.1 301 Moved Permanently
Date: Sun, 31 Aug 2008 22:00:25 GMT
Server: Apache/2.2.9
Location: http://alanstorm.com/
Content-Length: 229
Content-Type: text/html; charset=iso-8859-1
<!DOCTYPE HTML PUBLIC "-//IETF//DTD HTML 2.0//EN">
<html><head>
<title>301 Moved Permanently</title>
</head><body>
<h1>Moved Permanently</h1>
<p>The document has moved <a href="http://alanstorm.com/">here</a>.</p>
</body></html>
Having this information is invaluable when writing mod_rewrite rules. You need to know what the server is really saying to the browser.