Apache Sucks

I've always hated Apache. It's huge and bloated and while it can do almost everything, it doesn't seem to excel at anything. The qualified news that nginx is now the #2 web server in the world gives me some hope. To this practitioner Apache feels like the IE6 of the web serving world: it was popular for no good reason except the early-2000's trend toward consolidation in platforms, but nowadays there's much better alternatives available.

A particular headache is Apache's kinda-like-XML-but-not-really syntax and how it will do surprising and nonintuitive things if you're not a complete expert in the system. At work we've observed 3 different strange and unexpected behaviors from our Apache install in the past few weeks. All examples are on Apache/2.2.15 as distributed with RHEL 6.1.

The first was regarding allow/deny rules in the conf. Without going into details (largely because I'm trying to forget them :), conflicting declarations of access rules on a given directory didn't interact in an expected way. It neither replaced the old with the new, nor blended the two declarations in a predictable manner. Even with 3 of us staring at the docs and trying different things it made no sense, so we simply hacked it until it worked. That's not how things are supposed to be.

The second was Apache picking up the second HTTPS vhost (of many) as the default. The problem here is that if you don't declare a default vhost Apache will grab the first one in the configuration that matches the IP:port. At least, if it's going to grab one of them by default it should grab the first or last one. Except that it didn't take the first one (which was the web property I'm responsible for and would also have been erroneous, but not terrible), it took the second one (an internal system stats interface, which was a more serious security issue).

Both vhosts had "ServerName" directives, which most sane programmers without Apache experience would think would qualify a vhost declaration to only being valid for that host. But no, Apache still decided to take the second vhost in the config, in the process ignoring the allow/deny directives that should've prevented the outside world from seeing the internal system stats page.

We were under the gun so we decided not to bother figuring out what subtle confusion Apache was suffering from so we just removed the stats interface entirely for now. A more sane result on Apache's part would be for it to warn during starting that you didn't have any "catch all" vhosts set up, and then deliver a contentless 404 for any connections that didn't match.

The third is the most mind-blowing. I was configuring a staging instance to allow access for a 3rd party developer and I put in a line such as:

    Allow from 12.34.56.78/255.255.255.255  # John Doe 20120105

To which Apache responded: "The specified IP address is invalid." I stared at the line for a minute trying to figure out how the syntax could possibly be wrong. Knowing that Apache is easily confused I reasoned that the digits were probably throwing it off, so I removed the timestamp:

    Allow from 12.34.56.78/255.255.255.255  # John Doe

...and the config passed syntax check. What does that say about how shitty Apache's internals are? In writing this post I decided the look up what the exact Apache documentation says about comments.

Lines that begin with the hash character "#" are considered comments, and are ignored. Comments may not be included on a line after a configuration directive.
(from the Apache 2.2 docs here)

Now, I understand the rationale for this. Hash characters might actually be used elsewhere in the configuration syntax. It's a very poor design choice, but it's a design choice nonetheless. But if that's what you're going to do, a syntax error should happen in both cases! Do the developers of Apache not realize that most people who are familiar with Unix will see hashes in a config file and assume that it conforms to the same rules that normal shell scripting syntax does?

If the current character is a '#', it and all subsequent characters up to, but excluding, the next shall be discarded as a comment.
(from the POSIX docs here)

Optimistic parsing that violates the published specification is not helpful. It confuses your users because it gives the illusion that they understand something when they really don't. If you need optimistic, loose parsing to make your config files easy for users to work with it doesn't mean that you're going the extra mile and making ergonomic software; it means your config file syntax is a failure.

Apache sucks.

IMHO, of course.