Server refurbishment

25 June 2013
A few days ago, I upgraded all the software in this web-server. This had been in the works for some time, but earlier this week it finally went live. The main change is that I finally ditched Apache, as unlike the last time I did a server build, lighter-weight alternatives actually had all the features I wanted. I also took the opportunity to do a general tidy-up.

Motive

My usual approach to maintaining websites was to run a private webserver on my fileserver, and then mirror any changes to the production servers via rsync. This is why when I built the news pages, I based them around flat files, as that avoids the need to synchronise databases. The problem is that when I moved to New Zealand, for reasons of space I could not bring it with me. I had all the files so was still able to do the odd update, but it was a pain. This finally got to me, so I got round to building myself a new developmental server setup on the desktop system. Since my production servers were in need of an update as well, I took the opportunity to make both the developmental and production systems as similar as possible.

Bye bye Apache

Aside from it being a supposedly deliberate pun on a patchy server, from its history as a patchwork of updates to the NCSA web-server, these days it is considered too heavyweight for modern web-hosting trends. It really belongs to the era of shared hosting, as the features it still has that more recent web-servers don't are ones related to segregating different websites. For instance there is suPHP that prevents one compromised website from stamping over the files that belong to others. When I built the server, it was using then-recent experience I had obtained from doing rebuilds of the now-defunct Bristol IT Society server, which was semi-regularly attacked. As a result I was a bit paranoid about security, so I wanted to make sure that a worst-case compromise of one of the society websites would not be able to to trash the remaining sites. The security concerns no longer apply as the only active society website that I still host, Bristol Shooting, is now on its own server. As for the inactive sites I converted all the PHP into static pages, as it was mostly stuff like generation of photo-gallery listings.

What took its place?

Nginx. Long ago I liked Cherokee, but it hit a development hiatus that made me give up on it. Lighttpd was promising, but I also wanted per-vhost error logs, and Lighty had started to look like a one-hit-wonder with only its CGI/FastCGI support being really polished. To be fair for single-domain sites I think Lighty beats Nginx overall, but in this case that was not what I was setting up.

Removal of some PHP

Because the way I upload "news" articles is via rsync-ing of flat files, I decided to change from browse-time generation of article listing to an offline upload-time script. I had thought about switching from PHP to Python, but Python+FastCGI does not lend itself to small amounts of dynamic content in a mainly static site.

I was planning on getting rid of the PHP-Index view, which is a dynamically-generated tables & Javascript version of the site. It dates back to when the main version used frames (and later on, IFrames), and I kept it because several browsers had buggy CSS support. However the wiring job it required so that I could use common content HTML, particularly for images was messy, and these days the historically CSS-crippled MSIE is in terminal decline. However search engines still seem to pick up the PHP-Index version, so for now it is staying.

And finally, the picture gallery

This needed a re-think, as the previous method of basing the gallery HTML on a file listing while looking for side-by-side flat files for captions was loaded with exceptions. It was laden with hacks, especially for the galleries that had multiple images sizes and/or aspect ratios, and even then the galleries for the firework shows were still special cases. In the end I decided that I needed to add structured meta-data that made things like image order explicit, and more importantly could accommodate all the special cases. At least then any future refurbishment via scripting would not have so many undefined cases, and hence no need for manual intervention.

As much as I dislike XML from a performance perspective, I opted to use it for the meta-data files, as it has the advantage that both PHP and Python have built-in XML parsers. I used a script to generate draft indices, and surprisingly it mostly Just Worked. The only real change aside from correcting the order of some listings that was needed was separating the gallery titles from the gallery synopsis, stripping out redundant HTML tags in the process.