Justifying Dauxite

May 6, 2004

<< Introduction | Detailing a site structure >>

fhwang.net has been up since 2000, and it's been software-generated from the start. Many of its pages lend themselves well to automation. One large part of the site contains pointers to more than 100 articles that I've written for magazines and newspapers, and these pointers are organized by category, date, and my recommendations for readers. As the site continued to grow I added a blog on the front page, and although initially it was only updated to reflect changes in the site or my professional life, I have plans to start using it more, which will hopefully cut down on the number of times I end up ranting to my friends about gender politics or urban planning at a bar at two in the morning.

The first program was written in Perl, because that was my favorite language at the time. The second was in Java, because I realized that I am not enough of a l33t h4x0r to maintain my own Perl code. In addition, I was starting to read the various XP bibles and I wanted to start using unit tests. The third, named simply Publisher, was in Ruby because I realized (along with a lot of other people) that once you use lots of unit testing, Java's static typing causes more pain than relief. (In my case, that's both figuratively and literally: My RSI is bad enough that I notice it more when using something verbose like Java. Though I can't explain why my prose is still so damned long.)

Many of the pages on fhwang.net are fairly static, and in Publisher, handling those was easy enough. But for pages that had any sort of dynamic behavior, I used eRuby. Since then, I've gained a lot more experience writing dynamic HTML pages both in my freelance work and at Rhizome. Rhizome is at times a densely dynamic site, with certain pages pulling from seven or eight database tables at once, and with the page contents differing drastically depending on what resource you're looking at, who you are, and (literally) what day of the week it is.

From that experience I gradually came to the conclusion that eRuby—along with other templating languages—is an abysmal way to generate HTML if that generation is at all complex. Of course, eRuby is better than PHP or ASP, since the language you use between those <% %> isn't entirely moronic, but all such implementations share some common drawbacks. When you use eRuby you're treating markup as if it were raw text, which is like doing arithmetic with bitmasks. You keep telling yourself you'll keep your presentation separate from your logic, but it's too easy to add logic to the eRuby and too hard to move it out. And given that eRuby resists the one-responsibility organizational principle of OO code; that it will spew its contents all over stdout unless you strap it up tighter than a girl in a bondage video; and that when you finally do retrieve those results, the six lines you care about are buried in miles of HTML—given all that, does anybody unit test the stuff? I sure don't.

Over on the smart and friendly ruby-talk list, a thread on HTML generation pops up at least once a month, and there are always a few people who will heartily endorse Amrita. I had looked into Amrita before, but we didn't get off to a good start. For starters, I couldn't even get it to compile on one of the machines I was trying to use it on—though that's mostly my fault since when it comes to compiling and installing I'm a little incompetent and a lot impatient. But even when Amrita did compile, it had some default behaviors that I found surprising and off-putting: Not working well with XHTML, for one, and then hijacking the "id" attribute for its own uses even though that attribute comes up a lot in a number of XML syntaxes, including XHTML itself.

Maybe that stuff isn't important, and can simply be overridden once you know what settings to tweak. But these roadblocks were enough to give me pause, and when I started seriously browsing through the docs I started to think Amrita wasn't what I wanted. It seemed to be geared towards geek-centric sites with minimal design. (For one thing, I can't figure out how you're supposed to handle highly heterogenous mixed content.) Now, I spend a lot of time reading those minimally designed sites, but part of my rent every month is paid by work I do for online business publications or cosmetics companies, and they don't care one bit about the semantic structure of HTML. They just want the page to look good in Internet Explorer, and they reasonably expect that if they email you a new design and haven't changed what data the page depends on, that you can implement it quickly. The closer that design implementation is tied to rich data structures, the slower you'll be in responding to design changes.

Templating code like eRuby and procedural code like Amrita each offer lopsided views of the same problem. By focusing on the text output first, templating code appeals to the designer who thinks that appearance is important and that functionality is trivial. By focusing on data-driven behavior first, procedural code appeals to the programmer who thinks that functionality is important and that design is trivial.

But if I think both are important? Sometimes I want to be a design-patterns-agile-methodologies geek, and when I'm in that mode I want to write in Ruby, which is a great way to express relations among entities. Sometimes I want to be a messy-haired design snob, and when I'm in that mode I want to write in something that looks like XHTML, so I can focus on the surface. Why can't I have both?

<< Introduction | Detailing a site structure >>