fhwang.net

Justifying Dauxite

May 6, 2004

<< XML, and what it's good for | The final result >>

And then there's XSLT. XSLT is profoundly useful for a narrow set of problems. Unfortunately, it has ambitions of being a world-class multipurpose language, and if you indulge that ambition it will only repay you with treachery. XSLT can do a lot of things, but most of them can only be done in a way that's convoluted and verbose. But, hey, it's XML, and your output's XML, so this must be the right solution! Right. We've seen this delusion before, only that time it was called ColdFusion.

To be fair, let's start with where XSLT works. The SiteWrapper content filter uses an XSLT stylesheet that looks like:

<xsl:template match="/xhtml:page_content">
  <body>
  <p id="page_head">
    <a href="/"><img src="/img/home.png" alt="fhwang.net" width="223"
                     height="35"/></a>
    <br /><em>Francis Hwang&apos;s site</em>
  </p>
  <hr />
  <div id="page_content">
    <xsl:copy-of select="node()" />
  </div>
  <div id="site_nav">
    <ul>
      <li><a href="/blog/">Archives</a></li>
      <li>
        <a href="/rss/latest.xml"><img src="/img/rss-blue.png" width="36"
                                       height="14" alt="rss" /></a>
      </li>
    </ul>
    <ul>
      <li><a href="/bio.html">About me</a></li>
      <li><a href="/art/">Art</a></li>
      <li><a href="/writing/">Writing</a></li>
      <li><a href="mailto:sera@fhwang.net">Contact me</a></li>
    </ul>
  </div>
  </body>
</xsl:template>

This simply says "Find the elements inside of the <xhtml:page_content> element, and drop it in the middle of all this navigational stuff." This content filter is focused more on presentation than on logic, so you'd rather have a format that makes the design more apparent, and easier to change.

(And yes, I'm cheating, because <page_content> isn't an element in XHTML. After the SiteWrapper runs we'll have stripped out that element, and we'll be validating the results before uploading them anyway. But when you do this sort of stuff you really should grok your XML namespaces.)

XSLT works well wrapping elements in static text. And it can handle fairly simple conditions and loops. index.xsl, for example, loops through an RSS file, emitting a chunk of XHTML for each RSS <item>.

Unfortunately, XSLT can't go much further than that. As mentioned above, it's quite inept at the common task of date formatting. And it failed me completely on turning Docbook into XHTML. In Docbook, footnotes are imbedded in the text they're linked to, so to process them into XHTML I had to:

  1. Save the footnote text in some sort of collection.
  2. Keep a running count of what footnote you're currently processing.
  3. Drop in a numbered link to the footnote that you'll be inserting at the bottom of the page.
  4. When you get to the bottom, dump all your footnotes.

This isn't a difficult problem, but something about the structure of XSLT seems to make it really awkward to handle stateful variables. All I know is that I went online looking for XSLT solutions to this problem—Docbook being, again, a big format with lots of users—and all I found were 100-line chunks of code that made me wonder if maybe Ted Kaczynski wasn't right in some way.

So I implemented this in Ruby instead. It would've been great to use XSLT, especially since much of the work is simple-minded element mappings: turn <citation> into <cite>, <ulink> into <a>, etc. But XSLT simply wasn't the right tool for the job. It's not the right tool for most jobs.

Right now a lot of people are trying to use XSLT as a general-purpose language—apparently it's Turing-complete, whoop-di-doo—and they're spending a lot of money on those fat computer books down at the Barnes & Noble and opening another can of Mountain Dew every time they start to nod off from all those angle brackets. Ill-advised programming trends are nothing new, of course. My main hope is that when people get disgusted with XSLT they won't discard it outright, because for a narrow set of problems it's absolutely great. A language doesn't have to be general purpose to be useful. You can't use regular expressions to connect to a database, but you probably don't want to use anything else for processing raw text.

<< XML, and what it's good for | The final result >>

Selected referrals to this page

SU: fhwang.net/writing/justifying_dauxite6.html
First tracked November 27, 2004


117,018 members
Reviews of
fhwang.net/writing/justifying_dauxite6.h... ( 8K )
Web Development
Page Title
Francis Hwang: Justifying Dauxite English
Suggester says A great summary of XSLT, both the good and the bad.
Suggester
wad
DirtyPenguin
wad

Created Oct 19, 8:39am…