fhwang.net

On linkrot

In 2006, I stumbled across a YouTube video of Stevie Wonder playing “Superstition” on Sesame Street. It’s a sprawling, searing funk performance of the hit single, more insistent and grotty than the original recording on Talking Book. It’s funny to think of how you can get introduced to a musician: I first became aware of Stevie Wonder in the 1980s when he lapsed into easy-listening ditties like “I Just Called to Say I Love You”. It took me more than a decade to dig back into the history and understand his contribution to pop music. I can’t remember having watched this on Sesame Street when I was young, but if I did, I was obviously too young for it to take.

Anyway, I watched the clip once or twice, bookmarked it in my del.icio.us account, then went on my way. A few weeks ago, I was reminded of it, looked it up on del.icio.us, and followed the link—only to find that the video had been taken down.

No big deal: Videos get taken down all the time. But it made me think a little more about “linkrot”, as we call it in the web business. We’ve been making web pages for more than a decade now. A lot of those pages are still around, and their links don’t go anywhere anymore. What can be done about it?

In 1998, Jakob Nielsen wrote an article on linkrot that makes for interesting reading today. He makes two recommendations: First, that webmasters maintain their sites so inbound links remain valid, and second, that webmasters check their outbound links and correct them if necessary. The first, I’d say, remains valid, since you presumably own the content on your own site and should always be able to ensure that inward links to that content are working. But the advice about outbound links is quixotic at best, since an active blogger can easily churn out hundreds of outbound links a year. It’s not every blogger’s job to become a random monitoring service for the entire web.

This isn’t really a criticism of Nielsen so much as a sign of how quickly things have changed. In 1998, the vast majority of people who were reading Nielsen were people who were putting content on the web as part of their jobs, whether those jobs were at an online zine or a Fortune 500 company’s investor relations site.

But as has become a truism these days, web media is amateur media, and people now create new content quickly, and then just as quickly they leave it behind. They try out Blogger accounts for a month and then get bored. Or they decide Facebook is cooler than MySpace, so they leave behind an orphan profile, with “friendships” that grow stale and unmaintained.

Anyway, the Stevie Wonder video I bookmarked was this:

http://www.youtube.com/watch?v=dSC29xl6hno

... but maybe I should’ve just bookmarked this:

http://www.youtube.com/results?
  search_query=stevie+wonder+superstition+sesame+street&search_type=

... which, of course, isn’t one specific video record, it’s a search page. This is less convenient for whoever follows that link, of course: You have to click at least twice to get to a video that can be watched. But it’s more resilient. An individual video can be taken down, but YouTube will probably always have the same video up at a different location.

Such an approach rankles my inner perfectionist, since this screws with the tidy notion that a URI will describe one discrete thing in the world, for some sensible definition of “thing”. Different URIs reduce the sharing aspects of del.icio.us, and make it harder to do a Google Blog Search to see who else is blogging about this video. And two different video records of the same Stevie Wonder performance will have different comments, ratings, and video responses.

But do such things really matter anyway? I almost never read YouTube comments, because all they do is make me depressed about society. And who cares if two different URIs can be used to watch the same video? Whoever followed the link got to watch the video, which is what matters.

In the long term, this approach would rely on YouTube to stay in business for a while, which is likely—but also you’d have to hope that YouTube never changes the URI or arguments of its search page. That’s less likely, because who ever links to a YouTube search page?

In the more general case, it’s hard to imagine doing this in a way that will truly stand the test of time. Say your friend writes something funny about “Iron Man” on her blog and you link to it with a search … are you going to assume she’s going to keep that domain up forever? And what search terms are you going to use to uniquely link to that entry, and only that entry?

Linking strategies notwithstanding, I wonder about the word “rot”, which carries a negative connotation in general usage that doesn’t correspond with what we know about how ecosystems work. When a bird dies and falls to the forest floor, it doesn’t go to waste—its body is soon covered by scavengers like grubs that feed on rotting flesh, converting one form of biomass to another. This is bad news for the bird, but good news for the forest. Ecosystems don’t waste energy, they just keep it circulating in a panoply of forms.

So, at the risk of overusing a metaphor, we have a web full of linkrot. Can we build anything that can feeds on it? Say, a Firefox plugin that auto-suggests destinations when you follow a broken link? After all, there are countless dead links out there, waiting to be harvested.

blog comments powered by Disqus
Tagged: web

« Previous post

Next post »