fhwang.net

web

On linkrot
Monday, June 16, 2008

In 2006, I stumbled across a YouTube video of Stevie Wonder playing “Superstition” on Sesame Street. It’s a sprawling, searing funk performance of the hit single, more insistent and grotty than the original recording on Talking Book. It’s funny to think of how you can get introduced to a musician: I first became aware of Stevie Wonder in the 1980s when he lapsed into easy-listening ditties like “I Just Called to Say I Love You”. It took me more than a decade to dig back into the history and understand his contribution to pop music. I can’t remember having watched this on Sesame Street when I was young, but if I did, I was obviously too young for it to take.

Anyway, I watched the clip once or twice, bookmarked it in my del.icio.us account, then went on my way. A few weeks ago, I was reminded of it, looked it up on del.icio.us, and followed the link—only to find that the video had been taken down.

No big deal: Videos get taken down all the time. But it made me think a little more about “linkrot”, as we call it in the web business. We’ve been making web pages for more than a decade now. A lot of those pages are still around, and their links don’t go anywhere anymore. What can be done about it?

In 1998, Jakob Nielsen wrote an article on linkrot that makes for interesting reading today. He makes two recommendations: First, that webmasters maintain their sites so inbound links remain valid, and second, that webmasters check their outbound links and correct them if necessary. The first, I’d say, remains valid, since you presumably own the content on your own site and should always be able to ensure that inward links to that content are working. But the advice about outbound links is quixotic at best, since an active blogger can easily churn out hundreds of outbound links a year. It’s not every blogger’s job to become a random monitoring service for the entire web.

This isn’t really a criticism of Nielsen so much as a sign of how quickly things have changed. In 1998, the vast majority of people who were reading Nielsen were people who were putting content on the web as part of their jobs, whether those jobs were at an online zine or a Fortune 500 company’s investor relations site.

But as has become a truism these days, web media is amateur media, and people now create new content quickly, and then just as quickly they leave it behind. They try out Blogger accounts for a month and then get bored. Or they decide Facebook is cooler than MySpace, so they leave behind an orphan profile, with “friendships” that grow stale and unmaintained.

Anyway, the Stevie Wonder video I bookmarked was this:

http://www.youtube.com/watch?v=dSC29xl6hno

... but maybe I should’ve just bookmarked this:

http://www.youtube.com/results?
  search_query=stevie+wonder+superstition+sesame+street&search_type=

... which, of course, isn’t one specific video record, it’s a search page. This is less convenient for whoever follows that link, of course: You have to click at least twice to get to a video that can be watched. But it’s more resilient. An individual video can be taken down, but YouTube will probably always have the same video up at a different location.

Such an approach rankles my inner perfectionist, since this screws with the tidy notion that a URI will describe one discrete thing in the world, for some sensible definition of “thing”. Different URIs reduce the sharing aspects of del.icio.us, and make it harder to do a Google Blog Search to see who else is blogging about this video. And two different video records of the same Stevie Wonder performance will have different comments, ratings, and video responses.

But do such things really matter anyway? I almost never read YouTube comments, because all they do is make me depressed about society. And who cares if two different URIs can be used to watch the same video? Whoever followed the link got to watch the video, which is what matters.

In the long term, this approach would rely on YouTube to stay in business for a while, which is likely—but also you’d have to hope that YouTube never changes the URI or arguments of its search page. That’s less likely, because who ever links to a YouTube search page?

In the more general case, it’s hard to imagine doing this in a way that will truly stand the test of time. Say your friend writes something funny about “Iron Man” on her blog and you link to it with a search … are you going to assume she’s going to keep that domain up forever? And what search terms are you going to use to uniquely link to that entry, and only that entry?

Linking strategies notwithstanding, I wonder about the word “rot”, which carries a negative connotation in general usage that doesn’t correspond with what we know about how ecosystems work. When a bird dies and falls to the forest floor, it doesn’t go to waste—its body is soon covered by scavengers like grubs that feed on rotting flesh, converting one form of biomass to another. This is bad news for the bird, but good news for the forest. Ecosystems don’t waste energy, they just keep it circulating in a panoply of forms.

So, at the risk of overusing a metaphor, we have a web full of linkrot. Can we build anything that can feeds on it? Say, a Firefox plugin that auto-suggests destinations when you follow a broken link? After all, there are countless dead links out there, waiting to be harvested.

Tagged: web

The perils of image swiping
Saturday, July 29, 2006

Back in 2004, when I posted the Unauthorized iPod U2 vs. Negativland Special Edition on eBay, I wasn’t really surprised by the attention it attracted online, but I was surprised by how the facts got shaved down as the idea spread. It’s a fairly complex idea, I’ll admit—to explain it you’ve got to bring in U2 and Negativland and iPods and Island Records and Kasey Casem and Downhill Battle—and more than one entry said that the iPod was a project of Negativland or of Downhill Battle. I wasn’t personally upset by these misattributions, but it did serve as a personal reminder of the way that the accuracy of many blogs is probably a little closer to that of office gossip than of high-quality journalism.

Recently I noticed a different sort of mistake: One of the iPod images was being swiped for a Chinese blog, for an entry on a new edition of the U2 iPod that had nothing to do with Negativland:

It’s easy to imagine how this happened: This blogger wanted to pass on an Apple press release but also wanted to spruce it up with some pictures, so she entered “U2 iPod” into an image search engine and stumbled upon this photo. Not knowing anything about the Unauthorized iPod U2 vs. Negativland Special Edition—if there’s been anything written about it in the Chinese language, I’m not aware of it—she blithely included it in her blog post and never gave it a second thought.

I find it amusing to imagine some random reader of this blog seeing the unfamiliar band name in the iPod display and maybe doing a little Googling on the subject. And since discovering this, I’ve been trying to figure out if this dynamic can be exploited more generally, to dupe image-swiping bloggers into carrying subtle political messages on their own blogs … nothing comes to mind, alas. But maybe I’ll come up with something.

Tagged: web, art, ipod

Fictohedron: An alternate Ten-sided reader
Friday, April 14, 2006

Ten-sided continues: We’ve been writing for almost a month now, and in our more-than-50-entries we’ve been hinting at lost loves, past crimes, and strange inventions. Two more months to go …

why the lucky stiff, one of our ten illustrious writers, has just put together Fictohedron, an alternate to reading Ten-sided through the Turbulence site.

Fictohedron, an alternate 'Ten-sided' reader

Cool features include: Bayesian analysis to see what novel words are showing up with higher frequency in Ten-sided blogs, and when you click on those words they’re highlighted in entry text … A healthy sign of an easily remixable work, I’d like to think.

Tagged: web, art, blogging

Redesigning Rhizome
Sunday, January 1, 2006

So I didn’t post to my blog for more than a month, but I suppose I had a pretty good reason: This past month at Rhizome, we launched our newly redesigned site. This was a massive project: Rhizome is the most complicated website I have ever worked on, and by my account, my intern Jason and I had to update more than 150 PHP files by hand.

Patrick helped me immensely by giving me tips on clever PHP hackery. You can use ob_start and register_shutdown_function to post-process a page’s output, and wrap the page’s unique content in standardized nav junk. When this is combined with php_value auto_prepend_file in your Apache configuration to automatically prepend your includes, you can drastically reduce the amount of cut-and-paste required in PHP.

PHP’s community is curious: When you first come to the language, it seems as if it were designed to cause spaghetti code, since so much of the documentation focuses on low-level hacks with no interest in design principles. But the PHP language community is a big tent, and on closer inspection there are people laboring away to make elegant components with an inelegant language. Unfortunately, they’re not easy to see for all the newbie sites, and you might spend years coding PHP without ever seeing them. As Ruby grows, PHP is an example worth studying, and in some ways, worth trying not to emulate.

Tagged: ruby, web

"London Hurts"
Monday, July 11, 2005

Via BoingBoing: A group of Americans, feeling sentimental about London in the aftermath of the bombings, started a LiveJournal group called London Hurts, only to be invaded by bonafide Londoners who flooded the place with their own sarcasm. So the image postings went from this:

to this:

I’ve never even been to London, myself, but I appreciate the backlash nonetheless. I found it fairly horrifying to watch New York be claimed by the rest of the U.S. after the September 11th attacks. The sentiment seemed well-intended but inappropriately intimate, like if somebody you barely know asks too many questions at your grandfather’s funeral. And most people who moved to New York from elsewhere in America moved there for a reason, after all.

The “We’re all New Yorkers now” sentiment wouldn’t have been so grating if the Americans didn’t then go on to act typically American. After the memorial candles went out, we Americans bombed Iraqi civilians because we thought they were members of Al Queda, ignored torture of foreign prisoners because somebody has to think of the children, and then re-elected an incurious President because his opponent seemed too French. Hey, if you’ve never been out of the country and the only news you consume is your local TV news reporting on the location of white women, you’re not a New Yorker.

So considering where the sympathy’s coming from, Londoners are probably right to be suspect. Take Fox News host Brian Kilmeade, who on the day of the bombings said “I think that works to our advantage, in the Western world’s advantage, for people to experience something like this together …” Nothing bolsters popular support for a misguided foreign policy like its abject failure, right?

The thing about Americans is, we really mean well, but we’re astoundingly simple-minded about the complexity of the world. We imagine ourselves to be like Star Trek’s Federation, but just as often end up acting like Godzilla. And some of our concern is just for Londoners as an abstraction, not actually as people living different lives in different places who might actually have different opinions. If London were to take a page from Madrid, and decide that there were better ways to fight terror than by indiscriminately killing brown people in another country, how much would the Americans feel for the London victims then? Back before the bombs started dropping in Iraq, a quarter-million New Yorkers took to the streets against it. I didn’t hear any pro-war hysterical Americans claiming to be New Yorkers then.

Tagged: terrorism, web

Rhizome opens
Tuesday, May 24, 2005

If you’re one of those people who don’t exclusively read my blog for Ruby-related stuff, you might know that my day job is for new media arts organization Rhizome.org. You might even know that since January 2003, Rhizome policy had been that you had to donate $5 once a year to get access to the site. As of yesterday, that policy is no longer.

In its place is a policy that moves the wall: While we still can’t support a 100% open site, our new policy is a lot more amenable to today’s browsing habits. Access to the past year of content—our text discussions, and our ArtBase, which indexes more than 1500 works of new media art—is free to all. Posting texts and submitting your own work into the ArtBase is also free, though you have to give us your email address for that. And if you want to see anything that’s more than a year old, we ask you to be a Member, which requires an annual contribution of $25 or more.

Anyway, if you’re curious as to what Rhizome actually is—and, heck, some of my friends still don’t know—now’s a great time to come in and check it out. And maybe you’ll want to follow a discussion that Rhizomers have been having about the new policy. Hey, this marks the first time I’ve deep-linked into Rhizome from my own blog. That’s a nice feeling. I’ll have to do that more often.

Tagged: web, art

HtmlClipping 0.1.0
Sunday, May 15, 2005

More Ruby libs: I’ve just released the first version of HtmlClipping. HtmlClipping reads an HTML page that has a link pointing to a particular URI. It removes most HTML markup, bolds the link text, and trims the resulting text to a fixed number of characters. I developed it to help me track referers to my website, though I suppose it might have other uses.

For example, the following script gets the HTML at http://rubyforge.org/credits/, and forms an excerpt around the link to http://www.rubycentral.org/pledge/.

require 'htmlclipping'
require 'net/http'

contents = ''
Net::HTTP.start( 'rubyforge.org' ) do |http|
  response = http.get '/credits/'
  contents = response.body
end
clipping = HtmlClipping.new(
  contents, 'http://www.rubycentral.org/pledge/', 500
)
puts clipping.to_s

=> "… RubyForge takes time, effort, and money. Many thanks to the
   folks listed below who are making it possible! <br /> If RubyForge has
   been helpful to you, and you want to give something back to the Ruby
   community, please consider supporting <strong>RubyCentral</strong>.
   Thanks! <br /> InfoEther, Inc purchased the RubyForge hardware and
   provides system administration support. <br /> Several folks provide
   file mirrors to help share the bandwidth load: <br /> Evan Webb <br />
   Dennis Oelkers <br /> Austin &#8230;" 

Tagged: web, ruby

Running your own auction
Sunday, April 17, 2005

When I decided to auction the Unauthorized iPod U2 vs. Negativland Special Edition on my own web site, one of my concerns was how I was going to take bids from strangers and ensure that those strangers wouldn’t back out if they won. After all, you don’t want to muddy up your 15 minutes of fame just ‘cause some high school kid feels like playing around. On the other hand, you also don’t want a system that’s so cumbersome that potential bidders give up out of exasperation. So here, for posterity’s sake, are my thoughts about running an auction on your own website.

Tagged: social_software, web