This past month has brought a flurry of posts from various net-media luminaries concerning newspapers and online archives: First Simon Waldman and Jay Rosen on PressThink, followed by Dan Gillmor, and then Cory Doctorow over on BoingBoing. All three posts are worth reading in full, but the gist of what they’re saying is that newspapers should keep their archives free and open for all time, thus allowing the web to work its intertwingly magic on the results. This is inspiring stuff, but I think the discussion so far has glossed over some of the business realities facing newspaper publishers today. There’s no question that permanently free archives would be a boon for online discourse and public journalism. But can newspapers make it pay?
First, let me clear something up: Some of the discussion I’m linking to has conflated the concepts of permanent URIs with free archives, but we should keep the two separate. It’s trivial to support permanent URIs while not providing free archives; that many newspaper archives don’t do so is less indicative of some money-grubbing publisher’s desire to piss off Tim Berners-Lee than of the blindly-lurching-forward nature of web programming. Personally, I think in the near-term the permanent URI is even more important that the free archive, but that’s a subject for another day.
All the news that’s fit to give away
When you think about it, it’s paradoxical that some newspaper might give away a new article for free and charge for an old article, since new content is clearly of more value than old content. But keep in mind that in business, cost and value are only vaguely related, and just because something is valuable doesn’t mean that it’s easy to charge for it. In fact, new and old articles are desired by such distinct audiences that you could think of them as two separate products: News and history.
When a newspaper article is new, it’s desired by the general public, and while some of its value is based on how well it’s written, as much if not more is based on the facts that it contains. Facts aren’t protected by copyright, so having a fact first is a competitive advantage that withers away in hours, not days or weeks. Today, many people go to the New York Times web site for breaking news because the Times is a paper they trust, so reading its articles is worth the minor hassle of free registration. But if the Times were to start charging for today’s news, many of those readers would choose other options. They’d look for a similar story in the Washington Post or the Los Angeles Times, and such a story might even offer a summary of the Times’ findings with a polite attribution. Even being the first with great research doesn’t protect you much, because the results of a journalist’s research are still just facts, so thay can be copied, too.
People do pay for today’s Times, of course, in paper format, but that’s largely because people understand the cost of a physical package. Even then, this sort of package suffers from competition from free sources: Before the Internet, it was television, radio, and free weeklies. (By the way, what are we going to call newspapers if they morph into paperless web-centric news organizations? “was-papers”? “pro-blogs”?)
When an article ages, it loses the interest of the general public and becomes proportionally more valuable to a niche audience of researchers, journalists, and other professionally motivated readers. This audience studies an archived article less for the simple facts than for subtle details that can only be gleaned from the primary source. Who broke the Watergate story? Did the journalists move quickly to imply a direct link to the President, or did they wait patiently until they had assembled enough facts? Did they seem to have a sense of how big this story would become? If you’re writing a book about Watergate, you’d better get your hands on the original reportage by Bob Woodward and Carl Bernstein at the Washington Post. And you’re likely to pay for that access if necessary.
The half-life of content
To a large extent, then, this discussion depends on how the value of content diminishes as it ages: I’m going to swipe the scientific term “half-life” to describe this. Different types of content have half-lives. Sports scores and stock quotes have a high value to some if they can be delivered quickly, but that value drops minute by minute. An essay like this one, on the other hand, has a relatively long half-life, because in theory the context that makes its content valuable will not change nearly as quickly. (It should go without saying that half-life isn’t related to whether something was any good on the minute it was published.)
So it’s not helpful that Waldman, writing in PressThink, uses Chris Anderson’s essay “The Long Tail” as an example of the value of longevity online:
[“The Long Tail”] first appeared nearly four months ago, but it is still resonating around the web. As I write this, it has some 545 links according to Technorati, 15 in the last 24 hours alone. I used the epithet “influential” to describe it and that influence is partly down to the quality of the idea, and the place it originally appeared. But the permanence of the Web is also at work. Every day, more people are finding it. Every day it’s value is increasing as a result.
“The Long Tail” is more of an essay than a front-page newspaper article. Its insights are profound and will take a long time to be teased out, and I’d bet that in ten years, you could read it again and still take something away. In fact, I’m not the only person betting on its longevity: Anderson has parlayed the essay into both a book deal, and a standalone blog. But most news articles won’t stand this treatment. It’s hard to imagine somebody starting a blog that’s all about “Balking at Vote, Sunnis Seek Role on Constitution”.
But what about the long tail: the concept, that is, not the essay? Doesn’t the slow, steady crystallization of microaudiences mean that any underappreciated article could be rediscovered in a year or a decade? Perhaps. Again, we should not use Anderson’s essay as an example, because the same factors that make it profound also make it relatively opaque. “The Long Tail” takes a while to read, and although Anderson’s writing is clear, the issues he’s writing about necessarily take some time to explain. Bloggers pass this link along slowly in part because they process it slowly. And as we study the long tail more seriously in the years ahead, I suspect we will discover that it’s most applicable to content that cannot be evaluated quickly, like essays, novels, records, and films. News articles, by contrast, are engineered for quick consumption. They announce themselves with straightforward headlines, focus on concrete events, and avoid theoretical speculation. If you didn’t read a news article on the day it came out, the odds that you’ll rediscover it later are slim.
As manifested in sites like Amazon and eBay, the long tail has proven its ability to connect niche audiences with niche publishers, but there’s not much proof yet that it connects readers of the present with producers of the past. And the trend over the past decade has been that we have been progressively less interest in reading from the past. Imagine that the current trend continues, and that every day for the rest of your life, more online content is produced than was produced the day before. If this keeps up, won’t we drown out the past?
Let me provide a concrete example: Right now, using Google News’ advanced search, I can search for pages about “Roe Wade anniversary” updated in the past month, and I get about 1700 articles. Even if the Times’ archives were free, what are the odds I’d have the time to read a three-week-old article on their site, considering that I might have to wade through 1000 other articles to find it? And what are the chances I’ll care about a similarly old article in ten years, when that same search might return 170,000 articles?
Where does the money come from?
Regardless of how long news stays free, it’s been a long time since such articles were solely responsible for a newspaper’s revenues, either online or offline. Ads play a large part in helping a newspaper stay in the black. In the business, there are two kinds of ads: Classifieds in the back, and pictorial ads near the articles—”lineage” and “display”, in newspaper parlance. (The first of the two terms is pronounced with two syllables, not three.) Lineage was once a dependable source of revenue, but it’s now being threatened by more efficient forms of transactional communication. Journalism’s cultural relevance may be most threatened by blogs, but its financial survival is most threatened by Craigslist.
This matters when you’re talking about newspapers, because although Google Adsense now helps bloggers around the world defray their hosting costs, there isn’t yet any evidence that such ads can be lucrative enough to, say, cover the Times’ cost of employing a full-time journalist in Iraq. Right now, the Google text ads on that aforementioned article about Sunnis in Iraq are pretty crap: One site about elections, one about President Bush winning the election, and one that says you can “Do Surveys for $250/hr”. If you’re out on the edge of the network, blogging about Pomeranians or centaur porn, Google Adsense makes your life easier. But if you’re the publisher of the New York Times digital division, and you’re basing your business on less-searchable characteristics like first-hand reportage and professional editing, Google Adsense probably gives you a headache the size of a server farm.
There is one way that a newspaper might reliably profit from its online presence today, and it’s resolutely old-fashioned: Lexis-Nexis. As reported by Wired, most of the profits of the Times’ digital division comes from this archive service, more than $20 million last year, to be precise. This, according to the Wired article, is the single greatest reason that the Times won’t be opening its online archives any time soon. Not because they’d lose a little revenue from their own web site, but because they might lose a lot if Lexis-Nexis views them as less valuable for it.
Here we have two archives containing the history of the New York Times, one maintained by Lexis-Nexis and one by the Times itself. Both archives are subscription-only archives aimed at professionally motivated customers, so why is Lexis-Nexis making so much more money for the Times? Because Lexis-Nexis spans a massive number of publications: You can enter a search string and get search results on tens of thousands of publications, not just one. Lexis-Nexis is what journalists used instead of the Web, back when the Web didn’t exist.
Putting pressure into the system
Although one is a closed system and the other is open, Lexis-Nexis and Google offer the same lesson: Value can be created by decoupling distribution and production. Lexis-Nexis isn’t a media outlet. It leaves the messiness of producing articles to other companies and focuses instead on the business of putting it all in one place and making it searchable and accessible by computer. And even in the face of competition from the billions of pages in Google’s archives, Lexis-Nexis still retains a competitive value today, because most of its partners can’t figure out how to make the same kind of money from Google, so they shut out Google entirely.
As with all things that the Internet touches, the field is open for change. Off the top of my head, I can think of three places where the pressure for change can come from.
First, companies who want to compete with Lexis-Nexis could do so by creating smarter niche offerings priced for citizen journalists. After all, any business strategy based on dividing the news-reading public into two distinct groups—general readers who want it for free and professionals who will pay Lexis-Nexis-size fees—is probably quite unstable, given the growing number of web users who occupy a murky space in-between.
An individual subscription to Lexis-Nexis is way out of most people’s price range, but maybe archives could be offered in blocks grouped by interest, like cable channels: 100 progressive magazines here, 25 Midwestern newspapers there. Maybe these could even by ordered à la carte. This archive service could use ads to help cut subscription fees, since many of the customers probably wouldn’t mind as long as those ads were well-targeted. This wouldn’t get rid of the pay archive problem, but could ease the pain in the short-term if it increased the odds that readers of your blog could follow a link from 5% to 50%. Not that 50% of the internet would ever be subscribed to the same publication’s archive, but that your readers might, since the fact that they’re reading your blog increases the odds that they have other interests in common.
This is a pretty blue-sky sort of idea—as far-out, I’ll admit, as the idea that targeted ads will pay for free archives—not least because any company starting now would have to build the sort of contractual relationships that Lexis-Nexis has spent decades building, and Lexis-Nexis isn’t likely to take that lying down. On the other hand, outside companies have their own advantages in a changing field: A truly smart competitor, for example, would constantly scan article text and inbound links to mine for connections and suggest new, subtly defined niches. So maybe this is another way for Google to try to conquer the world, not that they’re lacking for ideas right now.
Second, journalists can try to get their hands on copyright, and start experimenting. Many journalists currently have no say in archiving practices because their contracts are work-for-hire, but there’s no reason it always has to be that way. It’s possible to simply try to sign different contracts, and then put up all your past writing somewhere and see how well Adsense will pay. Of course, many of the biggest publications won’t sign such a deal, and this would invariably involve being paid less cash up-front. A journalist who does this is essentially betting that she can find a value in her archives that her publisher cannot.
Third, bloggers can elevate the issue of open archives in their own linking practices. A blogger who is passionate about this issue could contribute to it by maintaining a comprehensive list of the archival policies of newspapers and magazines, complete with email addresses where you can contact publishers and (politely) try to convince them to open up.
As is often the case, these solutions start from the edges and work their way into the center. There’s no reason to expect a company like the Times to make these sorts of moves any time soon. Progress will happen first with niche political journals and small-town bloggers, with bloggers and maverick journalists and just maybe an upstart aggregation service. Who knows? In five years, maybe the web will be full of free archives, with deeply linked communities of bloggers digging up the past for one another. Maybe then, the folks at the Times will look up and take notice. But that day is, I suspect, a long way off, and in the meantime, there are many smaller goals to be met.