Web of Confusion
Kirsner, Scott, American Journalism Review
Doesn't anyone on the Internet know how to count?
Given the reputation for number-crunching that has surrounded computers since they came on the scene, you might expect that tallying the number of users who visit a given site on the World Wide Web would be a simple task. It's not.
With more than 500 daily newspapers in this country operating Web sites, according to the Newspaper Association of America, and with 400 more expected to launch by the end of the year, tracking online readership has emerged as a critical issue. Without solid information about how many readers visit the site and who those readers are, it's tough to effectively allocate resources to an online effort and even tougher to attract advertisers.
Yet a number of barriers, stemming from semantic nebulousness and technological limitations, prevent newspapers from accurately answering some key questions about their online readerships: Exactly how many readers are out there? How often do they visit my site? How long do they stay during a typical visit? And who are those readers?
To add to the confusion of dealing with an evolving technology and vague definitions of terms like "page requests" and "visits," newspapers also are being forced to decide which approach they will take to measuring online readership. They may opt to use internal software tools, like Interse's Market Focus or net.Genesis' net.Analysis, which attempt to analyze a site's,server log. They may elect to be audited by organizations like MRO or Audit Bureau Verification Services (part of the Audit Bureau of Circulations), which try to validate the statistics recorded on the server log. They may choose to implement software that monitors traffic as it happens at the network level, like Accrue Insight. Or they may decide to subscribe to research from PC Meter, which follows users' clicks around the Web with software installed on their home PCs. Typically newspapers are being forced to figure out which combination of these various measurement approaches they want to adopt.
"We need strong and precise measurement so we know how to reach people with this new medium," says Jack Fuller, president of the Tribune Co. and a 1986 Pulitzer Prize winner. "The most perfect piece of journalism which fails to reach people is a failure--it's not good journalism."
It's almost as if the county fair opened without first setting up the turnstiles. Most newspapers gave little thought to measurement issues before launching their Web sites. "Stats were really an afterthought," says Dan Peak, Webmaster at the Kansas City Star. "We knew we were going to have to track users, somehow sort of. But it wasn't a priority at the beginning."
Even sites that did plan to track usage from the start, like the New York Times, ran into problems. Since every visitor entering the Times site is required to register by getting a password and providing demographic information, the site must create a new database record for each new user. On opening day, the computer managing the database crashed due to the huge number of visitors. The paper was forced to temporarily disable its sophisticated registration system.
"You don't necessarily think about measurement as the first thing you want to do," says James Conaghan of the Newspaper Association of America. "You want to go on a shakedown cruise first, and then deal with measurement issues as they come up, or as advertisers demand certain numbers."
After the shakedown cruise, most newspapers and magazines face the challenge of dealing with the huge quantity of data generated by thousands of users visiting their site. "You quickly realize you're dealing with tons of data that are very hard to analyze," says Grady Seale, who helped the Boston Globe launch its site in 1995.
Where do all these data come from? Every single request that a user makes of a Web site generates an entry line in the server's log file. …