Magazine article Computers in Libraries

Cutting-Edge Statistics

Magazine article Computers in Libraries

Cutting-Edge Statistics

Article excerpt

Gather statistics that tell you something you can use.

As I write this, it's the year-end, and that means it's time for the annual compilation of statistics. Since we no longer have the distraction of Y2K (A few Swedish buses didn't start up this year is all.), we can add extra energy to this process. We have our traditional measurements in libraries: circulation, cataloging throughput, live-bodies-through-the-door-counters, dollars and cents. But it's been a bit of a challenge incorporating technical measurements into the traditional service. Which ones are useful and what do they mean?

1 Login =1 Circ, Right?

One of our first challenges in this arena was when we began to count logins. Since we are one of the few libraries to do this at all, ALA is not likely to incorporate logins as a standard statistical measure for libraries. To ALA and libraries in general, this sort of thing is invisible. We're not talking peanuts here, either. We experienced just under 1 million logins in the year 2000, and we processed 10 million pieces of e-mail, including about 25 percent in rejected spam mail. Nearly half our holds processed were done online remotely (not from within a branch), far more than any other outlet. This is up several percentage points compared to last year. If a login were a circ, then "online" would be the busiest branch as well as the least expensive to operate.

Is the comparison valid? Well, just like someone walking into the library and checking out a book, he or she uses library resources. In one case it's heat, lights, overhead, a few seconds of a staff member's time, and maybe a visit to the reference desk. Many people come in to use the computers and talk to no one while they're here. In the other case it's the use of one of a few modems, a couple CPU cycles, some electricity, and maybe a call to the support desk to figure out how to hook up. Most people dial in and don't require any extra support.

The similarities between the two groups are eerie, but they aren't the same. If someone could reduce all the numbers to energy or communications units expended, maybe we'd be on to something, but that's unlikely. I maintain that these sorts of things ought to be counted, but they should be carefully compared.

Tracking Web Statistics

A second confusing area is in Web statistics. How do you count the usefulness of your Web site? What does it mean to say you've had 10,000 visitors? That doesn't mean they found your site useful. Also, how can you use the information you've gathered to best advantage?

We've been using a statistical package called WebTrends for a couple of years ( These products are very much business-oriented, dealing with such things as ad clicks and e-commerce, but still they have some interesting information. Go to to see a complete example.

The program works off the log files kept by your Web server. Herein lie the problems. With Microsoft Internet Information Server (IIS) the log file for a single day averages 2 MB in size. These things build up, so if you don't pay attention, you run out of disk space. A year's worth of statistics is approaching a gigabyte in size, which is still a pretty serious piece of real estate. WebTrends goes through each line of all 366 log files for a year and tallies everything up. On a medium-fast Pentium, running the program over a year of files takes all day long. That's right. If you start the program on one computer, you can't do anything useful on that computer for the rest of the day. WebTrends does not like to grab the files off one computer and travel over the Ethernet to another, so I've taken to copying the log files to an empty machine so it can process locally, and alone, taking as long as it wants.

The result usually takes a few days because inevitably the first couple of tries won't be set right. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.