Who Goes There?: Measuring Library Web Site Usage
Bauer, Kathleen, Online
After all the work, time, and money that's invested in building and maintaining the library Web site, you and your staff will most likely want to know who, if anyone, is using it. Additionally, what features and resources do visitors use most often? Are the people accessing the site the same people who come into the library? How do people find the Web site? Do they use a search engine?
These are usage questions, and librarians already have experience in gathering usage data. For example, librarians count the number of questions asked at a reference desk as a way of measuring its use. Like the reference desk, the library Web site represents a service point. The Web site service point, however, is electronic, and it requires new methods of measuring usage.
Understanding the basics of Web server technology and the data servers record is a good start in developing usage measurement techniques. After that, you can explore the software that exists to help you make sense of Web site statistics, and find the right software for your system.
WEB SERVER LOS FILES
Every transaction on the Internet consists of a request from a browser client and a corresponding action from the computer server. Each individual chent/server transaction is recorded on the server in what is called a server log file. Its most basic form is called the common log file. [Note: In the examples that follow, log entries typically appear as a single line of text.]
The Common Log File
The common log file format is the standard set by the World Wide Web Consortium. The syntax of an entry in a common log file looks like the following:
In this example, the remote host is gateway.iso.com. The next two fields, rfc931 and authuser, are blank (represented by dashes). The request was made on May 10, 1999 at 10 minutes after midnight. The file requested was class.html. The error code 200 (status OK) was returned, and the file requested was 10,000 bytes in size.
The common log file format may be the standard, but variations of log files exist. Additional information may be stored in referrer and agent logs.
Referrer Log File
Many servers record information about the referrer site, or the URL a visitor came from immediately before making a request for a page at the current Web site. In this example, the referring page was a search engine, ink.yahoo.com, and the search used to find the requested page was "sample log file." (Many Web designers and marketers are interested in the search words that lead users to their sites.) Note that the IP address of the computer making the request, 999.999.999.99, is also recorded here.
Agent Log File
A third type of recording is the agent log. An agent log records the browser and operating system used by a visitor. It will also record the name of spiders or robots used to probe your Web site. An example of a hit from a Northern Light search engine, recorded in an agent log, might look like:
In addition to the standard information about the date, time, and IP address, email@example.com tells you that this hit came from a crawler.
A hit from a Web browser would reveal the browser name and version, such as Mozilla/4.0. This probably means the visitor's browser was Netscape version 4.0 (Mozilla was the code name for Netscape and is still used for a browser compliant with the open-source Netscape code.) Browser information, however, is not always considered reliable.
Common log files, referrer logs, and agent logs are sometimes combined into one log. Whatever format your Web server uses, the first thing you will need to do is determine what type of log file is being generated. The person responsible for the server should be able to tell you what format is used. In addition, there may be options in the log file that determine what data is recorded, and you may be able to use these options to increase or decrease the data collected, depending on your needs. …