Dating the Web: The Confusion of Chronology

Article excerpt

Exploring the Internet dating scene for the information professional means understanding the dimensions, deficiencies, and differences of the various dates associated with Web pages.

For those of us grounded in the print world of publishing, the date of publication helps identify and distinguish different editions, specific periodical issues, and even re-printings. The date of print publication rarely matches the exact date of composition but does have the distinct advantage of being basically unchangeable. Once published and a publication date has been included on each item, the only way authors can update or otherwise make a change is to issue a new edition or a correction. Otherwise, they would have to track down every single copy and make the change.

The Web, of course, is pretty much the opposite. Site owners can change any page any time they wish. Unscrupulous but talented hacks sometimes can even change other people's pages. And anyone who has ever tried to cite a Web page knows that many have no publication date information listed at all.

Yet for all its newness as a publication medium, the Web is aging. We have had public Web sites up for more than a decade now, though few pages remain in their original format. As the Web ages, it becomes increasingly important to try and understand the origination date of certain Web content. For intellectual property cases and the historical record, among other reasons, it can be important to know when a Web page was actually written or first posted. Exploring the Internet dating scene for the information professional means understanding the dimensions, deficiencies, and differences of the various dates associated with Web pages.

DATING DIMENSIONS

With the ease of posting a Web page, which is then publicly available, and the subsequent ease of changing that page, issues of date information have several dimensions. There is the original content creation date and possibly an editing or updated date. The surrounding text and graphics may come from an entirely different day and time while the page design may have occurred at yet a different point.

The date and time when the file containing this conglomeration of parts was last changed are reported in the date stamp. Any time a file on a computer is changed, a new date and time stamp, based on the computer's internal clock, is recorded.

Take for example an article written in 1998 that may have been uploaded to a Web page. The links in it may have been updated in 2000, while the page was redesigned with new surrounding logo graphics in 2002. Then the whole site was redesigned using a new content management system in 2004, resulting in the date stamp being updated to report the current year's date. Yet the bulk of the content of the article is still 6 years old, and the links have not been updated in 4 years.

DATE DEFICIENCIES

The previous example shows the problems with dating Web content. Most articles published on the Web by news media and periodical publishers have fairly obvious creation dates posted along with the article. Many Gannett papers include an "originally published" date and label at the bottom of each article. URLs also include the year, month, and day of the original publication.

However, other news publications often have no date listed in the article or in the URL. Still others put the current day's date on the top of every page, even when the articles were obviously published earlier. Alternatively, some list a "posted on" date. This may or may not be the same date as the date of the article's newspaper publication.

Beyond articles, plenty of other Web pages include some kind of a date. Far too often, it is only a small copyright notice at the bottom of the page. Typically the current year or a range of years such as 1995-2004 is listed. The problem with this date statement is that, on many sites, it may just be part of a standard footer on every page. …