Academic journal article Information Technology and Libraries

Web Indexing with Meta Fields: A Survey of Web Objects in Polymer Chemistry

Academic journal article Information Technology and Libraries

Web Indexing with Meta Fields: A Survey of Web Objects in Polymer Chemistry

Article excerpt

Four Web search engines--Alta Vista, Lycos, Excite, and Webcrawler--were used to collect data on Web objects in polymer chemistry. One thousand thirty-seven Web objects were examined for data in four categories: document information, use of meta fields, use of images, and use of chemical names. Issues raised include whether to provide metadata elements for parts of entities or whole entities only, the use of meta syntax, problems in representation of special types of objects, and whether links should be considered when encoding metadata. Use of meta fields was not widespread in the sample, and knowledge of meta fields in HTML varied greatly among Web object creators. This article is part of the result from the metadata project funded by OCLC Library and Information Science Research Grant Program.

As networked information expanded dramatically during the 1990s, the topic of how we organize this resource and integrate it with the existing bibliographic information repertoire has become a focus of study in the library and information community. New metadata schemes, such as the Dublin Core Metadata Element Set (Dublin Core hereafter) (Weibel 1995), are results of the effort in this area. Following a series of workshops on metadata, projects have been initiated to experiment with using the Dublin Core. While these research projects aim at testing the Dublin Core, there are debates over whether the number of elements in the Dublin Core should be expanded or contracted, whether the syntax of the Core should be strictly defined or left unstructured, and whether the Core should be targeted solely at the existing WWW architecture, or extend that architecture (Lagoze 1996). These questions strongly demonstrate a need for research in understanding existing Web objects and repositories from the description and representation perspectives.

Past experience in developing digital libraries proves it not feasible to have only "one overarching plan for cataloging, searching, and retrieving data from the many trillions of bytes of digital material that tomorrow's networked collections will contain" (Jacobson 1995). Distributed domain digital libraries can be one practical solution to this problem because of the fact that the description and representation of digital material is tailored to fit the idiosyncrasies of and the information needs in that subject domain. While the Dublin Core is considered as the core of metadata elements common across subject domains, we must obtain a thorough understanding of the current status of digital material and metadata use in subject domains before significant amounts of time and resources are invested to implement and expand a metadata scheme.

This article reports the findings from our survey of over one thousand Web pages in polymer science and chemistry in general. The intent of this survey was to investigate the current use of meta fields in representing Web objects in polymer chemistry, and to provide firsthand data for implementing metadata embedding and scheme expansion in the next phase of the digital library development.

Literature Review

The term "metadata" refers to "machine-understandable information about Web objects" (Swick 1997). They describe resources, indicate where the resources are located, and outline what is required in order to use them successfully" (Younger 1997). Metadata schemes, such as Dublin Core, entail a group of codes or labels that describe the content and/or container of digital objects. When these metadata are embedded in HTML documents, they can accommodate better automatic indexing for digital objects and thus provide better aids in networked resource discovery. Several terms have been used interchangeably in describing the digital objects that a user views through various interfaces (e.g., a Web browser). They are given names such as Web document, Web object, hypertext, and hypermedia. In the context of this study, the term "Web object" refers to any digital bibliographic entity or part that is accessible via the World Wide Web. …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.