Academic journal article Information Technology and Libraries

Nonroman Scripts in the Bibliographic Environment

Academic journal article Information Technology and Libraries

Nonroman Scripts in the Bibliographic Environment

Article excerpt

The representation of nonroman scripts in Latin characters causes information to be distorted in various ways. USMARC now provides for "alternate graphic representation," so that text in the authentic script(s) may be included in bibliographic records. As more library systems with nonroman capability are developed, conformance to standards for the encoding of nonroman data becomes more critical. The development of a single global character set standard is a significant change that must be accommodated in USMARC.

In Rule 1.0E, AACR2 mandates that the bibliographic description be written in the same script as the source of information "if practicable." (1) For more than a decade, machine-readable cataloging and bibliographic transcription in a nonroman script were mutually exclusive. During this period, the only way to represent nonroman data in machine-readable form was by transcription into Latin letters (romanization). The first part of this paper criticizes romanization as information distortion.

The USMARC Format for Bibliographic Data was modified to accommodate nonroman scripts in 1984. (2) The previous September, a Chinese/Japanese/Korean (CJK) capability had been added to the Research Libraries Information Network (RLIN) system. (3) The USMARC modifications are outlined in the second part of this paper, since not all readers will be familiar with them. The remainder of this paper describes efforts to develop a universal character set, and its potential effect on USMARC.


Currently, most local systems are limited to Latin script; romanization is necessary if the automated catalog is to be a comprehensive representation of the library's holdings. The practice of romanization has two causes: the lack of the proper typographical facilities and the concept of the "universal" catalog, "the catalog in which all items in the collection are entered in a single alphabet from A to Z, regardless of language, regardless of form, regardless of subject. The American ideal." (4)

The deficiencies of romanization from the point of view of the reader have been documented. (5-7) However, many nonspecialist librarians are unaware of the deficiencies and still regard romanization as adequate for access. Language experts reject this view; they persuaded the Library of Congress (LC) to continue to provide original script cataloging on cards for material in the so-called JACKPHY languages: Japanese, Arabic, Chinese, Korean, Persian (Farsi), Hebrew, and Yiddish.

Not only does romanization impede access, it distorts the presentation of information in a number of ways. The presentation of the text is unnatural. Distinctions present in the original language may be lost, or distinctions not present in the original script may be artificially created. Different transliteration schemes are used in different countries or contexts. Finally, the normalization used in automated indexing and searching, when applied to romanized text, introduces another layer of distortion.

Unnatural Presentation

Romanization is the presentation of language text in unfamiliar letters. Readers of a language may, in time, become used to a particular romanization scheme, and be able to read their language even when it is written in Latin letters. In the People's Republic of China, pinyin, the national standard for the romanization of Chinese, has a number of applications: it is used to show the pronunciation of ideographs (in which Chinese is normally written), and it underlies a system of finger-spelling for the blind.

A reader faced with text rendered in an unfamiliar way may find it incomprehensible. This can be illustrated by the case of alternative romanization methods. Hebraica bibliographers in the United States have become used to reading Hebrew written in Library of Congress romanization (which includes the vowels that are usually omitted in Hebrew orthography). …

Search by... Author
Show... All Results Primary Sources Peer-reviewed


An unknown error has occurred. Please click the button below to reload the page. If the problem persists, please try again in a little while.