How to Use CJK Software to Read Chinese, Japanese, and Korean on the Web

Article excerpt

Traditionally, East Asian libraries feature unique provisions of printed materials in Chinese, Japanese, and Korean (that is, the CJK languages). With the rapid development and wide application of the World Wide Web and other Internet services, resources in these languages are no longer restricted to the physical walls of a library. Instead, they are emerging in large quantities in cyberspace. Consequently, there is a technical challenge to East Asian librarianship on how to effectively deliver these valuable and often real-time information assets to patrons. This readability challenge extends to any potential information users as well.

Ordinary Web browsers, such as Netscape and Microsoft Internet Explorer, do not translate CJK encodes back into the corresponding languages. As a result, even if a Web user happens to surf onto some sites that write in Chinese, Japanese, or Korean, he or she usually sees no more than a screen of gibberish. Although there are Microsoft Windows editions for Chinese, Japanese, and Korean, respectively, each of these language-specific Windows systems can only deal with one particular language.(1) Moreover, each language has more than one encoding system, and therefore requires more than one version of Windows. A good example is the Microsoft Chinese Windows, which has two versions: one for the Big5 encoding, popular in Hong Kong and Taiwan, and the other for the GB encoding, which is widely used in China and Singapore.

Linguistically, Chinese is very different from Japanese and Korean in syntactic typology. However, the Chinese logo-graphic characters have been used widely in the Japanese and Korean languages for centuries. These characters are known as "hanzi" in China, "kanzi" or "kanji" in Japan, and "hanja" in Korea. Semantically, when Chinese characters were borrowed by Korea or Japan, their basic meanings remained the same in most cases. Orthographically, although there have been language reforms to simplify or standardize the writing systems in China and Japan, numerous characters are still written the same as they were many centuries ago. In this sense, there is a good linguistic foundation for the development of software packages that can be used for Chinese, Japanese, and Korean.

As a matter of fact, a number of such CJK packages are available on the market. As one sales strategy, the software developers of these packages have been offering the latest evaluation versions of their products on their Web sites. The general public can download and legally use them for a certain period of time. Hence, potential users have the opportunity to select and determine a package most appropriate for their needs before purchasing an official copy, which usually is more functional in fonts and other word processing capabilities.

Exploring CJK Encodes and Some of the Software Packages Supporting Them

Because of the multilingual uses of Chinese logographic characters in East Asia, each country or region has developed its own standards in character encoding. In Japan and Korea, encoding also includes their own national characters. Japanese has hiragana and katakana characters, and Korean has hangul characters.(2) In fact, more than 50 encoding systems exist for Chinese, Japanese, and Korean; they emerged at different times for various operating systems and purposes. The internal codes that follow represent only a few. However, they are widely used and are relevant to the software packages that I will discuss later.

Chinese Internal Codes

* GB: GB stands for "Guo Biao," which is short for "guojia biaozhun," meaning "national standards." It is commonly used in China and Singapore, more often for the simplified Chinese characters than for the traditional Chinese characters.

* Big5: Developed in Taiwan, Big5 is widely used in Taiwan and Hong Kong for the traditional Chinese characters. Nevertheless, Big5 can also display simplified Chinese characters. …