Designing an Expert System for Classifying Office Documents

Article excerpt

Can records management benefit from artificial intelligence technology, in particular from expert systems? This article gives an answer to this question by showing an example of a small-scale prototype project in automatic classification of office documents. Project methodology and basic elements of an expert system's approach are elaborated to give guidelines to potential users of this promising technology.

Expert systems technology, as a category of computer based artificial intelligence, offers a number of possibilities for further automation and improvements in managing office documents. Because of its potential, expert systems represent an interesting subject for study and implementation. To design and develop an expert system application is a challenge. Different expert systems are already being applied in many areas of human activity. Classification of different subject domains is one of the popular targets of expert systems. Yet, there are not many expert systems developed for office document management.

In the book, "Artificial Intelligence--Programming Techniques in Basic," an elementary expert system is described as "one of the easiest computer programs to develop from a programming standpoint."(1) When developing an expert system, it is only necessary to ask a series of questions; input the answers; have a series of IF-THEN statements to eliminate any conclusions that do not fit the data provided; and print the conclusions that were not eliminated. This knowledge, combined with some skill with the QUICKBASIC programming language, coupled with the author's understanding of the scheme for classification of office documents in the International Civil Aviation Organization (ICAO), gave impetus to the project. It was named a Classification of Office Documents Expert System, abbreviated as CLOD-X. The project became more than just a design and development of a single application. It became a means to learn and understand expert systems technology as a category of artificial intelligence. It also became a gateway to the exploration of possibilities for wider use of what expert systems can offer to records and document management within the contemporary office environment.

This is a small-scale prototype project with some limitations. Firstly, from a broader perspective, because of its constraints the chosen approach may reach unique conclusions that may not be duplicated by other similar projects. More research is needed for a sound generalization. However, the project indicates that this may be a viable means to automate office document classification. Secondly, from the micro perspective, the overall analysis of the project framework suggests that limits had to be established. There are over nine thousand different file titles used by ICAO, so that scope had to be constrained to a manageable number. At the minimum, there had to be a limit for the initial design of the prototype version such as CLOD-X. Therefore, only the section of the ICAO Central Registry File Guide(2) dealing with Distribution and Sale of Publications and Documents (Code A10) was selected. Added to this section was part of the registry file section dealing with the sale of Audio Visual Aids (Code AN12). The selected sample contained all the files dealing with purchase and sale of ICAO publications. As a result, the chosen subset included over four hundred file titles or possibilities (classification suggestions), which the expert system had to take into consideration. It was a sufficiently large database (number of files) to use in the design and programming of a simple expert system. It is believed that based on this experience, a comprehensive expert system could be developed covering the whole range of ICAO file subjects.

THE FEASIBILITY STUDY

The starting point of any well planned project is a feasibility study. The design of an expert system for document classification should be no exception. Before beginning any sort of knowledge acquisition, its representation, programming and expert system building, a crucial question has to be answered: Is the project feasible? …