Database Design and Construction: Database Costs - Dollars, Staff and Resources

Article excerpt

Parts one and two of this series (Information Today April, page 41, and May, page 7) covered the major decision points, definitions, and structure involved in creating a database. In this, the final article of the series, we will discuss the costs of creating and maintaining a database. Such costs include not only the actual dollars spent in design and construction of a database but also software and hardware costs, maintenance, as well as the staff hours expended and resources to be budgeted for. Personnel costs in particular are often overlooked.

There are a great number of variables that affect costs--some are hidden, some obvious. These can include:

* the cost of acquiring data

* keying, scanning or otherwise converting data into a computer readable form

* the cost of preparing the data--abstracting, indexing, coding or tagging, etc.

* the volume and complexity of data to be processed

* purchase/licensing fees for software

* purchase/licensing fees for hardware

* telecommunications fees

* hardware/software maintenance

* staff

* supplies

* output, i.e. creating CD-ROM, reports, publications, etc.

If it is being created for use by, or sale to others, there are market research, sales, packaging, and distribution costs to be considered as well. The extent to which these costs can be covered by anticipated revenues should be part of an initial cost analysis. In some instances, management may only approve the project if it can be shown to have a potential for producing revenue. In other instances, it may be necessary for the organization to subsidize the database when it appears that income produced, if any, cannot cover operating costs, yet there is a perceived value to the organization or its employees, members or clientele. Marketing studies and cost/benefit analyses are labor intensive and expensive.

As we mentioned in part one, the file design should address decision as to what to include and what to exclude. The volume and scope of the database is a very important cost variable. This also can involve questions relating to the backfile of existing material. For example, should the database include all of the material ever collected or only cover more recent years or specific areas? Depending on the nature of the data being captured, it may not make sense to spend the time and dollars necessary to load older or outdated material which will seldom or never be called for. On the other hand, there are some disciplines where older materials continue to be relevant and valuable. There is a close correlation between the size of the file and the cost of loading and maintaining the database.

Editorial guidelines and data entry specifications can also seriously impact the cost of converting and entering data. If well thought out and properly documented, such specifications can save considerable time and effort in what can be a very labor intensive and thus costly operation.

On the input side, costs for entering or converting the data can be expressed in terms of cost per character, per thousand characters, per page, etc. When scanning is involved they can also be expressed as cost per image. Generally speaking, the cost quoted per unit decreases as the number of units increase since there are economies of scale. Another factor that affects costs in the accuracy rate required. For example, here at Access Innovations, Inc., our minimum delivered accuracy rate in 99.95 percent and we have delivered at a 99.998 percent accuracy rate. These rates, respectively, equate to five errors per 10,000, and two errors per 100,000 characters keyed. The quality as well as the volume of data to be converted and/or entered into the database is another cost variable. For example, if all of the material is already in a machine readable format that can be readily loaded into the database being designed, input costs can be quite low relative to other factors. …