Disaster Recovery Then and Now: Considerations for Contingency Planning
Rehak, Richard H., Risk Management
Disaster! The word itself seems to radiate gloom. Most people associate disaster with raging fires, floods, earthquakes and explosions. Actually, seemingly minor inconveniences can spell disaster for many businesses: a leak in a subterranean tunnel flooded basements and eventually shut down the city of Chicago; loss of electrical power at a telephone company switching office shut down and restricted traffic at the major metropolitan airports in New York; and a bolt of lightning during a summer rainstorm caused an electrical surge that raced through a Florida company's coaxial terminal connections and destroyed a local area network (LAN) and the data it contained. Should one think that this was just a fluke, consider that this company suffered a repeat of this situation only a couple of weeks later during another storm.
With the growing dependence on automation coupled with the nation's failing infrastructure, there exists a recipe for disasters to occur with ever-increasing frequency. In fact, more than 50 percent of the disasters supported by Comdisco Disaster Recovery Services during the past 10 years took place in the past 18 months alone.
Now imagine having just started a new job in the 1970s managing the computer operations department for a large food manufacturer, when just two weeks into it, disaster strikes! The lights suddenly go out, and the silence that ensues contrasts sharply with the usual background noise of the data center.
Proceeding directly to the computer room, the new MIS manager meets the first-shift supervisor on the way, who stammers, "The computer's down." All computer equipment is quickly switched off to protect it from an electrical surge should the power come back on; this was the usual routine prior to or during electrical storms. As it turns out, the outage was caused by a construction crew building a new road between the corporate complex and the neighboring complex. The laborers had inadvertently cut the main power cable with their bulldozer, and specialized parts had to be shipped in by air for repairs.
The computer scheduling personnel, shift supervisor and MIS manager determined where the organization stood in regard to its production schedule. Fortunately, the company was current with all applications and was a week or so away from its monthly closing requirements. While the on-line systems were down, the user groups had manual fall-back procedures and could operate for several hours without major disruption.
The plan, if the computer was down, had been to rent computer time from other companies and to use the organization's own operating system backup tapes to duplicate the home operating environment. This process would take about 35 minutes to become operational. (The procedure to do this had been successfully tested at another location six months earlier without major difficulty.) The operators would, of course, have to modify all job control language to match the hardware configuration addresses and device types of the new machines they would be operating, which can be a considerable and time-consuming undertaking.
The facility manager informed the MIS manager that the primary feed to the complex was broken and that best estimates to get the parts and repair it were 24 hours. But the department needed to process the day's work that evening and prepare reports for distribution the next morning--the company was now officially in a disaster recovery mode.
It proved to be much harder to locate available computer time than it had been just a few months earlier. Many of the companies the operations department had dealt with previously had changed equipment or no longer had excess time available. Arrangements were finally made with about five different companies, but contiguous blocks of time longer than 24 hours were still unavailable. Other companies would provide most of the second shift and all of the third shift with the understanding that the company would need to be off their systems by 9:00 a. …