IWS - The Information Warfare Site
News Watch Make a  donation to IWS - The Information Warfare Site Use it for navigation in case java scripts are disabled

Continuity of Operations Plan (COOP)

1.0 Identification Data
1.1BSP Number
00008
1.2 BSP Title/Name
Continuity of Operations Plan (COOP)
1.3Version Number
1
1.4Adoption Date
05/19/2000
1.5Approving Authority
CIO Council Security Practices Subgroup
1.6Responsible Organization
US Department of Treasury
1.7Level of BSP
Candidate
1.8 Security Processes or other Framework(s) Supported
Contingency Planning (SPF 5)
1.9Reserved
1.10Points of Contact
2.0 What This BSP Does
2.1BSP's Purpose
The purpose of this BSP is to share procedures for developing organizational Continuity of Operations Plans (COOP). Organizations recognize the potential for financial and operational losses brought about by service interruptions. This BSP provides a framework for constructing plans to ensure the safety of employees and the resumption of time-sensitive operations and services in case of potential emergencies, including localized acts of nature, accidents, and technological and/or attack-related emergencies. Although the steps listed in paragraph 3.1.2. apply to all Contingency Plans, the creation of this particular COOP was facilitated by the use of Strohl System’s automated software program called Living Disaster Recovery Planning System (LDRPS). LDRPS is one of several automated tools available on the market that supplements the COOP process.
2.2Requirements for this BSP
The authorization for the development of Continuity of Operations Plans is embodied in the following documentation:
  • Office of Management and Budget Circular A-130, Appendix III, Security of Federal Automated Information Resources, February 1996
  • Computer Security of 1987, Public Law 100-235, January 1988
  • Presidential Decision Directive 63, Critical Infrastructure Protection, May 1998
  • Presidential Decision Directive 67, Enduring Constitutional Government and Continuity of Government Operations, October 1998
  • Executive Order 12656, Assignment of Emergency Preparedness Responsibilities, November 1988
  • Federal Information Processing Standards (FIPS) Publication 87, Guidelines for ADP Contingency Planning, March 1981
3.0 What This BSP Is
3.1Description of BSP
3.1.1Inputs
Data input for Dictionaries
Management input
Recovery Strategies
Business Impact Analysis
3.1.2Process
Although objectives may differ slightly from organization to organization, the processes listed in this BSP reflect processes common to the industry but yet comprehensive enough to meet virtually any organization’s needs. Except in rare cases, the purpose of developing continuity plans is to protect two important assets of the organization, personnel and data. While developing the COOP, we remained focused on the protection and safety of personnel and the recovery of data.

The primary objective of the COOP is to establish policies and procedures to be used in the event of an interruption of service within a preestablished time period. Responding or reacting to an event or emergency, restoring the most time-sensitive operations, and eventually, recovering to full functional capacity are goals that are encompassed in the plan. In the IT world, this includes establishing an IT operational capability to process, store and transmit data, the implementation of work around solutions for those portions of the IT system which cannot be immediately restored, and ultimately, restoring IT processes to normal operational status. We used the following steps to guide us through the COOP development process:

Conduct Open Discussion Meeting – A meeting was held for senior department and office leadership to inform them about the COOP and its importance, explain their participation in the planning process, and enlist their support. Our meeting was conducted offsite to reduce disruptions. The Office of the Chief Information Officer (CIO) was present.

Define Continuity Planning and Procedures – The first part of the plan consisted of developing and refining documents that define organizational policies and procedures regarding continuity planning. These documents are relatively static, in that the information contained in them are rarely changed. Once refined, we maintained these static documents separate from the actual COOP plans, which are living documents that require timely updates. The LDRPS software program provides a baseline text plan that can be used as a starting point for this process.

Create a Hierarchy – Next, we broke down the structure to define the boundaries and responsibilities for creating the plan. Because of the size of the Treasury Department, elements within the organization were encouraged to write their own COOP. LDRPS software was used to accommodate and support COOPs at varying structural elements. For example, Bureaus developed their own COOP independently, but were invited to maintain them in the automated software tool on the Treasury LAN. This heirarchy helped define the perimeters and fostered mutual support within the department for the accomplishment of the common goal – well conceived and well maintained continuity plans at every level of the Department.

Identify Administrative Security Levels – We limited access to organizational plans to protect the integrity of the plans and provide confidentiality for personal contact information of employees. Where possible, we limited user’s access to their own plans. Security levels within the automated software tool were implemented in accordance with the hierarchy defined in the process above. When processing classified plans, a stand-alone version was employed, which includes all classified plans in addition to the unclassified plans.

Identify Critical Business Functions and Assets – Leverage existing Business Impact Analyses (BIA) and Risk Assessments (RA) for this process. If neither has been done for the organization, we recommend conducting at least a Risk Assessment to help facilitate the Contingency Planning process. For government systems, our priority was to leverage information gathered on critical systems as defined by PDD 63. Confidentiality, Integrity and Availability was evaluated. Is the sensitivity or secrecy of data important to the organization? What about preserving the authenticity and accuracy of data? How important is ensuring that the asset or data is available? What is the organizational impact of the disclosure, loss of integrity, or unavailability of the particular asset or assets? How long would the asset have to be unavailable before the organization begins to feel the impact. What is the minimum allowable time before restoring the asset(s) and it associated business function?

Begin Building the Plans – These plans contain the critical information needed to respond and recover from a disruption: Employee information, vendors, customers, teams and their reporting structure, software requirements, equipment and supplies, telecommunications, critical process descriptions, assets, vital records storage locations, and others. Some of this information is already available in one form or another. Once we gathered this information, we began building the plans. We used the LDRPS software format to help guide us through the Plan Construction step, where reporting structures, teams, team positions, team and individual tasks, and action checklists were created for those responsible for carrying out the plan.

Gather Information and Maintain the Plan – The Question and Answer (Q&A) Processor in LDRPS is an effective information gathering tool. We use the Q&A processors extensively to decentralize data collection and update chores, while ensuring centralized database and configuration control.

Set Up Working Groups – Working groups were set up with weekly meetings to develop procedures, markup draft products, discuss progress and next steps, and resolve issues. In a client-contractor environment such as this model, the overwhelming success of the project is attributed to close and continuous client-contractor communication and collaboration.

Determine Plan Output – When determining plan output, we carefully thought about generating reports that could be used for archiving, testing, and executing at the time of implementation. With LDRPS, we reviewed the standard reports available and determined the types of plan information most suitable, as information within these standard reports can be filtered. Although LDRPS’ standard report format with its filtering capability is comprehensive, the user may desire to further customize these reports or create new reports altogether. In this case a working knowledge of Crystal reports will prove beneficial.

3.1.3Outputs
  • A Treasury Department COOP fully in compliance with PDD 67, EO 12656 and FEMA guidelines
  • A Headquarters, Treasury Department Basic Plan for local and national emergencies
  • A Continuity of Government Support Plan (TS) together with a Security Classification Guide and Operations Security Plan
  • Supporting plans for alert and notification/relocation and site support for at least three alternative operating facilities
  • Office plans for 18 departmental offices, the Office of the Inspector General, and the Office of the Inspector General for Tax Administration.
3.2 Relationship to Other BSPs
  • Policies and Procedures - The COOP has a direct relationship with Standard Operating Procedures that guide personnel in carrying out their assignments. The integration of these plans and procedures facilitate the Security Risk Management process.
  • Business Impact Analysis – A BAI will help identify the organization’s critical business functions, and allow the organization to prioritize the recovery and resumption of assets needed to restore those critical business functions.
  • Risk Assessment – The RA will reveal the most pertinent threats (including cyber-based) to the organization’s business functions. Knowing what these threats are will help the organization be proactive in implementing safeguards. Also, the organization may want to design their COOP exercises around those threats most likely to occur.
4.0 How To Use This BSP
4.1Implementation Guidance
  • A COOP is not a one-time project with an established start and end date. Rather, it is a living document whereby it is essential that information and action plans in the COOP remain viable and current. It should be tested, at a minimum, of once a year, or at a greater frequency as determined by management.
  • Since the information in the COOP describes the organization’s planning assumptions and objectives, the COOP may contain some proprietary information that needs protection. However, the success of the plan necessitates that key personnel have immediate access.
  • All individuals of the organization should be familiar with the COOP. Those who play an active role should continually ensure that adequate resources and capabilities exist for carrying out their roles.
  • The LDRPS software program stores information in a relational database, organized into logical data categories. As such, information can be shared between many plans. This facilitates the plan construction process by eliminating the need to reenter duplicate information.
  • When conducting exercises, we develop a scenario and incidents intended to test the objectives, direct the drill or exercise, and evaluate it based on comments and observations. Then, incorporate "lessons learned" from each event into subsequent training and events.
  • We leveraged existing Y2K data for recovery and continuity planning whenever possible. Critical functions, vital information, call trees, rosters, business continuity teams and other elements of Y2K plans provided an excellent point of departure for COOP. Office plans were converted with information gathered in one 1-2 hour session with office managers who were asked to validate/update information. If your organization prepared thoroughly for Y2K, there is no better time to convert that useful information into a valid, functional COOP.
  • We did not forget to take a proactive look while developing the COOP. As the plan came together, existing threats and vulnerabilities were identified that management wanted to reduce or eliminate. These findings, of course, are not a substitute for, but rather an integral part of, the Risk Management process.
  • Perhaps too obvious to mention, but too important to leave out. We did not forget that the COOP itself is considered a vital record, and its information must be readily available to the people who will need it, in hard copy format. We made backup and storage arrangements accordingly.
4.2Implementation Resource Estimates
Cost varies widely with a number of factors such as organization size and complexity and planning level of detail.
4.3 Performance Goals and Indicators (Metrics)
PDD 67 requires that Federal departments and agencies be able to resume performance of critical functions within 12 hours of warning.
4.4Tools
Living Disaster Recovery Planning System (LDRPS) software program
Strohl Systems
500 North Gulp Rd
King of Prussia, PA 19406
Telephone: (800) 634-2016
Fax: (610) 768-4135
E-mail: croop@strohlsystems.com
4.5Training Materials
Strohl Systems offers a two day basic course and a two day advanced course for their LDRPS product. More information can be found at http://www.strohlsystems.com/CompanyInfo/Services/Training.asp
Appendices
AExecutive Overview and Briefing
Not available.
BReference List
Contingency Planning and Management (CPM)
Disaster Recovery Information Exchange
Disaster Recovery Institute
Disaster Recovery Journal
Federal Emergency Management Agency (FEMA)
CProcurement Information
Contract Vehicles: GSA GS35F4594G (MOBIS)
GSA GS23F9806H
DEvaluation Information
ERecommended Changes
F Glossary
Alert – Advanved notification that a disaster situation may occur. This forewarns participants of the possible implementation of the COOP.

Alternate Site - A location, other than the normal facility, used to process data and/or conduct critical business functions in the event that access to the primary facility is denied or the primary facility is damaged. Examples of alternate sites include: hot site, cold site, warm site, and mobile recovery.

Backup - The practice of copying information, regardless of media (paper, microfilm, audio or video tape, computer disks, etc.) to provide a duplicate copy. This is done for protection in case the active information is unreadable or destroyed. Backups to support a recovery effort must include a storage strategy which physically separates the backup data from the original data so there is a minimum of chance that the same event could destroy both copies. Backups may be of various media types.

Contingency Plan - A document containing the recovery timeline methodology, test-validated documentation, procedures, and action instructions developed specifically for use in restoring organization operations in the event of a declared disaster.

Business Impact Analysis (BIA) - The process of identifying an organization's exposure to the sudden loss of selected business functions and/or the supporting resources (threats), and analyzing the potential disruptive impact of those exposures (risks) on key business functions and critical business operations.

Business Interruption - Any event, whether anticipated or unanticipated which disrupts the normal course of operations at a business location.

Call Tree - A list of key individuals to be contacted. Many of these individuals are responsible for contacting additional individuals linked below them on the list. With a call tree, you can help ensure that all of the employees assigned to the plan will be notified promptly.

Cold Site - Typically a fully-constructed data center or similar facility without computer hardware or similar equipment. A cold site facility has necessary environmental and support systems such as access controls, raised flooring, chilled water, tellecommunications access for voice and data, electrical power, and air conditioning.

Command Center - A command center will typically be a location with ample voice communications capabilities as well as office space, furniture, and office equipment to support emergency management team members. The command center can be located in an alternate recovery facility, mobile facility, in another building, or in a facility such as a hotel or conference center, remote from the normal business facilities.

Contingency Plan - The ability to sustain critical business functions in response to a disaster or interruption in supporting services by executing a contingency plan and associated capabilities that guides the orderly and timely restoration of an organization’s business capabilities.

Critical Business Function - Vital business functions necessary for the continued success of the organization. If a critical business function is non-operational, the organization could suffer serious legal, financial, goodwill, or other serious losses or penalties. Generally, critical business function(s) must operate continuously or sustain only brief interruptions.

Data Integrity - Information and data that accurately reflects the status of a business function at a given point of time, representing complete, synchronized information that has passed all data validation and error checking routines. Data integrity is critical in the post interruption environment when data is reconstructed from backups.

Disaster Recovery - The ability to respond to an interruption in services by implementing a recovery plan that ensures the orderly and timely restoration of an organization’s business capabilities and supporting resources.

Exercise - A test or drill in which actions in the contingency plan are performed or simulated as though responding to an event. It is during the exercise that planners and participants can evaluate whether the planned activities and tasks properly address potential situations.

Hot Site - A fully equipped computer facility. A hot site contains the stand-by computer equipment, environmental systems, communications capabilities, and other equipment necessary to fully support a using organization's immediate data processing requirements in the event of an emergency or a disaster.

Mitigation - Any measure taken to reduce or eliminate the exposure of assets or resources to risk.

Off-Site Storage - The process of storing vital records in a facility that is physically remote from the normal site. Usually this facility is environmentally protected for proper care and storage of magnetic media, microfilm, and paper.

Recovery - Those long-term activities and programs which are designed to be implemented beyond the initial crisis period of an emergency or disaster in order to return all systems to normal status or to reconstitute those systems to a new condition that is less vulnerable.

Restoration - The act of returning a piece of equipment or some other resource to operational status. Commercial service companies provide a restoration service with staff skilled in restoring sensitive equipment or large facilities. Such vendors often work with insurance companies and may restore equipment for a fee or may purchase damaged equipment with the intent of restoring the equipment and re-marketing the product.

Risk - The potential for harm or loss. The chance that an undesirable event will occur.

Risk Analysis - An analysis of potential threats to an organization's ability to maintain current business operations.

Stand Alone Processing - Processing typically on a PC which has no communications link with other processors.

Threat - Threats are the events that cause a risk to become a loss. An earthquake or fire which destroys a company's computer facility. Threats include natural phenomena such as storms and floods as well as man-made incidents such as cyber-terrorism, sabotage, power failures, and bomb threats.

Vital Records - Records or documents, regardless of media (paper, microfilm, audio or video tape, computer disks, etc.) which, if damaged or destroyed, would disrupt business operations and information flows and cause considerable inconvenience and require replacement or recreation at considerable expense.

Warm Site - An alternate recovery facility partially equipped with hardware, communications, power, and environmental support equipment.