Continuity
of Operations Plan (COOP)
| 1.0 |
Identification
Data | | 1.1 | BSP
Number | | 00008
| | 1.2
| BSP
Title/Name | | Continuity
of Operations Plan (COOP) | | 1.3 | Version
Number | | 1 |
| 1.4 | Adoption
Date | | 05/19/2000 |
| 1.5 | Approving
Authority | | CIO
Council Security Practices Subgroup | | 1.6 | Responsible
Organization | | US
Department of Treasury | | 1.7 | Level
of BSP | | Candidate |
| 1.8 | Security
Processes or other Framework(s) Supported | | Contingency
Planning (SPF 5) | | 1.9 | Reserved |
| 1.10 | Points
of Contact | |
- Vendor Partner:
Mr. Bob Holden
Project Manager
SRA
International, Inc.
4350 Fair Lakes Court
Fairfax, VA 22033-4232
Phone: (703) 803-1891
FAX: (703) 502-1130
bob_holden@sra.com
- Mr. Tim Atkin
Director, Critical Infrastructure Protection
SRA
International, Inc.
2000 15th Street, North
Arlington, VA 22201-2640
Phone: (703) 558-2009
Fax: (703) 558-4723
tim_atkin@sra.com
Ms. Mary Ellen Condon
Director, Information Assurance Division
SRA
International, Inc.
2000 15th Street, North
Arlington, VA 22201-2640
Phone: (703) 558-8439
Fax: (703) 558-4723 mary_ellen_condon@sra.com
|
| |
| 2.0 |
What
This BSP Does | | 2.1 | BSP's
Purpose | | The
purpose of this BSP is to share procedures for developing organizational Continuity
of Operations Plans (COOP). Organizations recognize the potential for financial
and operational losses brought about by service interruptions. This BSP provides
a framework for constructing plans to ensure the safety of employees and the resumption
of time-sensitive operations and services in case of potential emergencies, including
localized acts of nature, accidents, and technological and/or attack-related emergencies.
Although the steps listed in paragraph 3.1.2. apply to all Contingency Plans,
the creation of this particular COOP was facilitated by the use of Strohl Systems
automated software program called Living Disaster Recovery Planning System
(LDRPS). LDRPS is one of several automated tools available on the market that
supplements the COOP process. | | 2.2 | Requirements
for this BSP | | The
authorization for the development of Continuity of Operations Plans is embodied
in the following documentation: - Office
of Management and Budget Circular A-130, Appendix III, Security of Federal Automated
Information Resources, February 1996
- Computer
Security of 1987, Public Law 100-235, January 1988
- Presidential
Decision Directive 63, Critical Infrastructure Protection, May 1998
- Presidential
Decision Directive 67, Enduring Constitutional Government and Continuity of Government
Operations, October 1998
- Executive
Order 12656, Assignment of Emergency Preparedness Responsibilities,
November 1988
- Federal Information
Processing Standards (FIPS) Publication 87, Guidelines for ADP Contingency Planning,
March 1981
| | |
| 3.0 |
What
This BSP Is | | 3.1 | Description
of BSP |
| 3.1.1 | Inputs
Data input for Dictionaries Management input Recovery Strategies Business
Impact Analysis | | 3.1.2 | Process
Although objectives may differ slightly from organization to organization, the
processes listed in this BSP reflect processes common to the industry but yet
comprehensive enough to meet virtually any organizations needs. Except in
rare cases, the purpose of developing continuity plans is to protect two important
assets of the organization, personnel and data. While developing the COOP, we
remained focused on the protection and safety of personnel and the recovery of
data. The primary objective of the COOP is to establish policies and procedures
to be used in the event of an interruption of service within a preestablished
time period. Responding or reacting to an event or emergency, restoring the most
time-sensitive operations, and eventually, recovering to full functional capacity
are goals that are encompassed in the plan. In the IT world, this includes establishing
an IT operational capability to process, store and transmit data, the implementation
of work around solutions for those portions of the IT system which cannot be immediately
restored, and ultimately, restoring IT processes to normal operational status.
We used the following steps to guide us through the COOP development process: Conduct
Open Discussion Meeting A
meeting was held for senior department and office leadership to inform them about
the COOP and its importance, explain their participation in the planning process,
and enlist their support. Our meeting was conducted offsite to reduce disruptions.
The Office of the Chief Information Officer (CIO) was present. Define
Continuity Planning and Procedures The
first part of the plan consisted of developing and refining documents that define
organizational policies and procedures regarding continuity planning. These documents
are relatively static, in that the information contained in them are rarely changed.
Once refined, we maintained these static documents separate from the actual COOP
plans, which are living documents that require timely updates. The LDRPS software
program provides a baseline text plan that can be used as a starting point for
this process. Create a Hierarchy
Next, we broke down the
structure to define the boundaries and responsibilities for creating the plan.
Because of the size of the Treasury Department, elements within the organization
were encouraged to write their own COOP. LDRPS software was used to accommodate
and support COOPs at varying structural elements. For example, Bureaus developed
their own COOP independently, but were invited to maintain them in the automated
software tool on the Treasury LAN. This heirarchy helped define the perimeters
and fostered mutual support within the department for the accomplishment of the
common goal well conceived and well maintained continuity plans at every
level of the Department. Identify
Administrative Security Levels We
limited access to organizational plans to protect the integrity of the plans and
provide confidentiality for personal contact information of employees. Where possible,
we limited users access to their own plans. Security levels within the automated
software tool were implemented in accordance with the hierarchy defined in the
process above. When processing classified plans, a stand-alone version was employed,
which includes all classified plans in addition to the unclassified plans. Identify
Critical Business Functions and Assets
Leverage existing Business Impact Analyses (BIA) and Risk Assessments (RA) for
this process. If neither has been done for the organization, we recommend conducting
at least a Risk Assessment to help facilitate the Contingency Planning process.
For government systems, our priority was to leverage information gathered on critical
systems as defined by PDD 63. Confidentiality, Integrity and Availability was
evaluated. Is the sensitivity or secrecy of data important to the organization?
What about preserving the authenticity and accuracy of data? How important is
ensuring that the asset or data is available? What is the organizational impact
of the disclosure, loss of integrity, or unavailability of the particular asset
or assets? How long would the asset have to be unavailable before the organization
begins to feel the impact. What is the minimum allowable time before restoring
the asset(s) and it associated business function? Begin
Building the Plans These
plans contain the critical information needed to respond and recover from
a disruption: Employee information, vendors, customers, teams and their reporting
structure, software requirements, equipment and supplies, telecommunications,
critical process descriptions, assets, vital records storage locations, and others.
Some of this information is already available in one form or another. Once we
gathered this information, we began building the plans. We used the LDRPS software
format to help guide us through the Plan Construction step, where reporting structures,
teams, team positions, team and individual tasks, and action checklists were created
for those responsible for carrying out the plan. Gather
Information and Maintain the Plan
The Question and Answer (Q&A) Processor in LDRPS is an effective information
gathering tool. We use the Q&A processors extensively to decentralize data
collection and update chores, while ensuring centralized database and configuration
control. Set Up Working
Groups Working groups
were set up with weekly meetings to develop procedures, markup draft products,
discuss progress and next steps, and resolve issues. In a client-contractor environment
such as this model, the overwhelming success of the project is attributed to close
and continuous client-contractor communication and collaboration. Determine
Plan Output When determining
plan output, we carefully thought about generating reports that could be used
for archiving, testing, and executing at the time of implementation. With LDRPS,
we reviewed the standard reports available and determined the types of plan information
most suitable, as information within
these standard reports can be filtered. Although LDRPS standard report format
with its filtering capability is comprehensive, the user may desire to further
customize these reports or create new reports altogether. In this case a working
knowledge of Crystal reports will prove beneficial. | | 3.1.3 | Outputs
- A Treasury
Department COOP fully in compliance with PDD 67, EO 12656 and FEMA
guidelines
- A
Headquarters, Treasury Department Basic Plan for local and national emergencies
- A
Continuity of Government Support Plan (TS) together with a Security Classification
Guide and Operations Security Plan
- Supporting
plans for alert and notification/relocation and site support for at least three
alternative operating facilities
- Office
plans for 18 departmental offices, the Office of the Inspector General, and the
Office of the Inspector General for Tax Administration.
|
| 3.2
|
Relationship
to Other BSPs | |
- Policies and Procedures
- The COOP has a direct relationship with Standard Operating Procedures that guide
personnel in carrying out their assignments. The integration of these plans and
procedures facilitate the Security Risk Management process.
- Business
Impact Analysis A BAI
will help identify the organizations critical business functions, and allow
the organization to prioritize the recovery and resumption of assets needed to
restore those critical business functions.
- Risk
Assessment The RA will
reveal the most pertinent threats (including cyber-based) to the organizations
business functions. Knowing what these threats are will help the organization
be proactive in implementing safeguards. Also, the organization may want to design
their COOP exercises around those threats most likely to occur.
|
| |
| 4.0
|
How
To Use This BSP | | 4.1 | Implementation
Guidance | |
- A COOP is not a one-time project with
an established start and end date. Rather, it is a living document whereby it
is essential that information and action plans in the COOP remain viable and current.
It should be tested, at a minimum, of once a year, or at a greater frequency as
determined by management.
- Since
the information in the COOP describes the organizations planning assumptions
and objectives, the COOP may contain some proprietary information that needs protection.
However, the success of the plan necessitates that key personnel have immediate
access.
- All individuals
of the organization should be familiar with the COOP. Those who play an active
role should continually ensure that adequate resources and capabilities exist
for carrying out their roles.
- The
LDRPS software program stores information in a relational database, organized
into logical data categories. As such, information can be shared between many
plans. This facilitates the plan construction process by eliminating the need
to reenter duplicate information.
- When
conducting exercises, we develop a scenario and incidents intended to test the
objectives, direct the drill or exercise, and evaluate it based on comments and
observations. Then, incorporate "lessons learned" from each event into
subsequent training and events.
- We
leveraged existing Y2K data for recovery and continuity planning whenever possible.
Critical functions, vital information, call trees, rosters, business continuity
teams and other elements of Y2K plans provided an excellent point of departure
for COOP. Office plans were converted with information gathered in one 1-2 hour
session with office managers who were asked to validate/update information. If
your organization prepared thoroughly for Y2K, there is no better time to convert
that useful information into a valid, functional COOP.
- We
did not forget to take a proactive look while developing the COOP. As the plan
came together, existing threats and vulnerabilities were identified that management
wanted to reduce or eliminate. These findings, of course, are not a substitute
for, but rather an integral part of, the Risk Management process.
- Perhaps
too obvious to mention, but too important to leave out. We did not forget that
the COOP itself is considered a vital record, and its information must be readily
available to the people who will need it, in hard copy format. We made backup
and storage arrangements accordingly.
| | 4.2 | Implementation
Resource Estimates | | Cost
varies widely with a number of factors such as organization size and complexity
and planning level of detail. | | 4.3
| Performance
Goals and Indicators (Metrics) | | PDD
67 requires that Federal departments and agencies be able to resume performance
of critical functions within 12 hours of warning. | | 4.4 | Tools
| |
Living
Disaster Recovery Planning System (LDRPS) software program
Strohl
Systems
500 North Gulp Rd King of Prussia, PA 19406 Telephone:
(800) 634-2016 Fax: (610) 768-4135 E-mail: croop@strohlsystems.com |
| 4.5 | Training
Materials | |
Strohl Systems
offers a two day basic course and a two day advanced course for their LDRPS
product. More information can be found at http://www.strohlsystems.com/CompanyInfo/Services/Training.asp |
| |
| Appendices |
| A | Executive
Overview and Briefing | | Not
available. | | B | Reference
List | |
Contingency
Planning and Management (CPM)
Disaster
Recovery Information Exchange
Disaster
Recovery Institute
Disaster
Recovery Journal
Federal
Emergency Management Agency (FEMA) |
| C | Procurement
Information | |
Contract Vehicles:
GSA GS35F4594G (MOBIS)
GSA GS23F9806H | | D | Evaluation
Information | | E | Recommended
Changes |
| F |
Glossary
| | Alert
Advanved notification that a disaster situation may occur. This forewarns
participants of the possible implementation of the COOP. Alternate
Site - A location, other than the normal facility, used to process data and/or
conduct critical business functions in the event that access to the primary facility
is denied or the primary facility is damaged. Examples of alternate sites include:
hot site, cold site, warm site, and mobile recovery. Backup
- The practice of copying information, regardless of media (paper, microfilm, audio
or video tape, computer disks, etc.) to provide a duplicate copy. This is done
for protection in case the
active information is unreadable or destroyed. Backups to support a recovery effort
must include a storage strategy which physically separates the backup data from
the original data so there is a minimum of chance that the same event could destroy
both copies. Backups may be of various media types. Contingency
Plan - A document containing the recovery timeline methodology, test-validated
documentation, procedures, and action instructions developed specifically for
use in restoring organization operations in the event of a declared disaster.
Business Impact Analysis (BIA)
- The process of identifying an organization's exposure to the sudden loss of
selected business functions and/or the supporting resources (threats), and analyzing
the potential disruptive impact of those exposures (risks) on key business functions
and critical business operations. Business
Interruption - Any event, whether anticipated or unanticipated which disrupts
the normal course of operations at a business location. Call
Tree - A list of key individuals to be contacted. Many of these individuals are
responsible for contacting additional individuals linked below them on the list.
With a call tree, you can help ensure that all of the employees assigned to the
plan will be notified promptly. Cold
Site - Typically a fully-constructed data center or similar facility without computer
hardware or similar equipment. A cold site facility has necessary environmental
and support systems such as access controls, raised flooring, chilled water, tellecommunications
access for voice and data, electrical power, and air conditioning. Command
Center - A command center will typically be a location with ample voice communications
capabilities as well as office space, furniture, and office equipment to support
emergency management team members. The command center can be located in an alternate
recovery facility, mobile facility, in another building, or in a facility such
as a hotel or conference center, remote from the normal business facilities. Contingency
Plan - The ability to sustain critical business functions in response to a disaster
or interruption in supporting services by executing a contingency plan and associated
capabilities that guides the orderly and timely restoration of an organizations
business capabilities. Critical
Business Function - Vital business functions necessary for the continued success
of the organization. If a critical business function is non-operational, the organization
could suffer serious legal, financial, goodwill, or other serious losses or penalties.
Generally, critical business function(s) must operate continuously or sustain
only brief interruptions. Data
Integrity - Information and data that accurately reflects the status of a business
function at a given point of time, representing complete, synchronized information
that has passed all data validation and error checking routines. Data integrity
is critical in the post interruption environment when data is reconstructed from
backups. Disaster Recovery
- The ability to respond to an interruption in services by implementing a recovery
plan that ensures the orderly and timely restoration of an organizations
business capabilities and supporting resources. Exercise
- A test or drill in which actions in the contingency plan are performed or simulated
as though responding to an event. It is during the exercise that planners and
participants can evaluate whether the planned activities and tasks properly address
potential situations. Hot Site
- A fully equipped computer facility. A hot site contains the stand-by computer
equipment, environmental systems, communications capabilities, and other equipment
necessary to fully support a using organization's immediate data processing requirements
in the event of an emergency or a disaster. Mitigation
- Any measure taken to reduce or eliminate the exposure of assets or resources
to risk. Off-Site Storage
- The process of storing vital records in a facility that is physically remote
from the normal site. Usually this facility is environmentally protected for proper
care and storage of magnetic media, microfilm, and paper. Recovery
- Those long-term activities and programs which are designed to be implemented
beyond the initial crisis period of an emergency or disaster in order to return
all systems to normal status or to reconstitute those systems to a new condition
that is less vulnerable. Restoration
- The act of returning a piece of equipment or some other resource to operational
status. Commercial service companies provide a restoration service with staff
skilled in restoring sensitive equipment or large facilities. Such vendors often
work with insurance companies and may restore equipment for a fee or may purchase
damaged equipment with the intent of restoring the equipment and re-marketing
the product. Risk - The potential
for harm or loss. The chance that an undesirable event will occur. Risk
Analysis - An analysis of potential threats to an organization's ability to maintain
current business operations. Stand
Alone Processing - Processing typically on a PC which has no communications link
with other processors. Threat
- Threats are the events that cause a risk to become a loss. An earthquake or
fire which destroys a company's computer facility. Threats include natural phenomena
such as storms and floods as well as man-made incidents such as cyber-terrorism,
sabotage, power failures, and bomb threats. Vital
Records - Records or documents, regardless of media (paper, microfilm, audio or
video tape, computer disks, etc.) which, if damaged or destroyed, would disrupt
business operations and information flows and cause considerable inconvenience
and require replacement or recreation at considerable expense. Warm
Site - An alternate recovery facility partially equipped with hardware, communications,
power, and environmental support equipment. |
|