Infocon Magazine Issue One,
Continuity Planning Interview
with David Spinks, EDS
Interviewer: Wanja Eric Naef
See also Business Continuity
Planning - A safety net for businesses
Spinks is director - Information Assurance for Europe,
Middle East and Africa (EMEA) at EDS (http://www.eds.com).
He is responsible for EDS' Portfolio of Information Assurance
services across all EMEA markets. Services provided by
Information Assurance include Cyber Intelligence Services,
Global Web Assurance & Hosting, Cyber Forensics,
Cyber Security Institute and Secure Digital Communications.
is also chairman of the E-commerce Security Special Interest
Group, an active member of the Guild of Security Controllers,
a member of the British Computer Society Committee and
co-author of the guide "E-commerce - a World of
Opportunity". He has spoken to audiences all over
the world on subjects such as the impact of e-commerce
on the supplier chain, business continuity planning after
year 2000 and information security: the real threats.
Usually there is confusion about the term BCP. Some people use
the term continuity, others contingency and some use them interchangeably.
How would you define them?
David Spinks: That
is a good question. For BCP, business continuity planning, certainly
I would like to use the widest use of BCP in that if you look
at a typical business continuity planning study we are not just
necessarily looking at the needs, wants and requirements of the
business. The typical BCP study if focused on the right area
will start looking at stakeholders and their values. For instance,
I have done some work in the oil industry. The thing with the
oil industry clearly is that they have a potential impact on
the environment they are in. Very often with many oil companies
I have worked for it was actually a beneficial impact, because
they subsidise local schools and local education, because it
is in their interest to be centred in a happy community.
therefore the typical BCP study if conducted in the right way
will consider the needs and wants of the stakeholder, communities,
shareholders, owners, staff, management or all stakeholders including
bank insurance companies in the corporate entity. Next the study
will normally include some form of business impact assessment.
Now if the study has been done at the right level then those
business impacts will also consider the impacts of any unplanned
event on the stakeholder as well as the business itself. So we
are not just looking at contingency planning: generally contingency
planning entails factors concerning a corporate entity, organisation
and business process and attempts to predict what may happen
if unplanned events occur. The recommendation will typically
suggest various mitigations against those particular business
impacts. But BCP is at much higher level. It is saying that we
should consider this in terms of the impacts of these things
on the corporate entity and corporate stakeholders as well.
Obviously the other big confusion
is with either CDR, which is computer disaster recovery, or DR,
which is disaster recovery. Quite often those studies are centred
on the technology rather than the business. Clearly technology
has a major role to play in the business, but quite often the
disaster recovery from the computing we are finding more and
more is done in isolation from the business continuity planning.
What we are trying to do is to bring the two things together
and thus make sure that the computer disaster recovery complements
the business continuity planning.
would you say a good BCP plan should look like?
would like to step back from that question and say the plan is
important, i.e. the material bits of paper or responding to a
crisis are important. However, what is even more important than
the plan itself is the planning process, which has led to the
development of the plan. And what I would look for in the terms
of the effectiveness of the plan is:
First, have all the business
leaders been involved? Is the plan owned by the business or does
the business say we delegate that to the IT department or the
business-planning department or is the plan owned by the business
continuity manager? If the ownership of that plan is in a technical
area then it is very likely that when unexpected events happen
then that plan is not going to work. So therefore the plan should
involve the operational business leaders, so that they see it
as their plan. So that is the first critical success factor in
second is probably the most important and that is: is there
evidence that the plan has
been tested? We would look for two types of tests: we would look
for desk based tests, simulations, but we would also look for
real evidence that plans have been tested for real, namely that
people have been taken through the motions of various crises
and that somebody has independently stress-tested those tests
themselves. We see far too many business continuity plans and
or disaster recovery plans that whilst the have been tested were
done so in unrealistically ideal conditions and thus do not truly
recognise what really happens in a crises.
are two things which happen in a crisis: firstly communications
break down unless
there is good crises communication plan and secondly management,
this is from real case history from big disasters, are nowhere
to be found. They disappear. Alternatively management become
obtrusive because a crisis has two phases: the internal
phase which is in-looking into the organisation and attempting
to recover that organisation using the business continuity plan.
There is another phase of that which is we go back to the organisation
looking out in the stakeholder world and managing the stakeholders' needs
and wants at the point of the crisis. And that is where the communication
and PR people come in because they are the people who are going
to front the press, the media, the stakeholders and the shareholders.
A very effective BCP plan
operates on three levels globally. This is where the board operates,
as it looks at the long term impacts of the crisis on the business
and the board are then communicating with the press, the media
and if necessary with governments at a global level.
There is second level, which
is looking at the recovery of either a site or geographical area.
the third level is looking at a) putting out the fire, because
that is quite important.
It is like a bath with the taps running. Put the plug in and
make sure the water doesn't leak out and then turn the taps off.
That is the operational response. What we find is that too many
technical people only consider the third level. They only
consider the technical response to it and they forget that in
a major crisis somebody has to be at the gates talking to the
press, talking to the media, reassuring the local community,
dealing with the longer-term aspects of an event.
Now that model builds on
the UK cabinet office and home office emergency response gold,
sliver, bronze scheme and we have seen that work in operation a number
of times and it worked very successfully.
you please give us an example?
King's Cross Fire in the UK was managed on a bronze, silver and
gold operational level. And there are some pretty good case histories
written about that and how to deal with it.
Another example was the operation
put into place by Greater Manchester Police after the Manchester
Bomb. And that had all sorts of lessons to be learned on how
to manage a crisis.
You mentioned that it is of vital importance to have the backing
of senior management for BCP plans. Some
companies do not really care. How can you convince senior management
to support such plans?
are enough codes of practice in place for senior management to
not only support business continuity planning, but actually pushing
people to deliver it, because at the end of the day the board
of directors is held responsible in times of crisis if there
is not adequate planning. In the UK, we have generally recognized
practices such as the Turnbull Combined Code of Practice which
makes various commitments on both executive and non executive
officers of organisations to manage not
just financial risk but also operational, environmental
and safety risks. And one of the key responses to risk
management is business continuity planning. If they are
not supporting BCP plans the needle from innocent to guilty
will move rapidly from innocent to guilty. And I think
there are enough cases in place now, enough proof in place,
to suggest to any peer groups of executive officers that
they need to take business continuity planning seriously.
So that is a bit like a stick. The carrot--and it is important
to have a carrot as well--is that organisations that have
taken business continuity planning seriously will reap
benefits. For example, the big car manufacturers are excellent
in extracting performance benefits out of the supplier
chain and I would suggest that whilst there might well
be an initial cost for large car manufacturers in implementing
BCP, if they then put that down their supplier chain they
will see significant improvements from their suppliers.
And they also gain a higher level of confidence in key
suppliers where they just in time contracts that come what
may whatever happens to that supplier they will be able
to continue the supply of spare parts through any crisis
the supplier has.
Q: Does British Standard 7799 Part I provide
a good template for designing a business continuity management
BS7799 is an information security standard and the latter sections
do mention BCP and they emphasis that is has to be risk based
and that testing is required, but they do not go into any detail. There
are other codes of practise available for BCP, which go beyond
7799. It is excellent that it is in there, but 7799 is really
primarily focused on information technology. So the natural progression
for people looking at that is to take section 11 into a computer
disaster recovery plan rather than a true BCP.
Q: Risk assessment is a vital part of BCP.
What method of risk assessment would you recommend? Qualitative,
quantitative risk assessment or mixture of both?
think it depends on the particular organisation - which sector
they are in, and depending on the needs of the regulators. For
instance, if you take financial services you would look for some
quantification of risk for financial services. They will have
to do that before 2005 anyway to meet the Basel II
requirements. So unless they are prepared to put huge sums of
money into the capital pot, they have to begin to quantify risk.
On the other end of the scale, if you look at a purely commercial
organisation where it is administrative or there is a low threat
of loss then maybe just a qualitative review of risk.
But the important thing is
that BCP is not just based on risk assessment. Risk assessment
is important, but what is even more important is risk management.
It has to be part of a risk management process. It is the easiest
thing in the world to assess risk, because in information security
you are looking for threats, vulnerabilities, protective measures
and you are looking for the likelihood of any particular threat
can break it down into two sorts of risks: very high likelihood
risks, both low impact
and high impact, low likelihood. And it is the very high impact
low likelihood risks where you need the business continuity planning
safety net, because you are not going to be able to afford to
mitigate those collections of risks. For example, if you take
the oil company we spoke of. An oil company can do certain things
to mitigate risks of aviation crashes on their installations,
but only so much. So therefore what you do with an oil installation,
say 'well it is not very likely that a plane will crash on my
refinery, but if it does I am going to put all these processes
in place to respond.' And similarly with a bomb attack. You can
do certain preventive measures to keep people away from boundaries.
You can put boundary fences. You can make sure that cars do not
come into the site. All those will mitigate, but in the end of
the end day you will need a business continuity plan to account
for 'what happens if this and this happens.'
has BCP changed since September 11?
are in our second cycle. The big interest in BCP came in about
1998 when we were planning for the millennium Y2K with virtually
every utility, every major company in the UK had a pretty good
BCP plan for multiple simultaneous failures. And that was an
excellent period for BCP. Since then we have noticed that some
of these plans have actually not been maintained, the people
that built the plans have gone on to do other things. So there
has been almost an erosion in the BCP activity until September
11th. And many organisations have looked at those events and
governments have looked at those events and taken a view that
firstly we have to prevent them. If we cannot prevent them, we
have got to have plans in place to respond because the protective
measures against these types of terrorist attacks are pretty
difficult to prevent if somebody is really intent on getting
into your company and setting it alight or exploding a bomb.
It is pretty difficult to defend against that, particularly when
they are taking the measures and doing the things that they are.
Therefore it is even more important to have business continuity
in place. There has been a great deal of additional activity
in BCP and there are two absolutely brilliant case histories
that we are putting together in EDS, and that is the work that
we did for a major financial institution. Our customer was based
in the World Trade Center and we had that financial service customer
up and running within 24 hours of that event. The second one
is we actually had teams of people working in the Pentagon, luckily
not in the area of where that aircraft hit, and almost within
60 minutes of that event we had teams of people working for the
U.S. government and the Department of Defense recovering some
of those systems. And I think everybody learns in those events,
but I think that we responded as a corporation extremely well.
How can you now really design a BCP plan as it is very difficult
to plan for incidents like the 9/11 terrorist
attacks? How would you plan for major terrorist attacks?
you look at that type of scenario, i.e. terrorists acting within
a community, certainly within the UK, we will then go to a different
group of people, because what we do is to go the emergency planners.
And most good BCPs for most significant organisations will have
an interface with the local emergency planning community. The
emergency planners are part of the Home Office infrastructure
and their job is based in the local authority but looks at scenarios
like major explosions and other such. They have plans in place
to respond to those events and needless to say they are based
on a gold, silver, bronze structure. They will involve the police,
ambulance and other emergency services.
i.e. within a BCP, the BCP manager is simply going to be networking
the emergency planning officers. And say 'we are doing this and
we are going to take part of your emergency plan in the event
of a flood, a fire, an explosion in the community'. And those
two things will loop together.
the big organisations, for instance, I was involved in doing
some BCP for a major oil
company client and when we did the stress testing of some of
the continuity plans we actually involved the local emergency
planning officer, the police, the ambulance, and the fire service.
My previous employer was a company called AEA Technology and
we wrote a number of plans for nuclear sites and they did exactly
the same thing.
Q: How can small companies afford
BCP plans, as they can be quite expensive?
small companies may be better equipped to respond to a crisis
than larger companies because they can do things quickly. Their
communications quite often withstand the crisis and that is the
thing which causes most pain - the breakdown in corporate communications
and / or the communications between the corporate entity and
the press and the media. So in some ways smaller organisations
have less of a problem. The organisations that concern me slightly
are at the top end of the SME market where you look at organisations,
which employ a hundred to four hundred people. Where they may
be multi site and they may be largely dependent on information
technology, yet they have not invested in either information
security and hence not in BCP. We have one or two examples
where SMEs at that stage in the dot.com category have suffered
because they were unable to respond to a loss reliably for instance
on their web sites. They only have one web site. As soon as their
web site fails for whatever reason, they are almost out of business.
And they have not invested for a number of reasons in either
information security or business continuity planning. And that
is the type of company which might be a supplier of goods and
services to a major organisation. And that is where we go back
to the supply chain assessment. It is really important to push
information security and BCP down the supplier chain.
US manufacturers depend upon their suppliers. How can one guarantee
that the suppliers' BCP plan will complement
the plans of large manufacturers?
organisations are now recognising the benefits of partnership.
We keep hearing that large corporate clients who are good at
partnering have good profitability and good resilience to unplanned
events. And that is what crisis is - an unplanned event.
you look at establishing a good partnership, it's based on trust. Trust is built on knowledge - how
does that company operate, how do we operate? And the trust is
built by knowing that whatever happens between the two entities
that the supply chain is going to survive. Quite a lot of it
is about awareness, but at the end of the day if you establish
a relatively flexible contract, part of the contract process
is an audit and / or a review. And you are not just auditing
the company for security or for a BCP, you are auditing the supplier
against a whole range of criteria which really should include
a quick look at their BCP, if they have one and whether it has
been tested as part of due diligence in operating that contract.
I would place it in that category. And then you can reach
for the large suppliers this can be employed successfully. I am not saying for
every supplier, if I remember correctly General Motors had at
one time 40,000 suppliers, and you just cannot deal with that
number. But one thing the core organisation can do is actually
to rank its suppliers on criticality and not on value, not on
revenue, but on criticality. There is case history to help
us here-take the case of Ford in the UK and door handles. That
was a major problem for them caused by a relatively small supplier
in revenue terms, but that component loss stopped the production
line. So it is ranking the supplier on criticality and then working
with the most critical suppliers and building those suppliers
into your BCP tests and plan. That is not a huge issue as it
can be done relatively easily. But if you have 40,000 suppliers,
you probably look only at the top hundred or top two hundred
again for criticality reasons. That should be part of risk assessment
you look at a typical car plant: Part of the risk assessment
would be to look at what
could stop the suppliers' parts as if the suppliers' parts stop
the line stops. So therefore that is clearly a critical risk
and part of the mitigation clearly is making sure that you either
have a duplicate supplier, standby supplier, or that supplier
has processes and procedures in place to respond to his fire,
flood, or his loss of IT. Simple stuff really.
Q: You mentioned senior management should
be involved in designing a BCP plan. Who else should be involved?
goes through a number of phases and it is the initial phase,
which is the stakeholder values and business values, that needs
to be done at board level with senior management involvement.
Having defined those business objectives and stakeholders' issues,
the plan is worked through very much with each operational manager,
because effectively what you are doing is looking at each business
process. And then you must work out a plan for each business
process recognising the dependencies across them. So you will
be working with the senior managers at the operational level,
because at the end of the day those are the people who run the
company. The accountants actually do not run the company. They
make sure that the books are balanced. The IT people should
not be running the company; they should just be providing the
IT. But at the end of the day you need to involve the whole peer
group at that level to make sure that you got the impacts correctly
assessed. Accountants can help to do the quantification of value
of assets. IT people clearly have a major role. So you should
look at involving the people from the operational manager right
down the frontline.
Today's companies are highly dependent
on Information systems; hence the BCP plan has to pay special
attention to it. What measures would you implement to keep the
IT systems running?
would go back there and say that the starting point for looking
at resilience for IT systems is a checklist based on BS7799.
Because it is a pretty good starting point from understanding
and of course we are looking at three aspects of information
security: we are looking at confidentiality, integrity and business
continuity is focused largely on reliability.
Disasters quite often impact the availability side of security
and reliability. However,
we got a number of case histories from financial services where
the largest loss potentially is not anything to do with direct
loss of IT. So the largest losses to date have been losses of
confidentiality in the area of privacy where companies have been
severely embarrassed by loosing personal data. If we look at
evaluation of assets the one asset which is most often missed
is either brand value or reputation value. What happens if we
lose the integrity of our web site and personal data gets exposed?
That is a significant loss, but it is a loss against reputation
and it is quite difficult to assign a precise value to that.
Q: How can companies guarantee perfect Crisis
Communication and Public Relations?
is really an easy question to answer believe it or not. Because
what you need to do is simply to look back on the case histories
and learn from the case histories. And I will give you two. Firstly
there was Three Mile Island. Three Mile Island was a communications
disaster, because whilst we had technically a nuclear power plant
that was going critical, the people who were managing that stopped
communicating. So much so that the state's representative responsible
for issuing an evacuation order heard the evacuation order that
he should be issuing to the local public announced on radio.
Somebody else just issued on the radio and people just started
evacuating and the guy who was responsible for it was listening
on the radio.
second thing is bad communications with the press and the media
can actually hinder the operational
recovery. Because what happened in Three Mile Island is the press
and the media did two things: firstly, they hired a helicopter
to fly over the plant which is a huge risk. The last thing you
want in a nuclear incident is a helicopter flying over your plant.
The second thing they did, which can be managed, as workers came
out of the plant, the press & the media were there waiting
for workers and number of employees spoke to the press. That
should have been addressed at a very early stage and the whole
communications infrastructure within Three Mile Island just broke
down completely. This is a really good case history to say if
we are going to have good press & media communications there
are huge number of lessons to be learned from that.
case example of serious damage being done to an organisation
by lack of good communications
was when Challenger V blew up and we lost seven astronauts. NASA
took one hour from that blowing up to get in front of the press.
And do not forget the press & media were sitting there when
it happened and they took an hour to confront the press and the
media and do you know what she said? 'I think we lost seven astronauts.' And
that was the total sum of the statement to the press & media.
Now that was not so bad actually, pretty bad, but there was man
in the Pentagon sitting watching this on television and they
had not given him any information either. NASA took two years
to recover from that lack of poor communication, because their
bosses at the Pentagon just went berserk. And that was because
they were just sitting there waiting for somebody to contact
them and let them know what happened and they did not. So that
was a very serious for NASA.
is a whole raft of examples here in the UK. We have seen major
managed well, but messed up because the person fronting the press & media
has not had the right training and/or is an inappropriate spokesperson.
We have seen board directors and executive officers standing
up doing personal interviews live on television. From recent
personal experience you do not do that in a crisis. You let professional
Steele, EDS PR Manager: I have
been involved in a number of crises in my role. As a public
the most important activity is the delivery of regular, accurate
information. And you have to admit responsibility wherever
it lies. If you follow those guidelines, the communication
should maintain the crisis from stakeholder point of view. So
you are alright and you have the press on your side, if the
is against you the company's reputation will be stained for
a long time.
management, the biggest losses have all been in that area. And
however much people will say it is the technical recovery or
it is the IT. It is not. It is the PR guys who have a major role
at that point in time. And they allow people like me to do our
job and recover systems.
was involved in three major incidents over the last ten years.
Two I cannot talk about,
but one I am happy to share with you. It was a non-nuclear plant
and we had a theft of equipment. Not only was the equipment stolen,
but it was stolen from a system which was a real time system
monitoring the quality of air across the UK. And I got a call
at four o'clock in the morning. And if you look at sleep patterns
there at their deepest at four o'clock. The first question I
asked was, 'why is my member of staff going into work at four
o'clock in the morning' - I never did find that out. He went
into the office and was presented not just with a whole raft
of empty PCs, but all the chips have been taken out and the major
server systems had been taken away and, what is worse, these
guys had covered their tracks by pulling a hose. So there was
water everywhere. And he rang me up and said, 'I have got a problem.
In fact we have a problem. In fact AEA technology has a problem.
We've had a theft.' I said, 'well tell me about it. Have you
turned the water off?' 'No.' I said, 'go and turn the tap
off.' Literally the water was still spraying around the office.
And then we put the BCP into place which was firstly: ring the
head of physical security. 'You need to get there. Don't worry
I will go down the list.' So you get the head of security in
because the first thing, which needs to happen, is to cordon
off the area. Nobody goes in because the police and investigation
officers will need to get in there first. So that is the first
thing. Second thing is a call to the head of PR, 'Kevin we have
a problem. Get out of bed and get writing a press release because
we lost the air quality monitoring and we need to issue it to
the press. Clear it with the client. Tell the client. You manage
the client and the press. Get on with it.' And that was it. We
activate that within ten minutes of finding out. That recovery
went according to plan and within eight hours we had all the
systems up back and running. That is not bad from scratch.
do you think is the future for BCPs? Will it face problems again?
Spinks: I think BCPs
may as it stands at the moment have to change. Where I see a
growth and where we have to put systems in place is in risk management.
We need the risk management in place and it could be that part
of that risk management is a BCP or it might also be better security,
better mitigation, better insurance, better resilience and if
we call that BCP then okay. BCP will continue to play a role
in running organisations. But I think we will need to think about
who is involved? Because you got to get it right and calling
it BCP might be too IT-ish. So we may have to call it something
else. But yes, it has a vital role particularly in organisations
where there is a threat to either the environment or a threat
is an escalation scale
for the UK government to response to incidents.
IWS welcomes suggestions
regarding site content and usability. Please use our contact
form to submit your comments.
30 December, 2007
by Wanja Eric Naef
IWS Copyright © 2000 - 2008