Airplanes falling from the sky
When you talk with people about IT disaster recovery, their first
reaction is usually that it will not happen to them. This is quickly
followed by the description of an airplane landing on their company
Unfortunately, it does not only happen to others, it can happen to
you. And there are many disasters, not just airplanes falling from the
What is a disaster?
Disaster - 'Any event or developed situation which severely disrupts
the anticipated day-to-day running of a business or institution.'
When we talk about disasters, many people will think of fire and
flood, as well as crashing airplanes. But if we only think of these
dramatic physical disasters we will miss the majority of incidents
that account for the countless disasters that happen every day.
Any event can be a disaster, from the computer breaking down to a
software virus, from a factory fire to an office flood, and so many
IT Disaster - 'An incident that severely disrupts the IT service.'
To provide the IT service requires much more than just your PC sitting
on your desk. There is the local area network wiring and wide area
network wiring along with the routers, bridges and hubs. There are the
file servers and application servers, and the many pieces of software
that run on them. There is the connection to other organisation such
as the Internet Service Provider and BACS. Many companies view
telephony as part of the IT service and include the PABX.
Any incident that affects any one of those components of the IT
service may result in a disaster situation.
Why is it important?
IT is business critical. It is a business enabler, a profit enabler. A
break in the IT service can, and does, result in the closure of the
80% of businesses without effective contingency arrangements will
cease trading within 18 months of a serious incident, many of them
within the first 6 months.
Without the IT applications, you do not know who your customers are.
You can not process orders. You can not schedule production, or know
what raw materials to order. You can not organise deliveries. You can
not issue invoices, and chase outstanding payments. You can not pay
for supplies, or your staff.
Take away your IT, and you take away the ability to be able to do
Understanding the risks
What can go wrong that will affect the IT service? What controls can
we put in place to prevent, contain and recover from an incident?
The Risk Assessment considers physical issues, such as the suitability
of the location of the servers, building security and fire prevention.
It considers logical issues, such as anti-virus measures. It also
considers procedural issues, such as password management, back-up
frequency and storage.
Where are you vulnerable? And what further measures can be introduced
to reduce that vulnerability? Prevention is better than cure.
Counting the cost
What are the most important parts of the business? How are they
affected if there is an interruption to the IT service? How long can
you survive before you are significantly affected?
The Business Impact Analysis identifies the priority of IT systems by
considering the impact on the business if they are unavailable. It
further identifies the maximum time the business can operate without
The financial impact can be significant, and can be immediate.
Consider the situation where you can not take orders. You do not make
deliveries or issue invoices. You have no income. How long could you
But it is not just the immediate impact you have to consider. There
are longer-term issues such as the loss of customers to your
competitors, which will require an increased marketing spend to
recover from. There are legal issues as well. You are required to pay
taxes on time. How long will it be before your staff resign to go to
work for your competitors? What will happen to your share price? Will
you become vulnerable to a hostile takeover bid?
How quickly must you recover your IT services? In what order are those
services required to be available?
What to do?
What can be done to help recover from an IT incident?
Many people feel they can go back to a manual system when there is a
problem with the IT. Or they will use mobile phones if the telephones
are not working. For all but a few small businesses this is simply not
Recovery Strategies identify the approach to contain an incident when
it does occur, and to recover from an incident within the required
Some of the possibilities are having suitable maintenance agreements
on equipment to ensure it is repaired quickly, having UPSs in case of
power cuts, having resilient equipment with twin power supplies and
RAID discs. You will need to ensure you have adequate data and system
backups, and that they are secured off-site. For incidents that affect
the site you will need to consider an alternative location for your
server room, and possibly alternative office space for staff. The
communications between the sites needs to be arranged.
All these arrangements take time to put in place, and can not be done
quickly, especially after the event. Plan now or pay the price later.
Who does what, when, where and how?
A business does not run itself. And an effective response to an
incident does not just happen. It requires planning.
Who will be notified? Where are the phone numbers? Who says what to
the media? How do you handle salvage? Where is the back-up media?
Which staff can rebuild a server? What is the telephone number for the
equipment supplier? Who can authorise emergency purchases? And so the
list goes on.
Plan when you are not under pressure. Keep it simple. Write it down.
Will it work?
If you have not tested your plan, then you do not have a plan. You
simply have a report.
Testing your plans allows you to identify issues with it when you have
time to correct them. It also provides an opportunity for training
staff. A major incident will not mean business as usual. It will
require staff to do different tasks, and to do familiar tasks in a
A small investment in training and testing can make the difference
between a mountain and a molehill when you have a disaster.
Painting the Forth Bridge
Disaster Recovery Planning is not a one-off exercise. It is a
As your IT infrastructure changes and your use of IT changes, then so
must your plans change to reflect this. Before any change is made to
your IT the affect on your disaster recovery arrangements must be
considered. And those arrangements must be changed in advance of the
IT change, and not some months later. Disasters will not wait for you.