Login

Forgotten your details?

« Back to previous page

Is resilience sometimes too much of a good thing?

28 May 2010

The word 'Resilience' has been creeping into the Business Continuity Industry Lexicon with varying definitions and gradually increasing prominence over the last decade. The British Standard for Business Continuity Management, BS25999-1:2006 defines Resilience as the "..ability of an organization to resist being affected by an incident."

Resilience is not a new concept for IT managers. The world has become ever more dependent upon increasingly complex and interdependent IT systems to deliver services in what I call the "Martini World" – anytime, anyplace, anywhere. This trend is set to continue and accelerate and IT managers have embraced resilience to increase systems uptime.
The idea of resilience – being able to armour-plate yourself in protective measures to make sure that you can advance unaffected by risks, which bounce off your armour like arrows – is seductive but ultimately flawed if considered a substitution for disaster recovery. There are always chinks in the armour, otherwise we would not be able to move, and so there is always a chance of a lucky arrow strike. Or someone with a ballista that'll punch right through your armour. In other words, resilience can fail and then you need to recover. So what do these chinks in the armour look like?
Complexity in modern day IT systems (many interdependent components, interfaces dynamically exchanging data both within and between organisations, and so on) means that it is virtually impossible to predict all potential failure conditions and when something does go wrong it can result in a cascade of consequential effects making it difficult to quickly diagnose a root cause and put matters right. In highly complex and closely coupled systems, the scope for faults is high and small problems can quickly build and may potentially bring systems crashing down.
Faults can arise from equipment failure, software problems, planned changes to the system or from what we might call human error but in fact might be more accurately described as our inability to understand the system in sufficient detail to judge the consequences of our intervention. Human error is still the root cause of a surprisingly high proportion of unplanned outages and incidents.
By seeking to make a resilient system we often engineer in more complexity, adding more potential failure conditions and making it harder to understand what has gone wrong. Sometimes the armour-plating protection works for a while, but often we do not check and maintain it properly. We do not polish it to remove the rust and repair the dents and our protection is ultimately compromised. Having a duplicate system and not noticing when one copy has failed – until the duplicate has also failed – is resilience squandered. You might think it never happens, but it is all too common.
IBM first started sharing our concept of Business Resilience in 2003 and our own definition of the term has remained almost unchanged over the last six or seven years. We believe the concept of resilience needs to be applied holistically, taking the human element into account as well as the tin and the string that make up IT systems and the bricks and mortar in which everything is located. The good news is that there are potential solutions to many of the problems we commonly see with resilience, some of which I have described here.
We define Business Resilience as…
"… the ability to rapidly adapt and respond to risks, as well as opportunities, in order to maintain continuous business operations, be a more trusted partner, and enable growth."
Business Resilience also aims to add value by positioning the organisation to exploit upside risk (opportunities), which means for example scaling to cope with additional demand, and seeking a close coupling with the business to leverage the investment and drive real benefit.
Perhaps the biggest danger of the more traditional view of resilience is that, like with armour, you think it makes you invincible. Our Business Resilience approach has developed with a firm grounding in real world experience and is designed to maximise business benefit by looking not only at availability but also security and disaster recovery.
Robin Gaddum, Senior Managing Consultant, IBM's Business Continuity and Resiliency Services

Latest News

Cyber jihadists to target UK?… More…
10 February 2012

New US Navy intel tool checks Philippines terroris… More…
09 February 2012

UK cyber security skills inadequate… More…
08 February 2012

Utilities warned again about IT vulnerability… More…
08 February 2012

RSS Feed symbol | What is RSS?
View all news items…

Latest Events

13-14 February, 2012
Business Continuity and Emerge…
Location: Abu Dhabi, UAE

14-17 February, 2012
Security and Safety Technologi…
Location: Moscow, Russia

19-21 February, 2012
ASIS International 3rd Middle …
Location: Dubai, UAE

View all events…

Key Articles

The role of accurate mapping in disaster managemen… More…
07 February 2012

What's in your bin… More…
06 February 2012

Shropshire Council enhances CCTV for environmental… More…
06 February 2012

How to spot the cloud's pitfalls… More…
06 February 2012

RSS Feed symbol | What is RSS?
View all articles…


Design: Burnthebook