According to a survey conducted by the Disaster Recovery Preparedness Council (DRPC) in 2014, nearly three out of four companies are failing in their disaster readiness. This is an alarming number given how unpredictable the nature of disaster is. More than a third of organisations surveyed lost one or more critical applications, virtual machines or important data files at some point in time last year. Nearly one in five companies lost one or more critical applications over several days.
Can your business afford to take such a big hit?
Experts estimate the cost of losing critical applications at more than $5,000 per minute, with some companies even confirming more. Aside from the financial cost, outages also cost valuable staff time and damage to business reputation.
Behind such an enormous loss is the lack of disaster recovery (DR) planning, testing and resources. Some companies also don’t have a fully documented DR plan or their plan has not designed to respond to the worst-case scenario. Some organisations only test their DR plans once or twice a year, while others might not test their plans from one year to the next.
How do you improve your DR preparedness?
1. Plan ahead and implement a very detailed DR plan. Think of questions like what happens if one server goes down? What if all servers crash? Is all of the critical hardware in your server room covered by on-site hardware warranty?
The two most important aspects that organisations need to look at to improve their DR preparedness are the Recovery Point Objective (RPO) and the Recovery Time Objective (RTO). The RPO represents the maximum amount of time in which data can be lost due to a major incident. For example, if there was to be a major incident at 5pm, and the last backup was at 11pm the previous night, could the business absorb the impact of losing a full day’s worth of data? The RTO represents the amount of time that is tolerable to a business to bring any service back online following a major incident. If a server was to fail, what is maximum tolerable amount of time the organisation can handle that server or service being offline?
2. Identify risks to your IT systems and data to reduce or manage those risks, and develop a response plan in the event of a crisis. An IT Risk Assessment will identify current risks, check the security of your data and review operational procedures surrounding the IT systems supporting the business. You can manage IT risks by completing a business risk assessment. Having a business continuity plan can help your business recover from an IT incident. The assessment will audit various equipment including, server room, data centre, servers, routers, desktops, laptops, networks including wireless, firewalls, applications and so on.
3. Understand the environment you are trying to protect. Make adjustments, repeat tests and update your plan as the environment changes. Your plan must include everything you will need to recover: all applications, networks, documents, services, processing systems and so on. Specify the RTO for critical applications.
4. Develop a set of User Acceptance Tests (UAT) which should be confirmed as working during the course of the DR testing. The UAT are simply a list of the functions that any part of the business needs to operate. Start by listing the line-of-business applications, and consulting with department heads to get their input on what IT functions are critical for their department to run.
5. Test the critical applications more frequently to see if recovery can be done within RTOs. Automating such processes would be beneficial in the long run, saving you time and money. Ensure you test the recovery of your backups. What is the use of a backup if you can’t recover it? When disaster strikes, you don’t want to be left with half of your systems running. The Backup and Disaster recovery plan should integrate with the business’ overall IT environment to create solid business continuity.
Find a good specialist that can provide a template security policy, or tailor the risk assessment to suit the existing security policies. You should look at testing your IT environment’s resilience to security threats while an on-site audit of your physical installation of equipment should be able to capture any specific concerns or requirements.
While the DR plan may look complete on paper, by actually executing it, any areas that may not have been considered will be uncovered and additional detail added to it. You will then have 100% confidence that your plan will actually work. Look at allocating a budget to test your DR plan on an annual basis.
Finally, the key to a successful testing cycle is a repeatable process. Maintain strict internal process standards to ensure that risk assessments are sensible, repeatable, and provide the correct information your organisation requires.
About the author
Nathan Lowe is the Managing Director of ASI Solutions, a privately-owned ICT company operating across Australia.