Prevent database disasters with a simple checklist
By Angelique Smit, Client Relationship Manager at RDB Consulting
In today’s information-driven age, the database is the heart of any organisation. From running applications to processing transactions and storing customer and other mission critical data, without the database businesses simply cannot function. Despite the critical nature of the database, many companies do not have a comprehensive backup and disaster recovery strategy in place and resort to crisis management when their database crashes, often resulting in costly downtime.
There are a few checklist items to consider with backup and disaster recovery, ensuring minimal disruption and most importantly, continuity for the business.
Checklist Item #1: The backup and disaster recovery strategy
Whether organisations run a full disaster recovery environment or simply conduct regular backups, having a plan and processes in place to govern this in the event of an emergency can literally save a business.
A backup and disaster recovery strategy is therefore essential for every modern business of any size. This is the most important step in ensuring your database is not a disaster waiting to happen.
In order to develop this strategy, organisations firstly need to understand how critical their data is to the business. Not all organisations require a full disaster recovery environment, as these can be costly to implement. Furthermore, not all data is mission critical or will cause the business to fail if lost or takes time to recover. However, at the very least, all data needs to be maintained in some form of working backup environment and these backups need to be conducted in line with business rules. Business rules govern the backup and recovery strategy, and outline how data should be stored and restored, as well as guide the times required for a restore to take place and more.
A full disaster recovery environment is obviously preferable for mission critical databases, as when disaster happens the environment can simply be ‘switched over’ with minimum downtime and disruption. The disaster recovery environment should be in sync with the production environment and should also be regularly tested. If a disaster recovery environment is not in place, backups need to be stored in a minimum of three separate locations to ensure that at least one recovery copy is available for restore.
Regardless of the recovery method, the processes involved must be clearly documented. Listing the order of procedures, steps that need to be taken, the required turnaround times and who is responsible to ensure that all functions are fulfilled is essential. All parties involved should clearly understand their role. The failover processes must be regularly tested to ensure that when a disaster happens, these processes are seamless. When testing, the failover processes should also generate a log to establish which ones are successful and which ones are not, allowing for the appropriate person to remedy.
Checklist Item #2: Address database security
Building security into the database is important, both from a physical and data perspective. This is addressed in various legislations including Sarbanes Oxley (SoX) and the King III guidelines to mention a few, making database security a compliance requirement. The requirement for database security is also extended to any backup copies of data and disaster recovery environments.
Physical security such as access control, intrusion prevention and detection, fire detection and suppression will help to prevent unauthorised persons from accessing the physical storage areas and minimise the impact of disasters such as fire. Data security must also be implemented to prevent unauthorised data access and theft from the corporate network. This is critical given the rise in cybercrime. It is also important to ensure that the database itself and all backups receive the same protection levels. Without IT security, data can be lost, corrupted or more frequently in today’s world, stolen for sinister purposes. Data must be protected to prevent business downtime, which results in loss of revenue and reputation.
Checklist Item #3: Database administration
Whether you use an internal Database Administrator (DBA) or the services of an outsource provider, it is vital to be 100% comfortable with the DBA and the levels of support that are delivered. The DBA has access to all company data and therefore must be highly trustworthy.
The service levels delivered must also be checked, as bad service both in-house and outsourced can negatively impact database downtime and cost the business. This can be addressed in a solid Service Level Agreement (SLA) and Operations Level Agreement (OLA). However, the DBA or outsource provider should maintain the backup strategy, the frequency of testing processes, the documentation and availability of this documentation as well as all planned failover testing. If these services are not being delivered, an organisation should question the value that the DBA is delivering.
Checklist Item #4: Check your SLAs
SLAs must fit the requirements of the business and should support disaster recovery and restore goals. The infrastructure of the database and recovery environment needs to allow for either a full disaster recovery failover to take place or regular backups which require, amongst other things, enough disk space. SLAs must factor this in and meet the specific disaster recovery needs of the organisation.
For example, an online e-Commerce store cannot afford to have any downtime due to the nature of their 24x7x365 business. Therefore, their SLA should include service levels that ensure maximum uptime and fast restore times with disaster recovery. Other businesses, such as a legal firm, may need to have their data restored within a few hours, or a day. This type of business won’t collapse if the data restore is completed within 24 or even 48 hours. Therefore, the SLA must accommodate these factors and should also be in line with the disaster recovery strategy supported by the business rules and processes. If SLAs do not fall in line with business requirements, they need to be reassessed. However, it is also important to bear in mind that 99.999% uptime and fast recovery comes at a price. The balance of expense, functionality and best possible service levels to meet the business’ needs must be considered when defining an SLA.
In addition, the SLA should incorporate regular testing of the disaster recovery plan to ensure that it works, eliminating much frustration in the event of failure.
Conclusion
Ultimately any disaster recovery solution minimises downtime. Downtime costs money and this is often more expensive than the implementation of a full disaster recovery environment. If this is not possible, having a strategy in place is critical to ensure that processes are followed. Maintaining a stable database environment is equally important for business continuity. A checklist that covers these aspects of database backup and recovery will help to mitigate risk, minimise downtime and ensure businesses are up and running in the shortest possible time in the event of a disaster.