This article is more than 1 year old

Switching on to better disaster recovery

Automated protection for mission critical database workloads with Amazon RDS for Oracle

Sponsored Feature Efficient data management and resilience are key issues to be addressed when it comes to putting your data into the cloud. Few organizations have the luxury of being able to take their applications and services completely offline during any subsequent migration or infrastructure maintenance. So, anything that helps them maintain the availability of a primary, mission-critical database in the meantime is likely to win approval amongst database administrators (DBAs).

Amazon Web Services (AWS) first released Amazon Relational Database Service (Amazon RDS) for Oracle, a fully managed commercial database service that automates many administrative tasks - including hardware provisioning, configuration, software patches, monitoring and data backups - back in 2011, leaving DBAs free to spend more time concentrating on application development and other more pressing tasks.

Now, Amazon RDS for Oracle is making it easier to do just that with a key update to its proposition - Data Guard (DG) switchover. The new feature turbocharges data protection and management of Amazon RDS for Oracle by making it faster and safer for customers to test and maintain Oracle database environments both on-premises and in the AWS cloud.

Lightening the DBA load

Michael Barras, Principal Database Solutions Architect at AWS, is helping to spearhead improvements to the data handling and protection integral to RDS for Oracle while simultaneously focussing on improving its availability and migration features to help organizations keep mission critical workloads running in every circumstance.

"Relational databases are rather large software packages. A lot of work has to be done before you are able to manage them and put your data into the database," he says.

"With Data Guard switchover, we are taking some of the load off database administrators (DBAs), and delivering the system elasticity and background infrastructure to enable easier manageability and security in the cloud." DG switchover automates the switchover from the primary database to standby database (replicas) to minimize downtime and data loss during planned maintenance.

As well as providing the cloud infrastructure to allow organizations to operate their databases off-premises, AWS is also ensuring that firms can quickly and efficiently scale their database operations in response to rapidly changing business needs and customer requirements, adds Barras.

"It may be an as-a-service offering, but it's more than simply spinning up a server with a credit card and that's it."

Resilience, disaster recovery and automated backups

Designed to be highly scalable and durable from the outset, RDS for Oracle was built with fast, secure transaction performance in mind. But more importantly it also offers high availability through Multi-Availability Zone (Multi-AZ) deployments across multiple data centers in any one region and enables customers to establish and test disaster recovery (DR) plans which replicate across those regions.

For example, Goldman Sachs Transaction Banking (TxB), which provides a variety of payment and cash management services to its clients, runs several key components of its payment flow platform on RDS for Oracle to ensure business continuity and resiliency across different in-region and cross-region AZs.

"Our primary resiliency method is Multi-AZ within the region, so we rely on that automated failover that is going to give us a second copy of the data in another AZ," says John Gorry, former DBA and now TxB vice president of database engineering. "And should we have a failure of that primary node, RDS is going to bring up that secondary node with a relatively short amount of downtime, typically around 120 seconds."

That means almost no application downtime at all for TxB as long as the application has been configured correctly. "And that's a really powerful feature because it means my team can focus on the value add and not have to worry about how do we orchestrate failovers, how do we rebuild secondary copies of the data and so on."

With Data Guard (DG) and DG switchover, the service also provides resilience and disaster recovery via multiple availability zones (AZs) and multiple regions which make sure data is automatically and fully backed up and always recoverable through full data replication. A common configuration would be to run across two AZs in a source region, and then replicate to an AZ in a different region. The second region can be configured to have its own second copy of the customer's data.

In earlier versions of the service, customers looking to perform a disaster recovery drill would need to promote a replica as a new standalone database before creating a new replica to maintain the configuration. Now, DG switchover simplifies that process by reversing the role of the database without having to recreate the replica, as illustrated in the chart below.

Data Guard Switchover Diagram

Data Guard Switchover Diagram - Click to enlarge

Chart 1: RDS for Oracle Data Guard switchover planned database role transition

So, the original primary database becomes the standby database, while the replica takes its place as the primary. That way, a new replica can be created in the same state as the original replica, meaning the replication configuration can be seamlessly maintained for data consistency with no subsequent data validation required.

The switchover for a target standby database can be started either from the AWS Command Line Interface or from the Amazon RDS console. It uses the Oracle Data Guard Broker to validate both the primary and replica prior to making the move, during which the original primary database ceases to process write requests whilst new transactions are blocked.

Log shipping is suspended and MRP is allowed to catch up to the standby database to make sure the data is consistent. When the switchover is complete, the new primary database is restarted in read/write mode and the new replica is started in the previous replication configuration, either read-only or mounted mode. Any additional replicas which were not involved in the switchover are then reconfigured to continue asynchronous replication from the new primary database.

DBAs can also create automated backups and manual snapshots of their RDS for Oracle replicas, a feature that cuts the time they have to spend taking backups following a role transition between primary and secondary databases during the switchover/failover process. In addition, a new instance can be created simply by restoring from an earlier snapshot or point in time recovery.

A more detailed discussion around the use cases of the Data Guard switchover on Amazon RDS for Oracle is available here. The blog includes how database administrators (DBAs) and database architects can perform a managed switchover to enable role reversal between RDS for Oracle primary database and its replica.

The net result takes away the majority of the legwork for customers that would otherwise have to configure services for themselves.

"Doing all this themselves over on-premises systems would be a lot of work for many organizations," says Barras.

Testing Data Guard Switchover

Of course, it is one thing to be told that your mission critical primary database is online, operational, and protected – another to be absolutely certain that is the case. And no company wants to find out there is a problem after the event.

Key areas that are served by DG switchover include planned maintenance of systems, data traffic planning across different regions, and infrastructure management. It also covers the testing of new services to ensure that customer DR plans actually work and can successfully shift traffic to an alternative database whilst minimizing any chance of data loss.

"With the extra help of Data Guard switchover, organizations don't have to spend so much time and effort in making sure data workloads in the cloud are working well. They can also ensure those workloads are fully available to the business for operational needs, and that they are also well protected," says Barras.

Industry approval and compliance

And as for the aforementioned financial services market and other compliance-heavy industries, the new offering provides comprehensive compliance testing. This includes wide-ranging drills, and fully documented demonstrations of industry compliance being met by companies.

Cross-region resiliency for stateful services like RDS Oracle is critical for TxB for example, helping it prove to industry regulators and internal/external auditors that it can failover and run from different physical regions. The foundations of this ability are the cross region read-only replicas of the database with TxB routinely promoting a replica to make sure that data is properly synchronized between the primary and secondary regions at all times. It is this routine that the company is hoping to replace with DG switchover after a thorough evaluation of the new feature.

Planning ahead and stress testing systems to make sure they will do the job expected whilst maintaining compliance before you actually come to rely on them is always a stressful task in any aspect of IT. The new DG switchover and automated backup features in RDS for Oracle will give DBAs more peace of mind.

Sponsored by AWS.

More about

More about

More about


Send us news