Categories
ITOps

AWS RDS (Oracle 12c) Offsite Backups

A lot of people need to do offsite backups for AWS RDS – which can be done trivially within AWS. If you need offsite backups to protect you against things like AWS account breach or AWS specific issues – offsite backups must include diversification of suppliers.

I am going to use Amazon’s Data Migration service to replicate AWS RDS data to a VM running in Azure and set up snapshots/backups of the Azure hosts.

The new (2018) AWS Data Migration Service solve offisite RDS backup problems

The steps I used to do this are:

  1. Set up an Azure Windows 2016 VM
  2. Create an IPSec tunnel between the Azure Windows 2016 VM and my AWS Native VPN
  3. Install matching version of Oracle on the Windows 2016 VM
  4. Configure Data Migration service
  5. Create a data migration and continuous replication task
  6. Snapshots/Backups and Monitoring
  7. Debug and Gotchyas

1,2 – Set up Azure Windows 2016 VM and IPSec tunnel

Create Network on Azure and place a VM in the network with 2 interfaces. One interface must have an public IP, call this one ‘external’ and the other inteface will be called ‘internal’ – Once you have the public IP address of your Windows 2016 VM, create a ‘Customer Gateway’ in your AWS VPC pointing to that IP. You will also need a ‘Virual Private Gateway’ configured for that VPC. Then create a ‘Site-to-Site VPN connection’ in your VPC (it won’t connect for now but create it anyway). Configure your Azure Win 2016 VM to make an IPSec tunnel by following these instructions (The instructions are for 2012 R2 but the only tiny difference is some menu items):
https://docs.aws.amazon.com/vpc/latest/adminguide/customer-gateway-windows-2012.html#cgw-win2012-download-config. Once this is completed both your AWS site-to-site connection and your Azure VM are trying to connect to each other. Ensure that the Azure VM has its security groups configured to allow your AWS site-to-site vpn to get to the Azure VM (I am not sure which ports and protocols specifically, I just white-listed all traffic from the two AWS tunnel end points. Once this is done it took around 5 mins for the tunnel to come up (I was checking the status via the AWS Console), I also found that it requires traffic to be flowing over the link, so I was running a ping -t <aws_internal_ip> from my Azure VM. Also note that you will need to add routes to your applicable AWS route tables and update AWS security groups for the Azure subnet as required.

3 – Install matching version of Oracle on the Windows 2016 VM

4,5 – Configure Data Migration service and migration/replication

Log into your AWS console and go to ‘Data Migration Service’ / ‘DMS’ and hit get started. You will need to set up a replication VM (well atleast pick a size, security group, type etc). Note that the security group that you add the replication host to must have access to both your RDS and your Azure DBs – I could not pick which subnet the host went into so I had to add routes for a couple more subnets that expected. Next you will need to add your source and target databases. When you add in the details and hit test the wizard will confirm connectivity to both databases. I ran into issue on both of these points because of not adding the correct security groups, the windows firewall on the Azure VM and my VPN link dropping due to no traffic (I am still investigating a fix better than ping -t for this). Next you will be creating a migration/replication task, if you are going to be doing ongoing replication you need to run the following on your Oracle RDS db:

  • exec rdsadmin.rdsadmin_util.set_configuration(‘archivelog retention hours’, 24);
  • exec rdsadmin.rdsadmin_util.alter_supplemental_logging(‘ADD’,’ALL’);
  • exec rdsadmin.rdsadmin_util.alter_supplemental_logging(‘DROP’,’PRIMARY KEY’);

You can filter by schema, which should provide you with a drop down box to select which schema/s. Ensure that you enable logging on the migration/replication task (if you get errors, which I did the first couple of attempts, you won’t be fixing anything without the logs.

6 – Snapshots and Monitoring

For my requirements, daily snapshots/backups of the Azure VM will provide sufficient coverage. The Backup vault must be upgraded to v2 if you are using a Standrd SSD disk on the Azure VM, see:
https://docs.microsoft.com/en-us/azure/backup/backup-upgrade-to-vm-backup-stack-v2#upgrade . To enable email notifications for Azure backups, go to the azure portal, select the applicable vault, click on ‘view alerts’ -> ‘Configure notifications’ -> enter an email address and check ‘critical’ (or what type of email notifications you want. Other recommended monitoring checks include: ping for VPN connectivity, status check of DMS task (using aws cli), SQL query on destination database confirming latest timestamp of a table that should have regular updates.

7 – Debug and Gotchyas

  • Azure security group allowing AWS vpn tunnel endpoint to Azure VM
  • Windows firewall rule on VM allowing Oracle traffic (default port 1521) from AWS RDS private subnet
  • Route tables on AWS subnets to route traffic to your Azure subnet via the Virtual Private Network
  • Security groups on AWS to allow traffic from Azure subnet
  • Stability of the AWS <–> Azure VM site-to-site tunnel requires constant traffic
  • The DMS replication host seems to go into an arbitrary subnet of your VPC (there probably some default setting I didn’t see) but check this and ensure it has routes for the Azure site-to-site
  • Ensure the RDS Oracle database has the archive log retention and supplemental logs settings as per steps 4,5.
  • Azure backup job fails with ‘Currently Azure Backup does not support Standard SSD disks’. – upgrade backup vault: https://docs.microsoft.com/en-us/azure/backup/backup-upgrade-to-vm-backup-stack-v2#upgrade