Nutanix DR implementation to protect guest VMs and orchestrate disaster recovery to other Nutanix Cluster when event causing service disruption to occur at primary site.
Nutanix DR Terminologies:
Availability Zone (AZ): A zone that can have one or more independent datacenters inter-connected by low latency links. An AZ can either be in your office premises (on-prem) or in Xi Cloud Services. AZs are physically isolated from each other to ensure that a disaster at one AZ does not affect another AZ. An instance of Prism Central represents an on-prem AZ.
Recovery Availability Zone: An AZ where you can recover the protected guest VMs when a planned or an unplanned event occurs at the primary AZ causing its downtime. You can configure at most two recovery AZs for a guest VM.
Source Virtual Network: The virtual network from which guest VMs migrate during a failover or failback.
Recovery Virtual Network: The virtual network to which guest VMs migrate during a failover or failback operation.
Network Mapping: A mapping between two virtual networks in paired AZs. A network mapping specifies a recovery network for all guest VMs of the source network. When you perform a failover or failback, the guest VMs in the source network recover in the corresponding (mapped) recovery network.
Category: A VM category is a key-value pair that groups similar guest VMs. Associating a protection policy with a VM category ensures that the protection policy applies to all the guest VMs in the group regardless of how the group scales with time. For example, you can associate a group of guest VMs with the Department: Marketing category, where Department is a category that includes a value Marketing along with other values such as Engineering and Sales.
Recovery Point: A copy of the state of a system at a particular point in time.
Recovery Point Objective (RPO): The time interval that refers to the acceptable data loss if there is a failure. For example, if the RPO is 1 hour, the system creates a recovery point every 1 hour. On recovery, you can recover the guest VMs with data as of up to 1 hour ago. Take Snapshot Every in the Create Protection Policy GUI represents RPO.
Recovery Time Objective (RTO): The time period from failure event to the restored service. For example, an RTO of 30 minutes enables you to back up and run the protected guest VMs in 30 minutes after the failure event.
Protection and DR Between on-prem Availability zone:
Leap protects your guest VMs and orchestrates their disaster recovery (DR) to other Nutanix clusters when events causing service disruption occur at the primary AZ.
Before proceeding further let me introduce to my environment, I have two Nutanix Clusters, both clusters are registered with their own prism central hosting on same cluster. Logical design between two cluster will as below
Enabling Nutanix Discovery:
- Login to Prism Central on Both Clusters
- Click Gear Icon à Click Disaster Recovery
- Click Enable
- Click Enable
Nutanix Disaster Recovery is enabled.
Connect AZ:
- Browse Navigation Bar à Administration à Availability Zones
- Click Connect to Availability Zone
- Select Physical Location, provide 2nd Prism Central IP, User and Password and click Connect
Connection will be created between both prism Central.
Creating Category:
- Browse Navigation Bar à Administration à Categories
- Click New Category
- Specify the Category Name and enter value (subcategories)
Creating Protection Policy:
- Browse Navigation Bar à Data Protection à Protection Policy
- Click Create Protection Policy
- Specify the Primary Location, Cluster and Click Save.
- Specify Recovery Location PC & Cluster and click save
- Specify the Snapshot frequency & retention on local and remote.
- Specify the desire Category and click add
- Click Create to create the Policy
Assigning VM to Category:
- Navigate to VM
- Select the desire VM à Action à Manage Categories
- Select Desire Category and Click Save
Review Protection Summary:
- Browse Navigation Bar à Data Protection à Protection Summary
Protection Summary Dashboard, will show the RPO Status
- Browse Navigation Bar à VM à Recovery Points will shows the VM recovery points and protection status of selected VM.
Creating Recovery Plans:
- Browse Navigation Bar à Data Protection à Recovery Plans
- Create New Recovery Plan
- Specify Recovery plan name, specify primary and recovery location and click Next.
- Click Add VMs
- Select the VM and click add
- Click Next to proceed
- Select Network Type ( Stretch , Non-Stretch) and Specify source and Target Network Subnet and click Done.
Initating Failover:
- Select Recovery Plan
- Click Failover to initate failover
- Select the Desire Failover type, Incase of Planned Failover (Source VM will be shutdown and after finnal sync , VM will be registered in target cluster and powered-on). Incase if unplanned Failover desire recovery points and be selected for restore .
- Type Failover and click Failover
- Click Tasks for see the Failover status
- VM Successfully failed-over to DR successfully.