Onboarding : Azure Migration, Backup, Recovery

Topics

  • Key Concepts
  • Azure Paired Regions
  • Azure migration Framewotk
  • Azure Migrate
    • Migrate servers with Azure Migrate
  • Azure Site Recovery
  • Protect Infrastructure with Site recovery
  • Protect on-prem infrastructure from disaster by using Azure Site Recovery
    • Disaster Recovery drill
  • Protect VM by using Azure Backup

Scenarios:

  • Planning to move several datacenters to azure and workloads are VMware VMs and Relational databases.
  • Protect Azure Infrastructure with Site Recovery
  • Protect on-prem infrastructure from disaster by using Azure Site Recovery

Key Concepts

  • BCDR
  • Azure Region: Regions, where Microsoft hast at least one or many datacenters.
  • Azure Paired Regions: Each Azure Region is paired with another Azure Region within the same geography.

Azure Paired Regions

  • They make together a regional pair.
  • Across the region pairs azure serializes platform update (planned maintenance)
  • One paired region is updated at a time
  • For BCDR we can use paired region to use Azre’s isolation and availability poilicies
  • For active/active models use paired regions to optimize availability for application and minimize recovery time in the event of a disaster.

Benefits of paired regions

  • Physical isolation
    • at least 300 miles of separation
    • reduce the likelihood of natural disasters, civil unrest, power outage, physical network outage
  • Platform-provided replication
    • Some services such as Geo-Redundant Storage provide automatic replication to the paired region.
  • Region recovery order
    • The application that are deployed across paired regions are guaranteed to have one of the regions recovered with priority. but not paired across region deployed might be delayed.
  • Sequential updates for paired regions
  • Data residency
    • pairs are reside in the same geography

Resource

Azure migration Framewotk

Microsoft has a four stages framework for a successful migration.

Assess

  • It’s about discovery and evaluation
  • produce a full inventory dependency map

Migrate

Optimize

Monitor

The cloud solution architect must consider the framework and provide a buisiness plan for migration.

What happens in Assess step

Discover and evaluation
Involve key stackholders
Estimate cost saving
Identify tools

Discover and evaluation
  • Full assessment of the current environment and identify servers, applications, and services -> integrate the business team for feedback, guidance, and support.
  • Create full inventory and dependancy map of servers and services and each application must be fully investigated.
  • Available application’s migration options (manual step)
    • Rehost: recreating the existing infrastructure in azure, least impact and minimal changes, e.g. move vms from datacenter to azure.
    • Refactor: move services on a vm to platform-as-a-service (paas) services, reduce operational requirements, improve release agility, keeps costs low, small enhancement on efficiency, large impact on perormance.
    • Rearchitect: change to cloud native, or change to containers, or microservices to be able to move to cloud.
    • Rebuild: rebuild software if the costs of rearchitect is more than starting from scratch.
    • Replace: using third-parties or software-as-a-servive (saas) options.
Involve key stackholders

The superusers have to be involved, to use their experiences.

Estimate cost saving

Use the Azure Total Cost of Ownership (TCO) calculator to estimate the real costs of supporting and consider the point, that migration’s business plan is to reduce costs.

Identify tools
service/toolstageuse
Azure migrateassess, migrateperforms assessment and migrate VMware vms, Hyper-v vms, physical servers, databases, data, virtual desktops, web applications to azure.
Service mapassessmaps communication between applications components on win/linux for identifying dependancies
Azure tco calculatorassessestimates monthly running costs azure vs. on-prem
Azure database migration servicemigrateuses DATA MIGRATION ASSISTANT for migrate to azure.
Data migrate toolmigratemigrate database to azure cosmos db.
Azure cost managementoptimizehelps to monitor,control, and optimize ongoing azure costs.
Azure advisoroptimizehelps to optimize resuorces for better availability, performance, and cost.
Azure monitormonitorhelps to monitor entire performance, and application health, and setting notifications.
Azure sentinelmonitorprovides intelligent sercurity analytics for application.

What happens in Migrate step

Start with a small migration
Deploy cloud infrastructure target
Migrate workloads
Decomission on-prem infrastructure

At a high level, the steps are:

  1. Prepare the source (vCenter Server) and target (Azure) environments.
  2. Set up and start the replication between the two.
  3. Test that the replication has worked.
  4. Fail over from the source servers to Azure.

For the database migrations, the high-level steps are:

  1. Assess your on-premises databases.
  2. Migrate the schemas.
  3. Create and run an Azure Database Migration Service project to move the data.
  4. Monitor the migration.

Source


What happens in Optimize step

Analyze running costs -> Use Azure Cost Management to start analyzing your Azure costs at different management scopes
Review opportunities to review -> Azure Cost Management shows you cost-reduction advice from Azure Advisor


What happens in Monitor step

Integrate health and performance monitoring

Azure Migrate

For using Azure Migrate we must have an Azure Migrate Project.

These are the azure migrate steps

Discover virtual machines

  • Azure migrate does an agentless dicover via Server Assessment tool guides
  • download/install collector appliance (VMware, Hyper-V)
  • Collector Appliance have to be connected to Azure Migrate Project
  • collector gathers data about VM cores, memory, disk sizes, and network adapters.
  • Where applicable, the collector also gathers performance data like CPU and memory usage, disk IOPS, disk throughput, and network output
  • to visualize the dependencies, we can install Microsoft Monitoring Agent (Win/Linux)

Create assessments

  • In the portal, select the Assessments section, and then select Create assessment
  • An assessment is created with default settings

It’s important to install the collector appliance on a Windows Server unless we see this error.

Azure Migrate Installer check to get installed on a Windows Server

Source

Migrate servers with Azure Migrate
  • After Azure Migration’s assessment
    • decide for the best server for migration
  • share the finding with IT director
  • test the end-to-end process
  • add a Server Migration to Azure Migrate

Source

Migrate databases with Azure Database Migration Service
  • Azure Database Migration Service enables online and offline migrations from multiple database sources to Azure data platforms
  • service uses the Microsoft Data Migration Assistant to generate assessment reports.
  • Azure Database Migration Service has two pricing tiers:
    • Standard: Supports only offline migrations. There’s no charge to use this tier.
    • Premium: Supports both offline and online migrations. There’s no charge for the first six months. After that period, you’ll incur charges.
Destinations

Your relational database can be migrated to a number of different destinations in Azure:

  • Single Azure SQL Database instance: A fully managed, single SQL database.
  • Azure SQL Database managed instance: 100% compatible with SQL Server Enterprise Edition Database Engine, but missing some minor SQL Server features.
  • SQL Server on Azure Virtual Machines: An infrastructure-as-a-service (IaaS) offering that runs a full version of SQL Server and supports all the features of SQL Server.
  • Azure Database for MySQL: An Azure database service based on the MySQL Community Edition, versions 5.6 and 5.7.
  • Azure Database for PostgresSQL: An Azure database service based on the community version of the PostgreSQL database engine.
  • Azure Cosmos DB: A globally distributed, multi-model, fully managed database service.

Source

Azure Site Recovery

Each organization with cloud infrastructure must have a Business Continuity and disaster recovery (BCDR) plan and the projects separately as well.

About Site Recovery

  • provides disaster recovery for azure infrastructure by orchestrating replication, failover, and failback of azure VMs.
  • Azure site recovery replicates workloads between a primary and secondary sites
  • It can migrate VMs from on-prem to azure.
  • it does a repplication for Business Continuity and Disaster Recovery (BCDR) .
  • replicates vms’ workloads to a secondary region (site recovery can be used for migration as well)
  • For Site Recovery we use Azure Recovery Service Vault. Service Vault uses the storage to keep data backups, Vm configurations, workloads.
Diagram showing a VM environment that is unavailable and failing over to a secondary environment
Source : https://docs.microsoft.com/en-us/learn/modules/protect-infrastructure-with-site-recovery/2-what-is-site-recovery

Site Recovery Features

it orchestrales the disaster recovery and repplications from on-prem to azure and vice versa. It test (disaster recovery drills) seamlesly without affecting production woerkloads.

Azure VM Protection
– Site Recovery protects Azure VMs by created Vms’ mirror and creates associated resource group, storage, Vnet, and availability sets with site recovery suffix.

Snapshots and recovery points
– we can defines policies to specify the retention history of recovery points and frequency of snapshots.
– Creates recovery points from snapshots

Snapshot types
Crash-consistent : for recovery data on-disk every 5 minutes by default
App-consistant: all data as crash-consistent and in-memory data and in-process transactions (Site Recovery can restore a VM and any running apps without any data loss). Capturing snapshot any 60 minutes by default.

Recovery points are kept for 24 hours by default and can be extended to 72 hours.

Replication to secondary region
– After enabling the replication the extension is installed on VM and VM is registered to Site Recovery.
– Continnous replication of the VM begins with any write to the disk and is immediately transferred to the local storage. Site recovery replicate cache into storage in destination

Disaster recovery (DR) drills
– DR drill enables you to validate the replication strategy without loosing data, downtime or compromising production environment.

Flexible failover and failback

Business continuity and disaster recovery (BCDR)
This is important because loss of service causes
– lost rerevenue
– Company can face financial penalties for breaking agreements

As part of BCDR we have to identify the following objectives for the application. These objectives help to realize the maximum tolerable hours that your business can be withouot specified services, and what the data recovery should be.
– Recovery Time Objectives (RTOs)
– Recovery Point Objectives (RPOs)

Recovery Time Objective: is the maximum amount of time that business can survive after a disaster before normal service is restored. If RTO is 12 hours, means that operation can continue without business’s core functioning and if downtime is 24 hours the would be seriously harmed.

Recovery Point Objective. is the maximum amount of data loss that’s acceptable during the disaster.

Azure Site Recovery prerequities

  • Add a Recovery Services Vault
  • Organize target resources -> target region must be different region than source region
  • Configure outbound network connectivity
  • Set up replication on existing VMs
Site recovery for on-prem / physical servers
  • in on-prem we need a master server
  • Master server is configuration server as well for on-prem servers ans vms
    • Configuration server
    • Process server
    • Master target server
  • Replicated servers -> mobility must be installed on all servers which must be replicated

Protect Infrastructure with Site recovery

The site recovery install ‘Mobility Service on source VMs’ automatically. If it fails we can install it manually on the VMs in source.

For example: run the following code block in azure bash to get ready to test site recovery

curl https://raw.githubusercontent.com/MicrosoftDocs/mslearn-protect-infrastructure-with-azure-site-recovery/master/deploy.json > deploy.json

az group create --name west-coast-rg --location westus2
az group create --name east-coast-rg --location eastus2

az deployment group create \
    --name asrDeployment \
    --template-file deploy.json \
    --parameters storageAccounts_asrcache_name=asrcache$RANDOM \
    --resource-group west-coast-rg

Source : https://docs.microsoft.com/en-us/learn/modules/protect-infrastructure-with-site-recovery/3-site-recovery-setup

After creating the source and Recovery Vault, we can go to vault and use ‘+ Replicate’ button. Based on the ‘Source Location’ only the resource groups in the same region can be selected.

And then we select the VMs in next step and ok.

And then use the ‘Customize’ button for Resource group, and…

And change it as follows

Screenshot sowing selecting the east coast resource group

And at the end enable replication.

Protect on-prem infrastructure from disaster by using Azure Site Recovery

  • Site recovery
    • recover from system outages
    • replicates workloads between a primary and secondary site
    • migrate VMs from on-premises infrastructure to Azure
    • As part of your BCDR plan, identify the recovery time objectives (RTOs) and recovery point objectives (RPOs) for your applications. Both objectives help to realize the maximum tolerable hours that your business can be without specified services, and what the data recovery process should be. Let’s look closer at each one.
    • recovery time objectives (RTOs):
      • A recovery time objective is a measure of the maximum amount of time your business can survive after a disaster before normal service is restored. Let’s assume your RTO is 12 hours, which means that operations can continue for 12 hours without the business’s core services functioning. If the downtime is 24 hours, your business would be seriously harmed.
    • recovery point objectives (RPOs):
      • A recovery point objective is a measure of the maximum amount of data loss that’s acceptable during a disaster. A business can typically decide to do a backup every 24 hours, 12 hours, or even in real time. If a disaster occurs, there’s always some data loss.
    • Some notable features of Azure Site Recovery are:
      • Central management: Replication can be set up and managed and failover and failback can be invoked all from within the Azure portal.
      • On-premises virtual machine replication: On-premises virtual machines can be replicated to Azure or to a secondary on-premises datacenter, if necessary.
      • Azure virtual machine replication: Azure virtual machines can be replicated from one region to another.
      • App consistency during failover: By using recovery points and application-consistent snapshots, virtual machines are kept in a consistent state at all times during replication.
      • Flexible failover: Failovers can be run on demand as a test or triggered during an actual disaster. Tests can be run to simulate a disaster recovery scenario without interruption to your live service.
      • Network integration: Site Recovery can manage network management during a replication and disaster recovery scenario. Reserved IP addresses and load balancers are included so that the virtual machines can work in the new location.
    • Site Recovery can replicate any app that runs on a supported machine:
      • Azure VM: Replication is available for any workload that runs on a supported Azure virtual machine.
      • Hyper-V VM: Protection is available for any workload that runs on a Hyper-V virtual machine.
      • Physical servers: Protection is available for Windows and Linux operating systems.
      • VMware VM: Protection is available for any workload that runs in a VMware virtual machine
      • Some of the features offered include:
        • Near synchronous replication: Data is written to the primary storage and the replica almost simultaneously to allow for low recovery times.
        • App-consistent snapshots: Snapshots taken for recovery purposes can view information in memory and pending I/O operations, which allows for quick recovery times. The application is ready to go after switching to the replica VM.
        • Integration with SQL Always On: Always On is an enterprise-level alternative to normal database mirroring techniques.
        • Flexible recovery plans: The ability to recover an entire application stack with a single click. Both manual and scripted actions can be incorporated into the overall plan.
          • Network management: The automated ability to simplify the reservation of IP addresses, configure load balancers, and integrate with Azure Traffic Manager.
          • Automation library: Production-ready scripts that can be downloaded and integrated with the recovery plan to provide a fully automated recovery, if needed.

Site Recovery protects the virtual machine that hosts your domain controller and DNS.

Scenario: to protect your workloads from an earthquake, for example, is to review the company’s current business continuity and disaster recovery (BCDR) plan. You need to identify the different recovery objectives and scope for the systems that need protection.

BCDR plans are formal documents that companies draw up that cover the scope and actions to be taken when a disaster or large-scale outage happens.

Set up Azure Site Recovery

To be able to use Azure Site Recovery to protect on-prem following components must be created or set up on-prem.

Diagram showing the Azure Site Recovery architecture
source
  • Networking: A valid Azure virtual network is required for the replicated virtual machines to use.
  • Recovery Services vault: A vault in your Azure subscription stores the migrated VMs when a failover is run. The vault also contains the replication policy and the source and target locations for replication and failover.
  • Credentials: The credentials you use for Azure must have the Virtual Machine Contributor and Site Recovery Contributor roles to allow permission to modify both the VM and the storage that Site Recovery is connected to.
  • Configuration server: An on-premises VMware server fulfills several roles during the failover and replication process. It’s obtained from the Azure portal as an open virtual machine appliance (OVA) for easy deployment. The configuration server includes a:
    • Process server: This server acts as a gateway for the replication traffic. It caches, compresses, and encrypts the traffic before sending it over the WAN to Azure. The process server also installs the mobility service onto all the physical and virtual machines targeted for failover and replication.
    • Master target server: This machine handles the replication process during a failback from Azure.
Azure Site Recovery architecture
Azure Site Recovery replication process [source]
Test and monitor a failover [Source]

Test the configuration by doing a disaster recovery drill on an isolated VM. It’s a best practice to use an isolated network for the test so that live services aren’t disrupted ( Settings > Replicated Items section)

Source

Disaster Recovery drill
  • it’s for testing the BCDR plan
  • Site Recovery section of the Azure portal, select Recovery Plans > your recovery plan name > Test Failover

Source

Failover & Failback
  • Azure Site Recovery gives you the flexibility to fail over to Azure if a disaster occurs and fail back to on-premises machines after the event is over.

Source

Protect azure infrastructure with azure site recovery

Azure Site Recovery

  • managing and orchestrating replication, failover, and failback of Azure virtual machines
  • replicates your virtual machine workloads between Azure regions
  • Site recovery features
    • VM protection
    • snapshots and recovery points
    • replication to a secondary region
    • Disaster recovery (DR) drills
    • Flexible failover and failback
    • outbound network connectivity
      • Azure Site Recovery requires outbound connectivity on the virtual machines for replication
      • doesn’t support controlling network connectivity via an authentication proxy.
      • If our organization is using a URL-based firewall proxy to restrict outbound connectivity, we’ll need to add access to several URLs.
URLDescription
login.microsoftonline.comFor the Azure Site Recovery URLs to authenticate
*.blob.core.windows.netTo write VM data to the source storage account cache
*.hypervrecoverymanager.windowsazure.comFor Azure Site Recovery to communicate with the VM
*.servicebus.windows.netFor Azure Site Recovery monitoring and diagnostic data from the VM

If you prefer to control the connectivity using IP addresses instead, then you need to add the IP address ranges for:

  • The Azure Datacenters
  • The Azure Site Recovery endpoints

  • replicate VM with site recovery
    • root certificate must be installed on VM
    • windows vm must have the lates update
    • linux vms are different from distribution to distribution
    • Mobility Service (to keep an up-to-date cache of the VMs’ data) must be installed on vm (site recovery installs it automatically)
  • RBAC roles
    • contributor
    • Operator
    • Reader

This diagrem below demostrates what Site Recovery does.

[Source]

Source

Protect VM by using Azure Backup

  • Azure backup for protecting data
    • for on-premises servers
    • virtual machines,
    • virtualized workloads such as SQL Server or SAP HANA running in Azure VMs,
    • Azure file shares
  • used for disaster recovery BCDR plan
  • high-risk servers must have backup
  • provides secure backup for all Azure-managed data assets
  • t uses zero-infrastructure solutions to enable self-service backups and restores, with at-scale management at a lower and predictable cost
  • offers specialized backup solutions for Azure and on-premises virtual machines

Scenario: Your company has several critical virtual machine workloads running on Azure. As the lead solution architect, you’ve been asked to ensure that the company can recover these virtual machines if there’s data loss or corruption. You’ve been asked to use the built-in capabilities of Azure Backup to help protect these virtual machines.

Azure Backup is a service that allows you to back up Azure virtual machines, on-premises servers, Azure file shares, and SQL Server or SAP HANA running on Azure VMs, and other application workloads

Azure BackupAzure Site RecoverySnapshot
for resiliency solutions against faults and failuresfor resiliency solutions against faults and failures
maintain copies of stateful data that allow you to go back in timereplicates the data in almost real time and allows for a failover
network or power outages-> use availability zoneregion-wide disaster
Backups are used in cases of accidental data loss, data corruption, or ransomware attacks.A snapshot is a point-in-time backup of all disks on the virtual machine -> Snapshot is created from only one disk even when Vm has e.g. three disks
uses Recovery Services vault -> vault is a strong management entity-> data storage in a separate fault domain

Additionally, the choice of a recovery approach depends on the criticality of the application, RPO and RTO requirements, and the cost implications.

  • Why azure backup
    • Zero-infrastructure backup
    • Long-term retention
    • Security
      • Azure role-based access control
      • Encryption of backups
      • No internet connectivity required
      • Soft delete
    • High availability
    • Centralized monitoring and management

Azure VMs take advantage of Azure Backup without installing additional software.

More about snapshot

or Azure virtual machines, Azure Backup uses different extensions for each supporting operating system:

ExtensionOSDescription
VMSnapshotWindowsThe extension works with Volume Shadow Copy Service (VSS) to take a copy of the data on disk and in memory.
VMSnapshotLinuxLinuxThe snapshot is a copy of the disk.

Depending on how the snapshot is taken and what it includes, you can achieve different levels of consistency:

  • Application consistent
    • The snapshot captures the virtual machine as a whole. It uses VSS writers to capture the content of the machine memory and any pending I/O operations.
    • For Linux machines, you’ll need to write custom pre or post scripts per app to capture the application state.
    • You can get complete consistency for the virtual machine and all running applications.
  • File system consistent
    • If VSS fails on Windows, or the pre and post scripts fail on Linux, Azure Backup will still create a file-system-consistent snapshot.
    • During a recovery, no corruption occurs within the machine. But installed applications need to do their own cleanup during startup to become consistent.
  • Crash consistent
    • This level of consistency typically occurs if the virtual machine is shut down at the time of the backup.
    • No I/O operations or memory contents are captured during this type of backup. This method doesn’t guarantee data consistency for the OS or app.
Backup policy

You can define the backup frequency and retention duration for your backups. Currently, the VM backup can be triggered daily or weekly, and can be stored for multiple years. The backup policy supports two access tiers – snapshot tier and the vault tier.

Snapshot tier: All the snapshots are stored locally for a maximum period of five days. This is referred to as the snapshot tier. For all types of operation recoveries, it’s recommended that you restore from the snapshots since it’s much faster to do so. This capability is called Instant Restore.

Vault tier: All snapshots are additionally transferred to the vault for additional security and longer retention. At this point, the recovery point type changes to “snapshot and vault”.

Backup process for an Azure virtual machine

Here’s how Azure Backup completes a backup for Azure VMs:

  1. For Azure VMs that are selected for backup, Azure Backup starts a backup job according to the backup frequency you specify in the backup policy.
  2. During the first backup, a backup extension is installed on the VM, if the VM is running.
    • For Windows VMs, the VMSnapshot extension is installed.
    • For Linux VMs, the VMSnapshotLinux extension is installed.
  3. After the snapshot is taken, it’s stored locally as well transferred to the vault.
    • The backup is optimized by backing up each VM disk in parallel.
    • For each disk that’s being backed up, Azure Backup reads the blocks on the disk and identifies and transfers only the data blocks that changed (the delta) since the previous backup.
    • Snapshot data might not be immediately copied to the vault. It might take several hours at peak times. Total backup time for a VM will be less than 24 hours for daily backup policies.

Companies that have a business continuity and disaster recovery (BCDR) plan typically schedule test runs to ensure that the business can successfully recover from disasters. Now that you have successfully backed up your VMs, you want to explore the options available for restoring them as part of your BCDR testing.

Source


Resources


Add a testimonial from someone who loves your service. Lorem ipsum dolor sit amet, consectetur adipiscing elit. Proin id arcu aliquet, elementum nisi quis, condimentum nibh. Donec hendrerit dui ut nisi tempor scelerisque.

Jane Doe


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

%d bloggers like this: