Microsoft, since 2012, has recommended Always-On Availability Groups as the Tier 1 solution for SQL Server databases.
Here are the “Always On” Architecture guides: https://blogs.msdn.microsoft.com/sqlalwayson/2012/07/03/alwayson-architecture-guides/
In particular, this guide is thorough, is written by still-famous experts in the SQL Server world, who do real world implementations of highly available and scalable solutions. https://docs.microsoft.com/en-us/previous-versions/sql/sql-server-2012/hh781257(v=msdn.10)
Additionally, the Microsoft “SQLCAT” (which stood for SQL Customer Advisory Team) (Largest Enterprises migrating to SQL Server from, shall we say, other platforms) has lots of best practice articles on AlwaysOn Availability Groups. https://blogs.msdn.microsoft.com/sqlcat/tag/availability-groups/
Specifically, 5 years ago, they documented doing Availability Groups in Azure VMs. This is far simpler nowadays. https://blogs.msdn.microsoft.com/sqlcat/2014/09/22/extending-alwayson-availability-group-to-remote-azure-datacenter-powershell/
From SQL 2005 through SQL 2008R2, “Highly Available with Disaster Recovery” would have been Failover Instance clusters, with a secondary copy delivered via Database Mirroring.
This architecture was used by a company known as the NASDAQ stock exchange, and was a case study back then. http://download.microsoft.com/documents/customerevidence/Files/49271/NASDAQ_SQLServer_Word.doc
AlwaysOn is a single-solution way to achieve even higher availability than with Failover Instances, with secondary copies in a different region. NASDAQ and others adopted AlwaysOn Availability Groups in place of old combinations of failover instances and mirroring or log shipping.
I will quote the document, and then the sections below will dive into detail. Note, the below is for SQL 2012. 2014 increased number of secondaries, and 2016 improved number of synchronous copies, as well as throughput. The best solution is on top, and goes down from there in terms of RPO and RTO.
Contrasting RPO and RTO Capabilities
The business goals for Recovery Point Objective (RPO) and Recovery Time Objective (RTO) should be key drivers in selecting a SQL Server technology for your high availability and disaster recovery solution.
This table offers a rough comparison of the type of results that those different solutions may achieve:
|High Availability and Disaster Recovery
SQL Server Solution
|Potential Data Loss (RPO)||Potential Recovery Time (RTO)||Automatic Failover||Readable Secondaries(1)|
|AlwaysOn Availability Group – synchronous-commit||
|Seconds||Yes(4)||0 – 2 (0-3 in 2016)|
|AlwaysOn Availability Group – asynchronous-commit||
|Minutes||No||0 – 4 (0-8 in 2014)|
|Failover Cluster Instance||NA(5)||Seconds-to-minutes||Yes||NA|
|Database Mirroring(2) – High-safety (sync + witness)||
|Database Mirroring(2) – High-performance (async)||
|Backup, Copy, Restore(3)||Hours(6)||Hours
(1) An AlwaysOn Availability Group can have no more than a total of four secondary replicas (eight in 2014), regardless of type.
(2) This feature will be removed in a future version of Microsoft SQL Server. Use AlwaysOn Availability Groups instead.
(3) Backup, Copy, Restore is appropriate for disaster recovery, but not for high availability.
(4) Automatic failover of an availability group is not supported to or from a failover cluster instance.
(5) The FCI itself doesn’t provide data protection; data loss is dependent upon the storage system implementation.
(6) Highly dependent upon the workload, data volume, and failover procedures.
Solutions on Azure with Virtual Machines for SQL Server.
Best Practice option: Always On Availability Groups with SQL Enterprise. Backups should also be done and quickly available.
The edition of SQL Server matters. Enterprise is the top tier, and using its features such as Always-On Availability groups are recommended for 24×7 operation. https://docs.microsoft.com/en-us/sql/sql-server/editions-and-components-of-sql-server-2017?view=sql-server-2017
Standard edition, from the link above, shows up as “Basic data management …. For departments and small organizations”. Under high availability from the same link,
“Always On Availability Groups is not available in Standard Edition.”
This automatically makes it a lower uptime. Failover instances were great technology 20 years ago, but they were surpassed by database mirroring for faster failover back in 2005. Today, it’s more of a legacy compatibility feature.
Basic availability groups is basically Database Mirroring, but modernized for SQL 2016.
All up Business continuity and disaster recovery for SQL Server is here:
When you read this, know that we want our customers and partners to be successful, for the application intended.
The modern way is via Always On Availability Groups, which permit going higher level on a secondary copy of data, testing on that secondary copy, and then failing over to the secondary.
Way lower in the article is the non-cluster way:
If recovery point and recovery time objectives are more flexible, or databases are not considered to be highly mission critical, log shipping is another proven availability feature in SQL Server. Based on SQL Server’s native backups, the process for log shipping automatically generates transaction log backups, copies them to one or more instances known as a warm standby, and automatically applies the transaction log backups to that standby. Log shipping uses SQL Server Agent jobs to automate the process of backing up, copying, and applying the transaction log backups.”
Since words matter, and clarity is key: Mission Critical means using Always On Availability Groups for the best choice.
SLAs are important. We document them clearly:
- For all Virtual Machines that have two or more instances deployed across two or more Availability Zones in the same Azure region, we guarantee you will have Virtual Machine Connectivity to at least one instance at least 99.99% of the time.
- For all Virtual Machines that have two or more instances deployed in the same Availability Set, we guarantee you will have Virtual Machine Connectivity to at least one instance at least 99.95% of the time.
- For any Single Instance Virtual Machine using premium storage for all Operating System Disks and Data Disks, we guarantee you will have Virtual Machine Connectivity of at least 99.9%.
“Availability Set” refers to two or more Virtual Machines deployed across different Fault Domains to avoid a single point of failure.
To get the best availability, the Always On Availability Group cluster would have to be in an “Availability Set”, which ensures our back end maintenance does not take all of those machines down at the same time.
Also, in reality, monthly patching will require downtime unless you are protected by Always On Availability Groups (Or, at the least, automatic failover by a failover instance).
Therefore, outside of patching and other maintenance, you can expect no better than three 9’s (99.9%) virtual machine connectivity. It might be better, but it is not guaranteed.
The differences between a failover instance and availability groups are laid out here:
If you prefer video information, David Pless, of Microsoft, my coworker for 13 years, and great presenter, presented AlwaysOn Availability groups to the PASS (Professional Association of SQL Server) High Availability Disaster Recovery virtual chapter.
Finally, for SQL Server on Azure VMs, here is a performance guide, so you can plan or adjust to a well-performing setup. https://docs.microsoft.com/en-us/azure/virtual-machines/windows/sql/virtual-machines-windows-sql-performance
GE Healthcare lifted-and-shifted to the Azure Cloud, and leveraged VMs with .NET and SQL Server in Always On Availability Groups in Virtual Machines.
GE Healthcare and Microsoft kicked off a proof of concept for transitioning apps that were running in a production environment on-premises to Microsoft Azure. The transition was seamless using Microsoft Azure Virtual Machines (VMs). The VMs host multiple application components, including database instances running on SQL Server software and business logic running in Microsoft .NET Framework assemblies. Both SQL Server and .NET on Microsoft Azure VMs provide full compatibility with on-premises infrastructure. In addition, the SQL Server AlwaysOn feature ensures high availability.
Here is a Retail example customer case study, leveraging SQL Server and Always On Availability Groups on Azure:
Outside of failover being automatic: ENTERPRISE edition can keep databases up and running and maintained longer. I collected data from many places to put this “Do you need Enterprise again” after the updates post SP1 of SQL 2016 towards SQL Standard. More features have come to enterprise edition since, for 2017 and soon to come 2019.
Leveraging either Basic Availability Groups or Failover Instances with SQL Standard, would be cheaper, but has significant performance scale limits, and risks around uptime.
Not recommended (Low-tier option):
What about using a single instance of SQL Standard?
That’s pretty much not covered in any talk above about Enterprise, Availability, or Scale.
I would warn strongly against “saving money” by being exposed by a relatively low-availability configuration such as a single virtual machine hosting SQL Server databases.
Upgrades and patching in this scenario has vastly increased risk, as rollback basically includes a reinstall and exporting and importing data. If this matches your needs, that is fine then. Be aware of this when planning your upgrades and downtimes.
With higher needs, you need better support. Here are the options. Note, Production is basic for most production needs, and then Professional or Premier support is recommended.
Someday, you might plan to migrate and update code, and get to a world of cloud-native configurations.
I have helped major companies migrate thousands of databases to Azure SQL Database. It took some code rework, but their investment in that led to equal cost as on-prem, but without the following headaches:
- Deployment issues
Granted, it took time for them and others to do that all. In the end, you can read about Paychex, a time-tracking company, who did that here:
The SLAs for this, where you DO NOT have to patch, and you get backups included for free, and easy multi-region deployment as needed is here:
- Azure SQL Database Business Critical or Premium tiers configured as Zone Redundant Deployments have an availability guarantee of at least 99.995%.
- Azure SQL Database Business Critical or Premium tiers not configured for Zone Redundant Deployments, General Purpose, Standard, or Basic tiers, or Hyperscale tier with two or more replicas have an availability guarantee of at least 99.99%.
- Azure SQL Database Hyperscale tier with one replica has an availability guarantee of at least 99.95% and 99.9% for zero replicas.