This whitepaper focuses on best practices to attain the most value for the least cost when running Microsoft SQL Server on the AWS platform. Although for many general purpose use cases, Amazon Relational Database Service (Amazon RDS) for Microsoft SQL Server provides an easy and quick solution, in this paper we focus on scenarios where you need to push the limits to satisfy your special requirements.

In particular, this whitepaper explains how you can minimize your costs, maximize availability of your SQL Server databases, and optimize your infrastructure for maximum performance. The flexibility of AWS services, combined with the power of Microsoft SQL Server, can provide expanded capabilities for those who seek innovative approaches to optimize their applications and transform their businesses.

The main focus of this paper is on the capabilities available in Microsoft SQL Server
2017, which is the most current version at the time of publication. Existing databases
that are running on previous versions (i.e., 2008, 2012, 2014, and 2016) can be
migrated to SQL Server 2017 and run in compatibility mode.

Microsoft SQL Server offers several High Availability/Disaster Recovery (HA/DR) solutions, each suitable for specific requirements. These include:

  • Log Shipping • Mirroring (Deprecated, use Availability Groups instead) 
  • Always On Availability Groups (Enterprise Edition) 
  • Always On Basic Availability Groups (Standard Edition) 
  • Always On Failover Cluster Instances 
  • Distributed Availability Groups

These solutions rely on one or more secondary servers with SQL Server running as active or passive standby. Based on the specific HA/DR requirements, these servers can be located in close proximity to each other or far apart.

Availability Zones and Multi-AZ Deployment
AWS Availability Zones (AZs) are designed to provide separate failure domains, while keeping workloads in relatively close proximity for low latency inter-communications. AZs are a good solution for synchronous replication of your databases using Mirroring, Always On Availability Groups, or Basic Availability Groups. SQL Server provides zero data-loss and, when combined with the low-latency infrastructure of AWS Availability Zones, provides high performance.

This is one of the main differences between most on-premises deployments and AWS.
For example, Always On Failover Cluster Instances (FCI) is often used inside a single
datacenter. This is because all nodes in an FCI cluster must have access to the same
shared storage. Locating these nodes in different datacenters could adversely impact
performance. However, with AWS, FCI nodes can be located in separate AZs and still
provide good performance because of the low-latency network link between all AZs
within a region.

Cluster Placement Groups and Enhanced Networking
Amazon EC2 enables you to deploy a number of EC2 instances inside a cluster placement group. This means those EC2 instances are not only inside a single AZ, but also, to ensure minimum network latency, within close physical proximity in the same datacenter. To gain the highest bandwidth on AWS, you can leverage enhanced networking and Elastic Network Adapter (ENA) . To minimize latency, you can deploy Always On Failover Cluster Instances, or Always On Availability Groups on instances that run inside an EC2 cluster placement group.

Additionally, in the case of an AZ failure, you can manually failover your database to the second AZ. Relatively low latency between AZs provides near-zero data-loss, even with asynchronous replication in place. This is an example of HA with close to zero performance penalty for applications that are very sensitive to latency, combined with a DR solution within the same AWS region to minimize the recovery point objective (RPO).

Multi-Region Deployments
For those workloads that require even more resilience against unplanned events, you can leverage the global scale of AWS to ensure availability under any circumstances. Since Amazon Virtual Private Cloud (Amazon VPC) is confined within a single region, for a multi-region deployment, you would need to establish connectivity between your VPCs in different regions. Although there are a number of ways to do this, in most cases, the best approach is using inter-region VPC peering. This approach provides security, optimized performance, and enhanced throughput by ensuring that all traffic between your VPCs is encrypted, stays on the AWS global network backbone, and never traverses the Internet.

Although it is possible to have synchronous replication in a multi-region SQL Server deployment, the farther apart your selected regions are, the more severe the performance penalty is for a synchronous replication. Often the best practice for multiregion deployments is to establish an asynchronous replication, especially for regions that are geographically distant. For those workloads that come with aggressive RPO requirements, asynchronous multi-region deployment can be combined with a multi-AZ or single-AZ synchronous replication. You can also combine all three methods into a single solution. However, these combinations would impose significant number of additional SQL Server license costs, which need to be considered as part of your inadvance planning.

Performance optimization
In some cases, maximizing performance may be your utmost priority. Both SQL Server and AWS have several options to substantially increase performance of your workloads.

The first and most effective way to improve the performance of your workloads is by optimizing your applications and database schemas. You may also be able to significantly improve application performance by changing your application to use NoSQL instead of a standard relational database. Both AWS and SQL Server provide NoSQL capabilities. Amazon DynamoDB is a managed NoSQL database service that offers provisioned throughput. You may also use the capabilities of Microsoft SQL Server, such as memory-optimized tables, XML, and JSON capabilities. Microsoft SQL Server allows you to mix relational and NoSQL queries and tailor high-performing solutions for complex problems.

Using Amazon Elastic Block Store (Amazon EBS)
Amazon EBS is a single-AZ block storage service with various flexible options, catering for diverse requirements. When it comes to maximizing performance, using a Provisioned IOPS volume type (io1) is the best choice. You can provision up to 32,000 IOPS per io1 EBS volume (based on 16 KiB I/O size), along with 500 MB/s throughput.

Instance Storage
Storage optimized EC2 instance types use fixed-size local disks and a variety of different storage technologies are available. Among these, Non-Volatile Memory express (NVMe) is the fastest technology with the highest IOPS and throughput. The i3 class of instance types provides NVMe SSD drives, e.g., i3.16xlarge, comes with 8 disks, each with 1.9 TB of storage. When selecting storage optimized EC2 instance types for maximum performance, it is essential to understand that some of the smaller instance types provide instance storage that is shared with other instances. These are virtual disks that reside on a physical disk attached to the physical host. By selecting a bigger instance type, such as i3.2xlarge, you ensure that there is a 1:1 correspondence between your instance store disk and the underlying physical disk. This ensures consistent disk performance and eliminates the noisy-neighbor problem.

Scale-Out File Server
Windows Server 2016 introduced a new service called Storage Spaces Direct (S2D). S2D allows all or selected disks attached to instances of a Windows Server Failover Cluster (WSFC) to be clustered into a storage pool and made available to all members of the cluster. S2D has flexible options to maximize performance, storage space, and durability. It removes the complexities of managing different disk technologies and creating a RAID cluster spread across several servers in a network.

Security optimization
Security is the first priority at AWS, and there are many AWS security features available to you. These features can be combined with the built-in security features of Microsoft SQL Server to satisfy even the most stringent requirements and expectations.

Amazon EBS Encryption
If you are using EBS volumes to store your SQL Server database files, you have the option to enable block-level encryption. Amazon EBS transparently handles encryption and decryption for you. This is available through a simple checkbox, with no further action necessary.

AWS Key Management Service (KMS)
AWS KMS is a fully managed service to create and store encryption keys. You can use KMS-generated keys or bring your own keys. In either case, keys never leave KMS and are protected from any unauthorized access.

Transparent Data Encryption (TDE)
TDE is a feature available in Microsoft SQL Server that provides transparent encryption of your data at rest. TDE is available on Amazon RDS for SQL Server, and you can also enable it on your SQL Server workloads on EC2 instances.

Always Encrypted
Always Encrypted is a feature that allows separation between data owners and data managers. Sensitive data that is stored in Microsoft SQL Server using Always Encrypted, stays encrypted even during query processing. Encryption keys remain with the data owners and are not revealed to the database engine.

Row-Level Security
Row-Level Security (RLS) in SQL Server enables you to control database access at the row level. This feature reduces your attack surface by filtering out all unauthorized access attempts, originating from any layer of your application, directly from the database. It could potentially simplify your applications, but you need to design your applications in a way that differentiates users at the database level. For example, if you have a web application that shares the same connection string for all database operations, this feature would not be applicable. This situation needs to be considered at application design time.

Cost optimization
SQL Server can be hosted on AWS through License Included (LI), as well as Bring Your Own License (BYOL) licensing models. With LI, you run SQL Server on AWS and pay for the licenses as a component of your AWS hourly usage bill. The advantage of this model is that you do not need to have any long-term commitments and can stop using the product at any time and stop paying for its usage.

However, many businesses already have considerable investments in SQL Server licenses and may want to reuse their existing licenses on AWS. This is possible using BYOL:

  1. If you have Software Assurance (SA), one of the benefits is license mobility. License mobility allows you to use your licenses on server instances running anywhere, including on Amazon EC2 instances.
  2. If you do not have SA, you can still use your own licenses on AWS. However, there are specific requirements for running SQL Server instances on dedicated hardware. Therefore, you would need to use Amazon EC2 Dedicated Hosts to ensure license compliance. AWS is the only major public cloud that provides this option.

In this whitepaper, we described a number of best practices for deploying SQL Server workloads on AWS. We saw how AWS services can be used to compliment Microsoft SQL Server features to address different requirements.

Each solution and approach may be embraced according to particular business requirements. SQL Server containers may be used along with EC2 Dedicated Hosts, in order to tackle over-subscription licensing issue. AWS Availability Zones, regions, EC2 cluster and spread placement groups, VPC peering, and inter-region AWS Direct Connect can be used to implement various solutions covering almost any type of HA requirements. Different Microsoft Windows, Linux, and SQL Server features can be used in conjunction with AWS services to create scalable infrastructures, satisfying high performance requirements.


To read full download the whitepaper:
Best Practices for Deploying Microsoft SQL Server on AWS