Monitoring Availability Groups—New Tools from Solarwinds

As I mentioned in my post a couple of weeks ago, monitoring the plan cache on a readable secondary replica can be a challenge. My customer was seeing dramatically different performance, depending on whether a node was primary or secondary. As amazing as the Query Store in SQL Server 2016 is, it does not allow you to view statistics from the readable secondary. So that leaves you writing xQuery to mine the plan cache DMVs for the query information you are trying to identify.

My friends at Solarwinds (Lawyers: see disclaimer at bottom of post) introduced version 11.0 of Database Performance Analyzer (DPA, a product you may remember as Ignite) which has full support for Availability Group monitoring. As you can see in the screenshot below, DPA gives a nice overview of the status of your AG, and also lets you dig into the performance on each node.

image

There are a host of other features in their new releases, which you can check out some of their new hybrid features in their flagship product Orion. Amongst these features, a couple jumped out at me—there is now support for Amazon RDS and Azure SQL Database in DPA, and there is some really cool correlation data that will let your compare performance across your infrastructure. So, when you the DBA is arguing with the SAN, network, and VM teams about where the root cause of the performance problem, this tool can quickly isolate the root cause of the issue. With less fighting. These are great products, give them a look.

Disclaimer: I was not paid for this post, but I do paid work for SolarWinds on a regular basis.

Query Store and Availability Groups—Force Plan on Secondary Replicas

I’m still fighting with some challenges about inconsistent performance between a primary and secondary replica, so I’ve been waste deep in undocumented system views looking at temporary statistics. One of the things I thought about doing was talking advantage of the Force Plan option in the Query Store in SQL Server 2016.  If you are not familiar with this feature, it allows you to force a “preferred” execution plan. In this scenario, our query was running in about 20-30 seconds on the primary, and 20-30 minutes on the secondary. The plans were reasonably close, but I wanted to see what would happen if I forced a plan on the primary.

Primer about the Query Store and Availability Groups

Since readable secondary replicas are read-only, the query store on those secondary replicas are also read-only. This means runtime statistics for queries executed on those replicas are not recorded into the query store. All the stats there are from the primary replica. However, I wasn’t sure what would happen if I forced a plan on the primary—would the secondary replica honor that plan?

Let’s Find Out

The first thing I did was to query the query store catalog views to verify that the plan was forced.

image

I have to copies of the forced plan. If I run an estimated query plan on the primary, I see that plan is forced. You can see this by looked for UsePlan in the XML of the plan.

image

I did the same thing on the secondary (in the case of the secondary, we are looking at the actual plan, but it doesn’t matter).

image

You will note that there is no UsePlan. There are extended events and a catalog view that reflect plan forcing failure (Grant Fritchey wrote about this behavior here), While, I wouldn’t expect the catalog view to get updated, I was hoping that the Extended Event might fire. It did not.

Conclusion

The query store, as awesome as it is, doesn’t really do much for you on readable secondary replica. It does not force plans, nor does it record any of your data.

Thanks to Grant Fritchey and Erin Stellato for helping with this post!

SQL Server on Linux Clustering—A Few Other Notes

So I was chatting with fellow MVP Allan Hirt (b|t) about the cluster build that I wrote about yesterday, and I had a few more realizations about the Linux HA process as it stands right now. I haven’t talked to the the Linux product team at Microsoft about this, but I hope to in the near future to get a better idea of where things are headed. So these are my notes as of now, strictly relating to failover cluster instances (FCI), AlwaysOn Availablity Groups are coming, but are not in the latest CTPs of SQL Server on Linux.

It was faster than building a Windows cluster

It took me a while, I laughed, I cried, I cursed a lot, but if I look at the time it took for me to actually build the cluster and install SQL Server, it was a much faster process. Much of this comes down to the efficiency of the SQL Server installation process on Linux, which is as simple as running yum install mssql-server (mostly). Which leads me to my next point..

Installation options would be nice

The cluster building process is a little kludgy. Basically, you install two standalone instances of SQL Server, and then remove the data files from one them, and copy them into your NFS share. Having the option to do the equivalent of an “Add Node” install, would mean you wouldn’t need to worry about cleaning up your second node.

There’s no cluster validation, explicitly

This is a bit scarier, or easier depending on your view point. There are tests at various parts of the process to make sure things are working. For example, the first step of building your Linux cluster is to authorize the nodes to take part in the cluster, which validates certain security and network settings. However, the storage validation consists of starting and stopping SQL Server on each node to make sure it can talk to the storage and startup. Given that Microsoft doesn’t own the clusterware for this solution, I’m not sure how much they can enhance that, or if they will. This is a good open question.

There’s no dns

(Happy Late Birthday Kris!) One interesting thing I realized after talking to Allan was that I did all of my networking setup through the /etc/hosts file on each individual node. I remember doing this for RAC, and I think it may be a requirement of Pacemaker, but you will still want to make a DNS entry for your cluster identifier. When you do this on Windows, if you are using Active Directory for DNS, the installation does this for you. Not in Linux, you will need to do this yourself.

Screen Shot 2017-01-04 at 11.40.26 AM

Get comfortable with command line and scripting

There’s no cluster wizard to get you through the process. I think this isn’t a huge deal—Denny and I were talking yesterday about how relatively easy it would be to script the whole process in bash (I’m holding off until I find out if Microsoft is doing this), and most Linux sysadmins are really comfortable with writing bash scripts. But if you aren’t comfortable with Linux and the command line, now is the time to brush up, before things go prod.

Summary

We are in the very early days of this process, there is much that will likely change. From a functional and conceptual perspective, this is very similar to the way a SQL Server Failover Cluster works in Windows, but the implementation is quite different. I’d like to see things resemble Windows a bit more, at least from a SQL Server perspective, but we’ll see where the product heads.

SQL Server on Linux–Clustering

First of word of warning on this post—if you are reading it and it isn’t January of 2017, I suspect things may have changed significantly in the months going forward.

Screen Shot 2017-01-03 at 3.47.08 PM

So I did It, I built a SQL cluster on Linux. The process is documented here on BOL, I’m not going to walk you through it, I’ll probably do that in a later post, I just wanted to mention some things I ran into during this build process. First, I did this using VMWare Fusion on my Mac, but I think any virtualization platform that allows virtual networks should work. Secondly, even though BOL says you need Red Hat Enterprise Linux (and you do if you are doing this in prod and require support), I was able to do all of this on CentOS, which is the free as in beer version of RHEL.

In my scenario, I built 3 VMs, one to serve as an NFS server, the other two to be my SQL Servers. Currently, there is no cluster version of the install, it’s the standard installation for standalone SQL on Linux, you then point SQL Server at the NFS mount you created which serves as your shared storage. I had an initial permissions problem on writing my data files there—I did a bad thing on the NFS server and opened up the directory to the world (777), and was then able to copy files there. I’ll follow up on that.

One other thing that wasn’t in BOL, that I had to troubleshoot my way through is that just like a cluster on Windows, you have a cluster identifier and floating IP address. I had to add that to /etc/hosts on each of my nodes to get it to resolve. The article mentions turning off fencing for non-prod environments—I had to do that in order to get failover working correctly in my environment.

Finally, failover was a bit wonky at first, and I had to spend too much time troubleshooting an odd problem. I wrote a connect item for it., but select @@servername and select name from sys.servers returns the name of the host, and not the cluster name. I’m sure the team will fix this in the near future.

What Happens to Pages in the Buffer Pool when your Availablity Group Fails Over?

Recently at SQL Saturday Philadelphia, we started discussing failover  as it relates to mirroring and Always On Availability Groups. Specifically, we were wondering what would happen if you had a relatively busy readable secondary replica (which would have a lot of pages in the buffer pool on the secondary instance) and if those pages would be flushed from cache or anything like that. So I reached out to the product group and Kevin Farlee from Microsoft was extremely helpful:

Pages in the buffer pool are still valid, as they are updated by the redo thread.  By the time the secondary has transitioned to primary, all pending updates will have been applied, so pages in the bufferpool will all have correct contents. Note that the set of pages in bufferpool may be different between primary and secondary so you may have some cache warm up to do.”

So in a nutshell, if you have a busy readable secondary your cache won’t take much time to warmup, since the hot pages are already there.

Getting Started with Azure Hybrid Models—Point to Site Networking

I’ve done several presentations on hybrid disaster recovery (DR) using SQL Server and Windows Azure, and configuring the network has been a headache every time. Some SQL DR methods do not require a direct network connection (log shipping, mirroring, backups) and mat be secured using certificates for encyption, more advanced configurations require a VPN connection. Also, if domain services are required (either to connect back to an on-premises Windows Domain Controller (DC) (good) or to replicate your on-premises Domain Controller to a ReadOnly DC (best)) you will need a VPN connection.

If you would like a little background on networking in Azure, I highly recommend Mark Russinovich’s (b|t) session from TechEd last year on Windows Azure internals.

There are three types of direct connections to Windows Azure:

  1. Site to Site Network—an approved VPN device in your DMZ connects via VPN (encrypted, but not dedicated) to a network gateway Windows Azure. Best scenario for low volume, but regularly used scenarios
  2. Point to Site Network—the focus of this post, a certificate based VPN allowing individual machines to connect directly to the network gateway in Windows Azure, using a certificate and a VPN client. Each machine must have the certificate. This is designed for demos, and admin users who may need to connect directly to Azure virtual networks
  3. ExpressRoute—this is the true Enterprise class solution for connecting to Windows Azure. You can connect either at your ISP (participating ISPS on this list) or via a dedicated MPLS VPN from your WAN to Windows Azure. This guarantees a fixed amount of bandwidth and performance.

Figure 1 ExpressRoute Networking — http://www.windowsazure.com/en-us/services/expressroute/

Configuring Point to Site Networking

So if you are reading this, I am going to assume you have a Windows Azure account—if you don’t, get one—there’s lots of interesting things to play with. Connect to the Azure management portal and scroll down to networks.

Figure 2 Add new network in Windows Azure

From that menu select Virtual Network > Custom Create. You will be prompted to give your network a name and select an existing or create a new affinity group. I’m not going to write about affinity groups today, but they are an important part of Windows Azure—you can read more about them here.

Optionally, you can add a DNS server name and address. If you do not, Windows Azure will provide DNS services for you (in your VMs.) In my experience with this, if you want your VMs to connect to the internet, do not setup a DNS server (this mainly applies for point to site demo work—if you are doing this in prod, setup a DNS server.) There is an also an option to configure point to site networking—check this box and if you have a real VPN, you may also check site-to-site.

Figure 4 Configuring Point to Site VPN in Network

The next screen will just cover your IP address space, for limited demos you can just accept the default and click next. Finally click the “Add Gateway Subnet” option.

Figure 5 Add Gateway Subnet

You now have a new network, but with no VPN gateway to connect to. From the network dashboard click the Create Gateway button on the bottom of the screen. This will take about 20-25 minutes—don’t worry, there is other stuff to do.

Figure 6 Network without gateway

You have two other tasks—create an Azure VM, that is on this network (See below screenshot—you can choose network on the third screen of the VM creation process)

Figure 7 Creating a new Azure VM

While your VM is being created, you will need to create your certificates, to both upload to your Azure gateway and to install on the client machine. In my opinion, this isn’t clear in the Microsoft documentation, so I’ll draw a picture.

So you will create the root certificate on the main machine you want to connect to the VPN (in my case it is my on-premises Domain Controller) and then upload it to your Azure network.

Figure 8 Cert uploaded to network

You will also create a client certificate using this root certificate. You will install the client certificate on any machine you want connect to your VPN—do not recreate and upload a new root certificate for each machine that you want to connect. In order to create the certificates, you will need makecert.exe—you can get it as part of the Windows SDK. For the detailed instructions on creating the certificates, go to this MSDN page and find “Create a self-signed root certificate.” Execute those instructions all the way through “Install the client certificate.”

Now, you are ready to connect—on your gateway page (it’s probably finished creating by now) on the right side of the screen you will see “Download the 64-bit (You aren’t really using a 32 bit OS, are you?) Client VPN Package” click that link. It’s an exe that is not signed by Microsoft, so Windows will yell at you for installing it. That will install quickly—if you expand the network pane in Windows, you will see a new connection.

Figure 9 Connect to Azure VPN

Click connect, and then you will see…

Figure 10 Azure VPN Connection

If you run an IPConfig—you will see the connection to Azure.

Figure 11 IP Record from Azure VPN

One warning here—I’ve had this happen with a client, and it happened to me in the office. When I connect the VPN, I can’t get to the internet on any of my connections. Oddly, this has only happened on corporate networks, and hasn’t happened when I’ve been on the same connection through a virtual machine on my laptop. More to come later.

Update: This behavior is tied to Cisco firewall equipment. I will address in a follow up post.

Now (if you’ve turned Windows Firewall off) you can ping your VMs, and get started using SQL Server HA and DR functions with them.

Into the Blue—Building DR in Windows Azure

Tomorrow (at 1300 EDT/1800 GMT), I’ll be presenting to the SQL Pass High Availability and Disaster Recovery VC on one my favorite topics of late—building disaster recovery from your data center into Windows Azure Infrastructure as a Service. You can register for the session here. While I’m not screaming “cloud, cloud, cloud” from the rooftops, I do feel like DR is a really great point of entry to the cloud for many smaller organizations, who may lack the resource to have a second data center, or perhaps only need their applications to be highly available during certain parts of year. Additionally, the ability to backup SQL Server databases directly to Azure Blob Storage also can meet your offsite backup requirements with very little headaches.

The cloud isn’t all unicorns and rainbows however—there are some definite challenges in getting these solutions to work properly. I’ll discuss that and the following in this session:

  • Choosing the right DR solution
  • Networking and the cloud
  • Economics of cloud DR
  • Implementation concerns for AlwaysOn Availability Groups

I hope you can join me on Tuesday.

AlwaysOn Availability Groups and Automation

Last week I encountered a unique application requirement—we have a database environment configured with AlwaysOn Availability Groups for high availability (HA) and disaster recovery (DR), and our application was going to be creating and dropping databases from the server on a regular basis. So, I had to develop some code to handle this process automatically. I had a couple of things going in my favor on this—it was our custom developed application that was doing this, so I knew the procedures I was writing would be called as part of the process, and additionally, our Availability Group was only two nodes at the moment, so my version 1 code could be relatively simplistic and work. Aaron Bertrand (b|t), posted on this stack exchange thread, his code is a good start. I’m not going to put all of my code in this post—it’s ready for our prime time, but I have a few more fixed I’d like to make before releasing the code into the wild.

Dropping a Database

First of all, I need to say it’s very important to secure these procedures—they can do bad things to your environment if run out of context—particularly this one. I denied execute on everyone except the user who would be calling the procedure. I didn’t want any accidents happening. Dropping a database from an availability group is slight different then doing it from a standalone server. The process is as follows:

  1. Remove the database from the availability group (on the primary)
  2. Drop the database from the primary
  3. Drop the database from the secondary node(s)

Since, we have to initiate this from the primary instance we need to find out two pieces of data—1) what availability group are we removing the database from and 2) is the instance we are on the primary instance. In my code—I didn’t actually have to drop the database from the primary server. That piece was being called from another proc. So, I just had to remove from the availability group, and remove on the secondary. There are a number of ways to connect the secondary database—this needs to happen in SQLCMD mode, which isn’t possible in a stored procedure. We could use a linked server, or we could enable xp_cmdshell and run a SQLCMD script, and then disable xp_cmdshell. This isn’t my favorite technique from a security perspective, but I was under a time crunch, and the procedure is locked down, but in the future I will probably rebuild this with linked servers (created within the procedure)

SET QUOTED_IDENTIFIER ON

GO

ALTER PROCEDURE [dbo].[RemoveDBfromAG] @dbname VARCHAR(50)

AS

BEGIN

EXEC sp_configure 'show advanced options'

,1

RECONFIGURE

WITH OVERRIDE

EXEC sp_configure 'xp_cmdshell'

,1

RECONFIGURE

WITH OVERRIDE

DECLARE @dbid INT

DECLARE @groupid VARCHAR(50)

DECLARE @groupname VARCHAR(50)

DECLARE @server VARCHAR(50)

DECLARE @primary VARCHAR(50)

DECLARE @secondary VARCHAR(50)

DECLARE @sql NVARCHAR(max)

DECLARE @dropsecondary VARCHAR(500)

SELECT @dbid = database_id

FROM sys.databases

WHERE NAME = @dbname

SELECT @groupid = group_id

FROM sys.availability_databases_cluster

WHERE database_name = @dbname

SELECT @groupname = NAME

FROM sys.availability_groups

WHERE group_id = @groupid

SELECT @server = @@SERVERNAME

SELECT @primary = primary_replica

FROM sys.dm_hadr_availability_group_states

WHERE group_id = @groupid

SELECT @secondary = node_name

FROM sys.dm_hadr_availability_replica_cluster_nodes

WHERE node_name != @primary

SELECT @sql = 'alter availability group ' + @groupname + ' remove database [' + @dbname + ']'

SELECT @dropsecondary = 'sqlcmd -S "' + @secondary + '" -E -Q "exec ReturnsTestInstanceDropSecondaryDB [' + @dbname + ']"'

IF NOT EXISTS (

SELECT primary_replica

FROM sys.dm_hadr_availability_group_states

WHERE primary_replica = @primary

)

BEGIN

RETURN

END

ELSE

BEGIN

EXECUTE sp_executesql @sql

WAITFOR DELAY '00:00:25'

EXEC xp_cmdshell @dropsecondary

EXEC sp_configure 'xp_cmdshell'

,0

RECONFIGURE

WITH OVERRIDE

END

END

GO

The one particularly unique thing you will notice in the code—is a “WAITFORDELAY”—what I observed is that after the secondary database is removed from the availability group, it goes into recovering for about 10 seconds—and we are unable to drop a database while it’s in recovery. By implementing that wait (the T-SQL equivalent of a sleep command) the database was able to be dropped.

Adding a New Database

Adding a new database has similar requirements—we have a couple of additional factors though. We have to verify that the instance we are on, is the primary for the availability group we are adding the database into. This is where I really need to fix my code—it assumes that there are only two nodes in our availability group cluster. I need to refactor the code to potentially loop through the other four secondaries (or 7 if we are talking about SQL Server 2014). Also, I’m using a linked server connection—this also assumes that the drive letter and path to the data file directory on the secondary are also the same. To summarize, the process is as follows:

  1. Accept the availability group and database names as input parameters
  2. Verify that the node you are working on is the primary for the availability group
  3. Backup the database and transaction log of the database to file share you’d like to add to the availability group
  4. Add the database to the availability group on the primary
  5. Restore the database and transaction log to the secondary with norecovery
  6. Alter the database to set the availability group

My code for this isn’t totally flushed out—it works in my environment, but I don’t think it’s ready for sharing. I’ll share later, I promise.

Pitfalls to Avoid

This isn’t that different than just building out an availability group, but many of the same caveats apply, you need to ensure agent jobs and logins affiliated with a database are on all nodes in the availability groups. Additionally, the procedures to add and remove databases from your availability group, need to run on all of the nodes. Also, if you are doing this programmatically, the connection should use the listener, so that your application is always connecting to the primary instance.

Disaster Recovery Starts with Backup and Recovery

I present and blog a lot about high availability and disaster recovery solutions, in doing so I get to talk to a lot of folks about different strategies. Depending on your business needs and regulatory requirements these can vary greatly in costs and complexity. However, no matter your DR solution, it is imperative that you have a sound backup strategy and that you test those backups on a regular basis.

I recently took part in a architectural review of several important applications. The reason for the review is that customer teams were asking for a real-time DR solution for systems that were producing multiple terabytes of transaction volume daily. This is possible, but only at great costs, so in order to craft a better solution we started asking for details around current strategy, and to get a better understanding of the business requirements around their data. When doing so, it came out, that multiple application teams had never tested a restore of their backups. Excuse my fonts here…

If you aren’t regularly testing your restores, save the drive space and don’t

back anything up

Ok, rant over—sort of. At a conference I had the pleasure of attending this spring, an Infrastructure Executive from Goldman Sachs was presenting about how they had zero downtime during Hurricane Sandy. Granted it is Goldman Sachs, and their IT budget is some-huge-number billion dollars, but several things she said really stand out. In addition to much risk analysis work that GS had done, they regularly tested out failovers and restores, in all systems (computer, power, cooling and generation). That’s by far the most important thing you can do from a DR perspective. Even, if it is not physically doing a test (you really should), but getting all of the teams (Database, Sys Admin, Network, Application) into one room, and working out all of the moving parts, and what needs to happen in the event of a disaster.

Lastly, I know it’s hard to get resources to do test restores, enterprise storage isn’t cheap. However, there are many options you can leverage if you don’t have space in your primary location for testing:

  • Amazon Web Services—Take advantage of the cloud, DR in my opinion is one of the best use cases for the cloud. You can fire up a server, use it for testing, and then blow it away. You have to get your data there and that can be painful for large data sets, but it’s not a bad solution.
  • Real Hardware—I know enterprise storage is pricey, but a lab server with some relative slow, but dense storage isn’t. You can build a really effective restore testing environment for less than $10,000, and well under it, if you are willing to buy older unsupported HW on eBay.
  • Offsite Recovery Services—There are a number of firms who provide offsite recovery services (for example Sungard). However this option tends to be extremely expensive, as they guarantee hardware for you in the event of a disaster, and as part of that guarantee you are granted testing time.

I can’t express my feelings on this strongly enough—particularly if you are working with large data sets, with lots of opportunities for failure, it is absolutely critical to test those systems. Otherwise, don’t plan on being able to restore when you need.

Group Managed Service Accounts

One of the really enjoyable things about my current job, is being able to work on some of the latest technology as it comes out. My team owns the infrastructure build processes, so I get to do fun things like building proof of concept environments, SAN testing, and working with new versions of things. We’ve been spending a lot of time working with some of the new features of Windows Server 2012, in an effort to get it ready for production rollout. We’re ready to go, but we are still waiting on our antivirus team.

For those of you who work in organizations with any sort of regulatory concern (which is pretty much any public corporation), you know that managing the accounts that own SQL Server services can be a bit of nightmare. You have to maintain passwords, and if the password changes at any point in time, it’s a very manual process to change it on each of your servers. Its best security practice to change these passwords on a regular basis, and particularly if a member of the DBA team leaves. However changing this password requires touching every server in some fashion—you can automate this process, but it’s still kind of painful. Also, if that account ever gets locked out—it can be really painful.

Starting with Windows Server 2008 R2 the concept of Managed Service Accounts came into play. These were accounts with Active Directory internal passwords—set to change on a regular basis, but controlled by the system. The best part of this was that AD did the password change and propagated it out to the server. The problem with this for large environment was that there was a 1:1 ratio of accounts to servers—which mean we could never use it on a SQL Cluster, and it wasn’t necessarily practical (though a best security practice) to have a Domain account for each SQL Server in your environment.

With Windows Server 2012, this concept has been expanded to Group Managed Service Accounts (gMSAs)—which have one account that can be used by multiple servers. So this makes it suitable for clusters* (the * is for AlwaysOn Availability Groups—more on that later), which is very important in my organization, as we are heavily clustered. Note—this process includes a number of steps which will require your Active Directory team. The initial steps actually have to take place on a Domain Controller. I highly recommend showing this post to your Windows Admin—it was an excellent resource.

The prerequisites for this are as follows:

  • The Active Directory schema in the gMSA domain’s forest has to be updated to Windows Server 2012 in order to create a gMSA (note this can be done on a 2008/2008R2 Domain Controller). Send your Admin here.
  • The Key Distribution Services KDS root key needs to be created. A domain admin needs to do this on a domain controller. See here.
  • You have to wait 10 hours after creating this key—really you do. You can advance the time in your domain for testing, but don’t do that.
  • The target servers for the gMSA need to be running Windows Server 2012.

So after you domain admin has created the key, then it’s time to create the gMSA. Run the following PowerShell command as admin (may need to be run by your domain admin):

New-ADServiceAccount gmsa1 –DNSHostName membersrv1.domain.tld –PrinicipalsAllowedToRetrieiveManagedPassword ShortGroupName

Now you have your Group Managed Service Account:


Here’s the AD Information on it:

One thing I didn’t mention before is setting up groups. This is going to depend on the size of your environment and your management philosophy. In our case, I’m thinking about creating three groups—production, non-production, and mission critical. Some shops may create one for each cluster or server, it really depends on your needs.

Also, by default, a gMSA will change the password every 30 days—this is modifiable, but since it’s a system managed password, I generally don’t want to mess with it. The beautiful thing about this, is that when AD changes the password, since it is system managed, you don’t have to touch your SQL Servers at all. It just works.

One interesting note, you will need to add the Active Directory Domain Services role to your SQL Servers. This will enable the PowerShell command you need to use to install the service account on each of your servers. You can do this by issuing the following PowerShell command:

Install-WindowsFeature –Name AD-Domain-Services –ComputerName $servername –IncludeManagementTools

After this, you need to install the account on each server, once again using PowerShell:

Install-ADServiceAccount gMSA1

After that, when installing SQL Server just search for the account in Active Directory. You need to make sure the “Service Account” object is checked off.

I’ve selected the Service Account—note that the password box is greyed out. You don’t enter the password (since only AD knows what it is).

I know a password shows up in Configuration Manager—don’t mess with this—it’s managed exclusive by Active Directory.

I have tested this configuration in standalone and SQL Server Failover Cluster instances (only in SQL 2012, however), and it works fine. AlwaysOn Availability Groups, however act a little weird, I will follow up with another blog post on this, but initially everything configured fine. However, when I tried to failover, I got two really ugly SSPI errors, but then it just worked. I haven’t seen a statement from Microsoft on support for this feature, so stay tuned.

%d bloggers like this: