AlwaysOn Availability Groups and Automation

Spread the love

Last week I encountered a unique application requirement—we have a database environment configured with AlwaysOn Availability Groups for high availability (HA) and disaster recovery (DR), and our application was going to be creating and dropping databases from the server on a regular basis. So, I had to develop some code to handle this process automatically. I had a couple of things going in my favor on this—it was our custom developed application that was doing this, so I knew the procedures I was writing would be called as part of the process, and additionally, our Availability Group was only two nodes at the moment, so my version 1 code could be relatively simplistic and work. Aaron Bertrand (b|t), posted on this stack exchange thread, his code is a good start. I’m not going to put all of my code in this post—it’s ready for our prime time, but I have a few more fixed I’d like to make before releasing the code into the wild.

Dropping a Database

First of all, I need to say it’s very important to secure these procedures—they can do bad things to your environment if run out of context—particularly this one. I denied execute on everyone except the user who would be calling the procedure. I didn’t want any accidents happening. Dropping a database from an availability group is slight different then doing it from a standalone server. The process is as follows:

  1. Remove the database from the availability group (on the primary)
  2. Drop the database from the primary
  3. Drop the database from the secondary node(s)

Since, we have to initiate this from the primary instance we need to find out two pieces of data—1) what availability group are we removing the database from and 2) is the instance we are on the primary instance. In my code—I didn’t actually have to drop the database from the primary server. That piece was being called from another proc. So, I just had to remove from the availability group, and remove on the secondary. There are a number of ways to connect the secondary database—this needs to happen in SQLCMD mode, which isn’t possible in a stored procedure. We could use a linked server, or we could enable xp_cmdshell and run a SQLCMD script, and then disable xp_cmdshell. This isn’t my favorite technique from a security perspective, but I was under a time crunch, and the procedure is locked down, but in the future I will probably rebuild this with linked servers (created within the procedure)

SET QUOTED_IDENTIFIER ON

GO

ALTER PROCEDURE [dbo].[RemoveDBfromAG] @dbname VARCHAR(50)

AS

BEGIN

EXEC sp_configure 'show advanced options'

,1

RECONFIGURE

WITH OVERRIDE

EXEC sp_configure 'xp_cmdshell'

,1

RECONFIGURE

WITH OVERRIDE

DECLARE @dbid INT

DECLARE @groupid VARCHAR(50)

DECLARE @groupname VARCHAR(50)

DECLARE @server VARCHAR(50)

DECLARE @primary VARCHAR(50)

DECLARE @secondary VARCHAR(50)

DECLARE @sql NVARCHAR(max)

DECLARE @dropsecondary VARCHAR(500)

SELECT @dbid = database_id

FROM sys.databases

WHERE NAME = @dbname

SELECT @groupid = group_id

FROM sys.availability_databases_cluster

WHERE database_name = @dbname

SELECT @groupname = NAME

FROM sys.availability_groups

WHERE group_id = @groupid

SELECT @server = @@SERVERNAME

SELECT @primary = primary_replica

FROM sys.dm_hadr_availability_group_states

WHERE group_id = @groupid

SELECT @secondary = node_name

FROM sys.dm_hadr_availability_replica_cluster_nodes

WHERE node_name != @primary

SELECT @sql = 'alter availability group ' + @groupname + ' remove database [' + @dbname + ']'

SELECT @dropsecondary = 'sqlcmd -S "' + @secondary + '" -E -Q "exec ReturnsTestInstanceDropSecondaryDB [' + @dbname + ']"'

IF NOT EXISTS (

SELECT primary_replica

FROM sys.dm_hadr_availability_group_states

WHERE primary_replica = @primary

)

BEGIN

RETURN

END

ELSE

BEGIN

EXECUTE sp_executesql @sql

WAITFOR DELAY '00:00:25'

EXEC xp_cmdshell @dropsecondary

EXEC sp_configure 'xp_cmdshell'

,0

RECONFIGURE

WITH OVERRIDE

END

END

GO

The one particularly unique thing you will notice in the code—is a “WAITFORDELAY”—what I observed is that after the secondary database is removed from the availability group, it goes into recovering for about 10 seconds—and we are unable to drop a database while it’s in recovery. By implementing that wait (the T-SQL equivalent of a sleep command) the database was able to be dropped.

Adding a New Database

Adding a new database has similar requirements—we have a couple of additional factors though. We have to verify that the instance we are on, is the primary for the availability group we are adding the database into. This is where I really need to fix my code—it assumes that there are only two nodes in our availability group cluster. I need to refactor the code to potentially loop through the other four secondaries (or 7 if we are talking about SQL Server 2014). Also, I’m using a linked server connection—this also assumes that the drive letter and path to the data file directory on the secondary are also the same. To summarize, the process is as follows:

  1. Accept the availability group and database names as input parameters
  2. Verify that the node you are working on is the primary for the availability group
  3. Backup the database and transaction log of the database to file share you’d like to add to the availability group
  4. Add the database to the availability group on the primary
  5. Restore the database and transaction log to the secondary with norecovery
  6. Alter the database to set the availability group

My code for this isn’t totally flushed out—it works in my environment, but I don’t think it’s ready for sharing. I’ll share later, I promise.

Pitfalls to Avoid

This isn’t that different than just building out an availability group, but many of the same caveats apply, you need to ensure agent jobs and logins affiliated with a database are on all nodes in the availability groups. Additionally, the procedures to add and remove databases from your availability group, need to run on all of the nodes. Also, if you are doing this programmatically, the connection should use the listener, so that your application is always connecting to the primary instance.

3 thoughts on “AlwaysOn Availability Groups and Automation

  1. Dustin Jones

    Funny you posted about this, we are scoping out something similar at my organization at the moment. We have a huge multi terabyte db that is spit up in to multiple quarterly databases (2011Q1,2011Q2,…etc). All of the dbs except the active quarterly db are read only and only the active quarterly db needs to be restored in a DR situation ASAP to bring the system online. The read only dbs can be restored from tape as time permits in a real DR. We were thinking of putting together an AG group that only mirrors the active quarterly DB to another replica that is hooked up to replicated storage. Obviously automation would need to be written to rotate in a new quarterly db, rotate the old out. I’m looking forward to your follow up post.

    Reply

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.