Recovering From a Deleted Cluster Resource

Spread the love

I recently had the joy of hearing about a security issue on one of our SQL Server Failover Cluster instances, in an attempt to resolve the problem, someone inadvertently deleted the clustered SQL service from Failover Cluster Manager. Why? I have no idea—but I set about trying to figure out what to do to recover from it. Note—all of these screen shots are from Windows Server 2012 and SQL Server 2012, but the same technique applies in 2008R2—I don’t have a 2003 Cluster to test on, and I think the procedure may be a little more manual there (you may have to make a registry entry).

So how do we delete the service?—easy, right click on it and select remove.

Windows will try to warn us, but for the purposes of science we will continue:

After we click yes, the cluster deletes the service, and SQL Server goes to sad panda mode. This happens because the SQL Server was dependent on our Virtual Server Name, which disappeared when we deleted the service. I’m not sure why the Agent looks like it’s still running—it to is down.

A quick look over at our Domain Controller, shows that our cluster object (SQL2012CL1) is still in AD, so no recovery is needed there—so we don’t need to recover it. However, our DNS entry is gone.

So where do we start the repair process? Select Roles in Failover Cluster Manager, and then select Configure Role

Select the role “Other Server”

Set your Client Access Point—use the same name and IPAddress as your previous service.

Select your applicable disks—this is my demo instance, so I only have one cluster LUN.

Here’s where it starts to get good. We select the SQL Server and SQL Server Agent resource types.

It looks like we may be done—we aren’t.

This is where we need to set our dependencies and properties. So first right click on “New SQL Server” and select Properties

SQL is going to be dependent on the network name, the IP address, and the Disk that our cluster uses all being online.

So we add those—next go to properties in that same window. We need to set VirtualServerName and InstanceName to their correct values.

Lastly, I set the service name to the right value, but I don’t think it matters.

We need to do a similar process for the SQL Agent. It is only dependent on the SQL Server service.

We also need to define the parameters for it.

Then we can try to start everything up and yes, it’s all green!!

Hopefully this never happens to you, and it shouldn’t but in the event that it does, you now know what to do.

23 thoughts on “Recovering From a Deleted Cluster Resource

  1. matt

    hi
    I have a strange issue.
    While playing with a SQL 2008R2 test cluster environment to resolve a problem we had in production, I finished (don’t know how and don’t know why, I think deleting some resources or mapping/hiding some iSCSI LUNs) with a SQL instance perfectly running, but out of the cluster.
    I can see disks under available cluster storage, but I’m not able to re-add all the resources to create a new clustered service.
    If I try to add IP and Client Access resource, it tells me that the name and address are yet in use on the network, If I try to add storage (that I see as reserved into disk manager) it fails with strange unknown error or something like “cannot write on an existing file”…
    If I try to add a new Failover Cluster SQL node through setup, it finds the running instance but cannot add it.
    If i try to create new Failover Cluster SQL node, it fails saying the resource yet exists…

    The question is:
    Is it possible to cluster a yet running SQL instance (as I do, for example, with a Hyper-v virtual machine running on a local Hyper-v Manager)?

    Thanks

    Reply
  2. Magoo

    Very helpful article. Got a deleted SQL resource group back up and running in no time after I managed to accidently delete it. Luckily it wasn’t the production instance 😀

    Reply
  3. Jr

    Hi All, I have a similar problem however I recreated the cluster after upgrading to windows 2012 r2 and I was running sql 2012 cluster before the inplace grade, now I am getting this error message.

    Error: 17182, Severity: 16, State: 1.
    2013-10-23 09:22:41.59 spid14s TDSSNIClient initialization failed with error 0x5, status code 0xa. Reason: Unable to initialize the TCP/IP listener. Access is denied.
    any ideas ?

    Reply
    1. jdanton1

      Could be a couple of things–I’d do a netstat to verify that no other process is listening on port 1433 (assuming you are running SQL there). I’d also make sure both cluster networks can ping each other. This forum post may also shed some light on the topic.

      http://social.msdn.microsoft.com/Forums/sqlserver/en-US/ac756875-eee4-422e-8701-780d519d01c0/sql-20052008-on-windows-server-2008-r2-cluster-service-only-starts-when-account-is-local-admin?forum=sqlsecurity

      Reply
  4. Pingback: Don’t $%^&ing Delete Your SQL Server Cluster Resource | The SQL Herald

  5. Michael

    This helped me out hugely this afternoon.

    In my instance even though the SQL Role was deleted from cluster management console, and the resources no longer visible, I couldn’t re-add the role as it said it was already in use.

    Sure enough running get-clusterresource in powershell, SQL server, agent, IP, etc. were all still listed. Had to remove-clusterresource to remove them all, then was able to re-setup the role fine following this guide. We also had a couple of scripts, and file server as part of the role. I didn’t remove those, just did a move-clusterresource to add them back to the newly created SQL Role.

    Thanks man!

    Reply
  6. Zolcsi

    Great article, thanks bro’! After that, my sql server and agent resources didn’t come to online, I had to run sql setup, and repair. I owe you <3

    Reply

Leave a Reply to PoojaCancel reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.