Query Store and Availability Groups—Force Plan on Secondary Replicas

I’m still fighting with some challenges about inconsistent performance between a primary and secondary replica, so I’ve been waste deep in undocumented system views looking at temporary statistics. One of the things I thought about doing was talking advantage of the Force Plan option in the Query Store in SQL Server 2016.  If you are not familiar with this feature, it allows you to force a “preferred” execution plan. In this scenario, our query was running in about 20-30 seconds on the primary, and 20-30 minutes on the secondary. The plans were reasonably close, but I wanted to see what would happen if I forced a plan on the primary.

Primer about the Query Store and Availability Groups

Since readable secondary replicas are read-only, the query store on those secondary replicas are also read-only. This means runtime statistics for queries executed on those replicas are not recorded into the query store. All the stats there are from the primary replica. However, I wasn’t sure what would happen if I forced a plan on the primary—would the secondary replica honor that plan?

Let’s Find Out

The first thing I did was to query the query store catalog views to verify that the plan was forced.

image

I have to copies of the forced plan. If I run an estimated query plan on the primary, I see that plan is forced. You can see this by looked for UsePlan in the XML of the plan.

image

I did the same thing on the secondary (in the case of the secondary, we are looking at the actual plan, but it doesn’t matter).

image

You will note that there is no UsePlan. There are extended events and a catalog view that reflect plan forcing failure (Grant Fritchey wrote about this behavior here), While, I wouldn’t expect the catalog view to get updated, I was hoping that the Extended Event might fire. It did not.

Conclusion

The query store, as awesome as it is, doesn’t really do much for you on readable secondary replica. It does not force plans, nor does it record any of your data.

Thanks to Grant Fritchey and Erin Stellato for helping with this post!

The Curious Case of the HTDELETE Wait Type

I was working with a client this week and we encountered very long wait types on “insert as select” queries that were part of their data delivery process. This wait type isn’t documented very well, SQL Skills has it documented here  and mentions this:

“Typically these waits occur when queries involve columnstore indexes, but they can also occur without columnstore indexes being involved if a hash operator runs in batch mode.”

Nacho from Microsoft also has a blog post on it here. My theory was that a bad hash join was taking place and causing the wait.

Isolating the Plan

The thing that was very curious about the situation is that waits were only occurring on the readable secondary replica. At first, I tried to examine the query store to try to understand if there were multiple execution plans for a given query. The one problem with that is the readable secondary copy of the data is read-only, which means on that secondary replica you only see the query store data from the primary replica. If there was a plan that was specific to the secondary, I’d have gather them from the plan cache on the secondary. (Thanks to Erin Stellato (b|t) for this idea). There was one other problem—the code in question was executing as dynamic SQL from a stored procedure when meant it was always getting a new execution plan.

Ultimately I think any thought of the readable secondary having a vastly different plan was a red herrings. Statistics are going to be the same on both instances, and if there were a missing statistic on the secondary, SQL Server would create it in TempDB. Anyway, columnstore indexes don’t use statistics in the traditional sense.

Fortunately I was able to catch a query in the process of waiting on HTDELETE, so I no longer had to look for the needle in the haystack, and I could get to tuning the plans. I was able to grab the SELECT part of the query and generate an estimated plan on both the primary and secondary nodes. The plans were virtually the same on both nodes, with just a minor difference in memory grant between them.

 

image

The query is here.

SELECT    Object4.Column1
FROM    Database1.Schema1.Object5 Object4
    INNER JOIN Database2.Schema1.Object6(?, ?) Object7 ON Object4.Column2 = Object7.Column3
    INNER JOIN Database2.Schema1.Object8(?) Object9 ON Object4.Column4 = Object9.Column4 
    INNER JOIN Database1.Schema1.Object10 Object11 ON Object4.Column5 = Object11.Column6
    INNER JOIN Database2.Schema1.Object12(?) Object13 ON Object11.Column7 = Object13.Column7
WHERE    1 = 1
    AND Object4.Column8 >=  ‘01-Jan-2017’

The pattern here was that we were taking all of rows of an ID field in a columnstore index with about 350MM rows and joining them to a function that has 3500 rows. My gut instinct was this was a bad match for batch mode hashing. Additionally, SQL Server was recommending I create a b-tree index on the large columnstore table. there was a key lookup in the plan that I wanted to eliminate, but my hunch was that this join was causing the waits.

image

 

The Solution

So before I created the index, the query was taking at least 2-4 minutes, when it wasn’t getting hung on the HTDELETE wait. After I created the first index, we got done to about 15 seconds. SQL Server then recommended that I create another index on one of the join tables, which brought my query time down to sub-second. The plan looked a lot more traditional and had lots of my favorite operator INDEX SEEK.

image

 

The Moral of the Story

Sometimes you need non-clustered indexes on columnstore indexes. It stinks, because they do add space, but its hard to argue with a performance gain like this. I need to email some friends on the product team to ask, but I’m therorizing that the join was super expensive and causing the query to hang. Anyway, the real answer is to never stop tuning and trust your instincts.

Thanks to Sentry One for making Plan Explorer Free. I used it for the screen shots and anonymization in this post.

%d bloggers like this: