SQL Server 2017 Temporal Enhancements

One of the most popular features in my talks about SQL Server 2016 has been the temporal tables feature. If you aren’t familiar with this feature you can read more about it on Books Online here. In a nutshell, you get a second table that tracks the lineage of your data. This is fantastic for all sorts of scenarios up to and including auditing, data recovery, fraud detection, or even slowly changing dimensions.

th

This is implemented in SQL Server via a history table—a second copy of your data that maintains timestamps of when the data is valid. As you can imagine this table can grow quite large—Microsoft does us a couple of favors: the history table is page compressed by default (you can use columnstore) and you could put the history table on a different filegroup. The only major issue was to truncate or delete data from history table for pruning purposes, you had to turn of system versioning, or the glue that makes that this feature work.

Starting with SQL Server 2017 (and Azure SQL Database) you can define a retention period and have SQL Server prune records for you. This is awesome and easy—see how to implement here.

Always On Availability Groups transport has detected a missing log block…

If you are running SQL Server 2016 (especially before CU3) you have received this error:

DATE/TIME:    8/21/2017 11:24:53 AM

DESCRIPTION:    Always On Availability Groups transport has detected a missing log block for availability database "Database". LSN of last applied log block is (99939:95847:0). Log scan will be restarted to fix the issue. This is an informational message only. No user action is required.

COMMENT:    (None)

JOB RUN:    (None)

Image result for missing block

I’ve talked with the product team about it, and its just something that happens, and is more of an informational message. Based on some discussions I’ve have on #sqlhelp on Twitter, it may be related to not enough network bandwidth between nodes. But if you see these sporadically, relax, it’s nothing major.

Analyzing Your Dump Files

I’m blogging about this, because A) It’s something really awesome that the SQL Server team built and B) it seems to have terrible SEO, because it took me like three google searches to find the page. With the introduction of SQL Server Management Studio 17, the Tiger team at Microsoft built a plugin that allows you to debug and resolve memory dumps you may have encountered during otherwise normal operations. This is really awesome, as it is something that usually requires a support case with CSS.

For those of you wearing aluminum hats, this does require you do upload the dump file to Azure, where it is analyzed for free (as in beer) on Microsoft’s software. You can even choose your region if you have data provenance concerns. And according to this blog post the memory dumps are stored in accorded with Microsoft’s Privacy Policy.

You will need SSMS 17 for this, as well as to install the plug in, which you can get here.

image

After that you can quickly get feedback on your dumps. Microsoft have even built an API, so if you want to built something automated to upload your dump files using Python or PowerShell you can.

Query Store and Availability Groups—Force Plan on Secondary Replicas

I’m still fighting with some challenges about inconsistent performance between a primary and secondary replica, so I’ve been waste deep in undocumented system views looking at temporary statistics. One of the things I thought about doing was talking advantage of the Force Plan option in the Query Store in SQL Server 2016.  If you are not familiar with this feature, it allows you to force a “preferred” execution plan. In this scenario, our query was running in about 20-30 seconds on the primary, and 20-30 minutes on the secondary. The plans were reasonably close, but I wanted to see what would happen if I forced a plan on the primary.

Primer about the Query Store and Availability Groups

Since readable secondary replicas are read-only, the query store on those secondary replicas are also read-only. This means runtime statistics for queries executed on those replicas are not recorded into the query store. All the stats there are from the primary replica. However, I wasn’t sure what would happen if I forced a plan on the primary—would the secondary replica honor that plan?

Let’s Find Out

The first thing I did was to query the query store catalog views to verify that the plan was forced.

image

I have to copies of the forced plan. If I run an estimated query plan on the primary, I see that plan is forced. You can see this by looked for UsePlan in the XML of the plan.

image

I did the same thing on the secondary (in the case of the secondary, we are looking at the actual plan, but it doesn’t matter).

image

You will note that there is no UsePlan. There are extended events and a catalog view that reflect plan forcing failure (Grant Fritchey wrote about this behavior here), While, I wouldn’t expect the catalog view to get updated, I was hoping that the Extended Event might fire. It did not.

Conclusion

The query store, as awesome as it is, doesn’t really do much for you on readable secondary replica. It does not force plans, nor does it record any of your data.

Thanks to Grant Fritchey and Erin Stellato for helping with this post!

The Curious Case of the HTDELETE Wait Type

I was working with a client this week and we encountered very long wait types on “insert as select” queries that were part of their data delivery process. This wait type isn’t documented very well, SQL Skills has it documented here  and mentions this:

“Typically these waits occur when queries involve columnstore indexes, but they can also occur without columnstore indexes being involved if a hash operator runs in batch mode.”

Nacho from Microsoft also has a blog post on it here. My theory was that a bad hash join was taking place and causing the wait.

Isolating the Plan

The thing that was very curious about the situation is that waits were only occurring on the readable secondary replica. At first, I tried to examine the query store to try to understand if there were multiple execution plans for a given query. The one problem with that is the readable secondary copy of the data is read-only, which means on that secondary replica you only see the query store data from the primary replica. If there was a plan that was specific to the secondary, I’d have gather them from the plan cache on the secondary. (Thanks to Erin Stellato (b|t) for this idea). There was one other problem—the code in question was executing as dynamic SQL from a stored procedure when meant it was always getting a new execution plan.

Ultimately I think any thought of the readable secondary having a vastly different plan was a red herrings. Statistics are going to be the same on both instances, and if there were a missing statistic on the secondary, SQL Server would create it in TempDB. Anyway, columnstore indexes don’t use statistics in the traditional sense.

Fortunately I was able to catch a query in the process of waiting on HTDELETE, so I no longer had to look for the needle in the haystack, and I could get to tuning the plans. I was able to grab the SELECT part of the query and generate an estimated plan on both the primary and secondary nodes. The plans were virtually the same on both nodes, with just a minor difference in memory grant between them.

 

image

The query is here.

SELECT    Object4.Column1
FROM    Database1.Schema1.Object5 Object4
    INNER JOIN Database2.Schema1.Object6(?, ?) Object7 ON Object4.Column2 = Object7.Column3
    INNER JOIN Database2.Schema1.Object8(?) Object9 ON Object4.Column4 = Object9.Column4 
    INNER JOIN Database1.Schema1.Object10 Object11 ON Object4.Column5 = Object11.Column6
    INNER JOIN Database2.Schema1.Object12(?) Object13 ON Object11.Column7 = Object13.Column7
WHERE    1 = 1
    AND Object4.Column8 >=  ‘01-Jan-2017’

The pattern here was that we were taking all of rows of an ID field in a columnstore index with about 350MM rows and joining them to a function that has 3500 rows. My gut instinct was this was a bad match for batch mode hashing. Additionally, SQL Server was recommending I create a b-tree index on the large columnstore table. there was a key lookup in the plan that I wanted to eliminate, but my hunch was that this join was causing the waits.

image

 

The Solution

So before I created the index, the query was taking at least 2-4 minutes, when it wasn’t getting hung on the HTDELETE wait. After I created the first index, we got done to about 15 seconds. SQL Server then recommended that I create another index on one of the join tables, which brought my query time down to sub-second. The plan looked a lot more traditional and had lots of my favorite operator INDEX SEEK.

image

 

The Moral of the Story

Sometimes you need non-clustered indexes on columnstore indexes. It stinks, because they do add space, but its hard to argue with a performance gain like this. I need to email some friends on the product team to ask, but I’m therorizing that the join was super expensive and causing the query to hang. Anyway, the real answer is to never stop tuning and trust your instincts.

Thanks to Sentry One for making Plan Explorer Free. I used it for the screen shots and anonymization in this post.

DBCC Clonedatabase and Very Large Databases

One of the recent feature introductions to SQL Server is dbcc clonedatabase, a feature that lets you create a “data-less” clone of you database. All of the statistics and objects come into your cloned database, however none of the data does. This is perfect for development or performance tuning exercises, where you want all the metadata, but do not want the security risk of dealing with production data.

Recently I had the opportunity to use clonedatabase on a very large database. I was concerned about the size of the data files and how this would impact space on my volumes. Books Online is fairly clear, but I wanted to see for myself.

Note All files in the target database will inherit the size and growth settings from the model database. File name convention: The file names for the destination database will follow the source_file_name _underscore_random number convention. If the generated file name already exists in the destination folder, DBCC CLONEDATABASE will fail

So my thought in reading that, is that the same number of data files will be created in the clone, just with the settings in model. Let’s test that out.

The first thing I did was create a new database, and then add a few data files to it. I made them 20 MB, which is a different size than model—just for testing purposes.

image

Next, I ran the clone database command.

image

Then connect to the clone and look at the data files

image

I can see that all of the files were created, in the same location as the files on the source database, except with the size settings of model. While this shouldn’t be a big deal for most, if you do like I recommend and make model a reasonable size for your environment, and you happen to be tight on drive space, you could fill up a volume. So just be aware when using clonedatabase particularly with databases that have a lot of data files in them.

Exporting Masked Data with Dynamic Data Masking

The SQL Herald | Databases et al…

Dynamic Data Masking is a presentation layer that got added to Azure SQL DB and SQL Server 2016. In a nutshell it prevents end users from seeing sensitive data, and lets administrators show some data (e.g. the last 4 digits of social security number) for verification purposes. I’m not going to focus too much on the specifics of data masking in this post—that’s a different topic. This is how once you have a masking strategy you can protect your sensitive data going to other environments.

Well at PASS Summit, both in our booth and during my presentation on security in Azure DB, another idea came up—exporting data from production to development, while not releasing any sensitive data. This is a very common scenario—many DBAs have to export sensitive data from prod to dev, and frequently it is done in an insecure fashion.

Doing this requires a little bit of trickery, as dynamic data masking does not work for administrative users. So you will need a second user.

First step—let’s create a database and a masked table.

CREATE DATABASE DDM_Demo
GO

USE DDM_Demo
GO
CREATE TABLE Membership
(MemberID int IDENTITY PRIMARY KEY,
FirstName varchar(100) MASKED WITH (FUNCTION = ‘partial(1,”XXXXXXX”,0)’) NULL,
LastName varchar(100) NOT NULL,
Phone# varchar(12) MASKED WITH (FUNCTION = ‘default()’) NULL,
Email varchar(100) MASKED WITH (FUNCTION = ’email()’) NULL);

INSERT Membership (FirstName, LastName, Phone#, Email) VALUES
(‘Roberto’, ‘Tamburello’, ‘555.123.4567’, ‘RTamburello@contoso.com’),
(‘Janice’, ‘Galvin’, ‘555.123.4568’, ‘JGalvin@contoso.com.co’),
(‘Zheng’, ‘Mu’, ‘555.123.4569’, ‘ZMu@contoso.net’);

CREATE LOGIN TestUser WITH PASSWORD ‘P@ssw0rd!’
CREATE USER TestUser FROM LOGIN TestUser

GRANT VIEW DEFINITION TO demoexport;
USE msdb
GO

ALTER ROLE db_datareader ADD MEMBER demoexport;
GO
ALTER ROLE db_datawriter ADD MEMBER demoexport;
GO

Next I’ll login as this user and select from the membership table.

image_thumb.png

From here, I’m going to (as the TestUser) take an export of the database. You can do this by selecting the Export Data Tier Application option from the tasks menu in Management Studio.

image.png

I won’t bore you with clicking through the process, but this will give you an export of your database, with the data masked. Your next step is to import the .bacpac file you created. In this case I’m going to the same instance, so I changed the database name.

Right click on “Databases” in SSMS and select “Import Data-tier application”. Import the file you created in the previous step.

image.png

 
Now try selecting as your admin user.

imageimage_thumb.pngimage.png

Boom, you’ve exported and imported masked data in your lower environments.

Circles and Squares–What Do They Mean in the Query Store?

Denny Cherry and I are on a tuning mission this week, and fortunately the customer has SQL Server 2016, so we don’t have to waste a lot of time finding out problematic queries in the system. However, we were a little bit confused when we saw a graphic in the query store diagram today.

image

You will note that plan 44 has two durations, and even though has a single a plan id, the table to the right of the plan summary window indicates two plans. You will note that one of the plans is represented by a circle, the other a square. So what does this mean?

If the query is circle that means it completed. When the icon is a square that represents query that is cancelled, but had a plan generated for it. In this case I cancelled the query in SSMS.

We can confirm by looking in the Query Store catalog views.

image

The second run of the query shows as aborted and is represented as such in the query.

Does In-Memory OLTP Work with Always Encrypted? (AKA That time Bob Ward asked us a question)

Last week was the PASS Summit, which is the biggest confab of SQL Server professionals on the planet (and educational as ever), Denny Cherry  (b|t) and I ran into Bob Ward (b|t) of Microsoft and of 500 level internals presentations. And for the first time ever, Bob asked us a question about SQL Server—of course we didn’t know the answer of the top of our heads, but we felt obligated to research it like we’ve made Bob do so many times. Anyone, the question came up a Bob’s internals session on Hekaton (In-Memory OLTP) and whether it supported the new Always Encrypted feature in SQL Server 2016. I checked books online, but could not find a clear answer, so I fired up SSMS and setup a quick demo.

 

Set up Hekaton

Let’s get started with setting up Hekaton.

CREATE DATABASE imoltp  
GO 
 
————————————– 
— create database with a memory-optimized filegroup and a container. 
ALTER DATABASE imoltp ADD FILEGROUP imoltp_mod CONTAINS MEMORY_OPTIMIZED_DATA  
ALTER DATABASE imoltp ADD FILE (name=’imoltp_mod1′, filename=’c:\temp\imoltp_mod1′) TO FILEGROUP imoltp_mod  
ALTER DATABASE imoltp SET MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT=ON 
GO 

In that code, I’m just doing the setup for In-Memory OLTP—you need to added a filegroup, as well as enable the MEMORY_OPTIMIZED_ELEVATE_TO_SNAPSHOT_ON.

Always Encrypted Setup

Next, we needed to setup up Always Encrypted.

image

In order to configure this, you’ll first need to setup a Column Master Key, and then a Column Encryption Key. I took a sample table from books online, and attempted to encrypt it.

CREATE TABLE [Employee_Encrypted](
    [BusinessEntityID] [int] NOT NULL IDENTITY(1,1) ,
    [NationalIDNumber] [char](11) COLLATE Latin1_General_BIN2
        ENCRYPTED WITH (ENCRYPTION_TYPE = DETERMINISTIC,
        ALGORITHM = ‘AEAD_AES_256_CBC_HMAC_SHA_256’,
        COLUMN_ENCRYPTION_KEY = ColumnKey1)
        NOT NULL,
    [LoginID] [nvarchar](256) NULL,
    [BirthDate] [date]
        ENCRYPTED WITH (ENCRYPTION_TYPE = RANDOMIZED,
        ALGORITHM = ‘AEAD_AES_256_CBC_HMAC_SHA_256’,
        COLUMN_ENCRYPTION_KEY = ColumnKey1)
        NOT NULL,
         CONSTRAINT [PK_Employee_Encrypted_BusinessEntityID] PRIMARY KEY NONCLUSTERED
(
    [BusinessEntityID] ASC
)
)
WITH  (MEMORY_OPTIMIZED=ON)

With that T-SQL (and a couple of attempts at getting my parentheses in the right place) it just ran. Nothing major to do—just create table. I confirmed this behavior with Jos de Bruijn, who is the program manager for Hekaton.

SQL Server 2016 Database Mail Not Working

One of the nice things about SQL Server 2016 is that .NET 3.5 is no longer required for installation. The .NET requirement wouldn’t have been problematic, but in order to add the .NET feature access to the Windows media was required (this was particularly painful in Azure VMs). Fortunately the product team eliminated the requirement as part of the release. Unfortunately, a few things broke–namely database mail, native log shipping, and distributed replay. You can identify this problem, by a couple of symptoms:

  • Your database mail log has no entries
  • You have messages with a status of “unsent’ in sysmail_allitems in MSDB

Screen Shot 2016-07-27 at 2.41.59 PM

  • If you try to execute the file C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Binn\DatabaseMail.exe you get a .NET 3.5 error

This should be fixed in CU2, but there are reports below of it failing in SP1. The XML to put in the .config files is:

<?xml version=1.0 encoding=utf-8 ?>

<configuration>

  <startup useLegacyV2RuntimeActivationPolicy=true>

    <supportedRuntime version=v4.0/>

    <supportedRuntime version=v2.0.50727/>

  startup>

configuration>

%d bloggers like this: