SQL Server 2014 Columnstore Index Update Internals

Spread the love

I have this bad habit of wanting to know how things actually work that often leads me down rabbit holes. I’ve taken apart way too many mechanical things in my lifetime, and mostly have gotten them back together (though if you’re a cyclist—I never recommend taking apart a Shimano STI lever). Anyway, in SQL Server 2014, we now have the ability to update a columnstore index.

Given the highly compressed nature, the CPU cost of making a direct update to the columnstore index (which is stored as lobs made up of segments) would be extremely high. So in order to facilitate updates and inserts, SQL Server uses the concept of a delta store. This delta store is a b-tree (row oriented) structure where inserts and updates are stored until they reach a threshold of approximately 1000 rows 1048576 rows (thanks Jimmy May for catching–not sure where I got a 1000 rows from). Then when memory is available—they are moved into the columnstore. Let’s take a look at the process in detail.

Here we update a table with a columnstore index:

For the moment this data gets put into a delta store—which we can see from the following query:

The delta_store_hobt_id helps us here. Next I’m going to search sys.partitions for this hobt_id. Nothing

I did some more chasing, but I’ll spare you those details. I ended up at sys.syscsrowgroups—don’t look for it. It’s in Resource DB—I told your this was a rabbit hole. That didn’t really help—I then went sys.dm_db_database_page_allocations. There we can see the IN_ROW_DATA, that is where our delta store pages reside.

From there—we can run DBCC page to look at the contents of the page:

PageID 608240 is the delta store page—as you see, we can still see real data. If we look at the columnstore compressed page (522712), we see a bunch of gibberish on the page, indicative of compression (and the fact that it’s a lob now)

My VM was being fussy, so I forced the tuple mover (the process which moves data from the delta store into the columnstore) by doing an index rebuild.

Since there are no page allocations that aren’t of the type LOB_DATA all of our data has been compressed and added to the columnstore.

So that’s the path our data takes. Many thanks to Remus Rusanu (b|t) for both helping write this new feature and documenting it.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.