A Little Bit More on Columnstore (SQL 2014) Internals

Spread the love

I wrote a post on Monday with some details about columnstore index insert and update process. I mistakenly posted initially (I’ve since updated) that the tuple mover process kicked in when 1000 rows were loaded (thanks to Jimmy May from Microsoft b|t)—the correct answer is 220 or 1,048,576 rows. Remus confirms this here, but I wanted to test out the process for myself, and I found some interesting things.

Before I go too deep into my lab work, this is pretty important to the columnstore index performance story for a couple of reasons. First, when a columnstore index has open delta stores, it needs to parse those in order to answer the query. This is an expensive operation, as it prevents the query from being completely execute in batch mode and in general the insert process is slower, as Microsoft has written an optimized insert API for this process. Additionally, we would like our columnstore row groups to be as large as possible to optimize the row elimination process in answering the query. Let’s examine some of the impact.

Batch Loading

I’m not sure if I’ve found this documented anywhere, but the threshold to trigger the bulk insert API appears to be 102,400 rows—if we do an insert of 102,399 rows we see an open delta store. See below:

Note the message “Column 0 was omitted from VertiPaq during column index build.”—that’s indicative of the data getting processed directly into the columnstore. Also note the time—279 ms to load 102,400 rows. If we check the sys.column_store_row_groups for the table.

As we see—that load was added to the columnstore and compressed. Note—102,400 rows in our row group is not optimal for performance. We’d really like to have it be much larger for better segment elimination. In fact, if we do a rebuild on the index, we will see our rows get distributed more evenly.

 

Now let’s see what happens if we only load 102,399 rows.

 

 

So first of all, we see something huge—the execution time is 994 ms compared to 279 ms. That means it most likely got loaded to a b-tree delta store. Let’s confirm.

Sure enough—we see an open delta store. So I run my standard warehouse query against the table:

 

If I rebuild the index (so the delta store gets compressed) my logical reads drop, and my execution time drops slightly. Note—this is a small size dataset in terms of columnstore—in a much larger data set the effects would be much more dramatic.

 

 

One Last Thing

While I was playing with the tuple mover process yesterday I noticed something unique—a row group with the state of “invisible”

This description is documented in this connect item, and will be in books online in the near future. According to Remus:

“The Tuple Mover reads one closed delta store at a time and starts building the corresponding compressed segments. During this time scans continue to see and read the delta store. When the Tuple Mover has finished compressing the delta store, the newly created segments are made visible and the delta store is made invisible in a single, atomic operation. New scans will see and scan the compressed format. The Tuple Mover then waits for all scans still operating in the delta store to drain after which the delta store is removed.” 

I just happened the run the row groups query during the time the tuple mover was firing, and got to see this.

Summary

Ideally, your data warehouse loads are in perfect 1,024,800 row increments, so your row groups are as large as possible, and you avoid having open delta stores. Since the world is not ideal, you will probably want to monitor for open delta stores as they can greatly impact performance. Additionally, if your row groups get too “fragmented” (I use that word hesitantly—it’s not really fragmentation, it’s just a large number of row groups with less than optimal row counts) you may want to consider rebuilding the columnstore periodically—or even better the partition that it’s in.

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.