Performance of SQL Server 2014 Columnstore Archival Mode

I blogged about this in not a whole lot of detail a couple of month ago, but as an add to my recent testing of columnstore index update and insert performance, I decided to take another shot at testing the next Columnstore Archival mode of compression in SQL 2014 CTP2. What this allows to do is take either an entire columnstore index (really bad idea) or selected relatively unused partitions of our columnstore index, and compress them into an even smaller archive than a typical columnstore index. Microsoft, per conversations I had at PASS Summit mentioned seeing compression gains of up to 26%.

The Test

Like I did in my insert and update testing, I used the Make Big AdventureWorks script, which you can download here courtesy of Adam Machanic’s (b|t). Specifically, my focus was in and around the BigTransaction table. I partitioned the table based on date. Before I compressed, the partitions were the following sizes:

Table

Partition

Rows

Data KB

BigTransaction_Reg_Columnstore

1

3774500

29144

BigTransaction_Reg_Columnstore

2

3152574

24200

BigTransaction_Reg_Columnstore

3

2909218

22200

BigTransaction_Reg_Columnstore

4

2812282

21720

BigTransaction_Reg_Columnstore

5

2660245

20664

BigTransaction_Reg_Columnstore

6

2627890

30008

BigTransaction_Reg_Columnstore

7

2461343

28064

BigTransaction_Reg_Columnstore

8

2568732

29352

BigTransaction_Reg_Columnstore

9

8296817

52512

 

After I rebuilt partitions 1-5 with archival Columnstore compression, I saw the following savings in compression:

Table

Partition

Rows

Data KB

Compression %

BigTransaction_Arch_Columnstore

1

3774500

42480

31.39%

BigTransaction_Arch_Columnstore

2

3152574

35552

31.93%

BigTransaction_Arch_Columnstore

3

2909218

32840

32.40%

BigTransaction_Arch_Columnstore

4

2812282

32072

32.28%

BigTransaction_Arch_Columnstore

5

2660245

30520

32.29%

BigTransaction_Arch_Columnstore

6

2627890

30008

N/A

BigTransaction_Arch_Columnstore

7

2461343

28064

N/A

BigTransaction_Arch_Columnstore

8

2568732

29352

N/A

BigTransaction_Arch_Columnstore

9

8296817

52512

N/A

Average

     

32.06%

 

So for the partitions I rebuilt, I see an average space savings of 32%, but at what cost select performance?

Select Performance

The range query I did—selecting average transaction cost and date, with a group by on the date, with a filter on to only pull in dates in the archived partitions. In terms of pure time to execute the query, the columnstore archival query ran about 18% longer, however it consumed 51% more CPU time. Compression is CPU costly, and the more we compress data, the more CPU cycles are required to uncompress. One other point of note, when I partitioned the table (with either form of compression) I see a lot more logical reads for the same query plans. Here the numbers where around 85k and 120k, but if I run the same query against an unpartitioned table with columnstore the count is much closer to 40k. Aaron Bertrand (b|t) pointed out that this might be related to a bug with statistics IO.

 

 

 

 

 

So there is some benefit to this feature—using it as a cold archive, you can make gains in storage, potentially up to 30+% with a relatively minor penalty for querying. It should be best utilized on data that you don’t plan on accessing on a regular basis.

Columnstore Insert and Update Performance

I had been holding off on writing this blog post with SQL 2014 CTP1, since updateable columnstores were definite a work in progress. I did get several questions about the updatable columnstores during my presentation at the recent SQL PASS summit. So with a fresh VM with SQL 2014 CTP2 installed on my laptop I dove right in.

In order to really test any sort of compression technology, you need to work with a large volume of data. For the purposes of my presentations, I’ve been using Adam Machanic’s (b|t) Make Big AdventureWorks script, which you can download here. Specifically, the table I’m working with is the BigTransaction table, which is about 1.1 GB uncompressed and consists of 31,263,601 rows.

Read Performance

Just in case you aren’t up to speed on columnstore indexes they do offer much better read performance over traditional compression—this is outside the scope of the post, but running a test query to pull average transaction value for a given date, this graph shows the comparison between page compression and columnstore. I know you can’t see the elapsed time for the columnstore query in the chart—it’s 227 milliseconds (ms), as opposed to 5627 ms for the page compressed version of the query.

 

 

First things First

One of the caveats of using a clustered columnstore index, is that it must include all of the columns in the table, so the syntax is very simple.

CREATE TABLE [dbo].[ cluster_Columnstore_load_test]([TransactionID] [int] NOT NULL,

    [ProductID] [int] NOT NULL,

    [TransactionDate] [datetime] NULL,

    [Quantity] [int] NULL,

    [ActualCost] [money] NULL)

create clustered columnstore index CCI_Test on cluster_Columnstore_load_test

Loading the Data

I decided to do a pretty basic test—I took the data from my uncompressed BigTransaction and did an insert into as select. I also, decided to fire up an Extended Events session that captured the following events (column_store_index_build_process_segement, columnstore_tuple_move_begin_compess, columnstore_tuple_move_end_compess), so I could see the process as it happened.

Insert into cluster_Columnstore_load_test select * from cluster_Columnstore_load_test

Here were the results.

Table ‘cluster_Columnstore_load_test’. Scan count 0, logical reads 0, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table ‘bigTransaction_UC’. Scan count 1, logical reads 131820, physical reads 473, read-ahead reads 131816, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 63500 ms, elapsed time = 69404 ms.

(31263601 row(s) affected)

I was really surprised at how quickly that the data loaded (69 seconds). This morning I was also doing benchmark testing using another vendor’s columnar database, and the same load took nearly five minutes. In the xEvents session, I saw the build_process_segment event firing—each row group contains a minimum of 102,400 rows, and I observed highly variable numbers (as low as 120k, as high as 524k) in each rowgroup, based on compression level I suspect.

Here was my CPU during the process—high but not greater than 80%

 

Updating the Data

So my next test was to perform a large update against the data set.

    update cluster_Columnstore_load_test set actualcost=’$50.00′

    where transactionDate > ’01-Jan-2009′ and transactionDate < ’01-Jan-2012′

The results were as follows:

Table ‘cluster_Columnstore_load_test’. Scan count 3, logical reads 156967, physical reads 3636, read-ahead reads 332077, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

SQL Server Execution Times:

CPU time = 176609 ms, elapsed time = 213016 ms.

(8296817 row(s) affected)

From a process (well, xEvent) perspective, nothing happens when that update is performed. The data is moved into a special type of b-tree called a Delta Store, where approximately 1000 rows are stored. This gives the performance benefit of updating the columnstore index asynchronously—or in my case about 5 minutes later, when I see the all three of my events firing—the begin compress, followed by the build_process_segment, followed by the end compress. This was also accompanied by some spiky CPU which I would expect.

 

So How Does Page Compression Compare

So I ran the same series of test using a page compressed version of the table, with a clustered index including transactionid, processed, and transactiondate. Here’s what happened on the initial insert:

insert into page_test_load select * from bigtransaction_uc;

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server Execution Times:

CPU time = 0 ms, elapsed time = 0 ms.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 2 ms.

Table ‘page_test_load’. Scan count 0, logical reads 101434452, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

Table ‘bigTransaction_UC’. Scan count 1, logical reads 143648, physical reads 1, read-ahead reads 143658, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(31263601 row(s) affected)

(1 row(s) affected)

SQL Server Execution Times:

CPU time = 220750 ms, elapsed time = 235712 ms.

This is nearly 4x slower than the same insert into the clustered columnstore. The reason for this is that for the purposes of bulk insert the load process has been extremely optimized. From the Microsoft SIGMOD white paper on the subject:

Given the significant performance advantage of the bulk insert API over traditional insert statements, the query execution engine will internally use the bulk insert API for “insert into … select from …” statements that are targeting a column store. This operation is fairly frequent in ETL pipelines and this change allows it to create directly the highly efficient compressed format.

The page compression algorithm isn’t actually applied here, since I’m not doing an insert into..with (tablock), so these rows aren’t actually being page compressed as they are being loaded, but even with row compression being applied the performance is lower than with the columnstore.

I did do a test with the import/export wizard (which uses the bulk insert API) and my performance was far worse—46444 ms (or 7 min 44 sec)

So let’s take a look at the update on the page compressed table.

update page_test_load set actualcost=’$50.00′

where transactionDate > ’01-Jan-2009′ and transactionDate < ’01-Jan-2012′

SQL Server parse and compile time:

CPU time = 579 ms, elapsed time = 809 ms.

SQL Server parse and compile time:

CPU time = 0 ms, elapsed time = 0 ms.

Table ‘page_test_load’. Scan count 1, logical reads 247587, physical reads 1, read-ahead reads 42147, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

(8296817 row(s) affected)

(1 row(s) affected)

SQL Server Execution Times:

CPU time = 53781 ms, elapsed time = 57899 ms

Interestingly, the update was much faster (just under 4x)—that’s mainly because columnstore processes an update as both an insert followed by a delete.

The performance can be shown in the chart below

What’s interesting about this data as that the proportion of CPU time to total time is roughly the equivalent percentage in each case, and the update of the columnstore (8M rows) is roughly the same as inserting 31 M rows into the page compressed table, and vice versa.

So what does this tell us? Even though we can update columnstore indexes now, they are probably still best used for insert only workloads. Anything that requires frequent, quick updates isn’t a great candidate for a columnstore index.


 

PASS Summit Hybrid Availability Groups

Resources from presentation.

Also,to answer the question about cluster validation processing, I’m still waiting to hear an official answer from Microsoft, but here is what the MSDN documents mention:

Validation Warning Select No. I do not require support from Microsoft for this cluster, and therefore do not want to run the validation tests. When I click Next, continue creating the cluster.

from MSDN.

SQLPass Summit 2013 Keynote Day 1

Today I’m going to be live blogging the keynote at PASS Summit.

PASS President Bill Graziano started with talking what a great job PASS is doing with training in the last year with such programs as SQL Saturday, Summit, Rally and Chapters.

Quentin Clark (VP Data Platform, SQL Server) talking about the Data Platform in SQL Server. Talking about brick and mortar and the comparison to web vendors, but discussing cloud solutions and how on-premises solutions can work symbiotically to offer. Discussing how today’s keynote is going to focus on things that are around—not so much new tech going forward.

CTP2 of SQL 2014 is released—Quentin talks about how each employee needs to be empowered with information. Tracy Daugherty from MS is on stage to demo in-memory. I have a good bit of knowledge around in-memory and columnar data stores. Oracle flat out blocks their columnar functionality unless you are running on their hardware. Flat out, it’s a really crappy move.

Microsoft is introducing a bunch of new features that sync with Azure—hybrid Availability Groups, and better backup integration. The coolest feature I’ve seen mentioned is native encrypted backups, so we can now encrypt backups without taking the performance hit of Transparent Data Encryption.

The city of Barcelona talks about their data management. Much mention of Hortonworks Hadoop, not so much mention of HDInsight.     Demoing PowerQuery against a demo Skype data sets—combining data from multiple sources.

Interesting that we have no release data for SQL Server 2014. More here tomorrow, same bat place, same bat channel.

Loading Data from SQL Server to Vertica

I’m in the midst of learning Vertica—and I’m trying to load some AdventureWorks data into tables. I came across this blog post from Doug Harmon (b) where he mentions using sqlcmd and a batch file to transfer data.

Create a Windows Batch file, XferData.bat, to transfer the data

sqlcmd -S <server/instance> -d <database> -u -s”|” -I -h-1 -k2 -Q “EXEC Export_Data ” | ^

vsql -h <host> -d <database> -U <username> -w <password> -c “COPY public.DH_StageTable FROM LOCAL STDIN DELIMITER ‘|’ ;”

The batch file can be called from a SQL Server Agent job using either CmdExec or PowerShell.”

 

Unfortunately the comments for Doug’s blog seem to be broken, so I had to blog this myself. When I attempted to run that code as a batch file, Windows threw the following error:

 

C:\Temp>test_sql_Vertica.bat

 

C:\Temp>sqlcmd -S localhost -d AdventureWorks2012 -u -s”|” -I -h-1 -k2 -Q “EXEC Export_Data ” |

‘ ‘ is not recognized as an internal or external command, operable program or batch file.

 

When I tried to run it in PowerShell, Powershell wasn’t happy about the caret:

 

PS C:\temp> .\test_sql_Vertica.ps1

^ : The term ‘^’ is not recognized as the name of a cmdlet, function, script file, or operable program. Check the

spelling of the name, or if a path was included, verify that the path is correct and try again.

At C:\temp\test_sql_Vertica.ps1:1 char:89

+ … Export_Data “| ^

+ ~

+ CategoryInfo : ObjectNotFound: (^:String) [], CommandNotFoundException

+ FullyQualifiedErrorId : CommandNotFoundException

 

So the solution?

 

sqlcmd -S localhost -d AdventureWorks2012 -u -s”|” -I -h-1 -k2 -Q “EXEC Export_Data “|

vsql -h 10.10.50.114 -d AdventureWorks_Practice -U dbadmin -w Anex1 -c “COPY public.Practice_StoredProc FROM LOCAL STDIN DELIMITER ‘|’ ;”

 

PS C:\temp> .\test_sql_Vertica.ps1

Rows Loaded

————-

5

(1 row)

 

Don’t forget to add VSQL.exe to your path in Windows, as well.


 

Upcoming Presentations

In preparation for the upcoming PASS Summit, I will be presenting each of the sessions I will be doing at the Summit this week. The first will be Wednesday at the Philadelphia SQL Server Users Group, where I will be presenting my session on Hybrid Availability Groups.

Into the Blue: Extending AlwaysOn Availability Groups

For many organizations, having a second data center or co-location center doesn’t make sense, financially or logistically. Typically, this would limit options for building out a disaster recovery (DR) solution. However, now with Windows Azure virtual machines and SQL Server AlwaysOn Availability Groups, you can connect your on-premise solution to a real-time secondary replica, providing read scalability and a solid DR solution.

This session will demonstrate how to extend an Availability Group into Windows Azure, discussing the pros and cons as well as the cost of the solution. You will walk away with a solid understanding of AlwaysOn functionality within Windows Azure VMs, the costs and benefits of building a DR solution within Windows Azure, and how Azure-based backup and recovery can work.

I will also be presenting this at the PASS Summit in Room 217 D from 2:45-4 PM (1445-1600) on Friday 18 October.

 

On Friday, I will be presented to the Albuquerque (NM) SQL Server User’s Group on my other PASS Summit topic–

Accelerate Database Performance Through Data Compression

Much like the cars of the 1970s sacrificed gas mileage for better performance, database technology has also made its share of sacrifices for efficiency. Fortunately, times have changed significantly since then. Just as adding a turbocharger to a car delivers more power while saving fuel, the addition of compression to a database accelerates read performance while saving disk space.

Come learn how, why, and when compression is the solution to your database performance problems. This session will discuss the basics of how compression and deduplication reduce your data volume. We’ll review the three different types of compression in SQL Server 2012, including the overhead and benefits of each and the situations for which each is appropriate, and examine the special type of compression used for ColumnStore indexes to help your data warehouse queries fly. As with turbo, data compression also has drawbacks, which we’ll cover as well.

I’ll be doing this at PASS Summit in Room 203A at 3-4:15 PM (1500-1615 for my non-US readers) on Wednesday 16 October.

I hope to see many of you either at these presentations or at Summit!!

 

%d bloggers like this: