Where Should I Use a Columnstore Index in SQL Server?
April 7, 2014 2 Comments
I’ve been blogging a lot about columnstore indexes in SQL Server 2014—for good reason, I think it is a great feature that adds a lot of value to SQL Server for analytic workloads. One of the other (and probably more talked about) features in SQL Server 2014 is In-Memory OLTP (formerly known as Hekaton)—the SQL team added a neat new report that’s accessible to users of Management Data Warehouse (MDW) to show which tables and procedures in a given database might be well suited for In-Memory. There is nothing like that yet for a columnstore index.
Figure 1 SQL 2014 Management Data Warehouse In-Memory Tables Assistant
Microsoft’s official guidance is to use columnstores on large fact tables and dimensions greater than 1,000,000 rows. Typically (and this applies to regular page compression as well, you are going to look to tables that don’t have a very high update percentage as columnstore candidates. Despite the ability to update the clustered columnstores, this feature remains largely focused on the append-only type workloads present in data warehousing system. The insert process has been designed around doing bulk inserts very efficiently, and not so much doing singleton updates and inserts.
So I wanted to put logic together to identify candidate tables. The below script checks for tables with > 1,000,000 rows and an update percentage below 20—it would be easy enough to modify, the output list is a the list of candidate tables for columnstore indexes. Additionally, it won’t return any tables that already have a columnstore index on them.
with widetables as
as ColumnCount, sum(c.row_count)
sys.columns b on b.object_id=a.object_id
sys.dm_db_partition_stats c on c.object_id=a.object_id
sys.partitions d on d.object_id=a.object_id
where d.data_compression not
> 5 and count(b.column_id)
< 1200 and sum(c.row_count)
–Check Update %
–Checks for Percentage of Update Statements
SELECT o.NAME AS [Table_Name]
,x.NAME AS [Index_Name]
,i.partition_number AS [Partition]
,i.index_id AS [Index_ID]
,x.type_desc AS [Index_Type]
,i.leaf_update_count * 100.0 /
(i.range_scan_count + i.leaf_insert_count + i.leaf_delete_count + i.leaf_update_count + i.leaf_page_merge_count + i.singleton_lookup_count) AS [Percent_Update]
sys.objects o ON o.object_id = i.object_id
sys.indexes x ON x.object_id = i.object_id
AND x.index_id = i.index_id
WHERE (i.range_scan_count + i.leaf_insert_count + i.leaf_delete_count + leaf_update_count + i.leaf_page_merge_count + i.singleton_lookup_count)
= 1 and i.leaf_update_count * 100.0 /
(i.range_scan_count + i.leaf_insert_count + i.leaf_delete_count + i.leaf_update_count + i.leaf_page_merge_count + i.singleton_lookup_count)
select name as CandidateTables from widetables
select table_name from updaterate
I hope this script is useful, and can provide some nice guidance as to where to use a columnstore index.