Вы находитесь на странице: 1из 135

Table of Contents

Overview
Heaps (Tables without Clustered Indexes)
Clustered and Nonclustered Indexes Described
Create Clustered Indexes
Create Nonclustered Indexes
Create Unique Indexes
Create Filtered Indexes
Create Indexes with Included Columns
Delete an Index
Modify an Index
Move an Existing Index to a Different Filegroup
Indexes on Computed Columns
SORT_IN_TEMPDB Option For Indexes
Disable Indexes and Constraints
Enable Indexes and Constraints
Rename Indexes
Set Index Options
Disk Space Requirements for Index DDL Operations
Transaction Log Disk Space for Index Operations
Index Disk Space Example
Reorganize and Rebuild Indexes
Specify Fill Factor for an Index
Perform Index Operations Online
How Online Index Operations Work
Guidelines for Online Index Operations
Configure Parallel Index Operations
Index Properties F1 Help
Columnstore indexes
Overview
Architecture
Design guidance
Data loading guidance
What's new
Query performance
Real-time operational analytics
Data warehouse data warehouse
Defragment
Indexes
3/24/2017 • 3 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
The following table lists the types of indexes available in SQL Server and provides links to additional information.

INDEX TYPE DESCRIPTION ADDITIONAL INFORMATION

Hash With a hash index, data is accessed Guidelines for Using Indexes on
through an in-memory hash table. Hash Memory-Optimized Tables
indexes consume a fixed amount of
memory, which is a function of the
bucket count.

memory-optimized nonclustered For memory-optimized nonclustered Guidelines for Using Indexes on


indexes indexes, memory consumption is a Memory-Optimized Tables
function of the row count and the size
of the index key columns

Clustered A clustered index sorts and stores the Clustered and Nonclustered Indexes
data rows of the table or view in order Described
based on the clustered index key. The
clustered index is implemented as a B- Create Clustered Indexes
tree index structure that supports fast
retrieval of the rows, based on their
clustered index key values.

Nonclustered A nonclustered index can be defined on Clustered and Nonclustered Indexes


a table or view with a clustered index or Described
on a heap. Each index row in the
nonclustered index contains the Create Nonclustered Indexes
nonclustered key value and a row
locator. This locator points to the data
row in the clustered index or heap
having the key value. The rows in the
index are stored in the order of the
index key values, but the data rows are
not guaranteed to be in any particular
order unless a clustered index is created
on the table.

Unique A unique index ensures that the index Create Unique Indexes
key contains no duplicate values and
therefore every row in the table or view
is in some way unique.

Uniqueness can be a property of both


clustered and nonclustered indexes.
INDEX TYPE DESCRIPTION ADDITIONAL INFORMATION

Columnstore An in-memory columnstore index stores Columnstore Indexes Guide


and manages data by using column-
based data storage and column-based Using Nonclustered Columnstore
query processing. Indexes

Columnstore indexes work well for data


warehousing workloads that primarily
perform bulk loads and read-only
queries. Use the columnstore index to
achieve up to 10x query performance
gains over traditional row-oriented
storage, and up to 7x data
compression over the uncompressed
data size.

Index with included columns A nonclustered index that is extended Create Indexes with Included Columns
to include nonkey columns in addition
to the key columns.

Index on computed columns An index on a column that is derived Indexes on Computed Columns
from the value of one or more other
columns, or certain deterministic inputs.

Filtered An optimized nonclustered index, Create Filtered Indexes


especially suited to cover queries that
select from a well-defined subset of
data. It uses a filter predicate to index a
portion of rows in the table. A well-
designed filtered index can improve
query performance, reduce index
maintenance costs, and reduce index
storage costs compared with full-table
indexes.

Spatial A spatial index provides the ability to Spatial Indexes Overview


perform certain operations more
efficiently on spatial objects (spatial
data) in a column of the geometry
data type. The spatial index reduces the
number of objects on which relatively
costly spatial operations need to be
applied.

XML A shredded, and persisted, XML Indexes (SQL Server)


representation of the XML binary large
objects (BLOBs) in the xml data type
column.

Full-text A special type of token-based functional Populate Full-Text Indexes


index that is built and maintained by
the Microsoft Full-Text Engine for SQL
Server. It provides efficient support for
sophisticated word searches in character
string data.

Related Tasks
Related Content
SORT_IN_TEMPDB Option For Indexes
Disable Indexes and Constraints
Enable Indexes and Constraints
Rename Indexes
Set Index Options
Disk Space Requirements for Index DDL Operations
Reorganize and Rebuild Indexes
Specify Fill Factor for an Index
Pages and Extents Architecture Guide

See Also
Clustered and Nonclustered Indexes Described
Heaps (Tables without Clustered Indexes)
3/24/2017 • 4 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
A heap is a table without a clustered index. One or more nonclustered indexes can be created on tables stored as a
heap. Data is stored in the heap without specifying an order. Usually data is initially stored in the order in which is
the rows are inserted into the table, but the Database Engine can move data around in the heap to store the rows
efficiently; so the data order cannot be predicted. To guarantee the order of rows returned from a heap, you must
use the ORDER BY clause. To specify the order for storage of the rows, create a clustered index on the table, so that
the table is not a heap.

NOTE
There are sometimes good reasons to leave a table as a heap instead of creating a clustered index, but using heaps effectively
is an advanced skill. Most tables should have a carefully chosen clustered index unless a good reason exists for leaving the
table as a heap.

When to Use a Heap


If a table is a heap and does not have any nonclustered indexes, then the entire table must be examined (a table
scan) to find any row. This can be acceptable when the table is tiny, such as a list of the 12 regional offices of a
company.
When a table is stored as a heap, individual rows are identified by reference to a row identifier (RID) consisting of
the file number, data page number, and slot on the page. The row id is a small and efficient structure. Sometimes
data architects use heaps when data is always accessed through nonclustered indexes and the RID is smaller than a
clustered index key.

When Not to Use a Heap


Do not use a heap when the data is frequently returned in a sorted order. A clustered index on the sorting column
could avoid the sorting operation.
Do not use a heap when the data is frequently grouped together. Data must be sorted before it is grouped, and a
clustered index on the sorting column could avoid the sorting operation.
Do not use a heap when ranges of data are frequently queried from the table. A clustered index on the range
column will avoid sorting the entire heap.
Do not use a heap when there are no nonclustered indexes and the table is large. In a heap, all rows of the heap
must be read to find any row.

Managing Heaps
To create a heap, create a table without a clustered index. If a table already has a clustered index, drop the clustered
index to return the table to a heap.
To remove a heap, create a clustered index on the heap.
To rebuild a heap to reclaim wasted space, create a clustered index on the heap, and then drop that clustered index.
WARNING
Creating or dropping clustered indexes requires rewriting the entire table. If the table has nonclustered indexes, all the
nonclustered indexes must all be recreated whenever the clustered index is changed. Therefore, changing from a heap to a
clustered index structure or back can take a lot of time and require disk space for reordering data in tempdb.

Heap Structures
A heap is a table without a clustered index. Heaps have one row in sys.partitions, with index_id = 0 for each
partition used by the heap. By default, a heap has a single partition. When a heap has multiple partitions, each
partition has a heap structure that contains the data for that specific partition. For example, if a heap has four
partitions, there are four heap structures; one in each partition.
Depending on the data types in the heap, each heap structure will have one or more allocation units to store and
manage the data for a specific partition. At a minimum, each heap will have one IN_ROW_DATA allocation unit per
partition. The heap will also have one LOB_DATA allocation unit per partition, if it contains large object (LOB)
columns. It will also have one ROW_OVERFLOW_DATA allocation unit per partition, if it contains variable length columns
that exceed the 8,060 byte row size limit.
The column first_iam_page in the sys.system_internals_allocation_units system view points to the first IAM page
in the chain of IAM pages that manage the space allocated to the heap in a specific partition. SQL Server uses the
IAM pages to move through the heap. The data pages and the rows within them are not in any specific order and
are not linked. The only logical connection between data pages is the information recorded in the IAM pages.

IMPORTANT
The sys.system_internals_allocation_units system view is reserved for Microsoft SQL Server internal use only. Future
compatibility is not guaranteed.

Table scans or serial reads of a heap can be performed by scanning the IAM pages to find the extents that are
holding pages for the heap. Because the IAM represents extents in the same order that they exist in the data files,
this means that serial heap scans progress sequentially through each file. Using the IAM pages to set the scan
sequence also means that rows from the heap are not typically returned in the order in which they were inserted.
The following illustration shows how the SQL Server Database Engine uses IAM pages to retrieve data rows in a
single partition heap.

Related Content
CREATE INDEX (Transact-SQL)
DROP INDEX (Transact-SQL)
Clustered and Nonclustered Indexes Described
Clustered and Nonclustered Indexes Described
3/24/2017 • 3 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
An index is an on-disk structure associated with a table or view that speeds retrieval of rows from the table or
view. An index contains keys built from one or more columns in the table or view. These keys are stored in a
structure (B-tree) that enables SQL Server to find the row or rows associated with the key values quickly and
efficiently.
A table or view can contain the following types of indexes:
Clustered
Clustered indexes sort and store the data rows in the table or view based on their key values. These
are the columns included in the index definition. There can be only one clustered index per table,
because the data rows themselves can be sorted in only one order.
The only time the data rows in a table are stored in sorted order is when the table contains a
clustered index. When a table has a clustered index, the table is called a clustered table. If a table has
no clustered index, its data rows are stored in an unordered structure called a heap.
Nonclustered
Nonclustered indexes have a structure separate from the data rows. A nonclustered index contains
the nonclustered index key values and each key value entry has a pointer to the data row that
contains the key value.
The pointer from an index row in a nonclustered index to a data row is called a row locator. The
structure of the row locator depends on whether the data pages are stored in a heap or a clustered
table. For a heap, a row locator is a pointer to the row. For a clustered table, the row locator is the
clustered index key.
You can add nonkey columns to the leaf level of the nonclustered index to by-pass existing index key
limits, 900 bytes and 16 key columns, and execute fully covered, indexed, queries. For more
information, see Create Indexes with Included Columns.
Both clustered and nonclustered indexes can be unique. This means no two rows can have the same value
for the index key. Otherwise, the index is not unique and multiple rows can share the same key value. For
more information, see Create Unique Indexes.
Indexes are automatically maintained for a table or view whenever the table data is modified.
See Indexes for additional types of special purpose indexes.

Indexes and Constraints


Indexes are automatically created when PRIMARY KEY and UNIQUE constraints are defined on table columns. For
example, when you create a table and identify a particular column to be the primary key, the Database Engine
automatically creates a PRIMARY KEY constraint and index on that column. For more information, see Create
Primary Keys and Create Unique Constraints.

How Indexes Are Used by the Query Optimizer


Well-designed indexes can reduce disk I/O operations and consume fewer system resources therefore improving
query performance. Indexes can be helpful for a variety of queries that contain SELECT, UPDATE, DELETE, or
MERGE statements. Consider the query
SELECT Title, HireDate FROM HumanResources.Employee WHERE EmployeeID = 250 in the AdventureWorks2012
database. When this query is executed, the query optimizer evaluates each available method for retrieving the data
and selects the most efficient method. The method may be a table scan, or may be scanning one or more indexes
if they exist.
When performing a table scan, the query optimizer reads all the rows in the table, and extracts the rows that meet
the criteria of the query. A table scan generates many disk I/O operations and can be resource intensive. However,
a table scan could be the most efficient method if, for example, the result set of the query is a high percentage of
rows from the table.
When the query optimizer uses an index, it searches the index key columns, finds the storage location of the rows
needed by the query and extracts the matching rows from that location. Generally, searching the index is much
faster than searching the table because unlike a table, an index frequently contains very few columns per row and
the rows are in sorted order.
The query optimizer typically selects the most efficient method when executing queries. However, if no indexes are
available, the query optimizer must use a table scan. Your task is to design and create indexes that are best suited
to your environment so that the query optimizer has a selection of efficient indexes from which to select. SQL
Server provides the Database Engine Tuning Advisor to help with the analysis of your database environment and
in the selection of appropriate indexes.

Related Tasks
Create Clustered Indexes
Create Nonclustered Indexes
Create Clustered Indexes
3/24/2017 • 4 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
You can create clustered indexes on tables by using SQL Server Management Studio or Transact-SQL. With few
exceptions, every table should have a clustered index. Besides improving query performance, a clustered index can
be rebuilt or reorganized on demand to control table fragmentation. A clustered index can also be created on a
view. (Clustered indexes are defined in the topic Clustered and Nonclustered Indexes Described.)
In This Topic
Before you begin:
Typical Implementations
Limitations and Restrictions
Security
To create a clustered index on a table, using:
SQL Server Management Studio
Transact-SQL

Before You Begin


Typical Implementations
Clustered indexes are implemented in the following ways:
PRIMARY KEY and UNIQUE constraints
When you create a PRIMARY KEY constraint, a unique clustered index on the column or columns is
automatically created if a clustered index on the table does not already exist and you do not specify a unique
nonclustered index. The primary key column cannot allow NULL values.
When you create a UNIQUE constraint, a unique nonclustered index is created to enforce a UNIQUE
constraint by default. You can specify a unique clustered index if a clustered index on the table does not
already exist.
An index created as part of the constraint is automatically given the same name as the constraint name. For
more information, see Primary and Foreign Key Constraints and Unique Constraints and Check Constraints.
Index independent of a constraint
You can create a clustered index on a column other than primary key column if a nonclustered primary key
constraint was specified.
Limitations and Restrictions
When a clustered index structure is created, disk space for both the old (source) and new (target) structures
is required in their respective files and filegroups. The old structure is not deallocated until the complete
transaction commits. Additional temporary disk space for sorting may also be required. For more
information, see Disk Space Requirements for Index DDL Operations.
If a clustered index is created on a heap with several existing nonclustered indexes, all the nonclustered
indexes must be rebuilt so that they contain the clustering key value instead of the row identifier (RID).
Similarly, if a clustered index is dropped on a table that has several nonclustered indexes, the nonclustered
indexes are all rebuilt as part of the DROP operation. This may take significant time on large tables.
The preferred way to build indexes on large tables is to start with the clustered index and then build any
nonclustered indexes. Consider setting the ONLINE option to ON when you create indexes on existing
tables. When set to ON, long-term table locks are not held. This enables queries or updates to the
underlying table to continue. For more information, see Perform Index Operations Online.
The index key of a clustered index cannot contain varchar columns that have existing data in the
ROW_OVERFLOW_DATA allocation unit. If a clustered index is created on a varchar column and the existing
data is in the IN_ROW_DATA allocation unit, subsequent insert or update actions on the column that would
push the data off-row will fail. To obtain information about tables that might contain row-overflow data, use
the sys.dm_db_index_physical_stats (Transact-SQL) dynamic management function.
Security
Permissions
Requires ALTER permission on the table or view. User must be a member of the sysadmin fixed server role or the
db_ddladmin and db_owner fixed database roles.

Using SQL Server Management Studio


To create a clustered index by using Object Explorer
1. In Object Explorer, expand the table on which you want to create a clustered index.
2. Right-click the Indexes folder, point to New Index, and select Clustered Index….
3. In the New Index dialog box, on the General page, enter the name of the new index in the Index name
box.
4. Under Index key columns, click Add….
5. In the Select Columns fromtable_name dialog box, select the check box of the table column to be added to
the clustered index.
6. Click OK.
7. In the New Index dialog box, click OK.
To create a clustered index by using the Table Designer
1. In Object Explorer, expand the database on which you want to create a table with a clustered index.
2. Right-click the Tables folder and click New Table….
3. Create a new table as you normally would. For more information, see Create Tables (Database Engine).
4. Right-click the new table created above and click Design.
5. On the Table Designer menu, click Indexes/Keys.
6. In the Indexes/Keys dialog box, click Add.
7. Select the new index in the Selected Primary/Unique Key or Index text box.
8. In the grid, select Create as Clustered, and choose Yes from the drop-down list to the right of the property.
9. Click Close.
10. On the File menu, click Savetable_name.
Using Transact-SQL
To create a clustered index
1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.

USE AdventureWorks2012;
GO
-- Create a new table with three columns.
CREATE TABLE dbo.TestTable
(TestCol1 int NOT NULL,
TestCol2 nchar(10) NULL,
TestCol3 nvarchar(50) NULL);
GO
-- Create a clustered index called IX_TestTable_TestCol1
-- on the dbo.TestTable table using the TestCol1 column.
CREATE CLUSTERED INDEX IX_TestTable_TestCol1
ON dbo.TestTable (TestCol1);
GO

For more information, see CREATE INDEX (Transact-SQL).

See Also
Create Primary Keys
Create Unique Constraints
Create Nonclustered Indexes
3/24/2017 • 3 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
You can create nonclustered indexes in SQL Server 2016 by using SQL Server Management Studio or Transact-
SQL. A nonclustered index is an index structure separate from the data stored in a table that reorders one or more
selected columns. Nonclustered indexes can often help you find data more quickly than searching the underlying
table; queries can sometimes be answered entirely by the data in the nonclustered index, or the nonclustered index
can point the Database Engine to the rows in the underlying table. Generally, nonclustered indexes are created to
improve the performance of frequently used queries not covered by the clustered index or to locate rows in a table
without a clustered index (called a heap). You can create multiple nonclustered indexes on a table or indexed view.
In This Topic
Before you begin:
Typical Implementations
Security
To create a nonclustered index, using:
SQL Server Management Studio
Transact-SQL

Before You Begin


Typical Implementations
Nonclustered indexes are implemented in the following ways:
UNIQUE constraints
When you create a UNIQUE constraint, a unique nonclustered index is created to enforce a UNIQUE
constraint by default. You can specify a unique clustered index if a clustered index on the table does not
already exist. For more information, see Unique Constraints and Check Constraints.
Index independent of a constraint
By default, a nonclustered index is created if clustered is not specified. The maximum number of
nonclustered indexes that can be created per table is 999. This includes any indexes created by PRIMARY
KEY or UNIQUE constraints, but does not include XML indexes.
Nonclustered index on an indexed view
After a unique clustered index has been created on a view, nonclustered indexes can be created. For more
information, see Create Indexed Views.
Security
Permissions
Requires ALTER permission on the table or view. User must be a member of the sysadmin fixed server role or the
db_ddladmin and db_owner fixed database roles.

Using SQL Server Management Studio


Using SQL Server Management Studio
To create a nonclustered index by using the Table Designer
1. In Object Explorer, expand the database that contains the table on which you want to create a nonclustered
index.
2. Expand the Tables folder.
3. Right-click the table on which you want to create a nonclustered index and select Design.
4. On the Table Designer menu, click Indexes/Keys.
5. In the Indexes/Keys dialog box, click Add.
6. Select the new index in the Selected Primary/Unique Key or Index text box.
7. In the grid, select Create as Clustered, and choose No from the drop-down list to the right of the property.
8. Click Close.
9. On the File menu, click Savetable_name.
To create a nonclustered index by using Object Explorer
1. In Object Explorer, expand the database that contains the table on which you want to create a nonclustered
index.
2. Expand the Tables folder.
3. Expand the table on which you want to create a nonclustered index.
4. Right-click the Indexes folder, point to New Index, and select Non-Clustered Index….
5. In the New Index dialog box, on the General page, enter the name of the new index in the Index name
box.
6. Under Index key columns, click Add….
7. In the Select Columns fromtable_name dialog box, select the check box or check boxes of the table
column or columns to be added to the nonclustered index.
8. Click OK.
9. In the New Index dialog box, click OK.

Using Transact-SQL
To create a nonclustered index on a table
1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.
USE AdventureWorks2012;
GO
-- Find an existing index named IX_ProductVendor_VendorID and delete it if found.
IF EXISTS (SELECT name FROM sys.indexes
WHERE name = N'IX_ProductVendor_VendorID')
DROP INDEX IX_ProductVendor_VendorID ON Purchasing.ProductVendor;
GO
-- Create a nonclustered index called IX_ProductVendor_VendorID
-- on the Purchasing.ProductVendor table using the BusinessEntityID column.
CREATE NONCLUSTERED INDEX IX_ProductVendor_VendorID
ON Purchasing.ProductVendor (BusinessEntityID);
GO

For more information, see CREATE INDEX (Transact-SQL).


Create Unique Indexes
3/24/2017 • 5 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
This topic describes how to create a unique index on a table in SQL Server 2016 by using SQL Server Management
Studio or Transact-SQL. A unique index guarantees that the index key contains no duplicate values and therefore
every row in the table is in some way unique. There are no significant differences between creating a UNIQUE
constraint and creating a unique index that is independent of a constraint. Data validation occurs in the same
manner, and the query optimizer does not differentiate between a unique index created by a constraint or
manually created. However, creating a UNIQUE constraint on the column makes the objective of the index clear.
For more information on UNIQUE constraints, see Unique Constraints and Check Constraints.
When you create a unique index, you can set an option to ignore duplicate keys. If this option is set to Yes and you
attempt to create duplicate keys by adding data that affects multiple rows (with the INSERT statement), the row
containing a duplicate is not added. If it is set to No, the entire insert operation fails and all the data is rolled back.

NOTE
You cannot create a unique index on a single column if that column contains NULL in more than one row. Similarly, you
cannot create a unique index on multiple columns if the combination of columns contains NULL in more than one row. These
are treated as duplicate values for indexing purposes.

In This Topic
Before you begin:
Benefits of a Unique Index
Typical Implementations
Limitations and Restrictions
Security
To create a unique index on a table, using:
SQL Server Management Studio
Transact-SQL

Before You Begin


Benefits of a Unique Index
Multicolumn unique indexes guarantee that each combination of values in the index key is unique. For
example, if a unique index is created on a combination of LastName, FirstName, and MiddleName
columns, no two rows in the table could have the same combination of values for these columns.
Provided that the data in each column is unique, you can create both a unique clustered index and multiple
unique nonclustered indexes on the same table.
Unique indexes ensure the data integrity of the defined columns.
Unique indexes provide additional information helpful to the query optimizer that can produce more
efficient execution plans.
Typical Implementations
Unique indexes are implemented in the following ways:
PRIMARY KEY or UNIQUE constraint
When you create a PRIMARY KEY constraint, a unique clustered index on the column or columns is
automatically created if a clustered index on the table does not already exist and you do not specify a unique
nonclustered index. The primary key column cannot allow NULL values.
When you create a UNIQUE constraint, a unique nonclustered index is created to enforce a UNIQUE
constraint by default. You can specify a unique clustered index if a clustered index on the table does not
already exist.
For more information, see Unique Constraints and Check Constraints and Primary and Foreign Key
Constraints.
Index independent of a constraint
Multiple unique nonclustered indexes can be defined on a table.
For more information, see CREATE INDEX (Transact-SQL).
Indexed view
To create an indexed view, a unique clustered index is defined on one or more view columns. The view is
executed and the result set is stored in the leaf level of the index in the same way table data is stored in a
clustered index. For more information, see Create Indexed Views.
Limitations and Restrictions
A unique index, UNIQUE constraint, or PRIMARY KEY constraint cannot be created if duplicate key values
exist in the data.
A unique nonclustered index can contain included nonkey columns. For more information, see Create
Indexes with Included Columns.
Security
Permissions
Requires ALTER permission on the table or view. User must be a member of the sysadmin fixed server role or the
db_ddladmin and db_owner fixed database roles.

Using SQL Server Management Studio


To create a unique index by using the Table Designer
1. In Object Explorer, expand the database that contains the table on which you want to create a unique index.
2. Expand the Tables folder.
3. Right-click the table on which you want to create a unique index and select Design.
4. On the Table Designer menu, select Indexes/Keys.
5. In the Indexes/Keys dialog box, click Add.
6. Select the new index in the Selected Primary/Unique Key or Index text box.
7. In the main grid, under (General), select Type and then choose Index from the list.
8. Select Columns, and then click the ellipsis (…).
9. In the Index Columns dialog box, under Column Name, select the columns you want to index. You can
select up to 16 columns. For optimal performance, select only one or two columns per index. For each
column you select, indicate whether the index arranges values of this column in ascending or descending
order.
10. When all columns for the index are selected, click OK.
11. In the grid, under (General), select Is Unique and then choose Yes from the list.
12. Optional: In the main grid, under Table Designer, select Ignore Duplicate Keys and then choose Yes from
the list. Do this if you want to ignore attempts to add data that would create a duplicate key in the unique
index.
13. Click Close.
14. On the File menu, click Savetable_name.
Create a unique index by using Object Explorer
1. In Object Explorer, expand the database that contains the table on which you want to create a unique index.
2. Expand the Tables folder.
3. Expand the table on which you want to create a unique index.
4. Right-click the Indexes folder, point to New Index, and select Non-Clustered Index….
5. In the New Index dialog box, on the General page, enter the name of the new index in the Index name
box.
6. Select the Unique check box.
7. Under Index key columns, click Add….
8. In the Select Columns fromtable_name dialog box, select the check box or check boxes of the table
column or columns to be added to the unique index.
9. Click OK.
10. In the New Index dialog box, click OK.

Using Transact-SQL
To create a unique index on a table
1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.

USE AdventureWorks2012;
GO
-- Find an existing index named AK_UnitMeasure_Name and delete it if found
IF EXISTS (SELECT name from sys.indexes
WHERE name = N'AK_UnitMeasure_Name')
DROP INDEX AK_UnitMeasure_Name ON Production.UnitMeasure;
GO
-- Create a unique index called AK_UnitMeasure_Name
-- on the Production.UnitMeasure table using the Name column.
CREATE UNIQUE INDEX AK_UnitMeasure_Name
ON Production.UnitMeasure (Name);
GO
For more information, see CREATE INDEX (Transact-SQL).
Create Filtered Indexes
3/24/2017 • 6 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
This topic describes how to create a filtered index in SQL Server 2016 by using SQL Server Management Studio or
Transact-SQL. A filtered index is an optimized nonclustered index especially suited to cover queries that select from
a well-defined subset of data. It uses a filter predicate to index a portion of rows in the table. A well-designed
filtered index can improve query performance as well as reduce index maintenance and storage costs compared
with full-table indexes.
Filtered indexes can provide the following advantages over full-table indexes:
Improved query performance and plan quality
A well-designed filtered index improves query performance and execution plan quality because it is smaller
than a full-table nonclustered index and has filtered statistics. The filtered statistics are more accurate than
full-table statistics because they cover only the rows in the filtered index.
Reduced index maintenance costs
An index is maintained only when data manipulation language (DML) statements affect the data in the index.
A filtered index reduces index maintenance costs compared with a full-table nonclustered index because it is
smaller and is only maintained when the data in the index is changed. It is possible to have a large number
of filtered indexes, especially when they contain data that is changed infrequently. Similarly, if a filtered
index contains only the frequently modified data, the smaller size of the index reduces the cost of updating
the statistics.
Reduced index storage costs
Creating a filtered index can reduce disk storage for nonclustered indexes when a full-table index is not
necessary. You can replace a full-table nonclustered index with multiple filtered indexes without significantly
increasing the storage requirements.
In This Topic
Before you begin:
Design Considerations
Limitations and Restrictions
Security
To create a filtered index, using:
SQL Server Management Studio
Transact-SQL

Before You Begin


Design Considerations
When a column only has a small number of relevant values for queries, you can create a filtered index on
the subset of values. For example, when the values in a column are mostly NULL and the query selects only
from the non-NULL values, you can create a filtered index for the non-NULL data rows. The resulting index
will be smaller and cost less to maintain than a full-table nonclustered index defined on the same key
columns.
When a table has heterogeneous data rows, you can create a filtered index for one or more categories of
data. This can improve the performance of queries on these data rows by narrowing the focus of a query to
a specific area of the table. Again, the resulting index will be smaller and cost less to maintain than a full-
table nonclustered index.
Limitations and Restrictions
You cannot create a filtered index on a view. However, the query optimizer can benefit from a filtered index
defined on a table that is referenced in a view. The query optimizer considers a filtered index for a query that
selects from a view if the query results will be correct.
Filtered indexes have the following advantages over indexed views:
Reduced index maintenance costs. For example, the query processor uses fewer CPU resources to
update a filtered index than an indexed view.
Improved plan quality. For example, during query compilation, the query optimizer considers using a
filtered index in more situations than the equivalent indexed view.
Online index rebuilds. You can rebuild filtered indexes while they are available for queries. Online
index rebuilds are not supported for indexed views. For more information, see the REBUILD option
for ALTER INDEX (Transact-SQL).
Non-unique indexes. Filtered indexes can be non-unique, whereas indexed views must be unique.
Filtered indexes are defined on one table and only support simple comparison operators. If you need a filter
expression that references multiple tables or has complex logic, you should create a view.
A column in the filtered index expression does not need to be a key or included column in the filtered index
definition if the filtered index expression is equivalent to the query predicate and the query does not return
the column in the filtered index expression with the query results.
A column in the filtered index expression should be a key or included column in the filtered index definition
if the query predicate uses the column in a comparison that is not equivalent to the filtered index expression.
A column in the filtered index expression should be a key or included column in the filtered index definition
if the column is in the query result set.
The clustered index key of the table does not need to be a key or included column in the filtered index
definition. The clustered index key is automatically included in all nonclustered indexes, including filtered
indexes.
If the comparison operator specified in the filtered index expression of the filtered index results in an implicit
or explicit data conversion, an error will occur if the conversion occurs on the left side of a comparison
operator. A solution is to write the filtered index expression with the data conversion operator (CAST or
CONVERT) on the right side of the comparison operator.
Review the required SET options for filtered index creation in CREATE INDEX (Transact-SQL) syntax
Security
Permissions
Requires ALTER permission on the table or view. User must be a member of the sysadmin fixed server role or the
db_ddladmin and db_owner fixed database roles. To modify the filtered index expression, use CREATE INDEX
WITH DROP_EXISTING.
Using SQL Server Management Studio
To create a filtered index
1. In Object Explorer, click the plus sign to expand the database that contains the table on which you want to
create a filtered index.
2. Click the plus sign to expand the Tables folder.
3. Click the plus sign to expand the table on which you want to create a filtered index.
4. Right-click the Indexes folder, point to New Index, and select Non-Clustered Index….
5. In the New Index dialog box, on the General page, enter the name of the new index in the Index name
box.
6. Under Index key columns, click Add….
7. In the Select Columns fromtable_name dialog box, select the check box or check boxes of the table column
or columns to be added to the unique index.
8. Click OK.
9. On the Filter page, under Filter Expression, enter SQL expression that you’ll use to create the filtered index.
10. Click OK.

Using Transact-SQL
To create a filtered index
1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.

USE AdventureWorks2012;
GO
-- Looks for an existing filtered index named "FIBillOfMaterialsWithEndDate"
-- and deletes it from the table Production.BillOfMaterials if found.
IF EXISTS (SELECT name FROM sys.indexes
WHERE name = N'FIBillOfMaterialsWithEndDate'
AND object_id = OBJECT_ID (N'Production.BillOfMaterials'))
DROP INDEX FIBillOfMaterialsWithEndDate
ON Production.BillOfMaterials
GO
-- Creates a filtered index "FIBillOfMaterialsWithEndDate"
-- on the table Production.BillOfMaterials
-- using the columms ComponentID and StartDate.

CREATE NONCLUSTERED INDEX FIBillOfMaterialsWithEndDate


ON Production.BillOfMaterials (ComponentID, StartDate)
WHERE EndDate IS NOT NULL ;
GO

The filtered index above is valid for the following query. You can display the query execution plan to
determine if the query optimizer used the filtered index.
USE AdventureWorks2012;
GO
SELECT ProductAssemblyID, ComponentID, StartDate
FROM Production.BillOfMaterials
WHERE EndDate IS NOT NULL
AND ComponentID = 5
AND StartDate > '01/01/2008' ;
GO

To ensure that a filtered index is used in a SQL query


1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.

USE AdventureWorks2012;
GO
SELECT ComponentID, StartDate FROM Production.BillOfMaterials
WITH ( INDEX ( FIBillOfMaterialsWithEndDate ) )
WHERE EndDate IN ('20000825', '20000908', '20000918');
GO

For more information, see CREATE INDEX (Transact-SQL).


Create Indexes with Included Columns
3/24/2017 • 3 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
This topic describes how to add included (or nonkey) columns to extend the functionality of nonclustered indexes
in SQL Server by using SQL Server Management Studio or Transact-SQL. By including nonkey columns, you can
create nonclustered indexes that cover more queries. This is because the nonkey columns have the following
benefits:
They can be data types not allowed as index key columns.
They are not considered by the Database Engine when calculating the number of index key columns or
index key size.
An index with nonkey columns can significantly improve query performance when all columns in the query
are included in the index either as key or nonkey columns. Performance gains are achieved because the
query optimizer can locate all the column values within the index; table or clustered index data is not
accessed resulting in fewer disk I/O operations.

NOTE
When an index contains all the columns referenced by a query it is typically referred to as covering the query.

In This Topic
Before you begin:
Design Recommendations
Limitations and Restrictions
Security
To create an index with nonkey columns, using:
SQL Server Management Studio
Transact-SQL

Before You Begin


Design Recommendations
Redesign nonclustered indexes with a large index key size so that only columns used for searching and
lookups are key columns. Make all other columns that cover the query into nonkey columns. In this way,
you will have all columns needed to cover the query, but the index key itself is small and efficient.
Include nonkey columns in a nonclustered index to avoid exceeding the current index size limitations of a
maximum of 32 key columns and a maximum index key size of 1,700 bytes (16 key columns and 900 bytes
prior to SQL Server 2016). The Database Engine does not consider nonkey columns when calculating the
number of index key columns or index key size.
Limitations and Restrictions
Nonkey columns can only be defined on nonclustered indexes.
All data types except text, ntext, and image can be used as nonkey columns.
Computed columns that are deterministic and either precise or imprecise can be nonkey columns. For more
information, see Indexes on Computed Columns.
Computed columns derived from image, ntext, and text data types can be nonkey columns as long as the
computed column data type is allowed as a nonkey index column.
Nonkey columns cannot be dropped from a table unless that table’s index is dropped first.
Nonkey columns cannot be changed, except to do the following:
Change the nullability of the column from NOT NULL to NULL.
Increase the length of varchar, nvarchar, or varbinary columns.
Security
Permissions
Requires ALTER permission on the table or view. User must be a member of the sysadmin fixed server role or the
db_ddladmin and db_owner fixed database roles.

Using SQL Server Management Studio


To create an index with nonkey columns
1. In Object Explorer, click the plus sign to expand the database that contains the table on which you want to
create an index with nonkey columns.
2. Click the plus sign to expand the Tables folder.
3. Click the plus sign to expand the table on which you want to create an index with nonkey columns.
4. Right-click the Indexes folder, point to New Index, and select Non-Clustered Index….
5. In the New Index dialog box, on the General page, enter the name of the new index in the Index name
box.
6. Under the Index key columns tab, click Add….
7. In the Select Columns fromtable_name dialog box, select the check box or check boxes of the table
column or columns to be added to the index.
8. Click OK.
9. Under the Included columns tab, click Add….
10. In the Select Columns fromtable_name dialog box, select the check box or check boxes of the table
column or columns to be added to the index as nonkey columns.
11. Click OK.
12. In the New Index dialog box, click OK.

Using Transact-SQL
To create an index with nonkey columns
1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.
USE AdventureWorks2012;
GO
-- Creates a nonclustered index on the Person.Address table with four included (nonkey) columns.
-- index key column is PostalCode and the nonkey columns are
-- AddressLine1, AddressLine2, City, and StateProvinceID.
CREATE NONCLUSTERED INDEX IX_Address_PostalCode
ON Person.Address (PostalCode)
INCLUDE (AddressLine1, AddressLine2, City, StateProvinceID);
GO

For more information, see CREATE INDEX (Transact-SQL).


Delete an Index
3/24/2017 • 1 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
This topic describes how to delete (drop) an index in SQL Server 2016 by using SQL Server Management Studio or
Transact-SQL.
In This Topic
Before you begin:
Limitations and Restrictions
Security
To delete an index, using:
SQL Server Management Studio
Transact-SQL

Before You Begin


Limitations and Restrictions
Indexes created as the result of a PRIMARY KEY or UNIQUE constraint cannot be deleted by using this method.
Instead, the constraint must be deleted. To remove the constraint and corresponding index, use ALTER TABLE with
the DROP CONSTRAINT clause in Transact-SQL. For more information, see Delete Primary Keys.
Security
Permissions
Requires ALTER permission on the table or view. This permission is granted by default to the sysadmin fixed server
role and the db_ddladmin and db_owner fixed database roles.

Using SQL Server Management Studio


To delete an index by using Object Explorer
1. In Object Explorer, expand the database that contains the table on which you want to delete an index.
2. Expand the Tables folder.
3. Expand the table that contains the index you want to delete.
4. Expand the Indexes folder.
5. Right-click the index you want to delete and select Delete.
6. In the Delete Object dialog box, verify that the correct index is in the Object to be deleted grid and click
OK.
To delete an index using Table Designer
1. In Object Explorer, expand the database that contains the table on which you want to delete an index.
2. Expand the Tables folder.
3. Right-click the table that contains the index you want to delete and click Design.
4. On the Table Designer menu, click Indexes/Keys.
5. In the Indexes/Keys dialog box, select the index you want to delete.
6. Click Delete.
7. Click Close.
8. On the File menu, select Savetable_name.

Using Transact-SQL
To delete an index
1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.

USE AdventureWorks2012;
GO
-- delete the IX_ProductVendor_BusinessEntityID index
-- from the Purchasing.ProductVendor table
DROP INDEX IX_ProductVendor_BusinessEntityID
ON Purchasing.ProductVendor;
GO

For more information, see DROP INDEX (Transact-SQL).


Modify an Index
3/24/2017 • 1 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
This topic describes how to modify an index in SQL Server 2016 by using SQL Server Management Studio or
Transact-SQL.

IMPORTANT
Indexes created as the result of a PRIMARY KEY or UNIQUE constraint cannot be modified by using this method. Instead, the
constraint must be modified.

In This Topic
To modify an index, using:
SQL Server Management Studio
Transact-SQL

Using SQL Server Management Studio


To modify an index
1. In Object Explorer, connect to an instance of the SQL Server Database Engine and then expand that instance.
2. Expand Databases, expand the database in which the table belongs, and then expand Tables.
3. Expand the table in which the index belongs and then expand Indexes.
4. Right-click the index that you want to modify and then click Properties.
5. In the Index Properties dialog box, make the desired changes. For example, you can add or remove a
column from the index key, or change the setting of an index option.
To modify index columns
1. To add, remove, or change the position of an index column, select the General page from the Index Properties
dialog box.

Using Transact-SQL
To modify an index
1. Connect to the Database Engine.
2. From the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute. This example drops and re-
creates an existing index on the ProductID column of the Production.WorkOrder table by using the
DROP_EXISTING option. The options FILLFACTOR and PAD_INDEX are also set.
USE AdventureWorks2012;
GO
CREATE NONCLUSTERED INDEX IX_WorkOrder_ProductID
ON Production.WorkOrder(ProductID)
WITH (FILLFACTOR = 80,
PAD_INDEX = ON,
DROP_EXISTING = ON);
GO

The following example uses ALTER INDEX to set several options on the index
AK_SalesOrderHeader_SalesOrderNumber .

USE AdventureWorks2012;
GO
ALTER INDEX AK_SalesOrderHeader_SalesOrderNumber ON
Sales.SalesOrderHeader
SET (
STATISTICS_NORECOMPUTE = ON,
IGNORE_DUP_KEY = ON,
ALLOW_PAGE_LOCKS = ON
) ;
GO

To modify index columns


1. To add, remove, or change the position of an index column, you must drop and recreate the index.

See Also
CREATE INDEX (Transact-SQL)
ALTER INDEX (Transact-SQL)
INDEXPROPERTY (Transact-SQL)
sys.indexes (Transact-SQL)
sys.index_columns (Transact-SQL)
Set Index Options
Rename Indexes
Move an Existing Index to a Different Filegroup
3/24/2017 • 5 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
This topic describes how to move an existing index from its current filegroup to a different filegroup in SQL Server
2016 by using SQL Server Management Studio or Transact-SQL.
In This Topic
Before you begin:
Limitations and Restrictions
Security
To move an existing index to a different filegroup, using:
SQL Server Management Studio
Transact-SQL

Before You Begin


Limitations and Restrictions
If a table has a clustered index, moving the clustered index to a new filegroup moves the table to that
filegroup.
You cannot move indexes created using a UNIQUE or PRIMARY KEY constraint using Management Studio.
To move these indexes use the CREATE INDEX statement with the (DROP_EXISTING=ON) option in Transact-
SQL.
Security
Permissions
Requires ALTER permission on the table or view. User must be a member of the sysadmin fixed server role or the
db_ddladmin and db_owner fixed database roles.

Using SQL Server Management Studio


To move an existing index to a different filegroup using Table Designer
1. In Object Explorer, click the plus sign to expand the database that contains the table containing the index that
you want to move.
2. Click the plus sign to expand the Tables folder.
3. Right-click the table containing the index that you want to move and select Design.
4. On the Table Designer menu, click Indexes/Keys.
5. Select the index that you want to move.
6. In the main grid, expand Data Space Specification.
7. Select Filegroup or Partition Scheme Name and select from the list the filegroup or partition scheme to
where you want to move the index.
8. Click Close.
9. On the File menu, select Savetable_name.
To move an existing index to a different filegroup in Object Explorer
1. In Object Explorer, click the plus sign to expand the database that contains the table containing the index that
you want to move.
2. Click the plus sign to expand the Tables folder.
3. Click the plus sign to expand the table containing the index that you want to move.
4. Click the plus sign to expand the Indexes folder.
5. Right-click the index that you want to move and select Properties.
6. Under Select a page, select Storage.
7. Select the filegroup in which to move the index.
If the table or index is partitioned, select the partition scheme in which to move the index. For more
information about partitioned indexes, see Partitioned Tables and Indexes.
If you are moving a clustered index, you can use online processing. Online processing allows concurrent user
access to the underlying data and to nonclustered indexes during the index operation. For more information,
see Perform Index Operations Online.
On multiprocessor computers using SQL Server 2016, you can configure the number of processors used to
execute the index statement by specifying a maximum degree of parallelism value. The Parallel indexed
operations feature is not available in every edition of SQL Server. For a list of features that are supported by
the editions of SQL Server, see Features Supported by the Editions of SQL Server 2016. For more
information about Parallel indexed operations, see Configure Parallel Index Operations.
8. Click OK.
The following information is available on the Storage page of the Index Properties – index_name dialog
box:
Filegroup
Stores the index in the specified filegroup. The list only displays standard (row) filegroups. The default list
selection is the PRIMARY filegroup of the database.
Filestream filegroup
Specifies the filegroup for FILESTREAM data. This list displays only FILESTREAM filegroups. The default list
selection is the PRIMARY FILESTREAM filegroup.
Partition scheme
Stores the index in a partition scheme. Clicking Partition Scheme enables the grid below. The default list
selection is the partition scheme that is used for storing the table data. When you select a different partition
scheme in the list, the information in the grid is updated.
The partition scheme option is unavailable if there are no partition schemes in the database.
Filestream partition scheme
Specifies the partition scheme for FILESTREAM data. The partition scheme must be symmetric with the
scheme that is specified in the Partition scheme option.
If the table is not partitioned, the field is blank.
Partition Scheme Parameter
Displays the name of the column that participates in the partition scheme.
Table Column
Select the table or view to map to the partition scheme.
Column Data Type
Displays data type information about the column.

NOTE
If the table column is a computed column, Column Data Type displays "computed column."

Allow online processing of DML statements while moving the index


Allows users to access the underlying table or clustered index data and any associated nonclustered indexes during
the index operation.

NOTE
This option is not available for XML indexes, or if the index is a disabled clustered index.

Set maximum degree of parallelism


Limits the number of processors to use during parallel plan execution. The default value, 0, uses the actual number
of available CPUs. Setting the value to 1 suppresses parallel plan generation; setting the value to a number greater
than 1 restricts the maximum number of processors used by a single query execution. This option only becomes
available if the dialog box is in the Rebuild or Recreate state.

NOTE
If a value greater than the number of available CPUs is specified, the actual number of available CPUs is used.

Using Transact-SQL
To move an existing index to a different filegroup
1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.
USE AdventureWorks2012;
GO
-- Creates the TransactionsFG1 filegroup on the AdventureWorks2012 database
ALTER DATABASE AdventureWorks2012
ADD FILEGROUP TransactionsFG1;
GO
/* Adds the TransactionsFG1dat3 file to the TransactionsFG1 filegroup. Please note that you will have to
change the filename parameter in this statement to execute it without errors.
*/
ALTER DATABASE AdventureWorks2012
ADD FILE
(
NAME = TransactionsFG1dat3,
FILENAME = 'C:\Program Files\Microsoft SQL Server\MSSQL13\MSSQL\DATA\TransactionsFG1dat3.ndf',
SIZE = 5MB,
MAXSIZE = 100MB,
FILEGROWTH = 5MB
)
TO FILEGROUP TransactionsFG1;
GO
/*Creates the IX_Employee_OrganizationLevel_OrganizationNode index
on the TransactionsPS1 filegroup and drops the original IX_Employee_OrganizationLevel_OrganizationNode
index.
*/
CREATE NONCLUSTERED INDEX IX_Employee_OrganizationLevel_OrganizationNode
ON HumanResources.Employee (OrganizationLevel, OrganizationNode)
WITH (DROP_EXISTING = ON)
ON TransactionsFG1;
GO

For more information, see CREATE INDEX (Transact-SQL).


Indexes on Computed Columns
3/24/2017 • 5 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
You can define indexes on computed columns as long as the following requirements are met:
Ownership requirements
Determinism requirements
Precision requirements
Data type requirements
SET option requirements
Ownership Requirements
All function references in the computed column must have the same owner as the table.
Determinism Requirements

IMPORTANT
Expressions are deterministic if they always return the same result for a specified set of inputs. The IsDeterministic property
of the COLUMNPROPERTY function reports whether a computed_column_expression is deterministic.

The computed_column_expression must be deterministic. A computed_column_expression is deterministic when


one or more of the following is true:
All functions that are referenced by the expression are deterministic and precise. These functions include
both user-defined and built-in functions. For more information, see Deterministic and Nondeterministic
Functions. Functions might be imprecise if the computed column is PERSISTED. For more information, see
Creating Indexes on Persisted Computed Columns later in this topic.
All columns that are referenced in the expression come from the table that contains the computed column.
No column reference pulls data from multiple rows. For example, aggregate functions such as SUM or AVG
depend on data from multiple rows and would make a computed_column_expression nondeterministic.
The computed_column_expression has no system data access or user data access.
Any computed column that contains a common language runtime (CLR) expression must be deterministic
and marked PERSISTED before the column can be indexed. CLR user-defined type expressions are allowed
in computed column definitions. Computed columns whose type is a CLR user-defined type can be indexed
as long as the type is comparable. For more information, see CLR User-Defined Types.
NOTE
When you refer to string literals of the date data type in indexed computed columns in SQL Server, we recommend that you
explicitly convert the literal to the date type that you want by using a deterministic date format style. For a list of the date
format styles that are deterministic, see CAST and CONVERT. Expressions that involve implicit conversion of character strings
to date data types are considered nondeterministic, unless the database compatibility level is set to 80 or earlier. This is
because the results depend on the LANGUAGE and DATEFORMAT settings of the server session. For example, the results of
the expression CONVERT (datetime, '30 listopad 1996', 113) depend on the LANGUAGE setting because the string '
30 listopad 1996 ' means different months in different languages. Similarly, in the expression
DATEADD(mm,3,'2000-12-01') , the Database Engine interprets the string '2000-12-01' based on the DATEFORMAT
setting.
Implicit conversion of non-Unicode character data between collations is also considered nondeterministic, unless the
compatibility level is set to 80 or earlier.
When the database compatibility level setting is 90, you cannot create indexes on computed columns that contain these
expressions. However, existing computed columns that contain these expressions from an upgraded database are
maintainable. If you use indexed computed columns that contain implicit string to date conversions, to avoid possible index
corruption, make sure that the LANGUAGE and DATEFORMAT settings are consistent in your databases and applications.

Precision Requirements
The computed_column_expression must be precise. A computed_column_expression is precise when one or more
of the following is true:
It is not an expression of the float or real data types.
It does not use a float or real data type in its definition. For example, in the following statement, column y
is int and deterministic but not precise.

CREATE TABLE t2 (a int, b int, c int, x float,


y AS CASE x
WHEN 0 THEN a
WHEN 1 THEN b
ELSE c
END);

NOTE
Any float or real expression is considered imprecise and cannot be a key of an index; a float or real expression can be used
in an indexed view but not as a key. This is true also for computed columns. Any function, expression, or user-defined
function is considered imprecise if it contains any float or real expressions. This includes logical ones (comparisons).

The IsPrecise property of the COLUMNPROPERTY function reports whether a computed_column_expression is


precise.
Data Type Requirements
The computed_column_expression defined for the computed column cannot evaluate to the text, ntext, or
image data types.
Computed columns derived from image, ntext, text, varchar(max), nvarchar(max), varbinary(max),
and xml data types can be indexed as long as the computed column data type is allowable as an index key
column.
Computed columns derived from image, ntext, and text data types can be nonkey (included) columns in a
nonclustered index as long as the computed column data type is allowable as a nonkey index column.
SET Option Requirements
The ANSI_NULLS connection-level option must be set to ON when the CREATE TABLE or ALTER TABLE
statement that defines the computed column is executed. The OBJECTPROPERTY function reports whether
the option is on through the IsAnsiNullsOn property.
The connection on which the index is created, and all connections trying INSERT, UPDATE, or DELETE
statements that will change values in the index, must have six SET options set to ON and one option set to
OFF. The optimizer ignores an index on a computed column for any SELECT statement executed by a
connection that does not have these same option settings.
The NUMERIC_ROUNDABORT option must be set to OFF, and the following options must be set to
ON:
ANSI_NULLS
ANSI_PADDING
ANSI_WARNINGS
ARITHABORT
CONCAT_NULL_YIELDS_NULL
QUOTED_IDENTIFIER
Setting ANSI_WARNINGS to ON implicitly sets ARITHABORT to ON when the database compatibility
level is set to 90 or higher.

Creating Indexes on Persisted Computed Columns


You can create an index on a computed column that is defined with a deterministic, but imprecise, expression if the
column is marked PERSISTED in the CREATE TABLE or ALTER TABLE statement. This means that the Database
Engine stores the computed values in the table, and updates them when any other columns on which the
computed column depends are updated. The Database Engine uses these persisted values when it creates an index
on the column, and when the index is referenced in a query. This option enables you to create an index on a
computed column when Database Engine cannot prove with accuracy whether a function that returns computed
column expressions, particularly a CLR function that is created in the .NET Framework, is both deterministic and
precise.

Related Content
COLUMNPROPERTY (Transact-SQL)
SORT_IN_TEMPDB Option For Indexes
4/29/2017 • 8 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
When you create or rebuild an index, by setting the SORT_IN_TEMPDB option to ON you can direct the SQL Server
Database Engine to use tempdb to store the intermediate sort results that are used to build the index. Although
this option increases the amount of temporary disk space that is used to create an index, the option could reduce
the time that is required to create or rebuild an index when tempdb is on a set of disks different from that of the
user database. For more information about tempdb, see Configure the index create memory Server Configuration
Option.

Phases of Index Building


As the Database Engine builds an index, it goes through the following phases:
The Database Engine first scans the data pages of the base table to retrieve key values and builds an index
leaf row for each data row. When the internal sort buffers have been filled with leaf index entries, the entries
are sorted and written to disk as an intermediate sort run. The Database Engine then resumes the data page
scan until the sort buffers are again filled. This pattern of scanning multiple data pages followed by sorting
and writing a sort run continues until all the rows of the base table have been processed.
In a clustered index, the leaf rows of the index are the data rows of the table; therefore, the intermediate sort
runs contain all the data rows. In a nonclustered index, the leaf rows may contain nonkey columns, but are
generally smaller than a clustered index. If the index keys are large, or there are several nonkey columns
included in the index, a nonclustered sort run can be large. For more information about including nonkey
columns, see Create Indexes with Included Columns.
The Database Engine merges the sorted runs of index leaf rows into a single, sorted stream. The sort merge
component of the Database Engine starts with the first page of each sort run, finds the lowest key in all the
pages, and passes that leaf row to the index create component. The next lowest key is processed, and then
the next, and so on. When the last leaf index row is extracted from a sort run page, the process shifts to the
next page from that sort run. When all the pages in a sort run extent have been processed, the extent is
freed. As each leaf index row is passed to the index create component, it is included in a leaf index page in
the buffer. Each leaf page is written as it is filled. As leaf pages are written, the Database Engine also builds
the upper levels of the index. Each upper level index page is written when it is filled.

SORT_IN_TEMPDB Option
When SORT_IN_TEMPDB is set to OFF, the default, the sort runs are stored in the destination filegroup. During the
first phase of creating the index, the alternating reads of the base table pages and writes of the sort runs move the
disk read/write heads from one area of the disk to another. The heads are in the data page area as the data pages
are scanned. They move to an area of free space when the sort buffers fill and the current sort run has to be written
to disk, and then move back to the data page area as the table page scan is resumed. The read/write head
movement is greater in the second phase. At that time the sort process is typically alternating reads from each sort
run area. Both the sort runs and the new index pages are built in the destination filegroup. This means that at the
same time the Database Engine is spreading reads across the sort runs, it has to periodically jump to the index
extents to write new index pages as they are filled.
If the SORT_IN_TEMPDB option is set to ON and tempdb is on a separate set of disks from the destination
filegroup, during the first phase, the reads of the data pages occur on a different disk from the writes to the sort
work area in tempdb. This means the disk reads of the data keys generally continue more serially across the disk,
and the writes to the tempdb disk also are generally serial, as do the writes to build the final index. Even if other
users are using the database and accessing separate disk addresses, the overall pattern of reads and writes are
more efficient when SORT_IN_TEMPDB is specified than when it is not.
The SORT_IN_TEMPDB option may improve the contiguity of index extents, especially if the CREATE INDEX
operation is not being processed in parallel. The sort work area extents are freed on a somewhat random basis
with regard to their location in the database. If the sort work areas are contained in the destination filegroup, as the
sort work extents are freed, they can be acquired by the requests for extents to hold the index structure as it is built.
This can randomize the locations of the index extents to a degree. If the sort extents are held separately in tempdb,
the sequence in which they are freed has no effect on the location of the index extents. Also, when the intermediate
sort runs are stored in tempdb instead of the destination filegroup, there is more space available in the destination
filegroup. This increases the chances that index extents will be contiguous.
The SORT_IN_TEMPDB option affects only the current statement. No metadata records that the index was or was
not sorted in tempdb. For example, if you create a nonclustered index using the SORT_IN_TEMPDB option, and at a
later time create a clustered index without specifying the option, the Database Engine does not use the option
when it re-creates the nonclustered index.

NOTE
If a sort operation is not required or if the sort can be performed in memory, the SORT_IN_TEMPDB option is ignored.

Disk Space Requirements


When you set the SORT_IN_TEMPDB option to ON, you must have sufficient free disk space available in tempdb to
hold the intermediate sort runs, and enough free disk space in the destination filegroup to hold the new index. The
CREATE INDEX statement fails if there is insufficient free space and there is some reason the databases cannot
autogrow to acquire more space, such as no space on the disk or autogrow is set to off.
If SORT_IN_TEMPDB is set to OFF, the available free disk space in the destination filegroup must be roughly the
size of the final index. During the first phase, the sort runs are built and require about the same amount of space as
the final index. During the second phase, each sort run extent is freed after it has been processed. This means that
sort run extents are freed at about the same rate at which extents are acquired to hold the final index pages;
therefore, the overall space requirements do not greatly exceed the size of the final index. One side effect of this is
that if the amount of free space is very close to the size of the final index, the Database Engine will generally reuse
the sort run extents very quickly after they are freed. Because the sort run extents are freed in a somewhat random
manner, this reduces the continuity of the index extents in this scenario. If SORT_IN_TEMPDB is set to OFF, the
continuity of the index extents is improved if there is sufficient free space available in the destination filegroup that
the index extents can be allocated from a contiguous pool instead of from the freshly deallocated sort run extents.
When you create a nonclustered index, you must have available as free space:
If SORT_IN_TEMPDB is set to ON, there must be sufficient free space in tempdb to store the sort runs, and
sufficient free space in the destination filegroup to store the final index structure. The sort runs contain the
leaf rows of the index.
If SORT_IN_TEMPDB is set to OFF, the free space in the destination filegroup must be large enough to store
the final index structure. The continuity of the index extends may be improved if more free space is available.
When you create a clustered index on a table that does not have nonclustered indexes, you must have
available as free space:
If SORT_IN_TEMPDB is set to ON, there must be sufficient free space in tempdb to store the sort runs. These
include the data rows of the table. There must be sufficient free space in the destination filegroup to store
the final index structure. This includes the data rows of the table and the index B-tree. You may have to
adjust the estimate for factors such as having a large key size or a fill factor with a low value.
If SORT_IN_TEMPDB is set to OFF, the free space in the destination filegroup must be large enough to store
the final table. This includes the index structure. The continuity of the table and index extents may be
improved if more free space is available.
When you create a clustered index on a table that has nonclustered indexes, you must have available as free
space:
If SORT_IN_TEMPDB is set to ON, there must be sufficient free space in tempdb to store the collection of
sort runs for the largest index, typically the clustered index, and sufficient free space in the destination
filegroup to store the final structures of all the indexes. This includes the clustered index that contains the
data rows of the table.
If SORT_IN_TEMPDB is set to OFF, the free space in the destination filegroup must be large enough to store
the final table. This includes the structures of all the indexes. The continuity of the table and index extents
may be improved if more free space is available.

Related Tasks
CREATE INDEX (Transact-SQL)
Reorganize and Rebuild Indexes

Related Content
ALTER INDEX (Transact-SQL)
Configure the index create memory Server Configuration Option
Disk Space Requirements for Index DDL Operations
Disable Indexes and Constraints
3/24/2017 • 5 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
This topic describes how to disable an index or constraints in SQL Server 2016 by using SQL Server Management
Studio or Transact-SQL. Disabling an index prevents user access to the index, and for clustered indexes to the
underlying table data. The index definition remains in metadata, and index statistics are kept on nonclustered
indexes. Disabling a nonclustered or clustered index on a view physically deletes the index data. Disabling a
clustered index on a table prevents access to the data; the data still remains in the table, but is unavailable for data
manipulation language (DML) operations until the index is dropped or rebuilt.
In This Topic
Before you begin:
Limitations and Restrictions
Security
To disable an index, using:
SQL Server Management Studio
Transact-SQL

Before You Begin


Limitations and Restrictions
The index is not maintained while it is disabled.
The query optimizer does not consider the disabled index when creating query execution plans. Also, queries
that reference the disabled index with a table hint fail.
You cannot create an index that uses the same name as an existing disabled index.
A disabled index can be dropped.
When disabling a unique index, the PRIMARY KEY or UNIQUE constraint and all FOREIGN KEY constraints
that reference the indexed columns from other tables are also disabled. When disabling a clustered index, all
incoming and outgoing FOREIGN KEY constraints on the underlying table are also disabled. The constraint
names are listed in a warning message when the index is disabled. After rebuilding the index, all constraints
must be manually enabled by using the ALTER TABLE CHECK CONSTRAINT statement.
Nonclustered indexes are automatically disabled when the associated clustered index is disabled. They
cannot be enabled until either the clustered index on the table or view is enabled or the clustered index on
the table is dropped. Nonclustered indexes must be explicitly enabled, unless the clustered index was
enabled by using the ALTER INDEX ALL REBUILD statement.
The ALTER INDEX ALL REBUILD statement rebuilds and enables all disabled indexes on the table, except for
disabled indexes on views. Indexes on views must be enabled in a separate ALTER INDEX ALL REBUILD
statement.
Disabling a clustered index on a table also disables all clustered and nonclustered indexes on views that
reference that table. These indexes must be rebuilt just as those on the referenced table.
The data rows of the disabled clustered index cannot be accessed except to drop or rebuild the clustered
index.
You can rebuild a disabled nonclustered index online when the table does not have a disabled clustered
index. However, you must always rebuild a disabled clustered index offline if you use either the ALTER
INDEX REBUILD or CREATE INDEX WITH DROP_EXISTING statement. For more information about online
index operations, see Perform Index Operations Online.
The CREATE STATISTICS statement cannot be successfully executed on a table that has a disabled clustered
index.
The AUTO_CREATE_STATISTICS database option creates new statistics on a column when the index is
disabled and the following conditions exist:
AUTO_CREATE_STATISTICS is set to ON
There are no existing statistics for the column.
Statistics are required during query optimization.
If a clustered index is disabled, DBCC CHECKDB cannot return information about the underlying table;
instead, the statement reports that the clustered index is disabled. DBCC INDEXDEFRAG cannot be used to
defragment a disabled index; the statement fails with an error message. You can use DBCC DBREINDEX to
rebuild a disabled index.
Creating a new clustered index enables previously disabled nonclustered indexes. For more information, see
Enable Indexes and Constraints.
Security
Permissions
To execute ALTER INDEX, at a minimum, ALTER permission on the table or view is required.

Using SQL Server Management Studio


To disable an index
1. In Object Explorer, click the plus sign to expand the database that contains the table on which you want to
disable an index.
2. Click the plus sign to expand the Tables folder.
3. Click the plus sign to expand the table on which you want to disable an index.
4. Click the plus sign to expand the Indexes folder.
5. Right-click the index you want to disable and select Disable.
6. In the Disable Indexes dialog box, verify that the correct index is in the Indexes to disable grid and click
OK.
To disable all indexes on a table
1. In Object Explorer, click the plus sign to expand the database that contains the table on which you want to
disable the indexes.
2. Click the plus sign to expand the Tables folder.
3. Click the plus sign to expand the table on which you want to disable the indexes.
4. Right-click the Indexes folder and select Disable All.
5. In the Disable Indexes dialog box, verify that the correct indexes are in the Indexes to disable grid and
click OK. To remove an index from the Indexes to disable grid, select the index and then press the Delete
key.
The following information is available in the Disable Indexes dialog box:
Index Name
Displays the name of the index. During execution, this column also displays an icon representing the status.
Table Name
Displays the name of the table or view that the index was created on.
Index Type
Displays the type of the index: Clustered, Nonclustered, Spatial, or XML.
Status
Displays the status of the disable operation. Possible values after execution are:
Blank
Prior to execution Status is blank.
In progress
Disabling of the indexes has been started but is not complete.
Success
The disable operation completed successfully.
Error
An error was encountered during the index disable operation, and the operation did not complete
successfully.
Stopped
The disable of the index was not completed successfully because the user stopped the operation.
Message
Provides the text of error messages during the disable operation. During execution, errors appear as
hyperlinks. The text of the hyperlinks describes the body of the error. The Message column is rarely wide
enough to read the full message text. There are two ways to get the full text:
Move the mouse pointer over the message cell to display a ToolTip with the error text.
Click the hyperlink to display a dialog box displaying the full error.

Using Transact-SQL
To disable an index
1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.
USE AdventureWorks2012;
GO
-- disables the IX_Employee_OrganizationLevel_OrganizationNode index
-- on the HumanResources.Employee table
ALTER INDEX IX_Employee_OrganizationLevel_OrganizationNode ON HumanResources.Employee
DISABLE;

To disable all indexes on a table


1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.

USE AdventureWorks2012;
GO
-- Disables all indexes on the HumanResources.Employee table.
ALTER INDEX ALL ON HumanResources.Employee
DISABLE;

For more information, see ALTER INDEX (Transact-SQL).


Enable Indexes and Constraints
3/24/2017 • 4 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
This topic describes how to enable a disabled index in SQL Server 2016 by using SQL Server Management Studio
or Transact-SQL. After an index is disabled, it remains in a disabled state until it is rebuilt or dropped
In This Topic
Before you begin:
Limitations and Restrictions
Security
To enable a disabled index, using:
SQL Server Management Studio
Transact-SQL

Before You Begin


Limitations and Restrictions
After rebuilding the index, any constraints that were disabled because of disabling the index must be
manually enabled. PRIMARY KEY and UNIQUE constraints are enabled by rebuilding the associated index.
This index must be rebuilt (enabled) before you can enable FOREIGN KEY constraints that reference the
PRIMARY KEY or UNIQUE constraint. FOREIGN KEY constraints are enabled by using the ALTER TABLE
CHECK CONSTRAINT statement.
Rebuilding a disabled clustered index cannot be performed when the ONLINE option is set to ON.
When the clustered index is disabled or enabled and the nonclustered index is disabled, the clustered index
action has the following results on the disabled nonclustered index.

CLUSTERED INDEX ACTION DISABLED NONCLUSTERED INDEX …

ALTER INDEX REBUILD. Remains disabled.

ALTER INDEX ALL REBUILD. Is rebuilt and enabled.

DROP INDEX. Remains disabled.

CREATE INDEX WITH DROP_EXISTING. Remains disabled.

Creating a new clustered index, behaves the same as ALTER INDEX ALL REBUILD.
Allowed actions on nonclustered indexes associated with a clustered index depend on the state, whether
disabled or enabled, of both index types. The following table summarizes the allowed actions on
nonclustered indexes.
WHEN BOTH THE CLUSTERED AND WHEN THE CLUSTERED INDEX IS
NONCLUSTERED INDEXES ARE ENABLED AND THE NONCLUSTERED
NONCLUSTERED INDEX ACTION DISABLED. INDEX IS IN EITHER STATE.

ALTER INDEX REBUILD. The action fails. The action succeeds.

DROP INDEX. The action succeeds. The action succeeds.

CREATE INDEX WITH The action fails. The action succeeds.


DROP_EXISTING.

Security
Permissions
Requires ALTER permission on the table or view. If using DBCC DBREINDEX, eser must either own the table or be a
member of the sysadmin fixed server role or the db_ddladmin and db_owner fixed database roles.

Using SQL Server Management Studio


To enable a disabled index
1. In Object Explorer, click the plus sign to expand the database that contains the table on which you want to
enable an index.
2. Click the plus sign to expand the Tables folder.
3. Click the plus sign to expand the table on which you want to enable an index.
4. Click the plus sign to expand the Indexes folder.
5. Right-click the index you want to enable and select Rebuild.
6. In the Rebuild Indexes dialog box, verify that the correct index is in the Indexes to rebuild grid and click
OK.
To enable all indexes on a table
1. In Object Explorer, click the plus sign to expand the database that contains the table on which you want to
enable the indexes.
2. Click the plus sign to expand the Tables folder.
3. Click the plus sign to expand the table on which you want to enable the indexes.
4. Right-click the Indexes folder and select Rebuild All.
5. In the Rebuild Indexes dialog box, verify that the correct indexes are in the Indexes to rebuild grid and
click OK. To remove an index from the Indexes to rebuild grid, select the index and then press the Delete
key.
The following information is available in the Rebuild Indexes dialog box:

Using Transact-SQL
To enable a disabled index using ALTER INDEX
1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.
USE AdventureWorks2012;
GO
-- Enables the IX_Employee_OrganizationLevel_OrganizationNode index
-- on the HumanResources.Employee table.

ALTER INDEX IX_Employee_OrganizationLevel_OrganizationNode ON HumanResources.Employee


REBUILD;
GO

To enable a disabled index using CREATE INDEX


1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.

USE AdventureWorks2012;
GO
-- re-creates the IX_Employee_OrganizationLevel_OrganizationNode index
-- on the HumanResources.Employee table
-- using the OrganizationLevel and OrganizationNode columns
-- and then deletes the existing IX_Employee_OrganizationLevel_OrganizationNode index
CREATE INDEX IX_Employee_OrganizationLevel_OrganizationNode ON HumanResources.Employee
(OrganizationLevel, OrganizationNode)
WITH (DROP_EXISTING = ON);
GO

To enable a disabled index using DBCC DBREINDEX


1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.

USE AdventureWorks2012;
GO
-- enables the IX_Employee_OrganizationLevel_OrganizationNode index
-- on the HumanResources.Employee table
DBCC DBREINDEX ("HumanResources.Employee", IX_Employee_OrganizationLevel_OrganizationNode);
GO

To enable all indexes on a table using ALTER INDEX


1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.

USE AdventureWorks2012;
GO
-- enables all indexes
-- on the HumanResources.Employee table
ALTER INDEX ALL ON HumanResources.Employee
REBUILD;
GO

To enable all indexes on a table using DBCC DBREINDEX


1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.

USE AdventureWorks2012;
GO
-- enables all indexes
-- on the HumanResources.Employee table
DBCC DBREINDEX ("HumanResources.Employee", " ");
GO

For more information, see ALTER INDEX (Transact-SQL), CREATE INDEX (Transact-SQL), and DBCC
DBREINDEX (Transact-SQL).
Rename Indexes
3/24/2017 • 2 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
This topic describes how to rename an index in SQL Server 2016 by using SQL Server Management Studio or
Transact-SQL. Renaming an index replaces the current index name with the new name that you provide. The
specified name must be unique within the table or view. For example, two tables can have an index named XPK_1,
but the same table cannot have two indexes named XPK_1. You cannot create an index with the same name as an
existing disabled index. Renaming an index does not cause the index to be rebuilt.
In This Topic
Before you begin:
Limitations and Restrictions
Security
To rename an index, using:
SQL Server Management Studio
Transact-SQL

Before You Begin


Limitations and Restrictions
When you create a PRIMARY KEY or UNIQUE constraint on a table, an index with the same name as the constraint
is automatically created for the table. Because index names must be unique within the table, you cannot create or
rename an index to have the same name as an existing PRIMARY KEY or UNIQUE constraint on the table.
Security
Permissions
Requires ALTER permission on the index.

Using SQL Server Management Studio


To rename an index by using the Table Designer
1. In Object Explorer, click the plus sign to expand the database that contains the table on which you want to
rename an index.
2. Click the plus sign to expand the Tables folder.
3. Right-click the table on which you want to rename an index and select Design.
4. On the Table Designer menu, click Indexes/Keys.
5. Select the index you want to rename in the Selected Primary/Unique Key or Index text box.
6. In the grid, click Name and type a new name into the text box.
7. Click Close.
8. On the File menu, click Savetable_name.
To rename an index by using Object Explorer
1. In Object Explorer, click the plus sign to expand the database that contains the table on which you want to
rename an index.
2. Click the plus sign to expand the Tables folder.
3. Click the plus sign to expand the table on which you want to rename an index.
4. Click the plus sign to expand the Indexes folder.
5. Right-click the index you want to rename and select Rename.
6. Type the index’s new name and press Enter.

Using Transact-SQL
To rename an index
1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.

USE AdventureWorks2012;
GO
--Renames the IX_ProductVendor_VendorID index on the Purchasing.ProductVendor table to IX_VendorID.

EXEC sp_rename N'Purchasing.ProductVendor.IX_ProductVendor_VendorID', N'IX_VendorID', N'INDEX';


GO

For more information, see sp_rename (Transact-SQL).


Set Index Options
3/24/2017 • 2 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
This topic describes how to modify the properties of an index in SQL Server 2016 by using SQL Server
Management Studio or Transact-SQL.
In This Topic
Before you begin:
Limitations and Restrictions
Security
To modify the properties of an index, using:
SQL Server Management Studio
Transact-SQL

Before You Begin


Limitations and Restrictions
The following options are immediately applied to the index by using the SET clause in the ALTER INDEX
statement: ALLOW_PAGE_LOCKS, ALLOW_ROW_LOCKS, IGNORE_DUP_KEY, and
STATISTICS_NORECOMPUTE.
The following options can be set when you rebuild an index by using either ALTER INDEX REBUILD or
CREATE INDEX WITH DROP_EXISTING: PAD_INDEX, FILLFACTOR, SORT_IN_TEMPDB, IGNORE_DUP_KEY,
STATISTICS_NORECOMPUTE, ONLINE, ALLOW_ROW_LOCKS, ALLOW_PAGE_LOCKS, MAXDOP, and
DROP_EXISTING (CREATE INDEX only).
Security
Permissions
Requires ALTER permission on the table or view.

Using SQL Server Management Studio


To modify the properties of an index in Table Designer
1. In Object Explorer, click the plus sign to expand the database that contains the table on which you want to
modify an index’s properties.
2. Click the plus sign to expand the Tables folder.
3. Right-click the table on which you want to modify an index’s properties and select Design.
4. On the Table Designer menu, click Indexes/Keys.
5. Select the index that you want to modify. Its properties will show up in the main grid.
6. Change the settings of any and all properties to customize the index.
7. Click Close.
8. On the File menu, select Savetable_name.
To modify the properties of an index in Object Explorer
1. In Object Explorer, click the plus sign to expand the database that contains the table on which you want to
modify an index’s properties.
2. Click the plus sign to expand the Tables folder.
3. Click the plus sign to expand the table on which you want to modify an index’s properties.
4. Click the plus sign to expand the Indexes folder.
5. Right-click the index of which you want to modify the properties and select Properties.
6. Under Select a page, select Options.
7. Change the settings of any and all properties to customize the index.
8. To add, remove, or change the position of an index column, select the General page from the Index
Properties - index_name dialog box. For more information, see Index Properties F1 Help

Using Transact-SQL
To see the properties of all the indexes in a table
1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.

USE AdventureWorks2012;
GO
SELECT i.name AS index_name,
i.type_desc,
i.is_unique,
ds.type_desc AS filegroup_or_partition_scheme,
ds.name AS filegroup_or_partition_scheme_name,
i.ignore_dup_key,
i.is_primary_key,
i.is_unique_constraint,
i.fill_factor,
i.is_padded,
i.is_disabled,
i.allow_row_locks,
i.allow_page_locks,
i.has_filter,
i.filter_definition
FROM sys.indexes AS i
INNER JOIN sys.data_spaces AS ds ON i.data_space_id = ds.data_space_id
WHERE is_hypothetical = 0 AND i.index_id <> 0
AND i.object_id = OBJECT_ID('HumanResources.Employee');
GO

To set the properties of an index


1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following examples into the query window and click Execute.
USE AdventureWorks2012;
GO
ALTER INDEX AK_SalesOrderHeader_SalesOrderNumber ON
Sales.SalesOrderHeader
SET (
STATISTICS_NORECOMPUTE = ON,
IGNORE_DUP_KEY = ON,
ALLOW_PAGE_LOCKS = ON
) ;
GO

USE AdventureWorks2012;
GO
ALTER INDEX ALL ON Production.Product
REBUILD WITH (FILLFACTOR = 80, SORT_IN_TEMPDB = ON,
STATISTICS_NORECOMPUTE = ON);
GO

For more information, see ALTER INDEX (Transact-SQL).


Disk Space Requirements for Index DDL Operations
3/24/2017 • 3 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
Disk space is an important consideration when you create, rebuild, or drop indexes. Inadequate disk space can
degrade performance or even cause the index operation to fail. This topic provides general information that can
help you determine the amount of disk space required for index data definition language (DDL) operations.

Index Operations That Require No Additional Disk Space


The following index operations require no additional disk space:
ALTER INDEX REORGANIZE; however, log space is required.
DROP INDEX when you are dropping a nonclustered index.
DROP INDEX when you are dropping a clustered index offline without specifying the MOVE TO clause and
nonclustered indexes do not exist.
CREATE TABLE (PRIMARY KEY or UNIQUE constraints)

Index Operations That Require Additional Disk Space


All other index DDL operations require additional temporary disk space to use during the operation, and
permanent disk space to store the new index structure or structures.
When a new index structure is created, disk space for both the old (source) and new (target) structures is required
in their appropriate files and filegroups. The old structure is not deallocated until the index creation transaction
commits.
The following index DDL operations create new index structures and require additional disk space:
CREATE INDEX
CREATE INDEX WITH DROP_EXISTING
ALTER INDEX REBUILD
ALTER TABLE ADD CONSTRAINT (PRIMARY KEY or UNIQUE)
ALTER TABLE DROP CONSTRAINT (PRIMARY KEY or UNIQUE) when the constraint is based on a clustered
index
DROP INDEX MOVE TO (Applies only to clustered indexes.)

Temporary Disk Space for Sorting


Besides the disk space required for the source and target structures, temporary disk space is required for sorting,
unless the query optimizer finds an execution plan that does not require sorting.
If sorting is required, sorting occurs one new index at a time. For example, when you rebuild a clustered index and
associated nonclustered indexes within a single statement, the indexes are sorted one after the other. Therefore,
the additional temporary disk space that is required for sorting only has to be as large as the largest index in the
operation. This is almost always the clustered index.
If the SORT_IN_TEMPDB option is set to ON, the largest index must fit into tempdb. Although this option
increases the amount of temporary disk space that is used to create an index, it may reduce the time that is
required to create an index when tempdb is on a set of disks different from the user database.
If SORT_IN_TEMPDB is set to OFF (the default) each index, including partitioned indexes, is sorted in its destination
disk space; and only the disk space for the new index structures is required.
For an example of calculating disk space, see Index Disk Space Example.

Temporary Disk Space for Online Index Operations


When you perform index operations online, additional temporary disk space is required.
If a clustered index is created, rebuilt, or dropped online, a temporary nonclustered index is created to map old
bookmarks to new bookmarks. If the SORT_IN_TEMPDB option is set to ON, this temporary index is created in
tempdb. If SORT_IN_TEMPDB is set to OFF, the same filegroup or partition scheme as the target index is used.
The temporary mapping index contains one record for each row in the table, and its contents is the union of the
old and new bookmark columns, including uniqueifiers and record identifiers and including only a single copy of
any column used in both bookmarks. For more information about online index operations, see Perform Index
Operations Online.

NOTE
The SORT_IN_TEMPDB option cannot be set for DROP INDEX statements. The temporary mapping index is always created
in the same filegroup or partition scheme as the target index.

Online index operations use row versioning to isolate the index operation from the effects of modifications made
by other transactions. This avoids the need for requesting share locks on rows that have been read. Concurrent
user update and delete operations during online index operations require space for version records in tempdb.
For more information, see Perform Index Operations Online .

Related Tasks
Index Disk Space Example
Transaction Log Disk Space for Index Operations
Estimate the Size of a Table
Estimate the Size of a Clustered Index
Estimate the Size of a Nonclustered Index
Estimate the Size of a Heap

Related Content
CREATE INDEX (Transact-SQL)
ALTER INDEX (Transact-SQL)
DROP INDEX (Transact-SQL)
Specify Fill Factor for an Index
Reorganize and Rebuild Indexes
Transaction Log Disk Space for Index Operations
3/24/2017 • 1 min to read • Edit Online

Large-scale index operations can generate large data loads that can cause the transaction log to fill quickly. To
make sure that the index operation can be rolled back, the transaction log cannot be truncated until the index
operation has completed; however, the log can be backed up during the index operation. Therefore, the transaction
log must have sufficient room to store both the index operation transactions and any concurrent user transactions
for the duration of the index operation. This is true for both offline and online index operations. Because the
underlying tables cannot be accessed during an offline index operation, there may be few user transactions and the
log may not grow as quickly. Online index operations do not prevent concurrent user activity, therefore, large-scale
online index operations combined with significant concurrent user transactions can cause continuous growth of
the transaction log without an option to truncate the log.

Recommendations
When you run large-scale index operations, consider the following recommendations:
1. Make sure the transaction log has been backed up and truncated before running large-scale index
operations online, and that the log has sufficient space to store the projected index and user transactions.
2. Consider setting the SORT_IN_TEMPDB option to ON for the index operation. This separates the index
transactions from the concurrent user transactions. The index transactions will be stored in the tempdb
transaction log, and the concurrent user transactions will be stored in the transaction log of the user
database. This allows for the transaction log of the user database to be truncated during the index operation
if it is required. Additionally, if the tempdb log is not on the same disk as the user database log, the two
logs are not competing for the same disk space.

NOTE
Verify that the tempdb database and transaction log have sufficient disk space to handle the index operation. The
tempdb transaction log cannot be truncated until the index operation is completed.

3. Use a database recovery model that allows for minimal logging of the index operation. This may reduce the
size of the log and prevent the log from filling the log space.
4. Do not run the online index operation in an explicit transaction. The log will not be truncated until the
explicit transaction ends.

Related Content
Disk Space Requirements for Index DDL Operations
Index Disk Space Example
Index Disk Space Example
3/24/2017 • 5 min to read • Edit Online

Whenever an index is created, rebuilt, or dropped, disk space for both the old (source) and new (target) structures
is required in their appropriate files and filegroups. The old structure is not deallocated until the index creation
transaction commits. Additional temporary disk space for sorting operations may also be needed. For more
information, see Disk Space Requirements for Index DDL Operations.
In this example, disk space requirements to create a clustered index are determined.
Assume the following conditions are true before creating the clustered index:
The existing table (heap) contains 1 million rows. Each row is 200 bytes long.
Nonclustered index A contains 1 million rows. Each row is 50 bytes long.
Nonclustered index B contains 1 million rows. Each row is 80 bytes long.
The index create memory option is set to 2 MB.
A fill factor value of 80 is used for all existing and new indexes. This means the pages are 80 percent full.

NOTE
As a result of creating a clustered index, the two nonclustered indexes must be rebuilt to replace the row indicator
with the new clustered index key.

Disk Space Calculations for an Offline Index Operation


In the following steps, both temporary disk space to be used during the index operation and permanent disk space
to store the new indexes are calculated. The calculations shown are approximate; results are rounded up and
consider only the size of index leaf level. The tilde (~) is used to indicate approximate calculations.
1. Determine the size of the source structures.
Heap: 1 million * 200 bytes ~ 200 MB
Nonclustered index A: 1 million * 50 bytes / 80% ~ 63 MB
Nonclustered index B: 1 million * 80 bytes / 80% ~ 100 MB
Total size of existing structures: 363 MB
2. Determine the size of the target index structures. Assume that the new clustered key is 24 bytes long
including a uniqueifier. The row indicator (8 bytes long) in both nonclustered indexes will be replaced by
this clustered key.
Clustered index: 1 million * 200 bytes / 80% ~ 250 MB
Nonclustered index A: 1 million * (50 – 8 + 24) bytes / 80% ~ 83 MB
Nonclustered index B: 1 million * (80 – 8 + 24) bytes / 80% ~ 120 MB
Total size of new structures: 453 MB
Total disk space required to support both the source and target structures for the duration of the index
operation is 816 MB (363 + 453). The space currently allocated to the source structures will be deallocated
after the index operation is committed.
3. Determine additional temporary disk space for sorting.
Space requirements are shown for sorting in tempdb (with SORT_IN_TEMPDB set to ON) and sorting in the
target location (with SORT_IN_TEMPDB set to OFF).
a. When SORT_IN_TEMPDB is set to ON, tempdb must have sufficient disk space to hold the largest
index (1 million * 200 bytes ~ 200 MB). Fill factor is not considered in the sorting operation.
Additional disk space (in the tempdb location) equal to the Configure the index create memory
Server Configuration Option value = 2 MB.
Total size of temporary disk space with SORT_IN_TEMPDB set to ON ~ 202 MB.
b. When SORT_IN_TEMPDB is set to OFF (default), the 250 MB of disk space already considered for the
new index in step 2 is used for sorting.
Additional disk space (in the target location) equal to the Configure the index create memory Server
Configuration Option value = 2 MB.
Total size of temporary disk space with SORT_IN_TEMPDB set to OFF = 2 MB.
Using tempdb, a total of 1018 MB (816 + 202) would be needed to create the clustered and nonclustered
indexes. Although using tempdb increases the amount of temporary disk space used to create an index, it
may reduce the time that is required to create an index when tempdb is on a different set of disks than the
user database. For more information about using tempdb, see SORT_IN_TEMPDB Option For Indexes.
Without using tempdb, a total of 818 MB (816+ 2) would be needed to create the clustered and
nonclustered indexes.

Disk Space Calculations for an Online Clustered Index Operation


When you create, drop, or rebuild a clustered index online, additional disk space is required to build and maintain
a temporary mapping index. This temporary mapping index contains one record for each row in the table, and its
contents are the union of the old and new bookmark columns.
To calculate the disk space needed for an online clustered index operation, follow the steps shown for an offline
index operation and add those results to the results of the following step.
Determine space for the temporary mapping index.
In this example, the old bookmark is the row ID (RID) of the heap (8 bytes) and the new bookmark is the
clustering key (24 bytes including a uniqueifier). There are no overlapping columns between the old and
new bookmarks.
Temporary mapping index size = 1 million * (8 bytes + 24 bytes) / 80% ~ 40 MB.
This disk space must be added to the required disk space in the target location if SORT_IN_TEMPDB is set to
OFF, or to tempdb if SORT_IN_TEMPDB is set to ON.
For more information about the temporary mapping index, see Disk Space Requirements for Index DDL
Operations.

Disk Space Summary


The following table summarizes the results of the disk space calculations.
DISK SPACE REQUIREMENTS FOR THE LOCATIONS OF THE
INDEX OPERATION FOLLOWING STRUCTURES

Offline index operation with SORT_IN_TEMPDB = ON Total space during the operation: 1018 MB

-Existing table and indexes: 363 MB*

-
tempdb: 202 MB*

-New indexes: 453 MB

Total space required after the operation: 453 MB

Offline index operation with SORT_IN_TEMPDB = OFF Total space during the operation: 816 MB

-Existing table and indexes: 363 MB*

-New indexes: 453 MB

Total space required after the operation: 453 MB

Online index operation with SORT_IN_TEMPDB = ON Total space during the operation: 1058 MB

-Existing table and indexes: 363 MB*

-
tempdb (includes mapping index): 242 MB*

-New indexes: 453 MB

Total space required after the operation: 453 MB

Online index operation with SORT_IN_TEMPDB = OFF Total space during the operation: 856 MB

-Existing table and indexes: 363 MB

-Temporary mapping index: 40 MB\

-New indexes: 453 MB

Total space required after the operation: 453 MB

*This space is deallocated after the index operation is committed.


This example does not consider any additional temporary disk space required in tempdb for version records
created by concurrent user update and delete operations.

Related Content
Disk Space Requirements for Index DDL Operations
Transaction Log Disk Space for Index Operations
Reorganize and Rebuild Indexes
4/6/2017 • 9 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2008) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
This topic describes how to reorganize or rebuild a fragmented index in SQL Server 2016 by using SQL Server
Management Studio or Transact-SQL. The SQL Server Database Engine automatically maintains indexes whenever
insert, update, or delete operations are made to the underlying data. Over time these modifications can cause the
information in the index to become scattered in the database (fragmented). Fragmentation exists when indexes
have pages in which the logical ordering, based on the key value, does not match the physical ordering inside the
data file. Heavily fragmented indexes can degrade query performance and cause your application to respond
slowly.
You can remedy index fragmentation by reorganizing or rebuilding an index. For partitioned indexes built on a
partition scheme, you can use either of these methods on a complete index or a single partition of an index.
Rebuilding an index drops and re-creates the index. This removes fragmentation, reclaims disk space by
compacting the pages based on the specified or existing fill factor setting, and reorders the index rows in
contiguous pages. When ALL is specified, all indexes on the table are dropped and rebuilt in a single transaction.
Reorganizing an index uses minimal system resources. It defragments the leaf level of clustered and nonclustered
indexes on tables and views by physically reordering the leaf-level pages to match the logical, left to right, order of
the leaf nodes. Reorganizing also compacts the index pages. Compaction is based on the existing fill factor value.
In This Topic
Before you begin:
Detecting Fragmentation
Limitations and Restrictions
Security
To check the fragmentation of an index, using:
SQL Server Management Studio
Transact-SQL
To reorganize or rebuild an index, using:
SQL Server Management Studio
Transact-SQL

Before You Begin


Detecting Fragmentation
The first step in deciding which defragmentation method to use is to analyze the index to determine the degree of
fragmentation. By using the system function sys.dm_db_index_physical_stats, you can detect fragmentation in a
specific index, all indexes on a table or indexed view, all indexes in a database, or all indexes in all databases. For
partitioned indexes, sys.dm_db_index_physical_stats also provides fragmentation information for each partition.
The result set returned by the sys.dm_db_index_physical_stats function includes the following columns.
COLUMN DESCRIPTION

avg_fragmentation_in_percent The percent of logical fragmentation (out-of-order pages in


the index).

fragment_count The number of fragments (physically consecutive leaf pages)


in the index.

avg_fragment_size_in_pages Average number of pages in one fragment in an index.

After the degree of fragmentation is known, use the following table to determine the best method to correct the
fragmentation.

AVG_FRAGMENTATION_IN_PERCENT VALUE CORRECTIVE STATEMENT

> 5% and < = 30% ALTER INDEX REORGANIZE

> 30% ALTER INDEX REBUILD WITH (ONLINE = ON)*

* Rebuilding an index can be executed online or offline. Reorganizing an index is always executed online. To
achieve availability similar to the reorganize option, you should rebuild indexes online.
These values provide a rough guideline for determining the point at which you should switch between ALTER
INDEX REORGANIZE and ALTER INDEX REBUILD. However, the actual values may vary from case to case. It is
important that you experiment to determine the best threshold for your environment. Very low levels of
fragmentation (less than 5 percent) should not be addressed by either of these commands because the benefit
from removing such a small amount of fragmentation is almost always vastly outweighed by the cost of
reorganizing or rebuilding the index.

NOTE
In general, fragmentation on small indexes is often not controllable. The pages of small indexes are sometimes stored on
mixed extents. Mixed extents are shared by up to eight objects, so the fragmentation in a small index might not be reduced
after reorganizing or rebuilding the index.

Limitations and Restrictions


Indexes with more than 128 extents are rebuilt in two separate phases: logical and physical. In the logical
phase, the existing allocation units used by the index are marked for deallocation, the data rows are copied
and sorted, then moved to new allocation units created to store the rebuilt index. In the physical phase, the
allocation units previously marked for deallocation are physically dropped in short transactions that happen
in the background, and do not require many locks.
Index options cannot be specified when reorganizing an index.
The ALTER INDEX REORGANIZE statement requires the data file containing the index to have space available,
because the operation can only allocate temporary work pages on the same file, not another file within the
filegroup. So although the filegroup might have free pages available, the user can still encounter error 1105
"Could not allocate space for object <index name>.<table name> in database <database name> because
the 'PRIMARY' filegroup is full."
Creating and rebuilding nonaligned indexes on a table with more than 1,000 partitions is possible, but is
not supported. Doing so may cause degraded performance or excessive memory consumption during these
operations.
NOTE
Starting with SQL Server 2012, statistics are not created by scanning all the rows in the table when a partitioned index is
created or rebuilt. Instead, the query optimizer uses the default sampling algorithm to generate statistics. To obtain statistics
on partitioned indexes by scanning all the rows in the table, use CREATE STATISTICS or UPDATE STATISTICS with the
FULLSCAN clause.

Security
Permissions
Requires ALTER permission on the table or view. User must be a member of the sysadmin fixed server role or the
db_ddladmin and db_owner fixed database roles.

Using SQL Server Management Studio


To check the fragmentation of an index
1. In Object Explorer, Expand the database that contains the table on which you want to check an index’s
fragmentation.
2. Expand the Tables folder.
3. Expand the table on which you want to check an index’s fragmentation.
4. Expand the Indexes folder.
5. Right-click the index of which you want to check the fragmentation and select Properties.
6. Under Select a page, select Fragmentation.
The following information is available on the Fragmentation page:
Page fullness
Indicates average fullness of the index pages, as a percentage. 100% means the index pages are completely
full. 50% means that, on average, each index page is half full.
Total fragmentation
The logical fragmentation percentage. This indicates the number of pages in an index that are not stored in
order.
Average row size
The average size of a leaf level row.
Depth
The number of levels in the index, including the leaf level.
Forwarded records
The number of records in a heap that have forward pointers to another data location. (This state occurs
during an update, when there is not enough room to store the new row in the original location.)
Ghost rows
The number of rows that are marked as deleted but not yet removed. These rows will be removed by a
clean-up thread, when the server is not busy. This value does not include rows that are being retained due
to an outstanding snapshot isolation transaction.
Index type
The type of index. Possible values are Clustered index, Nonclustered index, and Primary XML. Tables
can also be stored as a heap (without indexes), but then this Index Properties page cannot be opened.
Leaf-level rows
The number of leaf level rows.
Maximum row size
The maximum leaf-level row size.
Minimum row size
The minimum leaf-level row size.
Pages
The total number of data pages.
Partition ID
The partition ID of the b-tree containing the index.
Version ghost rows
The number of ghost records that are being retained due to an outstanding snapshot isolation transaction.

Using Transact-SQL
To check the fragmentation of an index
1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.

USE AdventureWorks2012;
GO
-- Find the average fragmentation percentage of all indexes
-- in the HumanResources.Employee table.
SELECT a.index_id, name, avg_fragmentation_in_percent
FROM sys.dm_db_index_physical_stats (DB_ID(N'AdventureWorks2012'),
OBJECT_ID(N'HumanResources.Employee'), NULL, NULL, NULL) AS a
JOIN sys.indexes AS b ON a.object_id = b.object_id AND a.index_id = b.index_id;
GO

The statement above might return a result set similar to the following.

index_id name avg_fragmentation_in_percent


----------- ----------------------------------------------------- ----------------------------
1 PK_Employee_BusinessEntityID 0
2 IX_Employee_OrganizationalNode 0
3 IX_Employee_OrganizationalLevel_OrganizationalNode 0
5 AK_Employee_LoginID 66.6666666666667
6 AK_Employee_NationalIDNumber 50
7 AK_Employee_rowguid 0

(6 row(s) affected)

For more information, see sys.dm_db_index_physical_stats (Transact-SQL).

Using SQL Server Management Studio


To reorganize or rebuild an index
1. In Object Explorer, Expand the database that contains the table on which you want to reorganize an index.
2. Expand the Tables folder.
3. Expand the table on which you want to reorganize an index.
4. Expand the Indexes folder.
5. Right-click the index you want to reorganize and select Reorganize.
6. In the Reorganize Indexes dialog box, verify that the correct index is in the Indexes to be reorganized
grid and click OK.
7. Select the Compact large object column data check box to specify that all pages that contain large object
(LOB) data are also compacted.
8. Click OK.
To reorganize all indexes in a table
1. In Object Explorer, Expand the database that contains the table on which you want to reorganize the
indexes.
2. Expand the Tables folder.
3. Expand the table on which you want to reorganize the indexes.
4. Right-click the Indexes folder and select Reorganize All.
5. In the Reorganize Indexes dialog box, verify that the correct indexes are in the Indexes to be
reorganized. To remove an index from the Indexes to be reorganized grid, select the index and then
press the Delete key.
6. Select the Compact large object column data check box to specify that all pages that contain large object
(LOB) data are also compacted.
7. Click OK.
To rebuild an index
1. In Object Explorer, Expand the database that contains the table on which you want to reorganize an index.
2. Expand the Tables folder.
3. Expand the table on which you want to reorganize an index.
4. Expand the Indexes folder.
5. Right-click the index you want to reorganize and select Reorganize.
6. In the Rebuild Indexes dialog box, verify that the correct index is in the Indexes to be rebuilt grid and
click OK.
7. Select the Compact large object column data check box to specify that all pages that contain large object
(LOB) data are also compacted.
8. Click OK.

Using Transact-SQL
To reorganize a defragmented index
1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.
USE AdventureWorks2012;
GO
-- Reorganize the IX_Employee_OrganizationalLevel_OrganizationalNode index on the
HumanResources.Employee table.

ALTER INDEX IX_Employee_OrganizationalLevel_OrganizationalNode ON HumanResources.Employee


REORGANIZE ;
GO

To reorganize all indexes in a table


1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.

USE AdventureWorks2012;
GO
-- Reorganize all indexes on the HumanResources.Employee table.
ALTER INDEX ALL ON HumanResources.Employee
REORGANIZE ;
GO

To rebuild a defragmented index


1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute. The example rebuilds a
single index on the Employee table.

USE AdventureWorks2012;
GO
ALTER INDEX PK_Employee_BusinessEntityID ON HumanResources.Employee
REBUILD;
GO

To rebuild all indexes in a table


1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query The example specifies the keyword ALL . This rebuilds
all indexes associated with the table. Three options are specified.

USE AdventureWorks2012;
GO
ALTER INDEX ALL ON Production.Product
REBUILD WITH (FILLFACTOR = 80, SORT_IN_TEMPDB = ON,
STATISTICS_NORECOMPUTE = ON);
GO

For more information, see ALTER INDEX (Transact-SQL).

See Also
Microsoft SQL Server 2000 Index Defragmentation Best Practices
Specify Fill Factor for an Index
3/24/2017 • 4 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
This topic describes what fill factor is and how to specify a fill factor value on an index in SQL Server 2016 by using
SQL Server Management Studio or Transact-SQL.
The fill-factor option is provided for fine-tuning index data storage and performance. When an index is created or
rebuilt, the fill-factor value determines the percentage of space on each leaf-level page to be filled with data,
reserving the remainder on each page as free space for future growth. For example, specifying a fill-factor value of
80 means that 20 percent of each leaf-level page will be left empty, providing space for index expansion as data is
added to the underlying table. The empty space is reserved between the index rows rather than at the end of the
index.
The fill-factor value is a percentage from 1 to 100, and the server-wide default is 0 which means that the leaf-level
pages are filled to capacity.

NOTE
Fill-factor values 0 and 100 are the same in all respects.

In This Topic
Before you begin:
Performance Considerations
Security
To specify a fill factor in an index, using:
SQL Server Management Studio
Transact-SQL

Before You Begin


Performance Considerations
Page Splits
A correctly chosen fill-factor value can reduce potential page splits by providing enough space for index expansion
as data is added to the underlying table.When a new row is added to a full index page, the Database Engine moves
approximately half the rows to a new page to make room for the new row. This reorganization is known as a page
split. A page split makes room for new records, but can take time to perform and is a resource intensive operation.
Also, it can cause fragmentation that causes increased I/O operations. When frequent page splits occur, the index
can be rebuilt by using a new or existing fill-factor value to redistribute the data. For more information, see
Reorganize and Rebuild Indexes.
Although a low, nonzero fill-factor value may reduce the requirement to split pages as the index grows, the index
will require more storage space and can decrease read performance. Even for an application oriented for many
insert and update operations, the number of database reads typically outnumber database writes by a factor of 5
to 10. Therefore, specifying a fill factor other than the default can decrease database read performance by an
amount inversely proportional to the fill-factor setting. For example, a fill-factor value of 50 can cause database
read performance to decrease by two times. Read performance is decreased because the index contains more
pages, therefore increasing the disk IO operations required to retrieve the data.
Adding Data to the End of the Table
A nonzero fill factor other than 0 or 100 can be good for performance if the new data is evenly distributed
throughout the table. However, if all the data is added to the end of the table, the empty space in the index pages
will not be filled. For example, if the index key column is an IDENTITY column, the key for new rows is always
increasing and the index rows are logically added to the end of the index. If existing rows will be updated with data
that lengthens the size of the rows, use a fill factor of less than 100. The extra bytes on each page will help to
minimize page splits caused by extra length in the rows.
Security
Permissions
Requires ALTER permission on the table or view. User must be a member of the sysadmin fixed server role or the
db_ddladmin and db_owner fixed database roles.

Using SQL Server Management Studio


To specify a fill factor by using Table Designer
1. In Object Explorer, click the plus sign to expand the database that contains the table on which you want to
specify an index’s fill factor.
2. Click the plus sign to expand the Tables folder.
3. Right-click the table on which you want to specify an index’s fill factor and select Design.
4. On the Table Designer menu, click Indexes/Keys.
5. Select the index with the fill factor that you want to specify.
6. Expand Fill Specification, select the Fill Factor row and enter the fill factor you want in the row.
7. Click Close.
8. On the File menu, select Savetable_name.
To specify a fill factor in an index by using Object Explorer
1. In Object Explorer, click the plus sign to expand the database that contains the table on which you want to
specify an index’s fill factor.
2. Click the plus sign to expand the Tables folder.
3. Click the plus sign to expand the table on which you want to specify an index’s fill factor.
4. Click the plus sign to expand the Indexes folder.
5. Right-click the index with the fill factor that you want to specify and select Properties.
6. Under Select a page, select Options.
7. In the Fill factor row, enter the fill factor that you want.
8. Click OK.

Using Transact-SQL
To specify a fill factor in an existing index
1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute. The example rebuilds an
existing index and applies the specified fill factor during the rebuild operation.

USE AdventureWorks2012;
GO
-- Rebuilds the IX_Employee_OrganizationLevel_OrganizationNode index
-- with a fill factor of 80 on the HumanResources.Employee table.

ALTER INDEX IX_Employee_OrganizationLevel_OrganizationNode ON HumanResources.Employee


REBUILD WITH (FILLFACTOR = 80);
GO

Another way to specify a fill factor in an index


1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.

USE AdventureWorks2012;
GO
/* Drops and re-creates the IX_Employee_OrganizationLevel_OrganizationNode index on the
HumanResources.Employee table with a fill factor of 80.
*/

CREATE INDEX IX_Employee_OrganizationLevel_OrganizationNode ON HumanResources.Employee


(OrganizationLevel, OrganizationNode)
WITH (DROP_EXISTING = ON, FILLFACTOR = 80);
GO

For more information, see ALTER INDEX (Transact-SQL).


Perform Index Operations Online
3/24/2017 • 3 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
This topic describes how to create, rebuild, or drop indexes online in SQL Server 2016 by using SQL Server
Management Studio or Transact-SQL. The ONLINE option allows concurrent user access to the underlying table
or clustered index data and any associated nonclustered indexes during these index operations. For example,
while a clustered index is being rebuilt by one user, that user and others can continue to update and query the
underlying data. When you perform data definition language (DDL) operations offline, such as building or
rebuilding a clustered index; these operations hold exclusive locks on the underlying data and associated indexes.
This prevents modifications and queries to the underlying data until the index operation is complete.

NOTE
Online index operations are not available in every SQL Server edition. For more information, see Features Supported by the
Editions of SQL Server 2016.

In This Topic
Before you begin:
Limitations and Restrictions
Security
To rebuild an index online, using:
SQL Server Management Studio
Transact-SQL

Before You Begin


Limitations and Restrictions
We recommend performing online index operations for business environments that operate 24 hours a
day, seven days a week, in which the need for concurrent user activity during index operations is vital.
The ONLINE option is available in the following Transact-SQL statements.
CREATE INDEX
ALTER INDEX
DROP INDEX
ALTER TABLE (To add or drop UNIQUE or PRIMARY KEY constraints with CLUSTERED index option)
For more limitations and restrictions concerning creating, rebuilding, or dropping indexes online, see
Guidelines for Online Index Operations.
Security
Permissions
Requires ALTER permission on the table or view.
Using SQL Server Management Studio
To rebuild an index online
1. In Object Explorer, click the plus sign to expand the database that contains the table on which you want to
rebuild an index online.
2. Expand the Tables folder.
3. Click the plus sign to expand the table on which you want to rebuild an index online.
4. Expand the Indexes folder.
5. Right-click the index that you want to rebuild online and select Properties.
6. Under Select a page, select Options.
7. Select Allow online DML processing, and then select True from the list.
8. Click OK.
9. Right-click the index that you want to rebuild online and select Rebuild.
10. In the Rebuild Indexes dialog box, verify that the correct index is in the Indexes to rebuild grid and click
OK.

Using Transact-SQL
To create, rebuild, or drop an index online
1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute. The example rebuilds an
existing online

USE AdventureWorks2012;
GO
ALTER INDEX AK_Employee_NationalIDNumber ON HumanResources.Employee
REBUILD WITH (ONLINE = ON);
GO

The following example deletes a clustered index online and moves the resulting table (heap) to the
filegroup NewGroup by using the MOVE TO clause. The sys.indexes , sys.tables , and sys.filegroups
catalog views are queried to verify the index and table placement in the filegroups before and after the
move.
USE AdventureWorks2012;
GO
--Create a clustered index on the PRIMARY filegroup if the index does not exist.
IF NOT EXISTS (SELECT name FROM sys.indexes WHERE name =
N'AK_BillOfMaterials_ProductAssemblyID_ComponentID_StartDate')
CREATE UNIQUE CLUSTERED INDEX
AK_BillOfMaterials_ProductAssemblyID_ComponentID_StartDate
ON Production.BillOfMaterials (ProductAssemblyID, ComponentID,
StartDate)
ON 'PRIMARY';
GO
-- Verify filegroup location of the clustered index.
SELECT t.name AS [Table Name], i.name AS [Index Name], i.type_desc,
i.data_space_id, f.name AS [Filegroup Name]
FROM sys.indexes AS i
JOIN sys.filegroups AS f ON i.data_space_id = f.data_space_id
JOIN sys.tables as t ON i.object_id = t.object_id
AND i.object_id = OBJECT_ID(N'Production.BillOfMaterials','U')
GO
--Create filegroup NewGroup if it does not exist.
IF NOT EXISTS (SELECT name FROM sys.filegroups
WHERE name = N'NewGroup')
BEGIN
ALTER DATABASE AdventureWorks2012
ADD FILEGROUP NewGroup;
ALTER DATABASE AdventureWorks2012
ADD FILE (NAME = File1,
FILENAME = 'C:\Program Files\Microsoft SQL
Server\MSSQL10.MSSQLSERVER\MSSQL\DATA\File1.ndf')
TO FILEGROUP NewGroup;
END
GO
--Verify new filegroup
SELECT * from sys.filegroups;
GO
-- Drop the clustered index and move the BillOfMaterials table to
-- the Newgroup filegroup.
-- Set ONLINE = OFF to execute this example on editions other than Enterprise Edition.
DROP INDEX AK_BillOfMaterials_ProductAssemblyID_ComponentID_StartDate
ON Production.BillOfMaterials
WITH (ONLINE = ON, MOVE TO NewGroup);
GO
-- Verify filegroup location of the moved table.
SELECT t.name AS [Table Name], i.name AS [Index Name], i.type_desc,
i.data_space_id, f.name AS [Filegroup Name]
FROM sys.indexes AS i
JOIN sys.filegroups AS f ON i.data_space_id = f.data_space_id
JOIN sys.tables as t ON i.object_id = t.object_id
AND i.object_id = OBJECT_ID(N'Production.BillOfMaterials','U');
GO

For more information, see ALTER INDEX (Transact-SQL).


How Online Index Operations Work
3/24/2017 • 5 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
This topic defines the structures that exist during an online index operation and shows the activities associated with
these structures.

Online Index Structures


To allow for concurrent user activity during an index data definition language (DDL) operation, the following
structures are used during the online index operation: source and preexisting indexes, target, and for rebuilding a
heap or dropping a clustered index online, a temporary mapping index.
Source and preexisting indexes
The source is the original table or clustered index data. Preexisting indexes are any nonclustered indexes that
are associated with the source structure. For example, if the online index operation is rebuilding a clustered
index that has four associated nonclustered indexes, the source is the existing clustered index and the
preexisting indexes are the nonclustered indexes.
The preexisting indexes are available to concurrent users for select, insert, update, and delete operations.
This includes bulk inserts (supported but not recommended) and implicit updates by triggers and referential
integrity constraints. All preexisting indexes are available for queries and searches. This means they may be
selected by the query optimizer and, if necessary, specified in index hints.
Target
The target or targets is the new index (or heap) or a set of new indexes that is being created or rebuilt. User
insert, update, and delete operations to the source are applied by the SQL Server Database Engine to the
target during the index operation. For example, if the online index operation is rebuilding a clustered index,
the target is the rebuilt clustered index; the Database Engine does not rebuild nonclustered indexes when a
clustered index is rebuilt.
The target index is not searched while processing SELECT statements until the index operation is committed.
Internally, the index is marked as write-only.
Temporary mapping index
Online index operations that create, drop, or rebuild a clustered index also require a temporary mapping
index. This temporary index is used by concurrent transactions to determine which records to delete in the
new indexes that are being built when rows in the underlying table are updated or deleted. This
nonclustered index is created in the same step as the new clustered index (or heap) and does not require a
separate sort operation. Concurrent transactions also maintain the temporary mapping index in all their
insert, update, and delete operations.

Online Index Activities


During a simple online index operation, such as creating a clustered index on a nonindexed table (heap), the source
and target go through three phases: preparation, build, and final.
The following illustration shows the process for creating an initial clustered index online. The source object (the
heap) has no other indexes. The source and target structure activities are shown for each phase; concurrent user
select, insert, update, and delete operations are also shown. The preparation, build, and final phases are indicated
together with the lock modes used in each phase.

Source Structure Activities


The following table lists the activities involving the source structures during each phase of the index operation and
the corresponding locking strategy.

PHASE SOURCE ACTIVITY SOURCE LOCKS

Preparation System metadata preparation to create S (Shared) on the table


the new empty index structure.
Very short phase IS (Intent Shared)
A snapshot of the table is defined. That
is, row versioning is used to provide INDEX_BUILD_INTERNAL_RESOURCE\*
transaction-level read consistency.

Concurrent user write operations on the


source are blocked for a very short
period.

No concurrent DDL operations are


allowed except creating multiple
nonclustered indexes.
PHASE SOURCE ACTIVITY SOURCE LOCKS

Build The data is scanned, sorted, merged, IS


and inserted into the target in bulk load
Main phase operations. INDEX_BUILD_INTERNAL_RESOURCE**

Concurrent user select, insert, update,


and delete operations are applied to
both the preexisting indexes and any
new indexes being built.

Final All uncommitted update transactions INDEX_BUILD_INTERNAL_RESOURCE*


must complete before this phase starts.
Very short phase Depending on the acquired lock, all new S on the table if creating a
user read or write transactions are nonclustered index.\
blocked for a very short period until this
phase is completed. SCH-M (Schema Modification) if any
source structure (index or table) is
System metadata is updated to replace dropped.*
the source with the target.

The source is dropped if it is required.


For example, after rebuilding or
dropping a clustered index.

* The index operation will wait for any uncommitted update transactions to complete before acquiring the S lock or
SCH-M lock on the table.
** The resource lock INDEX_BUILD_INTERNAL_RESOURCE prevents the execution of concurrent data definition
language (DDL) operations on the source and preexisting structures while the index operation is in progress. For
example, this lock prevents concurrent rebuild of two indexes on the same table. Although this resource lock is
associated with the Sch-M lock, it does not prevent data manipulation statements.
The previous table shows a single Shared (S) lock acquired during the build phase of an online index operation that
involves a single index. When clustered and nonclustered indexes are built, or rebuilt, in a single online index
operation (for example, during the initial clustered index creation on a table that contains one or more nonclustered
indexes) two short-term S locks are acquired during the build phase followed by long-term Intent Shared (IS) locks.
One S lock is acquired first for the clustered index creation and when creating the clustered index is completed, a
second short-term S lock is acquired for creating the nonclustered indexes. After the nonclustered indexes are
created, the S lock is downgraded to an IS lock until the final phase of the online index operation.
Target Structure Activities
The following table lists the activities that involve the target structure during each phase of the index operation and
the corresponding locking strategy.

PHASE TARGET ACTIVITY TARGET LOCKS

Preparation New index is created and set to write- IS


only.

Build Data is inserted from source. IS

User modifications (inserts, updates,


deletes) applied to the source are
applied.

This activity is transparent to the user.


PHASE TARGET ACTIVITY TARGET LOCKS

Final Index metadata is updated. S

Index is set to read/write status. or

SCH-M

The target is not accessed by SELECT statements issued by the user until the index operation is completed.
After the preparation and final phase is completed, the query and update plans that are stored in the procedure
cache are invalidated. Subsequent queries will use the new index.
The lifetime of a cursor declared on a table that is involved in an online index operation is limited by the online
index phases. Update cursors are invalidated at each phase. Read-only cursors are invalidated only after the final
phase.

Related Content
Perform Index Operations Online
Guidelines for Online Index Operations
Guidelines for Online Index Operations
4/29/2017 • 7 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2008) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
When you perform online index operations, the following guidelines apply:
Clustered indexes must be created, rebuilt, or dropped offline when the underlying table contains the
following large object (LOB) data types: image, ntext, and text.
Nonunique nonclustered indexes can be created online when the table contains LOB data types but none of
these columns are used in the index definition as either key or nonkey (included) columns.
Indexes on local temp tables cannot be created, rebuilt, or dropped online. This restriction does not apply to
indexes on global temp tables.
Indexes can be resumed from where it stopped after an unexpected failure, database failover, or a PAUSE
command. See Alter Index. This feature is in public preview for SQL Server 2017.

NOTE
Online index operations are not available in every edition of Microsoft SQL Server. For a list of features that are supported by
the editions of SQL Server, see Features supported by editions.

The following table shows the index operations that can be performed online, the indexes that are excluded from
these online operations, and resumable index restrictions. Additional restrictions are also included.

ONLINE INDEX OPERATION EXCLUDED INDEXES OTHER RESTRICTIONS

ALTER INDEX REBUILD Disabled clustered index or disabled Specifying the keyword ALL may cause
indexed view the operation to fail when the table
contains an excluded index.
XML index
Additional restrictions on rebuilding
Columnstore index disabled indexes apply. For more
information, see Disable Indexes and
Index on a local temp table Constraints.

CREATE INDEX XML index

Initial unique clustered index on a view

Index on a local temp table

CREATE INDEX WITH DROP_EXISTING Disabled clustered index or disabled


indexed view

Index on a local temp table

XML index
ONLINE INDEX OPERATION EXCLUDED INDEXES OTHER RESTRICTIONS

DROP INDEX Disabled index Multiple indexes cannot be specified


within a single statement.
XML index

Nonclustered index

Index on a local temp table

ALTER TABLE ADD CONSTRAINT Index on a local temp table Only one subclause is allowed at a time.
(PRIMARY KEY or UNIQUE) For example, you cannot add and drop
Clustered index PRIMARY KEY or UNIQUE constraints in
the same ALTER TABLE statement.

ALTER TABLE DROP CONSTRAINT Clustered index


(PRIMARY KEY or UNIQUE)

The underlying table cannot be modified, truncated, or dropped while an online index operation is in process.
The online option setting (ON or OFF) specified when you create or drop a clustered index is applied to any
nonclustered indexes that must be rebuilt. For example, if the clustered index is built online by using CREATE
INDEX WITH DROP_EXISTING, ONLINE=ON, all associated nonclustered indexes are re-created online also.
When you create or rebuild a UNIQUE index online, the index builder and a concurrent user transaction may try to
insert the same key, therefore violating uniqueness. If a row entered by a user is inserted into the new index
(target) before the original row from the source table is moved to the new index, the online index operation will
fail.
Although not common, the online index operation can cause a deadlock when it interacts with database updates
because of user or application activities. In these rare cases, the SQL Server Database Engine will select the user or
application activity as a deadlock victim.
You can perform concurrent online index DDL operations on the same table or view only when you are creating
multiple new nonclustered indexes, or reorganizing nonclustered indexes. All other online index operations
performed at the same time fail. For example, you cannot create a new index online while rebuilding an existing
index online on the same table.
An online operation cannot be performed when an index contains a column of the large object type, and in the
same transaction there are update operations before this online operation. To work around this issue, place the
online operation outside the transaction or place it before any updates in the transaction.

Disk Space Considerations


Online index operations require more disk space requirements than offline index operations.
During index creation and index rebuild operations, additional space is required for the index being built (or
rebuilt).
In addition, disk space is required for the temporary mapping index. This temporary index is used in online
index operations that create, rebuild, or drop a clustered index.
Dropping a clustered index online requires as much space as creating (or rebuilding) a clustered index
online.
For more information, see Disk Space Requirements for Index DDL Operations.

Performance Considerations
Although online index operations permit concurrent user update activity, the index operations will take longer if
the update activity is very heavy. Typically, online index operations will be slower than equivalent offline index
operations regardless of the concurrent update activity level.
Because both the source and target structures are maintained during the online index operation, the resource
usage for insert, update, and delete transactions is increased, potentially up to double. This could cause a decrease
in performance and greater resource usage, especially CPU time, during the index operation. Online index
operations are fully logged.
Although we recommend online operations, you should evaluate your environment and specific requirements. It
may be optimal to run index operations offline. In doing this, users have restricted access to the data during the
operation, but the operation finishes faster and uses fewer resources.
On multiprocessor computers that are running SQL Server 2016, index statements may use more processors to
perform the scan and sort operations associated with the index statement just like other queries do. You can use
the MAXDOP index option to control the number of processors dedicated to the online index operation. In this way,
you can balance the resources that are used by index operation with those of the concurrent users. For more
information, see Configure Parallel Index Operations. For more information about the editions of SQL Server that
support Parallel indexed operations, see Features Supported by editions.
Because an S-lock or Sch-M lock is held in the final phase of the index operation, be careful when you run an
online index operation inside an explicit user transaction, such as BEGIN TRANSACTION...COMMIT block. Doing this
causes the lock to be held until the end of the transaction, therefore impeding user concurrency.
Online index rebuilding may increase fragmentation when it is allowed to run with MAX DOP > 1 and
ALLOW_PAGE_LOCKS = OFF options. For more information, see How It Works: Online Index Rebuild - Can Cause
Increased Fragmentation.

Transaction Log Considerations


Large-scale index operations, performed offline or online, can generate large data loads that can cause the
transaction log to quickly fill. To make sure that the index operation can be rolled back, the transaction log cannot
be truncated until the index operation has been completed; however, the log can be backed up during the index
operation. Therefore, the transaction log must have sufficient space to store both the index operation transactions
and any concurrent user transactions for the duration of the index operation. For more information, see
Transaction Log Disk Space for Index Operations.

Resumable Index Rebuild Considerations


NOTE
See Alter Index. This feature is in public preview for SQL Server 2017.

When you perform resumable online index rebuild the following guidelines apply:
Managing, planning and extending of index maintenance windows. You can pause and restart an index rebuild
operation multiple times to fit your maintenance windows.
Recovering from index rebuild failures (such as database failovers or running out of disk space).
When an index operation is paused, both the original index and the the newly created one require disk
space and need to be updated during DML operations.
Enables truncation of truncation logs during an index rebuild operation (this operation cannot be performed
for a regular online index operation).
SORT_IN_TEMPDB=ON option is not supported
IMPORTANT
Resumable rebuild does not require you to keep open a long running truncation, allowing log truncation during this
operation and a better log space management. With the new design, we managed to keep necessary data in a database
together with all references required to restart the resumable operation.

Generally, there is no performance difference between resumable and non-resumable online index rebuild. When
you update a resumable index while an index rebuild operation is paused:
For read-mostly workloads, the performance impact is insignificant.
For update-heavy workloads, you may experience some throughput degradation (our testing shows less than
10% degradation).
Generally, there is no difference in defragmentation quality between resumable and non-resumable online index
rebuild.

Related Content
How Online Index Operations Work
Perform Index Operations Online
ALTER INDEX (Transact-SQL)
CREATE INDEX (Transact-SQL)
Configure Parallel Index Operations
3/24/2017 • 4 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
This topic defines max degree of parallelism and explains how to modify this setting in SQL Server 2016 by using
SQL Server Management Studio or Transact-SQL. On multiprocessor computers that are running SQL Server
Enterprise or higher, index statements may use multiple processors to perform the scan, sort, and index operations
associated with the index statement just like other queries do. The number of processors used to run a single index
statement is determined by the max degree of parallelism configuration option, the current workload, and the
index statistics. The max degree of parallelism option determines the maximum number of processors to use in
parallel plan execution. If the SQL Server Database Engine detects that the system is busy, the degree of parallelism
of the index operation is automatically reduced before statement execution starts. The Database Engine can also
reduce the degree of parallelism if the leading key column of a non-partitioned index has a limited number of
distinct values or the frequency of each distinct value varies significantly.

NOTE
Parallel index operations are not available in every SQL Server edition. For more information, see Features Supported by the
Editions of SQL Server 2016

In This Topic
Before you begin:
Limitations and Restrictions
Security
To set the max degree of parallelism, using:
SQL Server Management Studio
Transact-SQL

Before You Begin


Limitations and Restrictions
The number of processors that are used by the query optimizer typically provides optimal performance.
However, operations such as creating, rebuilding, or dropping very large indexes are resource intensive and
can cause insufficient resources for other applications and database operations for the duration of the index
operation. When this problem occurs, you can manually configure the maximum number of processors that
are used to run the index statement by limiting the number of processors to use for the index operation.
The MAXDOP index option overrides the max degree of parallelism configuration option only for the query
specifying this option. The following table lists the valid integer values that can be specified with the max
degree of parallelism configuration option and the MAXDOP index option.

VALUE DESCRIPTION
VALUE DESCRIPTION

0 Specifies that the server determines the number of CPUs


that are used, depending on the current system workload.
This is the default value and recommended setting.

1 Suppresses parallel plan generation. The operation will be


executed serially.

2-64 Limits the number of processors to the specified value.


Fewer processors may be used depending on the current
workload. If a value larger than the number of available
CPUs is specified, the actual number of available CPUs is
used.

Parallel index execution and the MAXDOP index option apply to the following Transact-SQL statements:
CREATE INDEX
ALTER INDEX REBUILD
DROP INDEX (This applies to clustered indexes only.)
ALTER TABLE ADD (index) CONSTRAINT
ALTER TABLE DROP (clustered index) CONSTRAINT
The MAXDOP index option cannot be specified in the ALTER INDEX REORGANIZE statement.
Memory requirements for partitioned index operations that require sorting can be greater if the query
optimizer applies degrees of parallelism to the build operation. The higher the degrees of parallelism, the
greater the memory requirement is. For more information, see Partitioned Tables and Indexes.
Security
Permissions
Requires ALTER permission on the table or view.

Using SQL Server Management Studio


To set max degree of parallelism on an index
1. In Object Explorer, click the plus sign to expand the database that contains the table on which you want to
set max degree of parallelism for an index.
2. Expand the Tables folder.
3. Click the plus sign to expand the table on which you want to set max degree of parallelism for an index.
4. Expand the Indexes folder.
5. Right-click the index for which you want to set the max degree of parallelism and select Properties.
6. Under Select a page, select Options.
7. Select Maximum degree of parallelism, and then enter some value between 1 and 64.
8. Click OK.

Using Transact-SQL
To set max degree of parallelism on an existing index
1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.

USE AdventureWorks2012;
GO
/*Alters the IX_ProductVendor_VendorID index on the Purchasing.ProductVendor table so that, if the
server has eight or more processors, the Database Engine will limit the execution of the index
operation to eight or fewer processors.
*/
ALTER INDEX IX_ProductVendor_VendorID ON Purchasing.ProductVendor
REBUILD WITH (MAXDOP=8);
GO

For more information, see ALTER INDEX (Transact-SQL).


Set max degree of parallelism on a new index
1. In Object Explorer, connect to an instance of Database Engine.
2. On the Standard bar, click New Query.
3. Copy and paste the following example into the query window and click Execute.

USE AdventureWorks2012;
GO
CREATE INDEX IX_ProductVendor_NewVendorID
ON Purchasing.ProductVendor (BusinessEntityID)
WITH (MAXDOP=8);
GO

For more information, see CREATE INDEX (Transact-SQL).


Index Properties F1 Help
3/24/2017 • 8 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
The sections in this topic refer to various index properties that are available by using SQL Server Management
Studio dialogs.
In This Topic:
Index Properties General Page
Select (Index) Columns Dialog Box
Index Properties Storage Page
Index Properties Spatial Page
Index Properties Filter Page

Index Properties General Page


Use the General page to view or modify index properties for the selected table or view. The options for each page
may change based on the type of index selected.
Table name
Displays the name of the table or view that the index was created on. This field is read-only. To select a different
table, close the Index Properties page, select the correct table, and then open the Index Properties page again.
Spatial indexes cannot be specified on indexed views. Spatial indexes can be defined only for a table that has a
primary key. The maximum number of primary key columns on the table is 15. The combined per-row size of the
primary-key columns is limited to a maximum of 895 bytes.
Index name
Displays the name of the index. This field is read-only for an existing index. When creating a new index, type the
name of the index.
Index type
Indicates the type of index. For new indexes, indicates the type of index selected when opening the dialog box.
Indexes can be: Clustered, Nonclustered, Primary XML, Secondary XML, Spatial, Clustered columnstore, or
Nonclustered Columnstore.
Note Only one clustered index is allowed for each table. Only one xVelocity memory optimized columnstore index
is allowed for each table.
Unique
Selecting this check box makes the index unique. No two rows are permitted to have the same index value. By
default, this check box is cleared. When modifying an existing index, index creation will fail if two rows have the
same value. For columns where NULL is permitted, a unique index permits one NULL value.
If you select Spatial in the Index type field, the Unique check box is dimmed.
Index key columns
Add the desired columns to the Index key columns grid. When more than one column is added, the columns
must be listed in the order desired. The column order in an index can have a great impact on the index
performance.
No more than 16 columns can participate in a single composite index. For greater than 16 columns, see included
columns at the end of this topic.
A spatial index can be defined only on a single column that contains a spatial data type (a spatial column).
Name
Displays the name of the column that participates in the index key.
Sort Order
Specifies the sort direction of the selected index column, either Ascending or Descending.

NOTE
If the index type is Primary XML or Spatial, this column does not appear in the table.

Data Type
Displays the data type information.

NOTE
If the table column is a computed column, Data type displays "computed column."

Size
Displays the maximum number of bytes required to store the column data type. Displays zero (0) for a spatial or
XML column.
Identity
Displays whether the column participating in the index key is an identity column.
Allow NULLs
Displays whether the column participating in the index key allows NULL values to be stored in the table or view
column.
Add
Adds a column to the index key. Select table columns from the Select Columns from <table name> dialog box
that appears when you click Add. For a spatial index, after you select one column, this button is dimmed.
Remove
Removes the selected column from participation in the index key.
Move Up
Moves the selected column up in the index key grid.
Move Down
Moves the selected column down in the index key grid.
Columnstore columns
Click Add to select columns for the columnstore index. For limitations on a columnstore index, see CREATE
COLUMNSTORE INDEX (Transact-SQL).
Included columns
Include nonkey columns in the nonclustered index. This option allows you to bypass the current index limits on the
total size of an index key and the maximum number of columns participating in an index key by adding columns as
nonkey columns in the leaf level of the nonclustered index. For more information, see Create Indexes with Included
Columns
Select (Index) Columns Dialog Box
Use this page to add columns to the Index Properties General page when creating or modifying an index.
Check box
Select to add columns.
Name
Name of the column.
Data Type
The data type of the column.
Bytes
The size of the column in bytes.
Identity
Displays Yes for identity columns, and No when the column is not an identity column.
Allow Nulls
Displays Yes when the table definition allows null values for the column. Displays No when the table definition
does not allow nulls for the column.

Storage Page Options


Use this page to view or modify filegroup or partition scheme properties for the selected index. Only shows options
related to the type of index.
Filegroup
Stores the index in the specified filegroup. The list only displays standard (row) filegroups. The default list selection
is the PRIMARY filegroup of the database. For more information, see Database Files and Filegroups.
Filestream filegroup
Specifies the filegroup for FILESTREAM data. This list displays only FILESTREAM filegroups. The default list selection
is the PRIMARY FILESTREAM filegroup. For more information, see FILESTREAM (SQL Server).
Partition scheme
Stores the index in a partition scheme. Clicking Partition Scheme enables the grid below. The default list selection
is the partition scheme that is used for storing the table data. When you select a different partition scheme in the
list, the information in the grid is updated. For more information, see Partitioned Tables and Indexes.
The partition scheme option is unavailable if there are no partition schemes in the database.
Filestream partition scheme
Specifies the partition scheme for FILESTREAM data. The partition scheme must be symmetric with the scheme that
is specified in the Partition scheme option.
If the table is not partitioned, the field is blank.
Partition Scheme Parameter
Displays the name of the column that participates in the partition scheme.
Table Column
Select the table or view to map to the partition scheme.
Column Data Type
Displays data type information about the column.
NOTE
If the table column is a computed column, Column Data Type displays "computed column."

Allow online processing of DML statements while moving the index


Allows users to access the underlying table or clustered index data and any associated nonclustered indexes during
the index operation. For more information, see Perform Index Operations Online.

NOTE
This option is not available for XML indexes, or if the index is a disabled clustered index.

Set maximum degree of parallelism


Limits the number of processors to use during parallel plan execution. The default value, 0, uses the actual number
of available CPUs. Setting the value to 1 suppresses parallel plan generation; setting the value to a number greater
than 1 restricts the maximum number of processors used by a single query execution. This option only becomes
available if the dialog box is in the Rebuild or Recreate state. For more information, see Set the Max Degree of
Parallelism Option for Optimal Performance.

NOTE
If a value greater than the number of available CPUs is specified, the actual number of available CPUs is used.

Spatial Page Index Options


Use the Spatial page to view or specify the values of the spatial properties. For more information, see Spatial Data
(SQL Server).
Bounding Box
The bounding box is the perimeter of the top-level grid of a geometric plane. The bounding-box parameters exist
only in the geometry grid tessellation. These parameters are unavailable if the Tessellation Scheme is
Geography grid.
The panel displays the (X-min,Y-min) and (X-max,Y-max) coordinates of the bounding box. There are no default
coordinate values. Therefore, when you are creating a new spatial index on a geometry type column, you must
specify the coordinate values.
X-min
The X-coordinate of the lower-left corner of the bounding box.
Y-min
The Y-coordinate of the lower-left corner of the bounding box.
X-max
The X-coordinate of the upper-right corner of the bounding box.
Y-max
The Y-coordinate of upper-right corner of the bounding box.
General
Tessellation Scheme
Indicates the tessellation scheme of the index. The supported tessellation schemes are as follows.
Geometry grid
Specifies the geometry grid tessellation scheme, which applies to a column of the geometry data type.
Geometry Auto grid
This option is enabled for SQL Server when database compatibility level is set to 110 or higher.
Geography grid
Specifies the geography grid tessellation scheme, which applies to a column of the geography data type.
Geography Auto grid
This option is enabled for SQL Server when database compatibility level is set to 110 or higher.
For information about how SQL Server implements tessellation, see Spatial Data (SQL Server).
Cells Per Object
Indicates the number of tessellation cells-per-object that can be used for a single spatial object in the index. This
number can be any integer between 1 and 8192, inclusive. The default is 16, and 8 for earlier versions of SQL
Server when database compatibility level is set to 110 or higher.
At the top level, if an object covers more cells than specified by n, the indexing uses as many cells as necessary to
provide a complete top-level tessellation. In such cases, an object might receive more than the specified number of
cells. In this case, the maximum number is the number of cells generated by the top-level grid, which depends on
the Level 1 density.
Grids
This panel shows the density of the grid at each level of the tessellation scheme. Density is specified as Low,
Medium, or High. The default is Medium. Low represents a 4x4 grid (16 cells), Medium represents an 8x8 grid
(64 cells), and High represents a 16x16 grid (256 cells). These options are not available when the Geometry Auto
grid or Geography Auto grid tessellation options are chosen.
Level 1
The density of the first-level (top) grid.
Level 2
The density of the second-level grid.
Level 3
The density of the third-level grid.
Level 4
The density of the fourth-level grid.

Filter Page
Use this page to enter the filter predicate for a filtered index. For more information, see Create Filtered Indexes.
Filter Expression
Defines which data rows to include in the filtered index. For example,
StartDate > '20000101' AND EndDate IS NOT NULL'.

See Also
Set Index Options
INDEXPROPERTY (Transact-SQL)
sys.indexes (Transact-SQL)
Columnstore indexes - overview
3/24/2017 • 9 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2012) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
The columnstore index is the standard for storing and querying large data warehousing fact tables. It uses column-
based data storage and query processing to achieve up to 10x query performance gains in your data warehouse
over traditional row-oriented storage, and up to 10x data compression over the uncompressed data size.
Beginning with SQL Server 2016, columnstore indexes enable operational analytics, the ability to run performant
real-time analytics on a transactional workload.
Jump to scenarios:
Columnstore Indexes for Data Warehousing
Get started with Columnstore for real time operational analytics

What is a columnstore index?


A columnstore index is a technology for storing, retrieving and managing data by using a columnar data format,
called a columnstore.
Key terms and concepts
These are key terms and concepts are associated with columnstore indexes.
columnstore
A columnstore is data that is logically organized as a table with rows and columns, and physically stored in a
column-wise data format.
rowstore
A rowstore is data that is logically organized as a table with rows and columns, and then physically stored in a row-
wise data format. This has been the traditional way to store relational table data. In SQL Server, rowstore refers to
table where the underlying data storage format is a heap, a clustered index, or a memory-optimized table.

NOTE
In discussions about columnstore indexes, we use the terms rowstore and columnstore to emphasize the format for the data
storage.

rowgroup
A row group is a group of rows that are compressed into columnstore format at the same time. A rowgroup
usually contains the maximum number of rows per rowgroup which is 1,048,576 rows.
For high performance and high compression rates, the columnstore index slices the table into groups of rows,
called rowgroups, and then compresses each rowgroup in a column-wise manner. The number of rows in the
rowgroup must be large enough to improve compression rates, and small enough to benefit from in-memory
operations.
column segment
A column segment is a column of data from within the rowgroup.
Each rowgroup contains one column segment for every column in the table.
Each column segment is compressed together and stored on physical media.

clustered columnstore index


A clustered columnstore index is the physical storage for the entire table.

To reduce fragmentation of the column segments and improve performance, the columnstore index might
store some data temporarily into a clustered index, which is called a deltastore, and a btree list of IDs for
deleted rows. The deltastore operations are handled behind the scenes. To return the correct query results,
the clustered columnstore index combines query results from both the columnstore and the deltastore.
deltastore
Used with clustered column store indexes only, a deltastore is a clustered index that improves columnstore
compression and performance by storing rows until the number of rows reaches a threshold and are then
moved into the columnstore.
During a large bulk load, most of the rows go directly to the columnstore without passing through the
deltastore. Some rows at the end of the bulk load might be too few in number to meet the minimum size of
a rowgroup which is 102,400 rows. When this happens, the final rows go to the deltastore instead of the
columnstore. For small bulk loads with less than 102,400 rows, all of the rows go directly to the deltastore.
When the deltastore reaches the maximum number of rows, it becomes closed. A tuple-mover process
checks for closed row groups. When it finds the closed rowgroup, it compresses it and stores it into the
columnstore.
nonclustered columnstore index
A nonclustered columnstore index and a clustered columnstore index function the same. The difference is a
nonclustered index is a secondary index created on a rowstore table, whereas a clustered columnstore index
is the primary storage for the entire table.
The nonclustered index contains a copy of part or all of the rows and columns in the underlying table. The
index is defined as one or more columns of the table, and has an optional condition that filters the rows.
A nonclustered columnstore index enables real-time operational analytics in which the OLTP workload uses
the underlying clustered index, while analytics run concurrently on the columnstore index. For more
information, see Get started with Columnstore for real time operational analytics.
batch execution
Batch execution is a query processing method in which queries process multiple rows together. Queries on
columnstore indexes use batch mode execution which improves query performance typically 2-4x. Batch
execution is closely integrated with, and optimized around, the columnstore storage format. Batch-mode
execution is sometimes known as vector-based or vectorized execution.

Why should I use a columnstore index?


A columnstore index can provide a very high level of data compression, typically 10x, to reduce your data
warehouse storage cost significantly. Plus, for analytics they offer an order of magnitude better performance than
a btree index. They are the preferred data storage format for data warehousing and analytics workloads. Starting
with SQL Server 2016, you can use columnstore indexes for real-time analytics on your operational workload.
Reasons why columnstore indexes are so fast:
Columns store values from the same domain and commonly have similar values, which results in high
compression rates. This minimizes or eliminates IO bottleneck in your system while reducing the memory
footprint significantly.
High compression rates improve query performance by using a smaller in-memory footprint. In turn, query
performance can improve because SQL Server can perform more query and data operations in-memory.
Batch execution improves query performance, typically 2-4x, by processing multiple rows together.
Queries often select only a few columns from a table, which reduces total I/O from the physical media.

When should I use a columnstore index?


Recommended use cases:
Use a clustered columnstore index to store fact tables and large dimension tables for data warehousing
workloads. This improves query performance and data compression by up to 10x. See Columnstore Indexes
for Data Warehousing.
Use a nonclustered columnstore index to perform analysis in real-time on an OLTP workload. See Get
started with Columnstore for real time operational analytics.
How do I choose between a rowstore index and a columnstore index?
Rowstore indexes perform best on queries that seek into the data, searching for a particular value, or for queries
on a small range of values. Use rowstore indexes with transactional workloads since they tend to require mostly
table seeks instead of table scans.
Columnstore indexes give high performance gains for analytic queries that scan large amounts of data, especially
on large tables. Use columnstore indexes on data warehousing and analytics workloads, especially on fact tables,
since they tend to require full table scans rather than table seeks.
Can I combine rowstore and columnstore on the same table?
Yes. Beginning with SQL Server 2016, you can create an updatable nonclustered columnstore index on a rowstore
table. The columnstore index stores a copy of the chosen columns so you do need extra space for this but it will be
compressed on average by 10x. By doing this, you can run analytics on the columnstore index and transactions on
the rowstore index at the same time. The column store is updated when data changes in the rowstore table, so
both indexes are working against the same data.
Beginning with SQL Server 2016, you can have one or more nonclustered rowstore indexes on a columnstore
index. By doing this, you can perform efficient table seeks on the underlying columnstore. Other options become
available too. For example, you can enforce a primary key constraint by using a UNIQUE constraint on the
rowstore table. Since an non-unique value will fail to insert into the rowstore table, SQL Server cannot insert the
value into the columnstore.

Metadata

All of the columns in a columnstore index are stored in the metadata as included columns. The columnstore index
does not have key columns.
sys.indexes (Transact-SQL)
sys.index_columns (Transact-SQL)
sys.partitions (Transact-SQL)
sys.internal_partitions (Transact-SQL)
sys.column_store_segments (Transact-SQL)
sys.column_store_dictionaries (Transact-SQL)
sys.column_store_row_groups (Transact-SQL)
sys.dm_db_column_store_row_group_operational_stats (Transact-SQL)
sys.dm_db_column_store_row_group_physical_stats (Transact-SQL)
sys.dm_column_store_object_pool (Transact-SQL)
sys.dm_db_column_store_row_group_operational_stats (Transact-SQL)
sys.dm_db_index_operational_stats (Transact-SQL)
sys.dm_db_index_physical_stats (Transact-SQL)

Related Tasks
All relational tables, unless you specify them as a clustered columnstore index, use rowstore as the underlying data
format. CREATE TABLE creates a rowstore table unless you specify the WITH CLUSTERED COLUMNSTORE INDEX
option.
When you create a table with the CREATE TABLE statement you can create the table as a columnstore by specifying
the WITH CLUSTERED COLUMNSTORE INDEX option. If you already have a rowstore table and want to convert it
to a columnstore, you can use the CREATE COLUMNSTORE INDEX statement. For examples, see.

TASK REFERENCE TOPICS NOTES

Create a table as a columnstore. CREATE TABLE (Transact-SQL) Beginning with SQL Server 2016, you
can create the table as a clustered
columnstore index. You do not have to
first create a rowstore table and then
convert it to columnstore.
TASK REFERENCE TOPICS NOTES

Create a memory table with a CREATE TABLE (Transact-SQL) Beginning with SQL Server 2016, you
columnstore index. can create a memory-optimized table
with a columnstore index. The
columnstore index can also be added
after the table is created, using the
ALTER TABLE ADD INDEX syntax.

Convert a rowstore table to a CREATE COLUMNSTORE INDEX Convert an existing heap or binary tree
columnstore. (Transact-SQL) to a columnstore. Examples show how
to handle existing indexes and also the
name of the index when performing this
conversion.

Convert a columnstore table to a CREATE COLUMNSTORE INDEX Usually this is not necessary, but there
rowstore. (Transact-SQL) can be times when you need to perform
this conversion. Examples show how to
convert a columnstore to a heap or
clustered index.

Create a columnstore index on a CREATE COLUMNSTORE INDEX A rowstore table can have one
rowstore table. (Transact-SQL) columnstore index. Beginning with SQL
Server 2016, the columnstore index can
have a filtered condition. Examples
show the basic syntax.

Create performant indexes for Get started with Columnstore for real Describes how to create complementary
operational analytics. time operational analytics columnstore and btree indexes so that
OLTP queries use btree indexes and
analytics queries use columnstore
indexes.

Create performant columnstore indexes Columnstore Indexes for Data Describes how to use btree indexes on
for data warehousing. Warehousing columnstore tables to create
performant data warehousing queries.

Use a btree index to enforce a primary Columnstore Indexes for Data Shows how to combine btree and
key constraint on a columnstore index. Warehousing columnstore indexes to enforce primary
key constraints on the columnstore
index.

Drop a columnstore index DROP INDEX (Transact-SQL) Dropping a columnstore index uses the
standard DROP INDEX syntax that
btree indexes use. Dropping a clustered
columnstore index will convert the
columnstore table to a heap.

Delete a row from a columnstore index DELETE (Transact-SQL) Use DELETE (Transact-SQL) to delete a
row.

columnstore row: SQL Server marks


the row as logically deleted but does
not reclaim the physical storage for the
row until the index is rebuilt.

deltastore row: SQL Server logically


and physically deletes the row.
TASK REFERENCE TOPICS NOTES

Update a row in the columnstore index UPDATE (Transact-SQL) Use UPDATE (Transact-SQL) to update
a row.

columnstore row: SQL Server marks


the row as logically deleted, and then
inserts the updated row into the
deltastore.

deltastore row: SQL Server updates


the row in the deltastore.

Load data into a columnstore index Columnstore Indexes Data Loading

Force all rows in the deltastore to go ALTER INDEX (Transact-SQL) ... ALTER INDEX with the REBUILD option
into the columnstore. REBUILD forces all rows to go into the
columnstore.
Columnstore Indexes Defragmentation

Defragment a columnstore index ALTER INDEX (Transact-SQL) ALTER INDEX … REORGANIZE


defragments columnstore indexes
online.

Merge tables with columnstore indexes. MERGE (Transact-SQL)

See Also
Columnstore Indexes Data Loading
Columnstore Indexes Versioned Feature Summary
Columnstore Indexes Query Performance
Get started with Columnstore for real time operational analytics
Columnstore Indexes for Data Warehousing
Columnstore Indexes Defragmentation
Columnstore indexes - architecture
3/24/2017 • 6 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
Learn how a columnstore index is architected. Knowing these basics will make it easier to understand other
columnstore articles that explain how to use them effectively.

Data storage uses columnstore and rowstore compression


In discussions about columnstore indexes, we use the terms rowstore and columnstore to emphasize the format for
the data storage. Columnstore indexes use both types of storage.

A columnstore is data that is logically organized as a table with rows and columns, and physically stored in a
column-wise data format.
A columnstore index physically stores most of the data in columnstore format. In columnstore format, the data is
compressed and uncompressed as columns. There is no need to uncompress other values in each row that are not
requested by the query. This makes it fast to scan an entire column of a large table.
A rowstore is data that is logically organized as a table with rows and columns, and then physically stored in a
row-wise data format. This has been the traditional way to store relational table data such as a heap or clustered
"btree" index.
A columnstore index also physically stores some rows in a rowstore format called a deltastore. The deltastore,also
called delta rowgroups, is a holding place for rows that are too few in number to qualify for compression into the
columnstore. Each delta rowgroup is implemented as a clustered btree index.
A deltastore is a a holding place for rows that are too few in number to be compressed into the columnstore.
The deltastore is a rowstore.

Operations are performed on rowgroups and column segments


The columnstore index groups rows into manageable units. Each of these units is called a rowgroup. For best
performance, the number of rows in a rowgroup is large enough to improve compression rates and small enough
to benefit from in-memory operations.
A rowgroup is a group of rows on which the columnstore index performs management and compression
operations.
For example, the columnstore index performs these operations on rowgroups:
Compresses rowgroups into the columnstore. Compression is performed on each column segment within a
rowgroup.
Merges rowgroups during an ALTER INDEX REORGANIZE operation.
Creates new rowgroups during an ALTER INDEX REBUILD operation.
Reports on rowgroup health and fragmentation in the dynamic management views (DMVs).
The deltastore is comprised of one or more rowgroups called delta rowgroups. Each delta rowgroup is a clustered
btree index that stores rows when they are too few in number for compression into the columnstore.
A delta rowgroup is a clustered btree index that stores small bulk loads and inserts until the rowgroup
contains 1,048,576 rows or until the index is rebuilt. When a delta rowgroup contains 1,048,576 rows it is
marked as closed and waits for a process called the tuple-mover to compress it into the columnstore.
Each column has some of its values in each rowgroup. These values are called column segments. When the
columnstore index compresses a rowgroup, it compresses each column segment separately. To uncompress an
entire column, the columnstore index only needs to uncompress one column segment from each rowgroup.
A column segment is the portion of column values in a rowgroup. Each rowgroup contains one column
segment for every column in the table. Each column has one column segment in each rowgroup.|

Small loads and inserts go to the deltastore


A columnstore index improves columnstore compression and performance by compressing at least 102,400 rows
at a time into the columnstore index. To compress rows in bulk, the columnstore index accumulates small loads
and inserts in the deltastore. The deltastore operations are handled behind the scenes. To return the correct query
results, the clustered columnstore index combines query results from both the columnstore and the deltastore.
Rows go to the deltastore when they are:
Inserted with the INSERT INTO VALUES statement.
At the end of a bulk load and they number less than 102,400.
Updated. Each update is implemented as a delete and an insert.
The deltastore also stores a list of IDs for deleted rows that have been marked as deleted but not yet physically
deleted from the columnstore.

When delta rowgroups are full they get compressed into the
columnstore
Clustered columnstore indexes collect up to 1,048,576 rows in each delta rowgroup before compressing the
rowgroup into the columnstore. This improves the compression of the columnstore index. When a delta rowgroup
contains 1,048,576 rows, the columnstore index marks the rowgroup as closed. A background process, called the
tuple-mover, finds each closed rowgroup and compresses it into the columnstore.
You can force delta rowgroups into the columnstore by using ALTER INDEX to rebuild or reorganize the index. Note
that if there is memory pressure during compression, the columnstore index might reduce the number of rows in
the compressed rowgroup.

Each table partition has its own rowgroups and delta rowgroups
The concept of partitioning is the same in both a clustered index, a heap, and a columnstore index. Partitioning a
table divides the table into smaller groups of rows according to a range of column values. It is often used for
managing the data. For example, you could create a partition for each year of data, and then use partition switching
to archive data to less expensive storage. Partition switching works on columnstore indexes and makes it easy to
move a partition of data to another location.
Rowgroups are always defined within a table partition. When a columnstore index is partitioned, each partition has
its own compressed rowgroups and delta rowgroups.
Each partition can have multiple delta rowgroups
Each partition can have more than one delta rowgroups. When the columnstore index needs to add data to a delta
rowgroup and the delta rowgroup is locked, the columnstore index will try to obtain a lock on a different delta
rowgroup. If there are no delta rowgroups available, the columnstore index will create a new delta rowgroup. For
example, a table with 10 partitions could easily have 20 or more delta rowgroups.

You can combine columnstore and rowstore indexes on the same table
A nonclustered index contains a copy of part or all of the rows and columns in the underlying table. The index is
defined as one or more columns of the table, and has an optional condition that filters the rows.
Beginning with SQL Server 2016, you can create an updatable nonclustered columnstore index on a rowstore table.
The columnstore index stores a copy of the data so you do need extra storage. However, the data in the
columnstore index will compress to a smaller size than the rowstore table requires. By doing this, you can run
analytics on the columnstore index and transactions on the rowstore index at the same time. The column store is
updated when data changes in the rowstore table, so both indexes are working against the same data.
Beginning with SQL Server 2016, you can have one or more nonclustered rowstore indexes on a columnstore
index. By doing this, you can perform efficient table seeks on the underlying columnstore. Other options become
available too. For example, you can enforce a primary key constraint by using a UNIQUE constraint on the rowstore
table. Since an non-unique value will fail to insert into the rowstore table, SQL Server cannot insert the value into
the columnstore.

Metadata

Use these metadata views to see attributes of columnstore indexes. More architectural information is embedded in
some of these views.
Note, all of the columns in a columnstore index are stored in the metadata as included columns. The columnstore
index does not have key columns.
sys.indexes (Transact-SQL)
sys.index_columns (Transact-SQL)
sys.partitions (Transact-SQL)
sys.internal_partitions (Transact-SQL)
sys.column_store_segments (Transact-SQL)
sys.column_store_dictionaries (Transact-SQL)
sys.column_store_row_groups (Transact-SQL)
sys.dm_db_column_store_row_group_operational_stats (Transact-SQL)
sys.dm_db_column_store_row_group_physical_stats (Transact-SQL)
sys.dm_column_store_object_pool (Transact-SQL)
sys.dm_db_column_store_row_group_operational_stats (Transact-SQL)
sys.dm_db_index_operational_stats (Transact-SQL)
sys.dm_db_index_physical_stats (Transact-SQL)
|

Next steps
For guidance on designing your columnstore indexes, see Columnstore indexes - design guidance
Columnstore indexes - design guidance
3/24/2017 • 14 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
High-level recommendations for designing columnstore indexes. A small number of good decision decisions helps
you achieve the high data compression and query performance that columnstore indexes are designed to provide.

Prerequisites
This article assumes you are familiar with columnstore architecture and terminology. For more information, see
Columnstore indexes - overview and Columnstore indexes - architecture.
Know your data requirements
Before designing a columnstore index, understand as much as possible about your data requirements. For example,
think through the answers to these questions:
How large is my table?
Do my queries mostly perform analytics that scan large ranges of values? Columnstore indexes are designed to
work well for large range scans rather than looking up specific values.
Does my workload perform lots of updates and deletes? Columnstore indexes work well when the data is stable.
Queries should be updating and deleting less than 10% of the rows.
Do I have fact and dimension tables for a data warehouse?
Do I need to perform analytics on a transactional workload? If this is the case, see the columnstore design
guidance for real-time operational analytics.
You might not need a columnstore index. Rowstore tables with heaps or clustered indexes perform best on queries
that seek into the data, searching for a particular value, or for queries on a small range of values. Use rowstore
indexes with transactional workloads since they tend to require mostly table seeks instead of large range table
scans.

Choose the best columnstore index for your needs


A columnstore index is either clustered or nonclustered. A clustered columnstore index can have one or more
nonclustered btree indexes. Columnstore indexes are easy to try. If you create a table as a columnstore index, you
can easily convert the table back to a rowstore table by dropping the columnstore index.
Here is a summary of the options and recommendations.

COLUMNSTORE OPTION RECOMMENDATIONS FOR WHEN TO USE COMPRESSION

Clustered columnstore index Use for: Average of 10x


1) Traditional data warehouse workload
with a star or snowflake schema
2) Internet of Things (IOT) workloads
that insert large volumes of data with
minimal updates and deletes.
COLUMNSTORE OPTION RECOMMENDATIONS FOR WHEN TO USE COMPRESSION

Nonclustered btree indexes on a Use to: 10x on average plus some additional
clustered columnstore index 1) Enforce primary key and foreign key storage for the NCIs.
constraints on a clustered columnstore
index.
2) Speedup queries that search for
specific values or small ranges of values.
3) Speedup updates and deletes of
specific rows.

Nonclustered columnstore index on a Use for: NCCI is an additional index that requires
disk-based heap or btree index 1) An OLTP workload that has some 10% more storage on average.
analytics queries. You can drop btree
indexes created for analytics and replace
them with one nonclustered
columnstore index.
2) Many traditional OLTP workloads
that perform Extract Transform and
Load (ETL) operations to move data to
a separate data warehouse. You can
eliminate ETL and a separate data
warehouse by creating a nonclustered
columnstore index on some of the
OLTP tables.

Columnstore index on an in-memory Same recommendations as Columnstore index is an additional


table nonclustered columnstore index on a index.
disk-based table, except the base table
is an in-memory table.

Use a clustered columnstore index for large data warehouse tables


The clustered columnstore index is more than an index, it is the primary table storage. It achieves high data
compression and a significant improvement in query performance on large data warehousing fact and dimension
tables. Clustered columnstore indexes are best suited for analytics queries rather than transactional queries, since
analytics queries tend to perform operations on large ranges of values rather than looking up specific values.
Consider using a clustered columnstore index when:
Each partition has at least a million rows. Columnstore indexes have rowgroups within each partition. If the
table is too small to fill a rowgroup within each partition, you won't get the benefits of columnstore
compression and query performance.
Queries primarily perform analytics on ranges of values. For example, to find the average value of a column, the
query needs to scan all the column values. It then aggregates the values by summing them to determine the
average.
Most of the inserts are on large volumes of data with minimal updates and deletes. Many workloads such as
Internet of Things (IOT) insert large volumes of data with minimal updates and deletes. These workloads can
benefit from the compression and query performance gains that comes from using a clustered columnstore
index.
Don't use a clustered columnstore index when:
The table requires varchar(max), nvarchar(max), or varbinary(max) data types. Or, design the columnstore index
so that it doesn't include these columns.
The table data is not permanent. Consider using a heap or temporary table when you need to store and delete
the data quickly.
The table has less than one million rows per partition.
More than 10% of the operations on the table are updates and deletes. Large numbers of updates and deletes
cause fragmentation. The fragmentation affects compression rates and query performance until you run an
operation called reorganize that forces all data into the columnstore and removes fragmentation. For more
information, see Minimizing index fragmentation in columnstore index.
For more information, see Columnstore indexes - data warehousing.

Add btree nonclustered indexes for efficient table seeks


Beginning with SQL Server 2016, you can create nonclustered btree indexes as secondary indexes on a clustered
columnstore index. The nonclustered btree index is updated as changes occur to the columnstore index. This is a
powerful feature that you can use to your advantage.
By using the secondary btree index, you can efficiently search for specific rows without scanning through all the
rows. Other options become available too. For example, you can enforce a primary or foreign key constraint by
using a UNIQUE constraint on the btree index. Since an non-unique value will fail to insert into the btree index, SQL
Server cannot insert the value into the columnstore.
Consider using a btree index on a columnstore index to:
Run queries that search for particular values or small ranges of values.
Enforce a constraint such as a primary key or foreign key constraint.
Efficiently perform update and delete operations. The btree index is able to quickly locate the specific rows for
updates and deletes without scanning the full table or partition of a table.
You have additional storage available to store the btree index.

Use a nonclustered columnstore index for real-time analytics


Beginning with SQL Server 2016, you can have a nonclustered columnstore index on a rowstore disk-based table
or an in-memory OLTP table. This makes it possible to run the analytics in real-time on a transactional table. While
transactions are occurring on the underlying table, you can run analytics on the columnstore index. Since one table
manages both indexes, changes are available in real-time to both the rowstore and the columnstore indexes.
Since a columnstore index achieves 10x better data compression than a rowstore index, it only needs a small
amount of extra storage. For example, if the compressed rowstore table takes 20 GB, the columnstore index might
require an additional 2 GB. The additional space required also depends on the number of columns in the
nonclustered columnstore index.
Consider using a nonclustered columnstore index to:
Run analytics in real-time on a transactional rowstore table. You can replace existing btree indexes that are
designed for analytics with a nonclustered columnstore index.
Eliminate the need for a separate data warehouse. Traditionally, companies run transactions on a rowstore
table and then load the data into a separate data warehouse to run analytics. For many workloads, you can
eliminate the loading process and the separate data warehouse by creating a nonclustered columnstore
index on transactional tables.
SQL Server 2016 offers several strategies to make this scenario performant. It's very easy to try it since you
can enable a nonclustered columnstore index with no changes to your OLTP application.
To add additional processing resources, you can run the analytics on a readable secondary. Using a readable
secondary separates the processing of the transactional workload and the analytics workload.
For more information, see Get started with columnstore indexes for real-time operational analytics
For more information on choosing the best columnstore index, see Sunil Agarwal's blog Which columnstore index
is right for my workload?.

Use table partitions for data management and query performance


Columnstore indexes support partitioning which is a good way to manage and archive data. Partitioning also
improves query performance by limiting operations to one or more partitions.
Use partitions to make the data easier to manage
For large tables, the only practical way to manage ranges of data is by using partitions. The advantages of
partitions for rowstore tables also apply to columnstore indexes.
For example, both rowstore and columnstore tables use partitions to:
Control the size of incremental backups. You can backup partitions to separate filegroups and then mark them
as read-only. By doing this, future backups will skip the read-only filegroups.
Save storage costs by moving an older partition to less expensive storage. For example, you could use partition
switching to move a partition to a less expensive storage location.
Perform operations efficiently by limiting the operations to a partition. For example, you can target only the
fragmented partitions for index maintenance.
Additionally, with a columnstore index, you use partitioning to:
Save an additional 30% in storage costs. You can compress older partitions with the COLUMNSTORE_ARCHIVE
compression options. The data will be slower for query performance, which is acceptable if the partition is
queries infrequently.
Use partitions to improve query performance
By using partitions, you can limit your queries to scan only specific partitions which limits the number of rows to
scan. For example, if the index is partitioned by year and the query is analyzing data from last year, it only needs to
scan the data in one partition.
Use fewer partitions for a columnstore index
Unless you have a large enough data size, a columnstore index performs best with fewer partitions than you what
you might use for a rowstore index. If you don't have at least one million rows per partition, most of your rows
might go to the deltastore where they don't receive the performance benefit of columnstore compression. For
example, if you load one million rows into a table with 10 partitions and each partition receives 100,000 rows, all of
the rows will go to delta rowgroups.
Example:
Load 1,000,000 rows into one partition or a non-partitioned table. You get one compressed rowgroup with
1,000,000 rows. This is great for high data compression and fast query performance.
Load 1,000,000 rows evenly into 10 partitions. Each partition gets 100,000 rows which is less than the
minimum threshold for columnstore compression. As a result the columnstore index could have 10 delta
rowgroups with 100,000 rows in each. There are ways to force the delta rowgroups into the columnstore.
However, if these are the only rows in the columnstore index, the comrpessed rowgroups will be too small for
best compression and query performance.
For more information about partitioning, see Sunil Agarwal's blog post, Should I partition my columnstore index?.

Choose the appropriate data compression method


The columnstore index offers two choices for data compression: columnstore compression and archive
compression. You can choose the compression option when you create the index, or change it later with ALTER
INDEX ... REBUILD.
Use columnstore compression for best query performance
Columnstore compression typically achieves 10x better compression rates over rowstore indexes. It is the standard
compression method for columnstore indexes and enables fast query performance.
Use archive compression for best data compression
Archive compression is designed for maximum compression when query performance is not as important. It
achieves higher data compression rates than columnstore compression, but it comes with a price. It takes longer to
compress and decompress the data, so it is not well-suited for fast query performance.

Use optimizations when you convert a rowstore table to a columnstore


index
If your data is already in a rowstore table, you can use CREATE COLUMNSTORE INDEX to convert the table to a
clustered columnstore index. There's a couple optimizations that will improve query performance after the table is
converted.
Use MAXDOP to improve rowgroup quality
You can configure the maximum number of processors for converting a heap or clustered btree index to a
columnstore index. To configure the processors, use the maximum degree of parallelism option (MAXDOP).
If you have large amounts of data, MAXDOP 1 will likely be too slow. Increasing MAXDOP to 4 works fine. If this
results in a few rowgroups that do not have the optimal number of rows you can run ALTER INDEX REORG to
merge them together in the background.
Keep the sorted order of a btree index
Since the btree index already stores rows in a sorted order, preserving that order when the rows get compressed
into the columnstore index can improve query performance.
The columnstore index does not sort the data, but it does use metadata to track the minimum and maximum values
of each column segment in each rowgroup. When scanning for a range of values, it can quickly compute when to
skip the rowgroup. When the data is ordered, more rowgroups can be skipped.
To preserve the sorted order during conversion:
Use CREATE COLUMNSTORE INDEX with the DROP_EXISTING clause. This also preserves the name of the
index. If you have scripts that already use the name of the rowstore index you won't need to update them.
This example converts a clustered rowstore index on a table named MyFactTable to a clustered columnstore
index. The index name, ClusteredIndex_d473567f7ea04d7aafcac5364c241e09 , stays the same.

CREATE CLUSTERED COLUMNSTORE INDEX ClusteredIndex_d473567f7ea04d7aafcac5364c241e09


ON MyFactTable
WITH DROP_EXISTING = ON;

Related Tasks
These are tasks for creating and maintaining columnstore indexes.

TASK REFERENCE TOPICS NOTES

Create a table as a columnstore. CREATE TABLE (Transact-SQL) Beginning with SQL Server 2016, you
can create the table as a clustered
columnstore index. You do not have to
first create a rowstore table and then
convert it to columnstore.
TASK REFERENCE TOPICS NOTES

Create a memory table with a CREATE TABLE (Transact-SQL) Beginning with SQL Server 2016, you
columnstore index. can create a memory-optimized table
with a columnstore index. The
columnstore index can also be added
after the table is created, using the
ALTER TABLE ADD INDEX syntax.

Convert a rowstore table to a CREATE COLUMNSTORE INDEX Convert an existing heap or binary tree
columnstore. (Transact-SQL) to a columnstore. Examples show how
to handle existing indexes and also the
name of the index when performing this
conversion.

Convert a columnstore table to a CREATE COLUMNSTORE INDEX Usually this is not necessary, but there
rowstore. (Transact-SQL) can be times when you need to perform
this conversion. Examples show how to
convert a columnstore to a heap or
clustered index.

Create a columnstore index on a CREATE COLUMNSTORE INDEX A rowstore table can have one
rowstore table. (Transact-SQL) columnstore index. Beginning with SQL
Server 2016, the columnstore index can
have a filtered condition. Examples show
the basic syntax.

Create performant indexes for Get started with Columnstore for real Describes how to create complementary
operational analytics. time operational analytics columnstore and btree indexes so that
OLTP queries use btree indexes and
analytics queries use columnstore
indexes.

Create performant columnstore indexes Columnstore indexes - data Describes how to use btree indexes on
for data warehousing. Warehousing columnstore tables to create
performant data warehousing queries.

Use a btree index to enforce a primary Columnstore indexes - data Shows how to combine btree and
key constraint on a columnstore index. warehousing columnstore indexes to enforce primary
key constraints on the columnstore
index.

Drop a columnstore index DROP INDEX (Transact-SQL) Dropping a columnstore index uses the
standard DROP INDEX syntax that btree
indexes use. Dropping a clustered
columnstore index will convert the
columnstore table to a heap.

Delete a row from a columnstore index DELETE (Transact-SQL) Use DELETE (Transact-SQL) to delete a
row.

columnstore row: SQL Server marks


the row as logically deleted but does
not reclaim the physical storage for the
row until the index is rebuilt.

deltastore row: SQL Server logically


and physically deletes the row.
TASK REFERENCE TOPICS NOTES

Update a row in the columnstore index UPDATE (Transact-SQL) Use UPDATE (Transact-SQL) to update
a row.

columnstore row: SQL Server marks


the row as logically deleted, and then
inserts the updated row into the
deltastore.

deltastore row: SQL Server updates the


row in the deltastore.

Force all rows in the deltastore to go ALTER INDEX (Transact-SQL) ... ALTER INDEX with the REBUILD option
into the columnstore. REBUILD forces all rows to go into the
columnstore.
Columnstore indexes - defragmentation

Defragment a columnstore index ALTER INDEX (Transact-SQL) ALTER INDEX … REORGANIZE


defragments columnstore indexes
online.

Merge tables with columnstore indexes. MERGE (Transact-SQL)

Next steps
To create an empty columnstore index for:
SQL Server, use CREATE TABLE (Transact-SQL)
SQL Database, use CREATE TABLE on Azure SQL Database
SQL Data Warehouse, use CREATE TABLE (Azure SQL Data Warehouse)
To convert an existing rowstore heap or btree index to a clustered columnstore index, or to create a nonclustered
columnstore index, use:
CREATE COLUMNSTORE INDEX (Transact-SQL)
Columnstore indexes - data loading guidance
4/6/2017 • 6 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2012) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
Options and recommendations for loading data into a columnstore index by using the standard SQL bulk loading
and trickle insert methods. Loading data into a columnstore index is an essential part of any data warehousing
process because it moves data into the index in preparation for analytics.
New to columnstore indexes? See Columnstore indexes - overview and Columnstore indexes - architecture.

What is bulk loading?


Bulk loading refers to the way large numbers of rows are added to a data store. It is the most performant way to
move data into a columnstore index because it operates on batches of rows. Bulk loading fills rowgroups to
maximum capacity and compresses them directly into the columnstore. Only rows at the end of a load that don't
meet the minimum of 102,400 rows per rowgroup go to the deltastore.
s To perform a bulk load, you can use bcp Utility, Integration Services, or select rows from a staging table.

As the diagram suggests, a bulk load::


Does not pre-sort the data. Data is inserted into rowgroups in the order it is received.
If the batch size is >= 102400, the rows are directly into the compressed rowgroups. It is recommended that
you choose a batch size >=102400 for efficient bulk import because you can avoid moving data rows to a delta
rowgroups before the rows are eventually moved to compressed rowgroups by a background thread, Tuple
mover (TM).
If the batch size < 102400 or if the remaining rows are < 102400, the rows are loaded into delta rowgroups.

![Note] On a rowstore table with a nonclustered columnstore index data, SQL Server always inserts data into
the base table. The data is never inserted directly into the columnstore index.

Bulk loading has these built-in performance optimizations:


Parallel loads. You can have multiple concurrent bulk loads (bcp or bulk insert) that are each loading a
separate data file. Unlike rowstore bulk loads into SQL Server, you don't need to specify TABLOCK because
each bulk import thread will load data exclusively into a separate rowgroups (compressed or delta
rowgroups) with exclusive lock on it. Using TABLOCK will force an exclusive lock on the table and you will
not be able to import data in parallel.
Minimal logging. A bulk load uses minimal logging on data that goes directly to compressed rowgroups.
Any data that goes to a delta rowgroup is fully logged. This includes any batch sizes that are less than
102,400 rows. However, with bulk loading the goal is for most of the data to bypass delta rowgroups.
Locking Optimization. When loading into compressed rowgroup, the X lock on rowgroup is acquired.
However, when bulk loading into delta rowgroup, an X lock is acquired at rowgroup but SQL Server still
locks the locks PAGE/EXTENT because X rowgroup lock is not part of locking hierarchy.
If you have a nonclustered btree index on a columnstore index, there is no locking or logging optimization for the
index itself but the optimizations on clustered columnstore index as described above are still there.

Plan bulk load sizes to minimize delta rowgroups


Columnstore indexes perform best when most of the rows are compressed into the columnstore and not sitting in
delta rowgroups. It's best to size your loads so that rows go directly to the columnstore and bypass the deltastore
as much as possible.
These scenarios describe when loaded rows go directly to the columnstore or when they go to the deltastore. In
the example, each rowgroup can have 102,400-1,048,576 rows per rowgroup. In practice, the maximum size of a
rowgroup can be smaller than 1,048,576 rows when there is memory pressure.

ROWS ADDED TO THE COMPRESSED


ROWS TO BULK LOAD ROWGROUO ROWS ADDED TO THE DELTA ROWGROUP

102,000 0 102,000

145,000 145,000 0

Rowgroup size: 145,000

1,048,577 1,048,576 1

Rowgroup size: 1,048,576.

2,252,152 2,252,152 0

Rowgroup sizes: 1,048,576, 1,048,576,


155,000.

The following example shows the results of loading 1,048,577 rows into a table. The results show that one
COMPRESSED rowgroup in the columnstore (as compressed column segments), and 1 row in the deltastore.

SELECT object_id, index_id, partition_number, row_group_id, delta_store_hobt_id, state state_desc, total_rows,


deleted_rows, size_in_bytes
FROM sys.dm_db_column_store_row_group_physical_stats

Use a staging table to improve performance


If you are loading data only to stage it before running more transformations, loading the table to heap table will be
much faster than loading the data to a clustered columnstore table. In addition, loading data to a [temporary table]
[Temporary] will also load much faster than loading a table to permanent storage.
A common pattern for data load is to load the data into a staging table, do some transformation and then load it
into the target table using the following command

INSERT INTO <columnstore index> SELECT <list of columns> FROM <Staging Table>

This command loads the data into the columnstore index in similar ways to BCP or Bulk Insert but in a single batch.
If the number of rows in the staging table < 102400, the rows are loaded into a delta rowgroup otherwise the
rows are directly loaded into compressed rowgroup. One key limitation was that this INSERT operation was single
threaded. To load data in parallel, you could create multiple staging table or issue INSERT/SELECT with non-
overlapping ranges of rows from the staging table. This limitation goes away with SQL Server 2016. The command
below loads the data from staging table in parallel but you will need to specify TABLOCK

INSERT INTO <columnstore index> WITH (TABLOCK) SELECT <list of columns> FROM <Staging Table>

There are following optimizations available when loading into clustered columnstore index from staging table
Log Optimization: Minimally logged both when the data is loaded into compressed rowgroup. No minimal
logging when data gets loaded into delta rowgroup.
Locking Optimization: When loading into compressed rowgroup, the X lock on rowgroup is acquired.
However, with delta rowgroup, an X lock is acquired at rowgroup but SQL Server still locks the locks
PAGE/EXTENT because X rowgroup lock is not part of locking hierarchy.
If you have or more nonclustered indexes, there is no locking or logging optimization for the index itself but
the optimizations on clustered columnstore index as described above are still there

What is trickle insert?


Trickle insert refers to the way individual rows move into the columnstore index. Trickle inserts use the INSERT
INTO statement. With trickle insert, all of the rows go to the deltastore. This is useful for small numbers of rows,
but not practical for large loads.

INSERT INTO <table-name> VALUES (<set of values>)

Note, concurrent threads using INSERT INTO to insert values into a clustered columnstore index can insert rows
into the same deltastore rowgroup.
Once the rowgroup contains 1,048,576 rows, the delta rowgroup us marked closed but it is still available for
queries and update/delete operations but the newly inserted rows go into an existing or newly created deltastore
rowgroup. There is a background thread Tuple Mover (TM) that compresses the closed delta rowgroups
periodically every 5 minutes or so. You can explicitly invoke the following command to compress the closed delta
rowgroup

ALTER INDEX <index-name> on <table-name> REORGANIZE

If you want force a delta rowgroup closed and compressed, you can execute the following command. You may
want run this command if you are done loading the rows and don't expect any new rows. By explicitly closing and
compressing the delta rowgroup, you can save storage further and improve the analytics query performance. A
best practice is to invoke this command if you don't expect new rows to be inserted.

ALTER INDEX <index-name> on <table-name> REORGANIZE with (COMPRESS_ALL_ROW_GROUPS = ON)


How loading into a partitioned table works
For partitioned data, SQL Server first assigns each row to a partition, and then performs columnstore operations
on the data within the partition. Each partition has its own rowgroups and at least one delta rowgroup.

Next steps
For further discussion on loading, see this blog post.
Columnstore indexes - what's new
3/24/2017 • 6 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2012) Azure SQL Database Azure SQL Data Warehouse
Parallel Data Warehouse
Summary of columnstore features available for each version of SQL Server, and the latest releases of Azure SQL
Database Premium Edition, Azure SQL Data Warehouse, and Parallel Data Warehouse.

NOTE
For Azure SQL Database, columnstore indexes are only available in Premium Edition.

Feature Summary for Product Releases


This table summarizes key features for columnstore indexes and the products in which they are available.

COLUMNSTORE SQL DATABASE SQL DATA


INDEX FEATURE SQL SERVER 2012 SQL SERVER 2014 SQL SERVER 2016 PREMIUM EDITION WAREHOUSE

Batch execution yes yes yes yes yes


for multi-
threaded queries

Batch execution yes yes yes


for single-
threaded queries

Archival yes yes yes yes


compression
option.

Snapshot yes yes yes


isolation and
read-committed
snapshot
isolation

Specify yes yes yes


columnstore
index when
creating a table.

AlwaysOn yes yes yes yes yes


supports
columnstore
indexes.
COLUMNSTORE SQL DATABASE SQL DATA
INDEX FEATURE SQL SERVER 2012 SQL SERVER 2014 SQL SERVER 2016 PREMIUM EDITION WAREHOUSE

AlwaysOn yes yes yes yes yes


readable
secondary
supports read-
only
nonclustered
columnstore
index

AlwaysOn yes
readable
secondary
supports
updateable
columnstore
indexes.

Read-only yes yes yes* yes* yes*


nonclustered
columnstore
index on heap or
btree.

Updateable yes yes yes


nonclustered
columnstore
index on heap or
btree

Additional btree yes yes yes yes yes


indexes allowed
on a heap or
btree that has a
nonclustered
columnstore
index.

Updateable yes yes yes yes


clustered
columnstore
index.

Btree index on a yes yes yes


clustered
columnstore
index.

Columnstore yes yes yes


index on a
memory-
optimized table.

Nonclustered yes yes yes


columnstore
index definition
supports using a
filtered condition.
COLUMNSTORE SQL DATABASE SQL DATA
INDEX FEATURE SQL SERVER 2012 SQL SERVER 2014 SQL SERVER 2016 PREMIUM EDITION WAREHOUSE

Compression yes yes yes


delay option for
columnstore
indexes in
CREATE TABLE
and ALTER
TABLE.

*To create a readable nonclustered columnstore index, store the index on a read-only filegroup.

SQL Server 2016


SQL Server 2016 adds key enhancements to improve the performance and flexibility of columnstore indexes. This
enhances data warehousing scenarios and enables real-time operational analytics.
Functional
A rowstore table can have one updateable nonclustered columnstore index. Previously, the nonclustered
columnstore index was read-only.
The nonclustered columnstore index definition supports using a filtered condition. Use this feature to create
a nonclustered columnstore index on only the cold data of an operational workload. By doing this, the
performance impact of having a columnstore index on an OLTP table will be minimal.
An in-memory table can have one columnstore index. You can create it when the table is created or add it
later with ALTER TABLE (Transact-SQL). Previously, only a disk-based table could have a columnstore index.
A clustered columnstore index can have one or more nonclustered rowstore indexes. Previously, the
columnstore index did not support nonclustered indexes. SQL Server automatically maintains the
nonclustered indexes for DML operations.
Support for primary keys and foreign keys by using a btree index to enforce these constraints on a clustered
columnstore index.
Columnstore indexes have a compression delay option that minimizes the impact the transactional
workload can have on real-time operational analytics. This option allows for frequently changing rows to
stabilize before compressing them into the columnstore. For details, see CREATE COLUMNSTORE INDEX
(Transact-SQL) and Get started with Columnstore for real time operational analytics.
Performance for database compatibility level 120 or 130
Columnstore indexes support read committed snapshot isolation level (RCSI) and snapshot isolation (SI).
This enables transactional consistent analytics queries with no locks.
Columnstore supports index defragmentation by removing deleted rows without the need to explicitly
rebuild the index. The ALTER INDEX … REORGANIZE statement will remove deleted rows, based on an
internally defined policy, from the columnstore as an online operation
Columnstore indexes can be access on an AlwaysOn readable secondary replica. You can improve
performance for operational analytics by offloading analytics queries to an AlwaysOn secondary replica.
To improve performance, SQL Server computes the aggregate functions MIN, MAX, SUM, COUNT, AVG
during table scans when the data type uses no more than eight bytes, and is not of a string type. Aggregate
pushdown is supported with or without Group By clause for both clustered columnstore indexes and
nonclustered columnstore indexes.
Predicate pushdown speeds up queries that compare strings of type [v]char or n[v]char. This applies to the
common comparison operators and includes operators such as LIKE that use bitmap filters. This works with
all collations that SQL Server supports.
Performance for database compatibility level 130
New batch mode execution support for queries using any of these operations:
SORT
Aggregates with multiple distinct functions. Some examples: COUNT/COUNT, AVG/SUM,
CHECKSUM_AGG, STDEV/STDEVP.
Window aggregate functions: COUNT, COUNT_BIG, SUM, AVG, MIN, MAX, and CLR.
Window user-defined aggregates: CHECKSUM_AGG, STDEV, STDEVP, VAR, VARP, and GROUPING.
Window aggregate analytic functions: LAG< LEAD, FIRST_VALUE, LAST_VALUE, PERCENTILE_CONT,
PERCENTILE_DISC, CUME_DIST, and PERCENT_RANK.
Single-threaded queries running under MAXDOP 1 or with a serial query plan execute in batch mode.
Previously-only multi-threaded queries ran with batch execution.
Memory optimized table queries can have parallel plans in SQL InterOp mode both when accessing data in
rowstore or in columnstore index
Supportability
These system views are new for columnstore:
sys.column_store_row_groups (Transact-SQL)
sys.dm_column_store_object_pool (Transact-SQL)
sys.dm_db_column_store_row_group_operational_stats (Transact-SQL)
sys.dm_db_column_store_row_group_physical_stats (Transact-SQL)
sys.dm_db_index_operational_stats (Transact-SQL)
sys.dm_db_index_physical_stats (Transact-SQL)
sys.internal_partitions (Transact-SQL)
These in-memory OLTP-based DMVs contain updates for columnstore:
sys.dm_db_xtp_hash_index_stats (Transact-SQL)
sys.dm_db_xtp_index_stats (Transact-SQL)
sys.dm_db_xtp_memory_consumers (Transact-SQL)
sys.dm_db_xtp_nonclustered_index_stats (Transact-SQL)
sys.dm_db_xtp_object_stats (Transact-SQL)
sys.dm_db_xtp_table_memory_stats (Transact-SQL)
Limitations
MERGE is disabled when a btree index is defined on a clustered columnstore index.
For in-memory tables, a columnstore index must include all the columns; the columnstore index cannot
have a filtered condition.
For in-memory tables, queries on columnstore indexes run only in InterOP mode, and not in the in-memory
native mode. Parallel execution is supported.
SQL Server 2014
SQL Server 2014 introduced the clustered column store index as the primary storage format. This allowed regular
loads as well as update, delete, and insert operations.
The table can use a clustered column store index as the primary table storage. No other indexes are allowed
on the table, but the clustered column store index is updateable so you can perform regular loads and make
changes to individual rows.
The nonclustered column store index continues to have the same functionality as in SQL Server 2012 except
for additional operators that can now be executed in batch mode. It is still not updateable except by
rebuilding, and by using partition switching. The nonclustered columnstore index is supported on disk-
based tables only, and not on in-memory tables.
The clustered and nonclustered column store index has an archival compression option that further
compresses the data. The archival option is useful for reducing the data size both in memory and on disk,
but does slow query performance. It works well for data that is accessed infrequently.
The clustered columnstore index and the nonclustered columnstore index function in a very similar way;
they use the same columnar storage format, same query processing engine, and the same set of dynamic
management views. The difference is primary versus secondary index types, and the nonclustered
columnstore index is read-only.
These operators run in batch mode for multi-threaded queries: scan, filter, project, join, group by, and union
all.

SQL Server 2012


SQL Server 2012 introduced the nonclustered columnstore index as another index type on rowstore tables and
batch processing for queries on columnstore data.
A rowstore table can have one nonclustered columnstore index.
The colum store index is read-only. After you create the columnstore index, you cannot update the table by
insert, delete, and update operations; to perform these operations you must drop the index, update the table
and rebuild the columnstore index. You can load additional data into the table by using partition switching.
The advantage of partition switching is you can load data without dropping and rebuilding the columnstore
index.
The column store index always requires extra storage, typically an additional 10% over rowstore, because it
stores a copy of the data.
Batch processing providex 2x or better query performance, but it is only available for parallel query
execution.

See Also
Columnstore Indexes Guide
Columnstore Indexes Data Loading
Columnstore Indexes Query Performance
Get started with Columnstore for real time operational analytics
Columnstore Indexes for Data Warehousing
Columnstore Indexes Defragmentation
Columnstore indexes - query performance
3/24/2017 • 13 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2012) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
Recommendations for achieving the very fast query performance that columnstore indexes are designed to
provide.
Columnstore indexes can achieve up to 100x better performance on analytics and data warehousing workloads
and up to 10x better data compression than traditional rowstore indexes. These recommendations will help your
queries achieve the very fast query performance that columnstore indexes are designed to provide. Further
explanations about columnstore performance are at the end.

Recommendations for improving query performance


Here are some recommendations for achieving the high performance columnstore indexes are designed to
provide.
1. Organize data to eliminate more rowgroups from a full table scan
Leverage insert order. In common case in traditional data warehouse, the data is indeed inserted in time
order and analytics is done in time dimension. For example, analyzing sales by quarter. For this kind of
workload, the rowgroup elimination happens automatically. In SQL Server 2016, you can find out number
rowgroups skipped as part of query processing.
Leverage the rowstore clustered index. If the common query predicate is on a column (e.g. C1) that is
unrelated to insert order of the row, you can create a rowstore clustered index on columns C1 and then
create clustered columstore index by dropping the rowstore clustered index. if you create the clustered
columnstore index explicitly using DOP (degree of parallelism) = 1, the resultant clustered columnstore
index will be perfectly ordered on column C1. If you specify DOP=8, then you will see overlap of values
across 8 rowgroups. A common case of this strategy when you initially create columnstore index with large
set of data. Note, for nonclustered columnstore index (NCCI), if the base rowstore table has a clustered
index, the rows are already ordered. In this case, the resultant nonclustered columnstore index will
automatically be ordered. One important point to note is that columnstore index does not inherently
maintain the order of rows. As new rows are inserted or older rows are updated, you may need to repeat
the process as the analytics query performance may deteriorate
Leverage table partitioning. You can partition the columnstore index and then use partition elimination
to reduce number of rowgroups to scan. For example, a fact table stores purchases made by customers and
a common query pattern is to find quarterly purchases done by a specific customer, you can combine the
insert order with partitioning on customer column. Each partition will contain rows in time order for specific
customer.
2. Plan for enough memory to create columnstore indexes in parallel
Creating a columnstore index is by default a parallel operation unless memory is constrained. Creating the index in
parallel requires more memory than creating the index serially. When there is ample memory, creating a
columnstore index takes on the order of 1.5 times as long as building a B-tree on the same columns.
The memory required for creating a columnstore index depends on the number of columns, the number of string
columns, the degree of parallelism (DOP), and the characteristics of the data. For example, if your table has fewer
than one million rows, SQL Server will use only one thread to create the columnstore index.
If your table has more than one million rows, but SQL Server cannot get a large enough memory grant to create
the index using MAXDOP, SQL Server will automatically decrease MAXDOP as needed to fit into the available
memory grant. In some cases, DOP must be decreased to one in order to build the index under constrained
memory.
Beginning with SQL Server 2016, the query will always operate in batch mode. In previous releases, batch
execution is only used when DOP is greater than one.

Columnstore Performance Explained


Columnstore indexes achieve high query performance by combining high-speed in-memory batch mode
processing with techniques that greatly reduce IO requirements. Since analytics queries scan large numbers of
rows, they are typically IO-bound, and therefore reducing IO during query execution is critical to the design of
columnstore indexes. Once data has been read into memory, it is critical to reduce the number of in-memory
operations.
Columnstore indexes reduce IO and optimize in-memory operations through high data compression, columnstore
elimination, rowgroup elimination, and batch processing.
Data compression
Columnstore indexes achieve up to 10x greater data compression than rowstore indexes. This greatly reduces the
IO required to execute analytics queries and therefore improves query performance.
Columnstore indexes read compressed data from disk, which means fewer bytes of data need to be read
into memory.
Columnstore indexes store data in compressed form in memory which reduces IO by reducing the number
of times the same data is read into memory. For example, with 10x compression, columnstore indexes can
keep 10x more data in memory compared to storing the data in uncompressed form. With more data in
memory, it is more likely that the columnstore index will find the data it needs in memory with incurring
additional reads from disk.
Columnstore indexes compress data by columns instead of by rows which achieves high compression rates
and reduces the size of the data stored on disk. Each column is compressed and stored independently. Data
within a column always has the same data type and tends to have similar values. Data compression
techniques are very good at achieving higher compression rates when values are similar.
For example, if a fact table stores customer addresses and has a column for country, the total number of
possible values is fewer than 200. Some of those values will be repeated many times. If the fact table has
100 million rows, the country column will compress easily and require very little storage. Row-by-row
compression is not able to capitalize on the similarity of column values in this way and will use more bytes
to compress the values in the country column.
Column elimination
Columnstore indexes skip reading in columns that are not required for the query result. This ability, called column
elimination, further reduces IO for query execution and therefore improves query performance.
Column elimination is possible because the data is organized and compressed column by column. In
contrast, when data is stored row-by-row, the column values in each row are physically stored together and
cannot be easily separated. The query processor needs to read in an entire row to retrieve specific column
values, which increases IO because extra data is unnecessarily read into memory.
For example, if a table has 50 columns and the query only uses 5 of those columns, the columnstore index
only fetches the 5 columns from disk. It skips reading in the other 45 columns. This reduces IO by another
90% assuming all columns are of similar size. If the same data are stored in a rowstore, the query processor
needs to read the additional 45 columns.
Rowgroup elimination
For full table scans, a large percentage of the data usually does not match the query predicate criteria. By using
metadata, the the columnstore index is able to skip reading in the rowgroups that do not contain data required for
the query result, all without actual IO. This ability, called rowgroup elimination, reduces IO for full table scans and
therefore improves query performance.
When does a columnstore index need to perform a full table scan?
Starting with SQL Server 2016, you can create one or more regular nonclustered btree indexes on a clustered
columnstore index just like you can on a rowstore heap. The nonclustered btree indexes can speed up a query that
has an equality predicate or a predicate with a small range of values. For more complicated predicates, the query
optimizer might choose a full table scan. Without the ability to skip rowgroups, a full table scan would be very
time-consuming, especially for large tables.
When does an analytics query benefit from rowgroup elimination for a full-table scan?
For example, a retail business has modelled their sales data using a fact table with clustered columnstore index.
Each new sale stores various attributes of the transaction including the date is was sold. Interestingly, even though
columnstore indexes do not guarantee a sorted order, the rows in this table will loaded in a date-sorted order.
Over time this table will grow. Although the retail business might keep sales data for the last 10 years, an analytics
query might only need to compute an aggregate for last quarter. Columnstore indexes can eliminate accessing the
data for the previous 39 quarters by just looking at the metadata for the date column. This is an additional 97%
reduction in the amount of data that is read into memory and processed.
Which rowgroups are skipped in a full table scan?
To determine which rows groups to eliminate, the columnstore index uses metadata to store the minimum and
maximum values of each column segment for each rowgroup. When none of the column segment ranges meet
the query predicate criteria, the entire rowgroup is skipped without doing any actual IO. This works because the
data is usually loaded in a sorted order and although rows are not guaranteed to be sorted, similar data values are
often located within the same rowgroup or a neighboring rowgroup.
For more details about rowgroups, see Columnstore Indexes Guide
Batch Mode Execution
Batch mode execution refers to processing a set of rows, typically up to 900 rows, together for execution efficiency.
For example, the query Select SUM (Sales)from SalesData aggregates the total sales from the table SalesData. In
batch mode execution, the query execution engine computes the aggregate in group of 900 values. This spreads
metadata the access costs and other types of overhead over all the rows in a batch, rather than paying the cost for
each row thereby significantly reducing the code path. Batch mode processing operates on compressed data when
possible and eliminates some of the exchange operators used by row mode processing. This speeds up execution
of analytics queries by orders of magnitude.
Not all query execution operators can be executed in batch mode. For example, DML operations such as Insert,
Delete or Update are executed row at a time. Batch mode operators target operators for speeding up query
performance such as Scan, Join, Aggregate, sort and so on. Since the columnstore index was introduced in SQL
Server 2012, there is a sustained effort to expand the operators that can be executed int the batch mode. The table
below shows the operators that run in batch mode according to the product version.

SQL SERVER 2016


BATCH MODE WHEN IS THIS AND SQL
OPERATORS USED? SQL SERVER 2012 SQL SERVER 2014 DATABASE¹ COMMENTS
SQL SERVER 2016
BATCH MODE WHEN IS THIS AND SQL
OPERATORS USED? SQL SERVER 2012 SQL SERVER 2014 DATABASE¹ COMMENTS

DML operations no no no DML is not a


(insert, delete, batch mode
update, merge) operation
because it is not
parallel. Even
when we enable
serial mode batch
processing, we
don't see
significant gains
by allowing DML
to be processed
in batch mode.

columnstore SCAN NA yes yes For columnstore


index scan indexes, we can
push the
predicate to the
SCAN node.

columnstore SCAN yes yes yes yes


Index Scan
(nonclustered)

index seek NA NA no We perform a


seek operation
through a
nonclustered
btree index in
rowmode.

compute scalar Expression that yes yes yes There are some
evaluates to a restrictions on
scalar value. data type. This is
true for all batch
mode operators.

concatenation UNION and no yes yes


UNION ALL

filter Applying yes yes yes


predicates

hash match Hash-based yes yes yes Restrictions for


aggregate aggregation: no
functions, outer min/max for
hash join, right strings.
hash join, left Aggregation
hash join, right functions
inner join, left available are
inner join sum/count/avg/
min/max.
Restrictions for
join: no
mismatched type
joins on non-
integer types.
SQL SERVER 2016
BATCH MODE WHEN IS THIS AND SQL
OPERATORS USED? SQL SERVER 2012 SQL SERVER 2014 DATABASE¹ COMMENTS

merge join no no no

multi-threaded yes yes yes


queries

nested loops no no no

single-threaded no no yes
queries running
under MAXDOP
1

single-threaded no no yes
queries with a
serial query plan

sort Order by clause no no yes


on SCAN with
columnstore
index.

top sort no no yes

window NA NA yes New operator in


aggregates SQL Server 2016.

¹Applies to SQL Server 2016, SQL Database V12 Premium Edition, and SQL Data Warehouse
Aggregate Pushdown
A normal execution path for aggregate computation to fetch the qualifying rows from the SCAN node and
aggregate the values in Batch Mode. While this delivers good performance, but with SQL Server 2016, the
aggregate operation can be pushed to the SCAN node to improve the performance of aggregate computation by
orders of magnitude on top of Batch Mode execution provided the following conditions are met
Supported aggregate operators are MIN, MAX, SUM, COUNT, AVG
Any datatype <= 64 bits is supported. For example, bigint is supported as its size is 8 bytes but decimal
(38,6) is not because its size is 17 bytes. Also, no string types are supported
Aggregate operator must be on top of SCAN node or SCAN node with group by
Aggregate push down is further accelerated by efficient Aggregation on compressed/encoded data in
cache-friendly execution and by leveraging SIMD

For example, aggregate pushdown is done in both of the queries below

SELECT productkey, SUM(TotalProductCost)


FROM FactResellerSalesXL_CCI
GROUP BY productkey

SELECT SUM(TotalProductCost)
FROM FactResellerSalesXL_CCI

String predicate pushdown


Motivation: When designing a data warehouse schema, the recommended schema modeling is to use star-schema
or snowflake schema consisting of one or more fact tables and many dimension tables. The fact table stores the
business measurements or transactions and dimension table store the dimensions across which facts need to be
analyzed.
For example, a fact can be a record representing a sale of a particular product in a specific region while the
dimension represents a set of regions, products and so on. The fact and dimension tables are connected through
the a primary/foreign key relationship. Most commonly used analytics queries join one or more dimension tables
with the fact table.
Let us consider a dimension table products. a typical primary key will be productcode which is commonly
represented as string data type. For performance of queries, it is a best practice to create surrogate key, typically
an integer column, to refer to the row in the dimension table from the fact table.
The columnstore index runs analytics queries with joins/predicates involving numeric or integer based keys very
efficiently. However, in many customer workloads, we find the use to string based columns linking fact/dimension
tables and with the result the query performance with columnstore index was not as performing. SQL Server 2016
improves the performance of analytics queries with string based columns significantly by pushing down the
predicates with string columns to the SCAN node
String predicate pushdown leverages the primary/secondary dictionary created for column(s) to improve the
query performance. For example, let us consider string column segment within a rowgroup consisting of 100
distinct string values. This means each distinct string value is referenced 10,000 times on average assuming 1
million rows .
With string predicate pushdown, the query execution computes the predicate against the values in the dictionary
and if it qualifies, all rows referring to the dictionary value are automatically qualified. This improves the
performance in two ways. First, only the qualified row are returned reducing number of the rows that need to flow
out of SCAN node. Second, the number of string comparisons are significantly reduced. In this example, only 100
string comparisons are required as against 1 million comparisons. There are some limitations as described below
No string predicate pushdown for delta rowgroups. There is no dictionary for columns in delta rowgroups
No string predicate pushdown if dictionary exceeds 64k entries
Expression evaluating NULLs are not not supported

See Also
Columnstore Indexes Guide
Columnstore Indexes Data Loading
Columnstore Indexes Versioned Feature Summary
Columnstore Indexes Query Performance
Get started with Columnstore for real time operational analytics
Columnstore Indexes for Data Warehousing
Columnstore Indexes Defragmentation
Get started with Columnstore for real time
operational analytics
3/24/2017 • 10 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
SQL Server 2016 introduces real-time operational analytics, the ability to run both analytics and OLTP workloads
on the same database tables at the same time. Besides running analytics in real-time, you can also eliminate the
need for ETL and a data warehouse.

Real-Time Operational Analytics Explained


Traditionally, businesses have had separate systems for operational (i.e. OLTP) and analytics workloads. For such
systems, Extract, Transform, and Load (ETL) jobs regularly move the data from the operational store to an
analytics store. The analytics data is usually stored in a data warehouse or data mart dedicated to running
analytics queries. While this solution has been the standard, it has these three key challenges:
Complexity. Implementing ETL can require considerable coding especially to load only the modified
rows. It can be complex to identify which rows have been modified.
Cost. Implementing ETL requires the cost of purchasing additional hardware and software licenses.
Data Latency. Implementing ETL adds a time delay for running the analytics. For example, if the ETL job
runs at the at end of each business day, the analytics queries will run on data that is at least a day old. For
many businesses this delay is unacceptable because the business depends on analyzing data in real-time.
For example, fraud-detection requires real-time analytics on operational data.

Real-time operational analytics offers a solution to these challenges.


There is no time delay when analytics and OLTP workloads run on the same underlying table. For
scenarios that can use real-time analytics, the costs and complexity are greatly reduced by eliminating the
need for ETL and the need to purchase and maintain a separate data warehouse.
NOTE
Real-time operational analytics targets the scenario of a single data source such as an enterprise resource planning (ERP)
application on which you can run both the operational and the analytics workload. This does not replace the need for a
separate data warehouse when you need to integrate data from multiple sources before running the analytics workload or
when you require extreme analytics performance using pre-aggregated data such as cubes.

Real-time analytics uses an updateable columnstore index on a rowstore table. The columnstore index maintains
a copy of the data, so the OLTP and analytics workloads run against separate copies of the data. This minimizes
the performance impact of both workloads running at the same time. SQL Server automatically maintains index
changes so OLTP changes are always up-to-date for analytics. With this design, it is possible and practical to run
analytics in real-time on up-to-date data. This works for both disk-based and memory-optimized tables.

Get Started Example


To get started with real-time analytics:
1. Identify the tables in your operational schema that contain data required for analytics.
2. For each table, drop all btree indexes that are primarily designed to speed up existing analytics on your
OLTP workload. Replace them with a single columnstore index. This can improve the overall performance
of your OLTP workload since there will be fewer indexes to maintain.

--This example creates a nonclustered columnstore index on an existing OLTP table.


--Create the table
CREATE TABLE t_account (
accountkey int PRIMARY KEY,
accountdescription nvarchar (50),
accounttype nvarchar(50),
unitsold int
);

--Create the columnstore index with a filtered condition


CREATE NONCLUSTERED COLUMNSTORE INDEX account_NCCI
ON t_account (accountkey, accountdescription, unitsold)
;

The columnstore index on an in-memory table allows operational analytics by integrating in-memory
OLTP and in-memory columnstore technologies to deliver high performance for both OLTP and analytics
workloads. The columnstore index on an in-memory table must include all the columns.

-- This example creates a memory-optimized table with a columnstore index.


CREATE TABLE t_account (
accountkey int NOT NULL PRIMARY KEY NONCLUSTERED,
Accountdescription nvarchar (50),
accounttype nvarchar(50),
unitsold int,
INDEX t_account_cci CLUSTERED COLUMNSTORE
)
WITH (MEMORY_OPTIMIZED = ON );
GO

3. This is all you need to do!


You are now ready to run real-time operational analytics without making any changes to your application.
Analytics queries will run against the columnstore index and OLTP operations will keep running against
your OLTP btree indexes. The OLTP workloads will continue to perform, but will incur some additional
overhead to maintain the columnstore index. See the performance optimizations in the next section.

Blog Posts
Read Sunil Agarwal's blog posts to learn more about real-time operational analytics. It might be easier to
understand the performance tips sections if you look at the blog posts first.
Business case for real-time operational analytics
Using a nonclustered columnstore index for real-time operational analytics
A simple example using a nonclustered columnstore index
How SQL Server maintains a nonclustered columnstore index on a transactional workload
Minimizing the impact of nonclustered columnstore index maintenance by using a filtered index
Minimizing the impact of nonclustered columnstore index maintenance by using compression delay
Minimizing impact of a nonclustered columnstore index maintenance by using compression delay -
performance numbers
Real time operational analytics with memory-optimized tables
Minimize index fragmentation in a columnstore indes
Columnstore index and the merge policy for rowgroups

Performance tip #1: Use filtered indexes to improve query


performance
Running real-time operational analytics can impact the performance of the OLTP workload. This impact should
be minimal. The example below shows how to use filtered indexes to minimize impact of nonclustered
columnstore index on transactional workload while still delivering analytics in real-time.
To minimize the overhead of maintaining a nonclustered columnstore index on an operational workload, you
can use a filtered condition to create a nonclustered columnstore index only on the warm or slowly changing
data. For example, in an order management application, you can create a nonclustered columnstore index on the
orders that have already been shipped. Once the order has shipped, it rarely changes and therefore can be
considered warm data. With Filtered index, the data in nonclustered columnstore index requires fewer updates
thereby lowering the impact on transactional workload.
Analytics queries transparently access both warm and hot data as needed to provide real-time analytics. If a
significant part of the operational workload is touching the 'hot' data, those operations will not require
additional maintenance of the columnstore index. A best practice is to have a rowstore clustered index on the
column(s) used in the filtered index definition. SQL Server uses the clustered index to quickly scan the rows that
did not meet the filtered condition. Without this clustered index, a full table scan of the rowstore table will be
required to find these rows which can negatively impact the performance of analytics query significantly. In the
absence of clustered index, you could create a complementary filtered nonclustered btree index to identify such
rows but it is not recommended because accessing large range of rows through nonclustered btree indexes is
expensive.

NOTE
A filtered nonclustered columnstore index is only supported on disk-based tables. It is not supported on memory-
optimized tables
Example A: Access hot data from btree index, warm data from columnstore index
This example uses a filtered condition (accountkey > 0) to establish which rows will be in the columnstore index.
The goal is to design the filtered condition and subsequent queries to access frequently changing “hot” data
from the btree index, and to access the more stable “warm” data from the columnstore index.

NOTE
The query optimizer will consider, but not always choose, the columnstore index for the query plan. When the query
optimizer chooses the filtered columnstore index, it transparently combines the rows both from columnstore index as well
as the rows that do not meet the filtered condition to allow real-time analytics. This is different from a regular
nonclustered filtered index which can be used only in queries that restrict themselves to the rows present in the index.

--Use a filtered condition to separate hot data in a rowstore table


-- from “warm” data in a columnstore index.

-- create the table


CREATE TABLE orders (
AccountKey int not null,
Customername nvarchar (50),
OrderNumber bigint,
PurchasePrice decimal (9,2),
OrderStatus smallint not null,
OrderStatusDesc nvarchar (50))

-- OrderStatusDesc
-- 0 => 'Order Started'
-- 1 => 'Order Closed'
-- 2 => 'Order Paid'
-- 3 => 'Order Fullfillment Wait'
-- 4 => 'Order Shipped'
-- 5 => 'Order Received'

CREATE CLUSTERED INDEX orders_ci ON orders(OrderStatus)

--Create the columnstore index with a filtered condition


CREATE NONCLUSTERED COLUMNSTORE INDEX orders_ncci ON orders (accountkey, customername, purchaseprice,
orderstatus)
where orderstatus = 5
;

-- The following query returns the total purchase done by customers for items > $100 .00
-- This query will pick rows both from NCCI and from 'hot' rows that are not part of NCCI
SELECT top 5 customername, sum (PurchasePrice)
FROM orders
WHERE purchaseprice > 100.0
Group By customername

The analytics query will execute with the following query plan. You can see that the rows not meeting the filtered
condition are accessed through clustered btree index.
Please refer to the blog for details on filtered nonclustered columnstore index.

Performance tip #2: Offload analytics to Always On readable


secondary
Even though you can minimize the columnstore index maintenance by using a filtered columnstore index, the
analytics queries can still require significant computing resources (CPU, IO, memory) which impact the
operational workload performance. For most mission critical workloads, our recommendation is to use the
Always On configuration. In this configuration, you can eliminate the impact of running analytics by offloading it
to a readable secondary.

Performance Tip #3: Reducing Index fragmentation by keeping hot


data in delta rowgroups
Tables with columnstore index may get significantly fragmented (i.e. deleted rows) if the workload
updates/deletes rows that have been compressed. A fragmented columnstore index leads to inefficient
utilization of memory/storage. Besides inefficient use of resources, it also negatively impacts the analytics query
performance because of extra IO and the need to filter the deleted rows from the result set.
The deleted rows are not physically removed until you run index defragmentation with REORGANIZE command
or rebuild the columnstore index on the entire table or the affected partition(s). Both REORGANIZE and Index
REBUILD are expensive operations taking resources away which otherwise could be used for the workload.
Additionally, if rows compressed too early, it may need to be re-compressed multiple times due to updates
leading to wasted compression overhead.
You can minimize index fragmentation using COMPRESSION_DELAY option.

-- Create a sample table


create table t_colstor (
accountkey int not null,
accountdescription nvarchar (50) not null,
accounttype nvarchar(50),
accountCodeAlternatekey int)

-- Creating nonclustered columnstore index with COMPRESSION_DELAY. The columnstore index will keep the rows
in closed delta rowgroup for 100 minutes
-- after it has been marked closed
CREATE NONCLUSTERED COLUMNSTORE index t_colstor_cci on t_colstor (accountkey, accountdescription,
accounttype)
WITH (DATA_COMPRESSION= COLUMNSTORE, COMPRESSION_DELAY = 100);

Please refer to the blog for details on compression delay.


Here are the recommended best practices
Insert/Query workload:If your workload is primarily inserting data and querying it, the default
COMPRESSION_DELAY of 0 is the recommended option. The newly inserted rows will get compressed
once 1 million rows have been inserted into a single delta rowgroup.
Some example of such workload are (a) traditional DW workload (b) click-stream analysis when you need
to analyze the click pattern in a web application.
OLTP workload: If the workload is DML heavy (i.e. heavy mix of Update, Delete and Insert), you may see
columnstore index fragmentation by examining the DMV sys.
dm_db_column_store_row_group_physical_stats. If you see that > 10% rows are marked deleted in
recently compressed rowgroups, you can use COMPRESSION_DELAY option to add time delay when
rows become eligible for compression. For example, if for your workload, the newly inserted stays ‘hot’
(i.e. gets updated multiple times) for say 60 minutes, you should choose COMPRESSION_DELAY to be 60.
We expect most customers do not need to anything. The default value of COMPRESSION_DELAY option
should work for them.
For advance users, we recommend running the query below and collect % of deleted rows over the last 7
days.

SELECT row_group_id,cast(deleted_rows as float)/cast(total_rows as float)*100 as [% fragmented],


created_time
FROM sys. dm_db_column_store_row_group_physical_stats
WHERE object_id = object_id('FactOnlineSales2')
AND state_desc='COMPRESSED'
AND deleted_rows>0
AND created_time > GETDATE() - 7
ORDER BY created_time DESC

If the number of deleted rows in compressed rowgroups > 20%, plateauing in older rowgroups with < 5%
variation (referred to as cold rowgroups) set COMPRESSION_DELAY = (youngest_rowgroup_created_time –
current_time). Note that this approach works best with a stable and relatively homogeneous workload.

See Also
Columnstore Indexes Guide
Columnstore Indexes Data Loading
Columnstore Indexes Query Performance
Columnstore Indexes for Data Warehousing
Columnstore Indexes Defragmentation
Columnstore indexes - data warehouse
3/24/2017 • 4 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2016) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
Columnstore indexes, in conjunction with partitioning, are essential for building a SQL Server data warehouse.

What’s new
SQL Server 2016 introduces these features for columnstore performance enhancements:
Always On supports querying a columnstore index on a readable secondary replica.
Multiple Active Result Sets (MARS) supports columnstore indexes.
A new dynamic management view sys.dm_db_column_store_row_group_physical_stats (Transact-SQL)
provides performance troubleshooting information at the row group level.
Single-threaded queries on columnstore indexes can run in batch mode. Previously, only multi-threaded
queries could run in batch mode.
The SORT operator runs in batch mode.
Multiple DISTINCT operation runs in batch mode.
Window Aggregates now runs in batch mode for database compatibility level 130
Aggregate Pushdown for efficient processing of aggregates. This is supported on all database
compatibility levels.
String predicate pushdown for efficient processing of string predicates. This is supported on all database
compatibility levels.
Snapshot isolation for database compatibility level 130

Improve performance by combining nonclustered and columnstore


indexes
Beginning with SQL Server 2016, you can define nonclustered indexes on a clustered columnstore index.
Example: Improve efficiency of table seeks with a nonclustered index
To improve efficiency of table seeks in a data warehouse, you can create a nonclustered index designed to run
queries that perform best with table seeks. For example, queries that look for matching values or return a small
range of values will perform better against a btree index rather than a columnstore index. They don’t require a
full table scan through the columnstore index and will return the correct result faster by doing a binary search
through a btree index.
--BASIC EXAMPLE: Create a nonclustered index on a columnstore table.

--Create the table


CREATE TABLE t_account (
AccountKey int NOT NULL,
AccountDescription nvarchar (50),
AccountType nvarchar(50),
UnitSold int
);
GO

--Store the table as a columnstore.


CREATE CLUSTERED COLUMNSTORE INDEX taccount_cci ON t_account;
GO

--Add a nonclustered index.


CREATE UNIQUE INDEX taccount_nc1 ON t_account (AccountKey);

Example: Use a nonclustered index to enforce a primary key constraint on a columnstore table
By design, a columnstore table does not allow a primary key constraint. Now you can use a nonclustered index
on a columnstore table to enforce a primary key constraint. A primary key is equivalent to a UNIQUE constraint
on a non-NULL column, and SQL Server implements a UNIQUE constraint as a nonclustered index. Combining
these facts, the following example defines a UNIQUE constraint on the non-NULL column accountkey. The result
is a nonclustered index that enforces a primary key constraint as a UNIQUE constraint on a non-NULL column.
Next, the table is converted to a clustered columnstore index. During the conversion, the nonclustered index
persists. The result is a clustered columnstore index with a nonclustered index that enforces a primary key
constraint. Since any update or insert on the columnstore table will also affect the nonclustered index, all
operations that violate the unique constraint and the non-NULL will cause the entire operation to fail.
The result is a columnstore index with a nonclustered index that enforces a primary key constraint on both
indexes.

--EXAMPLE: Enforce a primary key constraint on a columnstore table.

--Create a rowstore table with a unique constraint.


--The unique constraint is implemented as a nonclustered index.
CREATE TABLE t_account (
AccountKey int NOT NULL,
AccountDescription nvarchar (50),
AccountType nvarchar(50),
UnitSold int,

CONSTRAINT uniq_account UNIQUE (AccountKey)


);

--Store the table as a columnstore.


--The unique constraint is preserved as a nonclustered index on the columnstore table.
CREATE CLUSTERED COLUMNSTORE INDEX t_account_cci ON t_account

--By using the previous two steps, every row in the table meets the UNIQUE constraint
--on a non-NULL column.
--This has the same end-result as having a primary key constraint
--All updates and inserts must meet the unique constraint on the nonclustered index or they will fail.

--If desired, add a foreign key constraint on AccountKey.

ALTER TABLE [dbo].[t_account]


WITH CHECK ADD FOREIGN KEY([AccountKey]) REFERENCES my_dimension(Accountkey)
;
Improve performance by enabling row-level and row-group-level locking
To complement the nonclustered index on a columnstore index feature, SQL Server 2016 offers granular locking
capability for select, update, and delete operations. Queries can run with row-level locking on index seeks against
a nonclustered index and rowgroup-level locking on full table scans against the columnstore index. Use this to
achieve higher read/write concurrency by using row-level and rowgroup-level locking appropriately.

--Granular locking example


--Store table t_account as a columnstore table.
CREATE CLUSTERED COLUMNSTORE INDEX taccount_cci ON t_account
GO

--Add a nonclustered index for use with this example


CREATE UNIQUE INDEX taccount_nc1 ON t_account (AccountKey);
GO

--Look at locking with access through the nonclustered index


SET TRANSACTION ISOLATION LEVEL repeatable read;
GO

BEGIN TRAN
-- The query plan chooses a seek operation on the nonclustered index
-- and takes the row lock
SELECT * FROM t_account WHERE AccountKey = 100;
END TRAN

Snapshot isolation and read-committed snapshot isolations


Use snapshot isolation (SI) to guarantee transactional consistency, and read-committed snapshot isolations
(RCSI) to guarantee statement level consistency for queries on columnstore indexes. This allows the queries to
run without blocking data writers. This non-blocking behavior also significantly reduces the likelihood of
deadlocks for complex transactions. For more information, see Snapshot Isolation in SQL Server on MSDN.

See Also
Columnstore Indexes Guide
Columnstore Indexes Data Loading
Columnstore Indexes Versioned Feature Summary
Columnstore Indexes Query Performance
Get started with Columnstore for real time operational analytics
Columnstore Indexes Defragmentation
Columnstore indexes - defragmentation
3/24/2017 • 5 min to read • Edit Online

THIS TOPIC APPLIES TO: SQL Server (starting with 2012) Azure SQL Database Azure SQL Data
Warehouse Parallel Data Warehouse
Tasks for defragmenting columnstore indexes.

Use ALTER INDEX REORGANIZE to defragment a columnstore index


online
APPLIES TO: SQL Server (starting with 2016), Azure SQL Database
After performing loads of any type, you can have multiple small rowgroups in the deltastore. You can use ALTER
INDEX REORGANIZE to force all of the rowgroups into the columnstore, and then to combine the rowgroups into
fewer rowgroups with more rows. The reorganize operation will also remove rows that have been deleted from
the columnstore.
To learn more see Sunil Agarwal's blog posts on the SQL Database Engine Team Blog.
Minimizing index fragmentation in columnstore indexes
Columnstore indexes and the merge policy for rowgroups
Recommendations for reorganizing
Reorganize a columnstore index after one or more data loads to achieve query performance benefits as quickly as
possible. Reorganizing will initially require additional CPU resources to compress the data, which could slow
overall system performance. However, as soon as the data is compressed, query performance can improve.
Use the example in sys.dm_db_column_store_row_group_physical_stats (Transact-SQL) to compute the
fragmentation. This helps you to determine whether it is worthwhile to perform a REORGANIZE operation.
Example: How reorganizing works
This example shows how ALTER INDEX REORGANIZE can force all deltastore rowgroups into the columnstore and
then combine the rowgroups.
1. Run this Transact-SQL to create a staging table that contains 300,000 rows. We will use this to bulk load
rows into a columnstore index.
USE master;
GO

IF EXISTS (SELECT name FROM sys.databases


WHERE name = N'[columnstore]')
DROP DATABASE [columnstore];
GO

CREATE DATABASE [columnstore];


GO

IF EXISTS (SELECT name FROM sys.tables


WHERE name = N'staging'
AND object_id = OBJECT_ID (N'staging'))
DROP TABLE dbo.staging;
GO

CREATE TABLE [staging] (


AccountKey int NOT NULL,
AccountDescription nvarchar (50),
AccountType nvarchar(50),
AccountCodeAlternateKey int
);
GO

-- Load data
DECLARE @loop int;
DECLARE @AccountDescription varchar(50);
DECLARE @AccountKey int;
DECLARE @AccountType varchar(50);
DECLARE @AccountCode int;

SELECT @loop = 0;
BEGIN TRAN
WHILE (@loop < 300000)
BEGIN
SELECT @AccountKey = CAST (RAND()*10000000 AS int);
SELECT @AccountDescription = 'accountdesc ' + CONVERT(varchar(20), @AccountKey);
SELECT @AccountType = 'AccountType ' + CONVERT(varchar(20), @AccountKey);
SELECT @AccountCode = CAST (RAND()*10000000 AS int);

INSERT INTO staging VALUES (


@AccountKey,
@AccountDescription,
@AccountType,
@AccountCode
);

SELECT @loop = @loop + 1;


END
COMMIT

2. Create a table stored as a columnstore index.


IF EXISTS (SELECT name FROM sys.tables
WHERE name = N'cci_target'
AND object_id = OBJECT_ID (N'cci_target'))
DROP TABLE dbo.cci_target;
GO

-- Create a table with a clustered columnstore index


-- and the same columns as the rowstore staging table.
CREATE TABLE cci_target (
AccountKey int NOT NULL,
AccountDescription nvarchar (50),
AccountType nvarchar(50),
AccountCodeAlternateKey int,
INDEX idx_cci_target CLUSTERED COLUMNSTORE
)
GO

3. Bulk insert the staging table rows into the columnstore table. INSERT INTO ... SELECT performs a bulk
insert. The TABLOCK runs the insert in parallel.

-- Insert rows in parallel


INSERT INTO cci_target WITH (TABLOCK)
SELECT TOP (300000) * FROM staging;
GO

4. View the rowgroups by using the sys.dm_db_column_store_row_group_physical_stats dynamic


management view (DMV).

-- Run this dynamic management view (DMV) to see the OPEN rowgroups.
-- The number of rowgroups depends on the degree of parallelism.
-- You will see multiple OPEN rowgroups depending on the degree of parallelism.
-- This is because insert operation can run in parallel in SQL server 2016.

SELECT *
FROM sys.dm_db_column_store_row_group_physical_stats
WHERE object_id = object_id('cci_target')
ORDER BY row_group_id;

In this example, the results show 8 OPEN rowgroups that each have 37,500 rows. The number of OPEN
rowgroups depends on the max_degree_of_parallelism setting.

5. Use ALTER INDEX REORGANIZE with the COMPRESS_ALL_ROW_GROUPS option to force all rowgroups to
be compressed into the columnstore.
-- This command will force all CLOSED and OPEN rowgroups into the columnstore.
ALTER INDEX idx_cci_target ON cci_target
REORGANIZE WITH (COMPRESS_ALL_ROW_GROUPS = ON);

SELECT *
FROM sys.dm_db_column_store_row_group_physical_stats
WHERE object_id = object_id('cci_target')
ORDER BY row_group_id;

The results show 8 COMPRESSED rowgroups and 8 TOMBSTONE rowgroups. Each rowgroup got
compressed into the columnstore regardless of its size. The TOMBSTONE rowgroups will be removed by
the system.

6. For query performance, its much better to combine small rowgroups together. ALTER INDEX REORGANIZE
will combine COMPRESSED rowgroups together. Now that the delta rowgroups are compressed into the
columnstore, run ALTER INDEX REORGANIZE again to combine the small COMPRESSED rowgroups. This
time you don't need the COMPRESS_ALL_ROW_GROUPS option.

-- Run this again and you will see that smaller rowgroups
-- combined into one compressed rowgroup with 300,000 rows
ALTER INDEX idx_cci_target ON cci_target REORGANIZE;

SELECT *
FROM sys.dm_db_column_store_row_group_physical_stats
WHERE object_id = object_id('cci_target')
ORDER BY row_group_id;

The results show the 8 COMPRESSED rowgroups are now combined into one COMPRESSED rowgroup.

Use ALTER INDEX REBUILD to defragment the columnstore index


offline
For SQL Server 2016 and later, rebuilding the columnstore index is usually not needed since REORGANIZE
performs the essentials of a rebuild in the background as an online operation.
Rebuilding a columnstore index removes fragmentation, and moves all rows into the columnstore. Use CREATE
COLUMNSTORE INDEX (Transact-SQL) or ALTER INDEX (Transact-SQL) to perform a full rebuild of an existing
clustered columnstore index. Additionally, you can use ALTER INDEX … REBUILD to rebuild a specific partition.
Rebuild Process
To rebuild a columnstore index, SQL Server:
1. Acquires an exclusive lock on the table or partition while the rebuild occurs. The data is “offline” and
unavailable during the rebuild, even when using NOLOCK, RCSI, or SI.
2. Re-compresses all data into the columnstore. Two copies of the columnstore index exist while the rebuild
is taking place. When the rebuild is finished, SQL Server deletes the original columnstore index.
Recommendations for Rebuilding a Columnstore Index
Rebuilding a columnstore index is useful for removing fragmentation, and for moving all rows into the
columnstore. We have the following recommendations:
1. Rebuild a partition instead of the entire table.
Rebuilding the entire table takes a long time if the index is large, and it requires enough disk space
to store an additional copy of the index during the rebuild. Usually it is only necessary to rebuild the
most recently used partition.
For partitioned tables, you do not need to rebuild the entire columnstore index because
fragmentation is likely to occur in only the partitions that have been modified recently. Fact tables
and large dimension tables are usually partitioned in order to perform backup and management
operations on chunks of the table.
2. Rebuild a partition after heavy DML operations.
Rebuilding a partition will defragment the partition and reduce disk storage. Rebuilding will delete all
rows from the columnstore that are marked for deletion, and it will move all rowgroups from the
deltastore into the columnstore. Note, there can be multiple rowgroups in the deltastore that each have
less than one million rows.
3. Rebuild a partition after loading data.
This ensures all data is stored in the columnstore. When concurrent processes each load less than 100K
rows into the same partition at the same time, the partition can end up with multiple deltastores.
Rebuilding will move all deltastore rows into the columnstore.

See Also
Columnstore indexes - what's new
Columnstore indexes - query performance
Get started with Columnstore for real time operational analytics
Columnstore indexes - data warehouse