Index with Included columns

Included column is a new feature in SQL Server 2005. When you add a column as an Included column, it gets stored at the leaf level of the index and it is not part of the index key. This only works for non-clustered indexes.

The following are the limitation of key column in an index:

  1. You cannot exceed the 900 byte limitation of the index key
  2. Maximum 16 columns could only be included in an index
  3. Columns with data types like nvarchar(max), text, and ntext cannot be used in indexes

The advantage of using included columns is as follows:

  1. Columns defined in the include statement, called non-key columns, are not counted in the number of columns
  2. Columns that previously could not be used in queries, like nvarchar(max), can be included as a non-key column. However you cannot use text, image and ntext data types in included columns. Computed columns can also be used. The column cannot be dropped while it is an Included Column
  3. A maximum of 1023 additional columns can be used as non-key columns
  4. The main advantage is it helps in eliminating key lookups and thus improving performance by covering all the fields involved in the query

The syntax for creating an index with included column is as follows:

CREATE INDEX Indexname on table
(Key column)
INCLUDE (column1, column2)

We can also create index with included columns using SSMS as shown below:

Expand your DB > Table name > Expand index node > right click on index go to properties. Now click on included column in Select a page and choose the column you wish to include.

included_columns_01

included_columns_02

included_columns_03

I tested the usage/advantage of included columns by creating a table called testmember which has 1.5 million records with the following fields:

Memberid Int (Clustered index)
Name varchar (30)
Emailaddress varchar (30)
Firstname varchar (30)
Createddate – datetime
Testindex char (500)

We already have a NC on EmailAddress, Firstname.

Index size before creating included columns:

Testmember CI size: 10590912 KB NC size: 45728 KB

I am running the below query,

SET STATISTICS IO ON
SELECT Memberid, Name, EmailAddress, Firstname, Createddate FROM testmember WHERE name='teach2'
SET STATISTICS IO OFF

Without any NC in name field in the table the following is the IO,

Table ‘testmember’. Scan count 5, logical reads 1334506, physical reads 540, read-ahead reads 1328194, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

I have created the below NC on testmember table

CREATE NONCLUSTERED INDEX IX_Name ON dbo.testmember(Name)

Index size after creating a NC without included columns

Testmember CI size: 10590912 KB NC size: 89984 KB

I ran the following query and verified the execution plan which is shown below:

SET STATISTICS IO ON
SELECT Memberid, Name, EmailAddress, Firstname, Createddate FROM testmember WHERE name='teach2'
SET STATISTICS IO OFF

included_columns_04

As you can see there is a index seek on the Non-clustered index (NC) on the predicate name however since the existing NC index on name field couldnt cover the query and hence the optimizer had to use the Clustered index Key to lookup for the rows which were seeked for the predicate name=teach2. The key lookup is quite an expensive operation, if many rows were returned by the NC seek, for each and every record returned the query optimizer had to lookup the clustered index key to fetch the record and this involves a lot of IO.

After dropping the above index and creating an index with included column the following is the IO,

CREATE NONCLUSTERED INDEX Ix_temp
ON dbo.testmember (Name ASC)
INCLUDE (EmailAddress,Firstname,Createddate)

The index size after creating included column is as follows:

Testmember CI size: 10590912 KB NC size: 234008 KB

The following is the IO involved for the below query and corresponding execution plan is shown below:

SET STATISTICS IO ON
SELECT Memberid, Name, EmailAddress, Firstname, Createddate FROM testmember WHERE name='teach2'
SET STATISTICS IO OFF

Table ‘testmember’. Scan count 1, logical reads 10, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0.

included_columns_05

So we can see that creating an index with included column helps in reducing the amount of IO involved by covering the query. In the above case the logical reads has reduced dramatically by 200 times of the original value.

Now I am dropping the existing index with included columns and creating a new one with one more field called testindex which is having char (500) as data type.

DROP INDEX IX_Temp ON dbo.testmember
CREATE NONCLUSTERED INDEX IX_Temp ON dbo.testmember( name )
INCLUDE (EmailAddress, Firstname, Createddate, testindex)

Basically the above index will cover the below query and will use an index seek:

SELECT Memberid, Name, EmailAddress, Firstname, Createddate FROM testmember WHERE name='teach2'

However if you see the index size of NC, it is as follows:

Testmember CI size: 10590912 KB NC size: 5769672 KB

We can see that the NC size is close to 5GB. Even though we get performance improvement we need to use the included columns judiciously.

How to find included columns:

We can use the catalog view sys.index_columns to retrieve this information:

SELECT OBJECT_NAME(OBJECT_ID), is_included_column FROM sys.index_columns WHERE is_included_column=1

Posted

in

by

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *