I have used the Contact table with person schema which is residing in AdventureWorks database for testing the index usage. I am describing the scenarios and places where index seek or scan will be used.
CREATE TABLE [Person].[Contact]( [ContactID] [INT] IDENTITY(1,1) NOT FOR REPLICATION NOT NULL, [NameStyle] [dbo].[NameStyle] NOT NULL, [Title] [NVARCHAR](8) NULL, [FirstName] [dbo].[Name] NOT NULL, [MiddleName] [dbo].[Name] NULL, [LastName] [dbo].[Name] NOT NULL, [Suffix] [NVARCHAR](10) NULL, [EmailAddress] [NVARCHAR](50) NULL, [EmailPromotion] [INT] NOT NULL, [Phone] [dbo].[Phone] NULL, [PasswordHash] [VARCHAR](40) NOT NULL, [PasswordSalt] [VARCHAR](10) NOT NULL, [AdditionalContactInfo] [XML](CONTENT [Person].[AdditionalContactInfoSchemaCollection]) NULL, [rowguid] [UNIQUEIDENTIFIER] ROWGUIDCOLNOT NULL, [ModifiedDate] [DATETIME] NOT NULL, CONSTRAINT [PK_Contact_ContactID] PRIMARY KEY CLUSTERED ([ContactID] ASC)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON) ON [PRIMARY] ) ON [PRIMARY]
We already have the following indexes:
CI ContactID
NC – rowguid
NC EmailAddress
As we all know, since the CI contains the entire data it covers all the fields in the table. Whereas the Nonclustered index will cover the NC index key and the CI key alone.
Thus the CI contactId in the table Contact will cover the entire table. Consider the below query:
SELECT * FROM person.contact WHERE contactid=100
But the NC EmailAddress will cover only the following (EmailAddress, ContactId).
Hence the below query will perform an index seek on Emailaddress field and do a key lookup in CI to fetch the records for all other columns.
SELECT * FROM person.contact WHERE emailaddress='kim2@adventure-works.com'
However the below query is covered by the NC index itself and hence it uses an index seek alone:
SELECT ContactId, Emailaddress FROM person.contact WHERE emailaddress='kim2@adventure-works.com'
As we can see the above query selects ContactId and Emailaddress fields which are covered by the NC index on Emailaddress field and hence the optimizer uses an index seek.
In general when there is a single column index, SQL will try to use the index seek when the query is as follows:
SELECT * FROM person.contact WHERE contactid IN (1,3,5,7) SELECT * FROM person.contact WHERE Emailaddress LIKE 'peter%' SELECT * FROM person.contact WHERE contactid > 7000 SELECT * FROM person.contact WHERE contactid BETWEEN 1 AND 1000
Ive created a nonclustered index on Email promotion field:
CREATE INDEX Ix_temp ON person.contact ( EmailPromotion )
However the following set of queries wouldnt seek the indexes:
Even though we have indexes in where clause the index isnt used below and the optimizer prefers a scan instead. We should always avoid using the following in where clause:
SELECT * FROM person.contact WHERE EmailPromotion +1 =1
SELECT * FROM person.contact WHERE Emailaddress LIKE '%bella12@adventure-works.com'
SELECT * FROM person.contact WHERE ABS(contactid)=10001
SELECT * FROM person.contact WHERE LOWER(Emailaddress)= 'KATHERINE73@ADVENTURE-WORKS.COM'
Composite Index:
Now lets create a composite index on Firstname and Lastname fields:
CREATE INDEX IX_Fname_Lname ON person.contact ( Firstname, Lastname )
In case of composite indexes, the index will be used if the leading column in the index key is used as a predicate in the where clause (or) if the predicate used equality operator for both the predicates as shown below:
The following query will use an index seek and key lookup regardless of the order as they have equality operator for both the predicates:
SELECT * FROM person.contact WHERE Firstname='Margaret' AND Lastname='Chen' SELECT * FROM person.contact WHERE Lastname='Chen' AND Firstname='Margaret'
However the following one wouldnt use the index seek and do a index scan since the leading column in the index is not mentioned in the predicate.
SELECT * FROM person.contact WHERE Lastname='Chen'
Now Ive dropped all the constraints and the clustered index, Ive created the below composite index:
CREATE INDEX IX_Fname_Lname_ContactId ON person.contact ( Firstname,Lastname,ContactId )
The following query gives the distinct records in the table and few of fields where we had created index:
SELECT COUNT(*) AS total_records, COUNT(DISTINCT firstname) AS distinct_firstname, COUNT(DISTINCT lastname) AS distinct_lastname, COUNT(DISTINCT contactid) AS distinct_contactid FROM person.contact
total_records | distinct_firstname | distinct_lastname | distinct_contactid |
19972 | 1018 | 1206 | 19972 |
From the above result we can infer that the most selective field is the one which has more number of unique(distinct) records i.e. ContactId, LastName and FirstName.
In the below query, there will be an index scan on the index IX_Fname_Lname_ContactId instead of index seek because we dont have the leading column Firstname of the index key anywhere in the predicate and hence Index will not be used.
SELECT * FROM person.contact WHERE contactid=100
We can find the execution plan for the above query below:
For the below query, the composite index will still not be used as the leading column in the index key Firstname is not used in the where clause.
SELECT * FROM person.contact WHERE contactid=100 AND lastname='blackwell'
For the below query, the composite index will still be used as the leading column in the index key Firstname is used in the where clause.
SELECT * FROM person.contact WHERE contactid=100 AND lastname='blackwell' AND firstname='Jackie'
However the following wouldnt use an index seek even though we have the leading key column of the index in the where clause.
SELECT * FROM person.contact WHERE contactid + 0=937 AND Firstname LIKE '%Gail' AND Lastname='Westover' SELECT * FROM person.contact WHERE Contactid =937 AND Lastname='Westover' AND Firstname LIKE '%Gail'
Leave a Reply