SPARSE Column

Happy New Year My dear Friends Smile

It’s  been a long this feature has been released however I just got an opportunity to check it out. This feature is released in SQL Server 2008 version. Sparse columns are normal columns which will store NULL values in a optimized storage which means no storage space is taken when you store NULL values to that column. Happy !! there is some overhead for storing non NULL values to this columns, it will take some more space for non NULL values.

In earlier version of SQL Server NULL values takes the same space as any other value for fixed width and minimal space for variable width, for more information refer How does SQL Server really store NULL-s article. So to overcome this Microsoft has introduced this new feature which will not use any storage space to store NULL values. I’m not going to discuss more about it’s advantages and limitations all these are described in http://msdn.microsoft.com/en-us/library/cc280604.aspx KB article. As I described earlier there is some overhead in storing non null values, that is approximately additional 4 bytes is required to store any value, that means 4 bytes plus the data storage space. I’m going to discuss when and where can we use sparse column.

Let’s do an example to calculate the space used by SPARSE columns with more than 90% of null values and vice versa

Column has more than 90% NULL values

Use the script below, it will create a table in tempdb database and populates couple of sparse columns with 10% non null values and 90% of null values

USE tempdb
GO
CREATE TABLE NULLVALUES_SPARSE
(ID INT IDENTITY(1,1) , NAME VARCHAR(50) NOT NULL,
MOBILE BIGINT SPARSE NULL, MOBILE2 BIGINT SPARSE NULL)
--Populate values, execute each insert statement seperately
GO
INSERT INTO NULLVALUES_SPARSE VALUES('SAGAR',NULL,NULL)
GO 900
INSERT INTO NULLVALUES_SPARSE VALUES('SAGAR',9600012345,9840012345)
GO 100
EXEC sp_spaceused 'NULLVALUES_SPARSE'

sparse_1

If you can see from the image above 40KB is allocated for that table in which 90% is null values, that means no space is used as these columns are declared as SPARSE

Column has more than 90% NON NULL values

USE tempdb
GO
CREATE TABLE NULLVALUES
(ID INT IDENTITY(1,1) , NAME VARCHAR(50) NOT NULL,
MOBILE BIGINT SPARSE NULL, MOBILE2 BIGINT SPARSE NULL)
GO
--Populate values, execute each insert statement seperately
INSERT INTO NULLVALUES VALUES('SAGAR',NULL,NULL)
GO 100
INSERT INTO NULLVALUES VALUES('SAGAR',9600012345,9840012345)
GO 900
EXEC sp_spaceused 'NULLVALUES'

This script allows you to create a table which will populate 90% non null values and 10% null values that means more overhead while storing the values.  From the image below you can see that the table is allocated 64KB in which data is 56 KB

sparse_2

So now lets answer the question where to use SPARSE column, from the above tests it’s evident that there is more storage required to store non null values. So during the design you need to take a call if the column is going to have more NULL values, if that’s the case then you can declare it as SPARSE. In the above test for 1000 rows with 2 columns it saves you 24 KB , lets say if you have more null valued columns with billion of rows then you are going to save more space.

One thought on “SPARSE Column”

  1. Pingback: DotNetShoutout

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>