Happy New Year My dear Friends
It’s been a long this feature has been released however I just got an opportunity to check it out. This feature is released in SQL Server 2008 version. Sparse columns are normal columns which will store NULL values in a optimized storage which means no storage space is taken when you store NULL values to that column. Happy !! there is some overhead for storing non NULL values to this columns, it will take some more space for non NULL values.
In earlier version of SQL Server NULL values takes the same space as any other value for fixed width and minimal space for variable width, for more information refer How does SQL Server really store NULL-s article. So to overcome this Microsoft has introduced this new feature which will not use any storage space to store NULL values. I’m not going to discuss more about it’s advantages and limitations all these are described in http://msdn.microsoft.com/en-us/library/cc280604.aspx KB article. As I described earlier there is some overhead in storing non null values, that is approximately additional 4 bytes is required to store any value, that means 4 bytes plus the data storage space. I’m going to discuss when and where can we use sparse column.
Let’s do an example to calculate the space used by SPARSE columns with more than 90% of null values and vice versa
Column has more than 90% NULL values
Use the script below, it will create a table in tempdb database and populates couple of sparse columns with 10% non null values and 90% of null values
USE tempdb GO CREATE TABLE NULLVALUES_SPARSE (ID INT IDENTITY(1,1) , NAME VARCHAR(50) NOT NULL, MOBILE BIGINT SPARSE NULL, MOBILE2 BIGINT SPARSE NULL) --Populate values, execute each insert statement seperately GO INSERT INTO NULLVALUES_SPARSE VALUES('SAGAR',NULL,NULL) GO 900 INSERT INTO NULLVALUES_SPARSE VALUES('SAGAR',9600012345,9840012345) GO 100 EXEC sp_spaceused 'NULLVALUES_SPARSE'
If you can see from the image above 40KB is allocated for that table in which 90% is null values, that means no space is used as these columns are declared as SPARSE
Column has more than 90% NON NULL values
USE tempdb GO CREATE TABLE NULLVALUES (ID INT IDENTITY(1,1) , NAME VARCHAR(50) NOT NULL, MOBILE BIGINT SPARSE NULL, MOBILE2 BIGINT SPARSE NULL) GO --Populate values, execute each insert statement seperately INSERT INTO NULLVALUES VALUES('SAGAR',NULL,NULL) GO 100 INSERT INTO NULLVALUES VALUES('SAGAR',9600012345,9840012345) GO 900 EXEC sp_spaceused 'NULLVALUES'
This script allows you to create a table which will populate 90% non null values and 10% null values that means more overhead while storing the values. From the image below you can see that the table is allocated 64KB in which data is 56 KB
So now lets answer the question where to use SPARSE column, from the above tests it’s evident that there is more storage required to store non null values. So during the design you need to take a call if the column is going to have more NULL values, if that’s the case then you can declare it as SPARSE. In the above test for 1000 rows with 2 columns it saves you 24 KB , lets say if you have more null valued columns with billion of rows then you are going to save more space.
Leave a Reply