Converting an EAV design to sparse columns and populating

One of the uses for sparse columns will likely be replacing entity-attribute-value designs. In an EAV design, the "sparse attribute" values are represented by a separate table. The attribute value can either be declared as (n)varchar or sql_variant. An example I've always used is the hardware store that wants to keep track of its products in a "products" table, but each product has different attributes. If you use the minimum number of columns to simplify the example, the EAV design looks like this.

create table products — base table
(
id int primary key,
name varchar(max)
)
go
create table properties — sparse attribute (name-value pair) table
(
id int,
name varchar(50),
value varchar(max),
CONSTRAINT PK_properties PRIMARY KEY (id, name),
FOREIGN KEY (id) REFERENCES products (id)
)
go

Here's a straightforward way to convert the table to use SQL Server 2008's sparse columns. It uses dynamic SQL, but in this case there's no user input (SQL injection worries).

declare @tab nvarchar(max),
@sql nvarchar(max)

set @tab=N'create table products2 (id int primary key, name nvarchar(max) '
select @tab=@tab+','+ name + ' varchar(max) sparse' from properties
group by name
set @tab += ' ,col_values xml column_set for all_sparse_columns);'
— select @tab
exec(@tab)

Populating it is also straghtforward using the pivot operator, introduced in SQL Server 2005. In the case where each item has only one of each sparse property (the table constraint enforces this), and our properties table has only three columns, there's no real aggregation with pivot. The aggregate is just required by the pivot operator syntax. This populates the table:

declare @col nvarchar(max),
@sql nvarchar(max)

set @cols=N''
select @cols=@cols+','+ name from properties
group by name
select @cols
set @cols=substring(@cols,2,datalength(@cols)/2 – 1)
set @sql=N'insert into products2 (id, name,' + @cols + ') select p.id, p.name, '+@cols+
      N' from (
      select id, name, value
      from properties
      ) as q
      pivot
      (
      max(value)
      for name in ('+@cols+
      ')
      ) as PivotTable'
set @sql += ' join products p on PivotTable.id = p.id'
— select @sql
exec (@sql)

What remains to be done now is to choose better data types for the sparse columns, if the data isn't really a string. If you haven't enforced value type correctness in the application, this may require some data cleansing. Full example as an attachment. Remember that currently (in CTP6) you can only have 1024 total columns in a table; but the limit will be increased to 30000 sparse columns before RTM.

eav_to_sparse.zip (1.23 KB)

One thought on “Converting an EAV design to sparse columns and populating”

Gil says:

April 8, 2008 at 7:30 pm

You should probably add square brackets [] around the property (now column) names, since they might not follow the rules for identifiers.

Comments are closed.

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.

Converting an EAV design to sparse columns and populating

One thought on “Converting an EAV design to sparse columns and populating”

Other articles

Over 1000 XEvents in SQL Server 2016 CTP2. Here are the new ones.

Taking the Azure SQL Database row-level security preview for a spin

Azure SQL Database V12 Preview – Spatial Fully Functional

Not a “me-too announcement” blog on Azure SQL Database V12 preview

AzureML: What components are used by the sample experiments?

Book Review: Chris Webb’s “Power Query for Power BI and Excel”

Imagine feeling confident enough to handle whatever your database throws at you.