Thursday, April 17, 2008

I'm heading off to SQLConnections in Orlando early tomorrow morning. I'll see starting with a preconference "Day of SQL Server 2008 for Developers" and covering, well, every new feature that could interest a developer. I'll also being doing a series of talks about everything from Spatial data support to Extended Events to PowerShell in SQL Server to XML for DBAs.

If you'll be in Orlando for the conference, stop by and say hi.

Thursday, April 17, 2008 6:42:51 AM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 
Tuesday, April 08, 2008

One of the uses for sparse columns will likely be replacing entity-attribute-value designs. In an EAV design, the "sparse attribute" values are represented by a separate table. The attribute value can either be declared as (n)varchar or sql_variant. An example I've always used is the hardware store that wants to keep track of its products in a "products" table, but each product has different attributes. If you use the minimum number of columns to simplify the example, the EAV design looks like this.

create table products -- base table
(
 id int primary key,
 name varchar(max)
)
go
create table properties -- sparse attribute (name-value pair) table
(
 id int,
 name varchar(50),
 value varchar(max),
 CONSTRAINT PK_properties PRIMARY KEY (id, name),
 FOREIGN KEY (id) REFERENCES products (id)
)
go

Here's a straightforward way to convert the table to use SQL Server 2008's sparse columns. It uses dynamic SQL, but in this case there's no user input (SQL injection worries).

declare @tab nvarchar(max),
        @sql nvarchar(max)

set @tab=N'create table products2 (id int primary key, name nvarchar(max) '
select  @tab=@tab+','+ name + ' varchar(max) sparse' from properties
group by name
set @tab += ' ,col_values xml column_set for all_sparse_columns);'
-- select @tab
exec(@tab)

Populating it is also straghtforward using the pivot operator, introduced in SQL Server 2005. In the case where each item has only one of each sparse property (the table constraint enforces this), and our properties table has only three columns, there's no real aggregation with pivot. The aggregate is just required by the pivot operator syntax. This populates the table:

declare @col nvarchar(max),
        @sql nvarchar(max)

set @cols=N''
select  @cols=@cols+','+ name from properties
group by name
 select @cols
set @cols=substring(@cols,2,datalength(@cols)/2 - 1)
set @sql=N'insert into products2 (id, name,' + @cols + ') select p.id, p.name, '+@cols+
      N' from (
      select id, name, value
      from properties
      ) as q
      pivot
      (
      max(value)
      for name in ('+@cols+
      ')
      ) as PivotTable'
set @sql += ' join products p on PivotTable.id = p.id'
-- select @sql
exec (@sql)

What remains to be done now is to choose better data types for the sparse columns, if the data isn't really a string. If you haven't enforced value type correctness in the application, this may require some data cleansing. Full example as an attachment. Remember that currently (in CTP6) you can only have 1024 total columns in a table; but the limit will be increased to 30000 sparse columns before RTM.

eav_to_sparse.zip (1.23 KB)
Tuesday, April 08, 2008 9:50:47 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Monday, April 07, 2008

It's easier than you'd think... When you use the SQL Server 2008 CREATE EVENT SESSION DDL statement with extended events, you specify:

Events to be captured (e.g. sqlserver.error_reported)
Actions to be fired to add more information (e.g. sqlserver.sql_text)
Predicates for conditional capture (e.g. where sqlserver.error_reported.error = 547)
Event target to collect the events (e.g. package0.ring_buffer)
Addtional options (e.g. MAX_MEMORY)

In general, the way you look for events, actions, predicates, etc to use is to query against the metadata views:

select p.name + '.' + o.name as [Full Name],
       o.description s [Description]
from sys.dm_xe_objects o
join sys.dm_xe_packages p on o.package_guid = p.guid
-- @type can be 'event', 'action', etc
where o.object_type = @type order by p.name, o.name

Two of the actions listed have to do with activity (causality): package0.attach_activity_id and package0.attach_activity_id_xfer. In addition there is an option TRACE_CAUSALITY. I thought you'd have to set the TRACE_CAUSALITY option and add the actions to collect activity information. But attempting to add either action produced an error.

It's easier than that. Simply turning on the TRACE_CAUSALITY option is enough. The actions can't be specified in CREATE EVENT SESSION (or ALTER EVENT SESSION) because they are for internal use. The activity_ids show up at the target without explicitly naming the actions.

Monday, April 07, 2008 10:54:14 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Saturday, April 05, 2008

It appears that there are some changes in .NET 3.5 System.Transactions (or System.Transactions.dll and System.Data.dll version 2.0.50727.1433 if you're looking at assembly versions). Florin Lazar blogs about a change to the syntax here. And Alazel Acheson blogs about changes to SqlClient changes to accomodate using lightweight transactions with less promotion to distributed transactions in the ADO.NET Team blog.

It is interesting to read how the latest version of SqlClient can combine automatic transactions with connection pooling to use promotion to a distributed only when necessary. It was reminiscent of how COM+ used the connection pool; the difference is that COM+ always used distibuted transactions. Some of the changes use a transaction-aware connection reset mode that's only supported in SQL Server 2008. So server changes were required as well.

At the end of the blog entry Alazel mentions that you can't use this facility with two open connections at the same time; it requires that there is a free connection with the appropriate transaction scope in the pool. I immediately thought of SQL Server's sp_getbindtoken and sp_bindsession, the manual way to allow two connections to share a single transaction context. But sp_getbindtoken and sp_bindsession are on the deprecation list (to be removed in a future version of SQL Server) in SQL Server 2008. The books online entry for these calls recommends "using MARS or distributed transactions instead". But why MARS?

MARS (multiple active resultsets) made its appearence in SQL Server 2005 and its data access stacks. It permits you to have multiple interleaved sessions while using a single SQL Server connection. These multiple sessions share the same transaction, modulo an interesting but escoteric behavior known as batch-scoped transactions. MARS also does not support named savepoints.

What MARS and sp_getbindtoken/sp_bindsession have in common is that they are both solutions to the same "problem". I prefer calling this a SQL Server "behavior" because its not technicall a problem, just how the underlying network stack works. SQL Server does not allow other activity on a connection (like an UPDATE or a second SELECT) while reading a rowset is in progress. Pre-MARS, if you wanted to have multiple commands in a single transaction scope, you would use two connections and "bind" the transaction scopes together with the transaction token. With the MARS capability this is no longer necessary.

So, the OpenConnction1/DoCommand1/Close then OpenConnection2/DoCommand2/Close pattern is now acommodated by System.Transactions and SQL Server 2008 without transaction promotion. To do two commands with the same transaction scope without closing the connection, use a single connection and multiple sessions with MARS. With either data access pattern, only a single physical connection to SQL Server is needed.

Saturday, April 05, 2008 2:14:44 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

Theme design by Jelle Druyts

Pick a theme: