Thursday, December 20, 2007

Since its appearance in SQL Serve 2008 CTP5, folks have been wondering about the reason for SQL Server's new Change Tracking feature, and how it differs from Change Data Capture (CDC). The CTP5 BOL has a nice writeup about how it differs from CDC, so I'd encourage you to start there. But about "why it's there...".

The main reason for change tracking, as far as I can see, is to be used with ADO.NET Sync Services. ADO.NET Sync Services is part of the Microsoft Sync Framework and is a way to do (I'm trying to summarize here) "client directed programmable replication". For database folks ADO.NET Sync Services can be thought of as providing synchronization between SQL Server Compact Edition and any relational datastore, but the model is extensible. ADO.NET Sync Services 1.0 shipped with SQL Server Compact Edition 3.5 in Visual Studio 2008 (there may be other ship vehicles I'm unaware of) and provides 2-tier, 3-tier, or service-oriented synchronization. It provides hub-and-spoke synchronization. ADO.NET Sync Services version 2.0 will add (the CTP is out now) peer-to-peer synchronization. Microsoft Sync Framework (CTP 1 is out now) includes sync support for data stores that aren't necessarily databases. The best place to go to find some highly informative examples that illustrate the synchronization patterns supported by the model is "The Synchronizer's" (Rafik's) blog.

What's this all have to do with SQL Server 2008 change tracking? Although ADO.NET Sync Services 1.0 is a good start (and there's GUI-based designers in VS2008), it usually means adding timestamp or datetime columns and tombstone tables (tables that track the primary key of deleted rows) to existing database tables and triggers to populate the information sync services needs. That's fairly intrusive, especially with packed applications. SQL Server 2008 change tracking takes care of all of this for you.

You turn on change tracking on a database with ALTER DATABASE and on individual tables with ALTER TABLE. Change tracking does the rest. You can access the information you need for Sync Services applications using the CHANGETABLE table-valed function and a few related functions. You usually want to turn on ALLOW_SNAPSHOT_ISOLATION in the database as well, because change tracking works by tracking when a transaction is committed, rather than when its started.

Change tracking provides the information ADO.NET Sync Services needs to answer request such as "what rows have changed since my particular client (change originator is tracked by change tracking as well) last synchronized" and "have I synchronized with the main database so long ago (change tracking info has a DBA-specified retention) that I need to sync the entire table because the incremental info I need is no longer available".

Currently the VS2008 designers (they're accessed by Added a "Local Database Cache" item to a programming project) don't have an option to indicate "just use SQL Server 2008 and I'm using change tracking", so you have to code the synchronization procedures to use change tracking with Sync Services by hand. Perhaps such a feature is in the works for ADO.NET Sync Services 2.0.

And how about SQL Server Compact Edition (currently the only "ClientProvider" that ship with Sync Services 1.0). Well the SQL Server Compact Edition, Sync Services "just works". No special setup is required, although you may (I haven't determined this) need SQL Server Compact Edition 3.5. This version of SQLCE ships with VS2008. One last thing...ADO.NET Sync Services 1.0 isn't available for compact devices yet, so when you sync with SQL Server Compact Edition, it must be deployed on the desktop.

Thursday, December 20, 2007 12:08:30 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Saturday, December 08, 2007

As a continuation of the previous discussion, here's a DELETE of a title row without fetching it from the database:

pubsEntities model = new pubsEntities();
titles deletetitle = new titles();
deletetitle.title_id = "BU9994";
deletetitle.EntityKey = new EntityKey("pubsEntities.titles", "title_id", "BU9994");
model.Attach(deletetitle);
model.DeleteObject(deletetitle);
model.SaveChanges();

In this case, you need to populate both the property (title_id) that corresponds to the primary key and also populate the EntityKey itself. Attach the object to the store (note that all the original values need not be specified), delete the object from the store, and save the changes. This bit of trickery convinces the ObjectStateManager that the object was in the store all the time, that you'd fetched it from the database. Then you deleted it from the store, so they'll delete it from the database.

Of course, if you have associated titleauthor rows, you'll get an error trying to delete that title. As you would in the relational database. Of course, you can put an ON DELETE CASCADE in the database or delete all of the associated rows first. SQL Server 2005 also allows ON DELETE SET NULL and ON DELETE SET DEFAULT, but you relational data model has to allow this; be careful about using these. You can synthesize and delete all the associated rows and the parent in a single round trip if know what the rows keys are in their entirety, but how do you delete a set of rows in a single statement without fetching or knowing all of the primary key pieces (like the associated au_id in titleauthor table)? Better yet...

UPDATE without fetch is even going to be even more interesting, because you need to know the original row values. Or how about updating/deleting a set of rows based on a non-key column, like "UPDATE titles SET royalty = NULL WHERE ytd_sales IS NULL"? Alex James, Metadata PM of the EF team has posted the first in a series of blog entries on doing just these types of things with EF...it will be interesting to see how far you can take this.

Saturday, December 08, 2007 8:36:17 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

The ADO.NET Entity Framework Beta 3 was released this week. For details, see the ADO.NET team blog.

One of the things that bothered me originally about EDM was that Entity SQL in V1 doesn't contain INSERT, UPDATE, or DELETE statements. Only SELECT. So to update or delete a row, I'd have to fetch it first. Let's start with something simple, like an INSERT with a foreign key constaint?

All of the "add" examples I've seen always insert a new customer and new order and new order details at the same time. But what if your customer already exists?

As an example, the titles table in the pubs database requires a pub_id of an existing publisher. But EDM represents this constraint as a publishers class field in the titles class and an additional  publishersReference instance as a member also. My first attempt to create a new title with an existing publisher (pub_id 1389) went like this.

titles t = new titles();
t.title_id = "BU9994";
// rest of columns elided for clarity
// statement to set the pub_id to 1389
model.AddTotitles(t);
model.SaveChanges();

My first "statement to set the pub_id to 1389" was:

publishers pub = new publishers();
pub.pub_id = "1389";
t.publishers = pub;

This one tries to add the publisher row too and gets a duplicate key violation.

This worked
t.publishers = model.publishers.First(pub => pub.pub_id == "1389");

BUT it causes a SELECT to get the publisher from the database. I wondered at this point if I really needed to do a SELECT to INSERT a row with an existing foreign key row. With some help from Danny Simmons, a much nicer way is to use the publishersReference class and synthesize an EntityKey.

t.publishersReference = new System.Data.Objects.DataClasses.EntityReference<publishers>();
t.publishersReference.EntityKey = new EntityKey("PubsTestEntities.publishers", "pub_id", "1389");

This does the insert correctly with no gratuitous SELECT. Confirmed by SQL Profiler. Good.

Saturday, December 08, 2007 7:39:11 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Wednesday, November 21, 2007

One last SQLCLR feature I'd forgotton about but was quite highly publicized. This is extension of SQLCLR UDT and UDAgg maximum size from 8000 bytes to 2gb. You just the MaxByteSize of -1 in the appropriate attribute, like this:

[SqlUserDefinedAggregate(Format.UserDefined, MaxByteSize=-1,IsNullIfEmpty=true)]

The serialization format is going to have to be UserDefined, the limt for Format.Native is still 8000 bytes.

Just catalog the UDT or UDAgg and use it as you would any other TYPE/AGGREGATE. I crufted up a proof-of-concept one pretty easily, and the new spatial data types (GEOGRAPHY and GEOMETRY) are also large (system) UDTs, so you know this works as of CTP5.

Now you can have your "infinite" UDT that holds an array or "infinite" string concatenation UDAgg. Enjoy.

Wednesday, November 21, 2007 7:39:46 AM (Pacific Standard Time, UTC-08:00)  #    Comments [3]  | 
Tuesday, November 20, 2007

Another cool SQLCLR feature in SQL Server 2008 is the ability to define a table-valued function with an ORDER clause. If you have intimate knowledge that the function always returns rows in sorted order, you declare your SQLCLR UDF with an ORDER clause in the CREATE FUNCTION DDL statement. Let's experiment with this using the cheap and easy Fibonacci sequence function from Dan Sullivan and my SQL Server 2005 Developer's Guide.

create assembly orderedtvf from 'C:\temp\OrderedTVF.dll'
go

-- no order clause
create function FibonacciUnOrdered (@prevstart int, @start int, @rows int)
returns table (next int, prev int)
as external name orderedtvf.FData.Fibonacci
go

The Fibonacci sequences generated are always in ascending order because that's how the function is implemented. In fact, they are in order by both the "next" and "prev" column. Because there is no limit to the number of method signatures we can have over the same SQLCLR UDF, we use the same SQLCLR method, just changing the DDL statement and function name.

create function FibonacciByNext (@prevstart int, @start int, @rows int)
returns table (next int, prev int)
order (next asc) -- this is new
as external name orderedtvf.FData.Fibonacci
go

create function FibonacciByPrev (@prevstart int, @start int, @rows int)
returns table (next int, prev int)
order (prev asc) -- same concept, different ordering column
as external name orderedtvf.FData.Fibonacci
go

Now let's do some testing. The query plan iterators and plan cost are shown as comments.

-- cost: 0.0279081
-- TVF -> Sort -> Select
select * from dbo.FibonacciUnordered(3,4,5)
order by next

With an ordered TVF, there is no SORT iterator, but there are extra query plan steps to operate on the ordered set. AND... the query cost is over 10x lower.

-- cost: 0.0023802
-- TVF -> Segment -> Sequence Project -> Assert -> Select
select * from dbo.FibonacciByNext(3,4,5)
order by next

-- cost: 0.0023802
-- TVF -> Segment -> Sequence Project -> Assert -> Select
select * from dbo.FibonacciByPrev(3,4,5)
order by prev

Note that it IS best to have a different TVF name for each sort order, and "no order", if you plan to use different ORDER BY clauses. This one has a Sort AND its cost is greater than the function that's declared UnOrdered

-- Different order
-- cost: 0.0292881 (more than Unordered = 0.0279081)
-- TVF -> Segment -> Sequence Project -> Assert -> Sort -> Select
select * from dbo.FibonacciByNext(3,4,5)
order by prev

In addition, the query plan guarentees that you don't lie in your order clause. Here's proof.

create function FibonacciWrong (@prevstart int, @start int, @rows int)
returns table (next int, prev int)
order (prev desc) -- THEY ARE IN ASCENDING ORDER, NOT DESCENDING
as external name orderedtvf.FData.Fibonacci
go

-- Error:
-- The order of the data in the stream does not conform to the ORDER hint
-- specified for the CLR TVF 'dbo.FibonacciWrong'.
-- The order of the  data must match the order specified in the ORDER hint for a CLR TVF.
-- Update the ORDER hint to reflect the order in which the input data is ordered,
-- or update the CLR TVF to match the order specified by the ORDER hint.
select * from dbo.FibonacciWrong(3,4,5)
order by prev

Remember, you're not only saving a SORT iterator in the query plan, you're saving memory too. The SORT iterator requires a memory grant. And because there are no stats for SQL Server to use in these "opaque to SQL" functions, the memory grant for the SORT iterator in FibonacciUnOrdered is 1024K. And, we hope that the rather generous memory grant is enough, else memory is being allocated during query execution.

So, declaring ordered TVFs is worth it.

Tuesday, November 20, 2007 4:39:05 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

One thing that I didn't find in the BOL What's New page is some of the new SQLCLR functionality in SQL Server 2008. The first one that intrigued me is support of multi-input user-defined aggregates. Suppose I wanted to implement Oracle's COVAR_POP aggregate, an analytic function that returns the population covariance between two expressions. The signature is COVAR_POP(expr1, expr2) and I want the signature to stay the same in SQL Server.

All that I need to do this is to use the "template" for a .NET UDAgg struct/class, replacing the Accumulate method that take one parameter with a 2-parameter method, like this:

public void Accumulate([SqlFacet(Precision = 20, Scale = 10)] Nullable<decimal> Value1, 
                               [SqlFacet(Precision = 20, Scale = 10)] Nullable<decimal> Value2)
{
  // code here
}

My CREATE AGGREGATE DDL statement would change a bit:

CREATE ASSEMBLY multiparmagg FROM 'C:\temp\multiparmagg.dll'
go

CREATE AGGREGATE dbo.covar_pop(@expr1 decimal(20,10), @expr2 decimal(20,10))
RETURNS decimal(20,10)
EXTERNAL NAME multiparmagg.CovarPop;
go

To invoke:

create table dbo.test_covar (
  i1 decimal(20,10),
  i2 decimal(20,10)
);
go
-- fill with data, then...
select dbo.covar_pop(i1, i2) from dbo.test_covar;

Happy aggregating.

Tuesday, November 20, 2007 3:24:11 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

With any new product or CTP, I've always gotten fast results by starting with the readme.txt file that comes with the product/CTP. Although its now called ReadmeSQL2008.htm, its still worth reading. This usually gives you answers about install scenarios and last-minute changes.

For example, this readme file answers two of the (so-far) FAQs about CTP5.
  What happened to Surface Area Configuration Utility?
  Can I install SQL Server 2008 CTP on Windows Server 2008? (ie, is it officially supported)

You'll need to RTFR for answers to these.

Right after the readme file, the first thing to do (after you've installed it) is to take a gander at the SQL Server Books Online, in the "What's New" section. Here's a bookmark for those without the facility to look in the TOC. mshelp://MS.SQLCC.v10/MS.SQLSVR.v10.en/s10sq_GetStart/html/6a428023-e3cc-4626-a88a-4c13ccbd7db0.htm.

There's WAY TOO MANY new features in this CTP to describe in a few sentences. And the BOL writers have really had their hands full with this CTP, by the looks of the sheer volume of new material. Enough to keep anyone reading and experiementing for weeks. However...

Every once in a while something slips through the readme and What's new, or isn't finished yet, like the "What's new in SQL Server Installation" section. Or folks think of new ways to use features the BOL writers hadn't thought of. Or want to expound on the repercussions of feature XYZ. Stay tuned for these.

Tuesday, November 20, 2007 11:49:10 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

SQL Server 2008 adds the concept of priority for conversations. It's setup using special DDL statements, priority cannot be specified on the CREATE DIALOG CONVERSATION or SEND/RECEIVE DML statements. The DDL statements are CREATE/ALTER/DROP BROKER PRIORITY.

To specify a priority, you associate a BROKER PRIORITY object with combinations of the qualifiers LOCAL_SERIVCE_NAME/REMOTE_SERVICE_NAME/CONTRACT and the priority is associated with all messages and conversation endpoints that match that combination. The wildcard 'ALL' (or leaving the qualifier out entirely) is permitted for any or all of the qualifiers, and matching precidence is specifying in BOL under the CREATE BROKER PRIORITY syntax. The set of defined priorities is stored in sys.conversation_priorities metadata view.

In addition to this setup, the database has to be set to use priorities, with the "ALTER DATABASE...SET HONOR_BROKER_PRIORITY ON" DDL statement. The default behavior is not to honor priorities.

Once you define BROKER PRIORITY(s) and set the database to honor them, priority (default is 5) will be set on:
 sys.conversation_endpoints
 each message in the queue 
 sys.transmission_queue messages

Service Broker priorities are a much asked-for feature and its good to see it implemented. Because this is a new feature in CTP5, there are still some rough edges that need fixing. You can't set honor_broker_priority on CREATE or ATTACH database. And using the SMO scripter (Script As/CREATE on the database in Object Explorer) doesn't set honor_broker_priority either.

There are some really nice examples of the syntax and the concept in BOL. I've enclosed a really simple starter script that changes the default priority and illustrates the metadata. Enjoy.

broker_priority.sql (2.29 KB)
Tuesday, November 20, 2007 11:08:34 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

SQL Server 2008 adds support for the 'let' clause in FLWOR expressions. The for and let clauses have a similar purpose, to bind content (tuples) to variables.  Either one can begin a FLWOR expression:

declare @x xml = '';
select @x.query('
for $i in (1,2,3)
return $i
');
> returns 1 2 3

declare @x xml = '';
select @x.query('
let $i := (1,2,3)
return $i
');
> returns 1 2 3

The distinction is that let is an assignment clause, in the simple statement using 'let' above, $i refers to the entire sequence (1,2,3). The for clause sets up an iterator. The simple statement above using 'for' loops 3 times and each time through the loop $i refers to a single member of the sequence. So, if I add an 'order by' clause, the results are quite different.

declare @x xml = '';
select @x.query('
for $i in (1,2,3)
order by $i descending
return $i
');
> returns 3 2 1

declare @x xml = '';
select @x.query('
let $i := (1,2,3)
order by $i descending
return $i
');
> error:
> XQuery [query()]: 'order by' requires a singleton (or empty sequence), found operand of type 'xs:integer +'

One limitation on the XQuery let clause is that it does not support constructed elements. So this statement works fine:

declare @x xml = '';
select @x.query('
let $x := 1
return $x
');
> returns 1

but this statement does not:

declare @x xml = '';
select @x.query('
let $x := ( <foo>2</foo>, <bar>2</bar> )
return $x
');
> error:
XQuery [query()]: 'let' is not supported with constructed XML

So Let the use of the let clause begin...no longer do I have to explain what a "FWOR" expression is, hooray. However, nota bene. When 'let' is used inside a loop, it's evaluated each time around the loop:

declare @x xml = '';
select @x.query('
for $i in (1,2,3)
let $j := 42
return ($i, $j)
');
> returns 1 42 2 42 3 42
> $j is evaluated three times

 

Tuesday, November 20, 2007 9:02:54 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Thursday, November 15, 2007

The upcoming SQL Server 2008 provider for Powershell was a big hit and sparked quite a bit of discussion when I talked about and demonstrated it during TechEd/Developers last week. At the time I thought that this might be a good discussion topic for database administators, and, after asking around, a room was found, and I'll be re-presenting this session, entitled "Using Windows Powershell with the SQL Server 2008 Provider and SMO" here at TechEd/ITForum tomorrow (Friday) at 13:30-14:45 in room 131.

If you do administrative tasks with Powershell or are interested in SQL Server, drop by and bring your opinions. See you there.

Thursday, November 15, 2007 2:11:43 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, November 13, 2007

I'm up early this morning for a repeat of my T-SQL in SQL Server 2008 talk. Things really went well for the last two days, folks really seemed to like to see lots of actual working code, albeit my examples are always "minimalist". There was a lot of interest around the Spatial data talk and some discussion of upcoming and ongoing projects that will take advantage of this new functionality. The increase in interest could be been caused by announcements about spatial data at the conference and also the opening of the new Spatial Data section on the SQL Server website. My SQL query tuning segment generated a lot of interest too.

If you're around the conference this morning and interested in SQL (and who isn't?) stop by and say hi.

Tuesday, November 13, 2007 11:22:07 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Monday, November 12, 2007

Today is day 1 of TechEd/ITForum in Barcelona. I have a bit less hectic of a workload at this one, a total of 4 sessions, 3 of them being "interactive sessions" (this year's word for chalktalk). Today's session is going to be on T-SQL enhancements in SQL Server 2008, at the late-in-the-day time of 5:45-7:00pm. C'mon out and I'll see you there, or, if you can't make the late session, I'm repeating it on Wednesday.

Tuesday, I'm doing my only breakout, the "day's worth of material in 75 minutes" on SQL query performance tips and tricks. And, in the afternoon, a session on spatial data (and indexes) in SQL Server 2008. See you there...

Monday, November 12, 2007 4:44:54 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Saturday, November 10, 2007

Actually yesterday, but today was the first time I'd had a chance to write about it.

Having done quite a bit with what's now being called "traditional web services", my first impression of REST were, I'll have to admit, the thought that it was web services without schema. I kinda like my metadata; it's always been irritating that stored procedures do not store ANY metadata on the number or shapes of the rowsets returned, only metadata on the parameters are stored. The closest that ANSI comes is allowing specification of the number of rowsets, a piece of the standard the SQL Server doesn't implement. So web services without schema seemed less useful than web services with.

But one of the up-and-coming data access technologies at Microsoft is Astoria, which is described as "a REST-ful set of interfaces to relational database (and other) data". It's causing a lot of excitement.

After the conference I mentioned this to an old friend, Jon Flanders, as he was wearing his "real programmers care about URIs" (or something close) teeshirt. Asked him to explain the zen of REST. To summarize, I got the impression it was all about the location specification and using HTTP verbs like GET/POST/DELETE to effect "state transfer" operations. And, although its not very common, metadata can be specified using WADL (Web Application Description Language). So specifying the location is perhaps like a "connection string" to the service? And to the data the application interacts with?

I'm still a bit skeptical of the (seeming) typelessness and contract-lessness of it all, the "HTTP is the only protocol"-ishness, and IIRC, I can find out the location of a traditional web service using WSDL's soap:address element's "location" attribute in the service portion. But at least I have an somewhat of an understanding of what all the buzz is about.

More on Astoria in future posts.

Saturday, November 10, 2007 2:32:05 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

Whew! I had a full week at TechEd Barcelona, just getting around to finishing up writing about it. It was great to get out and see everyone, there were a lot of familiar faces. Thanks for showing up and saying hello. I fully enjoyed each and every presentation and interactive discussion.

A special thank you to Gunther Beersaerts for allowing me to attend and present. Got to meet a lot of "the old gang" as well as some people who I'd only known through their blogs, like Ward Pond and Johannes Kebeck. And I got to attend the Belgium/Luxembourg country drinks party, where I had some great technical discussions over beers.

I don't have exact figures, but because I had the same room for each breakout, I "counted" interest by counting the number of people who showed up. Transactions and Optimizing T-SQL still rule, although there was almost a full room for SQLCLR. And there was more than I expected for Service Broker and quite a few folks for SQLXML. Packed a double "interactive" room for the Powershell in SQL Server 2008 presentation/discussion, although the liveliest discussions were in the "OLE DB and ODBC" and "tell us where it hurts in SQL Server" sessions. The ODBC folks were happy they aren't being forgotten, it was enlightening to see the numbers using OLE DB and ODBC. Thanks to John de Longa of DataDirect for his input on ODBC/OLE DB and Elisa Flasko for helping me moderate the discussion. A number of people seem quite worried about the advent of LINQ to SQL/EDM and what they see as the upcoming deluge of dynamic SQL and its effect on the plan cache. I'll have more to write on that subject soon.

Next week is ITForum. I won't be quite as busy, but I will be doing a few cool talks on Spatial Data in SQL Server, T-SQL improvements in SQL Server 2008 and T-SQL query tuning. See you there.

Saturday, November 10, 2007 1:59:39 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Thursday, November 08, 2007

Today, I'm finishing up my (many) talks at TechEd/Developers' Barcelona with a breakout on best practices for transactions and isolation and leading an interactive discussion on OLE DB and ODBC. I love talking about transactions and, in addition to demonstrating all of the isolation levels SQL Server supports their behavior and repercussions, I'll even show the fairly esoteric "MARS batch-scoped transaction".

The "OLE DB and ODBC" discussion should be good. After seemingly moving away from ODBC since the late 1990s (actually they never moved away) there has been a movement toward embracing the ODBC lifestyle and reiterating support in the pre-SQL Server 2008 timeframe. I'll be co-hosting this one with Elisa Flasko of the Microsoft data access team.

See you there!

Thursday, November 08, 2007 12:26:22 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, November 06, 2007

Tomorrow (Wednesday) I'm changing to multiple best practices session in a single day. In the morning, I'll be speaking about T-SQL query plans and "why queries run slowly" in a session called "Best Practices for Procedural SQL Code". And in the afternoon, a different data model and query language (but the same optimizer) in "Best Practices for XML data type and XQuery". Be sure to drop in if you're interested in either of those topics.

Today's surprise was a preview of the upcoming SQL Server 2008 Powershell Provider. It seemed to be a big hit with the folks that came to the session and promises to be yet another step in consolidating administrative tasks in a common form by using Powershell.

See you at TechEd/Developers Barcelona!

Tuesday, November 06, 2007 8:39:10 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

Today is day of SQLCLR (that is, use of .NET CLR code in SQL Server) for me at TechEd/Developers in Barcelona. I'm doing two talks on it: one before and one after lunch. The before lunch talk is a breakout-style talk, illustrating with code when its a good idea to use SQLCLR and when its not a good idea. I'll also touch on some SQL Server 2008 SQLCLR enhancements. Questions and comments at end of the talk.

The "after lunch" talk is more of an open discussion. Some folks think that SQLCLR is one of the fine features of SQL Server 2005. Other just want to know: "how do you turn that off, programmers will just abuse it". Please bring your opinions along (don't check them at the door), and contribute to the discussion. I think SQLCLR is one of the most misunderstood features of SQL Server.

After the SQLCLR discussion, I'll be doing an interactive session on SQL Server and Powershell. These two are integrated by means of a .NET library called SMO, though Powershell can also be used with SQLDMO. They'll be a little surprise in this one, but you'll need to be there to find out.

See you there!

Tuesday, November 06, 2007 12:17:58 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Monday, November 05, 2007

I like free, standard, sample databases. One that I've been looking at for quite a while is the Mondial database currently available at Institute for Informatics Georg-August-Universität Göttingen. This set of data (from an old CIA World factbook and other sources), is available not only as a relational database, but as XML, RDF, and even F-logic.

The problem with using this with SQL Server has been the lack of a DATE datatype with the appropriate value range. The DATE datatype is used as "Date of Independence" and some countries have independence dates before 1753. So you'd have to represent it as a VARCHAR. Yuk.

With the new datatypes in SQL Server 2008, this is do-able without compromise, so here it is. In addition, since many of the locations (in two tables) come with latitude and longitude columns, I've added a GEOGRAPHY column, so when CTP5 comes out we'll have something to use GEOGRAPHY with.

Source scripts included. Enjoy!

mondial-sqlserver2008.zip (163.73 KB)
Monday, November 05, 2007 5:21:28 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 

Ever since the hierarchyid data type was introduced in SQL Server 2008 CTP4, there's been a fair amount of discussion about renaming some of the methods of the data type to make them a bit more intuitive. In addition, there's been discussion about "missing" methods that don't exist in the hierarchyid data type. Well...

Because hierarchyid is a .NET-based system UDT, you can implement these "missing" methods yourself. There's a couple of ways to go about this:

1. Use your own assembly that uses the hierarchyid (that's Microsoft.SqlServer.Types.SqlHierarchyId to you) as parameters. Because its a "normal" SQL Server type, you can use you use it anywhere you can use a "native" SQL Server data type, like DATE.
2. Code your own UDT that inherits from SqlHierarchyId. Perhaps call it HIERARCHYID2, to further upset those folks who are offended by DATETIME2.

Wow. Did you say inheritence from a system UDT? Before we go down this path, bear in mind that this works with the CTP4 version of SQL Server 2008, but there's no guarentees about it working in the next CTP or release. I've heard nothing to that effect, but you never know. I've coded up a little stub and it seems to work, but...

I thought that T-SQL doesn't support UDT inheritence. Technically, it doesn't. That is, the SQL Server system catalogs (sys.types, etc) don't track UDT inheritence. This means that in order to allow T-SQL to "see" methods and properties in the base class, you need to write methods that do nothing but delegate to the base class. Because all the methods that you care about are public (by definition) you can do this. Simply override the methods that you don't want to pass through. I wrote about this a long time ago, see this blog post from 2004.

A few other considerations. Your UDT must be a class, not a struct, of course. Your UDT must implement IBinarySerialize and use UserDefined serialization, because the base class does. Microsoft.SqlServer.Types allows partially trusted callers, so your assembly should be able to work with permission_set safe. Finally, remember that SQL Server does not allow you to expose overloaded methods in assemblies, although you can use them in your internal implementation.

I'm working to expand the stub implementation, so if anyone has some neat ideas for derived methods and properties, I'd be interested in hearing from you. And, when CTP5 comes out...GEOGRAPHY2 anyone? If it doesn't do exactly what you want, change it.

By the way, I've wondered if deriving from a system data type isn't analogous to using undocumented system stored procedures, a practice which everyone I meet seems to disapprove of, but they do it anyway. I'm think at this point that it's more analogous to using a documented/supported system stored procedure, like sp_spaceused, in your own script. They can't change HIERARCHYID post-SQL Server 2008 without break all existing code. So you're safe, I think. Other opinions?

Monday, November 05, 2007 2:43:21 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

There's been some rumors going 'round about the immenent release of SQL Server 2008 CTP5. OK, maybe I've been asking around too. The main reason for the rumors is that there was a published CTP timetable at one point and its getting to be about that time. The other reason is the plethora of sessions, both here in Barcelona at TechEd/Developers and TechEd/ITForum and at DevConnections in Las Vegas (where some of my SQLskills collegues are this week) about features that won't appear until CTP5. Looking at the calendar of talks I can easily spot talks on Spatial Data and Filestream storage, two of the most compelling features in the next release, due in CTP5. I'll even be doing a spot of my own on Spatial data next week at TechEd/ITForum.

There have also been a number of postings by Microsoft employees about the upcoming CTP5 features, some of them including code. If you're at TechEd, don't miss Johannes Kebeck's talk on Virtual Earth, which may have some tidbits on usage of the SQL Server 2008 spatial data types. He's been working with the spatial data types since they were just a gleam in the SQL Server team's eye. Unfortunately, I'll miss it, because I've got a talk at the same time. Maybe Johannes can fill me in. And Michael Rys' talk on spatial data in SQL Server 2008.

But back to CTP5...the answer so far is just "soon, wait and see". But do attend the sessions and I believe you'll agree its something worth waiting for.

Monday, November 05, 2007 12:17:55 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Sunday, November 04, 2007

This week I'm at TechEd/Developers Barcelona. I arrived Saturday and I'm always delighted, being a Portland Oregon resident, to see the sun, even in winter. It's supposed to be sunny and mid-upper 60s for the next 10 days and that's good because I'll be staying on for TechEd/ITForum next week too. Oh boy, my sun quota for the rest of the year...

I'm going to be doing a breakout or two and some "interactive sessions" (that's what chalktalks evolved into) every day from Mon-Thurs. Talking about best practices and lessons learned for developers in SQL Server 2005. I've been working with developers since before SQL Server 2005 shipped and now that we've had a few years to practice I feel comfortable talking about it. In each session, there will be a "look forward" to SQL Server 2008 new features.

Today is day of Service Broker, one of the least understood features of SQL Server 2005. In the best practices session I'll be rewriting some of my older code to correspond with best practices and demonstrating how to use broker for performance. Performance, robustness, and co-location with the data are broker's best features when compared to other ways to implement a service-oriented architecture. 3500 transactional messages/second on commodity hardware is nothing to sneeze at.

After that, I'll be "hosting" and interactive session on Service Broker lessons learned. We'll discuss real world use cases, see what design issues folks have been running into and how they've been addressed. Bring your questions and comments about your own projects to contribute or just come along to listen in.

Later in the week, I'll be covering T-SQL queries and plans, SQLCLR, transactions and isolation, XML data and query, and have some discussion sessions on two of my favorite topics: native data access APIs (OLE DB and ODBC) and SQL Server/Powershell integration. There will even be some surprises: you'll have to show up to find out.

See you there!

Sunday, November 04, 2007 11:56:28 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Wednesday, October 31, 2007

In the last few years, I've done a few talks at various conferences on the integration of SMO (SQL Server Management Objects) and Powershell. My friend and co-author of the SQL Server 2005 books, Dan Sullivan, got me into using Powershell and SMO and has written quite a number of excellent blog entries and articles about it. Because SMO is just another loadable .NET library, they're a perfect fit. For TechEd US, I even wrote a simple powershell provider that makes SQL Server look at a file system (NavigationCmdletProvider through the database objects), and showed the code.

I'll be doing another chalktalk at TechEd Developers, Europe, next week... with a twist. A few weeks ago, I'd been informed of plans to include a Powershell NavigationCmdletProvider provider that's shipped as part of SQL Server 2008! Really! It won't be in the very next CTP (due soon, according to the original official CTP schedule from long ago), but, barring untowed circumstances, it will be in the final product.

So if you have any interest in SQL Server and Powershell, I'll see you in Barcelona at the chalktalk. It will be a good time, promise.

Wednesday, October 31, 2007 1:12:24 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

I've been talking with folks for (it seems like) a long time about modeling sparse attributes in a relational database. Seems like I run into a new design where there's a need for sparse attributes every few weeks. If you remember the "hardware store" example (where each new sales item may have unqiue properties), that's just the tip of the iceburg. Basically, your choices boil down to: Sparse Columns (new column for each new attribute), Sparse Tables (new table for each new set of related attributes, if they are related), Entity-Attribute-Value (the "traditional" design, often eshewed because of scalability concerns), and XML (attributes model that sparse attributes, elements model the common attributes).

Last week I ran into a person with a modeling decision like this. He also informed me that he'd tried the sparse table design and ran into SQL Server's hard limit of 256 tables in a join (if you want all the sparse attributes for all products?). Wow. I can't image what a 256 table join would look like, and how the query processor would have time to load all the statistics for this one. He'd also run into the 1024 column limit with sparse columns. I told him to wait on that one; SQL Server 2008 will have sparse columns.

The XML design is interesting too, because you can do a search to which rows have which sparse attributes and spit out the right data.  It's what the XML VALUE index was designed for. And hoist the common attributes to persisted computed columns for best query perf. Seems that sparse columns may have that covered too, if its implemented like what was shown at TechEd US. There would be an optional column defined as "XML COLUMN_SET FOR ALL_SPARSE_COLUMNS". A value index on this should do the trick for a fast search too.

I (and quite a few other folks, if my networking is correct) can hardly wait...

Wednesday, October 31, 2007 12:43:07 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Thursday, October 25, 2007

I'll have to admit it, when I first saw that SQL Server 2008 was adding spatial data support, I thought of it as a niche. The province of geographers, cartographers, and maybe a few others. Complex, involving a lot of higher mathematics, each province having their own geographic encoding, and so on... And that level exists, to me it's the production of spatial reference data. When I think of spatial reference data, I think of map data you'd buy from ESRI and data posted by government agencies. Or made available by utilities, so you don't hit a power cable while digging in your garden. As opposed to spatial line of business data.

Now, before you go searching your LOB application for latitude and longitude columns, how about looking for columns that contain "address". It's a short hop from address to lat/long by using a geocoder. The one I used is the MapPoint web service. Now you have line-of-business spatial data. I'll bet every app has a field that contains address. And how about looking for the nearest salesperson for a potential customer? Or the nearest warehouse? Mapping programs like Virtual Earth, Google Earth, and Yahoo Maps can give you general business information and maps but how about encoding information in your own business?

I'm quite excitied over this upcoming "niche" feature and think it could make its way into each and every application. That's spatial data "for the masses" (so I'm not the greatest at sloganizing, forgive me).

Thursday, October 25, 2007 11:54:52 AM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 

A couple of people have been asking and, in preparing for my upcoming talk on spatial data on SQL Server 2008 at ITForum in Barcelona in a few weeks I got to ask about using the new system UDTs types on the client. In an upcoming CTP release there will be an MSI installer file specifically to install these types on clients. The appropriate files are installed and assembly (Microsoft.SqlServer.Types.dll) registered in the GAC.

Since these are .NET data types, these (HierarchyID, Geography, Geometry) are easily usable in SQLCLR functions and procedures too. No possible data type mismatches or nullability (the new types implement INullable and have a static property to return a NULL instance) concerns.

Thursday, October 25, 2007 11:18:08 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Friday, September 21, 2007

Just saw Aaron Bertrand's post from PASS on SQL Server 2008 Change Tracking vs Change Data Capture (change data capture (or CDC) is in the current CTP; change tracking is not). I'm not at PASS this week myself, but home while the house is being re-roofed. His post seems to confirm something I'd suspected all along.

Change tracking seems to go hand-in-hand with Sync Services for ADO.NET. I've been following Sync Services for a while; it's a set of libraries for controlling and implementing synchronization for disconnected database applications. Its current shipping vehicle is the Visual Studio Orcas Beta 2 release, along with SQL Server Compact Edition version 3.5. The fact that SQL Server 2008 Change Tracking provides a mechanism to keep track of which user (the sync OriginatorID) made a change (CDC doesn't) and also seems to provide automatic change table management for DELETEs (CHANGE_RETENTION) and a mechanism to "get the set of changes that have occured from a baseline" (the "sync_last_received_anchor" in sync services) makes Change Tracking line right up with what Sync Services requires.

Although Aaron mentions "offline stores like Outlook in cache mode", SQLCE is an exciting offline store because its currently deployed in places like Windows Media Player, Zune, Media Center PC, and more. SQLCE runs on mobile devices and desktops. Sync Services isn't available for mobile devices yet, but is said to be "in progress".

When you set up a Sync Service app, you currenly must make changes to the database (triggers, "tombstone tables for deletes", and such) referred to by the "ServerProvider", in order to track the information Sync needs. The Sync "ServiceProvider" architecture layers over/shares concepts with the ADO.NET provider model. But you don't have to make changes for the SQLCE 3.5-side (SQLCE is the only current "ClientProvider" that Sync Services supports) because, "support for sync is built in". Well...maybe it's built in to the server too, with SQL Server 2008. Bet we'll see (at least one) demo with Sync Services when the Change Tracking feature ships.

Friday, September 21, 2007 1:44:11 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

Theme design by Jelle Druyts

Pick a theme: