Monday, August 20, 2007

Last blog entry on SQL Server Extended Events for a while. But...a couple of questions came up since I wrote my first blog entry on SQL Server Extended Events.
  What are the major advantages to SQL Server Extended Events?
  Is this really using the Crimson event system?

There's a couple of reasons that come to mind as an answer for the first one. First, and maybe foremost, this eventing system has an ETW target and therefore allows end-to-end tracing. ETW is a provider-based tracing system that is integrated throughout Windows. With the providers available ("logman -query providers" from the command line) you could trace from your ASP.NET app (by way of your IIS server), into SqlClient, across the network (providing that you can decode a TDS trace), into SQL Server, and back. That's a lot of power.

The main hassle I've always had with ETW is the dearth of post-processing tools available. You can use the tracerpt utility to process the ETW output into a comma-separated value file, but where you go from there depends on how well how you post-process the CSV. There is a logreader utility that can do simple SQL-like queries against a variety of log file formats. At one point, I made up a simple SSIS job to load SqlClient ETW output into SQL Server to do T-SQL queries, but where you go from there depends on your ability to decode the variable "data" fields in each event. Although the .mof files allow you to decode the binary format into columns based on the data type, CSVs aren't usually self-describing either, you have to know what each bit means.

Next reason why I was intrigued was the granularity of the eventing. At first glance, you can:
1. Create arbitrary groupings/rankings (buckets) on the event data with the bucketizer
2. Pair alloc/dealloc of most any type of resource with the pair_matcher
3. Add extra data (actions) to events. They even added a mechanism to determine causality.
4. Use as many targets as you wish (targets are separate from events and actions)
5. Add events and targets to a running session
6. Specify how much resources (like memory, dispatch latency) your trace should take (see CREATE EVENT SESSION DDL)
7. Use synchronous or asynchronous event collection, and event buffer retention
8. Specify memory partitioning by CPU or NUMA node

One of the good things about a trace is to attempt to balance "intrusiveness" (which slows things down) with thoroughness (you ARE usually tracing because there's a problem, after all).

The other question concerns the Crimson eventing system. Crimson is a really old codename for Windows Unified Eventing (Windows Eventing 6.0). You can collect your events in XML format and it uses an XML config file for registration, hence my possible confusion with the tern "XEvent" which I'd heard used for SQL Server's Exgtended Events. Matt Pietrick describes it as "an attempt to unify event log and ETW tracing". It's available on Vista and Longhorn OS's only. Here's a couple more references:

http://msdn.microsoft.com/msdnmag/issues/07/05/SecurityBriefs/
http://msdn.microsoft.com/msdnmag/issues/07/04/ETW/

Don't search for XEvent like I did, you'll get a lot of hits for the XWindow system XEvent (remember XWindows?). Or Crimson, you'll get a lot of hits on University of Alabama.

All of the articles refer to using the wevtutil utility to list event providers, like logman lists ETW providers. So I installed SQL Server 2008 on Longhorn Server (Windows Server 2008) beta3 and looked for "new" event providers and events. I didn't see any, so I'm not sure that SQL Server Extended Events will register anything more than ETW with Windows Unified Eventing. And the bucketizer/pairer targets write to dynamic management views, not to the event log. There is one more target, the asynch file system target, but that's not in this CTP. So stay tuned, perhaps this is an investing towards future "unified eventing".

BTW, I began to wonder: is unifying the event log with ETW data is really a good idea. Event logs are the "normal" messages that are emitted, tracing seems to me to be a "special occasion" messaging with possible very high message volumes. I don't think the two are actually mixed in the same physical location in Vista/Longhorn, but...what do you think about this unification?

Monday, August 20, 2007 3:54:08 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

This won't be as long of an entry because I'm trying to finish describing the items that you can use in an XEvent session, that is, the items that exist in a package.
  Events
  Targets
  Actions
  Predicates
  Maps
  Types
 
Let's do types and maps. A type is simply a data type, a simple type like Int16 or complex type like 'SOS_context'. Almost all the types live in package0, there's only one each in sqlos and sqlserver packages. Maps are enumerated constants.

See types:
select * from sys.dm_xe_objects where object_type = 'type'

and map:
select * from sys.dm_xe_objects where object_type = 'map'

and legal enumeration (map_key) values:
select map_value, map_key from sys.dm_xe_map_values where name = 'keyword_map'

Predicates: for predicates (think filters in SQL Profiler), you need a predicate source and a predicate comparator.
select * from sys.dm_xe_objects where object_type like 'pred%' order by object_type

There's probably more to it than this, there are customizable event attributes that can be SET in ADD EVENT, and predicates can use event fields for filtering (but not actions). But this will get you started.

One last thing that bears mentioning is the pair_matching target. This target allows you to specify a pair of events (like lock_acquired, lock_released) and after you've run the workload a while, it will show you (the relevent fields in the XML structure exposed by target_data (as in, SELECT CAST(target_data as xml) FROM sys.dm_xe_session_targets;) those events that do not match. That is, the locks that have been acquired but not yet released. VERY cool.

Once again BOL shows an example that takes advantage of knowledge of the pair_matching target's XML data structure. The XML used for target_data appear to be schema-less, i.e. they don't go by a named XML schema, i.e. you have to know what the structure items (elements, attributes, and values) mean. BTW, I keep referring to the BOL because I very much like the info in the BOL, as far as it goes, especially at this early stage. I come to (hopefully) elucidate and expound upon the BOL, not to complain about it. THANKS Buck, Alan, Steve, and all...

You can specify begin and end events, begin and end matching_columns and matching_actions. This is from:

SELECT * FROM sys.dm_xe_object_columns
WHERE column_type = 'customizable'
WHERE object_name = 'pair_matching'

Hope this was useful. Happy event tracing.

Monday, August 20, 2007 2:07:42 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

It's another rainy day in Portland in summer, so I thought I'd stay inside and write more about SQL Server 2008 Extended Events.

I wanted to finish things off by talking about actions and predicates. Need to make a detour at targets, too. I noticed the BOL examples (my point is to try not to repeat things you can find in the BOL) don't contain an example of actions in DDL. So we'll start with them. An action is an additional piece of data that you can tack on to an event. Like a stack trace, or even a causality ID. Or sql_text.

The available actions can be seen with:
SELECT * FROM sys.dm_xe_events WHERE type = 'action'

So let's try sql_text with our existing EVENT SESSION.

CREATE EVENT SESSION PubsLocksETW
ON SERVER
ADD EVENT sqlserver.lock_acquired
   (ACTION (sqlserver.sql_text)
    WHERE sqlserver.database_id = 12) -- (pubs database), add a predicate too
ADD TARGET package0.etw_classic_sync_target
   (SET default_etw_session_logfile_path = N'C:\temp\sqletw.etl' )
GO

But the sql_text doesn't appear in the ETW file. I did this a few times, because I thought I got it wrong. Maybe it will appear in the async file target, which isn't in this CTP. The BOL also says that not every action is valid for every event. Hmmm... the metadata (sys tables) seemed to be happy, but it ain't there. But I can use it with the bucketizer and pairer targets.

The bucketizer makes ...er' buckets (groups) of different "readings" (events) on a single data object. As in, group by lock type or group by cpu time. You can control how many buckets it makes. In fact that bucketizer target needs syntax like the following (from BOL).

CREATE EVENT SESSION MostLocks
ON SERVER
ADD EVENT sqlserver.lock_acquired (where sqlserver.database_id = 12) -- (pubs)
-- this means "create buckets based on object_id (object being locked in this case)"
ADD TARGET package0.synchronous_bucketizer
    (SET filtering_event_name='sqlserver.lock_acquired', source_type=0, source='object_id')
GO

But how did they figure out what to put after "SET"? Where does 'filtering_event_name' come from?

SELECT * FROM sys.dm_xe_object_columns
WHERE column_type = 'customizable'
WHERE object_name = 'synchronous_bucketizer'

There they are... and the description field suggests a use for 'action'.

CREATE EVENT SESSION PubsLockByText
ON SERVER
ADD EVENT sqlserver.lock_acquired
   (ACTION (sqlserver.sql_text)
    WHERE sqlserver.database_id = 12) -- (pubs database)

-- source_type= 1 means buckets by action, not by event
ADD TARGET package0.synchronous_bucketizer
   ( set filtering_event_name='sqlserver.lock_acquired', source_type=1, source='sqlserver.sql_text')
GO

Aha... now I have buckets created based on the text of the SQL statement that caused the lock, rather than by object_id.

This is getting to be too much for a single blog entry. But one last thing for now. You may not have noticed that I switched from using the "package.asynchronous_bucketizer" as the BOL does to using "package0.synchronous_bucketizer". Why? Because I want to do a simple, controlled experiment and I may not want to wait for the buffer to be full and async bucketizer to write out. BTW, for a simple controlled experiment, you can do:

USE pubs
GO

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
UPDATE authors SET au_fname = 'bob';
ROLLBACK  -- You don't really want everyone named 'bob', do you?

Leave the EVENT SESSION running to see the buckets. BOL has a cool query against the XML structure but to see the raw XML, if this is your only EVENT SESSION running...

SELECT CAST(target_data as xml) FROM sys.dm_xe_session_targets;


 

Monday, August 20, 2007 1:04:40 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Sunday, August 19, 2007

It's summertime in Oregon and its been nice and warm out (75-85F) lately. But today its raining (or specifically, the skies are quite ominous right now), so it's time to write more about my latest favorite subject: SQL Server 2008 Extended Events.

The SQL Server 2008 extended events introduce quite a bit of new terminology, but in investigating the specifics you come across some familiar themes.

Extended events are all contained in packages. An event package is identified by a GUID and a name. Three packages are provided and you can see brief descriptions by querying sys.dm_xe_packages. A package is just a container for all of the other objects (like events and targets) you'll refer to in event sessions. The grouping has no effect of EVENT SESSIONs; you can specify any object from any package in a single event session. The package names are: sqlserver, sqlos, and package0.

Two of the object types that packages contain are events and targets. Events name the information you can collect, these are defined in event_columns. Targets define where the event information is captured and how its processed before being collected. So what can you capture, already?

SELECT * FROM sys.dm_xe_objects WHERE type = 'event'

Only sqlos and sqlserver packages contain events. The events in sqlos are 40 low-level operating system-interaction events, as you might guess. An example is async_io_requested. The sqlserver packages contains over 80 events. These events seems to correspond to SQL Server counters you would see in performance monitor, rather than SQL Profiler trace events that EVENT NOTIFICATIONs use in SQL Server 2005, although there is some overlap. Many of these events only collect one event-specific column, a counter.

You can get a list of all the available events and the event-specific columns they collect by using:

SELECT convert(varchar(55),o.name) as [Object Name]
      ,convert(varchar(25),c.name) as [Column Name]
      ,c.column_id as [Column ID]
      ,convert(varchar(12),c.column_type) as [Column Type]
FROM sys.dm_xe_objects o
JOIN sys.dm_xe_object_columns c
ON o.name = c.object_name
WHERE o.object_type = 'event' AND c.column_type != 'readonly' -- readonly columns are common to most events
ORDER BY [Object Name]

So, to put this all together in an event session, lets use a variation of the BOL example:

CREATE EVENT SESSION test0
ON SERVER
ADD EVENT sqlserver.checkpoint_begin
ADD EVENT sqlserver.checkpoint_end
ADD TARGET package0.etw_classic_sync_target
    (SET default_etw_session_logfile_path = N'C:\temp\sqletw.etl')
go

See the previous blog entry for information about getting ETW working. Note that, in a single event session, we're using items from two different packages, sqlserver and package0. Start the session, using ALTER SESSION, then take a few checkpoints (or produce whatever event you decide to collect), ALTER SESSION to stop the session. Then you transform the (binary) ETL file to a .csv file by using tracerpt.exe.

Sunday, August 19, 2007 12:16:37 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Thursday, August 16, 2007

When starting out with XEvent support, I thought it would be good to start with the ETW target, although you can capture and catagorize events in buckets with the async bucketizer target, and pair related events (like obtain lock/release lock) with the pair matching target. Both VERY cool. But I just wanted a raw, vanilla trace, to start out. And I wrote a paper on ADO.NET and ETW once. So easy one first, I thought...

It turns out that you need privileges to start an ETW session. The ETW session is started for you (rather than you using the logman utility and starting it yourself) when you issue an ALTER EVENT SESSION...STATE=START. But mine never started. It always produced:

Msg 25602, Level 17, State 17, Line 1
The target, "CE79811F-1A80-40E1-8F5D-7445A3F375E7.package0.etw_classic_sync_target", encountered a configuration error during initialization.  Object cannot be added to the event session.

BTW, the guid before the name of the package is the package module id. You don't have to use it in CREATE EVENT SESSION...ADD TARGET...

The reason for this (for those of you that aren't reading the Katmai forums) is that the SQL Server service account is used to activate the ETW session. In order for this to work, the service account must be a member of the groups "Performance Monitor Users" and "Performance Log Users". Mine wasn't a member of "Performance Log Users". Make it a member of the group and this target "works a treat", as my UK friends would say. I'm tracing events to ETW as I write this. THANKS to Jerome Halmans for this information, its not yet in BOL that I could find.

I've always installed SQL Server (since 2005) by creating a simple account that's a member of only the USERS group in Windows machine/domain. During install SQL Server gives this account all the privs (and only the privs) it needs. Mostly it does this by creating a group SQLServerMSSQLUser[machine][instance]. But it also makes the user you specify (I call it SQLService) members of groups (like "Performance Monitor Users") when it requires group membership. It's a good idea to pick a service account this way for principal of least privilege, rather than running SQL Server as something else, like Admin or LocalSystem. See the security best practices whitepaper for details.

It's an interesting observation that not all the privs you need are tied to that single group, created at installation. That's (one of a few reasons) why its always best to use SQL Server Configuration Manager to change the service account rather than the "Services" control panel applet.

I don't know if they're going to add "Performance Log Users" to the list of things that the installer does. If they don't add it automatically, don't forget to add it yourself for this feature (that is: ETW target in SQL Server XEvent). And don't forget to point the ETL file to a directory that the service account has permission to write to.

Thursday, August 16, 2007 11:20:33 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

When I'm doing problem solving, its always good to have too much information rather than too little. With this in mind, I was quite interested in looking at SQL Server Extended Events (XEvent support) in SQL Server 2008.

You could always get diagnostic information in SQL Server through DBCC and SQL Trace/SQL Profiler. In SQL Server 2005 there are enhancements to SQL Profiler, dynamic management views (which enhanced and in some cases superceded DBCC information), DDL Triggers, and Event Notifications. There is also a WMI provide for events which uses event notifications internally. Event Notifications are sent to a SQL Server service broker queue and the events that are exposed are the same ones DDL triggers handle and most of the ones that SQL Profiler sees. In addition to all this info, there is an ETW (event tracing for Windows) provider for SQL Server. ETW support also appeared in System.Data.dll and the SNAC OLE DB provider/ODBC driver in SQL Server 2005/ADO.NET 2.0.

SQL Server 2008 adds support for XEvent (was codenamed Crimson), the new event system in Windows. The BOL provides info on this support, which works by creating and activating EVENT SESSIONs with DDL statements. Event sessions deal with items from event packages: events, targets, actions, types, predicates, and maps. You can mix and match the items from different packages in an EVENT SESSION.

So how do you get started? Create an event session (with CREATE SESSION DDL) and add items from the packages to your session (either in CREATE SESSION or in ALTER SESSION). You need at least one event and one target. You start/stop collecting by using ALTER EVENT SESSION...STATE=START/STOP.

- SQL Server 2008 ships with three packages: sqlserver, sqlos, and package0.
- There are lots of events. You can find them in sys.dm_xe_objects where object_type = 'event'.
- There are four targets, three of which work in the July CTP. The ones that work are:
    package0.asynchronous_bucketizer
    package0.pair_matching
    package0.etw_classic_sync_target

The first two targets write their info to sys.dm_xe_session_targets. You can join this to sys.dm_xe_sessions (after starting a session and collecting events) and look around. The fields in these DMVs are doc'd in BOL. Start with sys.dm_xe_session_targets.target_data.

The third target writes to an ETW session/file, providing compatibility with ETW. More on this one next.

Thursday, August 16, 2007 10:49:04 AM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 
Thursday, August 02, 2007

I've been looking at the mapping of the new Katmai date/time data types to .NET types, both from the point of view of SQL Server ADO.NET clients and of SQLCLR procedures/functions/triggers. There are a couple of things that "interesting", if I'm not misunderstanding something obvious.

There are no new System.Data.SqlTypes to correspond to the new SQL Server data types. The beta Visual Studio docs state this as a fact. So, unless SQLCLR supports the generic Nullable types as parameters in Katmai, you'll pass these in and out as strings. Or only use them in UDFs with RETURNS NULL ON NULL INPUT. No word yet (that I'm aware of) on Nullable type support in SQL Server 2008, BUT the new HierarchyID type (which is .NET based) supports INullable, like SqlTypes do. So I can pass in a NULL HierarchyID to SQLCLR, but not a NULL TIME parameter.

SQL Server's TIME data type is mapped to TimeSpan, which is a time interval. data type. This might confuse folks that know SQL Server doesn't support SQL-99's date/time interval data types. But the TimeSpan seems to have appropriate semantics as long as you stay away from the "Days" related properties and methods.

There is a new .NET data type for DATETIMEOFFSET, System.DateTimeOffset. That's good.

To distinguish between SQL Server 2000/2005/2008 usage/mappings, there is a (client-side) connection string parameter: "Type System Version". You have a choice of 2008, 2005, 2000, or "Latest". This was already used in SQL Server 2005 to distinguish between 2005 (XML data type exists) vs 2000 (XML data type is a long string). And other distinctions.

I wonder if LINQ for SQL and Entity Framework will support these... ;-)

Thursday, August 02, 2007 2:51:11 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 

I've been trying out table-valued parameters along with ADO.NET support in Orcas and came across an interesting dilemma. It centers around INSERTs using TVP against a table with an identity column. ADO.NET can use DataTable, IDataReader or IList<SqlDataRecord> to represent a TVP parameter. Let's say that I want a TVP and a procedure for insert that looks like this:

CREATE TYPE dbo.JobsTableWithIdentity AS TABLE (
  job_id smallint identity primary key,
  job_desc varchar(50),
  min_lvl tinyint,
  max_lvl tinyint
);

The "insert proc" would look like this:

CREATE PROCEDURE dbo.InsertJobsID (@tvp1 dbo.JobsTableWithIdentity readonly)
as
INSERT INTO dbo.Jobs (job_desc, min_lvl, max_lvl)
  SELECT job_desc, min_lvl, max_lvl from @tvp1;

Using this in ADO.NET (with either DataTable or IDataReader as a parameter) produces the error: "INSERT into an identity column not allowed on table variables. The data for table-valued parameter "@tvp1" doesn't conform to the table type of the parameter." But I didn't DO an insert into an identity column in the proc. And this works just fine in T-SQL:

DECLARE @t dbo.JobsTableWithIdentity;
INSERT @t VALUES('hi', 10, 10);
INSERT @t VALUES('hi2', 10, 10);
EXEC InsertJobsID @t;

So its ADO.NET "deciding" this is an error. The ADO.NET workaround (if I do want to start with a DataTable that contains the identity column, add rows to it, and call Update) is this:

CREATE TYPE dbo.JobsTableWithoutIdentity AS TABLE (
  -- job_id smallint identity primary key,
  job_desc varchar(50),
  min_lvl tinyint,
  max_lvl tinyint
);

-- sproc dbo.InsertJobsNoID changed accordingly

// and then, in ADO.NET code
// DataTable "t" contains a real jobs table, to which I've added rows

DataTable added = t.GetChanges(DataRowState.Added);
added.Columns.Remove("job_id");
da.InsertCommand.CommandText = "dbo.InsertJobsNoID";
da.InsertCommand.CommandType = CommandType.StoredProcedure;
da.InsertCommand.Parameters.AddWithValue("@tvp1", added);

But, should I have to do this? Or modify the T-SQL code, given I've not used the IDENTITY column? But, I will need this column to UPDATE (or MERGE) using the TVP. A dilemma...

Thursday, August 02, 2007 1:49:58 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Tuesday, July 31, 2007

Now that Kim and Paul have each posted about it and even published some pictures, I guess its time for me to send out a heartfelt CONGRATULATIONS to them both on the occasion of their recent wedding last weekend. All the best in your upcoming life together!

In a seperate/related announcement, Paul announced that he'll be joining SQLskills at summer's end. Welcome, Paul, it will be fantastic to have you onboard!

Tuesday, July 31, 2007 7:35:40 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

I saw Dan Jones' posting that the SQL Server 2008 July CTP (aka CTP4) was available on the Connect website. This one has some good things in it (like the new date/time datatypes as well as the hierarchyid data type) that should keep me busy for a while. There's much more new stuff than that, but the connect website has also the detalis.

One thing that is included that isn't intuatively obvious is ADO.NET client support. The Visual Studio Orcas Beta 2 release contained a new version (well, its still called version 2.0.50727, hmmm...) of System.Data.dll with support for the new date/time data types and also for table-valued parameters. The only problem was that it didn't even *connect* to CTP3 (there was a "network protocol" error). But it connects to the July CTP just fine. One things that not working in Orcas Beta 2 is SQLCLR projects against a SQL Server 2008 database, but typing CREATE ASSEMBLY is a small price to pay.

The new OLE DB provider and ODBC driver have been in place in the last CTP, but this is the first I've seen of .NET client functionality.

There are two items (listed in the readme file) that will not be in the SQL Server 2008 release. One is SQL Server Notification Services, which made its first appearence as an web release add-in to SQL Server 2000. Its not shipping in SQL Server 2008, and (some/most of) its functionality will eventually appear in Reporting Services. Another (removed from the installer) is a less-well known add-in that also debuted as a web release, SQLXML 4.0 (NOT to be confused with the XML data type, which is alive and well and has new xsd:date etc support). This used to be known as the SQLXML Web Release (V1,V2,V3) for SQL Server 2000 and SQLXML 4.0 (mostly) shipped "in the box" in SQL Server 2005. Some of its functionality was superceded by native XML and Web Services support in SQL Server 2005. It will be removed from the installer and shipped as a separate component instead, like all versions previous to version 4.0 were.

Tuesday, July 31, 2007 7:18:05 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, July 17, 2007

When I'm not busy writing about SQL Server, I quite enjoy reading books about it that look at things from a different point of view. I've been catching up on my reading lately, but getting behind on my book reviews. So here's a few reviews I'd been meaning to write for some time. The authors asked that I review them on Amazon, but I don't like either giving out personal information just to do a book review (too intrusive) or the idea of anonymous reviews/fake personal information (too easy to "stack the deck"). So I'm reviewing them here.

SQL Server 2005 Practical Troubleshooting (multiple authors, edited by Ken Henderson)
  The is a superb book about internals, with each chapter written by a dev, a PSS support person, or a member of the Development Customer Support Advisory team. Who could ask for more of an insiders' view? Maybe I like internals a bit too much, but I keep coming back to this book again and again. My favorite chapters are ones detailing the query cache and the query processor. I was originally put off by the number of proofreading/editing mistakes, but its worth getting past that and concentrating on the content in a hurry. Some excellent troubleshooting utilities are provided as well. A must read.

Expert SQL Server 2005 Development (Adam Machanic, Hugi Kornelis, and Lara Rubbelke)
  I really wrote my "review" for this one on its cover, you'll need to buy the book to read that part. This is not an overview book, nor does it try to be comprehensive in its coverage, but it covers, in depth, topics that you'll see nowhere else. In amazing depth. Absolutely worth every minute you'll spend reading and working the examples.

Pro SQL Server 2005 Service Broker (Klaus Aschenbrenner)
  I haven't received my copy of this one yet, but read a number of chapters as a reviewer before time constraints overwhelmed me. Based on the material that I did review, it's well worth reading. If Roger Wolter's Service Broker book was the seminal treatise on the subject, Klaus expands on the topics and provides quite a few examples and use cases.

Happy reading...

Tuesday, July 17, 2007 1:16:14 PM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 

In just over a month, I'll be at TechEd 2007 Hong Kong. I'll be delivering a preconference talk on (what else) SQL Server 2005 and a number of breakout sessions. Check the SQLskills Upcoming Events for more information. The folks in Hong Kong always put on an excellent show, and I'm very much looking forward to it. In addition, some of my old friends like Jon Flanders and Ron Jacobs will also be there. It should be a good time, stop by and say hi.

Tuesday, July 17, 2007 12:45:43 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, July 10, 2007

SQL Server 2008 will contain an ANSI SQL standard MERGE statement with extensions. Listening to the webcast last Friday, there's a fairly straightforward way to describe how this works.

Let's go back to first principals. Relational databases support two ways to do UPDATE and DELETE; positioned updates and searched updates. In a positioned update, you open an updatable cursor over a set of rows, navigate to the row you want, and issue an "UPDATE...WHERE CURRENT OF" statement. The searched update (UPDATE foo SET x=y WHERE z=1) is a SELECT and UPDATE in one statement. You specify the rows you want and also update them. The way you specify/gather the rows to update is mostly up to you, SQL Server even supports using a JOIN to do this.

update t
set t.name = s.name, t.age  = s.age
from [target] t
join [source] s on t.id = s.id;
go

MERGE in SQL Server 2008 is a searched operation that can operate on up to three different rowsets depending on how you specify the MERGE statement. These are actually the three rowsets you can get from a FULL OUTER JOIN. The statement:

select [target] t
inner join [source] s on t.id = s.id;

gets the rows in table T with a matching id value in table S. Let's call this rowset1. A left outer join will also include the rows in T that do not match S (rowset2); right outer join includes the rows in T that don't match S (rowset3), and FULL OUTER JOIN contains all three rowsets.

In MERGE, you can get up to all three rowsets (INNER, LEFT, and RIGHT join rowsets)

merge [target] t
 using [source] s on t.id = s.id
 when matched then update t.name = s.name, t.age = s.age -- use "rowset1"
 when not matched then insert values(id,name,age) -- use "rowset2"
 when source not matched then delete; -- use "rowset3"

The query processor will do the appropriate type of join to gather the rowsets that you need, as all clauses are not required. This optimizes performance over multiple statements, as you only have to gather the rows to process once.

Even, if you use only "when matched", MERGE is an improvement over our first "update using a join". If more than one row in the source matches one row in the target...

insert into t values(1, 'Fred', 42)
insert into s values(1, 'Buddy', 43)
insert into s values(1, 'Sam', '95)

The update using a join non-deterministically picks a row in the source to do the update, it could be Buddy or Sam. MERGE throws the following exception.

Msg 8672, Level 16, State 1, Line 1
The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to  ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.

MERGE can actually do more than three operations using predicates in the "match/no match clauses", but that's it for now.

Tuesday, July 10, 2007 7:42:55 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Sunday, June 24, 2007

Many of you have already heard the "hardware store" story. What's the best way to model products in a hardware store, where new items arrive at the hardware store each day. Each item has a "short list" of similar properties (e.g. UPC, price) and a long list of dissimilar properties (e.g. paint has color, type, amount and curtain rods have width, metal, etc). How to model the dissimilar properties for each item in relational table(s)?

This isn't as unusual of a problem as you might think, examples I've heard lately include:
 Items in a directory system (like AD)
 Readings for lab test results
 Attributes for Sharepoint items

I've always thought of the main contenders as:
1. Sparse tables - one per product
2. Sparse columns - 90% of the column values would be NULL
3. Model as XML - similar properties are subelements, sparse properties are attributes
4. Entity-attribute-value (EAV) - also known as open schema. A separate "properties" table with name-value pairs.

EAV is one of the most popular solutions, even supposedly endorsed by standard schemas in some industries. Many relational purists detest EAV because its non-relational. It's main drawbacks are that the "name-value pair" table gets huge fast, with the corresponding lack of performance, the need for careful editing (color and colour would be two different attributes), and the fact that the "value" column of name-value must have a data type of nvarchar or SQL-variant.

SQL Server 2005 added the PIVOT keyword. One use for PIVOT is the change the EAV tables into something that looks like sparse tables.

I even had the opportunity to ask Joe Celko (no fan of EAV) which he prefers, trying to ease him towards the "model as XML" mechanism. He stood up for sparse tables or sparse columns.

SQL Server 2008 will include support for sparse columns. You can designate a column as

SPARSE in the DDL, like this:

CREATE TABLE products (product_num int, item_num int, price decimal(7,2), ...,
                       color char(5) SPARSE, width float SPARSE...)

You can have a huge number of sparse columns per table, although the number of non-sparse columns remains at 1024. In addition, SQL Server 2008 will support sparse indexes (aka filtered indexes) defined like:

CREATE INDEX coloridx ON products(color) WHERE product_num IN (21,22,42...)

Finally, you can have an XML "COLUMN SET" column for each table; this exposes the sparse properties (or perhaps a subset of them?) for each item as a collection of XML elements, for those folks that like to model these as XML.

ALTER TABLE products ADD COLUMN properties XML COLUMN_SET FOR ALL_SPARSE_COLUMNS

It's an interesting idea; the proof will be in the perf as well as the usability.

Sunday, June 24, 2007 8:29:39 AM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 

ORDPATH is a hierarchical labeling scheme used in the internal implementation of the XML data type in SQL Server 2005. It's meant to provide optimized representation of hierarchies, simplify insertion of nodes at arbitrary locations in a tree, and also provide document order. It's described in the academic whitepaper "ORDPATHs: Insert-Friendly XML Node Labels". In addition to being used internally when the XML data type is stored, its also part of the key of the PRIMARY XML INDEX, used to speed up XQuery. It's an implementation/specialization of the path enumeration model of representing hierarchies in relational databases, mentioned in Joe Celko's book "Trees and Hierarchies in SQL".

In SQL Server 2008, there are additional uses of ORDPATH. There is a new system data type HierarchyID, that will likely use ORDPATH in its implementation. This allows simply hierarchies to be represented as relational column and provides methods that optimize common hierarchical operations (like parent, child, sibling, ancestors, descendants) without being concerned about the intricacies of elements and attributes.

In addition to representing and indexing XML and hierarchies, Michael Rys mentioned at his TechEd chalktalk on spatial data that the spatial data types may be indexed using a multi-level grid system and that these indexes would also use ORDPATH. Since neither HierarchyID or spatial types are in the current CTP of SQL Server 2008, we'll have to wait a bit to see if this is truly "ordpath everywhere".

Sunday, June 24, 2007 7:51:08 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Saturday, June 23, 2007

SQL Server 2008 will contain, if my count is correct, 7 new data types. Note: none of these are in the current CTP.

DATE - ANSI-compliant date data type
TIME - ANSI-compliant time data type with variable precision
DATETIMEOFFSET - timezone aware/preserved datetime
DATETIME2 - like DATETIME, but with variable precision and large date range

GEOMETRY - "flat earth" spatial data type
GEOGRAPHY - "round earth" spatial data type
HIERARCHYID - represents hierarchies using path enumeration model

The first four (date/time series) are NOT implemented/exposed as .NET system UDTs, but the last three are exposed as .NET system UDTs. This means that, for the first time, .NET will be used as part of SQL Server. In SQL Server 2005, there were originally DATE and TIME data types implemented in .NET, but implementing temporal data is an intricate process. After much wailing and gnashing of teeth by some members of the user community, these were removed.  One of the complaints was the .NET implementation. Hmmm...

One of the nice side-effects of implementing spatial and hierarchyid as .NET types is that these will be shipped as a separate assembly, and that the types will be available for client-side and middle-tier use as well as in the database. So if you want to do some massive number crunching of spatial sequences on a computation server and the network traffic from database to computation server is acceptable, you can do so.

In general, SQLCLR makes the "logic in database or middle-tier" argument easier to deal with. Although there's no "run on database or run on server" switch in VS, with minimal code changes you can move your logic, or even duplicate the logic between tiers if need be. You can't do this with T-SQL; although its faster and better for data access on the database, it doesn't run outside the database. Unless you want to use SQL Server Express Edition as an application server. But that's a discussion for another time.

Saturday, June 23, 2007 10:03:25 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

Had to write about another thing that "caught my ear" at TechEd during a chalktalk by Rick Negrin about Service Broker usage patterns.

Service Broker supports "internal activation", that is, associate a stored procedure that gets invoked when a queue has messages to process as well as "external activiation". When using external activation, an event notification occurs when a queue has messages to process, and this notification is picked up by an external application; the external application processes the queue messages, out-of-process to SQL Server.

There is an SSMS template (see template explorer) for an internal activator procedure and an engine sample implementation of a class library to support SQLCLR activator procs. In addition, Remus Rusanu has written some excellent blog entries on activation program message processing patterns.

There is a sample implementation of an external activator as well. Rick mentioned "productizing the external activator for SQL Server 2008". This would mean making the activator part of the SQL Server product, along with the extensive testing and support that go along with it.

That's great news, and a supported, standard, configurable, external activator would make a great addition for SQL Server 2008. Now, about improved broker diagnostic and configuration utilities...and that SQLCLR support library....and improved SSMS support. Those would help things out too (some people are never satisfied).

Saturday, June 23, 2007 9:39:47 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

The June CTP of SQL Server 2008 contains support for table-valued parameters. Here's a usage scenario for these that has been around for a while.

Imagine you are running an online store and deal with (among other data) orders and order detail lines. You'd like to have a stored procedure that can add an entire order in one server round trip, regardless of the number of items that I order (that is, 1 order header and 1->n order detail lines). You'd even settle for two round trips, one for the order header, one for the details. Before SQL Server 2008, there is no built-in mechanism that supports this. In the past, I've seen some pretty interesting workarounds, such as:

1. Compose an arbitrarily large SQL batch as a single "command text" by using string concatenation on the client/middle-tier. All SQL Server database APIs support one (and only one) batch per Command. ADO classic did something like this when you inserted/updated/deleted multiple rows in a disconnected Recordset and called for a "batch update".

2. Create a stored procedure with some "static" parameters and an arbitrarily large number of repeating parameters, most of which will always be NULL. The limit to the number of parameters in a stored procedure is 2100. Both this method and the previous one make for some pretty hideous-looking code.

3. Use multiple parameter sets. OLE DB does support multiple parameter sets and some databases can optimize inserts that use multiple parameter sets. The SQL Server providers, at least last time I looked with SQL Profiler, turn multiple parameter sets into multiple calls, that is, one round trip per parameter set. That's not what I wanted.
 
SQL Server 2008's solution to this age-old problem is table-valued parameters (TVPs, for short). You start using a TVP by creating a custom type, using the CREATE TYPE statement, like this:

CREATE TYPE lineitem_type (line_number INT, order_id INT, product INT, quantity INT);

Information about these table types appear in sys.types and also in a new metadata view, sys.table_types. Once you've created such a table type, you can use it in T-SQL like this:

CREATE PROC new_order (@order_id INT, @line_items lineitem_type)
AS
-- silly table names used for clarity
INSERT INTO orders_table VALUES (@order_id ... ) ;
INSERT INTO line_items_table
  SELECT * FROM @line_items;

One round trip. Compact, clean code. Nice.

Saturday, June 23, 2007 4:30:18 AM (Pacific Standard Time, UTC-08:00)  #    Comments [4]  | 

SQL Server 2008 will include GROUPING SETS; a (very) short explanation is that these allow the equivalent of multiple GROUP BY clauses in a single SQL statement. The result is a UNION ALL of the resultsets. SQL Server 2008 also contains/allows standard syntax for ROLLUP and CUBE, which have been in SQL Server for a while.

One way to use (or to think of) grouping sets is that, while ROLLUP with N columns produces a UNION of N+1 results and CUBE produces N-squared -1 results, grouping sets can produce an intermediate number of results, when not all the dimensions produced by CUBE are needed. Grouping sets should allow better optimization of this type of dimensional query.

SQL Server 2008 Reporting Services will contain a new type of control, called the TABLIX. The SQL Server 2008 CTP BOL defines a TABLIX as: "A Reporting Services RDL data region that contains rows and columns resembling a table or matrix, possibly sharing characteristics of both." Grouping sets sound like a good fit with this component. When TABLIX is available, a short profiler session would confirm this.

Of course, in addition to performance benefits and TABLIX support, GROUPING SETS are part of the ISO-ANSI SQL-2006 spec. Another plus.

Saturday, June 23, 2007 4:09:40 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

There was a fairly well-known Powerpoint slide that attempted to summarize the new SQL Server 2005 features in bullet points of a single slide. By the release, there were so many new features, the feature list had to be rendered in a 5-point font to fit. At TechEd 2007, Microsoft presented the new features of SQL Server 2008 (was: SQL Server Katmai) in an analogous format. Although it's not yet down to a 5-point font, there are quite a few impressive new features on tap. Here's the list, modulo (my own) potential spelling errors. There is currently a CTP of SQL Server 2008 as well as beta1 of the next version of Visual Studio.NET (Orcas) and the ADO.NET synchronization framework. The features I can find in the current betas have asterisks. [Note: Asterisks updated for the July CTP]

Transparent Data Encryption
External Key Management
Data Auditing
Pluggable CPU
Transparent Failover for Database Mirroring
Declarative Management Framework*
Server Group Management*
Streamlined Installation*
Enterprise System Management*
Performance Data Collection*
System Analysis*
Data Compression
Query Optimization Modes
Resource Governor
Entity Data Model*
LINQ*
Visual Entity Designer*
Entity Aware Adapters

SQL Server Change Tracking*
Synchronized Programming Model*
Visual Studio Support*
SQL Server Conflict Detection
FILESTREAM data type
Integrated Full Text Search
Sparse Columns
Large User Defined Types*
Date/Time Data Type*
LOCATION data type
SPATIAL data type
Virtual Earth Integration
Partitioned Table Parallelism
Query Optimizations*
Persistent Lookups
Change Data Capture*
Backup Compression
MERGE SQL Statement*
Data Profiling
Star Join*

Enterprise Reporting Engine*
Internet Report Deployment
Block Computations
Scale out Analysis
BI Platform Management
Export to Word and Excel
Author reports in Word and Excel
Report Builder Enhancements
TABLIX
Rich Formatted Data
Personalized Perspectives
… and many more

It should be straightforward to eventually produce a 5-point slide because these features haven't made "the list" yet.

Grouping Sets*
Table-valued parameters*
Table-valued constructors*
"Delighters" (1-stmt variable declaration/assignment, increment operators)*
HierarchyID data type*
Large User Defined Aggregates
New versions of ODBC driver*, OLE DB provider*, and ADO.NET data provider
SQLCE Version 3.5*
Database Mirroring Enhancements*
Data Mining Algorithm Enhancements*
SSIS support for the new date/time data types*
SSIS VSTA support*

That's an impressive set of features. And I'm pretty certain, since this is a summary, that's there are more to come.

Saturday, June 23, 2007 3:34:28 AM (Pacific Standard Time, UTC-08:00)  #    Comments [4]  | 
Thursday, May 10, 2007

SQL Server "Katmai", which some folks are already calling "SQL Server 2008" because of its scheduled release date, was officially annonuced yesterday. I've been following things and noticed, a while back, that there are a number of Katmai sessions scheduled at TechEd in Orlando, in June. Now that there is an official announcement, I guess I can blog about these without possibly hurting anyone's feelings. Note that, as with the official announcement, the descriptions are pretty sparse. So you'll have to come and see for yourself...

DAT201 - The Next Release of Microsoft SQL Server: Overview
DAT202 - The Next Release of Microsoft SQL Server: Beyond Relational
DAT303 - The Next Release of Microsoft SQL Server: Data Warehousing Enhancements
DAT204 - The Next Release of Microsoft SQL Server: Manageability Overview

DAT17-TLC - Spatial Support in Microsoft SQL Server
DAT06-TLC - New T-SQL System Types in the Next Release of Microsoft SQL Server
DAT18-TLC - Overview of New T-SQL Programmability Features in the Next Release of Microsoft SQL Server

Thursday, May 10, 2007 7:39:40 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Sunday, May 06, 2007

I'll be doing a one-day workshop on SQL Server 2005 in Amsterdam in June, as a postconference event of Microsoft DevDays. Although I've done quite a few events on SQL Server 2005 before, this one will be quite a bit different from any that I've done before. It deals with best practices.

It's been about 4-5 years since I started teaching and working with early adapters of technologies like SQLCLR, Service Broker, and XML data types. During the "break in" period, people were most interested in how they worked, syntax and semantics. Now that we've had a while to practice, folks are wondering how its working out. Did these revolutionary technologies change the database landscape?

I'll begin the workshop by discussing the impact of items like statement-level recompilation, SQL algebrizer and plan caching changes, plan guides, plan forcing and the like have had on the way people write procedural code. The effect of snapshot isolation on how people approach locking issues. And go from there into the uses and guidelines for features, like Service Broker, that folks normally associate with "SQL Server 2005 change". With some real-life examples. It's a different outlook that will benefit you even if you're not on the bleeding edge. And because some 'next generation' features will be introduced at TechEd the precededing week, I'll answer questions on where I think we go from here.

Also not to be missed: Dino Esposito will be presenting an excellent workshop on Ajax next door.

See you there.

Sunday, May 06, 2007 4:45:34 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 

I'll be at TechEd 2007 in Orlando next month doing a breakout session on SQL Server 2005 scaleout technologies, as well as a couple of chalktalks (on eventing and WMI and on Powershell and SMO) in the SQL Server area. Stop by and say hi.

And...if you're thinking of going to a pre-conference talk, Kimberly Tripp and I will be delivering an information-packed day on Leveraging SQL Server Always-On Technologies to Acheive High-Availability and Scalability. I'll be covering internals of some of the many scaleout solutions available with SQL Server 2005. We'll also have some of our much-requesting SQL Server Always-On DVDs. Have a look at Kimberly's blog entry for even more details. See you there.

Sunday, May 06, 2007 4:39:52 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Saturday, April 21, 2007

The day after Orcas Beta1 appeared for public download reports appeared on the ADO.NET Technology Preview forum that the EDM Wizard (that is, "Add New Item/ADO.NET Entity Data Model) wasn't working. And, sure enough, if you choose "Generate From Database" and carefully choose which tables to use, you get a model with no entities. If you choose an empty model, you get "Specified argument out of range of valid values", and get the three mapping files, but no language file. What to do?

One way around this is to use the files generated from the March CTP wizard. Because the XML schemas for the mapping files haven't changed since March CTP, these work fine. In fact, all of the EDM code that worked with March CTP continues to work unchanged. That's what I used in the SQLCE and EDM blog entry posting.

If you can't keep an old March CTP around just to generate EDM Data Models, you can use the EDMGen.exe command line utility in the C:\WINDOWS\Microsoft.NET\Framework\v3.5.20404 directory. This utility appears to work and generates useable code. It also generates a richer model than the wizard did (with AssociationSets and Associations) but it takes a little more work to put together a project. I'll post a really simple one if there's interest. And EDMGen doesn't generate the same output as the wizard used to (for example, columns of type NTEXT are ignored in the model) and uses different default naming conventions. So at least there are a few workarounds for now.

Saturday, April 21, 2007 7:07:03 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

SQLCE 3.5 comes with an ADO.NET data provider that supports entities, by supporting the ADO.NET 3.5 entity data model (EDM). I noticed this in Orcas March CTP with the inclusion of a new DLL, System.Data.SqlCe.Entities. Although the DLL was there, there was no support in Visual Studio for any provider but SqlClient, and trying to do this manually failed (DbProviderFactory.GetService returned null, IIRC).

In Orcas B1, there still seems to be no support in the Visual Studio "Add ADO.NET Entity Data Model" dialog, but you can do this manually. To set things up, I used a SQLCE table that was roughly the same as the jobs table in the pubs database, used an EDM generated in Mar CTP by pointing at SQL Server's pubs database and tweaked the SSDL file a bit to be consistant with the SQLCE table. One thing I was surprised with was that SQLCE's EDM implementation didn't seem to mind SSDL's EntityContainer being named "dbo" and generated the correct query anyhow (in SQLCE "select * from dbo.jobs" fails, and I hoped this wouldn't be the query that was generated). It used the right query.

Here's a simple example of using EDM with SQLCE 3.5, database included. You need to put the pubs.sdf database in c:\temp or change the connection string in the app.config to make this work. The sample uses both EntityClient and Entity Services. It could just as easily use LINQ for Entities.

An interesting idea is that if the internals of Sync Services were tweaked sightly, they could use the EDM and EDM's generated SQL in addition to (or instead of) DataSets. Imagine replicating/synchronizing Employee entities with associated Job information instead of synchronizing Employee and Jobs tables. The logic could be hooked in from what AcceptChanges does.

CeEDM.zip (39.65 KB)
Saturday, April 21, 2007 4:06:25 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

One of the latest developments in the SQLCE space is the beta for ADO.NET Sync Services. Sync Services ships as a standalone download, runnable on Visual Studio 2005 and is also built-in to Visual Studio Orcas. Both versions require a new version of SQLCE, version 3.5. This runs side-by-side with SQLCE 3.1, although both version's DLLs have the same names they are installed in different directories. The Orcas B1 version has a newer version of the DLLs and install GUI support.

The Orcas GUI support consists of a new component "Local Database Cache" that can be added a various types of C# and VB.NET projects. This bring up a Configure Data Synchronization dialog. In this dialog, you can configure a Server connection and a Client (SQLCE) connection, select "Cached data objects" (tables to be replicated) and replication specifics. You can also configure the feature of Sync Services that allows you to make synchronization service-based, that is, use a middle-tier WCF, Web Service, or other middle-tier component for connect to the database and perform the synchronization.

Besides service-based synchronization, Sync Services allows you to sync to data sources other than SQL Server. You can sync to most any database that supports ADO.NET, for example, Oracle. It works by instanciating a DataSet (although I'd though it could/might use EDM in future, see next post) and performing synchronization through the DataSet. Sync's DbServerSyncProvider uses a SyncAdapter, making it even more ADO.NET-like. Although DbServerSyncProvider can use any compliant ADO.NET provider, SqlCeClientSyncProvider is SQLCE specific.

For some excellent examples of Sync Services in action, check out The Synchronizer's (aka Rafik Robeal) blog. I think Rafik is up to six samples now, one of which works for an Oracle backend database.

For some excellent videos showing Sync Services with Visual Studio Orcas, check out the links on Steve Lasker's blog. Although I haven't tried it out yet, I think his WCF service-based example is running from the Orcas Beta1 version of the designer.

Next: SQLCE and Entities

Saturday, April 21, 2007 3:53:45 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

So now that I've found SQLCE and know which version I have, what can I do with it besides use the ADO.NET and OLE DB APIs? Well, turns out I need some auxiliary pieces. If you use 3.0, the pieces "come with", but when you upgrade to 3.1, you need to upgrade these.

I must admin I found this somewhat confusing, probably because I "started from scratch" and also wanted to use 3.1. I had to intall some pieces before it would "recognize" others. For example, unless you have the Windows Mobile dev tools installed, the 3.1 VS tools won't install. And after installing the Windows Mobile dev tools on VS SP1, I had to install VS SP1 again. And none of these installs made the "Configure Web Synchronization Wizard" app available until I installed the Server Tools.

This turns out to be more important than you might think. For example, the GUI DLLs are used to CREATE the database itself. I've not yet investigated if there is a flie format difference in different SQLCE versions, as there is with different SQL Server versions. Or how/when SQLCE database files with data are upgraded. And using the 3.1-specific features, that is the |Data Directory| connection string directive, and Click-Once deployment support require the updated version of the dev tools. When you install SQL Server 2005 SP2, your bits are updated, but these are a subset of the bits you use for development in VS.

Here's the list of additional bits (most are 3.1 updates) I came up with, along with some terse notes:

-- Server Tools (SQLCE30setupen - localized)
   Updates IIS to support connectivity solutions (Merge Repl)
   Special considerations for IIS 7.0 (IIS on Vista)
   No need if SQL Server and IIS on same machine
   These are not updated for SQLCE 3.1
   Includes Configure Web Synchronization Wizard app

-- Tools for VS2005 SP1 (SSCE31VSTools)
   Requires installating VS2005 SP1 first
   Adds Click Once Support for SQLCE 3.1
   Updates design-time UI to work with SQLCE 3.1
   Adds 3.1 device cab files
   Requires VS WindowsMobile dev tools install
   If VS not upgraded to SP1, uses SQLCE 3.0 design time UI

-- Developer SDK (SSCE31SDK)
   Cab files for devices
   MSI for desktop and tablet PC editions
   Header files
   MSI for Compact Edition help files
   Northwind sample app

-- Books Online (SSCE31BOL)

-- SQL Server 2005 Compact Edition Access Database Synchronizer
   Supports Access 2000 SP3, 2002 SP3, 2003 SP2, 2007
   Not sure if this supports 3.0, 3.1 or both
   Runs as a service, Uses RDA access and HTTP transport.
   Requires Active Sync 4.0 or later

-- SQL Server 2005 Sync Services CTP
   Installs 3.5 side-by-side, but no new device support yet
   Installs new ADO.NET-based sync
   Included with Orcas B1, works with VS2005 also
   There is a separate documentation and demos download

I'll only be using 3.1 and above because (blush) I don't have a compact device except for the emulator in Visual Studio. So I'm working desktop and emulator only for now.

Next: Newest stuff

Saturday, April 21, 2007 11:37:58 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

As part of a project, I've had occasion to look at the SQL Server Compact Edition in earnest. Although I've taken a cursory glance before this, its my first attempt to put all the pieces into place. So I thought I'd write it down as I went along.

SQL Server Compact Edition's original name was SQL Server Mobile, but as of version 3.1 its now supported on a variety of mobile devices but its also supported on Tablet and Desktop PCs. For simplicity, I'll refer to all versions as SQLCE from now on.

SQLCE ships with SQL Server 2005 and will also ship with the next version of SQL Server (Katmai). I believe it installs if you install the SQL Server 2005 adminstrative tools, but doesn't show up in Add/Remove Programs.  It also ships with Visual Studio, if you install the Compact Framework dev tools. You can also download it stand alone.

It consists of 6 DLLs, no services, not even an exe. You can program it with the ADO.NET data provider or OLE DB provider. Each is included as a separate DLL. When installed with VS or standalone it lives in the directory C:\Program

Files\Microsoft SQL Server Compact [or Mobile] Edition\v3.x. There are subdirectories that contain the redistributables for various compact devices and versions of Windows CE. When installed with SQL Server it lives in the C:\Program Files\Microsoft SQL Server\90\Tools\Binn\VSShell\Common7\IDE directory.

The versions that come with various product versions are:

SQL Server 2005 RTM and SP1: Version 3.0, build 3.0.5207.0
SQL Server 2005 SP2: Version 3.1, build 3.0.5300.0
Visual Studio 2005: Version 3.0, build 3.0.5207.0
Standalone download of SQL Server Compact Edition 3.1: Version 3.1, build 3.0.5300.0
Mar CTP standalone and Mar CTP of VS Orcas: Version 3.5, build 3.5.5305.0
VS Orcas Beta 1: Version 3.5, build 3.5.5334.0

Interestingly, the DLLs always end with "30" in all versions. You know which version by looking at the file properties, not the DLL names. Version 3.5 will supposedly also ship with SQL Server Katmai, but this is not available yet.

The reason SQLCE ships with SQL Server and Visual Studio is that these are the development and admin environments for it. You don't just start up the .exe and work from the command line, you use either SQL Server Management Studio or Visual Studio's Server Explorer. Each of these tools come with dialog boxes that allow you to create a database and interact through the GUI (VS's Server Explorer/Data Connections or SSMS's Registered Servers/Object Explorer/Query Window). You can also create and manipulate a database through the programmatic APIs. You must have SSMS to make this work with SQL Server, there is no support for SQLCE in SQL Server Management Studio Express.

In addition to DDL and DML, both environments allow you to configure merge replication, as support for what's called "Occasionaly Connected Systems" is one of the main points of SQLCE. Besides SQL Server Merge Replication, you can interact with the "main server database" through RDA (remote data access) or the new ADO.NET Sync Services, currently in beta.

Next: Associated pieces

Saturday, April 21, 2007 10:29:14 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Wednesday, April 18, 2007

I always look through the SQL Server 2005 samples with each new incantation, and always seem to find things in there that are new and interesting. In the SP2 version, the samples included a new sample database, "AdventureWorks Light". Also known as AdventureWorksLT.

Because folks that teach SQL Server do have a bunch of canned queries for exposition and know exactly how those queries are supposed to behave, sample databases really never disappear. Pubs was the sample database inherited from Sybase, although Sybase itself moved on to "Pubs2" last time I looked. The problem with Pubs was that it really didn't have much data at all. I always point out that, of the 8 publishers in Pubs, only 3 have actually published books. Northwind was inherited from Microsoft Access, being the original Access sample database. It had more data, but not a lot.

In SQL Server 2005, no sample databases are installed with the product by default, which is a good thing. But the "AdventureWorks series" is the new sample database, including an OLTP, data warehouse, and analysis databases, as well as case-senstive and case-insensitive collation versions. It has a reasonable amount of data and its design is said to reflect current thoughts on best practices. It is built around multiple departments in a putative organization, including Sales, HR, and Manufacturing. This allows it to include a employee/manager hierarchy and a bill-of-materials table.

But the Adventure Works database is a nice 3NF database, with lots of tables (~70) and it sometimes difficult to use for exposition, because of the long multipart object names and lack of familiarity. AdventureWorks Light is a bit more approachable, weighing in at about 3MB, but still over available as an MSI (no simple create script). It contains:

10 tables and 3 views
500 or so rows in each table
All in single object schema SalesLT
No stored procs, but 1 scalar and 2 table-valued functions
A single XML schema collection

It even includes a version of Sara Tahir's uspPrinterror and uspLogerror procedures for error handling, but interestingly uspRethrowError is left out.

We'll see if this new sample database is simple enough for widespread pedagogical use. At least it may get folks more use to using (memorizing) the "AdventureWorks family" table and column names.

PS: The SQL Server Samples now have their own Codeplex project at http://codeplex.com/SQLServerSamples

Wednesday, April 18, 2007 1:05:27 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 

Theme design by Jelle Druyts

Pick a theme: