Thursday, October 25, 2007

I'll have to admit it, when I first saw that SQL Server 2008 was adding spatial data support, I thought of it as a niche. The province of geographers, cartographers, and maybe a few others. Complex, involving a lot of higher mathematics, each province having their own geographic encoding, and so on... And that level exists, to me it's the production of spatial reference data. When I think of spatial reference data, I think of map data you'd buy from ESRI and data posted by government agencies. Or made available by utilities, so you don't hit a power cable while digging in your garden. As opposed to spatial line of business data.

Now, before you go searching your LOB application for latitude and longitude columns, how about looking for columns that contain "address". It's a short hop from address to lat/long by using a geocoder. The one I used is the MapPoint web service. Now you have line-of-business spatial data. I'll bet every app has a field that contains address. And how about looking for the nearest salesperson for a potential customer? Or the nearest warehouse? Mapping programs like Virtual Earth, Google Earth, and Yahoo Maps can give you general business information and maps but how about encoding information in your own business?

I'm quite excitied over this upcoming "niche" feature and think it could make its way into each and every application. That's spatial data "for the masses" (so I'm not the greatest at sloganizing, forgive me).

Thursday, October 25, 2007 11:54:52 AM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 

A couple of people have been asking and, in preparing for my upcoming talk on spatial data on SQL Server 2008 at ITForum in Barcelona in a few weeks I got to ask about using the new system UDTs types on the client. In an upcoming CTP release there will be an MSI installer file specifically to install these types on clients. The appropriate files are installed and assembly (Microsoft.SqlServer.Types.dll) registered in the GAC.

Since these are .NET data types, these (HierarchyID, Geography, Geometry) are easily usable in SQLCLR functions and procedures too. No possible data type mismatches or nullability (the new types implement INullable and have a static property to return a NULL instance) concerns.

Thursday, October 25, 2007 11:18:08 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Friday, September 21, 2007

Just saw Aaron Bertrand's post from PASS on SQL Server 2008 Change Tracking vs Change Data Capture (change data capture (or CDC) is in the current CTP; change tracking is not). I'm not at PASS this week myself, but home while the house is being re-roofed. His post seems to confirm something I'd suspected all along.

Change tracking seems to go hand-in-hand with Sync Services for ADO.NET. I've been following Sync Services for a while; it's a set of libraries for controlling and implementing synchronization for disconnected database applications. Its current shipping vehicle is the Visual Studio Orcas Beta 2 release, along with SQL Server Compact Edition version 3.5. The fact that SQL Server 2008 Change Tracking provides a mechanism to keep track of which user (the sync OriginatorID) made a change (CDC doesn't) and also seems to provide automatic change table management for DELETEs (CHANGE_RETENTION) and a mechanism to "get the set of changes that have occured from a baseline" (the "sync_last_received_anchor" in sync services) makes Change Tracking line right up with what Sync Services requires.

Although Aaron mentions "offline stores like Outlook in cache mode", SQLCE is an exciting offline store because its currently deployed in places like Windows Media Player, Zune, Media Center PC, and more. SQLCE runs on mobile devices and desktops. Sync Services isn't available for mobile devices yet, but is said to be "in progress".

When you set up a Sync Service app, you currenly must make changes to the database (triggers, "tombstone tables for deletes", and such) referred to by the "ServerProvider", in order to track the information Sync needs. The Sync "ServiceProvider" architecture layers over/shares concepts with the ADO.NET provider model. But you don't have to make changes for the SQLCE 3.5-side (SQLCE is the only current "ClientProvider" that Sync Services supports) because, "support for sync is built in". Well...maybe it's built in to the server too, with SQL Server 2008. Bet we'll see (at least one) demo with Sync Services when the Change Tracking feature ships.

Friday, September 21, 2007 1:44:11 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Thursday, September 20, 2007

I'll be at TechEd and ITForum in Barcelona in November, doing sessions on SQL Server 2005 and SQL Server 2008. One of the TechEd sessions is called "To ODBC or to OLE DB?" and is a discussion of using ODBC and OLE DB is applications these days, whether one or the other is "better", if you should convert existing applications, etc. On that note...

Back at TechEd US, I'd spoken with Chris Lee, who's in charge of SQL Native Client. That's the OLE DB provider and ODBC driver that ship with SQL Server. SQL Server 2008 has a new version of the provider and driver, and when I'd asked if support for the new features (like the new DATE/TIME data types and table-valued parameters) Chris had not only replied "yes", but given me a demo of using table-valued parameters from ODBC to show off and post if I wanted to.

Here's the code. To build and run it:

1. Use the Visual C++ compiler that comes with Visual Studio 2005 or Visual Studio 2008 beta. I used VS2008 Beta 2.
2. Convert the project if needed. I ignored the warnings about 1 source file not being converted.
3. Make sure that sqlncli.h and sqlncli10.lib are available to the compiler. They're in C:\Program Files\Microsoft SQL Server\100\SDK\Includes and Lib, respectively.
4. Install SQLNCLI from the SQL Server 2008 CTP distibution on the client machine. Just run the SQLNCLI.msi in Servers\Setup.
5. Setup an ODBC System DSN from Control Panel/ODBC Administrator named TVPDemo. It must use the SQL Server Native Client 10.0 ODBC driver. Use the database of your choice, the demo will run DDL to create the database objects you need.
6. Compile the demo, set breakpoints and walk through the code

Cheers! See you in Barcelona!

 

ODBC TVP Sample code1.zip (1008.79 KB)
Thursday, September 20, 2007 3:15:18 PM (Pacific Standard Time, UTC-08:00)  #    Comments [4]  | 

I received mail today from the SQL Server Compact Edition folks at Microsoft on my blog post on using SQL Server CE 3.5 beta, Visual Studio Beta 2, LINQ, and EDM. The current plan is:

1. Visual Studio 2008 will not ship with LINQ to SQLCE designer support. SQLMetal works just fine with SQLCE, though, as I'd mentioned.
2. There is planned future support for using SQLCE with EDM and LINQ to Entities when EDM ships after Visual Studio 2008 (VS 2008 SP1?).
3. There is a fix in the works for the FK issue that I had with the SQLCE Northwind sample database and SQLMetal.

Along these lines, I asked about the integration of LINQ to SQLCE with the updateable, scrollable, cursor-like behavior of SqlResultSet. Because SQLCE is an embedded database ("the engine" loads into your application) using the DataSet with SQLCE programming adds a layer of buffering (read: memory allocation and data copying) between the data and you. SqlResultSet is a perf win over using DataSet, and using LINQ to SQLCE or EDM to SQLCE, although they don't use the DataSet, doesn't allow in-place updating like SqlResultSet does. And they do use memory allocations rather than read directly from the SQLCE database. The current LINQ to SQLCE doesn't support SqlResultSet-like behavior yet, but perhaps in future...

Thursday, September 20, 2007 1:54:40 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Wednesday, September 19, 2007

So, people always ask... now that .NET Framework 3.0 installed on my system and there's going to be a new version of .NET that includes LINQ, what version of the framework will SQL Server load now, in SQL Server 2008? Does 2.0 still load? Or does it load 3.0 or 3.5?

Theoretically, and I've written this in books and previous blog entries, SQL Server loads the most recent version of the .NET framework. .NET service packs therefore affect SQLCLR. Especially if they contain updates to say, System.Data.dll. Or System.Xml.dll. Or [your favorite "approved" library goes here].

At present, one .exe can load one and only load version of the .NET framework. ASP.NET, for example, supports multiple versions with multiple worker-processes, that is, multiple .exe-s. Perhaps in .NET 4.0, we'll be able to run multiple .NETs in a single process. Perhaps not. The "main" .NET assembly is mscorlib.dll; this contains quite a few of the "main" .NET namespaces/classes. Namespaces don't necessarily correlate to DLLs, in Visual Studio 2008 version of .NET "System.Data" will be spread across a few DLLs. To me, there is "a new version of .NET" when there is a new version of mscorlib.dll. That's not the way its represented normally.

.NET 3.0 did not replace mscorlib.dll. Or System.Data, System.Xml, ASP.NET, or others. In my C:\WINDOWS\Microsoft.NET\Framework\v3.0 directory there is no DLLs, only three subdirectories: Windows Communication Foundation, Windows Workflow Foundation, and WPF. So, if you have a machine that "comes with .NET 3.0" it also must come with (at least) .NET 2.0.

My machine (with SQL Server 2008 and Visual Studio 2008 Beta 2) has a .NET 3.5 directory in it. And a subdirectory named "Microsoft .NET Framework 3.5 (Pre-Release Version)". Neither one has an mscorlib.dll in it. It's mostly Visual Studio-related DLLs and utilities.

The main additonal assemblies for 3.0 and 3.5 (like LINQ, for example) actually live in C:\Program Files\Reference Assemblies\Microsoft\Framework. V3.0 and V3.5. That's were the main action is, except that this is all mostly hidden by the presence of and registration in the global assembly cache (GAC).

So, an instance of SQL Server 2008 will load mscorlib.dll, version 2.0.50727.42, as always. However, in SQL Server 2008 (and 2005 for that matter, if it ships as a .NET upgrade through Windows Update), you'll see a new version of System.Data.dll, version 2.0.50727.1378. Or, perhaps, we'll be able to hold the versions constant, at the possible expense of interop between SQL Server 2005 features that use .NET. Like SSIS, SSMS, SSRS, and so on.

Got it? ;-)

Wednesday, September 19, 2007 1:53:07 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 

Let's talk about clients and SQL Server 2008. First, a little history...

Every new version of SQL Server seems to include a new feature that requires a change to the protocol that SQL Server uses to talk to clients, the TDS protocol. TDS stands for tabular data stream, and is a propriatary protocol used by both SQL Server and Sybase. Since version 4.21, SQL Server's version of TDS and Sybase's version have "grown apart". Because TDS includes version negotiation you can still use old libraries to talk to newer versions, albeit at reduced functionality.

Even if they don't update the TDS protocol, there will always be new features that require changes to the client stack at some level.

SQL Server comes with support for the following client stacks:
ODBC - Open Database Connectivity, related to the ANSI SQL standard, vendor neutral
OLE DB - a COM-based vendor neutral library
ADO.NET - a .NET-based vendor neutral library

ADO (classic) is an IDispatch-friendly library over OLE DB
LINQ to SQL and Entity Data Model use ADO.NET to talk to the database. Remember LINQ to SQL is actually LINQ to SQL Server at present.

In addition, Microsoft ships a JDBC (which does not stand for Java Database Connectivity, the docs say so ;-) driver separately. DBLIB also still works, but not shipped with SQL Server any more. It's functionality is limited to features that existed in SQL Server 6.5, IIRC. Other vendors (e.g. DataDirect) ship SQL Server clients too. DataDirect licenses the TDS stack.

Before SQL Server 2005, providers and drivers were part of MDAC. MDAC (Microsoft Data Access Components) was once a separate install, but now its part of the OS. TDS libraries were separate DLLs (e.g. DBMSSOCN.dll). This stopped in

SQL Server 2005, and there are two main distribution vehicles:
SQL Native Client (SQLNCLI): OLE DB, ODBC, and network libs
ADO.NET System.Data.dll: SqlClient provider and network libs

So how does this happen in SQL Server 2008? For OLE DB and ODBC, there is a new version of SQL Native Client, version 10. Since MDAC is now part of the operating system, if you install SQL Server 2008 over SQL Server 2005, you'll now see three ODBC drivers, on my CTP4 + VS 2008 beta 2 system:

SQL Server          version 2000.86.1830.00
SQL Native Client   version 2005.90.3042.00
SQL Server Native Client 10.0 version 2007.100.1049.14

For OLE DB there are three providers:
Microsoft OLE DB Provider for SQL Server
SQL Native Client
SQL Server Native Client 10.0

For ADO.NET, the situation is a bit more interesting. There's a revision to System.Data.dll to include the new functionality. The new version currently ships with Visual Studio 2008 Beta 2 and its version number currently is 2.0.50727.1378. It simply replaces the version that's installed on the system (and in the GAC) at the time. There's only one ".NET 2.0" version registered in the GAC, as "Version Number 2.0.0.0". Hmmm....seems very "MDAC-like with MDAC as part of the operating system" (that is, the operating system includes .NET 2.0).

If you want to use the new SQL Server 2008 functionality, like date, time, datetime2, dateoffset, table-valued parameters and large UDTs/UDAggs (and perhaps FILESTREAM support when it arrives) with OLE DB or ODBC, you need to use the new driver/provider. This means changing the connection string. AND RETESTING. This also applies to SSIS packages, Reporting Services reports and anywhere else you used OLE DB or ODBC.

If you only install the operating system and SQL Server 2008, you won't have the original (version 9.0) provider and driver. These will be provided as a separate download, for folks that have installed only 2008 but haven't restested.

BTW, once upon a time, there was talk of "multiple versions of SQL Native Client running side-by-side, using the fusion SxS loader". That didn't happen. The new providers are separate and are registered in the registry separately. No SxS magic needed.

Wednesday, September 19, 2007 1:06:39 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 

So, Microsoft.SqlServer.Types lives in the resource database and runs in its own appdomain when its used by system functions, like SELECT * FROM sys.assemblies. Let's try an experiment with the following setup.

I have two user assemblies in a database named "test". One doesn't access any .NET types, its called datetest. The new DATE/TIME-related data type series are not .NET-based, but you can use them in SQLCLR procs, as you can use NVARCHAR data type. There are some restrictions on DATE/TIME series, more about that later. My second user assembly is named hiertest. It uses the HierarchyID data type in SQLCLR code. That's (of course) OK too. Both my user assemblies are owned by DBO. Both catalog as SAFE_ACCESS.  There are no user assemblies in pubs database. Turn on SQL Profiler.

In pubs: SELECT * FROM sys.assemblies;

In profiler:
Assembly Load event:
Assembly Load Failed for requested assembly system.enterpriseservices, version=2.0.0.0, culture=neutral, publickeytoken=b03f5f7f11d50a3a, processorarchitecture=x86 with Error code: 0x80070002
In SQL Server Log:
 Common language runtime (CLR) functionality initialized using...
 AppDomain 2 (32767.sys[runtime].1) created
   
Still in pubs: declare @h hierarchyid; select HierarchyID::GetRoot(); -- invoke static method of hierarchyid data type

In profiler:
 Assembly Load Succeeded for Microsoft.SqlTypes.Types
In SQL Server Log:
 Unsafe assembly 'microsoft.sqlserver.types..." loaded into AppDomain 2 (32767.sys[runtime].1)

USE test
GO

In test: SELECT * FROM sys.assemblies;
In SQL Server log:
Assembly Load Failed for requested assembly system.enterpriseservices, version=2.0.0.0, culture=neutral, publickeytoken=b03f5f7f11d50a3a, processorarchitecture=x86 with Error code: 0x80070002
  
In test: EXEC dbo.somedateproc -- my SQLCLR proc that uses date.
In SQL Server log:
 AppDomain 3 (test.dbo[runtime].2) created
In profiler:
 Assembly Load Succeeded for datetimetest
  
In test: SELECT * FROM dbo.AncestorAndSelf('/'); -- my SQLCLR UDF that uses HierarchyID
In SQL Server log:
 Unsafe assembly 'microsoft.sqlserver.types..." loaded into AppDomain 3 (test.dbo[runtime].2)
In profiler:
 Assembly Load Succeeded for Microsoft.SqlTypes.Types
 Assembly Load Succeeded for hiertest

So what happened??

There is only one resource database appdomain, it gets created when system functions (sys.assemblies) use system assemblies. The actual assembly that contains the system .NET data types loads only when it's required, into the resource database's appdomain. By the way, the declaration of a (NULL) hierarchyID variable isn't enough to load the assembly, you must actually use the variable. If you use only TSQL and the new data types, only this one appdomain is needed, regardless of how many different databases use them.

Additonal appdomains are created on a per-database, per-assembly owner basis as in SQL Server 2005. Each "user" appdomain will also load the system assembly Microsoft.SqlTypes.Types, if and only if it needs it. That is, if you use HierarchyID (or Geometry/Geography) in a SQLCLR procedure.

One last bit. Why are the appdomains referred to as: "AppDomain 2 (32767.sys[runtime].1)" and what happened to AppDomain 1? When .NET is loading into any executing process, there is a single appdomain created, the default appdomain. AppDomain 1. SQL Server doesn't load Microsoft.SqlServer.Types into this default appdomain, but starts another for the resource database. That's AppDomain 2 (into the process). The ".1" in "32767.sys[runtime].1" is first user appdomain.

Database administrators like to know about everything going on in "their" database, and with good reason...if the database fails (or even runs slowly) its (s)he who gets the first phone call for help. SQLCLR is still relatively new. Hope this was helpful in explaining what's going on. But remember, the exact appdomain implementation could be refined further in later releases. Cheers.

Wednesday, September 19, 2007 10:25:07 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

I'm running with SQLCLR on, because I'd like to see the interaction between system SQLCLR code and appdomains. See the previous blog posting for an explanation of why this doesn't affect my system functions. First, I bring up SQL Server 2008 "fresh", open SQL log in SSMS, and start a profiler trace to catch Assembly Loading events. SQLCLR is nowhere to be seen.

USE TEMPDB
GO

SELECT * FROM sys.assemblies;
GO

The assembly list contains an entry for Microsoft.SqlServer.Types, that's the assembly that contains the system UDTs. It actually lives in the resource database, but shows up in system metadata lists for every database. In the SQL Server log, just listing the assemblies in a database produces:

Common language runtime (CLR) functionality initialized using...
AppDomain 2 (32767.sys[runtime].1) created

Database 32767 is the resource database, although its not for the most part directly visible in SQL Server 2005 metadata or in SSMS.

.NET appdomains are created on a per-database basis, currently one appdomain per assembly owner. So this functionality runs under an appdomain created for the owner "sys". The assembly is actually owned by principal_id 4, which is INFORMATION_SCHEMA according to sys.database_principals. But sys.schemas indicates that the sys schema is owned by principal_id 4 as well, INFORMATION_SCHEMA schema is owned by principal 3. They're likely referring to principal_id in the resource database, not the current database. Sys.assemblies also has this assembly marked as is_user_defined = 0 (false). And, the assembly has a safety level of UNSAFE. Hmmm...

Knowing how SQLCLR exception escalation works, I was concerned by the system assembly being UNSAFE. It is  running in its own appdomain. However, when I looked at the SQL Server log for the SQLCLR message, another seemingly unrelated message attracted my attention:

Using xpstar.dll version 2007.100.1049 to execute extended stored procedure xp_instance_regread...

SQL Server internals have always included extended stored procedures to perform system functions. User-written extended stored procedures can cause problems if poorly written, but this one (xp_instance_regread) was written by the SQL Server team, its part of SQL Server itself. Hmmm...so is Microsoft.SqlServer.Types part of SQL Server. And .NET code has more built in safeguards than unmanaged code. And BOL has indicated since SQL Server 2005 betas:

"This feature (i.e. extended stored procedures) will be removed in a future version of Microsoft SQL Server. Avoid using this feature in new development work, and plan to modify applications that currently use this feature. Use CLR Integration instead."

So I guess its OK, they're taking their own advise. And this assembly has been tested with SQL Server for hardening, it doesn't produce the error message that cataloging, say, System.Runtime.Remoting, does.

One final item. SQL Profiler does not load Microsoft.SqlServer.Types.dll just because I execute "select * from sys.assemblies", but profiler reports:

Assembly Load Failed for requested assembly system.enterpriseservices, version=2.0.0.0, culture=neutral, publickeytoken=b03f5f7f11d50a3a, processorarchitecture=x86 with Error code: 0x80070002

Perhaps there's a stray assembly dependency somewhere in CTP4. System.EnterpriseServices is not on the list of tested assemblies either, so it shouldn't load automatically.

Wednesday, September 19, 2007 10:22:04 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

I was listening to a replay of the webcast recording on the HierarchyID by Michael Wang (thanks, Michael) and as he mentioned the considerations for the CLR-based type with respect to DDL, I thought it would be interesting to go back and see how this type showed up in the various facilities that we have for monitoring what SQLCLR is doing. These facilities are:

1. Ability to see assemblies registed in each database
2. Watch code-running appdomains (but not transient DDL-only appdomains) being created/torn down in SQL log
3. SQL Profiler event for Assembly.Load
4. Monitor memory and CLR-related processes using DMVs and perfmon

Before reporting the results obtained with SQL Server 2008 CTP4, we need a few clarifications. First, the way SQLCLR manages appdomains is an implementation detail and subject to change in future releases, service packs, or can even change before SQL Server 2008 ships. I'm just observing. Second, let's talk about what exactly the "sp_configure 'clr enabled', 0" does. You can also set this option using SQL Server Service Area Configuration utility. This option indicates whether or not its possible to run *user-written SQLCLR code*, that is, SQLCLR stored procedures, UDFs, trigger, UDTs, and UDAggs written by programmers.

The switch *does not*:

1. Keep SQLCLR from loading in SQL Server's process. This always loads the first time that you need it.
2. Prevent DBAs from using SQLCLR-related DDL, such as CREATE/ALTER/DROP ASSEMBLY, and define SQLCLR objects.
3. Prevent system functions that use SQLCLR from running. In SQL Server 2005, there were no SQLCLR system functions that I was aware of.

In SQL Server 2008, system functions that use SQLCLR that immediately come to mind include:
   a. The HierarchyID and upcoming spatial data types, Geometry and Geography
   b. Change Data Capture and the Dynamic Management Framework

So, its not that using system functions written in SQLCLR is "a trick" that bypasses an established control mechanism. The reality is that the mechanism was never defined to prevent SQLCLR loading, DDL, or system functions.

One things that will prevent SQLCLR from running is to enable lightweight pooling, or "fiber mode scheduling". This is also a configuration option, and its incompatible with SQLCLR. The CLR is not "fiber aware", although it may be implemented sometime in the future. A few other SQL Server system features are incompatible with fiber mode as well.

Wednesday, September 19, 2007 10:18:01 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Sunday, September 02, 2007

I've been trying out LINQ for SQL and the Entity Data Model (EDM) latest betas with some "different" data sources. This post covers using SQLCE 3.5. The latest version of SQLCE 3.5 comes with Visual Studio 2008 Beta 2.

Using SQLCE 3.5 data sources doesn't work with the built-in LINQ for SQL designer in Visual Studio 2008 beta 2 or the CTP EDM designer released shortly after beta 2. You can add a Data Connection for a SQLCE 3.5 database. I used the Northwind.sdf sample database that was supplied. But dragging a SQLCE table on to the LINQ to SQL designer produces "The selected object(s) use an unsupported provider". The EDM Wizard doesn't even list the SQLCE ADO.NET data provider as a choice. So there's no LINQ to SQL or EDM designer support for SQLCE 3.5.

SQLCE works quite well with SQLmetal, although my attempt to generate a .dbml for the Northwind.sdf sample database produces the error message:

error DBML1055: The DeleteOnNull attribute of the Association element 'Order Det
ails_FK01' can only be true for singleton association members mapped to non-null
able foreign key columns.

Removing the "Order Details" table provides a 'hit it with a hammer' solution; I haven't figured out a nicer way to solve this problem yet. Then, the SQLMetal-generated classes worked fine.

Using EDMgen against the Northwind.sdf file produced the following error:

error 7001: Could not load System.Data.SqlServerCe.Entity.dll. Reinstall SQL Server Compact.
Could not load file or assembly 'System.Data.SqlServerCe.Entity, Version
=3.5.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91' or one of its depend
encies. The system cannot find the file specified.

Hmmm...the version of SQLCE that came with Visual Studio 2008 B1 did come with this DLL. And, I had it working with EDM then. But the version in VS2008 B2 doesn't. As a last resort, I installed the SQLCE 3.5 version from VS 2008 B1, it's out on the web as a separate download. This changed the error message to:

error 7001: Method 'CreateDbCommandDefinition' in type 'System.Data.SqlServerCe.
SqlCeProviderServices' from assembly 'System.Data.SqlServerCe.Entity, Version=3.
5.0.0, Culture=neutral, PublicKeyToken=89845dcd8080cc91' does not have an implementation.

Sure enough, the required implementation of the ProviderServices class required for EDM support has changed between VS Beta1 and VS Beta2. I confirmed this by checking the methods in the SqlProviderServices in System.Data.Entity.dll, they did change recently, adding CreateDbCommandDefinition. Unless I can find an updated System.Data.SqlServerCe.Entity.dll somewhere, it doesn't look like SQLCE is usable with EDM at this point in beta-time.

That's all for now.

Sunday, September 02, 2007 7:26:04 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 

I've been trying out LINQ for SQL and the Entity Data Model (EDM) latest betas with some "non-traditional" data sources. Namely SQL Server 2008 (CTP4) and SQL Server Compact Edition 3.5. I'll cover using SQL Server 2008 CTP in this blog entry, and using SQLCE 3.5 in the next one.

Using SQL Server 2008 data sources doesn't work with the built-in designers in Visual Studio 2008 beta 2. Adding new "LINQ to SQL classes" produces an empty design surface onto which you can drag items from Server Explorer. Problem is, you can't make a Server Explorer Data Connection for a SQL Server 2008 database "only versions 2005 and before are currently supported. The EDM Wizard actually includes "Use this selection to connect to Microsoft SQL Server 2000 or 2005..." when you attempt to create a Data Connection. Not SQL Server 2008. So there's no LINQ to SQL or EDM designer support. Or Server Explorer support, for that matter.

But each of the mapping products comes with a command line tool as well. LINQ to SQL uses SQLMetal, while the EDM uses EDMgen.

Using the command line tools, I was able to generate the appropriate artifacts (and use them) for SQL Server 2008. I can even load the artifacts (.dbml or .edmx) files into the Visual Studio designer after they were generated from the command line and edit them. Very cool.

I then did an experiment to see if LINQ to SQL or EDM supports the new data types in SQL Server 2008, namely, the new date/time-related types and the hierarchyID, which is implemented as a SQLCLR UDT.

SQLmetal did not produce an error or warning on the date/time types, but it did create definitions for the new data types as "NVarChar" with appropriate lengths. Using these definitions in LINQ code prints database-correct values, albeit as strings. Haven't tried an insert/update/delete yet. I had less success with hierarchyID, this produces the message:

warning SQM1021: Unable to extract column 'c1' of Table 'dbo.h1' from SqlServer
because the column's DbType is a user-defined type (UDT).

SQLmetal then error'd out and didn't generate a dbml file at all.

Trying date/time-related data types with EDMgen produces errors indicating that the date, time, datetime2, datetimeoffset aren't supported. Neither are SQLCLR UDTs, hierarchyID is reported as date type <unknown>.

Well, these are still beta. That's all for now.

Sunday, September 02, 2007 7:18:42 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, August 28, 2007

Just found out about this one today. This SQLCLR function works right now in CTP4 of SQL Server 2008.

public static Nullable<int> AddTwo(Nullable<int> x, Nullable<int> y)
{ return x+y; }

create function dbo.addtwo (@x int, @y int)
returns int
as
external name asmname.[Mynamespace.Class1].AddTwo;

select dbo.addtwo(2, null);
-> null

Great! Turns out, I suggested this back in 2005...after SQL Server 2005 shipped. And they listened. Actually, going back over the list from 2005... SSMS does display NULL for a NULL UDT already, as of SP1. And large UDTs and UDAggs are also on the agenda for SQL Server 2008.

That may also explain why there's no System.Data.SqlTypes.TimeSpan/DateTimeOffset in Visual Studio Orcas Beta2. They're not needed.

Tuesday, August 28, 2007 1:33:33 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

11:00AM: I'm sitting in the lounge room at TechEd Hong Kong, getting ready to do my first talk on SQL Server. It's an interesting setup that we have this year, the wireless hub is located next to an area of booths. Lots of interaction (I think) between vendors, attendees, and speakers.

The preconference talk on SQL Server 2005 best practices on Saturday seemed to go really well, with T-SQL query tuning taking center stage. There seemed to be a lot more interest in XML functionality in SQL Server than I've seen previously. As I'm writing this, I'm getting ready to talk on SQL Server SP2 and SQL Server on Vista. The demos are available on the SQLskills website, on the Current Events page.

Yesterday (at least I think it was yesterday, with the timezone difference) the ADO.NET team released the Orcas Beta2 synchronized version of the entity data model (EDM), along with the associated (and much anticipated) designer. More catching up to do...

If you're at the conference, drop by the lounge room and say hi. I'll be there.

Tuesday, August 28, 2007 1:22:33 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Monday, August 20, 2007

Last blog entry on SQL Server Extended Events for a while. But...a couple of questions came up since I wrote my first blog entry on SQL Server Extended Events.
  What are the major advantages to SQL Server Extended Events?
  Is this really using the Crimson event system?

There's a couple of reasons that come to mind as an answer for the first one. First, and maybe foremost, this eventing system has an ETW target and therefore allows end-to-end tracing. ETW is a provider-based tracing system that is integrated throughout Windows. With the providers available ("logman -query providers" from the command line) you could trace from your ASP.NET app (by way of your IIS server), into SqlClient, across the network (providing that you can decode a TDS trace), into SQL Server, and back. That's a lot of power.

The main hassle I've always had with ETW is the dearth of post-processing tools available. You can use the tracerpt utility to process the ETW output into a comma-separated value file, but where you go from there depends on how well how you post-process the CSV. There is a logreader utility that can do simple SQL-like queries against a variety of log file formats. At one point, I made up a simple SSIS job to load SqlClient ETW output into SQL Server to do T-SQL queries, but where you go from there depends on your ability to decode the variable "data" fields in each event. Although the .mof files allow you to decode the binary format into columns based on the data type, CSVs aren't usually self-describing either, you have to know what each bit means.

Next reason why I was intrigued was the granularity of the eventing. At first glance, you can:
1. Create arbitrary groupings/rankings (buckets) on the event data with the bucketizer
2. Pair alloc/dealloc of most any type of resource with the pair_matcher
3. Add extra data (actions) to events. They even added a mechanism to determine causality.
4. Use as many targets as you wish (targets are separate from events and actions)
5. Add events and targets to a running session
6. Specify how much resources (like memory, dispatch latency) your trace should take (see CREATE EVENT SESSION DDL)
7. Use synchronous or asynchronous event collection, and event buffer retention
8. Specify memory partitioning by CPU or NUMA node

One of the good things about a trace is to attempt to balance "intrusiveness" (which slows things down) with thoroughness (you ARE usually tracing because there's a problem, after all).

The other question concerns the Crimson eventing system. Crimson is a really old codename for Windows Unified Eventing (Windows Eventing 6.0). You can collect your events in XML format and it uses an XML config file for registration, hence my possible confusion with the tern "XEvent" which I'd heard used for SQL Server's Exgtended Events. Matt Pietrick describes it as "an attempt to unify event log and ETW tracing". It's available on Vista and Longhorn OS's only. Here's a couple more references:

http://msdn.microsoft.com/msdnmag/issues/07/05/SecurityBriefs/
http://msdn.microsoft.com/msdnmag/issues/07/04/ETW/

Don't search for XEvent like I did, you'll get a lot of hits for the XWindow system XEvent (remember XWindows?). Or Crimson, you'll get a lot of hits on University of Alabama.

All of the articles refer to using the wevtutil utility to list event providers, like logman lists ETW providers. So I installed SQL Server 2008 on Longhorn Server (Windows Server 2008) beta3 and looked for "new" event providers and events. I didn't see any, so I'm not sure that SQL Server Extended Events will register anything more than ETW with Windows Unified Eventing. And the bucketizer/pairer targets write to dynamic management views, not to the event log. There is one more target, the asynch file system target, but that's not in this CTP. So stay tuned, perhaps this is an investing towards future "unified eventing".

BTW, I began to wonder: is unifying the event log with ETW data is really a good idea. Event logs are the "normal" messages that are emitted, tracing seems to me to be a "special occasion" messaging with possible very high message volumes. I don't think the two are actually mixed in the same physical location in Vista/Longhorn, but...what do you think about this unification?

Monday, August 20, 2007 3:54:08 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

This won't be as long of an entry because I'm trying to finish describing the items that you can use in an XEvent session, that is, the items that exist in a package.
  Events
  Targets
  Actions
  Predicates
  Maps
  Types
 
Let's do types and maps. A type is simply a data type, a simple type like Int16 or complex type like 'SOS_context'. Almost all the types live in package0, there's only one each in sqlos and sqlserver packages. Maps are enumerated constants.

See types:
select * from sys.dm_xe_objects where object_type = 'type'

and map:
select * from sys.dm_xe_objects where object_type = 'map'

and legal enumeration (map_key) values:
select map_value, map_key from sys.dm_xe_map_values where name = 'keyword_map'

Predicates: for predicates (think filters in SQL Profiler), you need a predicate source and a predicate comparator.
select * from sys.dm_xe_objects where object_type like 'pred%' order by object_type

There's probably more to it than this, there are customizable event attributes that can be SET in ADD EVENT, and predicates can use event fields for filtering (but not actions). But this will get you started.

One last thing that bears mentioning is the pair_matching target. This target allows you to specify a pair of events (like lock_acquired, lock_released) and after you've run the workload a while, it will show you (the relevent fields in the XML structure exposed by target_data (as in, SELECT CAST(target_data as xml) FROM sys.dm_xe_session_targets;) those events that do not match. That is, the locks that have been acquired but not yet released. VERY cool.

Once again BOL shows an example that takes advantage of knowledge of the pair_matching target's XML data structure. The XML used for target_data appear to be schema-less, i.e. they don't go by a named XML schema, i.e. you have to know what the structure items (elements, attributes, and values) mean. BTW, I keep referring to the BOL because I very much like the info in the BOL, as far as it goes, especially at this early stage. I come to (hopefully) elucidate and expound upon the BOL, not to complain about it. THANKS Buck, Alan, Steve, and all...

You can specify begin and end events, begin and end matching_columns and matching_actions. This is from:

SELECT * FROM sys.dm_xe_object_columns
WHERE column_type = 'customizable'
WHERE object_name = 'pair_matching'

Hope this was useful. Happy event tracing.

Monday, August 20, 2007 2:07:42 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

It's another rainy day in Portland in summer, so I thought I'd stay inside and write more about SQL Server 2008 Extended Events.

I wanted to finish things off by talking about actions and predicates. Need to make a detour at targets, too. I noticed the BOL examples (my point is to try not to repeat things you can find in the BOL) don't contain an example of actions in DDL. So we'll start with them. An action is an additional piece of data that you can tack on to an event. Like a stack trace, or even a causality ID. Or sql_text.

The available actions can be seen with:
SELECT * FROM sys.dm_xe_events WHERE type = 'action'

So let's try sql_text with our existing EVENT SESSION.

CREATE EVENT SESSION PubsLocksETW
ON SERVER
ADD EVENT sqlserver.lock_acquired
   (ACTION (sqlserver.sql_text)
    WHERE sqlserver.database_id = 12) -- (pubs database), add a predicate too
ADD TARGET package0.etw_classic_sync_target
   (SET default_etw_session_logfile_path = N'C:\temp\sqletw.etl' )
GO

But the sql_text doesn't appear in the ETW file. I did this a few times, because I thought I got it wrong. Maybe it will appear in the async file target, which isn't in this CTP. The BOL also says that not every action is valid for every event. Hmmm... the metadata (sys tables) seemed to be happy, but it ain't there. But I can use it with the bucketizer and pairer targets.

The bucketizer makes ...er' buckets (groups) of different "readings" (events) on a single data object. As in, group by lock type or group by cpu time. You can control how many buckets it makes. In fact that bucketizer target needs syntax like the following (from BOL).

CREATE EVENT SESSION MostLocks
ON SERVER
ADD EVENT sqlserver.lock_acquired (where sqlserver.database_id = 12) -- (pubs)
-- this means "create buckets based on object_id (object being locked in this case)"
ADD TARGET package0.synchronous_bucketizer
    (SET filtering_event_name='sqlserver.lock_acquired', source_type=0, source='object_id')
GO

But how did they figure out what to put after "SET"? Where does 'filtering_event_name' come from?

SELECT * FROM sys.dm_xe_object_columns
WHERE column_type = 'customizable'
WHERE object_name = 'synchronous_bucketizer'

There they are... and the description field suggests a use for 'action'.

CREATE EVENT SESSION PubsLockByText
ON SERVER
ADD EVENT sqlserver.lock_acquired
   (ACTION (sqlserver.sql_text)
    WHERE sqlserver.database_id = 12) -- (pubs database)

-- source_type= 1 means buckets by action, not by event
ADD TARGET package0.synchronous_bucketizer
   ( set filtering_event_name='sqlserver.lock_acquired', source_type=1, source='sqlserver.sql_text')
GO

Aha... now I have buckets created based on the text of the SQL statement that caused the lock, rather than by object_id.

This is getting to be too much for a single blog entry. But one last thing for now. You may not have noticed that I switched from using the "package.asynchronous_bucketizer" as the BOL does to using "package0.synchronous_bucketizer". Why? Because I want to do a simple, controlled experiment and I may not want to wait for the buffer to be full and async bucketizer to write out. BTW, for a simple controlled experiment, you can do:

USE pubs
GO

SET TRANSACTION ISOLATION LEVEL SERIALIZABLE
BEGIN TRAN
UPDATE authors SET au_fname = 'bob';
ROLLBACK  -- You don't really want everyone named 'bob', do you?

Leave the EVENT SESSION running to see the buckets. BOL has a cool query against the XML structure but to see the raw XML, if this is your only EVENT SESSION running...

SELECT CAST(target_data as xml) FROM sys.dm_xe_session_targets;


 

Monday, August 20, 2007 1:04:40 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Sunday, August 19, 2007

It's summertime in Oregon and its been nice and warm out (75-85F) lately. But today its raining (or specifically, the skies are quite ominous right now), so it's time to write more about my latest favorite subject: SQL Server 2008 Extended Events.

The SQL Server 2008 extended events introduce quite a bit of new terminology, but in investigating the specifics you come across some familiar themes.

Extended events are all contained in packages. An event package is identified by a GUID and a name. Three packages are provided and you can see brief descriptions by querying sys.dm_xe_packages. A package is just a container for all of the other objects (like events and targets) you'll refer to in event sessions. The grouping has no effect of EVENT SESSIONs; you can specify any object from any package in a single event session. The package names are: sqlserver, sqlos, and package0.

Two of the object types that packages contain are events and targets. Events name the information you can collect, these are defined in event_columns. Targets define where the event information is captured and how its processed before being collected. So what can you capture, already?

SELECT * FROM sys.dm_xe_objects WHERE type = 'event'

Only sqlos and sqlserver packages contain events. The events in sqlos are 40 low-level operating system-interaction events, as you might guess. An example is async_io_requested. The sqlserver packages contains over 80 events. These events seems to correspond to SQL Server counters you would see in performance monitor, rather than SQL Profiler trace events that EVENT NOTIFICATIONs use in SQL Server 2005, although there is some overlap. Many of these events only collect one event-specific column, a counter.

You can get a list of all the available events and the event-specific columns they collect by using:

SELECT convert(varchar(55),o.name) as [Object Name]
      ,convert(varchar(25),c.name) as [Column Name]
      ,c.column_id as [Column ID]
      ,convert(varchar(12),c.column_type) as [Column Type]
FROM sys.dm_xe_objects o
JOIN sys.dm_xe_object_columns c
ON o.name = c.object_name
WHERE o.object_type = 'event' AND c.column_type != 'readonly' -- readonly columns are common to most events
ORDER BY [Object Name]

So, to put this all together in an event session, lets use a variation of the BOL example:

CREATE EVENT SESSION test0
ON SERVER
ADD EVENT sqlserver.checkpoint_begin
ADD EVENT sqlserver.checkpoint_end
ADD TARGET package0.etw_classic_sync_target
    (SET default_etw_session_logfile_path = N'C:\temp\sqletw.etl')
go

See the previous blog entry for information about getting ETW working. Note that, in a single event session, we're using items from two different packages, sqlserver and package0. Start the session, using ALTER SESSION, then take a few checkpoints (or produce whatever event you decide to collect), ALTER SESSION to stop the session. Then you transform the (binary) ETL file to a .csv file by using tracerpt.exe.

Sunday, August 19, 2007 12:16:37 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Thursday, August 16, 2007

When starting out with XEvent support, I thought it would be good to start with the ETW target, although you can capture and catagorize events in buckets with the async bucketizer target, and pair related events (like obtain lock/release lock) with the pair matching target. Both VERY cool. But I just wanted a raw, vanilla trace, to start out. And I wrote a paper on ADO.NET and ETW once. So easy one first, I thought...

It turns out that you need privileges to start an ETW session. The ETW session is started for you (rather than you using the logman utility and starting it yourself) when you issue an ALTER EVENT SESSION...STATE=START. But mine never started. It always produced:

Msg 25602, Level 17, State 17, Line 1
The target, "CE79811F-1A80-40E1-8F5D-7445A3F375E7.package0.etw_classic_sync_target", encountered a configuration error during initialization.  Object cannot be added to the event session.

BTW, the guid before the name of the package is the package module id. You don't have to use it in CREATE EVENT SESSION...ADD TARGET...

The reason for this (for those of you that aren't reading the Katmai forums) is that the SQL Server service account is used to activate the ETW session. In order for this to work, the service account must be a member of the groups "Performance Monitor Users" and "Performance Log Users". Mine wasn't a member of "Performance Log Users". Make it a member of the group and this target "works a treat", as my UK friends would say. I'm tracing events to ETW as I write this. THANKS to Jerome Halmans for this information, its not yet in BOL that I could find.

I've always installed SQL Server (since 2005) by creating a simple account that's a member of only the USERS group in Windows machine/domain. During install SQL Server gives this account all the privs (and only the privs) it needs. Mostly it does this by creating a group SQLServerMSSQLUser[machine][instance]. But it also makes the user you specify (I call it SQLService) members of groups (like "Performance Monitor Users") when it requires group membership. It's a good idea to pick a service account this way for principal of least privilege, rather than running SQL Server as something else, like Admin or LocalSystem. See the security best practices whitepaper for details.

It's an interesting observation that not all the privs you need are tied to that single group, created at installation. That's (one of a few reasons) why its always best to use SQL Server Configuration Manager to change the service account rather than the "Services" control panel applet.

I don't know if they're going to add "Performance Log Users" to the list of things that the installer does. If they don't add it automatically, don't forget to add it yourself for this feature (that is: ETW target in SQL Server XEvent). And don't forget to point the ETL file to a directory that the service account has permission to write to.

Thursday, August 16, 2007 11:20:33 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

When I'm doing problem solving, its always good to have too much information rather than too little. With this in mind, I was quite interested in looking at SQL Server Extended Events (XEvent support) in SQL Server 2008.

You could always get diagnostic information in SQL Server through DBCC and SQL Trace/SQL Profiler. In SQL Server 2005 there are enhancements to SQL Profiler, dynamic management views (which enhanced and in some cases superceded DBCC information), DDL Triggers, and Event Notifications. There is also a WMI provide for events which uses event notifications internally. Event Notifications are sent to a SQL Server service broker queue and the events that are exposed are the same ones DDL triggers handle and most of the ones that SQL Profiler sees. In addition to all this info, there is an ETW (event tracing for Windows) provider for SQL Server. ETW support also appeared in System.Data.dll and the SNAC OLE DB provider/ODBC driver in SQL Server 2005/ADO.NET 2.0.

SQL Server 2008 adds support for XEvent (was codenamed Crimson), the new event system in Windows. The BOL provides info on this support, which works by creating and activating EVENT SESSIONs with DDL statements. Event sessions deal with items from event packages: events, targets, actions, types, predicates, and maps. You can mix and match the items from different packages in an EVENT SESSION.

So how do you get started? Create an event session (with CREATE SESSION DDL) and add items from the packages to your session (either in CREATE SESSION or in ALTER SESSION). You need at least one event and one target. You start/stop collecting by using ALTER EVENT SESSION...STATE=START/STOP.

- SQL Server 2008 ships with three packages: sqlserver, sqlos, and package0.
- There are lots of events. You can find them in sys.dm_xe_objects where object_type = 'event'.
- There are four targets, three of which work in the July CTP. The ones that work are:
    package0.asynchronous_bucketizer
    package0.pair_matching
    package0.etw_classic_sync_target

The first two targets write their info to sys.dm_xe_session_targets. You can join this to sys.dm_xe_sessions (after starting a session and collecting events) and look around. The fields in these DMVs are doc'd in BOL. Start with sys.dm_xe_session_targets.target_data.

The third target writes to an ETW session/file, providing compatibility with ETW. More on this one next.

Thursday, August 16, 2007 10:49:04 AM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 
Thursday, August 02, 2007

I've been looking at the mapping of the new Katmai date/time data types to .NET types, both from the point of view of SQL Server ADO.NET clients and of SQLCLR procedures/functions/triggers. There are a couple of things that "interesting", if I'm not misunderstanding something obvious.

There are no new System.Data.SqlTypes to correspond to the new SQL Server data types. The beta Visual Studio docs state this as a fact. So, unless SQLCLR supports the generic Nullable types as parameters in Katmai, you'll pass these in and out as strings. Or only use them in UDFs with RETURNS NULL ON NULL INPUT. No word yet (that I'm aware of) on Nullable type support in SQL Server 2008, BUT the new HierarchyID type (which is .NET based) supports INullable, like SqlTypes do. So I can pass in a NULL HierarchyID to SQLCLR, but not a NULL TIME parameter.

SQL Server's TIME data type is mapped to TimeSpan, which is a time interval. data type. This might confuse folks that know SQL Server doesn't support SQL-99's date/time interval data types. But the TimeSpan seems to have appropriate semantics as long as you stay away from the "Days" related properties and methods.

There is a new .NET data type for DATETIMEOFFSET, System.DateTimeOffset. That's good.

To distinguish between SQL Server 2000/2005/2008 usage/mappings, there is a (client-side) connection string parameter: "Type System Version". You have a choice of 2008, 2005, 2000, or "Latest". This was already used in SQL Server 2005 to distinguish between 2005 (XML data type exists) vs 2000 (XML data type is a long string). And other distinctions.

I wonder if LINQ for SQL and Entity Framework will support these... ;-)

Thursday, August 02, 2007 2:51:11 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 

I've been trying out table-valued parameters along with ADO.NET support in Orcas and came across an interesting dilemma. It centers around INSERTs using TVP against a table with an identity column. ADO.NET can use DataTable, IDataReader or IList<SqlDataRecord> to represent a TVP parameter. Let's say that I want a TVP and a procedure for insert that looks like this:

CREATE TYPE dbo.JobsTableWithIdentity AS TABLE (
  job_id smallint identity primary key,
  job_desc varchar(50),
  min_lvl tinyint,
  max_lvl tinyint
);

The "insert proc" would look like this:

CREATE PROCEDURE dbo.InsertJobsID (@tvp1 dbo.JobsTableWithIdentity readonly)
as
INSERT INTO dbo.Jobs (job_desc, min_lvl, max_lvl)
  SELECT job_desc, min_lvl, max_lvl from @tvp1;

Using this in ADO.NET (with either DataTable or IDataReader as a parameter) produces the error: "INSERT into an identity column not allowed on table variables. The data for table-valued parameter "@tvp1" doesn't conform to the table type of the parameter." But I didn't DO an insert into an identity column in the proc. And this works just fine in T-SQL:

DECLARE @t dbo.JobsTableWithIdentity;
INSERT @t VALUES('hi', 10, 10);
INSERT @t VALUES('hi2', 10, 10);
EXEC InsertJobsID @t;

So its ADO.NET "deciding" this is an error. The ADO.NET workaround (if I do want to start with a DataTable that contains the identity column, add rows to it, and call Update) is this:

CREATE TYPE dbo.JobsTableWithoutIdentity AS TABLE (
  -- job_id smallint identity primary key,
  job_desc varchar(50),
  min_lvl tinyint,
  max_lvl tinyint
);

-- sproc dbo.InsertJobsNoID changed accordingly

// and then, in ADO.NET code
// DataTable "t" contains a real jobs table, to which I've added rows

DataTable added = t.GetChanges(DataRowState.Added);
added.Columns.Remove("job_id");
da.InsertCommand.CommandText = "dbo.InsertJobsNoID";
da.InsertCommand.CommandType = CommandType.StoredProcedure;
da.InsertCommand.Parameters.AddWithValue("@tvp1", added);

But, should I have to do this? Or modify the T-SQL code, given I've not used the IDENTITY column? But, I will need this column to UPDATE (or MERGE) using the TVP. A dilemma...

Thursday, August 02, 2007 1:49:58 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Tuesday, July 31, 2007

Now that Kim and Paul have each posted about it and even published some pictures, I guess its time for me to send out a heartfelt CONGRATULATIONS to them both on the occasion of their recent wedding last weekend. All the best in your upcoming life together!

In a seperate/related announcement, Paul announced that he'll be joining SQLskills at summer's end. Welcome, Paul, it will be fantastic to have you onboard!

Tuesday, July 31, 2007 7:35:40 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

I saw Dan Jones' posting that the SQL Server 2008 July CTP (aka CTP4) was available on the Connect website. This one has some good things in it (like the new date/time datatypes as well as the hierarchyid data type) that should keep me busy for a while. There's much more new stuff than that, but the connect website has also the detalis.

One thing that is included that isn't intuatively obvious is ADO.NET client support. The Visual Studio Orcas Beta 2 release contained a new version (well, its still called version 2.0.50727, hmmm...) of System.Data.dll with support for the new date/time data types and also for table-valued parameters. The only problem was that it didn't even *connect* to CTP3 (there was a "network protocol" error). But it connects to the July CTP just fine. One things that not working in Orcas Beta 2 is SQLCLR projects against a SQL Server 2008 database, but typing CREATE ASSEMBLY is a small price to pay.

The new OLE DB provider and ODBC driver have been in place in the last CTP, but this is the first I've seen of .NET client functionality.

There are two items (listed in the readme file) that will not be in the SQL Server 2008 release. One is SQL Server Notification Services, which made its first appearence as an web release add-in to SQL Server 2000. Its not shipping in SQL Server 2008, and (some/most of) its functionality will eventually appear in Reporting Services. Another (removed from the installer) is a less-well known add-in that also debuted as a web release, SQLXML 4.0 (NOT to be confused with the XML data type, which is alive and well and has new xsd:date etc support). This used to be known as the SQLXML Web Release (V1,V2,V3) for SQL Server 2000 and SQLXML 4.0 (mostly) shipped "in the box" in SQL Server 2005. Some of its functionality was superceded by native XML and Web Services support in SQL Server 2005. It will be removed from the installer and shipped as a separate component instead, like all versions previous to version 4.0 were.

Tuesday, July 31, 2007 7:18:05 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, July 17, 2007

When I'm not busy writing about SQL Server, I quite enjoy reading books about it that look at things from a different point of view. I've been catching up on my reading lately, but getting behind on my book reviews. So here's a few reviews I'd been meaning to write for some time. The authors asked that I review them on Amazon, but I don't like either giving out personal information just to do a book review (too intrusive) or the idea of anonymous reviews/fake personal information (too easy to "stack the deck"). So I'm reviewing them here.

SQL Server 2005 Practical Troubleshooting (multiple authors, edited by Ken Henderson)
  The is a superb book about internals, with each chapter written by a dev, a PSS support person, or a member of the Development Customer Support Advisory team. Who could ask for more of an insiders' view? Maybe I like internals a bit too much, but I keep coming back to this book again and again. My favorite chapters are ones detailing the query cache and the query processor. I was originally put off by the number of proofreading/editing mistakes, but its worth getting past that and concentrating on the content in a hurry. Some excellent troubleshooting utilities are provided as well. A must read.

Expert SQL Server 2005 Development (Adam Machanic, Hugi Kornelis, and Lara Rubbelke)
  I really wrote my "review" for this one on its cover, you'll need to buy the book to read that part. This is not an overview book, nor does it try to be comprehensive in its coverage, but it covers, in depth, topics that you'll see nowhere else. In amazing depth. Absolutely worth every minute you'll spend reading and working the examples.

Pro SQL Server 2005 Service Broker (Klaus Aschenbrenner)
  I haven't received my copy of this one yet, but read a number of chapters as a reviewer before time constraints overwhelmed me. Based on the material that I did review, it's well worth reading. If Roger Wolter's Service Broker book was the seminal treatise on the subject, Klaus expands on the topics and provides quite a few examples and use cases.

Happy reading...

Tuesday, July 17, 2007 1:16:14 PM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 

In just over a month, I'll be at TechEd 2007 Hong Kong. I'll be delivering a preconference talk on (what else) SQL Server 2005 and a number of breakout sessions. Check the SQLskills Upcoming Events for more information. The folks in Hong Kong always put on an excellent show, and I'm very much looking forward to it. In addition, some of my old friends like Jon Flanders and Ron Jacobs will also be there. It should be a good time, stop by and say hi.

Tuesday, July 17, 2007 12:45:43 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, July 10, 2007

SQL Server 2008 will contain an ANSI SQL standard MERGE statement with extensions. Listening to the webcast last Friday, there's a fairly straightforward way to describe how this works.

Let's go back to first principals. Relational databases support two ways to do UPDATE and DELETE; positioned updates and searched updates. In a positioned update, you open an updatable cursor over a set of rows, navigate to the row you want, and issue an "UPDATE...WHERE CURRENT OF" statement. The searched update (UPDATE foo SET x=y WHERE z=1) is a SELECT and UPDATE in one statement. You specify the rows you want and also update them. The way you specify/gather the rows to update is mostly up to you, SQL Server even supports using a JOIN to do this.

update t
set t.name = s.name, t.age  = s.age
from [target] t
join [source] s on t.id = s.id;
go

MERGE in SQL Server 2008 is a searched operation that can operate on up to three different rowsets depending on how you specify the MERGE statement. These are actually the three rowsets you can get from a FULL OUTER JOIN. The statement:

select [target] t
inner join [source] s on t.id = s.id;

gets the rows in table T with a matching id value in table S. Let's call this rowset1. A left outer join will also include the rows in T that do not match S (rowset2); right outer join includes the rows in T that don't match S (rowset3), and FULL OUTER JOIN contains all three rowsets.

In MERGE, you can get up to all three rowsets (INNER, LEFT, and RIGHT join rowsets)

merge [target] t
 using [source] s on t.id = s.id
 when matched then update t.name = s.name, t.age = s.age -- use "rowset1"
 when not matched then insert values(id,name,age) -- use "rowset2"
 when source not matched then delete; -- use "rowset3"

The query processor will do the appropriate type of join to gather the rowsets that you need, as all clauses are not required. This optimizes performance over multiple statements, as you only have to gather the rows to process once.

Even, if you use only "when matched", MERGE is an improvement over our first "update using a join". If more than one row in the source matches one row in the target...

insert into t values(1, 'Fred', 42)
insert into s values(1, 'Buddy', 43)
insert into s values(1, 'Sam', '95)

The update using a join non-deterministically picks a row in the source to do the update, it could be Buddy or Sam. MERGE throws the following exception.

Msg 8672, Level 16, State 1, Line 1
The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to  ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.

MERGE can actually do more than three operations using predicates in the "match/no match clauses", but that's it for now.

Tuesday, July 10, 2007 7:42:55 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Sunday, June 24, 2007

Many of you have already heard the "hardware store" story. What's the best way to model products in a hardware store, where new items arrive at the hardware store each day. Each item has a "short list" of similar properties (e.g. UPC, price) and a long list of dissimilar properties (e.g. paint has color, type, amount and curtain rods have width, metal, etc). How to model the dissimilar properties for each item in relational table(s)?

This isn't as unusual of a problem as you might think, examples I've heard lately include:
 Items in a directory system (like AD)
 Readings for lab test results
 Attributes for Sharepoint items

I've always thought of the main contenders as:
1. Sparse tables - one per product
2. Sparse columns - 90% of the column values would be NULL
3. Model as XML - similar properties are subelements, sparse properties are attributes
4. Entity-attribute-value (EAV) - also known as open schema. A separate "properties" table with name-value pairs.

EAV is one of the most popular solutions, even supposedly endorsed by standard schemas in some industries. Many relational purists detest EAV because its non-relational. It's main drawbacks are that the "name-value pair" table gets huge fast, with the corresponding lack of performance, the need for careful editing (color and colour would be two different attributes), and the fact that the "value" column of name-value must have a data type of nvarchar or SQL-variant.

SQL Server 2005 added the PIVOT keyword. One use for PIVOT is the change the EAV tables into something that looks like sparse tables.

I even had the opportunity to ask Joe Celko (no fan of EAV) which he prefers, trying to ease him towards the "model as XML" mechanism. He stood up for sparse tables or sparse columns.

SQL Server 2008 will include support for sparse columns. You can designate a column as

SPARSE in the DDL, like this:

CREATE TABLE products (product_num int, item_num int, price decimal(7,2), ...,
                       color char(5) SPARSE, width float SPARSE...)

You can have a huge number of sparse columns per table, although the number of non-sparse columns remains at 1024. In addition, SQL Server 2008 will support sparse indexes (aka filtered indexes) defined like:

CREATE INDEX coloridx ON products(color) WHERE product_num IN (21,22,42...)

Finally, you can have an XML "COLUMN SET" column for each table; this exposes the sparse properties (or perhaps a subset of them?) for each item as a collection of XML elements, for those folks that like to model these as XML.

ALTER TABLE products ADD COLUMN properties XML COLUMN_SET FOR ALL_SPARSE_COLUMNS

It's an interesting idea; the proof will be in the perf as well as the usability.

Sunday, June 24, 2007 8:29:39 AM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 

ORDPATH is a hierarchical labeling scheme used in the internal implementation of the XML data type in SQL Server 2005. It's meant to provide optimized representation of hierarchies, simplify insertion of nodes at arbitrary locations in a tree, and also provide document order. It's described in the academic whitepaper "ORDPATHs: Insert-Friendly XML Node Labels". In addition to being used internally when the XML data type is stored, its also part of the key of the PRIMARY XML INDEX, used to speed up XQuery. It's an implementation/specialization of the path enumeration model of representing hierarchies in relational databases, mentioned in Joe Celko's book "Trees and Hierarchies in SQL".

In SQL Server 2008, there are additional uses of ORDPATH. There is a new system data type HierarchyID, that will likely use ORDPATH in its implementation. This allows simply hierarchies to be represented as relational column and provides methods that optimize common hierarchical operations (like parent, child, sibling, ancestors, descendants) without being concerned about the intricacies of elements and attributes.

In addition to representing and indexing XML and hierarchies, Michael Rys mentioned at his TechEd chalktalk on spatial data that the spatial data types may be indexed using a multi-level grid system and that these indexes would also use ORDPATH. Since neither HierarchyID or spatial types are in the current CTP of SQL Server 2008, we'll have to wait a bit to see if this is truly "ordpath everywhere".

Sunday, June 24, 2007 7:51:08 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Saturday, June 23, 2007

SQL Server 2008 will contain, if my count is correct, 7 new data types. Note: none of these are in the current CTP.

DATE - ANSI-compliant date data type
TIME - ANSI-compliant time data type with variable precision
DATETIMEOFFSET - timezone aware/preserved datetime
DATETIME2 - like DATETIME, but with variable precision and large date range

GEOMETRY - "flat earth" spatial data type
GEOGRAPHY - "round earth" spatial data type
HIERARCHYID - represents hierarchies using path enumeration model

The first four (date/time series) are NOT implemented/exposed as .NET system UDTs, but the last three are exposed as .NET system UDTs. This means that, for the first time, .NET will be used as part of SQL Server. In SQL Server 2005, there were originally DATE and TIME data types implemented in .NET, but implementing temporal data is an intricate process. After much wailing and gnashing of teeth by some members of the user community, these were removed.  One of the complaints was the .NET implementation. Hmmm...

One of the nice side-effects of implementing spatial and hierarchyid as .NET types is that these will be shipped as a separate assembly, and that the types will be available for client-side and middle-tier use as well as in the database. So if you want to do some massive number crunching of spatial sequences on a computation server and the network traffic from database to computation server is acceptable, you can do so.

In general, SQLCLR makes the "logic in database or middle-tier" argument easier to deal with. Although there's no "run on database or run on server" switch in VS, with minimal code changes you can move your logic, or even duplicate the logic between tiers if need be. You can't do this with T-SQL; although its faster and better for data access on the database, it doesn't run outside the database. Unless you want to use SQL Server Express Edition as an application server. But that's a discussion for another time.

Saturday, June 23, 2007 10:03:25 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

Had to write about another thing that "caught my ear" at TechEd during a chalktalk by Rick Negrin about Service Broker usage patterns.

Service Broker supports "internal activation", that is, associate a stored procedure that gets invoked when a queue has messages to process as well as "external activiation". When using external activation, an event notification occurs when a queue has messages to process, and this notification is picked up by an external application; the external application processes the queue messages, out-of-process to SQL Server.

There is an SSMS template (see template explorer) for an internal activator procedure and an engine sample implementation of a class library to support SQLCLR activator procs. In addition, Remus Rusanu has written some excellent blog entries on activation program