Saturday, July 19, 2008

When SQL Server creates a CLR appdomain to run code in (a runtime appdomain as noted in the previous blog entry), the appdomain normally stays in place for the lifetime of SQL Server. This is done to save appdomain create/teardown and assembly load time. Note that DDL appdomains, as opposed to runtime appdomains, are torn down immediately after they are used. A friend of mine recently wanted to shutdown an appdomain on purpose to troubleshoot a problem that he thought might have been SQLCLR-related. So how do you shutdown a runtime appdomain on purpose?

You could write a .NET proc to call AppDomain.Unload. But I shied away from this for a few reasons. You'd need to catalog the appdomain as unsafe for the proc to work, which means marking database as trustworthy or doing the 'signed assembly with key in master' dance. And I'd really prefer a way to have SQL Server gracefully shutdown the appdomain itself.

SQL Server will shutdown an appdomain for different reasons. It can shut them down under extremely low memory conditions or when there is a serious enough unhandled exceptional condition (e.g. unhandled exceptional condition that could leave .NET locks in place). We really don't want to cause either of these on purpose just to shut down an appdomain. Another reason the SQL Server will shut down an appdomain is when a loaded assembly is altered. You can use the ALTER ASSEMBLY DDL statement to replace code in place, subject to limitations. When you alter an assembly in place SQL Server recycles the appdomain to be able to use your new code. Currently executing code will continue to use the appdomain until the call completes; new requests are routed to the new appdomain (with the updated code). When all current requests against the old appdomain complete, the appdomain shuts down. Hmmm, probably not a good idea to muck with recompiling the producting code either. So...

Compile a simple do-nothing assembly with a simple do-nothing function (say, add two numbers together). We'll call the assembly 'fred' and the function 'addtwo'. The assembly must be owned by the same owner as the appdomain you want to recycle (remember runtime appdomains are on a per database and assembly owner basis). So if the appdomain we want to recycle is the 'pubs.dbo[runtime]' appdomain...

use pubs
go

create assembly fred authorization dbo ...
create function dbo.addtwo ...
use the function dbo.addtwo (this causes the assembly to be loaded)
recompile the assembly fred
alter assembly fred ... (this cause the eventual appdomain unload)

Note that you don't have to change the 'fred' assembly, only recompile it. SQL Server decides that an assembly is changed if it has a different MVID (.NET assembly module version identifier). New MVIDs are assigned each time an assembly is recompiled (note that an MVID is not the same as a four-part assembly version number). In Visual Studio, you'd recompile the assembly by using the Recompile menu entry, not the Build menu entry. If you try and run alter assembly without a rebuild you'll get the error message 'ALTER ASSEMBLY failed because the source assembly is, according to MVID, identical to an assembly that is already registered under the name "fred"'. You could conceivably use Visual Studio autodeploy for these steps as well, but Visual Studio autodeploy drops the functions and assembly and recreates them rather than using ALTER ASSEMBLY.

The next request against any of the .NET database objects will cause a new appdomain to be created.

Saturday, July 19, 2008 9:45:23 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Friday, July 18, 2008

.NET (and therefore SQLCLR) divides up running its code (even within the same process like the sqlserver.exe process) into appdomains. The appdomain is like a lightweight process used to enforce isolation between running .NET code within the same Windows process. SQLCLR (.NET code running in SQL Server) uses appdomains to isolate execution of .NET code on a per database and per assembly owner basis.

SQLCLR uses appdomains for two reasons: for running .NET user code like functions and procedures, and for running DDL to create and alter assemblies. You can see appdomains being created and destroyed in the SQL Server log, as well as query the current appdomains by using the sys.dm_clr_appdomains dynamic management view.

In SQL Server 2005, the SQL Server log only showed user code-running appdomains, the messages look like this:

AppDomain 4 (testdb.dbo[runtime].3) created.
AppDomain 4 (testdb.dbo[runtime].3) is marked for unload due to common language runtime (CLR) or security data definition language (DDL) operations.
AppDomain 4 (testdb.dbo[runtime].3) unloaded.

Is these messages there's a appdomain being created for running code in the testdb database for assemblies owned by dbo. It's a runtime appdomain. The last two messages show the appdomain (some time later) being unloaded.

In SQL Server 2005 it was difficult to "see" DDL appdomains, because their presence was not logged in the SQL Server log. You could deduce they existed by noting that the AppDomain numbers (e.g. AppDomain 4 above) seemed to contain skips in the number range. Or if you were extremely lucky, by querying sys.dm_clr_appdomains at *exactly* the right time.  In SQL Server 2008, it is. You don't see them being created, but I just executed some DDL to create a SQLCLR assembly and SQL Server log reports:

AppDomain 5 (testdb.dbo[ddl].4) unloaded.

This should make it a little easier to diagnose appdomain-related problems or see when assemblies have been created or altered.

Friday, July 18, 2008 3:42:38 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Thursday, July 17, 2008

I've been able to coordinate being home and the Portland (Oregon) SQL Server user group's meeting schedule, so next week I'll be speaking "at home" for a change. I'm giving a talk on one of my favorite features for developers in SQL Server 2008, Spatial Data support. Because it's relatively common these days to see GPS not only in cars but also in "carry along" devices, cell phones, cameras, and other hi-tech toys I don't personally possess, I called this talk "SQL Server spatial data for the masses". If you're at all interested in SQL Server or storing, processing, and visualizing spatial data and databases, I'll see you there.

WHEN:
Thursday, July 24th - 6:30 PM

WHERE:
SQLSoft+
1500 NW Bethany Blvd.
Suite 285
Beaverton, Oregon OR 97007

You may want to RVSP to Ken Starnes at kstarnes@kdsa.com so he can save you a seat.

Thursday, July 17, 2008 11:01:56 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, July 08, 2008

I've noticed that some folks have written PowerShell scripts that execute against a list of servers. In the scripts, they read the names of the servers from XML files. But the SQL Server PowerShell provider in RC0 has a useful "component" called SQLRegistration; its "path" is SQLSERVER:\SQLRegistration that they can use instead.

This path permits enumeration and manipulation of the (SQL) servers and server groups that are defined using SQL Server Management Studio. The SQLRegistration path is not specific to server or instance, but to the SSMS user that's signed on. For example, on the same physical machine, SQLRegistration for the Windows user bobb reflects bobb's settings in SSMS. SQLRegistration for user mary (on the same machine) would reflect mary's SSMS settings. There is no SQLSERVER:\SQLRegistration\{machinename}\{instancename} path, just SQLSERVER:\SQLRegistration.

SQLRegistration contains two "subdirectories", called "Central Management Server Group" and "Database Engine Server Group". These are equivalent to SSMS' "Central Management Servers" and "Local Server Groups" folders, respectively.

So rather than write custom XML files that hold information about groups of servers you can use the built-in Registration store. You can import and export registration information to keep team members' view of SQL Servers and groups consistant using SSMS.

Because I've been writing PowerShell scripts against the provider's Policy and DataCollection stores, I thought it would be fun to program SQLRegistration. I found the Microsoft.SqlServer.Management.Registration namespace in BOL (which is RC0 is really called Microsoft.SqlServer.Management.Smo.Registration, BOL says this will change before RTM) and was about to code against the RegisteredServer and ServerGroup classes, when I realized that Registration was a simple hierarchies of servers and groups. Just like directories and files. And the built-in groups each have a "mode" property that's value is "d". Registered Servers have a blank mode property. Hmmm...

So, at the PowerShell prompt
> cd 'SQLSERVER:\sqlregistration\Database Engine Server Group'
> new-item MyNewGroup

creates a new directory (ServerGroup) named MyNewGroup. Well then, a registered server should be a file.

> new-item MyNewServer -itemtype file

New-Item : SQL Server PowerShell provider error: This provider only supports the
creation of new Server Groups or Server Registrations. Please specify "directory"
to create a new Server Group or "registration" to create a new Server Registration.

So this should work?

> new-item MyNewServer -itemtype registration

New-Item : SQL Server PowerShell provider error: Please specify a connection string using the -Value parameter.

> new-item MyNewServer -itemtype registration -Value "server=mynewserver;integrated security=true"
>

Yep. This creates a new server registration using the connection parameters that you specify. How cool is that? No other part of the SQL SERVER PowerShell provider that I'm aware of supports the new-item cmdlet/operation. Because the SQLRegistration represent fairly simple items and hierarchies new-item is supported here. No custom programming needed (although you can use the SMO classes if you'd rather), just treat it as you would file system.

A couple of caveats. In RC0 SSMS and PowerShell SQLRegisgtration sync up when you open SSMS but if you're using both at once, the sync to SSMS isn't immediate. Sometimes refresh doesn't appear to show the changes you made in PowerShell. Eventually it will show them if you refresh multiple times or restart SSMS. And after deleting a Server Registration in SSMS, I still had to run "rm MyNewServer" in PowerShell to get rid of it there. Also, I use "integrated security=sspi" rather than "integrated security=true" in my ADO.NET connection strings. This seems to bother SSMS which reports an error in configuration. Use "true" rather than "sspi" when creating registrations through PowerShell.

Monday, July 07, 2008 11:23:58 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

I received a question today about whether I'd converted my Policy-Based Management examples using SMO (see the multi-part "Programming Policy-Based Management with SMO" series, starting here) from C# to PowerShell yet. I did do this a while ago; they're available as a script download on the SQLskills website (look on the "Past Conferences" page under TechEd 2008). But...

Since then RC0 has changed PBM a bit. The multipart name policy can't use ExecutionMode.Enforce any more, so I changed it to ExecutionMode.None which equates to "On Demand" in the SSMS dialog. There was a change to one of the enumerated constant names too. So the updated scripts for RC0 are posted as part of this blog entry.

As long as I was working on parts of the PowerShell provider for SQL Server and SMO, I decided to put together an example of using the DataCollection APIs as well. The script creates a custom DataCollection Collection Set. That's pretty straightforward. Using the CollectionStore instance just use the correct location in the provider hierarchy and get a CollectionStore instance.

# set a collectionstore object for the default instance on local machine
$colpath = ('SQLSERVER:\DataCollection\' + (get-item env:\computername).Value + '\default')
$col = get-item $colpath

Then create a CollectionSet instance with (your CollectionStore $col is specifed as a parameter to new-object) and one or more CollectionItems in the CollectionSet. Then set the CollectionSet and CollectionItem proprties and call Create. This script enclosed as well, named CreateCustomCollectionSet.ps1.

As an aside, I've enclosed a script called start_smo_sql.ps1. This is a script that I call at PowerShell startup if I know the SQL Server provider is already installed. It sets convenience variables for long SMO namespace names, paths to the default Server, CollectionStore, etc. It's not invoked from my main PowerShell profile %UserProfile%\My Documents\WindowsPowerShell\profile.ps1 which is executed by all PowerShell shells/this user. Instead it's invoked from %UserProfile%\My Documents\WindowsPowerShell\Microsoft.SqlServer.Management.PowerShell.sqlps_profile.ps1. This profile is only executed when I'm using SQLPS.exe, the SQL Server-specific custom shell. It's quite cool that PowerShell accomodates separate profiles for custom shells.

The toughest part in creating a custom collection set is that the XML schema for collection set properties has been updated in one of the CTPs, but not updated in Books Online RC0. The updated XML schema can be obtained by executing:

use msdb
select * from syscollector_collector_types

and using the parameter_schema column to get the schema you need to figure out how to create a validatable parameter. My T-SQL custom collection set used a parameter with an XML namespace on the root element (only) that looked like this:

<ns:TSQLQueryCollector xmlns:ns="DataCollectorType">
        <Query>
          <Value>select * from sys.dm_exec_query_stats</Value>
          <OutputTable>dm_exec_query_stats</OutputTable>
        </Query>
</ns:TSQLQueryCollector>

ps_demo_scripts.zip (3.23 KB)
Monday, July 07, 2008 11:11:57 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Monday, June 09, 2008

Last week at TechEd Developers, I gave a talk on PowerShell and SQL Server. I mentioned some upcoming changes in RC0, and have just had a chance to check them out.

The PowerShell provider for SQL Server has been expanded to handle not only a "SQL" subdirectory (which enumerates database objects) and "SQLPolicy" (which enumerates the policy-based management objects) but also two new "directories": SQLRegistration and DataCollection.

SQLRegistration covers the groups and members of "Registered Servers" and before you ask, yes this is the same Registered Servers that show up in SQL Server Management Studio. So if you're managing hundreds of servers, you can easier script against any server or group of "Registered Servers". You don't have to navigate to a different machine and instance by referencing the name, as I described in an earlier blog entry. Be aware that the registration service takes a while to refresh registrations. If you define a new registered server in SSMS, it won't show up in the PowerShell list immediately. Just wait a minute or so...

The DataCollection "subdirectory" allows you to enumerate the database objects and properties for the data collection feature that is used to populate and control the performance data warehouse in SQL Server 2008. You navigate through these collections/values the same way you'd navigate SQL and SQLPolicy. This one isn't doc'd as being part of the PowerShell provider in  BOL yet.

A couple of other miscellanous changes I noticed:
1. There are no longer separate custom drives for SQL and SQLPolicy as in previous versions. If you liked these custom drives its easy to add them:
New-PSDrive -Name SQL -Root SQLSERVER:\SQL
New-PSDrive -Name SQLPolicy -Root SQLSERVER:\SQLPolicy
..etc

2. Collection Names are no longer case-sensitive. So:

dir SQLServer:\SQL\{Computer}\{Instance}\databases

(where you replace {Computer} and {Instance} with real names) works now with "databases" being case insensitive.

Monday, June 09, 2008 1:59:08 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

SQL Server 2008 RC0 comes with a Readme file and a Release Notes file. It's always good to read both. In this RC, the readme file contains information on installation and upgrade, and the Release Notes file contains information about things that have changed from previous releases.

In RC0, they've changed the syntax/format of a few relevent new features. The changes affect T-SQL MERGE, the HierarchyID data type's methods, and the Geometry data type's Latitude/Longitude order. These won't be doc'd in BOL under What's New, because the What's New section lists new features, not features that have changed since the last CTP. Reading up on features that have changed will (hopefully) reduce the number of "why doesn't this code example that I downloaded from a reputable source work?" questions.

In addition to the two new performance features I just blogged about, RC0 also contains the new "Spatial Builder" API. Isaac has an excellent blog post about this API, with a (now-)working example. Check out his blog, I didn't find this feature mentioned in either "What's New" or "Release Notes".

Monday, June 09, 2008 1:20:39 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

There's a newer version of SQL Server 2008 (Release Candidate 0) available on MSDN and Technet.

Another performance-related feature (actually its additional information) allows you to have better visibility into performance-affecting queries. This information is available as columns query_hash and query_plan_hash in the sys.dm_exec_query_stats and sys.dm_exec_requests DMV. There's a really nice illustrative example in Books Online, so I won't repeat it here. You can use the queries in my last blog post to experiment with this as well.

The queries:

SELECT title_id, sum(qty) from sales
group by title_id
having sum(qty) = 30

and

SELECT title_id, sum(qty) from sales
group by title_id
having sum(qty) = 40

are similar enough that they could be parameterized (although they aren't autoparameterized). Parameterization would save query plan entry storage, rather than allocating a separate plan for each incantation of the similar query. You can also use this feature, along with the additional information in sys.dm_exec_query_stats to produce cumulative statistics (such as avg IOs, elasped time, etc. BOL has a nice query to accumulate similar plans, the relavent part is simply "GROUP BY query_hash".

But because you have similar queries, should you always have a parameterized query? What if "...having  sum(qty) = 30" has a completely different plan than "...having sum(qty) = 130"? You can find this information by looking at query_plan_hash. If two plans have similar structure but a different query plan, the query_hash value will be identical but query_plan_hash will be different. That's an indication that parameterization might not be the way to go; the first query to executed will cache it's plan and the other query will use the same plan. This is because of a SQL Server behavior known as parameter sniffing.

Because parameterizing queries is usually one of the first things programmers can do to affect performance, it helps to know when to parameterize. In addition, many folks like to start query tuning by tuning the queries (including similar queries with different parameter values) that are executed most frequently. Query_plan_hash and query_hash gives you visibility into this important information.

Monday, June 09, 2008 1:02:36 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

There's a newer version of SQL Server 2008 (Release Candidate 0) available on MSDN and Technet.

There's a couple of features that show up in RC0 that are performance related. One is a server setting 'optimize for ad hoc workloads'. This one tells SQL Server to save a "compiled plan stub" in the query cache for adhoc queries, rather than the query plan that's usually saved.

You can see these stubs as cacheobjtype 'Compiled Plan Stub' in sys.dm_exec_cached_plans. And you can see them because trying to resolve a plan_handle to a query plan (using sys.dm_exec_cached_plans or sys.dm_exec_query_stats) doesn't work for stubs, like it does for "real" plan handles. Saving a stub rather than the entire plan saves precious plan cache memory.

Here's a little script to try it out.

sp_configure 'show advanced options',1
reconfigure
go
sp_configure 'optimize for ad hoc workloads',1
reconfigure
go
-- don't use on a production system, clears whole query cache
DBCC FREEPROCCACHE
go
use pubs
go
-- background plans to produce this DMV, note there is a stub for one of these plans
select * from sys.dm_exec_cached_plans
go
-- not autoparameterized, stub produced
SELECT title_id, sum(qty) from sales
group by title_id
having sum(qty) = 30
go
-- you can see the stub
select * from sys.dm_exec_cached_plans
go
-- autoparameterized, whole query plan for parameterized version
SELECT *
FROM titles
WHERE price = 19.99
go

-- stub for the non-parameterized version, plan for the parameterized version
SELECT usecounts, cacheobjtype, objtype, [text]
FROM sys.dm_exec_cached_plans P
   CROSS APPLY sys.dm_exec_sql_text(plan_handle)
   WHERE [text] NOT LIKE '%dm_exec%'
ORDER BY p.usecounts DESC
-- query plan for stub query handle is not saved returns NULL
SELECT sql.text, p.query_plan
FROM sys.dm_exec_query_stats AS qs
CROSS APPLY sys.dm_exec_sql_text(sql_handle) sql
CROSS APPLY sys.dm_exec_query_plan(plan_handle) p
WHERE text NOT LIKE '%sys_dm_exec%' AND text NOT LIKE '%msparam_0%'
ORDER BY qs.EXECUTION_COUNT DESC
go

-- set it back, same experiment produces plans not stubs
sp_configure 'show advanced options',1
reconfigure
go
sp_configure 'optimize for ad hoc workloads',0
reconfigure
go

Monday, June 09, 2008 12:40:47 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Friday, May 30, 2008

In a blog posting about a few weeks ago, I'd written about noticing a DataSet being created in an Entity Framework program by using the ADO.NET client trace facility. Entity Framework programs do (indirectly) cause a DataSet to be created, but its only for resolving the ProviderInvariantName of the underlying data provider. NOT for anything related to the functioning of the EntityClient or ObjectContext in any way. I've revised the original blog posting to remove the DataSet reference, but just in case you already read it....

The connection string for the EntityClient data provider contains a parameter that refers to the ProviderInvariantName of the underlying data provider (e.g. System.Data.SqlClient). Entity Framework validates the underlying data provider name by using the ADO.NET provider enumerator class DbProviderFactories. DbProviderFactories creates the DataSet to store the provider information obtained from the machine.config file.

I might have guessed this...as I'd written a whitepaper on the functionality of the ADO.NET provider model, "Generic Coding with the ADO.NET 2.0 Base Classes and Factories" not more than a few years ago. Using the DbProviderFactories class is the only way to enumerate the ADO.NET data providers installed on your machine.

Come to my "End-To-End Tracing with SQL Server 2008 and ADO.NET" talk at TechEd next week and I'll show you the trace.

Friday, May 30, 2008 3:48:41 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

I'm getting ready to head out to TechEd in Orlando. I'll be doing a full day of SQL Server 2008 for Developers on Monday, a talk on SQL Server client and server-side tracing (including SQL Server 2008 Extended Events), one on PowerShell and SQL Server, and finishing up with a talk on SQL Server Security for Developers and Architects. I'll also be hanging around the SQL Server Technical Learning Center (on the showfloor) at least part of the time I'm there.

If you're around, stop by and say hi...

Friday, May 30, 2008 2:38:13 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

It's a good idea when talking to a database to save on network roundtrips. The table-valued parameter in SQL Server 2008 is an example of a feature that can reduce them in the "1 order, 1-n detail items" use case.

So its always been mildly irritating that in order to insert a row with a filestream column, you'd need to make 2 database roundtrips. One roundtrip is to execute the INSERT statement, inserting an empty value in the filestream column. This causes the file to be created, a NULL value won't cause file creation, and you need the PathName to open a file handle or use the new SqlFileStream .NET class. The second roundtrip is used to get the FILESTREAM_TRANSACTION_CONTEXT and PathName. Then you have the info you need for the handle/SqlFileStream.

However, with the OUTPUT clause introduced in SQL Server 2005 you can do it all in one roundtrip. Here's the SQL statement, based on the BOL filestream example. You can run this code using an ADO.NET DataReader to retrieve the values, and construct a SqlFileStream instance to write to.

INSERT INTO dbo.student
OUTPUT Inserted.Resume.PathName(), GET_FILESTREAM_TRANSACTION_CONTEXT()
VALUES (newid (), 'Mary', CAST ('' as varbinary(max)))";

Of course, if you're writing a large value into a file anyway, the savings of one database roundtrip in the overall scheme of things is questionable, but using minimum roundtrips is a good habit to get into in any case.

Friday, May 30, 2008 10:29:50 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Friday, May 16, 2008

This item was mentioned in other blogs too, but just to mention it again (so my SQLCE and Sync Services friends won't think I overlooked it)...

Visual Studio 2008 SP1 Beta contains direct support for using SQL Server 2008 Change Tracking. When you use ADO.NET 1.0 Sync Services with a SQL Server 2008 database, the Sync Services designer (that's Add/New Item/Local Database Data Cache) adds a checkbox that allows you simply to "Use SQL Server Change Tracking". No extra triggers, no tombstone tables, change tracking does it all for you. Visual Studio generates some scripts to enable change tracking at a database level and at a table level for the tables you select.

There's some nice generated code that warns you if you need to resync because you haven't synchronized often enough. And you can tweak the SQL scripts for more control. You can specify a CHANGE_RETENTION (the default is 2 days) and whether or not AUTO_CLEANUP is ON (it's on by default) in your ALTER DATABASE statement. And if you want to write your own, more granular synchronization or conflict resolution code, you can specify WITH TRACK_COLUMNS_UPDATED = ON (default is OFF).

Something that puzzled me was, although the SQL Server Books Online states: "Using snapshot isolation is the easiest way for you to help ensure that all change tracking information is consistent. For this reason, we strongly recommend that snapshot isolation be set to ON for the database.", the ALTER DATABASE DDL generated by Visual Studio did not include the option to SET ALLOW_SNAPSHOT_ISOLATION ON. Enabling snapshot isolation in SQL Server does involve a non-trivial amount of overhead, so it looks like they set it to the least overhead option and let you decide if you need snapshot isolation after reading the arguments/scenarios where it's useful in Books Online. Probably a wise choice.

Synchronization Framework is an exciting technology, given that there's so many possible uses of synchronization with "Local Data Caches" and, with the support for Sync Services in mobile devices. The next release of the Synchronization Framework (2.0, there's a beta out already) even contains support for Peer-to-Peer synchronization through the model. Perhaps this will fit some additional "replication via services" in databases. We'll have to wait and see if the Peer-to-Peer sync provider even supports databases, I guess. Always something interesting going on...

Friday, May 16, 2008 3:12:43 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

Last fall I'd asked the folks on the data access team about the possibility of including the new SQL Server 2008 data types in the new data access object mapping layers. At that time they said "no" but they also asked "why do you want it?" and "are people expecting it?". Someone must have had some compelling arguments.

I'm sure you've read this in other blogs, but in Visual Studio 2008 SP1 Beta, which arrived Monday, there was...drumroll please...support for all four date and time types in not only LINQ to SQL, but in ADO.NET Entity Framework. Thanks! I'm still a bit surprised about the inclusion of DateTimeOffset in EF (its a SQL Server-specific data type and EF is more platform-neutral than LINQ to SQL), but it was a cool surprise to have them there. 

Everyone's blog entry also points out there is some special support for Filestream storage (including a System.Data.Types.SqlFilestream class), but they identify Filestream as a data type. Filestream is a storage format rather than a data type, the actual data type is varbinary(max). Minor terminology nit. You can read and write FILESTREAM based types without resorting to PInvoke. Cool.

Of course, there were some SQL Server 2008 data types (the SQLCLR-based types) that aren't "in there". These are the spatial types (Geometry and Geography) and the HierarchyID data type. Perhaps for EF/LINQ to SQL version 2. These types are correctly identified in Server Explorer in a table but attempting to add a table containing them to an EF diagram fails (it actually leaves the type out of the diagram) and in LINQ to SQL throws an error "the data type is not supported".

That means if you have one of these types in your SQL Server 2008 database, you need to do something funky like treat it as a varbinary(max) and deserialize it (think IBinarySerialize) on the client yourself. Because the SQLCLR-based data types would have a one-to-one mapping to .NET classes (Geometry = Microsoft.SqlServer.Types.SqlGeometry, Geography = SqlGeography, HierarchyID = SqlHierarchyID) these would possibly be straightforward to implement in the next release. They WOULD be a potential model-breaker in that you would have "columns" (entity properties) that have their own properties and methods. Another whole level of indirection.

I've said before that IMHO, spatial support is the "killer developer feature" in SQL Server 2008. Let's hope for more API support in future. But THANKS for the inclusion of the date and time types.

Friday, May 16, 2008 2:43:32 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 

After posting the last blog entry, I realized that I hadn't blogged in about a month. Hmmmm...

Well, since last month I posted the demos for my SQLConnections talks on the SQLskills website (under PastConferences). I really had a good time at SQLConnections, its been a while since I'd been invited to speak there. Thanks to those folks who showed up and said hi.

I've been updating my classes for SQL Server 2008 while waiting for the next CTP. Don't know when that's coming, but everyone's guess is "sometime in June". And I've found a few more neat things in SQL Server 2008 that I'll get around to writing about some day soon. And did some experimenting with the Visual Studio 2008 SP1 Beta (see previous post).

And I've done a lot of yardwork and gardening. I've been outside almost every day; might even lose that pastey-white "programmer who never sees the sun" look. But its 97 degrees in Portland today (yes, 97, really) and its a bit hot for gardening.

Next stop...TechEd Developer Week in June. I'm doing a precon there on SQL Server 2008 and a couple of other interesting talks. Wonder if it will be 97 degrees in Orlando... If you're in Orlando stop by and say hi.

Friday, May 16, 2008 2:20:37 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

Every once in a while I'll get inquires on a paper I wrote in 2004 about the ETW trace providers for ADO.NET (named "Tracing Data Access"). I got an inquiry today, and since I'd recently installed Visual Studio 2008 SP1 Beta, I thought I'd try it out on a LINQ to SQL program and an Entity Framework program.

If you haven't read the paper, the original is still available on MSDN, but an embellished version (including SNAC tracing in addition to ADO.NET) was released in 2006. I'm also doing a talk at TechEd on "End-to-End tracing in ADO.NET and SQL Server 2008". If you're at all interested in client tracing or SQL Server 2008 Extended Events, it will be worth your while to attend.

Unless the LINQ to SQL program has its own ETW provider that I missed, I got no LINQ to SQL specific trace events. Since LINQ to SQL is a thin layer over System.Data.SqlClient, I do get the events for SqlClient however.

Entity Framework however, is another matter. I not only get EF-specific trace events, but there appears to be some new trace namespace abbreviations for EF as well. These are the ones I've seen while going through one trace of one simple EF program. I may have missed some.

dobj = System.Data.Objects
ec   = System.Data.EntityClient
cqt  = System.Data.Common.CommandTrees
pc   = System.Data.Query.PlanCompiler

There's quite a bit of trace information for some parts of EF, like the CommandTree portion. But after a whole half-hour and a few traces, it appears that not all items in EF are as thoroughly instrumented with trace events as, say, SqlClient. On the other hand, EF provider for SQL Server will show lots of SqlClient-specific and System.Data.Common events, as does LINQ to SQL.

Friday, May 16, 2008 1:37:31 PM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 
Thursday, April 17, 2008

I'm heading off to SQLConnections in Orlando early tomorrow morning. I'll see starting with a preconference "Day of SQL Server 2008 for Developers" and covering, well, every new feature that could interest a developer. I'll also being doing a series of talks about everything from Spatial data support to Extended Events to PowerShell in SQL Server to XML for DBAs.

If you'll be in Orlando for the conference, stop by and say hi.

Thursday, April 17, 2008 6:42:51 AM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 
Tuesday, April 08, 2008

One of the uses for sparse columns will likely be replacing entity-attribute-value designs. In an EAV design, the "sparse attribute" values are represented by a separate table. The attribute value can either be declared as (n)varchar or sql_variant. An example I've always used is the hardware store that wants to keep track of its products in a "products" table, but each product has different attributes. If you use the minimum number of columns to simplify the example, the EAV design looks like this.

create table products -- base table
(
 id int primary key,
 name varchar(max)
)
go
create table properties -- sparse attribute (name-value pair) table
(
 id int,
 name varchar(50),
 value varchar(max),
 CONSTRAINT PK_properties PRIMARY KEY (id, name),
 FOREIGN KEY (id) REFERENCES products (id)
)
go

Here's a straightforward way to convert the table to use SQL Server 2008's sparse columns. It uses dynamic SQL, but in this case there's no user input (SQL injection worries).

declare @tab nvarchar(max),
        @sql nvarchar(max)

set @tab=N'create table products2 (id int primary key, name nvarchar(max) '
select  @tab=@tab+','+ name + ' varchar(max) sparse' from properties
group by name
set @tab += ' ,col_values xml column_set for all_sparse_columns);'
-- select @tab
exec(@tab)

Populating it is also straghtforward using the pivot operator, introduced in SQL Server 2005. In the case where each item has only one of each sparse property (the table constraint enforces this), and our properties table has only three columns, there's no real aggregation with pivot. The aggregate is just required by the pivot operator syntax. This populates the table:

declare @col nvarchar(max),
        @sql nvarchar(max)

set @cols=N''
select  @cols=@cols+','+ name from properties
group by name
 select @cols
set @cols=substring(@cols,2,datalength(@cols)/2 - 1)
set @sql=N'insert into products2 (id, name,' + @cols + ') select p.id, p.name, '+@cols+
      N' from (
      select id, name, value
      from properties
      ) as q
      pivot
      (
      max(value)
      for name in ('+@cols+
      ')
      ) as PivotTable'
set @sql += ' join products p on PivotTable.id = p.id'
-- select @sql
exec (@sql)

What remains to be done now is to choose better data types for the sparse columns, if the data isn't really a string. If you haven't enforced value type correctness in the application, this may require some data cleansing. Full example as an attachment. Remember that currently (in CTP6) you can only have 1024 total columns in a table; but the limit will be increased to 30000 sparse columns before RTM.

eav_to_sparse.zip (1.23 KB)
Tuesday, April 08, 2008 9:50:47 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Monday, April 07, 2008

It's easier than you'd think... When you use the SQL Server 2008 CREATE EVENT SESSION DDL statement with extended events, you specify:

Events to be captured (e.g. sqlserver.error_reported)
Actions to be fired to add more information (e.g. sqlserver.sql_text)
Predicates for conditional capture (e.g. where sqlserver.error_reported.error = 547)
Event target to collect the events (e.g. package0.ring_buffer)
Addtional options (e.g. MAX_MEMORY)

In general, the way you look for events, actions, predicates, etc to use is to query against the metadata views:

select p.name + '.' + o.name as [Full Name],
       o.description s [Description]
from sys.dm_xe_objects o
join sys.dm_xe_packages p on o.package_guid = p.guid
-- @type can be 'event', 'action', etc
where o.object_type = @type order by p.name, o.name

Two of the actions listed have to do with activity (causality): package0.attach_activity_id and package0.attach_activity_id_xfer. In addition there is an option TRACE_CAUSALITY. I thought you'd have to set the TRACE_CAUSALITY option and add the actions to collect activity information. But attempting to add either action produced an error.

It's easier than that. Simply turning on the TRACE_CAUSALITY option is enough. The actions can't be specified in CREATE EVENT SESSION (or ALTER EVENT SESSION) because they are for internal use. The activity_ids show up at the target without explicitly naming the actions.

Monday, April 07, 2008 10:54:14 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Saturday, April 05, 2008

It appears that there are some changes in .NET 3.5 System.Transactions (or System.Transactions.dll and System.Data.dll version 2.0.50727.1433 if you're looking at assembly versions). Florin Lazar blogs about a change to the syntax here. And Alazel Acheson blogs about changes to SqlClient changes to accomodate using lightweight transactions with less promotion to distributed transactions in the ADO.NET Team blog.

It is interesting to read how the latest version of SqlClient can combine automatic transactions with connection pooling to use promotion to a distributed only when necessary. It was reminiscent of how COM+ used the connection pool; the difference is that COM+ always used distibuted transactions. Some of the changes use a transaction-aware connection reset mode that's only supported in SQL Server 2008. So server changes were required as well.

At the end of the blog entry Alazel mentions that you can't use this facility with two open connections at the same time; it requires that there is a free connection with the appropriate transaction scope in the pool. I immediately thought of SQL Server's sp_getbindtoken and sp_bindsession, the manual way to allow two connections to share a single transaction context. But sp_getbindtoken and sp_bindsession are on the deprecation list (to be removed in a future version of SQL Server) in SQL Server 2008. The books online entry for these calls recommends "using MARS or distributed transactions instead". But why MARS?

MARS (multiple active resultsets) made its appearence in SQL Server 2005 and its data access stacks. It permits you to have multiple interleaved sessions while using a single SQL Server connection. These multiple sessions share the same transaction, modulo an interesting but escoteric behavior known as batch-scoped transactions. MARS also does not support named savepoints.

What MARS and sp_getbindtoken/sp_bindsession have in common is that they are both solutions to the same "problem". I prefer calling this a SQL Server "behavior" because its not technicall a problem, just how the underlying network stack works. SQL Server does not allow other activity on a connection (like an UPDATE or a second SELECT) while reading a rowset is in progress. Pre-MARS, if you wanted to have multiple commands in a single transaction scope, you would use two connections and "bind" the transaction scopes together with the transaction token. With the MARS capability this is no longer necessary.

So, the OpenConnction1/DoCommand1/Close then OpenConnection2/DoCommand2/Close pattern is now acommodated by System.Transactions and SQL Server 2008 without transaction promotion. To do two commands with the same transaction scope without closing the connection, use a single connection and multiple sessions with MARS. With either data access pattern, only a single physical connection to SQL Server is needed.

Saturday, April 05, 2008 2:14:44 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Monday, March 24, 2008

Short post this evening...

Just in case anyone else flails around looking for this feature, its right under your nose. When you use the PowerShell SQL Server 2008 provider, you have visibility to a single, local machine (and all its SQL Server instances you can access with integrated security) by default. To get access to multiple machines using the provider, simply reference a SQL provider path that contains that machine name. If the (Windows) principal has access to the other machine's SQL Server instance, it will open a connection using Windows auth.

For example, say that I'm on a machine named zmv20. I have access to machine zmv21's SQL Server instance as well.

>cd SQLSERVER:\SQL
>dir

MachineName
-----------
zmv20

>dir zmv21  <---- makes a connection to zmv21

Instance Name  (on zmv21)
-------------
DEFAULT

>dir           <---- now you can "see" zmv21 too

MachineName
-----------
zmv20
zmv21

Using the test-path cmdlet also works.

>test-path SQLSERVER:\SQL\zmv22  <---- can I login to this machine too?

But bear in mind that this is subject to a connection timeout lag. The error message indicates that its first using WMI to obtain the machine connection.

Of course, all this is doc'd in SQL Server Books Online. Where I missed it a few times...I was looking for the equivalent of a "connect" command. You don't need one.

Monday, March 24, 2008 11:21:58 PM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 
Thursday, March 20, 2008

I'm still getting used to the new sparse column feature in SQL Server 2008.

I'd just read in the BOL definition of ALTER TABLE that you can add a column_set to an existing table. I was converting a sample app from an EAV (entity attribute value) design to sparse columns. I used the existing "attribute-value" table to create the sparse columns, then created the table. I then went back to add the column_set with ALTER TABLE after the fact.

alter table sparsetest3 add spcolset xml column_set for all_sparse_columns

Received this error message:
Msg 1734, Level 16, State 1, Line 1
Cannot create the sparse column set 'spcolset' in the table 'sparsetest3' because the table already contains a sparse column set. A table cannot have more than one sparse column set.

Huh? I figured that I should be able to add the column_set and filed the lack of this capability as a bug. It came back as "by design".  And doc'd in BOL that way. The fact that its by design makes perfect sence and helped to solidify in my mind how the column_set works.

In a "normal" table (even with sparse columns) without a column_set, "select * from table" returns all of the columns. If a table has a sparse column_set, "select * from..." behaves differently, and returns only the non-sparse columns and the column_set column. NOT the individual sparse columns. You can INSERT or UPDATE this table by using only the column set. When you update using a column_set, all of the sparse columns that you don't specify are set to NULL. You can even update to column_set to NULL itself, which NULLs out all of the sparse columns.

The reason that "A column set cannot be added to a table if that table already contains sparse columns" (BOL exact wording under "Guidelines for using sparse columns") is that it could break existing code that uses "select * from...". Imagine:

create table sparsetab (
 id int identity primary key,
 col1 int,
 spcol2 int sparse
);

select * from sparsetab -- returns spcol2
-- add column_set (this is disallowed, but imagine it DID work)
select * from sparsetab -- doesn't return spcol2, only the column_set, can break code that relies ont spcol2

And BOL is right in ALTER TABLE as well.

create table sparsetab2 (
 id int identity primary key,
 col1 int
);
go
-- add first sparse column and column_set at the same time, works fine.
alter table sparsetab2
 add spcol1 int sparse,
       spcolset xml column_set for all_sparse_columns
go

You can even, as BOL indicates, add a column_set to a table that does not yet have a sparse column.

create table sparsetab4 (
 id int identity primary key,
 spcolset xml column_set for all_sparse_columns
);
go
-- Now you can add sparse columns, they use the column_set

The ONLY issue I have is with the error message 1734 at the beginning of the post. It's misleading, because I DON'T already have a column_set. But I DO already have sparse columns.

In last sparse column correction from a long-ago post. I'd heard (early on) that you would be able to have over 4 million sparse columns (actually sizeof(int) of them). This turns out to be incorrect, it was announced lately that the limit will be 30,000 sparse columns. And, in the current CTP6 you can only have 1024 total columns (in as previous versions); the sparse column limit will be changed sometime before RTM.

Amazingly, when I quoted the 30,000 column limit to a class last week, there was a groan. One student told me his EAV table already had over 60,000 unique attributes. 30,000 wouldn't be enough...must be a HUGE EAV table. That's the motivation for sparse column. Also, his EAV table had the "value" column defined as SQL_VARIANT. That's the other motivation, sparse columns are strongly typed. Although inserting through the column_set always uses a string (nvarchar) as the value, and attempts to convert string to the definied data type for specific columns.

Thursday, March 20, 2008 10:03:07 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Monday, March 17, 2008

This is the last part of a series on programming policy-based management. The series starts here.

In the previous installment, I created a policy that was constrained to a single database. To accomplish this, I used a Condition that called out the database by name, and tied it to the TargetSet using TargetSet's SetLevelCondition method.

An alternative consists of creating a policy as part of a PolicyCategory. Each Policy is a member of exactly one PolicyCategory. The default PolicyCategory is the only one that "ships with the system", but you can define your own. If you don't specify otherwise in the code (SSMS has a Category dropdown on the Description page of the new Category dialog), your policy is a member of the default category.

Each PolicyCategory has a property that indicates whether it's manditory that a database subscribe to that category. If this property is true, each database has an implicit subscription to the category. If not, a database must explicitly subscribe. The code to define a PolicyCategory is straightforward:

static void CreateCategory(PolicyStore ps)
{
    PolicyCategory cat = new PolicyCategory(ps, "MyNewCategory");
    cat.MandateDatabaseSubscriptions = false;
    cat.Create();
}

To create a Policy that's a member of the PolicyCategory, simply use the aptly-named PolicyCategory property. If you're using a named PolicyCategory you may not want to restrict that policy to a specific database. Here's the changes to the Policy definition.

// No Condition On This One, applies to all databases, but must be subscribed to
//ts1.SetLevelCondition(ts1.GetLevel("Server/Database"), "FinanceDB");

// Name the category
p2.PolicyCategory = "MyNewCategory";

To create a PolicyCategorySubscription (for those categories that are not Manditory), the only interesting part is that you need a SqlSmoObject. You can use a concrete subclass (like Database) or make up a SqlSmoObject by using a URN. Dan Sullivan's and my book "Developer's Guide to SQL Server 2005" covers both ways to make an SqlSmoObject. Here's the code for PolicyCategorySubscription.

static void CreatePolicyCategorySubscription(PolicyStore ps)
{
    Server svr = new Server(); // open a connection to default instance, local server
    Database db = new Database(svr, "pubs");
    PolicyCategorySubscription subs = new PolicyCategorySubscription(ps, db);
    subs.PolicyCategory = "MyNewCategory";
    subs.Create();
}

This means that only the pubs database now follows the policies in "MyNewCategory". BTW, there's currently the PowerShell provider in CTP6 throws an error when attempting list a PolicyCategorySubscription (its in the hierarchy at the same level as Policy). The PolicyCategorySubscription still works as advertised though.

This concludes the series on Programming Policy-Based Management with SMO. Hope it was useful.

Monday, March 17, 2008 6:58:47 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

Theme design by Jelle Druyts

Pick a theme: