Friday, September 10, 2004

Back to XML and schema collections in SQL Server 2005...

After my experiences with Office 2003 schemas yesterday (thanks for getting me going, Dan), last night I went out onto the Internet, prowling around at 1-2am looking for XML schema repositories. The most likely candidates are mostly gone or inactive. Microsoft's http://www.biztalk.org now links to http://www.microsoft.com/biztalk, information about the Biztalk product. No schema repository here, that I could see. James Tauber's http://www.schema.net has a "returns soon" sign on its digital door. The XML.ORG Registry seems to have the most, although less than I'd expected. There were actually some with namespace URI's for in 1999 XMLSchema namespace, and some DTDs too. Hmm...

Why all this interest and schemas and schema repositories? I'm looking for some non-trivial XML schemas (preferably interlocked sets of schemas, ie. with imports) along with exemplar documents, to try out with SQL Server 2005's CREATE XML SCHEMA COLLECTION and static-typed XQuery. If you know of any, send me links. Should make for some enjoyable times on planes in the near future, like puzzle books.

And what does this have to do with Peaches en Regalia? Once upon a time, in a "previous life", I wrote a program called SMUS2T that converted Amiga SMUS music format to a format for a popular sequencer program, Dr T's KCS. SMUS2T is so obscure that a Google search returns no hits. I then set out to find use cases in test data to break it, naturally. Settled on Peaches en Regalia (by Frank Zappa) because, if you've ever heard it, it contains almost every musical sequence that my spec had. Quintuplets, septuplets, perfect test data. And I liked listening to it, at least I did for the first 50 iterations or so. Two days later, the program could convert it and all 4 tracks played synchronized from beginning to end. And it could convert Yusef Lateef tunes correctly as well.

So now I'm looking for test data again. XML schema collections this time. To test something new. I'm sitting here listening to the digitally remastered version of Peaches from the Hot Rats album, sounds different after he re-produced it for the CD medium. Although I think my favorite track is now "The Gumbo Variations".

Friday, September 10, 2004 2:29:28 PM (Pacific Standard Time, UTC-08:00)  #    Comments [3]  | 

I just answered this question on the "public/private" sqlclr newsgroup, but now I'm wondering myself. What IS all that stuff doing in sys.assembly_files? If you've done autodeployment of a SQLCLR assembly lately, take a gander as the metadata view sys.assembly_files. There's not only the binary assembly code and the debug symbols (pdb file) but also source code. And finally, Visual Studio project file, .csproj or .vbproj.

You actually need:
  The real assembly code to run the assembly. Of course.
  The debug symbols if you want to enable in-process debugging

You might like the source code because then its stored in the database like T-SQL code is. There was a thought at one point that you'd deploy your source code and SQL Server would run a compile on it inside SQL Server. That's not gonna happen (at least this time around), and I sort of wondered why people wanted it, maybe because T-SQL stored procedures work that way, but more likely for manageability. That way the actual code is backed up with the database. Anyhow, in beta 2 the source lives in the server, but is not used by the execution engine. Interestingly, if you choose to use an external key file to codesign the assembly, maybe to deploy it to clients for deployment to their GAC, this isn't stored in the database. At least I haven't found it so far.

Still not sure why the VS project file is there though, maybe because it contains project settings you'd need if you wanted to rebuild. But you'd need the keyfile too.

BTW, if you do a "manual" deploy with CREATE ASSEMBLY only the actual binary is cataloged, of course. You can add the rest of the stuff manually with “ALTER ASSEMBLY ... ADD FILE FROM ...”. Or add any other file that strikes your fancy, for that matter.

Friday, September 10, 2004 1:34:51 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 

After lots of interesting (but fruitful) experiences with the XQuery engine in SQL Server 2005, I decided that I'd rather have an XML schema collection for everything than put [1] (that's the numeric predicate meaning "item 1 using 1-based ordinals") everywhere in the XQuery that I needed a singleton. It's a good idea to use "typed XML" anyway. Dan Sullivan thought a good non-trivial use-case would be Office 2003 docs and went forth to create SQL Server 2005 XML schema collections from the publically available schema documents.

First thing he ran into is that specifying processContents="lax" now causes an error when cataloging the XML schema collection. It used to catalog without error and then use processContents="skip" behavior instead in beta1. The Office schemas use this construct, so he's created some code to change it to processContents="skip" so SQL Server 2005 would "like" it. We've run into a couple more cataloging problems, repeated attributes on the same element, etc... Wonder where you report Office schema problems.

If you're wondering what this is about technically, pulling out my favorite XML Quick Reference (by Aaron and Gudge), they define processContents (on xsd:any or xsd:anyAttribute) as "Specifies whether a schema processor should validate the elements/attributes appearing in place of the wildcard". The default is "strict", meaning validate them. Lax means the schema processor *should* (but is not required to) validate if schema information is available. Skip means a schema processor should not validate these.

BTW guys, when's Essential XML Quick Reference Volume 2 due out? There *lot's* more specs now. Maybe you could publish an Essential XML Reference Encyclopedia?

Friday, September 10, 2004 11:00:36 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Thursday, September 09, 2004

With the introduction of the XML data type (using an XML data model) comes there's a choice of pulling XML out of the database as a stream or column. Using an XML data type as a column type will get you a column of type XML. That's SqlXml to you in .NET. SELECT ... FOR XML gets you a stream. In SQL Server 2005 there a new keyword TYPE on SELECT ... FOR XML that makes the stream into an XML type. Some/most tools make the stream look like a column with up to a size of 4k per "row". SSMS in SQL Server 2005 does better with it than Query Analyzer.

There's different APIs for these too. In ADO.NET there is ExecuteXmlReader vs ExecuteReader/ExecuteScalar. In OLE DB you use a COM IStream implementation, there was a Stream class added to ADO classic. Although you can use the either API for stream or column, it appears to be a stretch to use “other one“ in both cases. As an example, using a 'SELECT xmlcol from xmltab' with ExecuteXmlReader only gets you the XML in the first row. As you'd expect, if you remember ExecuteXmlReader returns a singleton like ExecuteRow in ADO.NET 2.0 does. At this point, I'm recommending using the "right" API for the representation. And you can always translate to a string. And, using the client XML APIs, any of the supported XML data models (such as document or Infoset).

Thursday, September 09, 2004 8:46:32 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Monday, September 06, 2004

SQL Server 2005's HTTP endpoint functionality uses http.sys, the HTTP stack in the kernel, to do it's processing. Until lately Windows Server 2003 was the only OS to implement http.sys and HTTP endpoints only worked when SQL Server 2005 ran on this OS.

Windows XP SP2 shipped with http.sys and HTTP endpoints are now enabled on this OS as well. There is a difference between the two OSes that makes this feature subtely different on each one. The difference is in the version of IIS that runs on each OS. If you are running IIS at the same time, XP SP2 runs IIS 5.1 and Windows Server 2003 runs IIS 6.0. The difference is that IIS 5.1 does not use http.sys even if it is available and therefore grabs all of the HTTP traffic on whatever ports it's listening on. In most cases this is port 80 and/or 443. This interferes with SQL Server HTTP endpoints trying to use these ports. Dan Sullivan confirmed this when running the class on his laptop running Windows XP SP2 last week.

There are a couple of ways to get around this. You can specify a port other than port 80/443 for SQL Server's HTTP endpoints and run both endpoints and IIS at the same time. Another workaround is to shutdown IIS and only run SQL Server's HTTP endpoints.

Monday, September 06, 2004 9:59:36 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

Some folks have asked why I appear to use the XQuery data() function on occasion, but not always, when I'm doing demos. If you want the atomic value of an attribute, rather than the attribute node itself, data() is your friend. data() makes an atomic value from any XQuery item, but with attribute nodes it's particularly useful. SQL Server's XQuery functions don't allow returning bare attributes at the root level (bare text nodes are allowed), and the value function is always looking for a singleton atomic value. The data() function helps in these situations; here's an example:


declare @x xml
set @x = '<foo bar="baz"/>'
-- returns error: "Attribute may not appear outside of an element"
select @x.query('/foo/@bar')


declare @x xml
set @x = '<foo bar="baz"/>'
-- returns baz
select @x.query('data(/foo/@bar)')

Using the data() function is different from using the string() function. data() takes a sequence of items (nodes or atomic values) and atomizes them, ie. returns a sequence of atomic values. string() returns the string value of a single item.  Here's an example comparing data() to string():

declare @x xml
-- XML fragment
set @x = '<x>hello<y>world</y></x><x>again</x>'
select @x.query('data(/*)')

returns a sequence of two string values:
helloworld again


declare @x xml
-- XML fragment
set @x = '<x>hello<y>world</y></x><x>again</x>'
select @x.query('string(/*)')

returns a static typing error, because string() requires a singleton or empty sequence as input

declare @x xml
-- XML fragment
set @x = '<x>hello<y>world</y></x><x>again</x>'
select @x.query('string(/*[1])')

returns a single string value:
helloworld

I'll have more to say about static typing in SQL Server 2005 in the future. It can seem like it makes simple queries, where you “know” the data but don't use a schema harder to write, but I'm getting to like it.

Sunday, September 05, 2004 11:49:30 PM (Pacific Standard Time, UTC-08:00)  #    Comments [3]  | 
Sunday, September 05, 2004

After getting a laptop with enough power this summer, I'd taken a week or so to familiarize myself with Virtual PC. Using it is a great way to run multiple versions of operating systems and databases for testing. It only took a day to realize that a mere 768meg of memory wasn't going to be enough, so now I'm upgraded to 1.5 gig; haven't gotten the second hard drive for perf yet. Only hiccup is an extreme speed problem with XPSP2 *guest* OS, but I'm running XPSP2 host OS with no problem. There is a VPC service pack due out to fix the XPSP2 guest problem, I've heard.

Last week I had occasion to install Virtual Server to run a preconfigured domain for an assignment. I was happy to find that you can install VPC and VServer side-by-side, that the .vhd files for the main drive are more or less compatible and the concepts of differencing and undo disks are the same. Got networking with host using the loopback adapter as usual and I'm running a whole mini domain on laptop. More secure that VPC, but I do miss Shared Folders (I have to use network shares to copy files to/from the host). Also seems a tad more sluggish than VPC, but I don't have multi-processor or any other cool hardware that VServer is better suited for.

Having been around when the bank I worked at started running IBM's VM operating system (back when computers where coal-powered), I was naturally leery of incompatibility problems for guest OSes at the DBMS or other application level. So far, haven't seen any. Got over my original unease.

Now back to SQL Server 2005...

Sunday, September 05, 2004 10:34:45 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Monday, August 30, 2004

There are a number of different ways that SQL Server 2005 checks to see whether on not you're trying to catalog or execute dangerous .NET code. This checking is enforced by SQL Server at CREATE ASSEMBLY time and also at execution time.

During CREATE ASSEMBLY, SQL Server reflects on the assembly and check for certain coding practices than can "cause problems" at runtime. It's well-known that SQL Server does not allow you to catalog an asssembly as SAFE or EXTERNAL_ACCESS if the assembly contains finalizers or mutable statics.

It's also well-known that SQL Server will throw a SecurityException at runtime if you attempt to access classes or methods marked with a Host Protection Attribute. HostProtectionAttribute was added to the .NET 2.0 runtime to allow class-library authors to annotate their code in order to inform the host (in this case SQL Server) of behaviors such as "MayLeakOnAbort" or "SelfAffectingThreading". HPAs provide protection at execution time, not catalog time.

It's a little-known fact that SQL Server also checks for the presence of certain attributes on assembly code at CREATE ASSEMBLY time. If your code contains dangerous attributes CREATE ASSEMBLY does not succeed. This is different checking than what HPAs allow. This list of "dangerous attributes" is checked at catalog time. The attributes SQL Server considers dangerous include:
  System.STAThreadAttribute and System.MTAThreadAttribute
  System.ThreadStaticAttribute
  System.ContextStaticAttribute
  System.DllImportAttribute

Doubtless there are others. Attempting to catalog an assembly that contains one of these attributes produces an error message like this:
Msg 6237, Level 16, State 1, Line 1
CREATE ASSEMBLY failed because method 'ProcWithAttr' on type 'StoredProcedures' in assembly 'SomeAssm' has invalid custom attribute 'System.STAThreadAttribute'.

Interestingly, I was able to find one that slipped by; SQL Server will allow cataloging an assembly with the UnverifiableCodeAttribute as SAFE. Maybe it was due to the fact that my module did not actually *contain* unverifiable code, just an attribute that said it did.

I'm amazed at the thoroughness with which SQL Server checks for code that can cause it problems as a .NET runtime host. .NET reflection is truly its friend; try to implement this type of checking on an extended stored procedure!

Monday, August 30, 2004 9:26:41 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Sunday, August 29, 2004

Because of SQL Server 2005's support of XQuery, a lot of folks have been asking about good XQuery-specific books. I've found two excellent books, each with a different focus.

"XQuery From The Experts" is a compendium of essays by the folks in the W3C XQuery working group. It's an excellent treatise of how XQuery is meant to work, the problems its meant to solve, and the details of implementation. Since most of the writers are also XQuery processor implementers, you're getting the "inside story". I especially like Jonathan Robie's XQuery tutorial chapter and Michael Rys' chapter on implementing an XML data type. Michael's XML data type implementation theory is not identical to the SQL Server 2005 implementation (although he's the PM for this feature), but its close enough in concept to make it a great read if you want to understand this feature.

If you're more interested in XQuery from a user perspective, Michael Brundage's "XQuery: The XML Query Language" is right up your alley. Michael covers the syntax and usage in depth, with lots of examples. The XQuery specs lack all but the most cursory examples (it's not their charter) so Mr. Brundage's book is a must. Although Michael works at Microsoft, his book doesn't cover the System.Xml implementation in .NET 2.0, but the language as defined in the spec. So you'll be doing deltas with the SQL Server 2005 or System.Xml implementations (which are closely aligned) in any case.

Remember that both books are based on a standard-in-progress, although now it's close to done. Happy XQuery-ing.

Sunday, August 29, 2004 11:09:45 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Wednesday, August 25, 2004

I'm at the Ascend program launch this week; it's being launched as part of a larger event. Yes, I finally got to meet Roger Doherty. I'm doing a day-long (actually a very long day-long) presentation of SQL Server 2005 new features for developers tomorrow. Today, I'm listening to Larry Chesnut speak about "DBA and scalability topics" as I write this. It's interesting to see what people consider "DBA topics" as opposed to "developer topics", because in the SQL Server culture, many small shop developers are also their own DBA. In addition, SQL Server runs some of the world's largest installtions.

It's also interesting that I've met up with quite a few folks here who took my class in beta 1 timeframe. They're not coming for the most part because they've "already seen it". If you look back at the blog entry for my "20 favorite beta 2 features" about a month ago, there's many new features even since beta 1. So I have at least 25 new things to talk about; the list of 20 were only favorites and I've "found" more since then.

I've also been asked here if our book "First Look at SQL Server for Developers" is based on beta2/.NET 2.0 beta 1 or the earlier Yukon beta1. The answer is Beta2 with qualifications. There's a lead time in publishing and we wanted the book to ship with the beta. So we took the information that we could get, did experiments intermediate versions and wrote what we thought would be in beta2 at the time. The server folks were pretty settled by that point, but there may be little nuances since that I've found through experimentation. The client-side folks made some big changes since we declared done, we caught as many of them as we could. That said, I'm keeping up to date with the latest on the book's website. It's difficult to write "in depth" book as the product's changing. I was inspired by Eric Gunnerson's excellent C# book, which shipped early but followed the product as it evovled via the book's website.

Wednesday, August 25, 2004 11:13:12 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Monday, August 23, 2004

If you're developing procedural code, UDTs, or UDAggs in SQLCLR for SQL Server 2005, Visual Studio 2005 has these nice auto-deploy projects called SQL Server projects that you can use. Comes with templates for each SQL Server 2005 item you can develop, just choose a SQL Server and database at project creation and hit “Deploy” menu entry when you're ready. F5 debugging is even built in to them. VERY cool.

But...there are some folks that don't want any help. Or have an existing class library they want to try. Or want to start from first principals and show others how to install a .NET assembly “by hand”. In Visual Studio Alpha/SQL Server 2005 Beta1 you could do this with a default class library project. In VS Beta1, you can't because there are mutable statics in the class library project “by default”. You have to install it as UNSAFE. Or....

1. Create a new class library project.
2. Click "show all files".
3. Look under Properties (C# project) or MyProject (VB.NET project)
4. Delete Resources.resx and Settings.settings. If you are working in VB.NET project you do NOT have to delete MyApplication.myapp.
5. You're good to go. Just code it up and manually deploy as SAFE (the default).

Monday, August 23, 2004 3:01:23 PM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 

To squeeze every ounce of performance out of the SqlServer data provider, the rule is to allocate the least number of ancillary object instances possible. In SQLCLR Optimizations 1, I mentioned a mechanism to keep from creating a SqlCommand and associated objects, just to initialize a SqlDefinition.

Another performance increase comes by not allocating/using extra buffers. When you execute a statement that produces a rowset, this means using SqlDataReader only when you need it. You need it only when you are  going to consume the results IN the .NET procedure and do some processing with them. You don't need a SqlDataReader just to pass a rowset back to the client. As an example, if I want to emulate this trivial T-SQL proc:

CREATE PROCEDURE getauthors
AS
SELECT * FROM authors


this .NET code looked pretty straightforward to me:

public static void GetAuthors1()
{
  // create and initialize command
  SqlCommand cmd = SqlContext.GetCommand();
  cmd.CommandText = "select * from authors";

  // get data reader
  SqlDataReader rdr = cmd.ExecuteReader();

  // pass results to client
  SqlContext.GetPipe().Send(rdr);
}

This actually allocates enough of a buffer in your procedure to hold a row and reads into the buffer, just to turn around and hand it to the SqlPipe. That buffer is pure overhead. A more performant way, that basically points the rowset at the SqlPipe, is to use SqlPipe.Execute. You can Execute a SqlCommand or use a SqlExecutionContext.

The SqlExecutionContext is the class that represents the volitile parts of SqlCommand, the parameters, and contains methods to execute SQL and return results. Here's the same procedure, using both SqlDefinition/SqlExecutionContext and SqlPipe.Execute.

// initialized once during class construction
readonly static SqlDefinition def =
  new SqlDefinition("select * from authors", null, null);

public static void GetAuthors2()
{
  // lookup static part of command
  // in this example, the command is entirely static
  SqlExecutionContext ec =
    SqlContext.GetConnection().CreateExecutionContext(def);

  // execute it. no SqlDataReader needed, results to client
  SqlContext.GetPipe().Execute(ec);
}

At first glance, you look at System.Data.SqlServer and see a few classes that look strange, but most everything looks the same as with the SqlClient provider. The big perf improvements, however, come from using the classes that look strange and are different. That's why they are there, to allow efficiencies that you can't get (like not allocating a buffer to receive a rowset) on the client. But you wouldn't send a rowset back to the "client" from the client, this only happens in a procedure inside SQL Server. On the server it's a whole different ballgame.

 

Sunday, August 22, 2004 11:14:56 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Friday, August 20, 2004

What's error 6522? It's the error that you always get from an unhandled exception in a SQLCLR stored procedure. If I write a SQLCLR error that throws an unhandled exception (let's write one on purpose):

[SqlProcedure]
public static void ErrorExecute()
{
    SqlCommand cmd = SqlContext.GetCommand();
    // everyone knows Bob can't type
    cmd.CommandText = "select * from authorss";
    SqlContext.GetPipe().Execute(cmd);
}

using it from T-SQL:

execute errorexecute
print @@error

Yields:

Msg 208, Level 16, State 1, Line 0
Invalid object name 'authorss'.
Msg 6522, Level 16, State 1, Procedure ErrorExecute, Line 1
A .NET Framework error occurred during execution of user defined routine or aggregate 'ErrorExecute':
System.Data.SqlServer.SqlException: Invalid object name 'authorss'.
   at System.Data.SqlServer.Internal.StandardEventSink.HandleErrors()
   at System.Data.SqlServer.Internal.RequestExecutor.HandleExecute(EventTranslator eventTranslator, SqlConnection conn, SqlTransaction tran, ClrLevelContext clrlvlctxtPipe, UrtExecutionType uetType, InternalResultSetOptions irsoOpts, CommandBehavior eBehavior, Object& objResult)
   at System.Data.SqlServer.Internal.RequestExecutor.ExecuteToPipe(SqlConnection conn, SqlT.
6522

Not that @@ERROR returns 6522, NOT 208. If you write an equivalent bad T-SQL proc:

create procedure errortsql
as
-- cant type any better in TSQL
select * from authorss
go

execute errortsql
print @@error

returns 208.

Why does this matter? Say that I have a T-SQL stored procedure (not this one, obviously) that I want to replace with a SQLCLR equivalent that say, runs faster. Say the procedure is used in 20 places in my application, each with semantics that check for specific values of @@ERROR. Won't work the same.

You can't catch the exception in SQLCLR and throw the "correct" one (using a SqlCommand with CommandText of RAISERROR(...)) either, because you can't throw a 208 error (or any other system error) with RAISERROR. So I'll have to change each one of my caller procs too when I switch to SQLCLR. The only current workaround is to change to the new T-SQL try-catch syntax

BEGIN TRY
  EXECUTE errorexecute
END TRY
BEGIN CATCH
  -- prints 208, whew...
  PRINT CONVERT(varchar(10), error_number())
END CATCH

Hopefully this will be changed or a workaround in SQLCLR will be available in the next beta. I want @@ERROR to somehow return 208, not 6522. Or this will be a great motivator for everyone to change to T-SQL TRY-CATCH.

Friday, August 20, 2004 3:14:09 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

Good T-SQL programmers are always looking to optimize their procedural code. Little optimizations can end up as big savings if the code will be executed many times or if it's used in many places in the application. Good SQLCLR programmers will be doing this too. Recently I ran into a micro-optimization that has promise.

Everyone who's used SQLCLR in earnest knows by now that it's useful to separate SqlCommand into two pieces: SqlDefinition and SqlExecutionContext. SqlDefinition is the static portion of the command, SqlExecutionContext contains the execution methods like ExecuteNonQuery and friends. Last time I'd heard rumors, about a 20% performance improvement, although it's not usually useful (or not allowed in betas) to report performance numbers. It's beta software after all, possibly with extra unoptimized code that will be removed/optimized at release. Especially SQLCLR, a brand new feature. So I didn't confirm the rumored numbers; sounded like a reasonable assumption.

SqlDefinitions are best initialized in the class constructor and stashed in readonly static (Shared Readonly in VB.NET) variables. I've done it like this:

static readonly SqlDefinition def = null;

static MyClass() {
  SqlCommand cmd = SqlContext.GetCommand();
  cmd.CommandText =
      "select * from authors where au_id like @au_id";
  cmd.Parameters.Add("@au_id", SqlDbType.Varchar);
  cmd.Parameters[0].Size = 50;
  def = new SqlDefinition(cmd);
}

Lately, I've noticed there's an even better way. There's a constructor for SqlDefinition that takes a CommandText, CommandType, array of SqlMetaData, and array of ParameterDirection. I can even put the initialization of these in the variable declarations, like this:

readonly static SqlMetaData[] md =
    new SqlMetaData[1] { new SqlMetaData("@au_id", SqlDbType.VarChar, 50) };
readonly static ParameterDirection[] pd =
    new ParameterDirection[1] { ParameterDirection.Input };
readonly static SqlDefinition def = new SqlDefinition(
    "select * from authors where au_id like @au_id",
    CommandType.Text, md, pd);

The variables must be defined in this order, as the SqlDefinition depends on the other variables being initialized first. Or I could initialize them explicitly in the class constructor.

Doing it this way saves me from creating object instances, then throwing them away to the garbage collector. Namely:
1. 1 SqlCommand
2. 1 SqlParameterCollection
3. N SqlParameters where N is the number of parms in the query. SqlParameter actually contains SqlMetaData as a member, plus "other stuff"

Because I'm only executing this ONCE (the first time the class is instantiated) in an entire SQL Server run, is this a micro-optimization? Probably. But what if I have 100 or 1000 such SqlDefinitions? Maybe not as "micro" then... what do you think?

Friday, August 20, 2004 9:54:48 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Wednesday, August 18, 2004

I've been trying out the new session support in HTTP endpoints. You manipulate this through SOAP headers, and looking through the list of SOAP headers supported allowed me to understand this feature from an entirely different perspective. It looks like the session support makes this into more of an alternate client stack than a mechanism to serve web services on the internet or (more likely) intranet.

The HTTP endpoints session support is meant to overcome the limitation that HTTP is a stateless protocol, while TDS/SNI/Netlibs is a stateful one. That is, you normally login to SQL Server perform multiple operations and logoff. With HTTP endpoint sessions, you can do this. There are SOAP headers for "connection string parameters" like initial catalog and application name, there are initiate (login) and terminate (logoff) attributes. There are headers exposing concepts like database transactions and prepared statements. There are even headers for new SQL Server 2005 features like database mirroring partner (for failover) and notification request (for query notifications).

I'll post some samples to show what's possible. They use the SoapHeader definition ("include file") starter (tweaked a bit) from the BOL. Bear in mind that if you define the endpoint on your own machine (you can't come over to my house and use my machine, sorry) you need to update the web reference. Doing so wipes out the SOAP Headers that you need in the proxy class (Reference.cs) to make this work. So don't forget to add 'em back. Save the original code in case you forget. For a more detailed explanation than that, refer to BOL or come to the class and I'll show you how. Or come to Windev and Dan Sullivan will, he's doing the talk on this feature. And have a think about why this is useful, maybe I'll discuss this next.

Wednesday, August 18, 2004 11:02:55 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Monday, August 16, 2004

When you load XML into an ADO.NET DataSet using it's ReadXml() method, ADO.NET uses a fixed set of rules to determine how to infer a relational structure for the XML. These rules are doc'd in the .NET framework docs. One of the most frustrating rules was that you couldn't have the same attribute name or subelement name in more than one different element. Doing so cause the ReadXml() method to produce the well-known error “The same table (Table name) cannot be the child table in two nested relations.”

The algorithm is improved so this type of XML can be loaded in .NET 2.0. It can load my test semistructured (also known as “tags and text”) document, as well as more mundane XML documents that caused this error in previous versions. The document:

<book>
<chapter>
<title>Testing your <noun>typewriter</noun></title>
<p>The quick brown <noun>fox</noun> jumps over
the lazy <noun>dog</noun></p>
</chapter>
</book> 

Produces 4 tables, named book, chapter, p, and noun. It even serializes out using WriteXml() the same way it looked when it was loaded, adding only an extra unused namespace declaration for xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance".

This will be a help for developers using this method of XML to relational mapping.

Monday, August 16, 2004 9:32:25 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Thursday, August 12, 2004

I was talking with Dan Sullivan this week about UDTs in SQL Server 2005 and he's made an interesting discovery. About a week ago, I'd happened to mention to Dan that UDTs and UDAggs (user-defined aggregates) are supposed to have a pattern of "deserialize-do action-serialize". He thought I'd misunderstood. Then he went off to write an improved UDT/UDAgg for his talk at Windev. Dan has an engineering background and is excited about using UDTs for data types like Linear Dimension (he wrote the LDim example in our book).

What Dan discovered in attempting to implement some initialization inside the constructor is that, inside SQL Server operations, the constructor is never called! His UDT used Format.UserDefined, which means he implements the IBinarySerialize interface to Read and Write the UDT state. The engine just allocates some bare memory for an instance with the IL “newobj” instruction, then calls IBinarySerialize.Read to fill it out. What this means is that any internal fields that you would initialize inside a constructor must also be initialized when IBinarySerialize.Read is called as well. Dan found this out the hard way, reference types he'd expected to be initialized in the constructor were null when Read is called.

He's made a common implementation pattern in his UDTs: always have an Initialize or Init method and call it both in the constructor (for "ordinary" or client usage) and in IBinarySerialize.Read (for use inside SQL Server 2005). A good thing to keep in mind if you're writing a UDT.

When I asked if this wasn't against the "rules", his response was that C# always calls the constructor after newobj, but it's not a .NET requirement. Niels Berglund mentioned that this behavior is also true in Format.Native too, but all its fields are value types, so they are all initialized to zero by the runtime.

Niels also mentioned an interesting pattern he observed by tracing through a UDAgg he'd written with Format.UserDefined.UDAggs implement an Init() method instead of a constructor so that instances can be reused, for example in GROUP BY statements. The pattern Niels observed was:

1. Init
2. Write (but Accumulate hasn't been called yet)
3. Read-Accumulate-Write (for each row processed)
4. Read-Terminate

This pattern had puzzled him at the time, but alongside Dan's UDT observations this makes sense. I'd originally thought that the Read-Accumulate-Write pattern was for parallel processing, but it always happens even on a single processor. At this point we're all thinking it might a hedge based on how thread scheduling works in SQL Server.

So don't forget Dan's "Initialize in Read or constructor" pattern when you write Format.UserDefined UDTs. I think I have the good fortune to work with some pretty observant folks.

Thursday, August 12, 2004 10:18:03 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

As Mark Fussell will tell you, one of the coolest things in System.Xml 2.0 is the static creator methods on XmlReader and XmlWriter. It made his top 10 list, albeit at number 10.

I've been looking into how the client side handles SQL Server 2005's XML data type and was bemoaning the fact that System.Data.SqlTypes.SqlXml had a CreateReader method that probably didn't set what I wanted set - ConformanceLevel.Fragment. Sounded like just the ticket for SQL Server 2005. Checked with Reflector, and lo and behold, they *do* use XmlReader.Create with ConformanceLevel.Fragment. Very cool, and the information was helpful to Kent Tegels on the beta newsgroup as well, just this morning. My research timing karma seems pretty good lately.

Now, I want an overload that looks like SqlXml.CreateReader(XmlReaderSettings). And something that handles sequences of atomic types. Maybe in .NET 2.0 beta2. Hey Mark... maybe I'll hit him up for it at WinDev.

Thursday, August 12, 2004 2:12:17 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, August 10, 2004

Came across something recently that I thought was odd (or a bug) when dealing with UDTs in SQL Server Management Studio.

If I have a UDT, say ComplexNumber, I deploy it to SQL Server 2005. Use it as a column in a UDT table, complextab. Insert a row:

INSERT complextab VALUES('1:1i')

Then attempt to SELECT all of the values:

SELECT * FROM complextab

I get the following error message:

An error occurred while executing batch. Error message is: File or assembly name 'ComplexNumber, Version=1.0.0.0, Culture=neutral, PublicKeyToken=17177e16a4b86577', or one of its dependencies, was not found.

Scratched my head and thought about it for a few minutes. Reason for this is that SQL Server Management Studio (SSMS) is just another client of SQL Server that uses ADO.NET. In order to use user-defined types with ADO.NET, you must deploy the assembly to the client. So putting the ComplexNumber assembly in the GAC or in the directory that contains SSMS fixes this "problem". It's not a problem, it’s by design. The SSMS will call ToString() ON THE CLIENT SIDE and display the value just fine.

Note that:
1. If you change it to: "SELECT complexcol.ToString() FROM complextab" it works because ToString is being called on the server, not on the client.
2. If you run the same command from SQLCMD it display the binary value of the ComplexNumber column. That's because it uses OLE DB to talk to SQL Server, not ADO.NET. Interestingly, this is what SQL Workbench in Beta 1 did also. Guess the folks at SSMS decided to nicely format it for Beta 2.

Tuesday, August 10, 2004 1:55:02 PM (Pacific Standard Time, UTC-08:00)  #    Comments [3]  | 

I’m a person who always likes to know where things live in the OS, database, or whatever product I’m dealing with. Being able to point to a specific DLL or configuration file or registry entry gives me something tangible to hang my hat on, rather than think that things happen “by magic”.

 

So I was intrigued by the apparent disappearance of the network libraries in SQL Server 2005. Start with these points:

1. SQL Server 2005 B2 does not ship a new version of MDAC
2. SQL Server 2005 B2 does ship with SQL Native Client (codename was SNAC) with new OLE DB provider and ODBC driver. It also appears to cover network connectivity.  In the new client config utility, all (four) supported protocols list netlibs as “SQLNCLI.dll”. That's SQL Native Client.
3. System.Data.SqlClient does not use MDAC any more
4. System.Data.SqlClient did use DBNETLIB.dll to talk to SQL Server in the past

 

So does .NET SqlClient use SQL Native Client? Dan, Niels, and I did some experiments and it doesn’t appear that SqlClient calls SNAC at all. We even installed .NET 2.0 on a machine without SNAC and did tests that include MARS and async (these require new protocol features). Worked fine, no SQLNCLI or DBNETLIB loaded or in sight. So how’d they do that?

 

Looking in System.Data.dll, it appears that there are some calls that start with the letters SNI. There are no calls to DBNETLIB (as there was in System.Data.dll in .NET 1.1). A quick search of Books Online yielded only references to SNI associated with Service Broker protocol (as in “the SNI connection string for Service Broker...”).

 

So it appears that both System.Data.dll and SQLNCLI.dll implement SNI calls directly rather than calling a specific library. And that the SNI model is used by more than just client-server protocols. It is possibly used for everything that is covered by CREATE ENDPOINT. System.Data.SqlClient doesn’t use MDAC or SNAC, but both System.Data.SqlClient and SNAC are compatible with previous versions of SQL Server, because I could use the OLE DB Native Client provider with SQL Server 2000 just fine. Hmmm.

 

Stay tuned...

Tuesday, August 10, 2004 9:29:59 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 

Theme design by Jelle Druyts

Pick a theme: