There are a number of different ways that SQL Server 2005 checks to see whether on not you're trying to catalog or execute dangerous .NET code. This checking is enforced by SQL Server at CREATE ASSEMBLY time and also at execution time.

During CREATE ASSEMBLY, SQL Server reflects on the assembly and check for certain coding practices than can "cause problems" at runtime. It's well-known that SQL Server does not allow you to catalog an asssembly as SAFE or EXTERNAL_ACCESS if the assembly contains finalizers or mutable statics.

It's also well-known that SQL Server will throw a SecurityException at runtime if you attempt to access classes or methods marked with a Host Protection Attribute. HostProtectionAttribute was added to the .NET 2.0 runtime to allow class-library authors to annotate their code in order to inform the host (in this case SQL Server) of behaviors such as "MayLeakOnAbort" or "SelfAffectingThreading". HPAs provide protection at execution time, not catalog time.

It's a little-known fact that SQL Server also checks for the presence of certain attributes on assembly code at CREATE ASSEMBLY time. If your code contains dangerous attributes CREATE ASSEMBLY does not succeed. This is different checking than what HPAs allow. This list of "dangerous attributes" is checked at catalog time. The attributes SQL Server considers dangerous include:
  System.STAThreadAttribute and System.MTAThreadAttribute
  System.ThreadStaticAttribute
  System.ContextStaticAttribute
  System.DllImportAttribute

Doubtless there are others. Attempting to catalog an assembly that contains one of these attributes produces an error message like this:
Msg 6237, Level 16, State 1, Line 1
CREATE ASSEMBLY failed because method 'ProcWithAttr' on type 'StoredProcedures' in assembly 'SomeAssm' has invalid custom attribute 'System.STAThreadAttribute'.

Interestingly, I was able to find one that slipped by; SQL Server will allow cataloging an assembly with the UnverifiableCodeAttribute as SAFE. Maybe it was due to the fact that my module did not actually *contain* unverifiable code, just an attribute that said it did.

I'm amazed at the thoroughness with which SQL Server checks for code that can cause it problems as a .NET runtime host. .NET reflection is truly its friend; try to implement this type of checking on an extended stored procedure!

Categories:

Because of SQL Server 2005's support of XQuery, a lot of folks have been asking about good XQuery-specific books. I've found two excellent books, each with a different focus.

"XQuery From The Experts" is a compendium of essays by the folks in the W3C XQuery working group. It's an excellent treatise of how XQuery is meant to work, the problems its meant to solve, and the details of implementation. Since most of the writers are also XQuery processor implementers, you're getting the "inside story". I especially like Jonathan Robie's XQuery tutorial chapter and Michael Rys' chapter on implementing an XML data type. Michael's XML data type implementation theory is not identical to the SQL Server 2005 implementation (although he's the PM for this feature), but its close enough in concept to make it a great read if you want to understand this feature.

If you're more interested in XQuery from a user perspective, Michael Brundage's "XQuery: The XML Query Language" is right up your alley. Michael covers the syntax and usage in depth, with lots of examples. The XQuery specs lack all but the most cursory examples (it's not their charter) so Mr. Brundage's book is a must. Although Michael works at Microsoft, his book doesn't cover the System.Xml implementation in .NET 2.0, but the language as defined in the spec. So you'll be doing deltas with the SQL Server 2005 or System.Xml implementations (which are closely aligned) in any case.

Remember that both books are based on a standard-in-progress, although now it's close to done. Happy XQuery-ing.

Categories:

I'm at the Ascend program launch this week; it's being launched as part of a larger event. Yes, I finally got to meet Roger Doherty. I'm doing a day-long (actually a very long day-long) presentation of SQL Server 2005 new features for developers tomorrow. Today, I'm listening to Larry Chesnut speak about "DBA and scalability topics" as I write this. It's interesting to see what people consider "DBA topics" as opposed to "developer topics", because in the SQL Server culture, many small shop developers are also their own DBA. In addition, SQL Server runs some of the world's largest installtions.

It's also interesting that I've met up with quite a few folks here who took my class in beta 1 timeframe. They're not coming for the most part because they've "already seen it". If you look back at the blog entry for my "20 favorite beta 2 features" about a month ago, there's many new features even since beta 1. So I have at least 25 new things to talk about; the list of 20 were only favorites and I've "found" more since then.

I've also been asked here if our book "First Look at SQL Server for Developers" is based on beta2/.NET 2.0 beta 1 or the earlier Yukon beta1. The answer is Beta2 with qualifications. There's a lead time in publishing and we wanted the book to ship with the beta. So we took the information that we could get, did experiments intermediate versions and wrote what we thought would be in beta2 at the time. The server folks were pretty settled by that point, but there may be little nuances since that I've found through experimentation. The client-side folks made some big changes since we declared done, we caught as many of them as we could. That said, I'm keeping up to date with the latest on the book's website. It's difficult to write "in depth" book as the product's changing. I was inspired by Eric Gunnerson's excellent C# book, which shipped early but followed the product as it evovled via the book's website.

Categories:

If you're developing procedural code, UDTs, or UDAggs in SQLCLR for SQL Server 2005, Visual Studio 2005 has these nice auto-deploy projects called SQL Server projects that you can use. Comes with templates for each SQL Server 2005 item you can develop, just choose a SQL Server and database at project creation and hit “Deploy” menu entry when you're ready. F5 debugging is even built in to them. VERY cool.

But...there are some folks that don't want any help. Or have an existing class library they want to try. Or want to start from first principals and show others how to install a .NET assembly “by hand”. In Visual Studio Alpha/SQL Server 2005 Beta1 you could do this with a default class library project. In VS Beta1, you can't because there are mutable statics in the class library project “by default”. You have to install it as UNSAFE. Or....

1. Create a new class library project.
2. Click "show all files".
3. Look under Properties (C# project) or MyProject (VB.NET project)
4. Delete Resources.resx and Settings.settings. If you are working in VB.NET project you do NOT have to delete MyApplication.myapp.
5. You're good to go. Just code it up and manually deploy as SAFE (the default).

Categories:

To squeeze every ounce of performance out of the SqlServer data provider, the rule is to allocate the least number of ancillary object instances possible. In SQLCLR Optimizations 1, I mentioned a mechanism to keep from creating a SqlCommand and associated objects, just to initialize a SqlDefinition.

Another performance increase comes by not allocating/using extra buffers. When you execute a statement that produces a rowset, this means using SqlDataReader only when you need it. You need it only when you are  going to consume the results IN the .NET procedure and do some processing with them. You don't need a SqlDataReader just to pass a rowset back to the client. As an example, if I want to emulate this trivial T-SQL proc:

CREATE PROCEDURE getauthors
AS
SELECT * FROM authors


this .NET code looked pretty straightforward to me:

public static void GetAuthors1()
{
  // create and initialize command
  SqlCommand cmd = SqlContext.GetCommand();
  cmd.CommandText = "select * from authors";

  // get data reader
  SqlDataReader rdr = cmd.ExecuteReader();

  // pass results to client
  SqlContext.GetPipe().Send(rdr);
}

This actually allocates enough of a buffer in your procedure to hold a row and reads into the buffer, just to turn around and hand it to the SqlPipe. That buffer is pure overhead. A more performant way, that basically points the rowset at the SqlPipe, is to use SqlPipe.Execute. You can Execute a SqlCommand or use a SqlExecutionContext.

The SqlExecutionContext is the class that represents the volitile parts of SqlCommand, the parameters, and contains methods to execute SQL and return results. Here's the same procedure, using both SqlDefinition/SqlExecutionContext and SqlPipe.Execute.

// initialized once during class construction
readonly static SqlDefinition def =
  new SqlDefinition("select * from authors", null, null);

public static void GetAuthors2()
{
  // lookup static part of command
  // in this example, the command is entirely static
  SqlExecutionContext ec =
    SqlContext.GetConnection().CreateExecutionContext(def);

  // execute it. no SqlDataReader needed, results to client
  SqlContext.GetPipe().Execute(ec);
}

At first glance, you look at System.Data.SqlServer and see a few classes that look strange, but most everything looks the same as with the SqlClient provider. The big perf improvements, however, come from using the classes that look strange and are different. That's why they are there, to allow efficiencies that you can't get (like not allocating a buffer to receive a rowset) on the client. But you wouldn't send a rowset back to the "client" from the client, this only happens in a procedure inside SQL Server. On the server it's a whole different ballgame.

 

Categories:

What's error 6522? It's the error that you always get from an unhandled exception in a SQLCLR stored procedure. If I write a SQLCLR error that throws an unhandled exception (let's write one on purpose):

[SqlProcedure]
public static void ErrorExecute()
{
    SqlCommand cmd = SqlContext.GetCommand();
    // everyone knows Bob can't type
    cmd.CommandText = "select * from authorss";
    SqlContext.GetPipe().Execute(cmd);
}

using it from T-SQL:

execute errorexecute
print @@error

Yields:

Msg 208, Level 16, State 1, Line 0
Invalid object name 'authorss'.
Msg 6522, Level 16, State 1, Procedure ErrorExecute, Line 1
A .NET Framework error occurred during execution of user defined routine or aggregate 'ErrorExecute':
System.Data.SqlServer.SqlException: Invalid object name 'authorss'.
   at System.Data.SqlServer.Internal.StandardEventSink.HandleErrors()
   at System.Data.SqlServer.Internal.RequestExecutor.HandleExecute(EventTranslator eventTranslator, SqlConnection conn, SqlTransaction tran, ClrLevelContext clrlvlctxtPipe, UrtExecutionType uetType, InternalResultSetOptions irsoOpts, CommandBehavior eBehavior, Object& objResult)
   at System.Data.SqlServer.Internal.RequestExecutor.ExecuteToPipe(SqlConnection conn, SqlT.
6522

Not that @@ERROR returns 6522, NOT 208. If you write an equivalent bad T-SQL proc:

create procedure errortsql
as
-- cant type any better in TSQL
select * from authorss
go

execute errortsql
print @@error

returns 208.

Why does this matter? Say that I have a T-SQL stored procedure (not this one, obviously) that I want to replace with a SQLCLR equivalent that say, runs faster. Say the procedure is used in 20 places in my application, each with semantics that check for specific values of @@ERROR. Won't work the same.

You can't catch the exception in SQLCLR and throw the "correct" one (using a SqlCommand with CommandText of RAISERROR(...)) either, because you can't throw a 208 error (or any other system error) with RAISERROR. So I'll have to change each one of my caller procs too when I switch to SQLCLR. The only current workaround is to change to the new T-SQL try-catch syntax

BEGIN TRY
  EXECUTE errorexecute
END TRY
BEGIN CATCH
  -- prints 208, whew...
  PRINT CONVERT(varchar(10), error_number())
END CATCH

Hopefully this will be changed or a workaround in SQLCLR will be available in the next beta. I want @@ERROR to somehow return 208, not 6522. Or this will be a great motivator for everyone to change to T-SQL TRY-CATCH.

Categories:

Good T-SQL programmers are always looking to optimize their procedural code. Little optimizations can end up as big savings if the code will be executed many times or if it's used in many places in the application. Good SQLCLR programmers will be doing this too. Recently I ran into a micro-optimization that has promise.

Everyone who's used SQLCLR in earnest knows by now that it's useful to separate SqlCommand into two pieces: SqlDefinition and SqlExecutionContext. SqlDefinition is the static portion of the command, SqlExecutionContext contains the execution methods like ExecuteNonQuery and friends. Last time I'd heard rumors, about a 20% performance improvement, although it's not usually useful (or not allowed in betas) to report performance numbers. It's beta software after all, possibly with extra unoptimized code that will be removed/optimized at release. Especially SQLCLR, a brand new feature. So I didn't confirm the rumored numbers; sounded like a reasonable assumption.

SqlDefinitions are best initialized in the class constructor and stashed in readonly static (Shared Readonly in VB.NET) variables. I've done it like this:

static readonly SqlDefinition def = null;

static MyClass() {
  SqlCommand cmd = SqlContext.GetCommand();
  cmd.CommandText =
      "select * from authors where au_id like @au_id";
  cmd.Parameters.Add("@au_id", SqlDbType.Varchar);
  cmd.Parameters[0].Size = 50;
  def = new SqlDefinition(cmd);
}

Lately, I've noticed there's an even better way. There's a constructor for SqlDefinition that takes a CommandText, CommandType, array of SqlMetaData, and array of ParameterDirection. I can even put the initialization of these in the variable declarations, like this:

readonly static SqlMetaData[] md =
    new SqlMetaData[1] { new SqlMetaData("@au_id", SqlDbType.VarChar, 50) };
readonly static ParameterDirection[] pd =
    new ParameterDirection[1] { ParameterDirection.Input };
readonly static SqlDefinition def = new SqlDefinition(
    "select * from authors where au_id like @au_id",
    CommandType.Text, md, pd);

The variables must be defined in this order, as the SqlDefinition depends on the other variables being initialized first. Or I could initialize them explicitly in the class constructor.

Doing it this way saves me from creating object instances, then throwing them away to the garbage collector. Namely:
1. 1 SqlCommand
2. 1 SqlParameterCollection
3. N SqlParameters where N is the number of parms in the query. SqlParameter actually contains SqlMetaData as a member, plus "other stuff"

Because I'm only executing this ONCE (the first time the class is instantiated) in an entire SQL Server run, is this a micro-optimization? Probably. But what if I have 100 or 1000 such SqlDefinitions? Maybe not as "micro" then... what do you think?

Categories:

I've been trying out the new session support in HTTP endpoints. You manipulate this through SOAP headers, and looking through the list of SOAP headers supported allowed me to understand this feature from an entirely different perspective. It looks like the session support makes this into more of an alternate client stack than a mechanism to serve web services on the internet or (more likely) intranet.

The HTTP endpoints session support is meant to overcome the limitation that HTTP is a stateless protocol, while TDS/SNI/Netlibs is a stateful one. That is, you normally login to SQL Server perform multiple operations and logoff. With HTTP endpoint sessions, you can do this. There are SOAP headers for "connection string parameters" like initial catalog and application name, there are initiate (login) and terminate (logoff) attributes. There are headers exposing concepts like database transactions and prepared statements. There are even headers for new SQL Server 2005 features like database mirroring partner (for failover) and notification request (for query notifications).

I'll post some samples to show what's possible. They use the SoapHeader definition ("include file") starter (tweaked a bit) from the BOL. Bear in mind that if you define the endpoint on your own machine (you can't come over to my house and use my machine, sorry) you need to update the web reference. Doing so wipes out the SOAP Headers that you need in the proxy class (Reference.cs) to make this work. So don't forget to add 'em back. Save the original code in case you forget. For a more detailed explanation than that, refer to BOL or come to the class and I'll show you how. Or come to Windev and Dan Sullivan will, he's doing the talk on this feature. And have a think about why this is useful, maybe I'll discuss this next.

Categories:

When you load XML into an ADO.NET DataSet using it's ReadXml() method, ADO.NET uses a fixed set of rules to determine how to infer a relational structure for the XML. These rules are doc'd in the .NET framework docs. One of the most frustrating rules was that you couldn't have the same attribute name or subelement name in more than one different element. Doing so cause the ReadXml() method to produce the well-known error “The same table (Table name) cannot be the child table in two nested relations.”

The algorithm is improved so this type of XML can be loaded in .NET 2.0. It can load my test semistructured (also known as “tags and text”) document, as well as more mundane XML documents that caused this error in previous versions. The document:

<book>
<chapter>
<title>Testing your <noun>typewriter</noun></title>
<p>The quick brown <noun>fox</noun> jumps over
the lazy <noun>dog</noun></p>
</chapter>
</book> 

Produces 4 tables, named book, chapter, p, and noun. It even serializes out using WriteXml() the same way it looked when it was loaded, adding only an extra unused namespace declaration for xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance".

This will be a help for developers using this method of XML to relational mapping.

Categories:

I was talking with Dan Sullivan this week about UDTs in SQL Server 2005 and he's made an interesting discovery. About a week ago, I'd happened to mention to Dan that UDTs and UDAggs (user-defined aggregates) are supposed to have a pattern of "deserialize-do action-serialize". He thought I'd misunderstood. Then he went off to write an improved UDT/UDAgg for his talk at Windev. Dan has an engineering background and is excited about using UDTs for data types like Linear Dimension (he wrote the LDim example in our book).

What Dan discovered in attempting to implement some initialization inside the constructor is that, inside SQL Server operations, the constructor is never called! His UDT used Format.UserDefined, which means he implements the IBinarySerialize interface to Read and Write the UDT state. The engine just allocates some bare memory for an instance with the IL “newobj” instruction, then calls IBinarySerialize.Read to fill it out. What this means is that any internal fields that you would initialize inside a constructor must also be initialized when IBinarySerialize.Read is called as well. Dan found this out the hard way, reference types he'd expected to be initialized in the constructor were null when Read is called.

He's made a common implementation pattern in his UDTs: always have an Initialize or Init method and call it both in the constructor (for "ordinary" or client usage) and in IBinarySerialize.Read (for use inside SQL Server 2005). A good thing to keep in mind if you're writing a UDT.

When I asked if this wasn't against the "rules", his response was that C# always calls the constructor after newobj, but it's not a .NET requirement. Niels Berglund mentioned that this behavior is also true in Format.Native too, but all its fields are value types, so they are all initialized to zero by the runtime.

Niels also mentioned an interesting pattern he observed by tracing through a UDAgg he'd written with Format.UserDefined.UDAggs implement an Init() method instead of a constructor so that instances can be reused, for example in GROUP BY statements. The pattern Niels observed was:

1. Init
2. Write (but Accumulate hasn't been called yet)
3. Read-Accumulate-Write (for each row processed)
4. Read-Terminate

This pattern had puzzled him at the time, but alongside Dan's UDT observations this makes sense. I'd originally thought that the Read-Accumulate-Write pattern was for parallel processing, but it always happens even on a single processor. At this point we're all thinking it might a hedge based on how thread scheduling works in SQL Server.

So don't forget Dan's "Initialize in Read or constructor" pattern when you write Format.UserDefined UDTs. I think I have the good fortune to work with some pretty observant folks.

Categories:

Theme design by Nukeation based on Jelle Druyts