Sunday, August 29, 2004

Because of SQL Server 2005's support of XQuery, a lot of folks have been asking about good XQuery-specific books. I've found two excellent books, each with a different focus.

"XQuery From The Experts" is a compendium of essays by the folks in the W3C XQuery working group. It's an excellent treatise of how XQuery is meant to work, the problems its meant to solve, and the details of implementation. Since most of the writers are also XQuery processor implementers, you're getting the "inside story". I especially like Jonathan Robie's XQuery tutorial chapter and Michael Rys' chapter on implementing an XML data type. Michael's XML data type implementation theory is not identical to the SQL Server 2005 implementation (although he's the PM for this feature), but its close enough in concept to make it a great read if you want to understand this feature.

If you're more interested in XQuery from a user perspective, Michael Brundage's "XQuery: The XML Query Language" is right up your alley. Michael covers the syntax and usage in depth, with lots of examples. The XQuery specs lack all but the most cursory examples (it's not their charter) so Mr. Brundage's book is a must. Although Michael works at Microsoft, his book doesn't cover the System.Xml implementation in .NET 2.0, but the language as defined in the spec. So you'll be doing deltas with the SQL Server 2005 or System.Xml implementations (which are closely aligned) in any case.

Remember that both books are based on a standard-in-progress, although now it's close to done. Happy XQuery-ing.

Sunday, August 29, 2004 11:09:45 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Wednesday, August 25, 2004

I'm at the Ascend program launch this week; it's being launched as part of a larger event. Yes, I finally got to meet Roger Doherty. I'm doing a day-long (actually a very long day-long) presentation of SQL Server 2005 new features for developers tomorrow. Today, I'm listening to Larry Chesnut speak about "DBA and scalability topics" as I write this. It's interesting to see what people consider "DBA topics" as opposed to "developer topics", because in the SQL Server culture, many small shop developers are also their own DBA. In addition, SQL Server runs some of the world's largest installtions.

It's also interesting that I've met up with quite a few folks here who took my class in beta 1 timeframe. They're not coming for the most part because they've "already seen it". If you look back at the blog entry for my "20 favorite beta 2 features" about a month ago, there's many new features even since beta 1. So I have at least 25 new things to talk about; the list of 20 were only favorites and I've "found" more since then.

I've also been asked here if our book "First Look at SQL Server for Developers" is based on beta2/.NET 2.0 beta 1 or the earlier Yukon beta1. The answer is Beta2 with qualifications. There's a lead time in publishing and we wanted the book to ship with the beta. So we took the information that we could get, did experiments intermediate versions and wrote what we thought would be in beta2 at the time. The server folks were pretty settled by that point, but there may be little nuances since that I've found through experimentation. The client-side folks made some big changes since we declared done, we caught as many of them as we could. That said, I'm keeping up to date with the latest on the book's website. It's difficult to write "in depth" book as the product's changing. I was inspired by Eric Gunnerson's excellent C# book, which shipped early but followed the product as it evovled via the book's website.

Wednesday, August 25, 2004 11:13:12 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Monday, August 23, 2004

If you're developing procedural code, UDTs, or UDAggs in SQLCLR for SQL Server 2005, Visual Studio 2005 has these nice auto-deploy projects called SQL Server projects that you can use. Comes with templates for each SQL Server 2005 item you can develop, just choose a SQL Server and database at project creation and hit “Deploy” menu entry when you're ready. F5 debugging is even built in to them. VERY cool.

But...there are some folks that don't want any help. Or have an existing class library they want to try. Or want to start from first principals and show others how to install a .NET assembly “by hand”. In Visual Studio Alpha/SQL Server 2005 Beta1 you could do this with a default class library project. In VS Beta1, you can't because there are mutable statics in the class library project “by default”. You have to install it as UNSAFE. Or....

1. Create a new class library project.
2. Click "show all files".
3. Look under Properties (C# project) or MyProject (VB.NET project)
4. Delete Resources.resx and Settings.settings. If you are working in VB.NET project you do NOT have to delete MyApplication.myapp.
5. You're good to go. Just code it up and manually deploy as SAFE (the default).

Monday, August 23, 2004 3:01:23 PM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 

To squeeze every ounce of performance out of the SqlServer data provider, the rule is to allocate the least number of ancillary object instances possible. In SQLCLR Optimizations 1, I mentioned a mechanism to keep from creating a SqlCommand and associated objects, just to initialize a SqlDefinition.

Another performance increase comes by not allocating/using extra buffers. When you execute a statement that produces a rowset, this means using SqlDataReader only when you need it. You need it only when you are  going to consume the results IN the .NET procedure and do some processing with them. You don't need a SqlDataReader just to pass a rowset back to the client. As an example, if I want to emulate this trivial T-SQL proc:

CREATE PROCEDURE getauthors
AS
SELECT * FROM authors


this .NET code looked pretty straightforward to me:

public static void GetAuthors1()
{
  // create and initialize command
  SqlCommand cmd = SqlContext.GetCommand();
  cmd.CommandText = "select * from authors";

  // get data reader
  SqlDataReader rdr = cmd.ExecuteReader();

  // pass results to client
  SqlContext.GetPipe().Send(rdr);
}

This actually allocates enough of a buffer in your procedure to hold a row and reads into the buffer, just to turn around and hand it to the SqlPipe. That buffer is pure overhead. A more performant way, that basically points the rowset at the SqlPipe, is to use SqlPipe.Execute. You can Execute a SqlCommand or use a SqlExecutionContext.

The SqlExecutionContext is the class that represents the volitile parts of SqlCommand, the parameters, and contains methods to execute SQL and return results. Here's the same procedure, using both SqlDefinition/SqlExecutionContext and SqlPipe.Execute.

// initialized once during class construction
readonly static SqlDefinition def =
  new SqlDefinition("select * from authors", null, null);

public static void GetAuthors2()
{
  // lookup static part of command
  // in this example, the command is entirely static
  SqlExecutionContext ec =
    SqlContext.GetConnection().CreateExecutionContext(def);

  // execute it. no SqlDataReader needed, results to client
  SqlContext.GetPipe().Execute(ec);
}

At first glance, you look at System.Data.SqlServer and see a few classes that look strange, but most everything looks the same as with the SqlClient provider. The big perf improvements, however, come from using the classes that look strange and are different. That's why they are there, to allow efficiencies that you can't get (like not allocating a buffer to receive a rowset) on the client. But you wouldn't send a rowset back to the "client" from the client, this only happens in a procedure inside SQL Server. On the server it's a whole different ballgame.

 

Sunday, August 22, 2004 11:14:56 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Friday, August 20, 2004

What's error 6522? It's the error that you always get from an unhandled exception in a SQLCLR stored procedure. If I write a SQLCLR error that throws an unhandled exception (let's write one on purpose):

[SqlProcedure]
public static void ErrorExecute()
{
    SqlCommand cmd = SqlContext.GetCommand();
    // everyone knows Bob can't type
    cmd.CommandText = "select * from authorss";
    SqlContext.GetPipe().Execute(cmd);
}

using it from T-SQL:

execute errorexecute
print @@error

Yields:

Msg 208, Level 16, State 1, Line 0
Invalid object name 'authorss'.
Msg 6522, Level 16, State 1, Procedure ErrorExecute, Line 1
A .NET Framework error occurred during execution of user defined routine or aggregate 'ErrorExecute':
System.Data.SqlServer.SqlException: Invalid object name 'authorss'.
   at System.Data.SqlServer.Internal.StandardEventSink.HandleErrors()
   at System.Data.SqlServer.Internal.RequestExecutor.HandleExecute(EventTranslator eventTranslator, SqlConnection conn, SqlTransaction tran, ClrLevelContext clrlvlctxtPipe, UrtExecutionType uetType, InternalResultSetOptions irsoOpts, CommandBehavior eBehavior, Object& objResult)
   at System.Data.SqlServer.Internal.RequestExecutor.ExecuteToPipe(SqlConnection conn, SqlT.
6522

Not that @@ERROR returns 6522, NOT 208. If you write an equivalent bad T-SQL proc:

create procedure errortsql
as
-- cant type any better in TSQL
select * from authorss
go

execute errortsql
print @@error

returns 208.

Why does this matter? Say that I have a T-SQL stored procedure (not this one, obviously) that I want to replace with a SQLCLR equivalent that say, runs faster. Say the procedure is used in 20 places in my application, each with semantics that check for specific values of @@ERROR. Won't work the same.

You can't catch the exception in SQLCLR and throw the "correct" one (using a SqlCommand with CommandText of RAISERROR(...)) either, because you can't throw a 208 error (or any other system error) with RAISERROR. So I'll have to change each one of my caller procs too when I switch to SQLCLR. The only current workaround is to change to the new T-SQL try-catch syntax

BEGIN TRY
  EXECUTE errorexecute
END TRY
BEGIN CATCH
  -- prints 208, whew...
  PRINT CONVERT(varchar(10), error_number())
END CATCH

Hopefully this will be changed or a workaround in SQLCLR will be available in the next beta. I want @@ERROR to somehow return 208, not 6522. Or this will be a great motivator for everyone to change to T-SQL TRY-CATCH.

Friday, August 20, 2004 3:14:09 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

Good T-SQL programmers are always looking to optimize their procedural code. Little optimizations can end up as big savings if the code will be executed many times or if it's used in many places in the application. Good SQLCLR programmers will be doing this too. Recently I ran into a micro-optimization that has promise.

Everyone who's used SQLCLR in earnest knows by now that it's useful to separate SqlCommand into two pieces: SqlDefinition and SqlExecutionContext. SqlDefinition is the static portion of the command, SqlExecutionContext contains the execution methods like ExecuteNonQuery and friends. Last time I'd heard rumors, about a 20% performance improvement, although it's not usually useful (or not allowed in betas) to report performance numbers. It's beta software after all, possibly with extra unoptimized code that will be removed/optimized at release. Especially SQLCLR, a brand new feature. So I didn't confirm the rumored numbers; sounded like a reasonable assumption.

SqlDefinitions are best initialized in the class constructor and stashed in readonly static (Shared Readonly in VB.NET) variables. I've done it like this:

static readonly SqlDefinition def = null;

static MyClass() {
  SqlCommand cmd = SqlContext.GetCommand();
  cmd.CommandText =
      "select * from authors where au_id like @au_id";
  cmd.Parameters.Add("@au_id", SqlDbType.Varchar);
  cmd.Parameters[0].Size = 50;
  def = new SqlDefinition(cmd);
}

Lately, I've noticed there's an even better way. There's a constructor for SqlDefinition that takes a CommandText, CommandType, array of SqlMetaData, and array of ParameterDirection. I can even put the initialization of these in the variable declarations, like this:

readonly static SqlMetaData[] md =
    new SqlMetaData[1] { new SqlMetaData("@au_id", SqlDbType.VarChar, 50) };
readonly static ParameterDirection[] pd =
    new ParameterDirection[1] { ParameterDirection.Input };
readonly static SqlDefinition def = new SqlDefinition(
    "select * from authors where au_id like @au_id",
    CommandType.Text, md, pd);

The variables must be defined in this order, as the SqlDefinition depends on the other variables being initialized first. Or I could initialize them explicitly in the class constructor.

Doing it this way saves me from creating object instances, then throwing them away to the garbage collector. Namely:
1. 1 SqlCommand
2. 1 SqlParameterCollection
3. N SqlParameters where N is the number of parms in the query. SqlParameter actually contains SqlMetaData as a member, plus "other stuff"

Because I'm only executing this ONCE (the first time the class is instantiated) in an entire SQL Server run, is this a micro-optimization? Probably. But what if I have 100 or 1000 such SqlDefinitions? Maybe not as "micro" then... what do you think?

Friday, August 20, 2004 9:54:48 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Wednesday, August 18, 2004

I've been trying out the new session support in HTTP endpoints. You manipulate this through SOAP headers, and looking through the list of SOAP headers supported allowed me to understand this feature from an entirely different perspective. It looks like the session support makes this into more of an alternate client stack than a mechanism to serve web services on the internet or (more likely) intranet.

The HTTP endpoints session support is meant to overcome the limitation that HTTP is a stateless protocol, while TDS/SNI/Netlibs is a stateful one. That is, you normally login to SQL Server perform multiple operations and logoff. With HTTP endpoint sessions, you can do this. There are SOAP headers for "connection string parameters" like initial catalog and application name, there are initiate (login) and terminate (logoff) attributes. There are headers exposing concepts like database transactions and prepared statements. There are even headers for new SQL Server 2005 features like database mirroring partner (for failover) and notification request (for query notifications).

I'll post some samples to show what's possible. They use the SoapHeader definition ("include file") starter (tweaked a bit) from the BOL. Bear in mind that if you define the endpoint on your own machine (you can't come over to my house and use my machine, sorry) you need to update the web reference. Doing so wipes out the SOAP Headers that you need in the proxy class (Reference.cs) to make this work. So don't forget to add 'em back. Save the original code in case you forget. For a more detailed explanation than that, refer to BOL or come to the class and I'll show you how. Or come to Windev and Dan Sullivan will, he's doing the talk on this feature. And have a think about why this is useful, maybe I'll discuss this next.

Wednesday, August 18, 2004 11:02:55 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Monday, August 16, 2004

When you load XML into an ADO.NET DataSet using it's ReadXml() method, ADO.NET uses a fixed set of rules to determine how to infer a relational structure for the XML. These rules are doc'd in the .NET framework docs. One of the most frustrating rules was that you couldn't have the same attribute name or subelement name in more than one different element. Doing so cause the ReadXml() method to produce the well-known error “The same table (Table name) cannot be the child table in two nested relations.”

The algorithm is improved so this type of XML can be loaded in .NET 2.0. It can load my test semistructured (also known as “tags and text”) document, as well as more mundane XML documents that caused this error in previous versions. The document:

<book>
<chapter>
<title>Testing your <noun>typewriter</noun></title>
<p>The quick brown <noun>fox</noun> jumps over
the lazy <noun>dog</noun></p>
</chapter>
</book> 

Produces 4 tables, named book, chapter, p, and noun. It even serializes out using WriteXml() the same way it looked when it was loaded, adding only an extra unused namespace declaration for xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance".

This will be a help for developers using this method of XML to relational mapping.

Monday, August 16, 2004 9:32:25 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Thursday, August 12, 2004

I was talking with Dan Sullivan this week about UDTs in SQL Server 2005 and he's made an interesting discovery. About a week ago, I'd happened to mention to Dan that UDTs and UDAggs (user-defined aggregates) are supposed to have a pattern of "deserialize-do action-serialize". He thought I'd misunderstood. Then he went off to write an improved UDT/UDAgg for his talk at Windev. Dan has an engineering background and is excited about using UDTs for data types like Linear Dimension (he wrote the LDim example in our book).

What Dan discovered in attempting to implement some initialization inside the constructor is that, inside SQL Server operations, the constructor is never called! His UDT used Format.UserDefined, which means he implements the IBinarySerialize interface to Read and Write the UDT state. The engine just allocates some bare memory for an instance with the IL “newobj” instruction, then calls IBinarySerialize.Read to fill it out. What this means is that any internal fields that you would initialize inside a constructor must also be initialized when IBinarySerialize.Read is called as well. Dan found this out the hard way, reference types he'd expected to be initialized in the constructor were null when Read is called.

He's made a common implementation pattern in his UDTs: always have an Initialize or Init method and call it both in the constructor (for "ordinary" or client usage) and in IBinarySerialize.Read (for use inside SQL Server 2005). A good thing to keep in mind if you're writing a UDT.

When I asked if this wasn't against the "rules", his response was that C# always calls the constructor after newobj, but it's not a .NET requirement. Niels Berglund mentioned that this behavior is also true in Format.Native too, but all its fields are value types, so they are all initialized to zero by the runtime.

Niels also mentioned an interesting pattern he observed by tracing through a UDAgg he'd written with Format.UserDefined.UDAggs implement an Init() method instead of a constructor so that instances can be reused, for example in GROUP BY statements. The pattern Niels observed was:

1. Init
2. Write (but Accumulate hasn't been called yet)
3. Read-Accumulate-Write (for each row processed)
4. Read-Terminate

This pattern had puzzled him at the time, but alongside Dan's UDT observations this makes sense. I'd originally thought that the Read-Accumulate-Write pattern was for parallel processing, but it always happens even on a single processor. At this point we're all thinking it might a hedge based on how thread scheduling works in SQL Server.

So don't forget Dan's "Initialize in Read or constructor" pattern when you write Format.UserDefined UDTs. I think I have the good fortune to work with some pretty observant folks.

Thursday, August 12, 2004 10:18:03 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

As Mark Fussell will tell you, one of the coolest things in System.Xml 2.0 is the static creator methods on XmlReader and XmlWriter. It made his top 10 list, albeit at number 10.

I've been looking into how the client side handles SQL Server 2005's XML data type and was bemoaning the fact that System.Data.SqlTypes.SqlXml had a CreateReader method that probably didn't set what I wanted set - ConformanceLevel.Fragment. Sounded like just the ticket for SQL Server 2005. Checked with Reflector, and lo and behold, they *do* use XmlReader.Create with ConformanceLevel.Fragment. Very cool, and the information was helpful to Kent Tegels on the beta newsgroup as well, just this morning. My research timing karma seems pretty good lately.

Now, I want an overload that looks like SqlXml.CreateReader(XmlReaderSettings). And something that handles sequences of atomic types. Maybe in .NET 2.0 beta2. Hey Mark... maybe I'll hit him up for it at WinDev.

Thursday, August 12, 2004 2:12:17 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, August 10, 2004

Came across something recently that I thought was odd (or a bug) when dealing with UDTs in SQL Server Management Studio.

If I have a UDT, say ComplexNumber, I deploy it to SQL Server 2005. Use it as a column in a UDT table, complextab. Insert a row:

INSERT complextab VALUES('1:1i')

Then attempt to SELECT all of the values:

SELECT * FROM complextab

I get the following error message:

An error occurred while executing batch. Error message is: File or assembly name 'ComplexNumber, Version=1.0.0.0, Culture=neutral, PublicKeyToken=17177e16a4b86577', or one of its dependencies, was not found.

Scratched my head and thought about it for a few minutes. Reason for this is that SQL Server Management Studio (SSMS) is just another client of SQL Server that uses ADO.NET. In order to use user-defined types with ADO.NET, you must deploy the assembly to the client. So putting the ComplexNumber assembly in the GAC or in the directory that contains SSMS fixes this "problem". It's not a problem, it’s by design. The SSMS will call ToString() ON THE CLIENT SIDE and display the value just fine.

Note that:
1. If you change it to: "SELECT complexcol.ToString() FROM complextab" it works because ToString is being called on the server, not on the client.
2. If you run the same command from SQLCMD it display the binary value of the ComplexNumber column. That's because it uses OLE DB to talk to SQL Server, not ADO.NET. Interestingly, this is what SQL Workbench in Beta 1 did also. Guess the folks at SSMS decided to nicely format it for Beta 2.

Tuesday, August 10, 2004 1:55:02 PM (Pacific Standard Time, UTC-08:00)  #    Comments [3]  | 

I’m a person who always likes to know where things live in the OS, database, or whatever product I’m dealing with. Being able to point to a specific DLL or configuration file or registry entry gives me something tangible to hang my hat on, rather than think that things happen “by magic”.

 

So I was intrigued by the apparent disappearance of the network libraries in SQL Server 2005. Start with these points:

1. SQL Server 2005 B2 does not ship a new version of MDAC
2. SQL Server 2005 B2 does ship with SQL Native Client (codename was SNAC) with new OLE DB provider and ODBC driver. It also appears to cover network connectivity.  In the new client config utility, all (four) supported protocols list netlibs as “SQLNCLI.dll”. That's SQL Native Client.
3. System.Data.SqlClient does not use MDAC any more
4. System.Data.SqlClient did use DBNETLIB.dll to talk to SQL Server in the past

 

So does .NET SqlClient use SQL Native Client? Dan, Niels, and I did some experiments and it doesn’t appear that SqlClient calls SNAC at all. We even installed .NET 2.0 on a machine without SNAC and did tests that include MARS and async (these require new protocol features). Worked fine, no SQLNCLI or DBNETLIB loaded or in sight. So how’d they do that?

 

Looking in System.Data.dll, it appears that there are some calls that start with the letters SNI. There are no calls to DBNETLIB (as there was in System.Data.dll in .NET 1.1). A quick search of Books Online yielded only references to SNI associated with Service Broker protocol (as in “the SNI connection string for Service Broker...”).

 

So it appears that both System.Data.dll and SQLNCLI.dll implement SNI calls directly rather than calling a specific library. And that the SNI model is used by more than just client-server protocols. It is possibly used for everything that is covered by CREATE ENDPOINT. System.Data.SqlClient doesn’t use MDAC or SNAC, but both System.Data.SqlClient and SNAC are compatible with previous versions of SQL Server, because I could use the OLE DB Native Client provider with SQL Server 2000 just fine. Hmmm.

 

Stay tuned...

Tuesday, August 10, 2004 9:29:59 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Monday, August 09, 2004

I was writing slides and labs last week. I read Dare's discussion of “The Sex and Cash theory“ about doing what you think is cool vs doing what you do for cash (work) that may not be as “sexy“. Didn't feel so bad after that. My discussions with others revolved around the “official” words and terminolgy for various SQL Server 2005-related items.

SQL Server Management Studio - its too long to fit on a slide. The obvious acronym (SMS) is already “taken” and would be confusing. Then I saw SSMS as an acronym posted on a newsgroup. Cool acronym. Done.

Did you also notice the suffixes for SQL Server Management Studio projects? They absolutely win the aware for longest Windows file suffix ever. “MyProject.ssmssln is for the solution file“ and “MyProject.ssmssqlproj“ for the file that lists the subfiles in a project. Wow. Well the file associations work and I guess we've gone way past 8.3 names.

SQL Server's native XML data type - I refer to this everywhere as the “XML data type” . An XML aficionado pointed out that I am preempting  a term usually used to refer to XML types defined in an XML schema and that I should cease and desist. So I did a little research.

The SQL Server 2005 Books Online refers to it as “xml data type”, but with a lower-case “xml”. Michael Rys, the PM of the SQL Server XML group refers to it in his “XQuery From The Experts“ book chapter on an XML native data type in a SQL database as “XML datatype”. Upper case XML. Another database vendor refers to it as XMLType. Couldn't find it in the ANSI-SQL 2003 spec, part 14 at all (it's mainly about mapping). And the ANSI-SQL 2003 spec refers to types defined in an XML Schema as “XML Schema Data Types”, not XML data type at all.

“XML folks“ don't usually like to see the acronym XML in small letters and “SQL folks“ have a convention of capitalizing SQL keywords and data types, they might be offended if it is in lower case. XML is now a T-SQL keyword. Oh well...In the end, I decided to stick with I have. And now I use the term “XML Schema Data Types”. If you ever read anything on this subject that I've written, now you'll know what I mean.

Monday, August 09, 2004 10:11:34 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Sunday, August 01, 2004

I've been looking at the SMO (SQL Server Management Objects) model recently. Because the amount of data they retrieve is potentially large and potentially expensive to produce (say sp_spaceused on every database), they use most/all of the *data transfer optimization* techniques: both sparse and eager population, lazy instanciation, caching, deferred/chunky writeback, and locality of reference. Reasonable, documented defaults with progammable overrides.

What's this got to do with object-relational mapping? Well, nothing. And everything. These techniques are data optimization techniques. They are the same whether you talk about them in terms of SQL SELECT statements or objects. The fact that pieces of data in a database are mapped to properties of classes in an object model (data model mapping) is only peripherally related to data access optimization, at best.

Before I go forward, I'll state that I don't consider myself an O/R mapping (or XML-relational or XML-object) bigot on either side, either for it or against it. Really. But many folks, especially when they are bashing mapping, use the data transfer optimization techniques (along with another data access concept - locality of reference) to justify their arguments. That's like me blaming the inventor of the knife and fork for my current extra weight. Data optimization, badly implemented, will kill a data-centric object model. But you can accomplish the same thing with bare SQL statements and DataReaders/Rowsets.

For an easy to picture example, imagine paging through a large resultset in ASP.NET. You can:
1. Cache the entire resultset (eager population + caching)
2. Use TOP queries to SELECT on a per-page basis using a custom ASP.NET pager
3. Send the whole bunch to client for paging (eager population + locality of reference)
4. Fetch entire resultset each time and throw all but one-page away. (bad perf).

You can accomplish all this, including the bad design, with DataReaders only, using DataSets for your cache. Not an object-mapped-to-a-relation in sight. Another easy-to-picture example is the Windows file explorer, with plus signs next to each directory. Sparse population/lazy instanciation. The “show details“ setting: sparse property population choice.

Here's a few more quick examples, using ObjectSpaces (technology on-hold, but this was the latest implementation I'd studied) and SQL statements.

// baseline
ObjectReader<Customers> or = os.GetObjectReader<Customer>("");

SqlCommand cmd = new SqlCommand("select * from customers");
SqlDataReader rdr = cmd.ExecuteReader();

// eager population
ObjectReader<Customers> or = os.GetObjectReader<Customer>("", "Orders, Details");

SqlCommand cmd = new SqlCommand(
 "select a.*, b.*, c.* from customers a
  inner join orders b on a.custid = b.custid
  inner join details c on b.orderid = c.orderid");
SqlDataReader rdr = cmd.ExecuteReader();

// sparse population
// customers contains a customer photo
// sparse population in mapping file, can be overriden on query
ObjectReader<Customers> or = os.GetObjectReader<Customer>("");

// leave customer photo out of the SELECT
SqlCommand cmd = new SqlCommand("select customerid, customer_name... from customers");
SqlDataReader rdr = cmd.ExecuteReader();

SO...is mapping among data models a bad idea, or is not knowing/designing for/using the data optimization techniques a bad idea? Mapping objects to data *badly* (or having a model so narrow there are no choices) is just as bad as writing bad SELECTs. You have to know where your data lives, how expensive it is to produce it, and how you are going to use it in ANY case. O/R mapping DOESN'T ABSOLVE YOU from that responsibility. But bad designs are the result of uninformed designers, not a specific technology. Thinking that a specific technology is a panacea and now you don't have to consider design, is also dangerous.

Sunday, August 01, 2004 1:23:35 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

There have been a number of questions recently (well OK, three) on the beta newsgroups about SqlDependency problems. It's been suggested that SqlDependency doesn't work in Whidbey Beta 1.

The reason for this is that SqlDependency (and, of course, its lower-level cousin SqlNotificationRequest) requires a “valid” notify-able SQL query to work. What's a “valid”  notify-able query for a NotificationRequest? Subscriptions for query notification in SQL Server 2005 use the same underlying mechanism to be notified of resultset changes as indexed views do. Therefore the rules are the same as the rules for indexed views.

Since the example in our “First Look at SQL Server 2005 for Developers” book also uses an invalid query (it doesn't use a two-part table name), I posted the rules for indexed views (and therefore for SqlNotificationRequest/SqlDependency on the book website, under “changes since the book shipped/chapter 12”. It seemed that you could “get away with” not always following the rules until SQL Server 2005 Beta2. The list of rules is from the SQL Server 2005 Beta 2 Books Online.

By the way, if you submit an invalid SELECT statement with a SqlDependency, you'll get an immediate notification with the reason (in SqlNotificationEventArgs) “invalid”.

Sunday, August 01, 2004 8:24:38 AM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 
Wednesday, July 28, 2004

Just when I thought I'd found all the new cool features. An Ascend Phase 1 participant once asked me “can the DBA get rid of unwanted query notification subscriptions”? In Beta2 you can.

SELECT * FROM sys.dm_qn_subscriptions

-- pick the ID of the subscription that you want, then
-- say its ID = 42
KILL QUERY NOTIFICATION SUBSCRIPTION 42

Cool. Your subscriber app doesn't appear to get notified its been killed, though, (as it does when almost anything else affecting the subscription happens) either when using SqlDependency or SqlNotificationRequest . That's why this should only be used for “pesky” subscriptions.

Wednesday, July 28, 2004 1:14:59 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Tuesday, July 27, 2004

I started my career as a maintanance programmer, and spent a lot of time searching out and fixing bugs, in existing code and also introduced by required program maintanance. Sometimes at 3am, with a hard deadline for the fix. I believe it shows in the way I present technologies (as in, “these are the intricate details, here's the gotchas”). Sometimes this way of approaching things comes across as negative, its really just meant to save you a few hours in your middle of the night maintanance/research.

So this might sound a little out a bit out of character, but I must give you the brief list of my 20 favorite new beta 2 developer features so far. These are in no particular order, off the top of my head, and I'm sure I'm leaving some out. Doesn't even touch things like DTS, RMO, database admin, and data mining stuff. And some were in Beta1, but vastly improved for Beta2. Check em out.

1. Statement level snapshot syntax changes from trace flag to ALTER DATABASE...SET READ_COMMITTED_SNAPSHOT ON.
2. XML SCHEMA COLLECTIONS replace CREATE XML SCHEMA (and DOCUMENT/CONTENT keyword)
3. Secret/key storage (CREATE MASTER KEY, etc)
4. Support of transactions by System.Data.SqlServer
5. Integration of CLR attributes like SqlFunc/SqlFunction
5. Widening of the ENDPOINT concept (see CREATE ENDPOINT)
6. FOR XML PATH
7. XQuery update to Nov 2003 alignment
8. Improvements in TSQL TRY-CATCH
9. RANKING and PARTITIONING functions
10. The new metadata (sys.everything) - more improvements
11. FOR XML ... ROOT(). (Hooray)
12. Service Broker routing, protocol, and security changes
13. XML INDEXES improvements and granularity
14. Additions to HTTP ENDPOINT support, sessions, transactions, etc
15. The new "virtual tables" (dynamic metadata info)
16. TSQL Encryption Functions
17. "ValidationMethodName" attribute property in UDTs
18. EXECUTE AS USER= (or LOGIN=) statement as a replacement/extension for SETUSER
19. Dynamic WSDL generation hook in ENDPOINTs FOR HTTP
20. Promotable transactions

At the risk of sounding syncophantic...WOW.

Tuesday, July 27, 2004 10:59:28 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Monday, July 26, 2004

sqlaccess.dll is the in-database .NET data provider, System.Data.SqlServer. If you do any data access in .NET stored procedures, UDFs, triggers, UDTs, and UDAggregates, you are using this provider.

There is a version in Visual Studio Beta 1 (in \Program Files\Microsoft Visual Studio 8\Common7\IDE\PublicAssemblies) that (I think) Visual Studio uses in its Database projects. There's a different version (the size is different by 1k) in \Microsoft SQL Server\MSSQL.1\MSSQL\binn.

I've renamed Visual Studio's version to sqlaccess_vs.dll and copied SQL Server's version to the VS PublicAssemblies directory. I'm not exactly sure what's different yet, but am suspicious because the PublicAssemblies directory also contains ClrCppModule.dll. In SQLServer 2005 Beta 2, this piece is no longer needed to run CLR items in SQL Server.

Monday, July 26, 2004 10:08:08 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

SQL Server 2005 beta 2 is now available for MSDN subscribers on the MSDN website. Although it's not yet listed in the “what's hot” section, it's available under Servers/SQL Server/SQL Server 2005. I've got a few initial observations upon downloading it and installing it.

1. Read the readme file (ReadmeSQL2005.htm). Although this usually goes without saying, there are some common installation considerations in there that can save you time or even a reinstall (e.g., Changing the Service Account is not yet supported).

2. Read the known issues file (sqlki.sql). It's worth giving it a quick read to start, and, if something you've heard was going to be in beta 2 doesn't work the way you think it should have, go back to this file as a “first resort”.

3. When installing SQL Server 2005 Beta 2 and Visual Studio Beta 1 together, realize they have slightly different versions of the .NET CLR. Beta 2 has 40607.42, VS Beta 1 has 40607.16. They're compatible, but you have to run the .42 version of the CLR. Check the readme for details.

4. The SQL Express version that shipped with Visual Studio Beta 1 is not Beta 2. I've already discovered one difference; SQL Express can use “FOR XML” with user-defined-type columns, in Beta 2 this produces an “unsupported” error message. There is a newer version of SQL Express released on MSDN subscriptions...or install the Developer's Edition of Beta 2.

Have fun...

Monday, July 26, 2004 9:57:41 AM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 
Wednesday, July 21, 2004

In recent posts I'd mentioned mapping a few times, it's time to get back and explore it in earnest. There appears to be three major data models in use by programmers today:

  • Relational - that's where the majority of the corporate data is stored. SQL and its product-specific dialects is the main programming language that follows this model.
  • Objects - really defined as state and behavior, the state being the data, the behavior being what you use to manipulate that data. Object-oriented databases peaked and waned in the 90s, but object-oriented programming languages are what's used by the majority of programmers today
  • XML - XML has accomodations for markup and data, a schema language and query languages. XML-specific databases are on the rise, albeit slowly. Instead, most/all mainstream relational database vendors are adding accomodations for XML data in their databases today (ANSI-SQL 2003), just as they added accomodations for objects in the 90s (ANSI-SQL 1999).

If you program using more than one of these models, it becomes necessary to map between these models. For example, object-oriented programmers are never satisfied with my answer when asking about representing data as objects: “the familiar connection-command-rowset paradigm IS an object-oriented implementation of data access“. They don't want rowsets and rowset metadata; they want business objects. They are just as unlikely to be happy with the answer that the XML DOM or streaming apis can be implemented using an object model. So mapping (or dealing with the reality that the "other side" of the protocol pipe may be using mapping) is a necessity if you can't mandate what all of the users of your data will do. When you speak of object-oriented programmers using web services (which uses XML as a marshaling format and for just about everything else) you're even talking mapping for data marshaling. Whether you stick with composition and simple types or go all out to make relational or XML represent complex types, it's just a different refinement of the same mapping. So we're talking two way:

Relational to object mapping
Relational to XML mapping
XML to object mapping

Currently, there are Microsoft implementations of mapping in the .NET libraries in System.Xml.Serialization and the remoting SoapSerializer. The SOAP serializer is based on mapping type systems to XML based on a pre-XML schema attempt in SOAP 1.0 section 6. With the completion of the XML Schema spec, fewer people each day remember that SOAP originally stood for Simple Object Access Protocol. It's acronym status has even been revoked. The schema-centric model (document-literal format in WSDL) is what most modern web services use.

The experimental COmega language made this mapping an implicit part of the programming language, consuming metadata with two utilities SQL2COmega and XML2COmega, and making SELECT a language keyword. This be an interesting approach with a future.

I was intrigued by the inclusion of a three-part mapping schema format that made a brief appearance in .NET 2.0 alpha 1, and seems to have faded off into the future. The format was based on the presumption that you had existed XML schemas, relational databases and object models and you didn't want them to change. Current technologies like SQLXML mapping schemas in SQL Server's SQLXML use annotated schemas, with annotations that mapping XML elements and attributes to table columns or special formats like UpdateGrams and DiffGrams. This meant you had to annotate the schemas. The mapping format left XSD schemas alone, but introduced XML representations of relational schemas (known as RSD format) and object schemas (known as OSD format). Mapping any type to any other was accomplished via a mapping schema format (known as MSD) that could represent constructs outside the boundary of the "original" data model. It was also flexible in that it did not mandate a one-to-one mapping, implemented declarative mapping (for offline code generation) and could expose multiple overlapping mappings over the same data.

This mapping format made its appearance in two guises: System.Data.SqlXml (XML/relational) and ObjectSpaces (object/relational). ObjectSpaces has been postponed until "the next release" of .NET, .NET 2.0 SqlXml mapping has also been pushed off into the future. I couldn't even find a current link to this work on MSDN. System.Xml.Serialization didn't choose to implement this mapping model, keeping .NET attributes as its model to acheive mapping. So it will be intresting to see what (if anything) becomes of this model.

Bottom line is, unless you develop in a cloistered environment where you can safely ignore other data models, you're gonna have to map some data. Embedded SQL-92 development is such an environment, so is using XML web services with the native XML apis, simple types, and query languages only. The rest of everyone will have to use toolkits or roll their own mapping for now. And stay away from complicated domain-specific constructs, because the other side may be using a different domain model.

Wednesday, July 21, 2004 3:59:43 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

Theme design by Jelle Druyts

Pick a theme: