Thursday, January 27, 2005

Here's the answers to the question from Fun With static XQuery evaluation - 2

-- start with a schema collection

CREATE XML SCHEMA COLLECTION ages AS
'<xs:schema
   xmlns:xs="http://www.w3.org/2001/XMLSchema"
   targetNamespace="urn:ages"
   xmlns:tns="urn:ages">
<xs:element name="age" type="xs:int"/>
</xs:schema>
'
GO

DECLARE @x xml(ages)
SET @x = '<age>12</age>'
-- fails ??!
SELECT @x.query('string(/age)')
GO

This fails because there can be more than one <age> element and fn:string requires a singleton or empty sequence.

--- These work ---

-- this query restricts it to the first age element
DECLARE @x xml(ages)
SET @x = '<age>12</age>'
SELECT @x.query('string(/age[1])')
GO

-- this restricts the variable to XML documents. Fragments disallowed.
-- This means there can be only ONE (or zero) age elements.
-- No subscript is needed on the query then.
DECLARE @x xml(document ages)
SET @x = '<age>12</age>'
SELECT @x.query('string(/age)')
GO

The second one was a bit harder if you haven't run across the (document schemacollection) construct. Remember that XML data type can contain documents or fragments. Putting "document " before the schema collection name in any typed XML declaration restricts instances to an XML document (ie, single root element). The default is "content" so:

declare @x xml(content ages)    -- use ages xml schema collection, allow fragments or documents
declare @x xml(document ages) -- disallow fragments; documents only
declare @x xml(ages)               -- equals using "content"

Note that you can only enforce "document only" using this keyword with TYPED XML. It's not supported on untyped XML instances. You can do the equivalent enforcement with an untyped XML column in a table by using an XML check constraint, like this:

create table foo (
  xmlcol xml constraint mycontr
         xmlcol.value('count(/*)', 'int') = 1 and xmlcol.exist('/text()')=0

Hope you've enjoy this foray into static typing and XQuery. Because this is a "implementation decision" you won't find much about this in the W3C spec. The best information about this is in the excellent XML Best Practices for Microsoft SQL Server 2005 document.

BTW, in case you collect W3C specs for your own offline reference (like I do), bear in mind that the final SQL Server 2005 implementation of XQuery will be aligned with the W3C July 2004 XQuery spec series. XQuery is still a W3C "work in progress". SQL Server 2005 implements a subset of the functions and operators, adds functions to access T-SQL variables and SQL columns, and also implements static typing. So it's not a 1-to-1 match with the spec, but if you like W3C specs, July 2004 is the one you want. For now.

Thursday, January 27, 2005 11:03:05 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Wednesday, January 26, 2005

After the last two entries, you might be thinking "I guess I can never use text() as a node test with typed XML again". Not so. The error message reads: 'text()' is not supported on simple typed or 'http://www.w3.org/2001/XMLSchema#anyType' elements. So what's left? Mixed content, for one thing. Mixed content consists of a mixture of text and also embedded subelements.

If we change the schema to allow mixed content (this schema also allows a particular subelement):

CREATE XML SCHEMA COLLECTION mixedage AS
'<xs:schema
xmlns:xs="http://www.w3.org/2001/XMLSchema"
targetNamespace="urn:ages"
xmlns:tns="urn:ages">
  <xs:complexType name="age" mixed="true">
    <xs:complexContent mixed="true">
      <xs:restriction base="xs:anyType">
         <xs:sequence>
           <xs:element name="dogyears" type="xs:int"/>
         </xs:sequence>
      </xs:restriction>
    </xs:complexContent>
  </xs:complexType>

<xs:element name="age" type="tns:age"/>
</xs:schema>
'

Then the text() node test works with typed XML just fine:

DECLARE @x xml(mixedage)
SET @x = '
<ag:age xmlns:ag="urn:ages">This is the age in dog years<dogyears>3</dogyears></ag:age>'
-- now it works OK
SELECT @x.query('
declare default namespace "urn:ages";
/age/text()')
GO

Wednesday, January 26, 2005 11:30:56 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

Reference back to the previous entry. Now that we know the rules, let's try them out:

-- snip --

Data(),text() and string() accessors

XQuery has a function fn:data() to extract scalar, typed values from nodes, a node test text() to return text nodes, and the function fn:string() that returns the string value of a node. Their usages are sometimes confusing. Guidelines for their proper use in SQL Server 2005 are as follows. Consider the XML instance <age>12</age>.

Untyped XML: The path expression /age/text() returns the text node "12". The function fn:data(/age) returns the string value "12" and so does fn:string(/age).

Typed XML: The expression /age/text() returns static error for any simple typed <age> element. On the other hand, fn:data(/age) returns integer 12, while fn:string(/age) yields the string "12".

-- snip --

Try this:

DECLARE @x xml
SET @x = '<age>12</age>'
-- works as expected
SELECT @x.query('data(/age)')
GO

DECLARE @x xml
SET @x = '<age>12</age>'
-- fails
-- Msg 2211, Level 16, State 1, Line 6
-- XQuery [query()]: Singleton (or empty sequence) required, found operand of type 'element(age,xdt:untypedAny) *'
SELECT @x.query('string(/age)')
GO

Oh. This message looks familiar. It turns out that XQuery functions are strongly typed also. Here's the definition of fn:string and fn:data:

fn:string($arg as item()?) as xs:string

fn:data($arg as item()*) as xdt:anyAtomicType*

The "item()*" means that data takes a sequence of 0-n items. "item()?" means that string only takes a sequence of 0-1 item. Let's fix it then.

SELECT @x.query('string(/age[1])')

Cool. Here's the test for comprehension. Let's try this with typed XML.

-- start with a schema collection

CREATE XML SCHEMA COLLECTION ages AS
'<xs:schema
   xmlns:xs="http://www.w3.org/2001/XMLSchema"
   targetNamespace="urn:ages"
   xmlns:tns="urn:ages">
<xs:element name="age" type="xs:int"/>
</xs:schema>
'
GO

DECLARE @x xml(ages)
SET @x = '<age xmlns="urn:ages">12</age>'
-- fails as expected
SELECT @x.query('
declare default namespace "urn:ages";
/age/text()')
GO

DECLARE @x xml(ages)
SET @x = '<age xmlns="urn:ages">12</age>'
-- works as expected
SELECT @x.query('
declare default namespace "urn:ages";
data(/age)')
GO

DECLARE @x xml(ages)
SET @x = '<age>12</age>'
-- fails ??!
SELECT @x.query('string(/age)')
GO

Why does the last query (against strongly typed XML) fail, even though there is a schema? How can you fix it? There are two different "right answers".

Wednesday, January 26, 2005 4:36:58 PM (Pacific Standard Time, UTC-08:00)  #    Comments [4]  | 

There's been lots of puzzled faces lately when I try to explain doing XQuery against strongly typed XML (XML typed by a SCHEMA COLLECTION) vs untyped XML. The largest FAQ is why when I have this document:

<person>
  <name>bob</name>
  <age>51</age>
</person>

using the value function (after assignment to @person) @person.value('/person/age', 'int') returns my favorite error:

Msg 2389, Level 16, State 1, Line 6
XQuery [value()]: Operator 'value()' requires a singleton (or empty sequence), found operand of type 'xdt:untypedAtomic *'

Huh? Although you know by looking at the document that there is only one age element, the XQuery parser uses static evaluation. It doesn't read your document (or read your mind) and assumes there can be more than one age element. After all, there's no schema to enforce the singleton age element, I could have 3 or 4 ages. It doesn't want to guess and be wrong at execution time. Using:

@person.value('/person[1]/age[1]', 'int')

works. I can see why age[1] is required, but why person[1]? Doesn't XML have a single root element? Actually, no. SQL Server 2005 supports fragments (well-formed, multiple root) as well as documents. Fragment support is part of the XQuery 1.0/XPath 2.0 data model.

Most people get by that. The real fun starts when you do examples using untyped XML and XPath expressions with the text() node test. text() works just fine when using untyped XML, but fails against typed XML with simple content. Here's an example (the result of a discussion with Dan Sullivan):

CREATE XML SCHEMA COLLECTION root AS
'<xs:schema
   xmlns:xs="http://www.w3.org/2001/XMLSchema"
   targetNamespace="urn:geo"
   xmlns:tns="urn:geo">
<xs:element name="Root" type="xs:string"/>
</xs:schema>
'
GO

-- UNTYPED
-- this works
DECLARE @x  xml
set @x = '<g:Root xmlns:g="urn:geo">asdf</g:Root>'
select @x.query('
 declare namespace g="urn:geo"
 /g:Root/text()')

-- TYPED
-- Msg 9312, Level 16, State 1, Line 4
-- XQuery [query()]: 'text()' is not supported on simple typed
-- or 'http://www.w3.org/2001/XMLSchema#anyType' elements,
-- found 'element(g{urn:geo}:Root,xs:string) *'.

DECLARE @x  xml(root)
-- same document
set @x = '<g:Root xmlns:g="urn:geo">asdf</g:Root>'
select @x.query('declare namespace g="urn:geo"
/g:Root[1]/text()')

But why? Isn't text() a node test that returns the value of a text() node. After casting about in XQuery specs, and SQL BOL, I finally came across this in the XML Best Practices paper.

-- snip --

Data(),text() and string() accessors

XQuery has a function fn:data() to extract scalar, typed values from nodes, a node test text() to return text nodes, and the function fn:string() that returns the string value of a node. Their usages are sometimes confusing. Guidelines for their proper use in SQL Server 2005 are as follows. Consider the XML instance <age>12</age>.

Untyped XML: The path expression /age/text() returns the text node "12". The function fn:data(/age) returns the string value "12" and so does fn:string(/age).

Typed XML: The expression /age/text() returns static error for any simple typed <age> element. On the other hand, fn:data(/age) returns integer 12, while fn:string(/age) yields the string "12".

-- snip --

Well, that was confusing. But now I think I get it. When does a element not have a text() node (or more preicsely, not allow the text() node test)?? When it's a strong-typed query using a simple type element...that's when. But why? Although I know the rules now, I'm still somewhat baffled.

This is getting pretty long, more on this topic in a bit...

Wednesday, January 26, 2005 3:22:09 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Wednesday, January 19, 2005

A new feature of SQL Server 2005 that has been fairly well publicized is the ability, on Windows 2003 operating systems, to enforce password stregth, expiration, and lockout policies on SQL Server logins, as the operating system enforces them on Windows logins. The way that this works is that SQL Server calls NetValidatePasswordPolicy, a Win32 function available on Windows 2003. So if I have a machine policy (either standalone or more likely inherited from a domain policy) that a password must be at least 8 characters long, the following DDL will fail:

CREATE LOGIN bob WITH PASSWORD = 'bob'

you need:

CREATE LOGIN bob WITH PASSWORD = 'bob000000'

However, did you realize that password on other secrets will follow policies as well? For example:

CREATE APPLICATION ROLE somerolename WITH PASSWORD = 'aaa'
CREATE MASTER KEY ENCRYPTION BY PASSWORD = 'aaa'
CREATE CERTIFICATE foo WITH SUBJECT = 'foo', ENCRYPTION_PASSWORD = 'aaa'
CREATE SYMMETRIC KEY skey WITH ALGORITHM = DES ENCRYPTION BY PASSWORD = 'aaa'

will all fail for the same policy reasons. The lone straggler, at least as of Dec CTP is ASYMMETRIC KEY. This works...

CREATE ASYMMETRIC KEY akey WITH ALGORITHM = RSA_512
 ENCRYPTION BY PASSWORD = 'a'

Wednesday, January 19, 2005 11:13:37 PM (Pacific Standard Time, UTC-08:00)  #    Comments [5]  | 
Monday, January 17, 2005

One of my students last week noticed that using a Service Broker object name (like a CONTRACT, SERVICE, and MESSAGE TYPE name) with the wrong case caused an error message. That's because Service Broker object names are case sensitive by deisgn. Because these identifiers can go over the wire, and you can't predict the collation of the database instance on the other side, they have to go by binary collation. Even when the objects are defined in a single database, you can't assume that's the only place they'll be used. Thanks to Roger Wolter for clarifying this...

Keep this in mind when you define a SERVICE (with the required associated contract) for query notifications or event notifications, as well as in your own broker apps.

Monday, January 17, 2005 6:02:06 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Sunday, January 16, 2005

Haven't been blogging as much recently, as I've been teaching and travelling quite a bit. Last week though, I received an interesting query on XQuery query plans (that's more uses of the word query in one sentence than I've seen before). We were looking at SQL query plans when XQuery is involved. I'd remembered that:

1. The primary XML index builds a node table over an XML column
2. If there is no XML index, the node table must be built at query time

This seems to quite a big effect on number of estimated rows. Take a 1-row table with an untyped XML column. The row contains a fairly simple document from the W3C XQuery use cases. Let's do a simple query (like /BookStore/Book) over the document.

Without the primary XML index, the execution plan contains 3 UDX expressions, two of them have a large number of estimated rows, 1000 and 10000 estimated rows. The plan step that estimates 1000 rows returns 3 actual rows; the 10000 estimate step returns 114 rows. On the other hand, once the primary index is added, these two UDX expressions are replaced by clustered index seeks, with a fairly close estimate of rows vs the actual rows.

Looks like, if you're going to be doing any non-trivial amount of queries against the XML data type a primary XML index is pretty close to being a requirement. Building it as an index bodes much better than building it at execution time.

Sunday, January 16, 2005 11:37:40 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Thursday, January 06, 2005

I had an interesting flashback today. Someone wrote with a reference to a mailing list posting I'd written in 2000, about IRowsetFastLoad. They wanted code, because nowadays attachments to public postings have become a thing of the past. Looked in my OLE DB code, from back when I was teaching "Essential OLE DB", and ... voila. (I know there's a code sample in SQL BOL, and told the letter writer about it, but finding the original code was way more fun. Written in ATL and all, too.

The poster wanted to know what I'd recommend instead, if this didn't work out (his inserts are too slow). Choices that came to mind were:

1. BULK INSERT SQL statement
2. BCP API in C - it's ODBC-based IIRC
3. SqlBulkCopy class from ADO.NET 2.0

There's something for choices #1 and #3 in the new SQL Server 2005/ADO.NET 2.0 stuff. Of course SqlBulkCopy (misnamed, it's it actually more like IRowsetFastLoad) is new in ADO.NET 2.0. And BULK INSERT has been enhanaced/rewritten as a "rowset provider", an OLE DB-like provider whose name is "BULK". The enhancements are more functional than speed-based. I haven't done speed tests (it's too early) but someone posted on newsgroups in Beta 1 that "BULK" was actually slower than the old BULK INSERT. It's supposed to be as fast by RTM. The new BULK supported XML inserts also, include multiple XML documents separated by "row separator characters". Think CSV files of XML data type. Whew...

This caused me to wonder if the new SQL Native Client OLE DB provider (SQLNCLI) supports IRowsetFastLoad too. Hmmm....says in BOL that there is not only IRowsetFastLoad but an IBCPSession interface on the new SQLNCLI provider. Maybe something for choice #2 also.

Thursday, January 06, 2005 2:54:09 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Wednesday, January 05, 2005

I've been lying pretty low lately, doing research on some topics in a slightly different space (but at least obliquely SQL Server 2005 related). Came across a list of interesting student comments today, and I was back trying out (picking at?) one of my favorite spare-time topics to answer a question about. SQL Server 2005 metadata. The question was whether or not the INFORMATION_SCHEMA views were security-sensitive, and...of course they are. There are now (at least) 4 different ways to list the user tables in a database (all subject to security) and they are:

select * from sysobjects where type = 'U'
select * from sys.objects where type = 'U'
select * from sys.tables -- I always wanted a systables in earlier versions
select * from INFORMATION_SCHEMA.tables where TABLE_TYPE = 'BASE_TABLE'

The Dec CTP closed the last loophole; using system procedures like sp_help is now permission sensitive.

But what if you WANT a user to be able to list all of the tables in the database? The “smallest“ permission you can give a user is VIEW DEFINITION privilege on a specific object. As an example, if I had a user named 'fred' that I wanted to 'see' the authors table, I would give:

grant view definition on object::authors to 'fred'

A specific resource is called a securable. Securables exist in a hierarchy. So although there's no straightforward way to give VIEW DEFINITION on all of the tables (that I can see) in a specific schema or database in a single statement, I could give a user access to all of the objects (Table, View, Function, Procedure, Queue, Type, Rule, Default, Synonym, Aggregate, XML Schema Collection) in a particular schema with this one statement.

grant view definition on schema::dbo to 'fred' -- all in dbo schema
grant view definition on database::dbo to 'fred' -- all in the database

Note the VIEW DEFINITION is the “smallest“ permission; it gives a user permission to see that an object exists without being able to see the data. For fred's case, “select * from authors“ would still fail. Giving SELECT also gives view definition privilege, so if you can SELECT against an object, it also shows up in your metadata list. Makes perfect sense. You can tell I'm liking the new metadata views...and the new “all permissions grantable“ concept. Wait until the first user calls up and asks “where did all of the tables in my dropdown list go?“.

Wednesday, January 05, 2005 12:08:38 AM (Pacific Standard Time, UTC-08:00)  #    Comments [2]  | 

Theme design by Jelle Druyts

Pick a theme: