MHO: LINQ to SQL and Entity Framework: Panacea or evil incarnate? Part 5


This post is part of a series about worries when implementing LINQ to SQL or ADO.NET Entity Framework from a SQL database-centric programmer's perspective. The last two worries are related. It's mostly about either level of abstraction being one level away from the "real SQL code" that's being executed.

First off…
LINQ to SQL and EF will discourage the use of stored procedures to return rowsets; returning rowsets in stored procedures is *thought to be* superior to returning them in dynamic SQL. In addition, the code generators will use a subset of T-SQL query constructs, that is, only the constructs that the LINQ or ESQL language supports, rather than the full power of the T-SQL query language.

A stored procedure is always thought of by stored procedure afficianados as representing a "contract" between consumer and provider. That is, the database metadata tells me exactly what I'm going to get. Although the database metadata does indicate number, type, etc of parameters, this is absolutely not true for rowsets returned by stored procedures. There is NO database metadata that records anything about the returned rowsets, or even how many rowsets a stored procedure will return. Actually the number of rowsets returned is part of the ANSI standard but SQL Server implement it. In addition, errors that might happen in the middle of a stored procedure might result of rowsets being missing. And than there's always the possibility of returning multiple and/or different rowsets by using a stored procedure with conditional code… Not much of a rowset contract at all.

One way to ameliorate this problem (in SQL Server) is to use multistatement table-valued functions to return one rowset with known metadata. The main hassle with this is performance; a multistatement table-valued function is the equivalent of filling a table variable in code and then returning it. There are I/O considerations (the I/O of reading the base tables + I/O of reading the table variable at the consumer) The are also performance considertations as SQL Server table variables have no statistics, if the table-valued function is used as a row source in a larger query (composable queries), there is no way to estimate the number of rows returned by the TVF.

SQL Server's strongly typed table-valued parameters in SQL Server 2008 would be an analogous concept, but currently these are limited in being "input only" in procedures. No strong typed results yet. Oracle is an exception to this "no contract for rowsets" concept. Because Oracle doesn't return rowsets from stored procedures, they introduced a special parameter type called refcursor. Refcursors can appear in database APIs as a parameter (of type Refcursor or more generically "table"). And you can have strongly typed Refcursors, providing the needed contract. We'll have to wait for Oracle's (or DataDirect Technologies') EF provider or LINQ abstraction product to see how they use this.

So now that we've determined that there is no more of a rowset contract for stored procedures than ad-hoc SQL (the difference really is in SQL encapsulation and support of ownership chains, but that's another story), what about extentions that ESQL doesn't support? There are database-specifc extensions like SQL Server's PIVOT operator, or ANSI SQL standards, like ranking and windowing.

LINQ folks are quick to talk about implementation thorugh "extension methods" but the long and short of this is that these are a LINQ-ism, unrelated to LINQ to SQL. That is, the LINQ construct to SQL dialect statement mapping is fixed and embedded in the LINQ product. Using extensions to the *SQL statement mapping* (ie changing what SQL statement is produced) require either going deep into the framework (if this can be done at all) or implementing equivalent concepts on the client side, leaving the generated database code alone.

EF may have a little better story with this because each provider-writer implements the ESQL to query mapping, conceivably you could write a custom provider to encapsulate the supplied provider with extensions. However, the ESQL language itself does not have to capability of ODBC-like "escape clauses", so there'd be no way to express this extended SQL-based functionality in ESQL.

So I'd classify the "subset of SQL" and "stored procedure rowset is an anonymous type" problem as something that might be worked out in future releases of databases and frameworks. Until LINQ to SQL or EF provides escape clauses in the framework, the easiest way out is the ultimate escape clause, using the stored procedure that returns (anonymous) rowsets. And the more stored procedures are used (not CUD procedures, which enhance the model, but rowset-returning procedures), the farther away from the model you get. Which interferes with the usefulness of the model in general.

MHO.

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.