Wednesday, October 10, 2007

It's not like me to post on security - that's one of Kimberly's areas - but this is an interesting example of a real-life injection attack :-)

Wednesday, October 10, 2007 9:59:48 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Tuesday, October 09, 2007

This is another question that came up on the Disaster Recovery forum on MSDN. Paraphrasing - ' I have a backup file containing full backups for 45 databases. How can I restore them all using a script?'

The answer is pretty straightforward. Let's create the situation described, using 3 databases for clarity rather than 45:

BACKUP DATABASE tinylogtest TO DISK = 'c:\sqlskills\mixedbackups.bck' WITH NAME = 'tinylogtest 10/09/07', INIT;

BACKUP DATABASE pagesplittest TO DISK = 'c:\sqlskills\mixedbackups.bck' WITH NAME = 'pagesplittest 10/09/07';

BACKUP DATABASE ghostrecordtest TO DISK = 'c:\sqlskills\mixedbackups.bck' WITH NAME = 'ghostrecordtest 10/09/07';

GO

You now can't do a regular restore from that backup file for any database except the first one in the file:

RESTORE DATABASE pagesplittest FROM DISK = 'c:\sqlskills\mixedbackups.bck' WITH REPLACE;

GO

Msg 1834, Level 16, State 1, Line 1

The file 'c:\tinylogtest\tinylogtest.mdf' cannot be overwritten. It is being used by database 'tinylogtest'.

Msg 3156, Level 16, State 4, Line 1

File 'fgt_mdf' cannot be restored to 'c:\tinylogtest\tinylogtest.mdf'. Use WITH MOVE to identify a valid location for the file.

Msg 1834, Level 16, State 1, Line 1

The file 'c:\tinylogtest\tinylogtest.ldf' cannot be overwritten. It is being used by database 'tinylogtest'.

Msg 3156, Level 16, State 4, Line 1

File 'fgt_log' cannot be restored to 'c:\tinylogtest\tinylogtest.ldf'. Use WITH MOVE to identify a valid location for the file.

Msg 3119, Level 16, State 1, Line 1

Problems were identified while planning for the RESTORE statement. Previous messages provide details.

Msg 3013, Level 16, State 1, Line 1

RESTORE DATABASE is terminating abnormally.

So how to restore the other databases apart from the first one? The answer is that you need to specify which of the backups within the file you're after. You can find the positions using the RESTORE HEADERONLY command:

RESTORE HEADERONLY FROM DISK = 'c:\sqlskills\mixedbackups.bck';

GO

This returns lots of information:

and then a whole bunch more columns and then ending with:

For our purposes, the interesting things to note are the Position, DatabaseName, and BackupTypeDescription. In our simple example we have 3 full backups. To restore the pagesplittest database as we tried to do above, we have to specifiy the position within the file of the backup we want to restore - just the database name isn't enough.

RESTORE DATABASE pagesplittest FROM DISK = 'c:\sqlskills\mixedbackups.bck' WITH REPLACE, FILE = 2;

GO

And it works. So the question asked how to do this, plus how to do it in a script. Below is a script I've adapted from the example I wrote for the Books Online for DBCC SHOWCONTIG back in 1999 when I rewrote DBCC SHOWCONTIG for SQL Server 2000.

Enjoy!

-- Create a temporary table to hold the output from RESTORE HEADERONLY

CREATE TABLE master.dbo.restoreheaderonly (

BackupName NVARCHAR (128), BackupDescription NVARCHAR (255), BackupType SMALLINT, ExpirationDate DATETIME,

Compressed TINYINT, Position SMALLINT, DeviceType TINYINT, UserName NVARCHAR (128), ServerName NVARCHAR (128),

DatabaseName NVARCHAR (128), DatabaseVersion INT, DatabaseCreationDate DATETIME, BackupSize NUMERIC (20, 0),

FirstLSN NUMERIC (25, 0), LastLSN NUMERIC (25,0), CheckpointLSN NUMERIC (25,0), DatabaseBackupLSN NUMERIC (25, 0),

BackupStartDate DATETIME, BackupFinishDate DATETIME, SortOrder SMALLINT, CodePage SMALLINT, UnicodeLocaleId INT,

UnicodeComparisonStyle INT, CompatibilityLevel TINYINT, SoftwareVendorId INT, SoftwareVersionMajor INT,

SoftwareVersionMinor INT, SoftwareVersionBuild INT, MachineName NVARCHAR (128), Flags INT, BindingID UNIQUEIDENTIFIER,

RecoveryForkID UNIQUEIDENTIFIER, Collation NVARCHAR (128), FamilyGUID UNIQUEIDENTIFIER, HasBulkLoggedData BIT,

IsSnapshot BIT, IsReadOnly BIT, IsSingleUser BIT, HasBackupChecksums BIT, IsDamaged BIT, BeginsLogChain BIT,

HasIncompleteMetaData BIT, IsForceOffline BIT, IsCopyOnly BIT, FirstRecoveryForkID UNIQUEIDENTIFIER,

ForkPointLSN NUMERIC (25, 0) NULL, RecoveryModel NVARCHAR (60), DifferentialBaseLSN NUMERIC (25, 0) NULL,

DifferentialBaseGUID UNIQUEIDENTIFIER, BackupTypeDescription NVARCHAR (60), BackupSetGUID UNIQUEIDENTIFIER NULL);

GO

 

-- Populate the table

INSERT INTO master.dbo.restoreheaderonly EXEC ('RESTORE HEADERONLY FROM DISK = ''C:\sqlskills\mixedbackups.bck''') ;

GO

 

DECLARE @Position SMALLINT;

DECLARE @DatabaseName NVARCHAR (128);

DECLARE @ExecString NVARCHAR (255);

 

-- Declare a cursor to iterate over the results

DECLARE databases CURSOR FOR

SELECT Position, DatabaseName FROM master.dbo.restoreheaderonly WHERE BackupTypeDescription = 'Database';

 

-- Open the cursor.

OPEN databases;

 

-- Loop through the databases.

FETCH NEXT FROM databases INTO @Position, @DatabaseName;

WHILE @@FETCH_STATUS = 0

BEGIN

SELECT @ExecString = 'RESTORE DATABASE ' + RTRIM (@DatabaseName) +

' FROM DISK = ''C:\sqlskills\mixedbackups.bck''' +

' WITH REPLACE, FILE = ' + RTRIM (CONVERT (VARCHAR (10), @Position));

SELECT 'Restoring database ' + RTRIM (@DatabaseName);

EXEC (@ExecString);

FETCH NEXT FROM databases INTO @Position, @DatabaseName;

END;

 

-- Close and deallocate the cursor.

CLOSE databases;

DEALLOCATE databases;

 

-- Delete the temporary table.

DROP TABLE master.dbo.restoreheaderonly;

GO

Tuesday, October 09, 2007 5:35:32 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Monday, October 08, 2007

Lots of people have been asking for us to create some aggregate feeds of the various blogs on the SQLskills site - well, now I've done it.

The three new feeds are:

SQLskills BI Team Blog

SQLskills SQL Server 2008 Category Aggregate Feed

SQLskills All Blogs Aggregate Feed

http://pipes.yahoo.com/pipes/pipe.run?_id=vv9PBOp13BGAc67kLO2fWQ&_render=rss. There's no FeedBurner equivalent for this feed as its larger than the maximum feed size that FeedbBurner can handle (512k).

Enjoy!

Monday, October 08, 2007 5:35:39 PM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 

Over the years I was in the Storage Engine team I saw a lot of concern on the various forums about the ghost cleanup task. There have been a few bugs with it in previous versions  (see these KB articles - 932115 and 815594) and there's very little info available on it. For some reason I didn't get around to posting about it on my old blog but today I want to go into some depth on it.

So what is ghost cleanup? It's a background process that cleans up ghost records - usually referred to as the ghost cleanup task. What's a ghost record? As I described briefly in the Anatomy of a record post last week, a ghost record is one that's just been deleted in an index on a table (well, actually it gets more complicated if snapshot isolation of some form is enabled but for now, a record in an index is a good start). Such a delete operation never physically removes records from pages - it only marks them as having been deleted, or ghosted. This is a performance optimization that allows delete operations to complete more quickly. It also allows delete operations to rollback more quickly because all that needs to happen is to unmark the records as being deleted/ghosted, instead of having to reinsert the deleted records. The deleted record will be physically removed (well, its slot will be removed - the record data isn't actually overwritten) later by the background ghost cleanup task. The ghost cleanup task will leave a single record on the page to avoid having to deallocate empty data or index pages.

The ghost cleanup task can't physically delete the ghost records until after the delete transaction commits because the deleted records are locked and the locks aren't released until the transaction commits. As an aside, when ghost records exist on a page, even a NOLOCK or READ UNCOMMITTED scan won't return them because they are marked as ghost records.

When a record is deleted, apart from it being marked as a ghost record, the page that the record is on is also marked as having ghost records in one of the allocation maps - the PFS page (post coming soon!) - and in its page header. Marking a page as having ghost records in a PFS page also changes the database state to indicate that there are some ghost records to cleanup - somewhere. Nothing tells the ghost cleanup task to clean the specific page that the delete happened on - yet. That only happens when the next scan operation reads the page and notices that the page has ghost records.

The ghost cleanup task doesn't just start up when it's told to - it starts up in the background every 5 seconds and looks for ghost records to cleanup. Remember that it won't be told to go cleanup a specific page by a delete operation - it's a subsequent scan that does it, if a scan happens. When the ghost cleanup task starts up it checks to see if its been told to cleanup a page - if so it goes and does it. If not, it picks the next database that is marked as having some ghost records and looks through the PFS allocation map pages to see if there are any ghost records to cleanup. It will check through or cleanup a limited number of pages each time it wakes up - I think I remember the limit is 10 pages - to ensure it doesn't swamp the system. So - the ghost records will eventually be removed - either by the ghost cleanup task processing a database for ghost records or by it specifically being told to remove them from a page. If it processes a database and doesn't find any ghost records, it marks the database as not having any ghost records so it will be skipped next time.

How can you tell its running? On SQL Server 2005, you can use the following code to see the ghost cleanup task in sys.dm_exec_requests:

SELECT * INTO myexecrequests FROM sys.dm_exec_requests WHERE 1 = 0;

GO

SET NOCOUNT ON;

GO

DECLARE @a INT

SELECT @a = 0;

WHILE (@a < 1)

BEGIN

INSERT INTO myexecrequests SELECT * FROM sys.dm_exec_requests WHERE command LIKE '%ghost%'

SELECT @a = COUNT (*) FROM myexecrequests

END;

GO

SELECT * FROM myexecrequests;

GO

And on SQL Server 2000 you need to use sysprocesses (well, on SQL Server 2005 this works as well but its fake view derived from the DMVs):

SELECT * INTO mysysprocesses FROM master.dbo.sysprocesses WHERE 1 = 0;

GO

SET NOCOUNT ON;

GO

DECLARE @a INT

SELECT @a = 0;

WHILE (@a < 1)

BEGIN

INSERT INTO mysysprocesses SELECT * FROM master.dbo.sysprocesses WHERE cmd LIKE '%ghost%'

SELECT @a = COUNT (*) FROM mysysprocesses

END;

GO

SELECT * FROM mysysprocesses;

GO

The output from sys.dm_exec_requests is (with most unused and uninteresting columns stripped off):

session_id request_id  start_time              status       command
---------- ----------- ----------------------- ------------ ----------------
15         0           2007-10-05 16:34:49.653 background   GHOST CLEANUP

So how can you tell if a record is ghosted? Let's engineer some and look at it with DBCC PAGE - I've stripped out the uninteresting bits of the output and highlighted the interesting ghost parts:

CREATE TABLE t1 (c1 CHAR(10))

CREATE CLUSTERED INDEX t1c1 on t1 (c1)

GO

BEGIN TRAN

INSERT INTO t1 VALUES ('PAUL')

INSERT INTO t1 VALUES ('KIMBERLY')

DELETE FROM t1 WHERE c1='KIMBERLY';

GO

DBCC IND ('ghostrecordtest', 't1', 1);

GO

DBCC TRACEON (3604);

GO

DBCC PAGE ('ghostrecordtest', 1, 143, 3);

GO

<snip>

m_freeData = 130    m_reservedCnt = 0   m_lsn = (20:88:20)
m_xactReserved = 0  m_xdesId = (0:518)  m_ghostRecCnt = 1
m_tornBits = 0

<snip>

Slot 0 Offset 0x71 Length 17

Record Type = GHOST_DATA_RECORD      Record Attributes =  NULL_BITMAP
Memory Dump @0x6256C071

00000000:   1c000e00 4b494d42 45524c59 20200200 †....KIMBERLY  ..
00000010:   fc†††††††††††††††††††††††††††††††††††.
UNIQUIFIER = [NULL]

Slot 0 Column 1 Offset 0x4 Length 10

c1 = KIMBERLY

Slot 1 Offset 0x60 Length 17

Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP
Memory Dump @0x6256C060

00000000:   10000e00 5041554c 20202020 20200200 †....PAUL      ..
00000010:   fc†††††††††††††††††††††††††††††††††††.
UNIQUIFIER = [NULL]

Slot 1 Column 1 Offset 0x4 Length 10

c1 = PAUL

Let's see what goes on the transaction log during this process (remember this is undocumented and unsupported - do it on a test database) - I've stripped off a bunch of the columns in the output:

DECLARE @a CHAR (20)

SELECT @a = [Transaction ID] FROM fn_dblog (null, null) WHERE [Transaction Name]='PaulsTran'

SELECT * FROM fn_dblog (null, null) WHERE [Transaction ID] = @a;

GO

Current LSN              Operation         Context             Transaction ID
------------------------ ----------------- ------------------- --------------
00000014:00000054:0011   LOP_BEGIN_XACT    LCX_NULL            0000:00000206
00000014:0000005a:0012   LOP_INSERT_ROWS   LCX_CLUSTERED       0000:00000206
00000014:0000005a:0013   LOP_INSERT_ROWS   LCX_CLUSTERED       0000:00000206
00000014:0000005a:0014   LOP_DELETE_ROWS   LCX_MARK_AS_GHOST   0000:00000206
00000014:0000005a:0016   LOP_DELETE_ROWS   LCX_MARK_AS_GHOST   0000:00000206

So there are the two inserts followed by the two deletes - with the rows being marked as ghost records. But where's the update to the PFS page? Well, changing the ghost bit in a PFS page is not done as part of a transaction. We'll need to look for it another way (apart from just dumping everything in the transaction log and searching manually):

SELECT Description, * FROM fn_dblog (null, null) WHERE Context like '%PFS%' AND AllocUnitName like '%t1%';

GO

Description               Current LSN              Operation        Context   Transaction ID
------------------------- ------------------------ ---------------- --------- ----------------
Allocated 0001:0000008f   00000014:00000054:0014   LOP_MODIFY_ROW   LCX_PFS   0000:00000208
                          00000014:0000005a:0015   LOP_SET_BITS     LCX_PFS   0000:00000000

The first one is just allocating a page but the second one is the one we're looking for - it's changed the bit for the page to say it has ghost records on. Let's commit the transaction and see what happens, filtering out all the previous transaction log:

SELECT MAX ([Current LSN]) FROM fn_dblog (null, null);

GO

-- 00000014:0000005e:0001

COMMIT TRAN

GO

SELECT [Page ID], * FROM fn_dblog (null, null) WHERE [Current LSN] > '00000014:0000005e:0001';

GO

Page ID         Current LSN              Operation          Context         Transaction ID
--------------- ------------------------ ------------------ --------------- --------------
NULL            00000014:0000005f:0001   LOP_COMMIT_XACT    LCX_NULL        0000:00000206
0001:0000008f   00000014:00000060:0001   LOP_EXPUNGE_ROWS   LCX_CLUSTERED   0000:00000000

We see that almost as soon as the transaction has commited, the ghost cleanup task goes in and process the page. Let's check a page dump to make sure the record is gone, and show that the contents of the record are still on the page (again, with non-relevant bits snipped out):

DBCC PAGE ('ghostrecordtest', 1, 143, 3);

GO

<snip>

m_freeData = 130         m_reservedCnt = 0        m_lsn = (20:94:1)
m_xactReserved = 0       m_xdesId = (0:518)       m_ghostRecCnt = 0
m_tornBits = 0

<snip>

Record Type = PRIMARY_RECORD         Record Attributes =  NULL_BITMAP
Memory Dump @0x6212C060

00000000:   10000e00 5041554c 20202020 20200200 †....PAUL      ..
00000010:   fc†††††††††††††††††††††††††††††††††††.
UNIQUIFIER = [NULL]

Slot 0 Column 1 Offset 0x4 Length 10

c1 = PAUL

DBCC PAGE ('ghostrecordtest', 1, 143, 2);

GO

<snip>

6212C040:   01000000 00000000 00000000 00000000 †................
6212C050:   00000000 00000000 00000000 00000000 †................
6212C060:   10000e00 5041554c 20202020 20200200 †....PAUL      ..
6212C070:   fc1c000e 004b494d 4245524c 59202002 †.....KIMBERLY  .
6212C080:   00fc0000 00000000 00000000 01000000 †................
6212C090:   00000000 13000000 01000000 00000000 †................

<snip>

So even though the record no longer exists, all that happened was that the slot was removed from the slot array at the end of the page - the record contents will remain on the page until the space is reused.

In the next post I'll go into details of the PFS and other allocation maps. Btw - please let me know if this stuff is interesting - I'd like to know where to spend blogging time. Thanks!

Monday, October 08, 2007 8:24:09 AM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 
Saturday, October 06, 2007

It's common knowledge that SQL Server copes with daylight savings time (DST) correctly so why should you care?

Well, it's not so common knowledge that at the end of DST when the clocks go back an hour (always at 02:00 in the U.S.), SQL Agent essentially pauses for an hour (in at least SS2000 onwards). This means that if you have a job that's doing something every 15 minutes, there will be a gap of 75 minutes between the job execution at 01:45 and the job execution at 02:00. This happens because at 02:00, the time is set back to 01:00 but the next run time of all the jobs remains the same - so your job cannot execute until it's next scheduled time of 02:00. So, in the northern hemisphere every Fall, and in the southern hemisphere every Spring, you lose an hour's worth of SQL Agent jobs. Still, why should you care?

Well, it depends what the jobs are that get delayed by an hour. If you have a job that takes a log backup every 15 mins then on the day DST ends, there's actually a gap of 75 minutes between log backups. If you have a Service Level Agreement (SLA) that limits the maximum amount of lost work to 15 minutes in the event of a disaster, then for those 75 minutes you're exposed to potentially not being able to meet that SLA!

That could be a pretty big deal, especially if something goes wrong during that hour (no more or less likely than something going wrong at any other time, but still possible). In that case, you need to come up with an alternative solution. A couple of ways to get around the problem I can think of:

  • Have someone stay up late during that hour and take manual log backups.
  • Switch over to database mirroring, which continually streams the log to the redundant server and so isn't affected the DST issue.

Both of these are viable solutions but I think the best one is to create a SQL Agent job that runs at 01:59 and creates extra backup jobs to run at 01:00, 01:15, 01:30, and 01:45. I don't see why this shouldn't be possible. At 10:36 this morning I created a simple agent job to print the date to a file and set it to execute at 09:40 - in the past. I then set my system time back one hour and the job executed perfectly. The only downside of this solution is that you need to create and schedule the extra jobs using the T-SQL Agent SPs embedded in job steps for your 01:59 job - tedious but not hard. Maybe someone could send me a script and I'll blog it as a follow-on?

So with DST coming to an end on November 4th this is definitely something for you to be aware of even if you don't want to go to the trouble of coping with the extra hour's exposure. As an aside - the dates when DST starts and ends changed this year. KB article 931975 discusses which parts of SQL Server aren't aware of the changed dates and what you can do about it.

Saturday, October 06, 2007 9:49:34 AM (Pacific Standard Time, UTC-08:00)  #    Comments [4]  | 
Friday, October 05, 2007

Here's a really interesting question that was in my search engine logs yesterday - if I have a transaction that runs and completes while a backup is running, will the complete transaction be in the backup? The answer is.... it depends!

In terms of what gets backed up, the way a full backup works is:

  1. Note the transaction log's LSN (Log Sequence Number)
  2. Read all allocated extents in the various data files
  3. Note the LSN again
  4. Read all the transaction log between the starting LSN and the ending LSN

Any transaction that commits before or on the LSN read in step 3 will be fully reflected when the database is restored. If not, the transaction will be undone. So you can't just go by the completion time of the backup and the completion time of the transaction. The transaction may well commit before the backup operation completes, but it may complete during step 4, and so it will get rolled back during a restore. In this case, it's necessary to take a log backup as well and restore that too to make the transaction be fully reflected after a restore.

As you know I always like to prove things  - so here's my proof of what I just said. I'm going to use the AdventureWorks database to do this. First thing is to set it to full recovery mode (and take the first full backup to start full recovery mode logging):

ALTER DATABASE AdventureWorks SET RECOVERY FULL;

BACKUP DATABASE AdventureWorks TO DISK='C:\SQLskills\AdventureWorks.bck' WITH INIT;

GO

Now I'm going to flush out the backup history tables in MSDB:

USE msdb;

GO

EXEC sp_delete_backuphistory '10/6/07';

GO

I've got a really contrived example that I'm going to use to show whether my transaction is wholely contained in the full backup. Using the HumanResources.Employee table, there's a column VacationHours which I'm going to set to 0 and then force all the changes pages to disk. This is my base state:

UPDATE AdventureWorks.HumanResources.Employee SET VacationHours = 0;

GO

CHECKPOINT;

GO

My contrived transaction is going to be in a tight loop updating all the rows in the table, which will generate lots of transaction log, and timed to complete just before the backup completes (i.e. in stage 4). In one connection I start the backup:

BACKUP DATABASE AdventureWorks TO DISK='C:\SQLskills\AdventureWorks.bck' WITH INIT;

SELECT GETDATE ();

GO

and in another I start my contrived transaction, after starting the backup:

BEGIN TRAN

DECLARE @a INT

DECLARE @b INT

SELECT @a = 1

WHILE (@a < 6)

BEGIN

SELECT @b = 1

WHILE (@b < 201)

BEGIN

UPDATE AdventureWorks.HumanResources.Employee SET VacationHours = @b

SELECT @b=@b+1

END

SELECT @a=@a+1

END

COMMIT TRAN;

SELECT GETDATE ();

GO

It's not pretty but it does the job. The backup finishes at 2007-10-05 17:42:38.983 and the transaction finishes at 2007-10-05 17:42:38.107 - before the backup finishes. Remember I set the VacationHours all to zero before running my transaction - let's check the transaction did actually change them:

SELECT MAX (VacationHours) FROM AdventureWorks.HumanResources.Employee;

GO

This returns 200 - which is what I'd expect. But is the transaction wholely contained in the backup? Let's look at the backup history to find out the last LSN that was captured in the full backup:

SELECT Backup_Start_Date, Backup_Finish_Date, First_LSN, Last_LSN

FROM msdb.dbo.backupset WHERE database_name = 'AdventureWorks';

GO

And the output we get is:

Backup_Start_Date       Backup_Finish_Date      First_LSN           Last_LSN
----------------------- ----------------------- ------------------- -------------------
2007-10-05 17:42:22.000 2007-10-05 17:42:38.000 86000000001600029   91000000625600001

The LSNs are in decimal, so we need to convert the three numbers to hex so we can compare them to what's in the log - giving: 5B:1870:1. This is the LSN of the last log record that was backed up in the full backup. Now let's take a look at the transaction log for AdventureWorks using the undocumented fn_dblog function. This is undocumented but very well known. It's a fully composable alternative to using the old DBCC LOG command.

USE AdventureWorks;

GO

SELECT [Current LSN], Operation, [Transaction ID], AllocUnitName FROM fn_dblog (NULL, NULL);

GO

The log record at that LSN is:

00000058:00001870:0001  LOP_MODIFY_ROW     0000:00001338  HumanResources.Employee.PK_Employee_EmployeeID

This is obviously in the middle of my contrived transaction - showing that it isn't all in the full backup. The end of the transaction isn't until way later in the log:

0000005e:00000628:01b1  LOP_MODIFY_ROW     0000:00001338  HumanResources.Employee.PK_Employee_EmployeeID
0000005e:00000628:01b2  LOP_COMMIT_XACT    0000:00001338  NULL

Before I do anything else, I want to take a log backup to preserve my transaction:

BACKUP LOG AdventureWorks TO DISK='C:\SQLskills\AdventureWorks_Log.bck' WITH INIT;

GO

Now I want to restore the full backup and really show that my transaction isn't all in there:

USE master;

GO

RESTORE DATABASE AdventureWorks FROM DISK='C:\SQLskills\AdventureWorks.bck' WITH REPLACE, RECOVERY;

GO

SELECT MAX (VacationHours) FROM AdventureWorks.HumanResources.Employee;

GO

This returns 0. Clearly my transaction isn't all in there, and the parts that are were rolled back during the restore. Now let's do the same thing but using NORECOVERY for the restore of the full backup and also applying the log backup I took:

RESTORE DATABASE AdventureWorks FROM DISK='C:\SQLskills\AdventureWorks.bck' WITH REPLACE, NORECOVERY;

GO

RESTORE LOG AdventureWorks FROM DISK='C:\SQLskills\AdventureWorks_Log.bck' WITH RECOVERY;

GO

SELECT MAX (VacationHours) FROM AdventureWorks.HumanResources.Employee;

GO

This time the SELECT returns 200. And now you can start playng around with fn_dblog if you didn't know about it before. I'll be posting more about Storage Engine internals that you can figure out from the transaction log in future.

Friday, October 05, 2007 5:00:37 PM (Pacific Standard Time, UTC-08:00)  #    Comments [1]  | 

Here's an interesting question I was sent by my friend Steve Jones over at SQL Server Central - will a single CPU with dual-cores perform better than two single-core CPUs? Both have two processing cores but the hardware architecture is different - which one will make SQL Server perform better? Well, there's no hard and fast answer - it depends! I had a discussion on this topic this morning with Jerome Halmans, part of my old team in the SQL Server Storage Engine and I'm basing this post on our discussion with his permission.

My hypothesis (which Jerome confirmed) was that the performance of the two architectures depends on the amount of cache line invalidations and how that is managed (see here for a description of CPU caches and cache lines).

  • On the single-core machine, cache line invalidations needs to go across the main bus between the two CPUs, involving bus arbitration delays.
  • On the dual-core machine, cache line invalidations don't go across the bus because the two cores are contained within the same CPU package. In fact, if the architecture is smart enough, it may be able to just remap the cache line from one processing core to the other, thus avoiding any data copying. I'm not sure if such an architecture exists though.

Here is a very interesting and accessible Intel article that discusses cache-sharing in multi-core Intel systems. In this paper at least, the L2 cache is shared but modifications made by different cores in their private L1 caches still need to bounce through the shared L2 cache before being loaded by the other core. This will still be WAY faster than having to go through main bus between single-core CPUs.

And here is a similar paper from AMD on their Barcelona multi-core architecture that describes each core having separate L1 and L2 caches, with an additional shared L3 cache. The seperate L2 caches are kind-of linked though, in that modifications to a cache line in one L2 cache are immediately mirrored in the other L2 caches (if needed).

But the amount of cache invalidations (of whatever kind) depends on the workload. The two types of workload to consider are:

  1. Where the workload has very independent characteristics, so the data being processed by a thread on one processing core is unrelated to that being processed by a thread on the other core. There should be very few cache line invalidations. In this case, the single-core CPUs will have all their local caches full of data relevant to just the thread running. The two cores on the dual-core CPU will need to share some level of on-chip cache and so their may well be more churn in the cache. In this case I'd expect the single-core CPUs to perform better.
  2. Where the workload is such that data is shared, and threads touch data being processed by others threads on other cores. In this case, the single-core CPUs will fall victim to massive amounts of cache line invalidations, whereas the dual-core CPUs will do on-chip cache line invalidation (of whatever type is supported by the architecture). In this case I'd expect the multi-core CPU to outperform the two single-core CPUs.

Saying that, the majority of workloads on SQL Server are of the second type above. Jerome mentioned that even synthetic workloads (such as the TPCC benchmark) are still going to result in multiple-threads accessing and changing the same data/index pages.

So - what's the conclusion? I expect that a multi-core CPU will outperform an equivalent number of single-core CPUs in most workloads. And as Jerome pointed out, even if that's not the case for your workload, you'll find it pretty hard to find a system that ships with single-core CPUs these days.

I'd love to hear any comments on this, especially any measurements you've done on workloads as I don't have any single-core machines available to run tests on - even the laptop I'm typing this on is a dual-core Centrino.

Friday, October 05, 2007 11:43:24 AM (Pacific Standard Time, UTC-08:00)  #    Comments [0]  | 
Thursday, October 04, 2007

(This is an updated repost from earlier this year on my old blog)

Whenever I’m discussing index maintenance, and specifically fragmentation, I always make a point of saying ‘Make sure the index is being used before doing anything about fragmentation’. If an index isn’t being used very much, but has very low page density (lots of free space in the index pages), then it will be occupying a lot more disk space than it could do and it may be worth compacting (with a rebuild or a defrag) to get that disk space back. However, usually there’s not much point spending resources to remove any kind of fragmentation when an index isn’t being used. This is especially true of those people who rebuild all indexes every night or every week.

 

You could even go so far as to say if a non-clustered index isn’t being used, why is it there at all? Extra non-clustered indexes drag down performance in a number of ways. Consider a non-clustered index called IX_MyNCIndex on the table MyTable:

  • Any time a record is inserted into MyTable, a matching record is inserted into IX_MyNCIndex. This is a bunch of extra IOs, extra log records, plus maybe even a page-split.
  • Any time a record is deleted from MyTable, the matching record in IX_MyNCIndex must be deleted. Extra IOs again and log records again.
  • Any time a record in MyTable is updated:
    • If MyTable has a clustered index, and the clustered index key value changes, then the matching record in IX_MyNCIndex must be updated. Extra IOs and log records again.
    • If any of the non-clustered index key values changes, or any of the INCLUDEd column values changes, then the matching record in IX_MyNCIndex must be updated. Extra IOs and log records again.
  • If a clustered index is created on MyTable, then IX_MyNCIndex has to be rebuilt to include the logical RIDs rather than the physical heap RIDs (see this post for an explanation). Lot of extra IOs and log records again.

That’s a significant amount of extra IOs and log records to maintain each extraneous non-clustered index.

 

So, how can you tell if an index is being used?

  • In SQL Server 2000 there is no way to do it
  • In SQL Server 2005 there are a few different ways in SQL Server 2005 – the one I want to discuss in this post is the sys.dm_db_index_usage_stats DMV.

This DMV exposes the information that is tracked about index usage (as the name suggests). It does not generate any information itself; it just returns info from a cache inside SQL Server. This cache is empty when the server instance starts, and is not persisted across instance restarts. All cache entries for indexes in a database are removed when that database is closed. So, the cache tracks usage information about which indexes have been used since the database they are part of was last opened (either manually or as part of instance start-up).

 

This continues to confuse people so I'll call it out: if the output from the DMV does not have an entry for the index you're interested in, it has not been used since the last database startup.

 

The cache tracks the following info for each index (for user queries and system queries):

  • The number of times it was used in a seek operation (either looking up a single row, or doing a range scan) along with the time of the last seek.
  • The number of times it was used in a scan operation (e.g. a select * operation) along with the time of the last scan
  • The number of times it was used in a lookup operation (this means a bookmark lookup – where a non-clustered index does not fully cover a query and additional columns must be retrieved from the base table row) along with the time of the last lookup.
  • The number of times it was used in an update operation (this counts inserts, updates, and deletes) along with the time of the last update.

Let’s have a look at its use.

SELECT * FROM sys.dm_db_index_usage_stats;

GO

The output is too wide for a single image so I've split it in two (I won't post any more output from the DMV - I'll just talk about it):

indexusage11.jpg

indexusage21.jpg

Unless you've just re-started your instance, you'll see a bunch of output from this, representing all index activity since the instance/databases started. If you're interested in whether an index is being used, you can filter the output. Let's focus in on a particular table  - AdventureWorks.Person.Address.

SELECT * FROM sys.dm_db_index_usage_stats

WHERE database_id = DB_ID('AdventureWorks')

AND object_id = OBJECT_ID('AdventureWorks.Person.Address');

GO

You'll probably see nothing in the output, unless you've been playing around with that table. Let's force the clustered index on that table to be used, and look at the DMV output again.

SELECT * FROM AdventureWorks.Person.Address;

GO

SELECT * FROM sys.dm_db_index_usage_stats

WHERE database_id = DB_ID('AdventureWorks')

AND object_id = OBJECT_ID('AdventureWorks.Person.Address');

GO

Now there's a single row, showing a scan on the clustered index. Let's do something else.

SELECT StateProvinceID FROM AdventureWorks.Person