How valuable are certifications?

(This essay appeared in our latest SQLskills Insider newsletter over the weekend and generated such a lot of discussion that I thought I’d share it with all of you who just read my blog. I’ve added a few points at the end too.)

There’s constant debate in the SQL Server community about whether the current certifications are worthwhile. Until recently, if you’d asked me, my opinion was that apart from the MCM certification, most SQL Server certifications were a waste of time and money.

This has understandably annoyed some people as many employers require certain certifications, and people like to get certifications to prove to themselves that they’re capable. I’m not saying your certification is worthless – but that its worth to anyone but you has been diluted through the actions of others. You know whether you have the real-deal skills or not, but how do others?

Why have I held this somewhat controversial view? It’s because over the last few years there’s been a sad proliferation of ways to allow someone to pass these certifications without the required knowledge and experience.

And don’t get me wrong – this isn’t a rant against Microsoft. It’s a rant against those whose actions have led to the detraction of the integrity of the existing certifications – those who want a short-cut to a certification and those who provide them.

For example, for most of the Microsoft SQL Server certifications, you’ve been able to:

  1. Attend a 3-5 day ‘boot camp’ that crams enough knowledge down your throat so you pass the exam at the end of the week
  2. Buy a book that gives you all the knowledge you need to *pass the exam*
  3. Buy a brain-dump from someone who’s memorized the exam
  4. (Allegedly, in some countries) Pay someone else to take the exam on your behalf

Note: I’m not implying that someone who does #1 or #2 is also going to do #3 or #4, just that this is the list of what someone can do to help pass the exam.

I cannot fathom how someone can live with themselves after doing #3 or #4 above – blatantly cheating to achieve a certification is despicable.

#1 and #2 aren’t that much better in my opinion. The point of a certification is that it certifies that you have a certain level of knowledge and experience in a subject. Reading a book or having knowledge drummed into you isn’t experience – it’s just book learning. Many prep guide books have given you the knowledge needed to pass the exam, not the complete set of knowledge implied by the certification – same for the boot camps. However, picking up a book to fill a few gaps in knowledge is acceptable, just not getting the book as the only source of knowledge for passing the exam.

This state of affairs has not been limited to the SQL Server certifications, and has not been limited to Microsoft at all – it’s been all pervasive through the certification industry, from what I can tell.

There’s also been the perception that in many of the non-MCM SQL Server certifications, some of the questions have incorrect answers and that some of the questions try to steer you towards picking the new features as the answer. I’ve heard this from many people who’ve taken certification exams.

This means there’s been a big problem: lack of confidence that someone with certification X actually has the commensurate knowledge and experience that certification X implies. I’ve even heard of companies that discard resumes when the candidate lists a bunch of certifications, on the premise that they must be falsified – which I think is taking the not-trusting-the-certifications to a misguided extreme.

The people who have been suffering are the vast majority of people who take the certifications the ‘proper’ way – building up experience over time and then taking the exams. For these people, hearing that anyone can take a short-cut and get the certification without having the real knowledge and experience really reduces the value of their having the certification. And, because it’s been common knowledge that people can cheat or short-cut the exams, it must be incredibly frustrating for those who do it the right way and spend money on the certifications in good faith.

The sad reality has been that one cannot assume that a certification equals the holder having the certified qualities.

This is one of the (many) reasons why the SQL MCM certification was created – a certification that cannot be passed without taking a practical exam and that you cannot pass with just book learning. (Ironically though, you need to have some of the lesser certifications before you can be awarded the MCM.)

There are various layers of security that were used in the re-launch of the SQL MCM program.  Right now, having the MCM really means a lot.  I won’t get in to the specifics of the security precautions here (for obvious reasons) but I will be incredibly upset on the day I hear of the first brain-dump for the MCM because then the MCM certification won’t be trustworthy.

I wish this overall situation were otherwise. I wish people didn’t cheat or look for short cuts. There’s even been talk amongst the SQL Server MVP community of creating a set of community certifications to try to work around these problems. Why hasn’t it happened? Because it’s very hard, time consuming, and expensive to do – and, ultimately, it would be vulnerable to some of these issues as well.

My opinion has been changed recently by steps Microsoft has taken to re-instill confidence in the integrity of their certifications. Last November Microsoft released an interview with Don Field, their Senior Director of Certification and Training (see Microsoft Ensures Integrity of Its Certification Program) which lists a bunch of things they’ve been doing, including suing a notorious brain-dumping site, banning exam candidates that can be proven to have used unfair means to pass, and aggressively shutting down web sites with leaked exam content.

The Microsoft Press exam and certification guides have also started emphasizing that reading through the book is not enough to pass the exam and the exams themselves are becoming a lot more in-depth and higher quality. Furthermore, testing centers are coming under high scrutiny to ensure they are not allowing cheating to take place.

I applaud these actions. It’s high time that the millions of people with Microsoft certifications can start to feel that their certification means something to other people. However, I think that Microsoft learning partners and testing centers have to play along here and get with the security program – for the benefit of everyone. And I think that Microsoft has to continue to be aggressive in its anti-cheating crusade – as the cheaters will never stop trying to make a buck from people looking for a short cut.

If you ask me my opinion today, it’s that things are slowly moving in the right direction and I hope the overall value is going to re-emerge for the non-MCM certifications. I really, really hope this happens – as otherwise, how can we gauge the knowledge and experience of the vast majority of SQL Server professionals out there? And then what value will the certifications be?

Call to action: Simple: if you’re going for a certification, go about it the right way. Don’t cheat or take short cuts – you’re only cheating yourself, and making a mockery of the certification you just ‘attained’.

[Additional that wasn’t in the newsletter: 

I’ve had feedback from two people on the mailing list that they’ve heard of folks trying to find a brain-dump of the MCM knowledge exam. That makes me really sad, and angry!

Call to action for Microsoft: I think each certification exam should have a practical (lab) component so that it validates people really *know* the stuff rather than having just read a book, for instance. Yes, this will cost more to implement, but that’s an investment in getting more people onto the Microsoft platform as the certifications will be really meaningful. Provision of certifications should not be under a revenue-generating model anyway – the cost should be non-negotiably borne by the various product groups IMHO – as a way of fostering more adoption of their products.

End additional.]

I’m really interested to know your thoughts on certifications – please comment!

Internals, Design, and Performance class in Chicago in May

A few weeks ago Kimberly did a blog post survey about which city in the US you'd like to see us come to with our next Immersion Event and there were two that stood out: Chicago, IL and Tampa, FL. We looked into the logistics of putting on an event at each location and we ended up selecting Chicago for a variety of reasons. One is certainly that it is centrally located but the real clincher is that we got a phenomenal nightly hotel rate for those of you that do have to travel. Our nightly rate is only $99 per night and it includes internet access. The hotel has free parking and it’s relatively easy to get to for all of the local folks.

Dates: Monday, 2nd May through Friday, 6th May 2011

Course: SQLskills Immersion Event on Internals and Performance (full class abstract)

Venue: Hilton Suites Chicago/Oak Brook

For full details and registration, click HERE.

Don't dip your toe – immerse yourself!

(Want some background info?

And sign-up to join our community and become a SQLskills Insider for our monthly newsletter with exclusive content, discounts, and more!)

Do changes to index keys really do in-place updates?

There was an interesting discussion on Twitter this morning (started by my good friend Erin Stellato (blog|twitter)) about in-place updates for index records where the key value changes and the record remains on the same page. Various sources including the SQL Server 2008 Internals book (pg 361 – I didn't write or review that bit :-) describe the process, but they describe it incorrectly unfortunately. They say that the record will remain in the exact same location on the same page and only the key storage bytes need to change.

This is not true. True in-place updates of index key values do not happen in 2005 onwards. Let's go down the rabbit hole…

Here's a simple example:

USE KeyUpdateTest;
CREATE TABLE test (c1 INT, c2 VARCHAR (2000));
CREATE CLUSTERED INDEX test_cl ON test (c1);
INSERT INTO test VALUES (1, REPLICATE ('Paul', 500));

Now we look at the page itself (and you can get the stored-proc I use below from my old blog post Inside The Storage Engine: sp_AllocationMetadata – putting undocumented system catalog views to work):

EXEC sp_allocationMetadata 'test';

Object Name Index ID Alloc Unit ID     Alloc Unit Type First Page Root Page First IAM Page
———– ——– —————– ————— ———- ——— ————–
test        1        72057594039762944 IN_ROW_DATA     (1:144)    (1:144)   (1:145)

DBCC PAGE (keyupdatetest, 1, 144, 2);

Memory Dump @0x0000000016A6A000

0000000016A6A000:   01010400 00c20001 00000000 00000800 †…..Â……….
0000000016A6A010:   00000000 00000100 1c000000 bd174108 †…………½.A.
0000000016A6A020:   90000000 01000000 1d000000 8a000000 †…………Š…
0000000016A6A030:   12000000 00000000 00000000 239dbd4a †…………#.½J
0000000016A6A040:   00000000 00000000 00000000 00000000 †…………….
0000000016A6A050:   00000000 00000000 00000000 00000000 †…………….
0000000016A6A060:   30000800 01000000 03000002 001100e1 †0…………..á
0000000016A6A070:   07506175 6c506175 6c506175 6c506175 †.PaulPaulPaulPau
0000000016A6A080:   6c506175 6c506175 6c506175 6c506175 †lPaulPaulPaulPau
0000000016A6A090:   6c506175 6c506175 6c506175 6c506175 †lPaulPaulPaulPau
0000000016A6A0A0:   6c506175 6c506175 6c506175 6c506175 †lPaulPaulPaulPau
0000000016A6A0B0:   6c506175 6c506175 6c506175 6c506175 †lPaulPaulPaulPau
0000000016A6A820:   6c506175 6c506175 6c506175 6c506175 †lPaulPaulPaulPau
0000000016A6A830:   6c506175 6c506175 6c506175 6c506175 †lPaulPaulPaulPau
0000000016A6A840:   6c000021 21212121 21212121 21212121 †l..!!!!!!!!!!!!!
0000000016A6A850:   21212121 21212121 21212121 21212121 †!!!!!!!!!!!!!!!!
0000000016A6BFE0:   21212121 21212121 21212121 21212121 †!!!!!!!!!!!!!!!!
0000000016A6BFF0:   21212121 21212121 21212121 21216000 †!!!!!!!!!!!!!!`.


Row – Offset                        
0 (0x0) – 96 (0x60)

Now I'll update the record:

UPDATE test SET c1 = 2 WHERE c1 =1;

And let's look at the log records generated (in the output below I've removed all the extraneous stuff from the log that occurs before my transaction):

SELECT [Current LSN], [Operation], [Context],
 [Log Record Length], [Page ID], [Slot ID] FROM fn_dblog (NULL, NULL);

Current LSN             Operation           Context               Log Record Length Page ID        Slot ID
———————– ——————- ——————— —————– ————– ———–
0000001d:00000096:0011  LOP_BEGIN_XACT      LCX_NULL              112               NULL           NULL
0000001d:00000096:0012  LOP_DELETE_ROWS     LCX_MARK_AS_GHOST     2116              0001:00000090  0
0000001d:00000096:0013  LOP_MODIFY_HEADER   LCX_PFS               76                0001:00000001  0
0000001d:00000096:0014  LOP_SET_BITS        LCX_PFS               56                0001:00000001  0
0000001d:00000096:0015  LOP_INSERT_ROWS     LCX_CLUSTERED         2120              0001:00000090  1
0000001d:00000096:0016  LOP_COMMIT_XACT     LCX_NULL              52                NULL           NULL
0000001d:000000a2:0001  LOP_EXPUNGE_ROWS    LCX_CLUSTERED         64                0001:00000090  0
0000001d:000000a2:0002  LOP_SET_BITS        LCX_PFS               56                0001:00000001  0
0000001d:000000a2:0003  LOP_MODIFY_HEADER   LCX_PFS               76                0001:00000001  0

The row is deleted and then inserted. Look at the log record lengths! And look at the slot numbers. The new row is in slot 1 (the LOP_INSERT_ROWS record) and slot 0 is ghosted (the LOP_DELETE_ROWS record), and then we see ghost cleanup running (the LOP_EXPUNGE_ROWS record), which removes the old slot 0, and the slot array shifts down so the new record is slot 0 again. See my blog post Inside the Storage Engine: Ghost cleanup in depth for info on ghost cleanup internals.

But is it in the same physical place on the page? No – a whole new record was created.

DBCC PAGE (keyupdatetest, 1, 144, 2);

Memory Dump @0x0000000013CAC000

0000000013CAC000:   01010400 00800001 00000000 00000800 †…………….
0000000013CAC010:   00000000 00000100 1c000000 bd172210 †…………½.".
0000000013CAC020:   90000000 01000000 1d000000 a2000000 †…………¢…
0000000013CAC030:   01000000 76020000 00000000 239dbd4a †….v…….#.½J
0000000013CAC040:   01000000 00000000 00000000 00000000 †…………….
0000000013CAC050:   00000000 00000000 00000000 00000000 †…………….
0000000013CAC060:   3c000800 01000000 03000002 001100e1 †<…………..á
0000000013CAC070:   07506175 6c506175 6c506175 6c506175 †.PaulPaulPaulPau
0000000013CAC080:   6c506175 6c506175 6c506175 6c506175 †lPaulPaulPaulPau
0000000013CAC090:   6c506175 6c506175 6c506175 6c506175 †lPaulPaulPaulPau
0000000013CAC820:   6c506175 6c506175 6c506175 6c506175 †lPaulPaulPaulPau
0000000013CAC830:   6c506175 6c506175 6c506175 6c506175 †lPaulPaulPaulPau
0000000013CAC840:   6c300008 00020000 00030000 02001100 †l0…………..
0000000013CAC850:   e1075061 756c5061 756c5061 756c5061 †á.PaulPaulPaulPa
0000000013CAC860:   756c5061 756c5061 756c5061 756c5061 †ulPaulPaulPaulPa
0000000013CADFE0:   21212121 21212121 21212121 21212121 †!!!!!!!!!!!!!!!!
0000000013CADFF0:   21212121 21212121 21212121 41084108 †!!!!!!!!!!!!A.A.


Row – Offset                        
0 (0x0) – 2113 (0x841)              

Notice that where the original record ended (offset 0x830 into the page). In the first page dump, everything after that is empty space. In the second page dump, the new record starts there. And look at the slot array (called the offset table by DBCC PAGE). In the first page dump the original record started at offset 0x60, and in the second dump it starts at offset 0x841. Even though ghost cleanup ran, the record did NOT migrate to it's original location.

It's very clear that an in-place update did not happen, and that the entire record is recreated in it's entirety.

In this case, the record ended up being physical slot 0 again, but that's a pathological case because there's only one slot on the page. If you repeat the experiment with two records (e.g. c1=1 and c1=2) and then update c1 of the first record to 3, you'll see the new record being physical slot 2, then after ghost cleanup becoming physical slot 1. But again, it's deleted and reinserted in it's entirety.

Now I'll show you a case where the row has to move to a new page – definitely not an in-place update!

USE KeyUpdateTest;
CREATE TABLE test (c1 INT, c2 VARCHAR (4000));
CREATE CLUSTERED INDEX test_cl ON test (c1);
INSERT INTO test VALUES (1, REPLICATE ('Paul', 1000));
INSERT INTO test VALUES (2, REPLICATE ('Erin', 1000));

I've filled up the page with two rows.

DBCC PAGE (keyupdatetest, 1, 144, 2);

Metadata: ObjectId = 2105058535      m_prevPage = (0:0)                   m_nextPage = (0:0)
pminlen = 8                          m_slotCnt = 2                        m_freeCnt = 58
m_freeData = 8130                    m_reservedCnt = 0                    m_lsn = (29:150:2)
0000000016A6BFA0:   696e4572 696e4572 696e4572 696e4572 †inErinErinErinEr
0000000016A6BFB0:   696e4572 696e4572 696e4572 696e4572 †inErinErinErinEr
0000000016A6BFC0:   696e0000 21212121 21212121 21212121 †in..!!!!!!!!!!!!
0000000016A6BFD0:   21212121 21212121 21212121 21212121 †!!!!!!!!!!!!!!!!
0000000016A6BFE0:   21212121 21212121 21212121 21212121 †!!!!!!!!!!!!!!!!
0000000016A6BFF0:   21212121 21212121 21212121 11106000 †!!!!!!!!!!!!..`.


Row – Offset                        
1 (0x1) – 4113 (0x1011)             
0 (0x0) – 96 (0x60)

Now I'll update the first row to have c1 = 3 so my row moves away from Erin's :-)

UPDATE test SET c1 = 3 WHERE c1 =1;

And look at the log:

SELECT [Current LSN], [Operation], [Context],
 [Log Record Length], [Page ID], [Slot ID] FROM fn_dblog (NULL, NULL);

Current LSN             Operation           Context                Log Record Length Page ID        Slot ID
———————– ——————- ———————- —————– ————– ———–
0000001d:000000a3:0015  LOP_BEGIN_XACT      LCX_NULL               136               NULL           NULL
0000001d:000000a3:0016  LOP_MODIFY_ROW      LCX_PFS                80                0001:00000001  0
0000001d:000000a3:0017  LOP_MODIFY_ROW      LCX_IAM                88                0001:00000091  0
0000001d:000000a3:0018  LOP_HOBT_DELTA      LCX_NULL               64                NULL           NULL
0000001d:000000a3:0019  LOP_FORMAT_PAGE     LCX_INDEX_INTERIOR     84                0001:00000092  -1
0000001d:000000a3:001a  LOP_INSERT_ROWS     LCX_INDEX_INTERIOR     84                0001:00000092  0
0000001d:000000a3:001b  LOP_ROOT_CHANGE     LCX_CLUSTERED          96                0001:00000082  52
0000001d:000000a3:001c  LOP_COMMIT_XACT     LCX_NULL               52                NULL           NULL
0000001d:000000a3:001d  LOP_BEGIN_XACT      LCX_NULL               120               NULL           NULL
0000001d:000000a3:001e  LOP_MODIFY_ROW      LCX_PFS                80                0001:00000001  0
0000001d:000000a3:001f  LOP_MODIFY_ROW      LCX_IAM                88                0001:00000091  0
0000001d:000000a3:0020  LOP_HOBT_DELTA      LCX_NULL               64                NULL           NULL
0000001d:000000a3:0021  LOP_FORMAT_PAGE     LCX_HEAP               84                0001:00000093  -1
0000001d:000000a3:0022  LOP_INSERT_ROWS     LCX_CLUSTERED          4092              0001:00000093  0
0000001d:000000a3:0023  LOP_DELETE_SPLIT    LCX_CLUSTERED          60                0001:00000090  1
0000001d:000000a3:0024  LOP_MODIFY_HEADER   LCX_HEAP               84                0001:00000090  0
0000001d:000000a3:0025  LOP_INSERT_ROWS     LCX_INDEX_INTERIOR     88                0001:00000092  1
0000001d:000000a3:0026  LOP_COMMIT_XACT     LCX_NULL               52                NULL           NULL
0000001d:000000a3:0027  LOP_INSERT_ROWS     LCX_CLUSTERED          4120              0001:00000093  1
0000001d:000000a3:0028  LOP_COMMIT_XACT     LCX_NULL               52                NULL           NULL

This is really interesting. Here's what happens (missing a few things for clarity):

  • 0000001d:000000a3:0016: page (1:146) is allocated to be the new root page for the clustered index
  • 0000001d:000000a3:0019: it gets minimally formatted (just the page header – although the page header is 96 bytes, there is 12 bytes of empty space at the end)
  • 0000001d:000000a3:001a: the index record pointing at (1:144) is inserted in the root page
  • 0000001d:000000a3:001b: the index root os changedd from (1:144) to (1:146) in metadata
  • 0000001d:000000a3:001e: page (1:147) is allocated to be the second data page at the leaf of the clustered index
  • 0000001d:000000a3:0022: a page split of page (1:144) occurs, moving the Erin row to slot 0 of page (1:147)
  • 0000001d:000000a3:0023: the Erin row is removed from page (1:144) – not ghosted
  • 0000001d:000000a3:0025: the index record pointing at (1:147) is inserted in the root page
  • 0000001d:000000a3:0027: the updated Paul row is inserted as slot 1 of page (1:147)

As you can see, this is as far as you can get from doing an in-place update. Page (1:144) is left with a single ghost-record on (the Paul record – the Erin record isn't ghosted because it was moved because of a split). The next access to the page will cause ghost cleanup to remove it and deallocate the page.

So no, in-place updates of key values do not happen, even though they appear to at first glance. You can experiment to show the same thing happening in nonclustered indexes too.

Another misconception busted!

PS Some of you may remember when such in-place updates really *did* happen. That was before 2005 when the Storage Engine (specifically my old Access Methods team) did the index maintenance (changing clustered and nonclustered index structures based on key value changes). A whole bunch of features in 2005 meant that we switched it over to the Query Processor to drive the index maintenance, and nifty short-cuts like we used to do aren't possible any more.