Extended Events Usage Survey

Last week at the PASS Summit I presented my Making the Leap from Profiler to Extended Events session, and one of the questions I always ask at the beginning is how many people have used Profiler (usually most of the attendees) and how many have used Extended Events (very few).  I’m giving this session again this week and next, and I thought it would be interesting to get feedback from the community as a whole about Profiler and XE use based on SQL Server version.  So in Paul Randal style, here’s my first poll.


I will summarize the results in a post next week.  Thanks for your help!

PASS Summit 2013: Final Recap

I haven’t traditionally written recap posts for the PASS Summit, but this year was just phenomenal, and I think that down the road I would regret it if I didn’t take a few minutes to summarize the highlights.

Perry and I arrive at the PASS Summit

Perry and I arrive at the PASS Summit

In no particular order…

#SQLRun

On Wednesday morning about 70-some people congregated in downtown Charlotte for the now-traditional #SQLRun.  Organized by Jes Borland (whom I really don’t have enough wonderful adjectives to describe), it was a 3.4 mile run with fellow SQL runners in the complete dark.   It was the perfect way to start Day 1 of the Summit.  A run, when it includes friends, is never bad.  I met a couple new people that I saw throughout the week, proving again that common interests outside SQL Server help facilitate those relationships within PASS.  Whatever your passion, I encourage you to find people with the same non-SQL interests at conferences and SQLSaturdays.  You just never know who you’ll meet.

My Sessions

My first session was Wednesday morning after the keynote, and it was Making the Leap from Profiler to Extended Events.

Perry checking out the crowd before my XE session

Perry checking out the crowd before my XE session

This was one of my favorite topics to cover this year, and based on feedback throughout the week, it hit home with many people.  Over 300 attendees made it to the session (the picture above was taken 15 minutes before I started), and I had great questions throughout and finished right on time.  In case you missed it, I’ll be giving a shorter version of the same session this Wednesday at noon EDT for the DBA Virtual Chapter (sign up here) and again next week at SQLIntersection.  Scripts can be downloaded from the PASS site, or from the SQLskills site under the Conference Demos heading.

My second session was Friday morning at 8 AM, How to Avoid Living at Work: Lessons from Working from Home.  Despite the early hour, we had a good number of attendees and a great discussion.  As I mentioned in a post back in August, I’m still adjusting, but it’s going well 🙂

The WIT Panel

I had the honor of sitting on the WIT Panel on Thursday, and even though I probably said less than the other panelists, I had the opportunity to address a couple great questions (including one from an audience member).

2013 WIT Panel

2013 WIT Panel (L to R: Gail Shaw, Kevin Kline, Cindy Gross, Rob Farley, me, and moderator Mickey Stuewe)

You can view the recording here, and since Thursday I’ve had a lot of time to reflect on what else I could have said, particularly when I answered the question from the audience member.  I want to include it here, for reference, and if you watch the video it starts at 59:11:

I had an interesting experience.  I was walking around the Expo yesterday and after having a short conversation with someone, someone said to me, well, you are a woman working in technology, you are a foreigner, you are a former collegiate athlete, and you are young.  You have all this working against you, how are going to make it in this industry?

My reply to her was that I would have said, “How am I NOT going to make it?”  Because here’s the thing: YOU decide what you can and cannot do, what you will and will not do.  You are in complete control of your destiny.  People will doubt you.  People will tell you that you aren’t good enough, don’t know enough, that you’re not “something enough”.  Don’t listen to them.  Know who you are…and if you don’t know, figure it out.  I firmly believe that once you fully accept the person that you are, and you like that person, that nothing will stop you.  Have confidence in yourself and then go forth and conquer.  And to the guy that said that?  There’s one part of me that wants to kick his @$$.  The other part of me feels sorry for him.  He has no idea what he’s up against.

The SQLskills Team

A conference bonus is that I get to see the SQLskills team.  It’s not often we’re all together because we’re scattered throughout the US.  I had time with every member of the team, including a couple dinners which really provide time to catch up in a relaxed setting.  I also moderated Kimberly’s session, Skewed Data, Poor Cardinality Estimates, and Plans Gone Bad, on Thursday, which was a lot of fun.  If you have any interest in statistics, go watch her session on PASS TV.

Me and Kimberly - notice how tall I am?!

Me and Kimberly – notice how tall I am?! (photo credit @AmazingScotch)

SQL Sentry

I cannot say enough good things about SQL Sentry.  They sponsored many events at the Summit including (and if I miss one please let me know):

  • Quizbowl at the Welcome Reception
  • Karaoke on Tuesday night
  • #SQLRun on Wednesday morning (they marked the path and provided t-shirts to those who registered early)
  • WIT Panel Luncheon (including a cool USB hub for swag)
  • The SQL Sentry shuttle on Tuesday, Wednesday, and Thursday nights that provided transportation for Summit attendees around Charlotte

In addition to being a community leader, SQL Sentry is simply a remarkable company.  I have met many members of their team, and it’s a close-knit group that values their customers, and just puts out great products.  I have been a fan of the company and its team since I joined the community, and they raised the bar even further this year.  Well done.

Dr. DeWitt’s Keynote

On Thursday morning Dr. DeWitt returned to the PASS Summit…I actually have no idea how many times he’s given a talk at the PASS Summit, but I know that for each of the past four years that I have been there, he’s been there.  This year his topic was Hekaton and of course it did not disappoint.

Perry listening to Dr. DeWitt talk about Hekaton while I type rapidly

Perry listening to Dr. DeWitt talk about Hekaton while I type rapidly

I live-blogged his session and was able to capture a fair bit of his content.  Dr. DeWitt explains complex database topics in a way that many understand – he’s not just a smart guy, he’s a great teacher.  Thank you Dr. DeWitt for your session, and thank you PASS for bringing him back again.  Can we do it again next year?

My Peeps

I cannot list everyone here.  You would all just end up looking for your name 🙂 But seriously, there are so many factors that contribute to a successful Summit for me, and one of them is most certainly seeing friends and meeting new people.  Whether we had a 5 minute chat, discussed a technical problem and how to solve it, or enjoyed a beer at some point: thank you for being part of the SQL community, and for being part of my world.  I feel so fortunate that I have a group of individuals, within my professional field, I call true friends.

Ok, ok…I have to give a special shout out to Johan Bijnens who brought me chocolate all the way from Belgium, and Aaron Bertrand who brought me Kinder Eggs from Canada.  Thank you both for feeding my addiction 🙂

I’m already thinking about next year’s Summit, but I hope to see you all before then.  Have a great week, and good luck catching up on email!

p.s. One of my favorite pictures from the week, courtesy of Jimmy May.  And if you’re wondering why the heck this Perry the Platypus stuffed animal shows up in all these pictures…well, I take him with me on trips and then take pictures to send back to my kids.  They think it’s hilarious.  Ok…I do too.

Perry and me before my XE session

Perry and me before my XE session

PASS Summit 2013: Day 2

And day 2 at this year’s PASS Summit starts with a sweet surprise from Aaron Bertrand ( b | t ), Kinder eggs.  It promises to be a good day.

Today is Dr. DeWitt‘s keynote (did I mention that he’s a University of Michigan alum? Go Blue!), and here we go…

8:15 AM

Douglas McDowell, outgoing Vice-President of Finance starts with information about the PASS budget.  Summit is the largest source of revenue for PASS, the Business Analytics Conference provided a nice contribution to the budget this year (over $100,000), and PASS has a one million dollars in reserve.

Last year PASS spent 7.6 million dollars on the SQL Server community, with the largest amount spent on the Summit.  The second largest cost was the BA Conference.  Per Douglas, Headquarters (HQ) is a critical investment for PASS.  Right now the IT department has 3 individuals maintaining 520 websites.  (And you thought you were a busy DBA!)  One initiative for PASS this year, and going forward, is an international expansion, which took about 30% of the budget this past year.  Overall, PASS is a very good financial place – and thanks to Douglas for all his work as a Board member.

8:31 AM

Bill Graziano takes the stage to thank Douglas for his time on the Board, and also Rob Farley who moves off the Board this year.  Bill asked Rushabh to come on stage…Rushabh has been on the Board of Directors for 8 years.  He’s held the positions of VP of Marketing, Executive VP of Finance, and President.

8:34 AM

Incoming PASS President Tom LaRock takes the stage, and starts with an omage to Justin Timberlake and Jimmy Fallon’s hashtag video.  Awesome.  Tom introduces the incoming PASS BoD:

  • Thomas LaRock (President)
  • Adam Jorgensen (Executive VP of Finance)
  • Denise McInerney (VP of Marketing)
  • Bill Graziano (Immediate Past President)
  • Jen Stirrup (EMEA seat)
  • Tim Ford (US seat)
  • Amy Lewis (open seat)

Tom has announced the PASS BA Conference – it will be May 7-9, 2014 in CA.  Next year’s Summit will be November 4-7, 2014 in Seattle, WA.

The WIT Lunch is today – and I’m on the panel so I hope to see you there!

8:41

Dr. DeWitt takes the stage, and the topic is Hekaton: Why, What, and How.

I was able to meet the co-author of this session, Rimma Nehme, before today’s keynote – she’s a Senior Researcher in his lab (which is apparently in an old Kroger grocery store building on the Madison campus).

DeWitt says that Hekaton is an OLTP rocket ship.  The marketing team has renamed Hekaton to In-Memory OLTP, and DeWitt wants people to vote on Twitter.  I am Team #Hekaton…it just sounds cooler (and it’s much easier to type).

Covering three things: What, the Why and How of Hekaton.

Hekaton is memory optimized, but durable.  It’s a very high performance OLTP engine, but can be used for more than that.  It’s fully integrated into SQL Server 2014, not a bolt-on.  Architected for modern CPUs.  (Slide deck will be posted later, I’ll post the link when I have it.)

Why Hekaton?  Many OLTP databases now fit in memory.  There are certain kinds of workloads that SQL Server can no longer meet.  Historically, OLTP performance has been improved by better software (driven by TPC benchmarks), CPU performance doubling every 2 years, existing DBMS software maturing.  DeWitt says we’ve done as much as we can with mainline products.  CPUs are not getting faster – that well is drive.

Hekaton was picked because the goal was 100X improvements.  Not quite there yet.  Customers have seen 15-20X.  If you’re burning 1 million instructions/sec and only yields 100 TPS.  If you want to get to 10,000 TPS, but reduce number of instructions/sec to a value that’s just not possible.

Getting to 100X with the flood of new products available (e.g. Oracle-TimesTen, IBM-SolidDB, Volt-VoltDB), including Hekaton.  Why a new engine?  Why not just pin all the tables in memory?  That won’t do the trick.  Performance would still be limited by the use of:

  • latches for shared data structures such as the buffer pool and lock table
  • locking as the concurrency control mechanism
  • interpretation of query plans

Implications of a shared buffer pool is a consideration.  Assume the pool is empty.  Query 1 comes along and needs page 7.  Is page 7 in the pool?  No, a frame is allocated, the query has to wait while the IO occurs.  The IO completes and Query 1 can continue.  Remember that the data structure is a shared structure.  If Query 2 checks for page 7, the buffer manager will report where it is, but Query 2 will be blocked by the latch on page 7 until Query 1 is finished.

(sidebar: a transaction or query only holds 2 latches at a time)

There can be significant contention for latches on “hot” pages in the buffer pool.  This can be a big performance hit.  All “shared” data must be protected with latches.

The need for concurrency control…  Query 1: A = A + 100.  Database actions: Read A, update the value, write A.  Query 2: A = A + 500.  Database actions: Read A, update the value, write A.  If A was originally 1000, after both queries, it will be 1600.  This represents a serial schedule.

Two phase locking developed by Dr. Jim Gray – which is the standard.  Two simple rules:

  1. Before access, query must acquire “appropriate” lock type from Lock Manager
  2. Once a query releases a lock, no further locks can be acquired

If these rules are followed, resulting schedule of action is equivalent to some serial (good) schedule.  Dr. Gray got Turin Award for proof of this, one of two given to database scientists.

(sidebar: can I get a fellowship in Dr. DeWitt’s lab?  Seriously…)

Still need a deadlock detection/resolution mechanism also needed (wanted to get rid of this for Hekaton…which is why it’s been a 5 year effort).

After a query is parsed and optimized, get an execution plan which is given to a query interpreter that walks the tree of operators and executes them in a particular order.  When the database is on disk, the cost of interpreting tree is insignificant.

All these three things (concurrency, query optimization and latches) are why you can’t get to 100X with current implementations.

Currently in SQL Server, shared data structures use latches.  Concurency control is done via locking, and query execution is via interpretation.

With Hekaton, shared data structures are lock-free.  For concurrency control, versions with timestamps plus optimistic concurrency control is used.  For query execution, compile into DLL that loads when queries are executed.  This is what will get us to 100X.

SQL Server has 3 query engines – relational, apollo (column store), and Hekaton.

To use Hekaton, create a memory-optimized table.  Two kinds of durability: schema-only and schema-and-data.  (Every Hekaton table must have a primary key index – can be hash or range.  Also have new b-tree in Hekaton, b-w tree, which gives high performance on range queries.) Some schema limitations for V1.  Once you’ve created the table, then populate the table.  Run a SELECT INTO statement, or do a BULK LOAD from a file.  Just need to make sure it’s going to fit into memory.  Then, use the table.  Via standard ad-hoc T-SQL query interface (termed “interop”), up to 3X performance boost.  Adapt, recompile and execute T-SQL SPs, get 5X-30X improvement.

Query optimization is the hardest part of relational databases, per Dr. DeWitt.  Lock-free data structures truly are rocket science – they make query optimization look simple.

Lock-free data structures invented by Maurice Herlihy at Brown University – got him elected to the National Academy of Engineering (which is a big deal).

When you think lock-free, think about latch free – it allows multiple processes with threads to access the same data structure without blocking.  Dr. DeWitt has a great slide showing performance differences with multiple threads for latch vs. lock-free.  He mentioned that it was a tough slide to animate (and if you see it, you’ll understand why…I was actually wondering how he did it).  With lock-free (aka latch-free?) – an update does not block reader threads – there is no performance hits.  Every shared data structure in Hekaton was built around this functionality.

In Hekaton, now have a different concurrency control.  It’s optimistic:

  • Conflicts are assumed to be rare
  • Transactions run to “completion” without blocking or setting locks
  • Conflicts detected during a Validation phase

Second component of concurrency control is multiversion – updating a row creates a NEW version of the row.  It works really well when you do this in memory.  The third component is timestamps – every row version has a timestamp:

  • Each row version has an associated time range
  • transactions use their begin timestamp to select correct version
  • timestamps also used to create a total order for transaction to obtain equivalent of serial order

This approach drastically reduces number of threads – dozens not hundreds.

Transaction phases in Hekaton:

  • Read committed versions of rows
  • Updates create new tentative versions
  • Track read set, write set, and scan set

When the transaction is done, goes through second phase which is validation, this is where the concurrency control mechanism decides whether transaction can commit.  Reaches commit point…

When transaction begins, current clock value is used as Begin_TS for transaction.  At the start of the validation phase, transaction is given unique End_TS.  It is used during validation to determine whether it is safe to commit the transaction.  Begin_TS are NOT unique, End_TS are ALWAYS unique.

Hekaton tables have either hash or range index on unique key.  Rows allocated space from SQL’s heap storage.  Additional indices (hash or range) on other attributes.

Hekaton row format – all rows tagged with Begin_TS and End_TS.  Latest version has infinity on the End_TS ( most recent version of the row).  The Begin_TS is the End_TS of the inserting transaction.  The End_TS is the logical time when the row was deleted and/or replaced with a new version.

Multiversioning Example – Have transaction to increase value by 10,000.  A new version of the row is created.  Pointers are used to link the rows together in memory.  Don’t think about them being contiguous in memory.  The transaction puts its signature (transaction ID) on each row (End_TS of initial row, Begin_TS of new row).  When the transaction is later validated and committed, for all rows it updated/created/deleted, it will re-access each row with that transaction ID and replace it with the End_TS.  NO LATCHES ARE USED!  NO LOCKS ARE SET!  NO BLOCKING OF ANY TRANSACTIONS!  (I’m not not yelling, neither is Dr. DeWitt.)  This is timestamps and versioning – used on rows AND transactions.  Always create new versions of rows when doing updates.  Per Dr. DeWitt, competitors are not going to have the same level of performance.

9:27 AM

Optimistic multi-version – this is the lock/latch-free mechanism in Hekaton (Dr. DeWitt says it so fast it’s hard to catch it 🙂

When is it safe to discard “old” versions of a row?  When the begin timestamp of the oldest query in the system is ahead of the last End_TS…older rows no longer needed.  Hekaton garbage collection is non-blocking, cooperative, incremental, parallel, and self-throttling. It has minimal impact on performance.  It happens completely under the covers.

Steps:

  1. Updates create new version of each updated row
  2. Transaction use  combination of time stamps and versions for concurrency control
  3. A transaction is allowed to read only versions of rows whose “valid” time overlaps the Begin_TS of the Xi.
  4. Transactoins essentially never block (WAIT, there’s a caveat here that Dr. DeWitt is glossing over…hm…)

Validation Phase

  1. Transaction obtains a unique End_TS
  2. Determine if transaction can be safely committed.

Validation steps depend on the isolation level of the transaction – “new” isolation levels for Hekaton.

Read Stability key idea: check that each version read is still “visible” at the end of the transaction using End_TS.

Phantom Avoidance requires a repeat each scan checking whether new versions have become visible since the transaction started.  And if any scan returns additional rows, validation fails.  This sounds expensive, but, keep in mind all rows are in memory.  It is only performed for transaction running at a serializable level, and it is still a LOT cheaper than acquiring and releasing locks.

There is a also a post-processing phase with 3 sub-phases (which I couldn’t type fast enough…oy).

Checkpoints & Recovery – the data is not lost, have a normal checkpoint process, use logs to generate checkpoints (holds data during shutdown).  Restart/recovery – starts by loading a known checkpoint and scans log to recover all work since then, fully integrated with HA (giving readable secondaries of memory optimized tables).

Standard method for query execution on a relational system is complicated…and slow-ish.  Regular T-SQL access is Interop.  Queries can access and update both Hekaton and disk-resident tables.  Interpreted execution limits performance.  When you put this all into a DLL, get much faster execution.

Native plan generation – query through parse tree, produces logical plan and then feeds it into optimizer which produces physical plan – but these are likely to be different for Hekaton (different algorithms and cost model).  Take the physical plan and then translate it into C code (the ugliest C code you’ve ever seen, no function calls, per DeWitt), goes into C compiler and produces DLL – which is very slim, only what’s needed to run the SP.  The DLL is then invoked and loaded – it’s stored in the catalog.

9:50

Dr. DeWitt just gave a bunch of quantitative data showing performance improvement in terms of transaction/sec, instructions/sec and CPU…I couldn’t copy it fast enough. 🙂

For more details, there is a session at 1:30 PM in Room 208 A-B there is a DBA Focused Session, tomorrow at 10:15 AM there is a dev-focused session.

Dr. DeWitt takes some time to thank his team.  This is something I really appreciate about him.  He brings Rimma Nehme on stage (who surprised him by showing up today) and gives nothing but praise…ending with a slide that has a picture of Rimma and a bubble: “Let the woman drive.”  Love it!

You can download DeWitt’s deck and supporting files here.

Thank you Dr. DeWitt!