In reading the whitepaper about “High-Performance Concurrency Control Mechanisms for Main-Memory Databases”, I was intrigued by the discussion of speculative reads and transaction dependencies. It’s not always good to use information from an academic whitepaper as details of an implementation, because the real implementation might be differ slightly from the description of the whitepaper’s implementation. In addition, bear in the that the observations here are based on the CTP1 version of SQL Server 2014, and the details may change by RTM.
The whitepaper describes two implementations of the multiversion storage engine, one using optimistic concurrency with no locking and one using locking. Because the description of the Hekaton feature mentioned “lock-free structures” as one of the pillars, I looked at the optimistic implementation as possibly close to the CTP1 implementation.
To paraphrase the whitepaper, there are at least two ways the speculative reads can take place.
1. Transaction A determines if it can read Version V. Version V’s begin timestamp is another transaction’s (TB) ID. TB is in the preparing state. Use transaction B’s end timestamp (obtained right before TB-prepare) as V’s begin time. Speculatively read V. Transaction A has a commit dependency of Transaction B.
2. Transaction A determines if it can read Version V. Version V’s end timestamp is another transaction’s (TE) ID. TE is in the preparing state. Use transaction E’s end timestamp (obtained right before TE-prepare) as V’s end time. Speculatively ignore V. Transaction A has a commit dependency of Transaction E.
These descriptions mention dependencies happening in the transaction A’s Active phase, but in addition, there is mention that “Transaction A may acquire additional commit dependencies during validation but only if it speculatively ignores a version.” For information about transaction phases, reference the previous blog entry or the whitepaper.
After noticing the extended events dependency_acquiredtx_event and waiting_for_dependenciestx_event, I set out to look for those dependencies. Because tx A can only acquire a dependency on tx B when B is in the preparing phase (state) and, in most cases, the preparing state is usually pretty short, the dependency sounded almost like a race condition. Making the preparing state as long as possible would give me a better chance.
In the implementation of pessimistic concurrency, the whitepaper mentions two “read sets” that are checked during the prepare phase. The ReadSet contains pointers to every version read and the ScanSet stores information needed to repeat every scan. The whitepaper also describes WriteSet but that’s outside my scope. During the prepare phase ReadSets are checked to ensure consistency in isolation level Repeatable Read or Serializable. In addition, ScanSets are checked to guard against phantoms in isolation level serializable. Serializable isolation with a large ScanSet seemed to be the best choice to lengthen the prepare phase.
I declared a two column memory-optimized table with a primary key/hash index on one column (I called it id) and no index at all on the other column (c1). Added 250000 rows. And figured that running the following batch inside a “traditional transaction” (to slow things down even more compared to a compiled stored procedure) from multiple clients should produce the behavior.
select top(150000) * from txtest with (serializable);
update txtest with (serializable) set c1 = 1
where c1 in (select top(10) c1 from txtest with (serializable) order by newid()); — update 10 random rows
Running betwen 5-10 of these clients simultaneously for 20 iterations of each client produced the “transaction dependency” behavior consistently. Tracing this with an extended events session that included:
Besides observing the behavior, I was able to make some interesting observations from the event session.
1. You can have “chains” of dependent transactions, e.g. tx A depends on tx B, tx B depends on tx C, etc.
2. You can have multiple dependent transactions on the same transaction, e.g. txs A, B, and C all depend on tx D.
3. You can have multiple dependency_acquiredtx_event for the same two transactions, e.g. two different occurrences of the event for the dependency tx A depends on tx B.
I also noticed one other interesting behavior. In my tests with 10 clients x 20 iterations (200 transactions total) between 60% and 80% of the transactions routinely abort. That’s not surprising, the test was set up to produce read conflicts. What was surprising is that, although transaction commit wasn’t the norm, every transaction that had dependency(ies) on it ended up committing. 22 of 200 transactions in one test. And the dependent ones also committed. That’s surprising, but perhaps the fact that transactions that have dependencies and those that they depend on all eventually commit is just an artifact of the test batch I chose.
So, from this test you can deduce that:
1. Transaction dependencies and speculative reads are real (can be observed) in the current implementation.
2. Large scans and transaction isolation levels higher than snapshot should only be used with memory-optimized tables when *absolutely* necessary. 3. That’s especially true for transaction isolation level serializable.