Migrating to In-Memory OLTP…what’s stopping you?

In the past few months I’ve had several clients reach out about migrating to In-Memory OLTP solutions.  When the second or third request rolled in I remember thinking, “Finally!”  As in, I’ve been wondering why businesses haven’t been looking to implement In-Memory sooner.  In fact, I added a section on In-Memory to our IEPTO2 course because with the availability of the feature in SQL Server 2016 SP1 (yes, there are memory limitations, but it’s a good place to start) I figured we would see an uptick in interest.  But here we are, half way through 2017 and over 6 months since the release of SQL Server 2016 SP1, and I still see a lot of hesitation around it.

I wrote a post over on SQLPerformance last week, Testing DML Statements for In-Memory OLTP, and that generated some discussions on Twitter.  So I figured it was time for a post to find out what’s holding companies back.  This isn’t a true poll – it’s a fill-in-the-blank.  As in: post a comment.  If you have considered migrating to In-Memory and then didn’t, I want to understand why.  I recognize there are limitations – the product is still new and it’s evolving.  But perhaps if we understand the largest inhibitors to migration we can help move them up on Microsoft’s list via Connect and other pathways.  Understand I am asking for specifics here, for example: we can’t use In-Memory tables because they don’t support spatial data types.  Whatever the reason, share it via a comment.

I’m giving this a week or so, then I’ll write a follow up either discussing trends, debunking myths, showing some testing, or perhaps just talking about why I’m sad that no one uses In-Memory OLTP.  🙂  In all seriousness, I have no idea what the follow up will look like, it all depends on whether you and your friends share info!  Until then, thanks for reading and commenting!

(And if you’re in the US, please have a safe 4th of July…no one wants to go to the hospital after trying to set off fireworks at home…just ask my friend who’s a hand surgeon, it’s her least favorite time of year!)

Fireworks from the professionals
Fireworks from the professionals

36 thoughts on “Migrating to In-Memory OLTP…what’s stopping you?

  1. Hi Erin,
    I guess the reason is, I need to identify what tables or complete DB’s would benefit going to an In-Memory solution. Is there any to identify this? Maybe the Tuning Advisor or a third party tool? I work for local government and all the databases are vendor driven so anything I do would be wiped out the next update etc.
    Also all of our servers (prod & dev) are on VM’s (VMware). Has anybody seen advantages when using In-Memory in this virtual environment?

    Thanks for your posts, I learn lot from you.
    Patrick

    1. Hi Patrick – thanks for your comment! SQL Server does have methods to help you determine what tables and SPs are candidates for migration to In-Memory. Check out: https://docs.microsoft.com/en-us/sql/relational-databases/in-memory-oltp/determining-if-a-table-or-stored-procedure-should-be-ported-to-in-memory-oltp – the Transaction Performance Analysis Reports should help you out.

      With regard to vendor databases – that is a little different in that migrating to in-memory requires support from the vendor. That may not be an option, but I’d definitely recommend that you ask them about – you won’t know until you ask. Also, updates wouldn’t necessarily wipe out changes – the vendor would need to drop and recreate table, and/or drop and recreate stored procedures (more likely). You may also run into limitations with in-memory (e.g. not all data types supported) so I would ask the vendor, first, what they support if anything, and if they plan to support it in the future.

      With regard to using In-Memory OLTP and and VMs, why wouldn’t you expect to see the same advantages as you would with a physical server? In-Memory OLTP is about having the data already in memory, or using natively compiled procedures as a performance benefit. Those are independent of running in VM vs. a physical machine.

      Thanks,

      Erin

      1. Erin,

        I’m thrilled that there’s even a discussion of In-Memory (thanks!).

        I’ve done a deep dive on the Transaction Performance Analysis Report, and there are definitely some issues with the way tables are evaluated for placement on the “magic quadrant”. Blog post here: http://bit.ly/2srl207

  2. Erin,

    My hat off for your initiative to fill the shoe of Microsoft up to the task. Perhaps, the biggest concern is fear of security and data integrity. The view is on the assumption that there is a trade off between reliability and security of data.

    1. Hi Yihalem-

      Thanks for your comment! Can you please explain the concern about security and data integrity? I’m unclear what you mean when you say “there is a trade off between reliability and security of data”. Can you elaborate please?

      Thanks!

      Erin

      1. Perhaps he’s referring to the inability to perform a consistency check on HK tables. As far as I know every other security feature (including Always Encrypted) is available. My one headache is the inability to remove the filegroup, if the experiment doesn’t work out.

    1. George-

      In some cases, having everything fit in the buffer pool absolutely provides the performance you need. But when it doesn’t, then in-mem tables and natively compiled SPs can provide a huge benefit.

      Erin

      1. This is the main thing holding me back from looking into in-memory OLTP. I have a 20GB database on a server with 48GB RAM. My whole database is in RAM with plenty to spare. My DB is also sitting on SSDs.

        I think in-mem OLTP is a very cool concept, but I’m not sure it really has anything to offer me in my current setup.

  3. We just finished migrating to SQL2016 from SQL2008 and SQL2012. Our systems are heavy inserts; it is an e-mail archive solution. It is on the horizon to implement and I am looking at the shock absorber scenario. Insert into Hekaton first, then bulk insert into main tables. Also going to be challenging for us to test; current databases are 9 and 30TB in size. Gotta get more memory 1st and also implement other things too like page compression on both tables and indexes. The 9TB is done, now working on 30TB one. Upper management needs to decide if they want to proceed with it too.

    1. Todd-

      Your scenario is a great one for in-mem OLTP. Loading the data into in-mem tables will be very fast, and then you can do an asynch offload to the disk-based tables. Compression will help for disk-based as well, and yes, you will need to plan for memory. If you can, right now, I’d start grabbing info about the volume of data loaded per day so you have actual numbers to make sure you case for a memory request. I always think that if you can quantify what you need, you have a better chance of actually getting it (don’t tell my kids that! 🙂 Good luck. I think it sounds like a fun project!

      Erin

  4. The reasons are:
    – Let’s be frank then. It’s too new, too young for serious business. So it’s risky. No one wants to risk his/her critical data.
    – When you add a Filegroup, there’s no way back as far as I remember. You may say “what harm does it have?”, doesn’t it bother you when you create “MDW” and you can’t delete it afterwards?
    – I implemented Hekaton in one of the test environments and I observed it caused lots of logs to be created in the Error Log file, it’s chatty.
    – There were lots of limitations when I tried it first, I know that by time they eliminated a lot of them,
    – I’m not sure how people calculate the bucket counts for the hash indexes. I know the documentation say “1,5 or 2 times of the unique values” but how is one supposed to know about the future unique values? I mean for instance let’s say there are 200 unique values in one field today and it could be 500 next week.
    – In many environments and clients, they deploy changes daily or weekly. In Hekaton, many operations are/were not online. One needs to drop an objects and then create a new one. When business needs to be online and running, it’s not practical to drop and recreate objects in the middle of the operations.

    Note: As the development goes on by Microsoft, some of my reasons might be invalid now. But those are my concerns and I believe they are shared by many out there.

    1. Ekrem-

      Thank you for your candid feedback. Answering each item individually…

      – It is new, but it has to start somewhere. Understand not wanting to risk critical data – I think temp table use is a good candidate to start
      – True about the FG – but hopefully people are testing in a TEST/DEV environment first and not just doing this in production
      – Didn’t know it was so chatty – I haven’t seen that same issue in my testing/work.
      – They continue to reduce the limitations, I recommend you check the documentation to stay current
      – There’s an article on calculating hash count (https://msdn.microsoft.com/en-us/library/dn494956(v=sql.120).aspx) and in 2016 you can change the count (requires a rebuild)
      – Valid argument

      Again, thanks for the feedback!

      Erin

  5. We’ve spent a large amount of effort trying it at several sites, to little outcome.

    We had hoped to use memory-optimized table types to replace tvps but are yet to achieve suitable outcomes.

    The real benefit comes from natively compiled code but the language support is still too primitive. Without that, the interop kills it, and joins or updates to disk based tables invariably lead to spools of all the data to tempdb.

    We’ve also seen a bunch of stability issues ie random “a severe error has occurred” type errors with connection terminations. All disappear when we remove the memory-optimized declaration.

    Have you actually used it in serious production scale with good outcomes? We really want to and have realy tried.

    I’ve been desperately trying to find someone with great outcomes to see if we can emulate that. I suspect that some limited shock absorber scenarios might work.

    1. Greg-

      Thanks for sharing your experience, even though it hasn’t been ideal. I agree that natively compiled stored procedures are a huge part of the perf gain, and I haven’t thoroughly tested joining in-mem to disk-based tables. The use cases I’ve worked with so far have been more limited and so I haven’t hit the same issues. That may change. Thanks for the feedback.

      E

    2. Greg-

      We currently use in-memory table types to replace TVPs and temp tables within procedures. Our specific use case is around reducing tempdb contention (on system base table because out temp table creation rate was so high). This is on boxes with 40K+ transactions per second. This is implemented on a few hundred instances.

      Overall it has been a win for us. Most of our TVPs are integer-based though, which might be the big difference. Honestly the tempdb contention was so bad before we made these changes, that almost anything would have been better. I wrote a blog post about the types of contention we were dealing with: https://m82labs.com/tempdb-contention.

      Feel free to reach out via the contact details on my site if you have any questions. It could honestly just be that our workload fits In-Memory OLTP nicely.

      -Mark

  6. I asked someone to try it on SQL 2014 and tried myself on a project in SQL 2016. Both failed. But admittedly it’s called IMOLTP and not IMETL or IMDW 🙂

    The SQL 2014 project was bulk loading lots of files in parallel 24×7 and so IMOLTP would act as a no-logging staging area. Amazing it didn’t improve performance over using temp tables and so was scrapped. I didn’t get the details from the developer but they gave it a serious shot.

    I still wondered about it and tried myself on an early SQL 2016 project, using a partitioned view to have hot reporting data on IMOLTP and cold data in a disk table. This was about 30 million rows at the time (much more these days) and when I benchmarked between normal, IMOLTP, and columnstore variations – performance for IMOLTP was all over the place and generally far worse than the other two. In this case I had 100% control and could see that it was all in memory and the partitioning was working – IMOLTP was just ineffective for that workload for some reason.

    In both cases it was using standard procedures and functions because compiled ones are too primitive. Remembering that they can’t even have CASE statements until 2017 and the idea of repeating long complex codes for multiple UNIONs would be unmaintainable. I also tend to use lots of CTE and window functions for the analysis reporting I do so those were out also.

    I think MS screwed up in a few ways.
    – They release v1 missing so many features that is almost impossible to use or justify using. Look at graphs in 2017. There’s literally zero reason to use it over normal tables and joins. It adds nothing whatsoever.
    – They add so many exceptions for known use cases. For example not having calculated columns in IMOLTP or CS or temporal tables. This excludes so many use cases where you can’t just drop it in and try it, it’s going to require a big redesign.
    – When they introduce a v1 feature and it’s incomplete or doesn’t work it tends to turn people off from trying again in v2 or v3 because they don’t have time to waste on following the marketing again.

    1. When I thought about why my project failed, I think that IMOLTP may only provide a benefit where data is being accessed and modified concurrently, and the in memory model improves performance due to a different method of locking.

      So for my ETL reporting it made no difference as data was only every written once and reporting on it in memory could not be improved because it would always be in the buffer cache anyway.

      I don’t know if that was the reason and others I spoke to at the time were skeptical and truly believe the initial presentations we all sat through saying just turning this on will boost performance 7x or so (and 40x or something with the compiled stuff).

      1. Cody-

        Thanks for the comments and for sharing your experience. I think with the ETL reporting you could see a perf improvement if you were using natively compiled SPs. If you had those and didn’t see an improvement I would be surprised. And I agree that 40x is not realistic currently. I think 10-20X is a better expectation to set.

        Erin

    2. “– When they introduce a v1 feature and it’s incomplete or doesn’t work it tends to turn people off from trying again in v2 or v3 because they don’t have time to waste on following the marketing again.”

      I absoultely agree with this.

  7. The ones that hit me the most:

    1. A lot of managers still see memory as relatively expensive. One of the metrics I track on client systems is the ratio of memory size to database size, and it’s usually in the 5-10% range. (So 1TB of data = 50-100GB of buffer pool, which means that’s typically hosted on a 64-128GB server.) When we have a discussion about putting 256GB or 512GB of memory in the server (because the server’s often undersized to begin with), that just ends the conversation. I understand – if you want high availability, that means having 2 servers with that memory in the primary data center, plus one in the DR data center.

    2. The touted 100x improvements usually require natively compiled T-SQL. As your testing at SQLperf showed, you can’t just use in-memory objects and call it a day – you also have to have your code in procs, and deal with the native compilation limitations. Queries in the app, like Entity Framework? Sorry, you’re going to be building procs and modifying your app.

    3. Lack of CHECKDB. I can’t go in front of a client with a straight face and say, “The way we’ll detect corruption is to regularly back up to NUL.” Without corruption detection or repair, any file corruption whatsoever on disk-persisted tables means you’re going to be restoring from full backups, which means downtime. That means In-Memory OLTP basically requires Availability Groups. Again, price tag ratchets upward due to complexity.

    4. Shops are facing different problems. The typical wait types I see are more along the lines of PAGEIOLATCH, SOS_SCHEDULER_YIELD, CXPACKET, etc, and In-Memory OLTP rarely solves those problems. Look at the top waits across servers – take https://www.spotlightessentials.com/waitopedia for example, top right graph – and the wait stats shops are facing just aren’t fixed with In-Memory OLTP.

    It’s not that it’s not right for anybody – I’m sure it’s right for somebody. It’s just never going to be a mainstream feature at this rate.

    1. Brent-

      Really good summary of your experiences – thank you. I agree about the memory and the expected improvements. Even with natively compiled SPs, I haven’t seen anything like 100x improvement. Noted about CHECKDB, and the point about waits is a good one. You have to know what your waits are going in, as In-Mem OLTP isn’t going to help with every situation. I’m hoping to find more where it can…

      E

  8. Just wanted to chime in and say that some of what’s been discussed here has been addressed, i.e. chatty errorlogs. Microsoft recommends NOT starting with HASH indexes, and I would assume that’s because they are not self-maintaining, and subject to different behavior than what we’re used to for on-disk tables. An example of that would be that you can’t use a predicate based on only the leading column of a muti-column key for a HASH index.

    One comment was about having to drop/recreate objects, and that’s no longer true as of SQL 2016. ALTER PROCEDURE and ALTER TABLE are now fully supported for memory-optimized procs and tables. Your table will of course be offline for the duration of the ALTER statement, but that’s to be expected.

  9. We use In-Memory OLTP extensively, primarily to replace temporary tables and disk-based TVPs. It has been a very successful adoption for us. By far the biggest barrier we face to further use of Hekaton tables in our environment is the lack of cross-database transaction support.

  10. I’m going to echo a lot of what others have said. Brent hit the nail on the head in that organizations are facing problems that IM-OLTP probably cannot solve. In addition to this, I’ve tried a few times now (following BOL) to convert some of my stuff to IM-OLTP and the payback on performance just isn’t there. The tables I deal with on a daily basis are large FACT tables and when using a clustered columnstore index, I’ve seen monumental improvements, but when I tried using IM-OLTP, the performance was not noticeable. On top of that, here are a few things still holding me back from IM-OLTP:
    1) Database triggers not allowed – very helpful when you need to audit your server and DDL activity
    2) The complexity around managing bucket sizes on IM-OLTP tables: I think this is where MS has a real opportunity to course correct. Instead of us having to specify a bucket count and managing the size of that bucket count over time, if MS could implement something behind the scenes similar to the tuple mover they implemented for delta-store and Clustered Columnstore indexes, I think more people will embrace it. I know today we have to manage things like statistics on tables and we could set up a maintenance task, but the more I read up on bucket sizes, there are still mixed opinions (too many buckets = too much extra memory withheld from use on server…too little, your query could under-perform)
    3) Like Brent said, where you see huge improvements is with native-compiled store procedures. But the language is still too primitive (echoing Greg Low). The biggest barrier I see with the NCSP is that all tables involved have to be in-memory (no disk based tables allowed).
    4) Multi-tenant shared enviroments: It’s hard for me to take away memory on a server when I’m just one group of about 10-15 on that server. It becomes a tough conversation with the other teams that think “wait a minute, you have a solution to put your data on disk, yet you want it on memory, which is a server performance resource we all need?”…never goes well

    I think the concept of IM-OLTP is amazing, just by the fact that you don’t have to worry about table locking again. And I saw Bob Ward’s presentation at Pass 2016 about the internals of the Heckaton DLL (Absolutely Mind Blowing). But at the same time, putting your database on MVCC (Snapshot Isolation) will help alleviate this issue too.

  11. @Matt Martin – questions for you, sir —

    You wrote that columnstore enhanced performance greatly, but that memory-optimized tables did not deliver noticeable performance. Sounds like you didn’t query memory-optimized tables using a clustered columnstore index on the in-mem tables. Not sure if that’s a fair comparison.

    Not sure how to interpret what you wrote: “But at the same time, putting your database on MVCC (Snapshot Isolation) will help alleviate this issue too.” Sounds like you are equating/comparing a database level setting of some form of snapshot isolatiON (RCSI or SI) – which only affects on-disk tables – with deploying memory-optimized tables, which use MVCC. Snapshot isolation for on-disk tables does nothing to prevent write/write blocking, which is handled only with MVCC for memory-optimized tables.

    1. Hi Ned,

      Great point on using a clustered columnstore index on a memory optimized table. I forgot that SQL 2016 introduced that feature. I gave it a shot and compared to a clustered columnstore index table copy I had on disk (solid state), and the solid state disk actually outperformed the IM-OLTP version, which makes no sense to me. Both tables had 30M rows in it. The only time I got close to the same performance is if I wrapped the memory-optimized table inside an native compiled stored proc and ran the SELECT statement that way. I’m not sure if I’m doing something wrong, but here is the suedo code I used to create the IM-OLTP clustered columnstore table…Let me know your thoughts:

      CREATE TABLE dbo.test_1_mem(

      po_nbr INT, po_line_nbr INT, last_upd_ts SMALLDATETIME, sales_amt FLOAT
      INDEX ccsi CLUSTERED COLUMNSTORE
      ,PRIMARY KEY NONCLUSTERED (po_nbr, po_line_nbr)
      ) WITH (MEMORY_OPTIMIZED = ON, DURABILITY = SCHEMA_AND_DATA)

      SELECT po_nbr, SUM(sales_amt)
      FROM dbo.test_1_mem

      Also, on the MVCC question, I followed this link from Paul White (<a href ="https://sqlperformance.com/2014/06/sql-performance/the-snapshot-isolation-level&quot;) and after I enabled snapshot isolation as the default transaction level on my database, I didn’t have readers block writers…so again, not sure what I’m doing different from you. I’d like to hear your feedback on that as well.

      Thanks,
      Matt

      1. Hi Matt,

        (apologies to Erin if Matt and I have detoured on this thread)

        A couple of things to note:

        1. Natively compiled modules cannot consider a clustered columnstore index on a memory-optimized table (only RANGE and HASH indexes). Also, all modules that are natively compiled run serial – you will never get parallelism from a native module, at least up to and including SQL 2016.

        2. There are vast differences between CCI for on-disk vs in-mem, blog post here: http://bit.ly/2oioODY

        3. My point about database level settings for snapshot isolation was they have no bearing on the memory-optimized workload, and only solve reader/writer blocking, but cannot solve writer/writer blocking. When you wrote “will help alleviate this issue too”, it’s not clear to me exactly what “issue” you’re referring to.

        4. While CCI might speed certain queries on memory-optimized tables relative to HASH/RANGE indexes, in my opinion, In-Memory shines brightest for workloads that are both highly concurrent and write intensive. I’m not sure how analytical queries will fare (haven’t tested analytical queries with in-mem, but you’ve inspired me to write a blog post about it).

        5. In the query you listed above, there is a SUM() but no group by, so I’m guessing that you GROUP BY po_nbr.

        Thanks,

        Ned

        1. Hi Ned,

          A few more questions:

          1) On point 1, if I cannot get parallelism on an NCSP for analytical workloads, would it still be suitable to use IM-OLTP for large fact tables in the hundreds of millions of rows?

          2) On your point about writers blocking writers, I understand what you are getting at. My point was on readers not blocking writers, but I see how when you have multiple writes going on at the same time, that’s where Heckaton gives you the edge. I just don’t encounter scenarios at my workplace like that. All of our stuff is scheduled on ETL and loads in batch without multiple processes trying to write to the same table at once.

          I like your description on point 4, and would be curious if you ever had an opinion to offer on analytical workloads for IM-OLTP. That would be good to hear about and see how we could leverage Heckaton for that.

          – On the SELECT query I previously provided, I did leave out the GROUP BY po_nbr (sorry about that).

          Thanks for the feedback. Let me know if you ever do any research on IM-OLTP in regards to analytical workloads.

          And @Erin, sorry if Ned and I have gotten off course on this topic. Great discussion though!!

          Matt Martin
          matt@sqletl.com

          1. Matt and Ned – no worries! I’m glad to host a place for the discussion, even if it varied a bit 🙂 And, it was good stuff. Thank you both for taking the time to provide feedback, this is helpful!

          2. Hi Matt,

            “if I cannot get parallelism on an NCSP for analytical workloads, would it still be suitable to use IM-OLTP for large fact tables in the hundreds of millions of rows?”

            I dont’ have a definitive answer on this. If the queries are large and/or complex, then Native Compilation can provide a real bump in performance. The greatly reduced clock cycles might offset the serial execution. Keep in mind that if using interop with memory-optimized tables, then just like for on-disk tables, your queries can spill to TempDB due to sorts or joins not being allocated enough memory upfront (SQL 2017 likely changes that with adaptive query magic).

            In-Memory might be a good fit for your workload, even if there is no distinct advantage for querying. That’s because ETL loading is a really great use case for In-Memory. So if the load consumes significant time/resources, In-Memory might improve that process a lot.

  12. Just a few more of the limitations on IM-OLTP
    1) Support for datetimeoffset
    – we deal with a lot of timestamps in varying timezones and this would be helpful. I know as a hack, I can have 2 columns (one for the timestamp in UTC and the other for the timezone)
    2) Allowing print/raiserror statements in NCSPs.
    – I know that the NCSP block is atomic so me asking for a raiserror doesn’t make sense, but I’d like to throw my own errors if certain conditions are not met.
    3) declaring table variables in NCSPs. Again, I know you can pass in a TVP as a parameter, but it would be helpful if I could declare @table (loc_nbr int) while inside the stored proc. If that was allowed, then I might be able to get passed the fact that temp tables are not allowed in NCSPs.

    I hope MS continues to invest in this technology. It’s already so much better in 2016 and for that matter 17 since they added support for CASE statements.

    Thanks,
    Matt

  13. Late to the party, but I’d love to put in my 2 cents.

    Erin, I would *love* to turn on In Memory OLTP.

    But we are very concerned about the safety of any changes. Once In Memory OLTP is turned on, it can’t be turned off.
    With every database change, we require rollback steps that don’t require full restores with data loss.

    We’re using 2014 and some early tests with In-Memory OLTP showed some problems when we tried turning on OLTP on some small-ish test systems with 100s and 100s of databases. Checkpoints were slow, restores took too long. These were problems with just turning on the feature, not even using it.

    So we are very reluctant to enable In-Memory OLTP in production without a rollback strategy. We feel like we’re in a corner with tempdb contention and want memory-optimized TVPs to alleviate this.

    As a consequence, without being able to solve tempdb contention, we’re reluctant to turn on RCSI (which uses more tempdb).

    And without any more scalability room to use, we’re actively migrating large pieces of our databases to other NoSQL solutions. (We have a new initiative called All-In-AWS).

    It’s a shame too. I just want the memory-optimized TVPs. This use case (as far as I know) doesn’t strictly need disk at all.

  14. Although Microsoft has invested a lot of effort in In-Memory OLTP technology, currently, with the latest update in SQL Server 2017, the functionality is not satisfactory. We cannot migrate databases without losing referential integrity or do a lot of manual work.
    IMO In-Memory OLTP is for brand new projects, and certainly not for migrating existing once.
    Part of difficulties you can expect when trying to migrate the disk-based database to In-Memory OLTP, I described in ‘Converting a Database to In-Memory OLTP’
    https://www.red-gate.com/simple-talk/sql/t-sql-programming/converting-database-memory-oltp/

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

A Fond Farewell

If you haven’t guessed from the title, I’m writing this post because I am leaving SQLskills. This Friday, January 14th, is my last day, and

Explore

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.