Hekaton data and code – where does that stuff actually live?

When investigating new features, I’m prone to use what I call the Rumpelstiltskin method, that is, I’m happier if I know where these features “live” in the file system (not exactly Rumpelstiltskin, where the reward came by guessing his name, but hopefully I’m not stretching the analogy too far).

With this in mind, I started off to find where *exactly* Hekaton stores it’s compiled table code, it’s compiled procedure code, and it’s”backing store” data for persistent memory-optimized tables. For the experiment I used the Hekaton-ized Northwind sample database code from the SQL Server 2014 Books Online. I executed the code in stages, and looked at the appropriate file system locations, once I’d found them.

Creating or altering a database to contain a filegroup for memory-optimized data with a corresponding file produces the well-known “filestream directory structure”, as used by filestream (2008+) and filetable (2012+). With a file for a memory-optimized data filegroup, it contains the nested GUID-named subdirectories as though you had one table containing filestreams and one filestream-based column. However, the lowest level of directory isn’t empty; it contains 1 GB-worth of files. 8 of those files are 128 mb in size, and 14 other files are empty. And this is *before* adding any data. Or even any tables.

BTW, the filegroup for memory-optimized files can have more than on container (ADD FILE.. in DDL, directory as far of the file system goes). Adding a second “file” produces the corresponding “filestream” directory, but no GUI-named subdirectories (yet). So that’s where the data will live, because there has to be a persistent backing-store to populate the in-memory tables at next SQL Server startup time.

Now, let’s add a memory-optimized table and see what happens. I’d been browsing around the file system in the “SQL Server instance directory” (C:\Program Files\Microsoft SQL Server\MSSQL11.MSSQLSERVER\MSSQL in my case) and noticed an subdirectory of binn named “xtp”. That’s where the C compiler and linker live (in the xtp\bin directory) and where the support libraries live (in the xtp\lib directory). Great.

I also noticed an empty subdirectory of the “data” directory named “xtp”. Creating a single memory-optimized table (Employees, in this script) produced an subdirectory under …\data\xtp with the name ‘5’. Where 5 is the DBID of the database in question (hkNorthwind in my case). There are six files for each memory-optimized table, with names start begin with “xtp_t_5_277576027”. With memory-optimized tables:

xtp_t – is for table code

5 – refers to the DBID

277576027 – refers to the object_id of the table

The six files I saw had suffixes .c, .obj, .dll, .pdb (symbols), .out (intermediate compiler file), and .mat.xml. Not sure what .mat.xml is just yet (it contains an XML representation of the table metadata), but the other files are recognizable. Adding the other tables, I can see one set of these files for each memory-optimized table. Similarly, adding an native-compiled stored procedure produces the same six file types that begin with “xtp_p” (for procedure).

Interestingly, if I stop the instance and start it up again, these files only appear once I’ve *executed* the procedure. The files for tables appear after startup. So it sounds like these are lazy-instantiated and compiled when they are first used. This could be the case for the “table code” as well, but these would have to be instantiated at SQL startup, so that the tables could be (re)created and (re)populated in memory.

Turns out that I didn’t have to “browse-around” for the modules’ location. It shows up in sys.dm_os_loaded_modules, under the “name” column when the modules are loaded.

Back to data. All I do for now is what the script contains, populate the tables using INSERT statements. And this does nothing to the tables in the memory-optimized filegroup files. I’m assuming it writes to the 128 mb pre-allocated files. However, executing CHECKPOINT afterwards writes to one of the previously empty pre-allocated files, changing the size to 64mb. Bringing SQL Server (and the OS) down and up a few times populates more of these files to 64mb. And, after a while, I check back and I have 24 files total in the “filestream subdirectory”. All this appears to be consistent with the Books Online description under the “Storage/Defining durability for memory-optimized tables“. I’m not seeing any populated “Delta files” yet, because I haven’t UPDATEd or DELETEd any rows. That should work as advertised too.

So. Nothing too earth-shattering, but at least now I know “where everything is”. And Hekaton tables and procedures are less of an unknown “Rumpelstiltskin” today than they were yesterday.

Cheers, Bob

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.

Hekaton data and code – where does that stuff actually live?

Other articles

Over 1000 XEvents in SQL Server 2016 CTP2. Here are the new ones.

Taking the Azure SQL Database row-level security preview for a spin

Azure SQL Database V12 Preview – Spatial Fully Functional

Not a “me-too announcement” blog on Azure SQL Database V12 preview

AzureML: What components are used by the sample experiments?

Book Review: Chris Webb’s “Power Query for Power BI and Excel”

Imagine feeling confident enough to handle whatever your database throws at you.