Tracking SQL Server Database Usage

One of the challenges with inheriting an existing set of SQL Servers and databases when you start a new job can be determining which of the databases are actively being used or not, especially on development and testing systems where there may be multiple copies of the same database that were used by different projects over time. This can also be a challenge for multi-tenant software as a service (SaaS) providers that create a new database for each client they provide service for. An easy way to track whether a database is being used is with Extended Events and the lock_acquired event by filtering for the shared transaction workspace (SharedXactWorkspace) lock that is acquired anytime a user connects to the database.

To start off, we first need to look up the columns returned by the lock_acquired event, and also look up the map values associated with any of the columns so that we know the correct values to use in our event session definition.

-- Look up the lock_acquired event columns
SELECT 
	name,
	column_id,
	type_name
FROM sys.dm_xe_object_columns
WHERE object_name = N'lock_acquired' AND
	column_type = N'data';

-- Look up the values for the Lock Resource Type and the Lock Owner Type
SELECT 
	name,
	map_key,
	map_value
FROM sys.dm_xe_map_values
WHERE name IN (N'lock_resource_type',
N'lock_owner_type');

From this, we can get the DATABASE lock_resource_type map_key=2 and the SharedXactWorkspace lock_owner_type map_key=4. With these values, we can define our event session to track how frequently this lock occurs by database_id, and leverage the bucketizer/histogram target to bucket the data automatically. Since the target name and output changed slightly in SQL Server 2012, two different version specific examples of the event session and event parsing code are presented below:

SQL Server 2008 Event Session

-- If the Event Session Exists, drop it first
IF EXISTS (SELECT 1 
			FROM sys.server_event_sessions 
			WHERE name = 'SQLskills_DatabaseUsage')
	DROP EVENT SESSION [SQLskills_DatabaseUsage] 
	ON SERVER;

-- Create the Event Session
CREATE EVENT SESSION [SQLskills_DatabaseUsage] 
ON SERVER 
ADD EVENT sqlserver.lock_acquired( 
	WHERE owner_type = 4 -- SharedXactWorkspace
	  AND resource_type = 2 -- Database level lock
	  AND database_id > 4 -- non system database
	  AND sqlserver.is_system = 0 -- must be a user process
) 
ADD TARGET package0.asynchronous_bucketizer
( SET slots = 32, -- Adjust based on number of databases in instance
	  filtering_event_name='sqlserver.lock_acquired', -- aggregate on the lock_acquired event
	  source_type=0, -- event data and not action data
	  source='database_id' -- aggregate by the database_id
)
WITH(MAX_DISPATCH_LATENCY =1SECONDS); -- dispatch immediately and don't wait for full buffers
GO

-- Start the Event Session
ALTER EVENT SESSION [SQLskills_DatabaseUsage] 
ON SERVER 
STATE = START;
GO

-- Parse the session data to determine the databases being used.
SELECT  slot.value('./@count', 'int') AS [Count] ,
        DB_NAME(slot.query('./value').value('.', 'int')) AS [Database]
FROM
(
	SELECT CAST(target_data AS XML) AS target_data
	FROM sys.dm_xe_session_targets AS t
    INNER JOIN sys.dm_xe_sessions AS s 
		ON t.event_session_address = s.address
	WHERE   s.name = 'SQLskills_DatabaseUsage'
	  AND t.target_name = 'asynchronous_bucketizer') AS tgt(target_data)
CROSS APPLY target_data.nodes('/BucketizerTarget/Slot') AS bucket(slot)
ORDER BY slot.value('./@count', 'int') DESC

GO

SQL Server 2012 Event Session

-- If the Event Session Exists, drop it first
IF EXISTS (SELECT 1 
			FROM sys.server_event_sessions 
			WHERE name = 'SQLskills_DatabaseUsage')
	DROP EVENT SESSION [SQLskills_DatabaseUsage] 
	ON SERVER;

-- Create the Event Session
CREATE EVENT SESSION [SQLskills_DatabaseUsage] 
ON SERVER 
ADD EVENT sqlserver.lock_acquired( 
	WHERE owner_type = 4 -- SharedXactWorkspace
	  AND resource_type = 2 -- Database level lock
	  AND database_id > 4 -- non system database
	  AND sqlserver.is_system = 0 -- must be a user process
) 
ADD TARGET package0.histogram
( SET slots = 32, -- Adjust based on number of databases in instance
	  filtering_event_name='sqlserver.lock_acquired', -- aggregate on the lock_acquired event
	  source_type=0, -- event data and not action data
	  source='database_id' -- aggregate by the database_id
); -- dispatch immediately and don't wait for full buffers
GO

-- Start the Event Session
ALTER EVENT SESSION [SQLskills_DatabaseUsage] 
ON SERVER 
STATE = START;
GO

-- Parse the session data to determine the databases being used.
SELECT  slot.value('./@count', 'int') AS [Count] ,
        DB_NAME(slot.query('./value').value('.', 'int')) AS [Database]
FROM
(
	SELECT CAST(target_data AS XML) AS target_data
	FROM sys.dm_xe_session_targets AS t
    INNER JOIN sys.dm_xe_sessions AS s 
		ON t.event_session_address = s.address
	WHERE   s.name = 'SQLskills_DatabaseUsage'
	  AND t.target_name = 'histogram') AS tgt(target_data)
CROSS APPLY target_data.nodes('/HistogramTarget/Slot') AS bucket(slot)
ORDER BY slot.value('./@count', 'int') DESC

GO

One thing to keep in mind with this event session is that while a end user might not actually use a database, other tasks like maintenance, backups, CHECKDB, or even using intellisense in SQL Server Management Studio will. It is therefore, expected that databases not being used by end users would still show up inside of the histogram target, but the frequency of usage would be significantly lower than the databases that are actively being used by end users or applications.

7 thoughts on “Tracking SQL Server Database Usage

  1. Great article. I wish there were a combined ola.hallengren type solution for recording various aspects of the SQL instance to record historical information.

  2. Hi Jonathan,

    Thank you for the article.

    I followed through and created a session on a SQL 2012 instance and noticed that for some of the returned entries the dbname value is NULL.

    I have 26 databases (including system), so I set the number of slots to 26:

    ADD TARGET package0.histogram
    ( SET slots = 26, …

    I only get 24 records back – instead of 26 – and as mentioned some entries have NULL for db name.

    I looked at the XML returned and for the problematic entries the value field contains a number that does not correspond to the dbid of any databases (hence the NULL values returned):

    34

    38

    Any idea what could be causing this?

    Overall the results look good, but I wonder if I am missing something.

    Thanks again!
    Marios Philippopoulos

  3. Would I be correct in presuming the number of slots should be set equal to the number of databases (including system) on the server?
    So if “Select count(*) from sys.databases;” returns 57, set the slots = 57?

    Thanks,
    Jason

  4. Using SQL Server 2019, I am getting the same issue reported by Marios Philippopoulos, except that (7) of my (24) databases are marked NULL. When I right-click on View Target Data, for the package0.histogram, I see (7) numbers in the first column, marked value, that are higher in value than any of my database_id values revealed by this query: Select name, database_id from sys.databases;

    Any ideas?
    Thanks.

    Select @@Version:
    Microsoft SQL Server 2019 (RTM-CU9) (KB5000642) – 15.0.4102.2 (X64) Jan 25 2021 20:16:12 Copyright (C) 2019 Microsoft Corporation Developer Edition (64-bit) on Windows 10 Pro 10.0 (Build 19041:

    1. Hey Tom,

      My guess would be that during data collection you possibly had DBCC CHECKDB run, which creates a snapshot of the databases which has a different database_id from the main database. The event session would still collect those snapshot database_id’s into the histogram target while running, and then the snapshot is removed when CHECKDB completes. I’ve verified this happens in a test environment in my lab.

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.