Capturing DBCC CHECKDB Output

If you haven’t experienced database corruption in your SQL Server career and had to pore over DBCC CHECKDB output, you’re a lucky soul.  If you encounter corruption, the most important thing you can have is the output, because this helps determine your next course of action.

Step 1

First, you need to be running CHECKDB on a regular basis.  If you’re going to ask what you mean by regular basis, I’ll direct you to a post of Paul’s that is still relevant: Importance of running regular consistency checks.  There are a few factors to consider, but in general I recommend at least once a week.  If you’re a 9 to 5 shop and have the maintenance window to run CHECKDB daily, feel free to do that.

Step 2

Second, running CHECKDB and capturing the output should be an automated process.  SQL Agent is your friend here, and you have a few options.

Option 1: Set up an Agent Job

Within SSMS, you can create an Agent job to run DBCC CHECKDB and write the output to a file.  I recommend the following syntax, which is good for most databases:

DBCC CHECKDB ('YourDBNameHere') WITH NO_INFOMSGS;
GO

Note: If you have a very large database (1TB or larger), or don’t have the maintenance window to run a complete CHECKDB, then you may want to look at another one of Paul’s posts for options, Consistency Checking Options for a VLDB.

Within the Agent job, it’s pretty straight-forward:

Running CHECKDB in an Agent job

Running CHECKDB in an Agent job

 

To make sure the CHECKDB output is saved, you want to write it out to a file, which you configure on the Advanced tab:

CHECKDB output file configuration

CHECKDB output file configuration

 

The configuration of the output file is important.  If you enter a path and file name like this:

C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019\MSSQL\Log\checkdb_output.txt

then every time the job runs it overwrite the contents of this file.  You have an option to append the file, but then you have to make sure you manage the size of the file.  Instead, append the date and time to the file to make it unique:

C:\Program Files\Microsoft SQL Server\MSSQL15.SQL2019\MSSQL\Log\checkdb_output_$(ESCAPE_SQUOTE(DATE))_$(ESCAPE_SQUOTE(TIME)).txt

You also need to create a job to clean up output files regularly.  With this configuration, you can view the CHECKDB output at any time, as long as the file still exists in the folder specified.

Option 2: Use Ola’s Agent Job

The syntax above works for one database…but what if you have 10 user databases on your instance?  And what about system databases?  Well, you can add the commands for each database, or you can make life easy and use Ola Hallengren’s scripts for maintenance, which creates two jobs for integrity checks: one for user database, and one for system databases.

SQL Agent jobs created by Ola's maintenance script

SQL Agent jobs created by Ola’s maintenance script

 

Ola’s stored procedure, dbo. DatabaseIntegrityCheck, has multiple options for configuration.  By default it checks all tables and indexes, and outputs error messages, and the job writes an output file each time it executes (including date and time in the name) to the instance Log folder by default.  When multiple databases are checked, the output for all databases is included in one file.  It’s also worth noting that Ola’s scripts create an Output File Cleanup job to clean up those files.  By default NONE of these jobs are scheduled, so I recommend you review what the job does, confirm it is configured how you want, and then create a schedule.

Option 3: Use a Maintenance Plan

In every environment I manage directly, I use Ola’s scripts for maintenance tasks.  However, some folks may be more comfortable using Maintenance Plans.  If that’s true for you, then when you set up the plan for integrity checks, there are two things I recommend.  First, you typically do not want to run with the PHYSICAL_ONLY option unless the check takes an extremely long time to run and you’re trying to make the task fit into a maintenance window.  When you use the PHYSICAL_ONLY option the logical checks, inter-page checks, and things like DBCC CHECKCATALOG will be skipped.

Options in the CHECKDB maintenance task

Options in the CHECKDB maintenance task

 

Second, the maintenance task writes to an output file by default.  Within the UI, you can select the Reporting and Logging button to confirm the location.

Reporting and logging for CHECKDB maintenance plan

Reporting and logging for CHECKDB maintenance plan

 

By default, it writes to the Log folder for the SQL Server installation (e.g. C:\Program Files\Microsoft SQL Server\MSSQL15.MSSQLSERVER\Log for SQL Server 2019).  Each file is created with the date and time as part of the filename.  As with other options, you need to ensure that these output files are cleaned up on a regular basis.

Other Items to Consider

If you run DBCC CHECKDB in SSMS via a machine you have RDP’d to, be aware that depending on the group policy your session can disconnect after a certain amount of time.  I’ve had this happen in client environments, and if you’re trying to run CHECKDB, it will stop and you won’t get any output…hence the suggestion to use an Agent job.

With any Agent job you use for integrity checks, I also recommend (dare I say insist?) that you enable notifications for job failure.  This is configured on the Notifcations tab of the job, and requires an Operator and database mail.  Create an Operator that writes to a team alias (e.g. DBAs@yourcompany.com) rather than just you.

Agent Job Notifications

Agent Job Notifications

 

Summary

If you are responsible for any SQL Server instance, you want to make sure of the following:

  1. You are running DBCC CHECKDB on a regular basis to look for integrity issues.
    1. This process should be automated.
  2. Your method for checking should include an option to write the CHECKDB output to a file that can be referenced later.
    1. If you find corruption, make a couple copies of this file.  If it gets deleted, you will have to run CHECKDB again to determine what is corrupt.
  3. Set up Notifications for the Agent job so you are notified if corruption is found.

I hope that the information in this post was review for you, but if it wasn’t, or if you have encountered corruption, feel free to reach out to see if we can help.

 

Working From Home with Kids: 7 Weeks In

I’ve been meaning to write a follow up post to my original to discuss what life is REALLY like during quarantine.  I re-read my original post and then starting writing, but after about two pages realized it was more a journal entry than an informative post.  I decided to pivot, and just create a list of things I’ve noticed.

  • We’re still operating on a loose schedule. There are general bed times (e.g. 9 or 9:30 for the 13-year old, so she’s up by 9 or 9:30 for school) and general wake up times.  Everyone handles their own schedule.
  • I haven’t set an alarm clock since this started. I do not miss waking up anyone else.
  • I also do not miss making school lunches, and since my kids can feed themselves, it’s great. They still haven’t learned how to properly rinse dishes and load them into the dishwasher.  We’re working on it.
  • I’ve done better with giving them chores to do around the house. The 15-year old needs explicit instructions, the 13-year old loves mopping.  Happy to let her have that task.  No one likes laundry.
  • The kids are handling their school work really well. They spend most of their day working in their rooms (they each have a desk and a laptop), coming out for lunch or breaks.  I found this sign on my daughter’s door this morning:
School Work From Home Messages

School WFH Messages

  • They still have meltdowns about school. These are mostly related to school work and less related to social issues, which I’ll take.
  • After the first week of online school, my 13-year old informed me that she hates sitting at a computer all day. (I’ve heard similar from my friends who are teachers.)
  • I try to start each morning with time sitting outside, a work out, and then I shower and get dressed. That routine has always worked for me, and I wear jeans most days, as does the 15-year old.
  • After dinner we are still working on family time. With the nice weather we’ve been going to an area of the Metroparks where the road is closed to ride bikes, skateboard, and walk.
  • We’ve had some Zoom meetings with family, some meetings with friends, and the 13-year old is great at FaceTime and Google Meet with her friends. They have class meetings, too.  The 15-year old is not great at initiating friend time – sometimes I think we’re still arranging play dates for a toddler.  My husband had him play catch with a friend the other day, which did a world of good for his outlook.  Introverts need friend time too!
  • We’ve done a couple socially distanced hang outs with friends. Those will get easier as it gets warmer, and we have plenty of room to spread out on our porch and in our yard.
  • Work has not been too different. I get a few more interruptions than usual, but no more than I would when the kids are home during the summer.  I expect this might change in a few weeks, when they wrap up school.
  • Screen time is higher than I would like (for all of us), but it’s not awful.
  • Animal Crossing is a great game.  I am not the most frequent player in our house, which is good because the island would still have a tent for a museum and no bridges.  I think my house has cockroaches.
  • I’ve finished one puzzle, which was 500 pieces.  I am slow at puzzles.  I have a 1000 piece one I need to start.
  • Yard work has been great therapy, but I don’t like weeding.  I figure, whatever works right now is enough.
  • Everyone now realizes what a distraction/annoyance the dogs can be since they can be loud and want to go outside frequently. For this reason, Mittens was named Quarantine Employee of the Month for April (thank you Andy Mallon for the inspiration).
Stellato Home Office Quarantine Employee of the Month for April: Mittens

Stellato Home Office Quarantine Employee of the Month for April: Mittens

All in all, we’re doing well.  I enjoy not having to rush for anything after dinner or on the weekend.  I’ve lost count of the number of trips and events that have been cancelled, and there are only a couple “once in a lifetime” things that make me sad (the 15-year old was headed to the state theater competition, which won’t be rescheduled).  If there were milestones we were missing, like graduation, this would be much harder.  I feel for the seniors, and for anyone graduating at this time.  And I am tired from constantly living in a state of concern, and not knowing what will happen next.  It’s exhausting.  The best we can do it take it one day at a time, and focus on the little things that bring joy.  Keep at it friends, we’ll get through this.

Troubleshooting Performance in Azure SQL

At some point, whether you’re a DBA, developer, or application administrator, you’re going to find yourself troubleshooting performance in Azure.  Within the Azure Portal you have the capability to look at performance for an Azure SQL via the Performance overview and Query Performance insight pages.

Checking Performance in the Azure Portal

Checking Performance in the Azure Portal

If you haven’t worked with SQL Server previously, this is a great place to start as it provides a high level view of system performance, broken out by DTUs, CPU, and data and log I/O.  You can dig into specific queries that might be causing issues through Query Performance Insight, including query text and resource use over time.

But if you’re familiar with troubleshooting in SQL Server, this method feels different, and slower.  Not to worry, many of the tools you’re familiar with still work in Azure SQL, you just need a few tweaks.

Wait Statistics

Whenever I’m dropping into a server – whether it’s one I’m familiar with or not – to look at a performance issue, I want to understand wait statistics.  Specifically, I want to know what wait statistics while the problem is occurring.  For this reason, I don’t just query sys.dm_os_wait_stats, I snapshot it, using Paul’s script from his post Capturing wait statistics for a period of time.

You can use this as-is in Azure SQL, but, since it queries sys.dm_os_wait_ stats, you’re not looking at wait statistics specific to your database, which is really what you care about.  So you have to change sys.dm_os_wait_ stats to sys.dm_db_wait_stats in the script.  Once you make that switch, you’re good to go – although depending on the issue you’re dealing with you may change the delay between snapshots to something lower than 30 minutes, which is what Paul has in the script.  I sample for either 5 and 10 minutes, depending on what else I want to check.  This also assumes the problem is occurring WHILE I am capturing wait statistics.

WhoIsActive

I think I’ve bought Adam Machanic drinks a few times, but at this point I feel like I owe him several dinners at high end restaurants.  While I’m waiting for the wait statistics snapshot to complete, I like to run WhoIsActive.  You want to grab that from his blog, rather than GitHub.  If you read the post, there was an issue with the original version of the script, fixed here in v2.  Note that when you create the stored procedure you want to do that in your user database.  It will fail if you try to create it in master.  Once it’s created, you can use all your favorite parameters for execution:

EXEC sp_WhoIsActive;   /* default, what's running right now */
GO

EXEC sp_WhoIsActive @find_block_leaders = 1;   /* list output with lead blocking sessions at the top */
GO

EXEC sp_WhoIsActive @get_plans = 1;   /* pull execution plans */

If you haven’t worked with WhoIsActive much, I’ll refer to you the documentation.  Between wait statistics and WhoIsActive, I usually get an idea of where to look next.  And this is where the next script comes in handy…

SQL Server Diagnostic Queries

You can get the current version of Glenn’s Diagnostic Queries for each release on his Resources page.  Note that Azure SQL and Managed Instances have separate scripts.  With the Diagnostics Queries in hand, I can dig deeper into what I think might be the issue, or I can just start rolling through the queries in order to check configuration and performance data.  I may end up on some tangents, depending on what I find, but these three scripts together provide a great starting point when troubleshooting performance in Azure, whether I have a complaint that the system is slow overall, or I’m told that a particular set of queries is slow. Happy troubleshooting!