Recycle Fulltext Catalog Log Files

It’s not often that I have the need or opportunity to work with FullText Indexes, and it’s also not often that I come across a new thing to add to regularly schedule maintenance tasks in SQL Server.  Today I happened to do at the same time.  FullText Indexes in SQL Server are really not something that cause many people problems, until they do.  When a problem does happen with a FullText population or crawl, the Log file for the catalog is the starting point for troubleshooting what is causing the problems.  The crawl logs are in the instance ERRORLOG path with the format of:

SQLFTXXXXXYYYYY.LOG[Z] – where X is the database ID with leading zeros and Y is the catalog ID with leading zeros and Z is the log file number in sequence (see BOL Topic here)

The problem is that this file just grows and grows and if your catalog has a lot of indexes that are automatically tracking changes then it can basically so large you can’t open it.  The one I had to deal with today was 4GB in size and made me wish that Windows had the good old Linux tail command.  I used Powershell Get-Content -Path “LogFilePath” -Last 1000 and then slowly walked backwards in increments of 1000 until I found the point in the file that mattered to me, so all is not lost. However, the LOG file in use had a start date of March 12, 2019 which in my opinion if it was broken that long we have other things to talk about.  So just like my recommendation for the SQL Server ErrorLog and calling sp_cycle_errorlog in a job to roll the file over, I made a recommendation to recycle the FullText crawl log.  To recycle the crawl log, you have to execute sp_fulltext_recycle_crawl_log for each of the Catalogs in each database that uses FullText:

EXEC sp_fulltext_recycle_crawl_log @ftcat = ‘FullTextCatalogName’

I know, it’s not just a straightforward task that works across everything, but most databases I deal with don’t have full text catalogs defined, so it wouldn’t be difficult to use sys.databases to get a list of databases with is_fulltext_enabled = 1 and check if they have a catalog in them to enable the job.  Or you can just look at the file names in the path and get the database ID and catalog ID directly.

Securing Data in SQL Server

Recently I have been having quite a few discussions around security of data inside of SQL Server and how to prevent the massive data breaches that we have been hearing about on the news.  Like most things some people want “THE ANSWER” or THE SOLUTION to securing the data inside of SQL Server. Unfortunately there isn’t a single solution that solves all of the problems that are potentially out there.  Security of data requires defense in depth, starting with a secure configuration and installation of SQL Server.  Often times, defense in depth also means changes to the database, the application, and how you do business.

Features for Securing Data

SQL Server offers multiple options and features that help with securing data and since SQL Server 2016 Service Pack 1, many of them are available in Standard Edition:

  • SSL/TLS Protocol Encryption

    • SQL Server 2005+
    • Uses a SSL certificate to encrypt the network connection between clients and SQL Server, securing data from watching over the wire
    • Can be forced by SQL Server during the handshake when certificate properly installed on SQL Server certificate store
    • Only requires a SSL certificate from a CA for Server Authentication to implement
    • Requires certificate rotation before expiration
    • Generally does not require application changes but may under specific conditions
  • Database Column Level Encryption

    • SQL Server 2005+
    • Uses a certificate or key to encrypt a column securing data from being queried without knowing how to decrypt the values
    • Data remains encrypted at the column level even while the database is online
    • Encryption keys are maintained inside of the SQL Server encryption hierarchy and must be opened properly to decrypt data
    • May require application changes to support encryption
      • Not required if handled by stored procedure access to data but opens risks of decrypted data in process memory
    • Performance impacts when comparing column data in WHERE clause
      • May be mitigated by encrypting value to compare first and performing binary filtering of encrypted values
  • Transparent Data Encryption

    • SQL Server 2012+
    • Encrypts data at rest using database encryption key stored in the boot record of the database and a certificate stored in master
    • Prevents someone copying files or stealing a database backup from restoring the data without the certificate
    • Does not require application changes for securing data at rest
    • Does not protect data from being queried through SQL Server once the database is open
    • If someone has access to the master database or a backup of master, they can get the certificate that encrypts other backups (https://simonmcauliffe.com/technology/tde/)
    • Without the certificate that encrypts the database you cannot recover from a disaster
    • Certificate expiration is not enforced or checked and does not require certificate rotation once used for encryption
  • Backup Encryption

    • SQL Server 2014+
    • Encrypts the backup file using a certificate or asymmetric key, securing data backups from being restored
    • Prevents someone stealing a database backup from restoring the data without the certificate or key
    • Same potential risks as TDE since certificate or key is stored in the master database
  • Always Encrypted

    • SQL Server 2016+
    • Data remains encrypted at the column level even while the database is online
    • Column encryption keys are stored in the database to encrypt data
    • Column master keys are used to encrypt the column encryption keys in the database and are maintained outside of SQL Server and are not available to the DB, securing data from DBAs
      • Requires installing column master keys for SSMS to query and decrypt data – Windows Certificate Store, Azure Key Vault, or HSM
    • Requires application changes to support the encryption
    • Protects data in use from memory dumps and maintains encryption

How do we apply this for Securing Data?

Generally speaking, the first thing that we need to define is what specifically we are trying to protect against?  If we need to prevent someone from monitoring  or intercepting network packets containing data in clear text then we need to implement protocol encryption for connections.  If we are concerned about someone opening the database on another system or stealing a copy of the database or backups using TDE might be a good solution, but only if we also maintain a defense in depth strategy that separates our backups of master and the certificates used by TDE from the database backup files.  If we don’t separate our backups then TDE is very easy to hack around and is simply checking the box without actually securing our data.  If we want to protect the data at rest and from prying eyes while open, then we need to implement some form of column level encryption of the data, whether that is key based inside of SQL Server or using Always Encrypted. The important thing is that there isn’t a single solution that is going to protect every situation, you have to consider the risks and what a specific feature is designed to protect against and then layer them together to meet the requirements. Even then you might not be protected.

Take as an example, the Microsoft Customer Support Database data exposure that happened December 2019 and Microsoft provided public information about after securing the data.  While we don’t know all the security measures that were in place specifically on the database level, none of the above features would have protected the data from being publicly viewable through an application level security misconfiguration. As a part of any data security discussion, auditing and regular reviews of business practices, and any changes, have to be implemented to ensure that the data remains secured.

Keyset Does Not Exist – Add-ClusterNode

While working on a video recording for Paul this week I ran into an interesting problem with one of my Windows Server 2016 clusters. While attempting to add a new node to the cluster I ran into an exception calling Add-ClusterNode:

The server ‘SQL2K16-AG03.SQLskillsDemos.com’ could not be added to the cluster.
An error occurred while adding node ‘SQL2K16-AG03.SQLskillsDemos.com’ to cluster ‘SQL2K16-WSFC’.

Keyset does not exist

The windows account I was using was the domain administrator account and I had just recently made modifications that involved the certificate store on this specific VM, so I decided to take a backup of the VMDK and then revert to a snapshot to try again, and this time it worked.  So needless to say I was intrigued as to what I could have done that would be causing this error to happen.  It turns out that while installing a SSL certificate for SQL Server to use, I had negatively impacted the permissions for the C:\ProgramData\Microsoft\Crypto\RSA folder, and the keys that are protected inside of that folder, which happen to be the private keys for the certificates on the server.  The normal permissions for this folder can be seen on the screenshot from the working node in the cluster.

UPDATE: 1/24/2020 – A reader provided a link from Microsoft Support for the default permissions as well (https://support.microsoft.com/en-us/help/278381/)

 

On the broken copy of the VM, the Owner was not set, and none of the other permissions were even close to matching and inheritance had been propagated (oops!!!).  The really crazy thing is, I don’t even know how exactly I caused this to happen at this point, but none of the keys were accessible to anything on that node. While I was able to go into the folder and manually set the Owner: and all the permissions on each of the Keys in the Machine folder to allow the node to be joined to the cluster, I ultimately went and evicted the node and rebuilt it instead of relying on manually applied permissions to make it work.   Here is why:

The owner of the folders is SYSTEM, which is easy to go back and set, but to set the permissions on the Keys inside the Machine Keys folder, you have to Take Ownership of the key:

Even as an Administrator, you don’t have Read access, so when you click the Advanced button, it can’t show any information and offers to try Administrative Permissions to do it, but again that fails when you click Continue to get the objects permissions you can only take ownership to view the properties, which then means you don’t know who the correct owner actually should have been.

However, in a real production down scenario where you made a mistake setting permissions and the cluster nodes can’t join into the cluster, start the cluster service correctly, or take ownership of the resources, manually taking ownership and setting permissions on the Keys as shown in this screenshot from another environment I intentionally destroyed the permissions on to try and complete this blog post, can get you out of a pinch.