Changing Database Collation and dealing with TempDB Objects

OK, so… I don't know how many of you use different collations but if you do then you know that there are two truths:

1) They're very flexible
2) They can cause you a bit of grief (changing collations and tempdb)

Flexibility

As of SQL Server 2000 (or heck, maybe it was 7.0?), database collations could be changed at installation OR set/changed later. You can set the collation when a database is created (if not set, the database will use the server's default). You can set the collation when a table is created (if not set, the table will use the database's default). You can set the collation when a query is executed (which doesn't really make sense unless it's in a WHERE clause or ORDER by clause). And – you can set the collation in a view or stored procedure to do things like case sensitive searching – on the fly. However, neither of these will perform well over large results sets (at least not without indexes) so, I'd be careful of doing any adHoc changes to collations (even in views/sps – without appropriate indexes)!

Anyway, the key point is that they're very flexible. In many international databases/localized databases, column collation differs by table (in order to do efficient sorting, etc.) and different language data may be separated (either with a column that described which language/country code is used OR in different tables).

Grief in Changing Database Collations

Actually, changing database collation is *very* simple. Literally, it only takes an ALTER DATABASE to do. For example, the following code runns flawlessly:

USE master
go

DROP DATABASE TestCollation
go

CREATE DATABASE TestCollation
COLLATE SQL_Latin1_General_CP1_CI_AS
go

sp_helpdb TestCollation
go

ALTER DATABASE TestCollation
COLLATE Latin1_General_CS_AS_KS_WS
go

sp_helpdb TestCollation
go

BUT… if you go from case sensitive to case insensitive… be careful! It is important to realize that ALL of your tables AND data will need to be checked against the new collation. In fact, changing database collation will not be allowed if the objects/data would no longer adhere to your unique constraints, etc. Check out this more complete script (ChangingDatabaseCollation.sql (2.85 KB)), if you want to see what happens.

Grief with temporary objects

So.. the other area (and this seems to be the one where everyone has trouble), is with temporary objects. If you create a temp table and your database has a different collation other than TempDB (which has the same collation as the system – based on installation), then comparisons/lookups/joins – may have problems. A simple trick to get around this is to use database_default. Check out this sample and you'll see how it works:

CREATE DATABASE Test
COLLATE Icelandic_BIN
go

USE Test
go

CREATE TABLE #test1
(
   col1 varchar(12)
)
go

CREATE TABLE #test2
(
   col1 varchar(12) COLLATE database_default
)
go

USE Tempdb
go

CREATE TABLE #test3
(
   col1 varchar(12) COLLATE database_default
)
go

sp_help 'tempdb..#test1' — Will use TempDB's collation
exec sp_help 'tempdb..#test2' — Will use Test's collation (Icelandic BIN)
exec sp_help 'tempdb..#test3' — Will use TempDB's collation
go

So simple, so obvious… and, well – I just found out about that one?! I used to recommend that you explcitly set the collation for every column. Now, that still works – but, it doesn't offer you any flexbility. So, you could get around that with dynamic string execution but that can also get very complicated, very quickly. So… database_default is a VERY simple and clean way of doing this.

Have fun,
kt

7 thoughts on “Changing Database Collation and dealing with TempDB Objects

  1. Hey there Marc – Code-review checklist… now that’s AWESOME! Would you be willing to share your checklist (send me mail privately)? Maybe we can post it and see what other folks have on theirs… maybe even start to create a "community" [SQL] code-checklist?

    Hmmm… I like that idea!

    Thanks,
    kt

  2. Followup question on this…what about spills to tempdb? It makes sense that if you have a database with a non-default collation, if you create a temp object it will be by default in the server instance collation (and you show a good way to get around that). But what I want to verify is, if you have a non-tempdb related query that has an operator that spills to tempdb because of size or memory grant issues…will the tempdb operation (say a distinct sort) use the tempdb collation, or the database collation?

    This could produce really weird results, as you could have a case sensitive database on a case-insensitive server… say, SELECT DISTINCT COUNT(*) FROM table, with table having data like this: a, A, b, B, c, C etc. While it was in memory, it uses database collation and will have the full number, but if it spills to tempdb AND it doesn’t carry the database collation down there automatically…well, then we have inconsistent results!!

    I’m still trying to prove this out in the meantime, but can’t seem to force a spill to tempdb yet, drat my luck.

    Cheers!

    1. Hey there Nic – I doubt this is a problem as the object’s structure will have already been defined by the table that’s spilling. But, I have to admit – I haven’t tried this. So, it’s something to test for sure if you’ve had problems but at the same time, I haven’t heard of any so I’m not worried. Have you seen something that’s made you think otherwise? Or, just wondering?

      Cheers,
      k

Leave a Reply

Your email address will not be published. Required fields are marked *

Other articles

SQLskills SQL101: Partitioning

Continuing on our path to understanding the basics and core concepts, there’s a big topic that’s often greatly misunderstood and that’s partitioning. I know I’m

Explore

Imagine feeling confident enough to handle whatever your database throws at you.

With training and consulting from SQLskills, you’ll be able to solve big problems, elevate your team’s capacity, and take control of your data career.