Wednesday, 29 February 2012

ALTER COLLATION - deceptive simplicity

A database collation is something developers hardly ever think about. After all, most of the time, it is hidden deep below the surface and concerns itself with such trifling matters as character data types encoding and sorting.

As a developer you hardly ever have to concern yourself with it but it can occasionally bite with a 'Cannot resolve the collation conflict between X and Y in the equal to operation.' error. It may happen in comparison, look-ups or joins. That is because when two columns have different collations they cannot be compared directly and there is no way to do any sort of implicit conversion. Such problems can be easily solved by forcing the collation with COLLATE.

More surprising, perhaps, are problems with temporary objects as they use collation of the tempdb and not the database you are using. Solution is the same but here usually more cumbersome. For that reason administrators' rule of thumb is to keep all databases with the same collation as tempdb as long as it is possible, just to keep collations in their oblivion.

Recently wearing my administrator hat I faced a task of changing collations in dozens of databases on one server to match them with tempdb collation. Since we have
ALTER DATABASE databasename ALTER COLLATION new_collation
it looked like task simple enough but the problem is that (by design) the query above will change only the default collation of the database leaving all columns in their current collation.

I google around a bit but couldn't find anything ready made to convert all the columns' collation as well as that of the database and was about to write a script that would do just that. I though it cannot be that hard. All you need to do is find out all the columns in all the tables that store character data and then get all the properties of such columns to construct ALTER COLUMN statements. I really was about to do that but then I googled some more and found a very interesting post by Igor Kovalenko written over 5 years ago.

To that post Igor Kovalenko has attached his own change_collation.sql script which is merely 3391 lines long! Reading that script was a real eye opener. Looking at a database from that perspective it looks like collations are lurking everywhere and there are so many objects other than simple columns to consider: views, functions, procedures, columns used by indexes just to name a few. More, since the script was published many unsupported special cases have been reported in the comments.

Still, in my fairly simple case a much simpler and therefor shorter script would do just fine but I'm not going to code it now, am I? ;)

1 comment: