In this fourth and final part in my series about NULL, I’ll discuss some well-known and some less well-known functions and keywords that are specifically created to deal with NULL values. And I will, of course, explain why null if null is null null null is null. In case you have not yet read them, you can click these links to read the first. second. and third part.
IS NULL is not = NULL
I have already explained why tests for equality with NULL will always return Unknown instead of True or False. This holds true in all circumstances – even when both sides of the equation are NULL. That’s why the query below will not help you find the people for which no birthday is on file:
SELECT FirstName. LastName
FROM dbo. Persons
WHERE Birthday = NULL;
Even though no birthday is on file for Hugo Kornelis, my name will not be returned – because when evaluating “my” row, SQL Server will see a comparison between a missing value (my birthday) and a missing value (the NULL) – and when asked whether two unspecified values are equal, the only guaranteed correct answer is “I ain’t the faintest, dude” (for the British readers, that would be “I haven’t got the foggiest idea, old chap”).
Since there are numerous occasions where you want to find rows where values are missing, a special operator was introduced for this: IS NULL. Rewriting the query above as follows will return my name, because my Birthday is indeed NULL in this database.
SELECT FirstName. LastName
FROM dbo. Persons
WHERE Birthday IS NULL;
Note that the operator IS NULL will return True if the operand (Birthday in this case) is NULL, or False if it is any other value – an IS NULL test can therefore never result in Unknown. So to find values that are not missing, the expression can simply be negated to
WHERE NOT( Birthday IS NULL);
but there is also a shorthand form available that is much more common – in fact so much more that I don’t think I’ve ever seen the above version used!
WHERE Birthday IS NOT NULL;
Do not forget that the NULL keyword is an integral part of the IS [NOT] NULL operator. You can’t use IS or IS NOT as a replacement for = and <> and expect the same special treatment of NULL values that IS [NOT] NULL provides. That would require the use of the IS [NOT] DISTINCT FROM operator that is specified in SQL:1999 but not yet implemented in SQL Server ( vote here if you’d like to see this changed!)
Some of you may have seen existing code where a predicate such as Column = NULL is used, and does actually return rows. This is probably legacy code that has not been maintained in a long time (and if it isn’t, it has to be written by a legacy developer who has not maintained his knowledge in a long time).
Very early versions of SQL Server were released before agreement was reached in
the ANSI committee on all features. As a result, some features were implemented in a different way than what the ANSI committee turned out to describe, facing Microsoft developers with the challenge to upgrade to ANSI compliant behaviour without breaking existing code.
Behaviour of NULL comparisons in predicates is one such example where the original version of SQL Server “got it wrong”. So when SQL Server was changed to return Unknown on a comparison with NULL, the SET ANSI_NULL OFF option was introduced to force the “old” behaviour that would return False on NULL vs. non-NULL comparison, and True on NULL vs. NULL comparison.
The SET ANSI_NULL option has been marked as deprecated in SQL Server 2005, so if you find code that still relies on this setting, you’d better change it ASAP. This would of course be very easy if you only had to look for “= NULL” and “<> NULL”, and change them to “IS NULL” and “IS NOT NULL” – but unfortunately, expressions of the form “= expression ” where expression can be NULL are also affected by the ANSI_NULL setting; these are much harder to identify or fix!
ISNULL is not COALESCE
Despite the name similarity, and despite the fact that in Access, ISNULL(xxx) is equivalent to xxx IS NULL, the T-SQL ISNULL function is completely different from the IS NULL predicate. ISNULL is used to return the first non-NULL from its two inputs. However, ISNULL is also a leftover from the days before the ANSI standard was finalised – the ANSI standard function that should be used to replace ISNULL is called COALESCE. Despite the similarities, there are also a couple of differences that you should be aware of before replacing all your occurrences of ISNULL to COALESCE.
One reason to prefer COALESCE of ISNULL, in addition to its adherence to standards, is that COALESCE can take any number of arguments whereas ISNULL only takes two. Both will return the first non-NULL argument, or NULL if all arguments are NULL. The unlimited number of arguments makes COALESCE a far better option when tasked to find the first non-NULL value from more than two inputs – for example, to find the first non-NULL of Arg1, Arg2, Arg3, and Arg4, I’d rather use
COALESCE ( Arg1. Arg2. Arg3. Arg4 )
ISNULL ( Arg1. ISNULL ( Arg2. ISNULL ( Arg3. Arg4 )))
Another major problem with ISNULL is how it deals with implicit conversions. COALESCE will, like any other T-SQL operation, use the rules of data type precedence to find the data type of its result – but ISNULL will always return a value of the data type of its first argument, making this the only T-SQL keyword that does not respect the precedence rules.
Unfortunately, there is one situation where you can not simply rip out ISNULL and replace it with COALESCE – and that is in the case of computed columns. If you attempt to create the tables below, you will see that Test_ISNULL is created without problems, whereas the attempt to add the index to Test_COALESCE fails.
CREATE TABLE Test_ISNULL