I’ve just learned a really weird and unobvious thing (actually had to e-mail my question to SQL experts, suspecting that there is a bug in SQL Server), which will make me very cautious about using "WHERE … NOT IN".
Long story short, if you run:
SELECT i
FROM (SELECT 1 AS i UNION SELECT 2 UNION SELECT 3) AS X
WHERE i NOT IN (3, 4, 5)
You’ll get: 1, 2
But if you run:
SELECT i
FROM (SELECT 1 AS i UNION SELECT 2 UNION SELECT 3) AS X
WHERE i NOT IN (3, 4, 5, NULL)
You’ll get empty set.
The reason it happens is because
WHERE i NOT IN (3, 4, 5, NULL)
is being evaluated as
WHERE i <> 3 AND i <> 4 AND i <> 5 AND i <> NULL
Now since i <> NULL is false, the whole expression becomes false (even if i was NULL, NULL <> NULL would still evaluate to NULL, because you have to compare for nulls with IS NULL operator.
This difference (whether you have NULL or not in subquery) can be subtle and hard to notice in case if you query over a table.
The error-prone solution would be to use LEFT OUTER JOIN ON T and T.key IS NULL condition:
SELECT X.i
FROM (SELECT 1 AS i UNION SELECT 2 UNION SELECT 3) AS X
LEFT OUTER JOIN (SELECT 3 AS i UNION SELECT 4
UNION SELECT 5 UNION SELECT NULL) AS Y
ON X.i = Y.i
WHERE Y.i IS NULL
This will return: 1, 2