在编写SQL时,会建议将选择性高(过滤数据多)的条件放到WHERE条件的前面,这是为了让查询优化器优先考虑这些条件,减少生成最优(或相对最优)的执行计划的时间,但最终的执行计划生成过滤顺序还是决定这些条件的选择性与判断bool值的容易程度
测试代码:
GO SELECT * INTO #T1 FROM sys.all_columns GO SELECT * INTO #T2 FROM sys.all_columns GO SELECT * INTO #T3 FROM sys.all_columns GO SET STATISTICS IO ON SET STATISTICS TIME ON GO SELECT * FROM #T1 AS T1 WHERE T1.[object_id]=3 AND (SELECT COUNT(1) FROM #T2 AS T2 WHERE T2.column_id>T1.column_id)>1 SELECT * FROM #T1 AS T1 WHERE (SELECT COUNT(1) FROM #T2 AS T2 WHERE T2.column_id>T1.column_id)>1 AND T1.[object_id]=3
执行计划:
可以从查询计划看出,无论T1.[object_id]=3在何处,其计算bool值相对简单,而(SELECT COUNT(1) FROM #T2 AS T2 WHERE T2.column_id>T1.column_id)>1 需要访问其他表,因此执行优化器优先执行T1.[object_id]=3,在满足T1.[object_id]=3为ture时再坚持行是否满足(SELECT COUNT(1) FROM #T2 AS T2 WHERE T2.column_id>T1.column_id)>1。
但对于以下查询:
SELECT * FROM #T1 AS T1 WHERE (SELECT COUNT(1) FROM #T3 AS T3 WHERE T3.[object_id]>T1.[object_id])<1 AND (SELECT COUNT(1) FROM #T2 AS T2 WHERE T2.column_id>T1.column_id)>1 SELECT * FROM #T1 AS T1 WHERE (SELECT COUNT(1) FROM #T2 AS T2 WHERE T2.column_id>T1.column_id)>1 AND (SELECT COUNT(1) FROM #T3 AS T3 WHERE T3.[object_id]>T1.[object_id])<1
执行计划:
执行统计:
(25 row(s) affected) Table ‘Worktable‘. Scan count 29, logical reads 36813, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘#T3_________________________________________________________________________________________________________________00000000000C‘. Scan count 4, logical reads 80, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘#T1_________________________________________________________________________________________________________________000000000009‘. Scan count 5, logical reads 59, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘#T2_________________________________________________________________________________________________________________00000000000A‘. Scan count 1, logical reads 59, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. (1 row(s) affected) SQL Server Execution Times: CPU time = 111 ms, elapsed time = 331 ms. (25 row(s) affected) Table ‘#T1_________________________________________________________________________________________________________________000000000009‘. Scan count 5, logical reads 59, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘Worktable‘. Scan count 10731, logical reads 87653, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘#T3_________________________________________________________________________________________________________________00000000000C‘. Scan count 4, logical reads 236, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. Table ‘#T2_________________________________________________________________________________________________________________00000000000A‘. Scan count 1, logical reads 59, physical reads 0, read-ahead reads 0, lob logical reads 0, lob physical reads 0, lob read-ahead reads 0. (1 row(s) affected) SQL Server Execution Times: CPU time = 219 ms, elapsed time = 281 ms.
条件 (SELECT COUNT(1) FROM #T2 AS T3 WHERE T3.[object_id]>T1.[object_id])<1 能过滤掉大部分数据,而(SELECT COUNT(1) FROM #T3 AS T2 WHERE T2.column_id>T1.column_id)>1不能过滤任何数据,因此如果优先执行(SELECT COUNT(1) FROM #T3 AS T2 WHERE T2.column_id>T1.column_id)>1,则会大大减少判断(SELECT COUNT(1) FROM #T2 AS T2 WHERE T2.column_id>T1.column_id)>1的次数,从而提高查询速度,但SQL Server无法推断出该结论,因此只能顺序判断WHERE 条件。
总结:虽然在很多情况下SQL Sever引擎能帮助我们判定 WHERE 条件后的执行顺序,但我们仍应该将选择性高(过滤数据多)的条件放置在 WHERE 语句中的前面,尤其对于复杂的SQL 语句,应仔细分析测试。
你比SQL SERVER 更了解你的数据!!!