mysql-为什么使用IN(子查询)的查询要比使用IN(离散列表)的查询花费更长的时间

这一直困扰着我,为什么这个查询

SELECT 
  * 
FROM
  `TABLE` 
WHERE `value` IN 
  (SELECT 
    val 
  FROM
    OTHER_TABLE 
  WHERE `date` < '2014-01-01')

比顺序运行这两个查询要慢几个数量级

SELECT 
  `val` 
FROM
  OTHER_TABLE 
WHERE `date` < '2014-01-01' 

Result:
+----+
| val |
+-----+
| v1  |
| v2  |
| v3  |
| v7  |
| v12 |
+-----+

和这个查询:

SELECT 
  * 
FROM
  `TABLE` 
WHERE `value` IN ('v1', 'v2', 'v3', 'v7', 'v12')

解决方法:

从文档中:(重点由我添加)

Subquery optimization for IN is not as effective as for the = operator
or for the IN(value_list) operator.

A typical case for poor IN subquery performance is when the subquery
returns a small number of rows but the outer query returns a large
number of rows to be compared to the subquery result.

The problem is that, for a statement that uses an IN subquery, the
optimizer rewrites it as a correlated subquery. Consider the following
statement that uses an uncorrelated subquery:

SELECT ... FROM t1 WHERE t1.a IN (SELECT b FROM t2);

The optimizer rewrites the statement to a correlated subquery:

SELECT ... FROM t1 WHERE EXISTS (SELECT 1 FROM t2 WHERE t2.b = t1.a);

If the inner and outer queries return M and N rows, respectively, the
execution time becomes on the order of O(M×N), rather than O(M+N) as
it would be for an uncorrelated subquery.

An implication is that an IN subquery can be much slower than a query
written using an IN(value_list) operator that lists the same values
that the subquery would return.

http://dev.mysql.com/doc/refman/5.7/en/subquery-restrictions.html

希望这对其他可能好奇的人有所帮助

上一篇:Mysql的子查询(AGAINST的参数不正确)


下一篇:mysql-‘IN / ALL / ANY’子查询中的未知列