MySQL似乎无法使用GROUP BY子查询优化选择,最终导致执行时间长.对于这种常见情况,必须有一个已知的优化.
假设我们正在尝试从数据库返回所有订单,并带有一个标志,指示该订单是否是客户的第一笔订单.
CREATE TABLE orders (order int, customer int, date date);
客户检索第一笔订单非常快捷.
SELECT customer, min(order) as first_order FROM orders GROUP BY customer;
但是,一旦我们使用子查询将全部订单集加入其中,它就会变得非常缓慢
SELECT order, first_order FROM orders LEFT JOIN (
SELECT customer, min(order) as first_order FROM orders GROUP BY customer
) AS first_orders ON orders.order=first_orders.first_order;
我希望我们没有一个简单的技巧,因为否则它的执行速度将提高约1000倍
CREATE TEMPORARY TABLE tmp_first_order AS
SELECT customer, min(order) as first_order FROM orders GROUP BY customer;
CREATE INDEX tmp_boost ON tmp_first_order (first_order)
SELECT order, first_order FROM orders LEFT JOIN tmp_first_order
ON orders.order=tmp_first_order.first_order;
编辑:
受@ruakh提议的选项3的启发,使用INNER JOIN和UNION确实存在一个较丑陋的解决方法,该方法具有可接受的性能,但不需要临时表.但是,这是特定于我们的情况的,我想知道是否存在更通用的优化.
SELECT order, "YES" as first FROM orders INNER JOIN (
SELECT min(order) as first_order FROM orders GROUP BY customer
) AS first_orders_1 ON orders.order=first_orders_1.first_order
UNION
SELECT order, "NO" as first FROM orders INNER JOIN (
SELECT customer, min(order) as first_order FROM orders GROUP BY customer
) AS first_orders_2 ON first_orders_2.customer = orders.customer
AND orders.order > first_orders_2.first_order;
解决方法:
您可以尝试以下几种方法:
>从子查询的字段列表中删除客户,因为它仍然没有做任何事情:
SELECT order,
first_order
FROM orders
LEFT
JOIN ( SELECT MIN(order) AS first_order
FROM orders
GROUP
BY customer
) AS first_orders
ON orders.order = first_orders.first_order
;
>相反,将客户添加到ON子句中,这样实际上可以为您做些事情:
SELECT order,
first_order
FROM orders
LEFT
JOIN ( SELECT customer,
MIN(order) AS first_order
FROM orders
GROUP
BY customer
) AS first_orders
ON orders.customer = first_orders.customer
AND orders.order = first_orders.first_order
;
>与先前相同,但使用INNER JOIN而不是LEFT JOIN,并将原始的ON子句转换为CASE表达式:
SELECT order,
CASE WHEN first_order = order THEN first_order END AS first_order
FROM orders
INNER
JOIN ( SELECT customer,
MIN(order) AS first_order
FROM orders
GROUP
BY customer
) AS first_orders
ON orders.customer = first_orders.customer
;
>在CASE表达式中用不相关的IN子查询替换整个JOIN方法:
SELECT order,
CASE WHEN order IN
( SELECT MIN(order)
FROM orders
GROUP
BY customer
)
THEN order
END AS first_order
FROM orders
;
>在CASE表达式中用相关的EXISTS子查询替换整个JOIN方法:
SELECT order,
CASE WHEN NOT EXISTS
( SELECT 1
FROM orders AS o2
WHERE o2.customer = o1.customer
AND o2.order < o1.order
)
THEN order
END AS first_order
FROM orders AS o1
;
(很可能上述某些功能实际上会更糟,但我认为它们都值得尝试.)