在左联接中使用group by的MySQL子查询-优化

MySQL似乎无法使用GROUP BY子查询优化选择,最终导致执行时间长.对于这种常见情况,必须有一个已知的优化.

假设我们正在尝试从数据库返回所有订单,并带有一个标志,指示该订单是否是客户的第一笔订单.

CREATE TABLE orders (order int, customer int, date date);

客户检索第一笔订单非常快捷.

SELECT customer, min(order) as first_order FROM orders GROUP BY customer;

但是,一旦我们使用子查询将全部订单集加入其中,它就会变得非常缓慢

SELECT order, first_order FROM orders LEFT JOIN ( 
  SELECT customer, min(order) as first_order FROM orders GROUP BY customer
) AS first_orders ON orders.order=first_orders.first_order;

我希望我们没有一个简单的技巧,因为否则它的执行速度将提高约1000倍

CREATE TEMPORARY TABLE tmp_first_order AS 
  SELECT customer, min(order) as first_order FROM orders GROUP BY customer;
CREATE INDEX tmp_boost ON tmp_first_order (first_order)

SELECT order, first_order FROM orders LEFT JOIN tmp_first_order 
  ON orders.order=tmp_first_order.first_order;

编辑:
受@ruakh提议的选项3的启发,使用INNER JOIN和UNION确实存在一个较丑陋的解决方法,该方法具有可接受的性能,但不需要临时表.但是,这是特定于我们的情况的,我想知道是否存在更通用的优化.

SELECT order, "YES" as first FROM orders INNER JOIN ( 
    SELECT min(order) as first_order FROM orders GROUP BY customer
  ) AS first_orders_1 ON orders.order=first_orders_1.first_order
UNION
SELECT order, "NO" as first FROM orders INNER JOIN ( 
    SELECT customer, min(order) as first_order FROM orders GROUP BY customer
  ) AS first_orders_2 ON first_orders_2.customer = orders.customer 
    AND orders.order > first_orders_2.first_order;

解决方法:

您可以尝试以下几种方法:

>从子查询的字段列表中删除客户,因为它仍然没有做任何事情:

SELECT order,
       first_order
  FROM orders
  LEFT
  JOIN ( SELECT MIN(order) AS first_order
           FROM orders
          GROUP
             BY customer
       ) AS first_orders
    ON orders.order = first_orders.first_order
;

>相反,将客户添加到ON子句中,这样实际上可以为您做些事情:

SELECT order,
       first_order
  FROM orders
  LEFT
  JOIN ( SELECT customer,
                MIN(order) AS first_order
           FROM orders
          GROUP
             BY customer
       ) AS first_orders
    ON orders.customer = first_orders.customer
   AND orders.order = first_orders.first_order
;

>与先前相同,但使用INNER JOIN而不是LEFT JOIN,并将原始的ON子句转换为CASE表达式:

SELECT order,
       CASE WHEN first_order = order THEN first_order END AS first_order
  FROM orders
 INNER
  JOIN ( SELECT customer,
                MIN(order) AS first_order
           FROM orders
          GROUP
             BY customer
       ) AS first_orders
    ON orders.customer = first_orders.customer
;

>在CASE表达式中用不相关的IN子查询替换整个JOIN方法:

SELECT order,
       CASE WHEN order IN
                  ( SELECT MIN(order)
                      FROM orders
                     GROUP
                        BY customer
                  )
            THEN order
        END AS first_order
  FROM orders
;

>在CASE表达式中用相关的EXISTS子查询替换整个JOIN方法:

SELECT order,
       CASE WHEN NOT EXISTS
                  ( SELECT 1
                      FROM orders AS o2
                     WHERE o2.customer = o1.customer
                       AND o2.order < o1.order
                  )
            THEN order
        END AS first_order
  FROM orders AS o1
;

(很可能上述某些功能实际上会更糟,但我认为它们都值得尝试.)

上一篇:MySQL表的组织和优化(Rails)


下一篇:哪种方法最有效?