MySQL查询分为内连接查询和外连接查询,他们的区别在于:内连接查询的两个表示对等关系,根据条件进行匹配;外连接是以某一个表为主,两一个表根据条件进行关联。外连接分为左外连接、右外连接和全外连接。本文重点介绍各外连接的思想,以及如何实现全外连接,并举例。
左外连接
左外连接以左边表为基础,根据条件,将右边表附属到左边表,语法:SELECT * FROM A LEFT JOIN B ON condition。几何图形关系如下图,即查询结果集除了A表所有数据外,还包含满足条件的B表数据:
右外连接
右外连接以右边表为基础,根据条件,将左边表附属到右边表,语法:SELECT * FROM A RIGHT JOIN B ON condition。几何图形关系如下图,即查询结果集除了B表所有数据外,还包含满足条件的A表数据:
全外连接
全外连接是除了能够根据条件匹配得到的数据,还包含左右两表中都不匹配的数据(默认应为null),应用全外连接的情况一般都有一个联系左右两表的主线。几何关系如下图所示,对应A和B的并集(去重):
但不幸的是MySQL不支持全外连接,那在需要全外连接查询的情况下,如何实现呢?最常见的是左连接与右连接合并。
实例
项目中存在这样的场景:某项任务task具有2种不同的状态todo和done,分别存储在todolist和donelist表中,任务存储在task表中,现在需要统计每个task的已处理和未处理情况。首先先到了全外连接,那么如何实现呢?
举例实现表结构如下:
实现四种方法:
1、左连接,右连接,合并;(需保持两个结果集结构一致)
- 首先是左连接:
SELECT
A.id AS Aid,
B.id AS Bid,
A.taskid tid
FROM
(
SELECT
*
FROM
todolist
WHERE
todolist.user = '张三'
) A
LEFT JOIN (
SELECT
*
FROM
donelist
WHERE
donelist.user = '张三'
) B ON A.taskid = B.taskid
查询结果:
- 其次是右连接(注意由于需要合并,故左右连接的结果集结构需一致):
SELECT
A.id AS Aid,
B.id AS Bid,
A.taskid tid
FROM
(
SELECT
*
FROM
todolist
WHERE
todolist.user = '张三'
) A
RIGHT JOIN (
SELECT
*
FROM
donelist
WHERE
donelist.user = '张三'
) B ON A.taskid = B.taskid
查询结果:
- 最后进行合并,并与task表进行内连接:
SELECT
SUM(IF(Aid IS NOT NULL, 1, 0)) todo,
SUM(IF(Bid IS NOT NULL, 1, 0)) done,
task.name
FROM
(
SELECT
A.id AS Aid,
B.id AS Bid,
A.taskid tid
FROM
(
SELECT
*
FROM
todolist
WHERE
todolist.user = '张三'
) A
LEFT JOIN (
SELECT
*
FROM
donelist
WHERE
donelist.user = '张三'
) B ON A.taskid = B.taskid
UNION
SELECT
A.id AS Aid,
B.id AS Bid,
B.taskid tid
FROM
(
SELECT
*
FROM
todolist
WHERE
todolist.user = '张三'
) A
RIGHT JOIN (
SELECT
*
FROM
donelist
WHERE
donelist.user = '张三'
) B ON A.taskid = B.taskid
) AS AB
INNER JOIN task ON task.id = AB.tid
GROUP BY
task.name
运行结果如下表,实现全外连接:
2、A+B左连接,B-A去除左连接到A的记录,然后合并两个结果集;(需保持两个结果集结构一致)
这是另一种实现全外连接的方式,即先查询A B的左连接,然后查询B中去除左连接到A的记录,最后合并(A代表todolist,B代表donelist):
- A+B左连接
SELECT
1 AS todo,
CASE
WHEN B.id IS NOT NULL THEN
1
ELSE
0
END AS done,
A.taskid tid
FROM
(
SELECT
*
FROM
todolist
WHERE
todolist.user = '张三'
) A
LEFT JOIN (
SELECT
*
FROM
donelist
WHERE
donelist.user = '张三'
) B ON A.taskid = B.taskid
查询结果:
- B-A去除左连接到A的记录
SELECT
0 AS todo,
1 AS done,
donelist.taskid tid
FROM
donelist
WHERE
donelist.user = '张三'
AND NOT EXISTS (
SELECT
*
FROM
todolist
WHERE
todolist.taskid = donelist.taskid
AND donelist.user = '张三'
AND odolist.user = donelist.user
)
查询结果:
- 合并
SELECT
SUM(AB.todo) todo,
SUM(AB.done) done,
task.name
FROM
(
SELECT
1 AS todo,
CASE
WHEN B.id IS NOT NULL THEN
1
ELSE
0
END AS done,
A.taskid tid
FROM
(
SELECT
*
FROM
todolist
WHERE
todolist.user = '张三'
) A
LEFT JOIN (
SELECT
*
FROM
donelist
WHERE
donelist.user = '张三'
) B ON A.taskid = B.taskid
UNION
SELECT
0 AS todo,
1 AS done,
donelist.taskid tid
FROM
donelist
WHERE
donelist.user = '张三'
AND NOT EXISTS (
SELECT
*
FROM
todolist
WHERE
todolist.taskid = donelist.taskid
AND donelist.user = '张三'
AND odolist.user = donelist.user
)
) AB
INNER JOIN task ON task.id = AB.tid
GROUP BY
task.name
结果同上
3、以task表为根本,将A和B表左连接,实现查询;
该方法的思想是,不管A和B表有什么关系,他们都跟作为主线的表task相关,只需要将A和B表与task表进行左连接,得到连接后的数据集,即为最后需要查询的结果集。SQL代码如下:
SELECT
SUM(AB.todo) AS todo,
SUM(AB.done) AS done,
task.name
FROM
(
SELECT
task.name,
CASE
WHEN A.id IS NULL THEN
0
ELSE
1
END AS todo,
CASE
WHEN B.id IS NULL THEN
0
ELSE
1
END AS done
FROM
task
LEFT JOIN (
SELECT
*
FROM
todolist
WHERE
todolist.user = '张三'
) A ON A.taskid = task.id
LEFT JOIN (
SELECT
*
FROM
donelist
WHERE
donelist.user = '张三'
) B ON B.taskid = task.id
WHERE
A.id IS NOT NULL
OR B.id IS NOT NULL
) AB
GROUP BY
task.name
查询结果同上,但这种方法存在一定的缺陷,即当主线表(task表)特别大的时候,性能会比较差。
4、A表查a状态,B表查b状态,然后合并;(需保持两个结果集结构一致)
该方法是不管A和B表的关系,现根据条件查询,然后在合并。SQL语句如下:
SELECT
SUM(A.todo) todo,
SUM(A.done) done,
task.name
FROM
(
SELECT
1 todo,
0 done,
todolist.taskid tid
FROM
todolist
WHERE
todolist.user = '张三'
UNION ALL
SELECT
0 todo,
1 done,
donelist.taskid tid
FROM
donelist
WHERE
donelist.user = '张三'
) A
INNER JOIN task ON task.id = A.tid
GROUP BY
task.name
查询结果同上。
四种方式只是实现功能,具体性能没有考虑,待后续学习。希望看客们多提意见,共同学习。
总结:
- 理解左连接、右连接和全外连接的思想
- 四种实现全外连接效果的方法
- 语法上注意:UNION可以去重,UNION ALL不去重