题目
按照dept_no进行汇总,属于同一个部门的emp_no按照逗号进行连接,结果给出dept_no以及连接出的结果employees
CREATE TABLE dept_emp
(emp_no
int(11) NOT NULL,dept_no
char(4) NOT NULL,from_date
date NOT NULL,to_date
date NOT NULL,
PRIMARY KEY (emp_no
,dept_no
));
输出格式:
数据
drop table if exists `dept_emp` ;
CREATE TABLE `dept_emp` (
`emp_no` int(11) NOT NULL,
`dept_no` char(4) NOT NULL,
`from_date` date NOT NULL,
`to_date` date NOT NULL,
PRIMARY KEY (`emp_no`,`dept_no`));
INSERT INTO dept_emp VALUES(10001,'d001','1986-06-26','9999-01-01');
INSERT INTO dept_emp VALUES(10002,'d001','1996-08-03','9999-01-01');
INSERT INTO dept_emp VALUES(10003,'d004','1995-12-03','9999-01-01');
INSERT INTO dept_emp VALUES(10004,'d004','1986-12-01','9999-01-01');
INSERT INTO dept_emp VALUES(10005,'d003','1989-09-12','9999-01-01');
INSERT INTO dept_emp VALUES(10006,'d002','1990-08-05','9999-01-01');
INSERT INTO dept_emp VALUES(10007,'d005','1989-02-10','9999-01-01');
INSERT INTO dept_emp VALUES(10008,'d005','1998-03-11','2000-07-31');
INSERT INTO dept_emp VALUES(10009,'d006','1985-02-18','9999-01-01');
INSERT INTO dept_emp VALUES(10010,'d005','1996-11-24','2000-06-26');
INSERT INTO dept_emp VALUES(10010,'d006','2000-06-26','9999-01-01');
输出:
d001|10001,10002
d002|10006
d003|10005
d004|10003,10004
d005|10007,10008,10010
d006|10009,10010
实现
mysql
select dept_no,group_concat(emp_no)
from dept_emp
group by dept_no;
结果:
函数语法:
group_concat( [DISTINCT] 要连接的字段 [Order BY 排序字段 ASC/DESC] [Separator ‘分隔符’] ),默认是逗号。
hive
SELECT dept_no,concat_ws(',', collect_set(emp_no))
from dept_emp
group by dept_no;
结果:
注意:在使用collect_set函数时,hive中的emp_no字段类型不能int,可以是string,不然的话会报如下错误:
提问:为什么最后一行有所差别呢?