像Excel一样使用SQL进行数据分析(上)

SQL不仅可以从数据库中读取数据,还能通过不同的SQL函数语句直接返回所需要的结果,从而大大提高了自己在客户端应用程序中计算的效率。


1  重复数据处理

查找重复记录
SELECT * FROM user 
Where (nick_name,password) in
(
SELECT nick_name,password 
FROM user 
group by nick_name,password 
having count(nick_name)>1
);


查找去重记录

查找id最大的记录

SELECT * FROM user 
WHERE id in
(SELECT max(id) FROM user
group by nick_name,password 
having count(nick_name)>1
);


删除重复记录

只保留id值最小的记录

DELETE  c1
FROM  customer c1,customer c2
WHERE c1.cust_email=c2.cust_email
AND c1.id>c2.id;
DELETE FROM user Where (nick_name,password) in
(SELECT nick_name,password FROM
    (SELECT nick_name,password FROM user 
    group by nick_name,password 
    having count(nick_name)>1) as tmp1
)
and id not in
(SELECT id FROM
    (SELECT min(id) id FROM user 
     group by nick_name,password 
     having count(nick_name)>1) as tmp2
);



2  缺失值处理

查找缺失值记录
SELECT * FROM customer
WHERE cust_email IS NULL;


更新列填充空值

UPDATE sale set city = "未知" 
WHERE city IS NULL;
UPDATE orderitems set 
price_new=IFNULL(price_new,5.74);


查询并填充空值列

SELECT AVG(price_new) FROM orderitems;
SELECT IFNULL(price_new,5.74) AS bus_ifnull
FROM orderitems;



3  计算列

更新表添加计算列
ALTER TABLE orderitems ADD price_new DECIMAL(8,2) NOT NULL;
UPDATE orderitems set price_new= item_price*count;


查询计算列

SELECT item_price*count as sales FROM orderitems;



4  排序

多列排序
SELECT * FROM orderitems
ORDER BY price_new DESC,quantity;


查询排名前几的记录

SELECT  * FROM orderitems
ORDER BY price_new DESC Limit 5;


查询第10大的值

SELECT DISTINCT price_new
FROM orderitems
ORDER BY price_new DESC LIMIT 9,1;


上一篇:SQL 中为什么经常要加NOLOCK?


下一篇:MySQL基础知识——ORDER BY