postgresql group by 报错

 

举个例子:     table name:makerar cname  | wmname |          avg            --------+-------------+------------------------   canada | zoro   |     2.0000000000000000 spain  | luffy  | 1.00000000000000000000   spain  | usopp  |     5.0000000000000000   执行语句:  1 SELECT cname, wmname, MAX(avg) FROM makerar GROUP BY cname;    错误:       ERROR:  column "makerar.wmname" must appear in the GROUP BY clause or be used in an  aggregate function  LINE 1: SELECT cname, wmname, MAX(avg)  FROM makerar GROUP BY cname;                    首先这个错误的原因是因为:   

  Before SQL3 (1999), the selected fields must appear in the GROUP BY clause[*].Interestingly enough,

even though the spec sort of allows to select non-grouped fields, major engines seem to not really like it.

Oracle and SQLServer just don't allow this at all. Mysql used to allow it by default, but now since 5.7 the

administrator needs to enable this option (ONLY_FULL_GROUP_BY) manually in the server configuration

for this feature to be supported.

         翻译过来就主要是:在SQL3(1999)标准之前,select 的字段必须也放在group by 的语句里(因为当如未 在group的相同字段出现不同值时,数据库引擎便不知道刚显示什么了,如上例)。主要的数据库引擎都不允 许这样的操作(有selected field 不放在group by中),即使mysql在5.7版本后也需要打开一个选项才能使用。       这种操作在mysql上运行的情况:it doesn't work "well" in mysql -- in fact, they actually warn you in the docs that if you do it, and all the "hidden" columns (those not in the GROUP BY) aren't 1-to-1 with the GROUP BY  columns, then the results are unpredictable in every other database you just plain can't do it, so i wouldn't call  what mysql does "doing it well"。          解决办法:         1.先在子查询里进行聚合运算(sum,max等),在通过join连接            
1 SELECT m.cname, m.wmname, t.mx FROM (      SELECT cname, MAX(avg)  AS mx    
2 FROM makerar      GROUP BY cname      ) t 
3 JOIN makerar m ON m.cname = t.cname AND t.mx = m.avg ;

 

 

  cname  | wmname |          mx            --------+--------+------------------------  canada | zoro   |     2.0000000000000000    spain  | usopp  |     5.0000000000000000                           2.使用window functions           1 SELECT cname, wmname, MAX(avg) OVER (PARTITION BY cname) AS mx FROM makerar;            cname  | wmname |          mx            --------+--------+------------------------  canada | zoro   |     2.0000000000000000    spain  | luffy  |     5.0000000000000000    spain  | usopp  |     5.0000000000000000             要去掉mx重复的话:        
 SELECT DISTINCT /* distinct here matters, because maybe there are  various tuples for the same max 
value */     m.cname, m.wmname, t.avg AS mx FROM (     SELECT cname, wmname, avg, ROW_NUMBER() 
OVER (PARTITION BY avg DESC) AS rn      FROM makerar ) t JOIN makerar m ON m.cname = t.cname AND 
m.wmname = t.wmname AND t.rn = 1 ;  

 

  cname  | wmname |          mx            --------+--------+------------------------   canada | zoro   |     2.0000000000000000   spain  | usopp  |     5.0000000000000000              3.使用 DISTINCT ON     
1     SELECT DISTINCT ON (cname)      cname, wmname, avg FROM      makerar  ORDER BY      cname, avg DESC ;

 

上一篇:2021HFUT数据库实验九 视图的定义与使用


下一篇:hive对有null值的列进行avg,sum,count等操作时会不会过滤null值