hive报错Unsupported SubQuery Expression Invalid subquery. Subquery in UDAF is not allowed.

 

Error while compiling statement: FAILED: SemanticException org.apache.hadoop.hive.ql.optimizer.calcite.CalciteSubquerySemanticException: Unsupported SubQuery Expression Invalid subquery. Subquery in UDAF is not allowed.

在hive中in、not in不支持子查询

例子

select DISTINCT userid FROM TABLE_A AS a WHERE a.dt >= '20200209' AND a.userid 
NOT IN  (SELECT DISTINCT userid FROM TABLE_B AS b WHERE b.dt >= '20200209');

  

改写方法1 :用left join连接

select DISTINCT a.userid FROM TABLE_A AS  a left JOIN  TABLE_B AS b
on a.userid=b.userid
WHERE b.userid is NULL;

使用left jion示例一下,左联以后,加上b表userid为空的条件,就可以实现例子的需求

改写方法2: 用exists 

select DISTINCT a.userid FROM TABLE_A AS a WHERE a.dt >= '20200209' 
AND NOT EXISTS  
(SELECT DISTINCT b.userid FROM TABLE_B AS b WHERE b.dt >= '20200209' AND a.userid = b.userid);

  where后面使用NOT EXISTS 时候,不需要跟着字段

 

来源:https://www.jianshu.com/p/4c1e0b426ed5

 

上一篇:为什么产生大数据技术&为什么需要学习大数据


下一篇:pyspark操作hive分区表以及.gz.parquet和part-00000文件压缩问题