LATERAL VIEW EXPLODE函数详解及应用

数据说明:

+-----+-----------+------------+------------+---------+---------+
|id   |device_type|business_gmv|order_source|pay_money|event_day|
+-----+-----------+------------+------------+---------+---------+
|10001|jingdong   |16          |1           |1000     |20211020 |
|10004|jingdong   |15          |1           |2000     |20211021 |
+-----+-----------+------------+------------+---------+---------+

使用炸裂函数进行如下操作:

select
   device_type_all,
   business_gmv_all,
   order_source,
   pay_money,
   event_day_all
from order_hbi
LATERAL VIEW OUTER EXPLODE(array('ALL', device_type)) col1 AS device_type_all
LATERAL VIEW OUTER EXPLODE(array('ALL', business_gmv)) col1 AS business_gmv_all
LATERAL VIEW OUTER EXPLODE(array('ALL', event_day)) col1 AS event_day_all
order by event_day_all asc

语句说明:
1.首先将device_type 和ALL的组合数组 进行炸裂,那么这里原来的两行数据在各自加上ALL 之后会得到一共四行数据:

+---------------+----------------+------------+---------+-------------+
|device_type_all|business_gmv_all|order_source|pay_money|event_day_all|
+---------------+----------------+------------+---------+-------------+
|ALL            |16              |1           |1000     |20211020     |
|jingdong       |16              |1           |1000     |20211020     |
|ALL            |15              |1           |2000     |20211021     |
|jingdong       |15              |1           |2000     |20211021     |
+---------------+----------------+------------+---------+-------------+

2.将business_gmv 和ALL 的组合数组进行炸裂,则由上述的数据再次翻倍,即每一行在基于business_gmv 进行和ALL 的炸裂则得到如下的八行数据

+---------------+----------------+------------+---------+-------------+
|device_type_all|business_gmv_all|order_source|pay_money|event_day_all|
+---------------+----------------+------------+---------+-------------+
|ALL            |ALL             |1           |1000     |20211020     |
|ALL            |16              |1           |1000     |20211020     |
|jingdong       |ALL             |1           |1000     |20211020     |
|jingdong       |16              |1           |1000     |20211020     |
|ALL            |ALL             |1           |2000     |20211021     |
|jingdong       |15              |1           |2000     |20211021     |
|ALL            |15              |1           |2000     |20211021     |
|jingdong       |ALL             |1           |2000     |20211021     |
+---------------+----------------+------------+---------+-------------+

再基于上述结果进行event_day 的炸裂,则最终可以得到如下的结果的16行数据:

+---------------+----------------+------------+---------+-------------+
|device_type_all|business_gmv_all|order_source|pay_money|event_day_all|
+---------------+----------------+------------+---------+-------------+
|ALL            |ALL             |1           |1000     |20211020     |
|ALL            |16              |1           |1000     |20211020     |
|jingdong       |ALL             |1           |1000     |20211020     |
|jingdong       |16              |1           |1000     |20211020     |
|jingdong       |15              |1           |2000     |20211021     |
|ALL            |ALL             |1           |2000     |20211021     |
|ALL            |15              |1           |2000     |20211021     |
|jingdong       |ALL             |1           |2000     |20211021     |
|ALL            |ALL             |1           |1000     |ALL          |
|ALL            |16              |1           |1000     |ALL          |
|jingdong       |16              |1           |1000     |ALL          |
|jingdong       |15              |1           |2000     |ALL          |
|jingdong       |ALL             |1           |1000     |ALL          |
|ALL            |ALL             |1           |2000     |ALL          |
|ALL            |15              |1           |2000     |ALL          |
|jingdong       |ALL             |1           |2000     |ALL          |
+---------------+----------------+------------+---------+-------------+

基于上述结果统计的目的是可以统计多纬度的指标的聚合结果

4.加上group by 进行统计结果的分析,聚合订单金额

select
   device_type_all,
   business_gmv_all,
   order_source,
   sum(pay_money) pay_money,
   event_day_all
from order_hbi
LATERAL VIEW OUTER EXPLODE(array('ALL', device_type)) col1 AS device_type_all
LATERAL VIEW OUTER EXPLODE(array('ALL', business_gmv)) col1 AS business_gmv_all
LATERAL VIEW OUTER EXPLODE(array('ALL', event_day)) col1 AS event_day_all
group by device_type_all,business_gmv_all,order_source,event_day_all
order by event_day_all asc

可以得到如下的结果:

+---------------+----------------+------------+---------+-------------+
|device_type_all|business_gmv_all|order_source|pay_money|event_day_all|
+---------------+----------------+------------+---------+-------------+
|jingdong       |ALL             |1           |1000.0   |20211020     |
|ALL            |ALL             |1           |1000.0   |20211020     |
|ALL            |16              |1           |1000.0   |20211020     |
|jingdong       |16              |1           |1000.0   |20211020     |
|jingdong       |15              |1           |2000.0   |20211021     |
|ALL            |15              |1           |2000.0   |20211021     |
|jingdong       |ALL             |1           |2000.0   |20211021     |
|ALL            |ALL             |1           |2000.0   |20211021     |
|ALL            |16              |1           |1000.0   |ALL          |
|jingdong       |16              |1           |1000.0   |ALL          |
|ALL            |ALL             |1           |3000.0   |ALL          |
|ALL            |15              |1           |2000.0   |ALL          |
|jingdong       |15              |1           |2000.0   |ALL          |
|jingdong       |ALL             |1           |3000.0   |ALL          |
+---------------+----------------+------------+---------+-------------+

比如我们可以直接从这个结果中选择所有设备类型(device_type_all)并且所有的bussiness_gmv 并且所有日期的订单总量则可以直接选取如下的数据,订单总金额为3000

|ALL            |ALL             |1           |3000.0   |ALL  

也可以宣组jingdong 所有business_gmv_all 所有日期的订单总额则为:3000

|jingdong       |ALL             |1           |3000.0   |ALL          |

假如不想要某个字段的聚合结果可以用!=ALL 先过滤掉,比如:business_gmv_all !=‘ALL’

SELECT device_type_all,sum(pay_money) gmv,event_day_all
FROM (
select
   device_type_all,business_gmv_all,order_source,
   sum(pay_money) pay_money,
   event_day_all
from order_hbi
LATERAL VIEW OUTER EXPLODE(array('ALL', device_type)) col1 AS device_type_all
LATERAL VIEW OUTER EXPLODE(array('ALL', business_gmv)) col1 AS business_gmv_all
LATERAL VIEW OUTER EXPLODE(array('ALL', event_day)) col1 AS event_day_all
group by device_type_all,business_gmv_all,order_source,event_day_all
order by event_day_all asc
) tmp
where business_gmv_all !='ALL'
group by device_type_all,event_day_all

得到如下结果:可以选取想要的数据避免重复计算,直接选取即可

+---------------+------+-------------+
|device_type_all|gmv   |event_day_all|
+---------------+------+-------------+
|ALL            |1000.0|20211020     |
|jingdong       |1000.0|20211020     |
|jingdong       |2000.0|20211021     |
|ALL            |2000.0|20211021     |
|ALL            |3000.0|ALL          |
|jingdong       |3000.0|ALL          |
+---------------+------+-------------+
上一篇:SAP Business One 表结构


下一篇:SAP Business ByDesign 和支付宝与钉钉集成的一个原型开发案例