我有一个包含以下字段的表:
season, collection, product_key, units_sold
我想补充一点
cumulative_sold column(aggreate of previous rows values)
这应该按季节,集合,units_sold满足顺序
sample input
----------
ss,f1,1,3
ss,f1,2,4
ss,f1,3,4
ss,f2,1,1
ss,f2,5,1
expected output
--------------
ss,f1,1,3,3
ss,f1,2,4,7(3+4)
ss,f1,3,4,11(3+4+4)
ss,f2,1,1,1
ss,f2,5,1,2(1+1)
解决方法:
您可以使用相关子查询执行累积求和:
select season, collection, product_key, units_sold,
(select sum(units_sold)
from t t2
where t2.season < t.season or
t2.season = t.season and t2.collection < t.collection or
t2.season = t.season and t2.collection = t.collection and t2.units_sold <= t.units_sold
) as cumsum
from t;
这是标准的SQL.如果您有大量数据,则需要t(season,collection,units_sold)的索引.
编辑:
如果您只想在特定季节中使用此特定集合,而不是总体累积总和:
select season, collection, product_key, units_sold,
(select sum(units_sold)
from t t2
where t2.season = t.season and t2.collection = t.collection and
t2.units_sold <= t.units_sold
) as cumsum
from t;
编辑二:
这是一种非常标准的SQL类型.如果你能得到正确的问题会有所帮助.要处理units_sold上的重复项:
select season, collection, product_key, units_sold,
(select sum(units_sold)
from t t2
where t2.season = t.season and t2.collection = t.collection and
(t2.units_sold < t.units_sold or
t2.units_sold = t.units_sold and t2.product_key <= t.product_key
)
) as cumsum
from t;