背景
场景:
短视频、电商、社交等基于标签的动态推荐.
挑战:
标签多, 按标签权重比例每个标签取N条, 与数据库的交互次数多.
已读列表巨大, 过滤已读采用传统not in资源消耗大.
效率低下.
PG解决方案:
效率提升1000倍, 可能价值1个亿.
技术点: 数组、自定义type、partial index降低无效过滤、hash subplan优化not in、array agg, jsonb
例子
以短视频推荐为例.
PG解决方案
1、视频资源池
1.1、全国资源池
vid int8, -- 视频ID
tag int, -- 标签
score float4 -- 标签权重
);
写入数据: 1000万条记录: 100万个视频, 每个视频10个标签, 标签取值空间1-100.
1.2、省份资源池
pid int, -- 省份ID
vid int8, -- 视频ID
tag int, -- 标签
score float4 -- 标签权重
);
写入数据: 1000万条记录: 100万个视频, 每个视频10个标签, 标签取值空间1-100.
pid 取值空间1-36
1.3、城市资源池
cid int, -- 城市ID
vid int8, -- 视频ID
tag int, -- 标签
score float4 -- 标签权重
);
写入数据: 1000万条记录: 100万个视频, 每个视频10个标签, 标签取值空间1-100.
cid 取值空间1-10000
1.4、partial index(分区索引), 避免过滤列表巨大带来巨大的无效扫描, 之前已经讲过
《重新发现PostgreSQL之美- 23 彭祖的长寿秘诀》
declare
begin
for i in 0..49 loop
execute format('create index idx_pool_global_%s on pool_global (tag,score desc) include (vid) where abs(mod(hashint8(vid),50))=%s', i,i);
execute format('create index idx_pool_province_%s on pool_province (pid,tag,score desc) include (vid) where abs(mod(hashint8(vid),50))=%s', i,i);
execute format('create index idx_pool_city_%s on pool_city (cid,tag,score desc) include (vid) where abs(mod(hashint8(vid),50))=%s', i,i);
end loop;
end;
$$;
2、用户标签类型
tag int, -- 标签
score float4, -- 标签权重
limits int -- 用这个标签获取多少条VID
);
3、用户表
uid int8 primary key, -- 用户ID
pid int, -- 省份ID
cid int, -- 城市ID
tag_scores1 tag_score[], -- 标签、权重、对应标签获取多少条. 也可以使用jsonb存储
tag_scores2 tag_score[], -- 标签、权重、对应标签获取多少条 limit = 0的放这个字段. 业务更新tag_scores根据两个字段的结果来计算. 主要是减少PG计算量.
readlist jsonb -- 已读VID, 和分区索引的分区数匹配, 用jsonb数组表示. jsonb[0]表示abs(mod(hashint8(vid),50))=0的vid数组
);
写入数据: 1000万个用户, 每个用户20个标签(标签取值空间1-100), limit大于0的标签5个(和为100). vid已读列表5万条(1-300万取值空间).
pid 取值空间1-36
cid 取值空间1-10000
select generate_series(1,10000000), ceil(random()*36), ceil(random()*10000),
array(
select row(ceil(random()*100), random()*10, 40)::tag_score
union all
select row(ceil(random()*100), random()*10, 20)::tag_score
union all
select row(ceil(random()*100), random()*10, 15)::tag_score
union all
select row(ceil(random()*100), random()*10, 15)::tag_score
union all
select row(ceil(random()*100), random()*10, 10)::tag_score
),
array (
select row(ceil(random()*100), random()*10, 0)::tag_score from generate_series(1,15)
),
(
select jsonb_agg(x) as readlist from
(
select array (select x from
(select ceil(random()*3000000)::int8 x from generate_series(1,50000)) t
where mod(x,50)=i
) x
from generate_series(0,49) i
) t
) ;
4、如何更新用户的标签权重, 对应标签获取多少条?
原则: 可以在程序中计算并且不会增加程序和数据库交互的, 放在程序内计算.
取出UID, 在应用程序中计算的到tag_scores结果, 存入数据库users表.
5、获取推荐视频vids SQL:
(
select array_agg(vid) from
(
select vid from pool_global t1
where t1.tag=t.tag
and t1.vid not in (select jsonb_array_elements_text( readlist[0] )::int8 from users where uid=1)
and abs(mod(hashint8(vid),50)) = 0
order by t1.score desc
limit ceil(t.limits*0.5)
) x -- 全国池 50%
) as global_pool,
(
select array_agg(vid) from
(
select vid from pool_province t1
where t1.tag=t.tag
and t1.pid=(select pid from users where uid=1)
and t1.vid not in (select jsonb_array_elements_text( readlist[0] )::int8 from users where uid=1)
and abs(mod(hashint8(vid),50)) = 0
order by t1.score desc
limit ceil(t.limits*0.3)
) x -- 省份池 30%
) as province_pool,
(
select array_agg(vid) from
(
select vid from pool_city t1
where t1.tag=t.tag
and t1.cid=(select cid from users where uid=1)
and t1.vid not in (select jsonb_array_elements_text( readlist[0] )::int8 from users where uid=1)
and abs(mod(hashint8(vid),50)) = 0
order by t1.score desc
limit ceil(t.limits*0.2)
) x -- 城市池 20%
) as city_pool
from
(
select (unnest(tag_scores1)).tag as tag, (unnest(tag_scores1)).limits as limits from
users where uid=1
) t;
global_pool | {555234,30783,44877,893039,274638,811324,743142,233694,503619,977097,263781,350882,15000,961863,705252,823857,302978,919950,864090,633682}
province_pool | {1381822,1570117,1733796,1802258,1757745,1308796,1296608,1958019,1637076,1626698,1369964,1501167}
city_pool |
-[ RECORD 2 ]-+-------------------------------------------------------------------------------------------------------------------------------------------
global_pool | {41356,470712,453025,997172,997806,520315,512094,523652,714477,526433}
province_pool | {1246470,1582571,1154589,1213147,1144821,1498216}
city_pool |
-[ RECORD 3 ]-+-------------------------------------------------------------------------------------------------------------------------------------------
global_pool | {951192,518459,830710,34429,113691,362024,578173,574309}
province_pool | {1831028,1060594,1871276,1365273,1092971}
city_pool | {2100388}
-[ RECORD 4 ]-+-------------------------------------------------------------------------------------------------------------------------------------------
global_pool | {235601,950720,269682,452868,622511,590893,602110,104605}
province_pool | {1791516,1039665,1669258,1663575,1126127}
city_pool |
-[ RECORD 5 ]-+-------------------------------------------------------------------------------------------------------------------------------------------
global_pool | {738016,548948,600423,559686,483213}
province_pool | {1185880,1916542,1336231}
city_pool |
-------------------------------------------------------------------------------------------------------------------------------------------------------------------
Subquery Scan on t (cost=0.29..246.06 rows=10 width=96) (actual time=2.508..3.754 rows=5 loops=1)
-> Result (cost=0.29..2.71 rows=10 width=8) (actual time=0.020..0.023 rows=5 loops=1)
-> ProjectSet (cost=0.29..2.56 rows=10 width=32) (actual time=0.019..0.020 rows=5 loops=1)
-> Index Scan using users_pkey on users (cost=0.29..2.50 rows=1 width=224) (actual time=0.017..0.018 rows=1 loops=1)
Index Cond: (uid = 1)
SubPlan 2
-> Aggregate (cost=7.08..7.09 rows=1 width=32) (actual time=0.261..0.261 rows=1 loops=5)
-> Limit (cost=5.43..6.76 rows=25 width=12) (actual time=0.254..0.258 rows=10 loops=5)
-> Index Only Scan using idx_pool_global_0 on pool_global t1 (cost=5.43..18.63 rows=248 width=12) (actual time=0.254..0.257 rows=10 loops=5)
Index Cond: (tag = t.tag)
Filter: (NOT (hashed SubPlan 1))
Heap Fetches: 0
SubPlan 1
-> Result (cost=0.29..4.76 rows=100 width=8) (actual time=0.851..1.074 rows=1001 loops=1)
-> ProjectSet (cost=0.29..3.01 rows=100 width=32) (actual time=0.850..0.938 rows=1001 loops=1)
-> Index Scan using users_pkey on users users_1 (cost=0.29..2.50 rows=1 width=18) (actual time=0.006..0.006 rows=1 loops=1)
Index Cond: (uid = 1)
SubPlan 5
-> Aggregate (cost=8.15..8.16 rows=1 width=32) (actual time=0.247..0.247 rows=1 loops=5)
-> Limit (cost=7.93..8.13 rows=1 width=12) (actual time=0.242..0.245 rows=6 loops=5)
InitPlan 3 (returns $3)
-> Index Scan using users_pkey on users users_2 (cost=0.29..2.50 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=1)
Index Cond: (uid = 1)
-> Index Only Scan using idx_pool_province_0 on pool_province t1_1 (cost=5.43..6.85 rows=7 width=12) (actual time=0.241..0.243 rows=6 loops=5)
Index Cond: ((pid = $3) AND (tag = t.tag))
Filter: (NOT (hashed SubPlan 4))
Heap Fetches: 0
SubPlan 4
-> Result (cost=0.29..4.76 rows=100 width=8) (actual time=0.813..1.027 rows=1001 loops=1)
-> ProjectSet (cost=0.29..3.01 rows=100 width=32) (actual time=0.812..0.896 rows=1001 loops=1)
-> Index Scan using users_pkey on users users_3 (cost=0.29..2.50 rows=1 width=18) (actual time=0.005..0.005 rows=1 loops=1)
Index Cond: (uid = 1)
SubPlan 8
-> Aggregate (cost=9.07..9.08 rows=1 width=32) (actual time=0.236..0.237 rows=1 loops=5)
-> Limit (cost=7.93..9.05 rows=1 width=12) (actual time=0.235..0.235 rows=0 loops=5)
InitPlan 6 (returns $7)
-> Index Scan using users_pkey on users users_4 (cost=0.29..2.50 rows=1 width=4) (actual time=0.002..0.003 rows=1 loops=1)
Index Cond: (uid = 1)
-> Index Only Scan using idx_pool_city_0 on pool_city t1_2 (cost=5.43..6.55 rows=1 width=12) (actual time=0.234..0.234 rows=0 loops=5)
Index Cond: ((cid = $7) AND (tag = t.tag))
Filter: (NOT (hashed SubPlan 7))
Heap Fetches: 0
SubPlan 7
-> Result (cost=0.29..4.76 rows=100 width=8) (actual time=0.793..1.011 rows=1001 loops=1)
-> ProjectSet (cost=0.29..3.01 rows=100 width=32) (actual time=0.792..0.876 rows=1001 loops=1)
-> Index Scan using users_pkey on users users_5 (cost=0.29..2.50 rows=1 width=18) (actual time=0.006..0.006 rows=1 loops=1)
Index Cond: (uid = 1)
Planning Time: 0.999 ms
Execution Time: 3.825 ms
(49 rows)
6、压测
\set uid random(1,10000000)
\set mod random(0,49)
select
(
select array_agg(vid) from
(
select vid from pool_global t1
where t1.tag=t.tag
and t1.vid not in (select jsonb_array_elements_text( readlist[:mod] )::int8 from users where uid=:uid)
and abs(mod(hashint8(vid),50)) = :mod
order by t1.score desc
limit ceil(t.limits*0.5)
) x -- 全国池 50%
) as global_pool,
(
select array_agg(vid) from
(
select vid from pool_province t1
where t1.tag=t.tag
and t1.pid=(select pid from users where uid=:uid)
and t1.vid not in (select jsonb_array_elements_text( readlist[:mod] )::int8 from users where uid=:uid)
and abs(mod(hashint8(vid),50)) = :mod
order by t1.score desc
limit ceil(t.limits*0.3)
) x -- 省份池 30%
) as province_pool,
(
select array_agg(vid) from
(
select vid from pool_city t1
where t1.tag=t.tag
and t1.cid=(select cid from users where uid=:uid)
and t1.vid not in (select jsonb_array_elements_text( readlist[:mod] )::int8 from users where uid=:uid)
and abs(mod(hashint8(vid),50)) = :mod
order by t1.score desc
limit ceil(t.limits*0.2)
) x -- 城市池 20%
) as city_pool
from
(
select (unnest(tag_scores1)).tag as tag, (unnest(tag_scores1)).limits as limits from
users where uid=:uid
) t;
pgbench (PostgreSQL) 14.0
transaction type: ./test.sql
scaling factor: 1
query mode: prepared
number of clients: 12
number of threads: 12
duration: 120 s
number of transactions actually processed: 99824
latency average = 14.426 ms
latency stddev = 12.608 ms
initial connection time = 13.486 ms
tps = 831.235738 (without initial connection time)
2018款macbook pro i5系列, 每秒可以推荐多少个视频ID?
- 83100
降级方法:
4、重新发现PG之美- 4 随机漫步踏浪而来
在一些论坛、短视频业务中, 编辑精选和地域或大范围精选的内容会采用随机推荐的方式推送给客户.
随机查询就有了高并发、低延迟的需求, 然而通用的order by random()随机方法性能太烂, 无法满足需求.
PG 提供了tablesample method(para)方法, 能够以几千倍的性能满足高并发需求.
视频回放: https://www.bilibili.com/video/BV1cy4y137WU/
压缩方法:
roaringbitmap | hll
《重新发现PostgreSQL之美- 24 滑动窗口分析2000x》
《重新发现PostgreSQL之美- 23 彭祖的长寿秘诀》
传统方案
放弃治疗
参考
《PostgreSQL x分组, y排序, 每组各取(N动态)条- 子查询+子查询聚合使用》
《重新发现PostgreSQL之美- 23 彭祖的长寿秘诀》