看hash join 里的概念:
numbuckets与numbatches:
numbuckets是 在内存中,hash表里面,"桶"的个数。
numbatches是 如果hash表过大,在内存中放不下,则要分多次进行。
void final_cost_hashjoin(PlannerInfo *root, HashPath *path, JoinCostWorkspace *workspace, SpecialJoinInfo *sjinfo, SemiAntiJoinFactors *semifactors) { ... /* and compute the number of "virtual" buckets in the whole join */ virtualbuckets = (double) numbuckets *(double) numbatches; ... }
http://www.postgresql.org/docs/current/static/using-explain.html
EXPLAIN ANALYZE SELECT *
FROM tenk1 t1, tenk2 t2
WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2 ORDER BY t1.fivethous;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=717.30..717.56 rows=101 width=488) (actual time=104.950..105.327 rows=100 loops=1)
Sort Key: t1.fivethous
Sort Method: quicksort Memory: 68kB
-> Hash Join (cost=230.43..713.94 rows=101 width=488) (actual time=3.680..102.396 rows=100 loops=1)
Hash Cond: (t2.unique2 = t1.unique2)
-> Seq Scan on tenk2 t2 (cost=0.00..445.00 rows=10000 width=244) (actual time=0.046..46.219 rows=10000 loops=1)
-> Hash (cost=229.17..229.17 rows=101 width=244) (actual time=3.184..3.184 rows=100 loops=1)
Buckets: 1024 Batches: 1 Memory Usage: 27kB
-> Bitmap Heap Scan on tenk1 t1 (cost=5.03..229.17 rows=101 width=244) (actual time=0.612..1.959 rows=100 loops=1)
Recheck Cond: (unique1 < 100)
-> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.01 rows=101 width=0) (actual time=0.390..0.390 rows=100 loops=1)
Index Cond: (unique1 < 100)
Total runtime: 107.392 ms