PA152: Efficient Use of DB
8. Query Optimization
Vlastislav Dohnal
PA152, Vlastislav Dohnal, FI MUNI, 2023 2
Query Optimization
◼ Generating and comparing query
execution plans
Pick the best
Query
Execution Plans
Cost estimation
Generating
Filtering
Assigning costs
PA152, Vlastislav Dohnal, FI MUNI, 2023 3
Generating Execution Plans
◼ Consider using:
Rel. algebra transformation rules
Implementations of rel. alg. operations
Use of existing indexes
Building indexes and sorting on the fly
PA152, Vlastislav Dohnal, FI MUNI, 2023 4
Plan Cost Estimation
◼ Depends on costs of each operation
 i.e., its implementation
◼ Assumptions for operation costs:
 Input is read from and disk
 Output is kept in memory
 Costs on CPU
◼ Processing on CPU is faster than reading from disk
◼ Can be neglected but often simplified
 Network communication costs
◼ Issue in distributed databases
 Ignoring contents of mem buffers/caches between
queries
◼ Estimated costs of operation
 = number of read and write accesses to disk
Operation Cost Estimation
◼ Example: settings in PostgreSQL
https://www.postgresql.org/docs/15/runtime-config-query.html#RUNTIME-CONFIG-QUERY-CONSTANTS
https://www.postgresql.org/docs/15/static/runtime-config-resource.html
 seq_page_cost (1.0)
 random_page_cost (4.0)
 cpu_tuple_cost (0.01)
 cpu_index_tuple_cost (0.005)
 cpu_operator_cost (0.0025)
 shared_buffers (32MB) – ¼ RAM
 effective_cache_size (4GB) – ½ RAM
 work_mem (8MB)
◼ Memory available to an operation
PA152, Vlastislav Dohnal, FI MUNI, 2023 5
PA152, Vlastislav Dohnal, FI MUNI, 2023 6
Operation Cost Estimation
◼ Parameters
B(R) – size of relation R in blocks
f(R) – max. record count to store in a block
M – max. RAM buffers available (in blocks)
HT(i) – depth of index i (in levels)
LB(i) – sum of all leaf nodes of index i
PA152, Vlastislav Dohnal, FI MUNI, 2023 7
Operation Implementation
◼ Based on concept of iterator
Open – initialization
◼ preparations before returning any record of result
GetNext – return next record of result
Close – finalization
◼ release temp buffers, …
◼ Advantages
Result may not be returned at once
◼ Does not occupy main memory; may not be
materialized on a disk
Pipelining can be used
PA152, Vlastislav Dohnal, FI MUNI, 2023 8
Accessing Relation: table scan
◼ Relation is not interlaced
Reading costs: B(R)
TwoPhase-MergeSort = 3B(R) reading/writing
◼ Final writing is ignored
◼ Relation is interlaced
Reading costs are up to T(R) blocks!
TwoPhase-MergeSort
◼ T(R) + 2B(R) reads and writes
R1 R2 R3 R4 R5 R6 R7 R8 …
R1 R2 S1 S2 R3 R4 S3 S4 …
PA152, Vlastislav Dohnal, FI MUNI, 2023 9
Accessing Relation: index scan
◼ Read relation using an index
 Scanning index → reading records
◼ Read index blocks (<< B(R))
◼ Read records of relation
 Applicable to any attribute
 Max. costs:
◼ (max. B(R) and T(R) reads) + (up to 𝑚 𝐻𝑇+1
− 1)
 Where m is an index arity (LB = 𝑚 𝐻𝑇
)
◼ Advantages
 Can limit to a subset of records (interval)
◼ Min. costs: 0 read blocks of relation + 1..HT blocks of
index
 For a covering index
Max. number of nodes
in an m-ary tree
PA152, Vlastislav Dohnal, FI MUNI, 2023 10
One-Pass Algorithms
◼ Implementation:
 Read relation → Processing → Output buffers
 Processing records one by one
◼ Operations
 Projection, Selection, Duplicate elimination
(DISTINCT)
◼ costs: B(R)
 Aggregation functions (GROUP BY)
◼ costs: B(R)
 Set operations, cross product
◼ costs: B(R) + B(S)
PA152, Vlastislav Dohnal, FI MUNI, 2023 11
Duplicate Elimination – distinct
◼ Procedure
 Test whether the record is in output
 If not, output the record
◼ Test for existence in output
 Store already seen records in memory
◼ Can use M-2 blocks
 No data structure: n2 complexity (comparisons)
 Use hashing
◼ Limitation: B(R) < M-1
◼ Can be implemented using iterators?
Distinct – example
◼ Relation company(company_key,company_name)
PA152, Vlastislav Dohnal, FI MUNI, 2023
12
# explain analyze SELECT DISTINCT company_name FROM provider.company;
HashAggregate (cost=438.68..554.67 rows=11600 width=20) (actual time=9.347..12.133 rows=11615 loops=1)
Group Key: company_name
-> Seq Scan on company (cost=0.00..407.94 rows=12294 width=20)
(actual time=0.019..5.007 rows=12295 loops=1)
Planning time: 0.063 ms
Execution time: 12.799 ms
# explain analyze SELECT DISTINCT company_key FROM provider.company;
Unique (cost=0.29..359.43 rows=12294 width=8) (actual time=0.041..8.857 rows=12295 loops=1)
-> Index Only Scan using company_pkey on company (cost=0.29..328.69 rows=12294 width=8)
(actual time=0.039..5.686 rows=12295 loops=1)
Heap Fetches: 4726
Planning time: 0.063 ms
Execution time: 9.645 ms
# explain analyze SELECT DISTINCT company_name FROM provider.company ORDER BY company_name;
Unique (cost=1243.05..1304.52 rows=11600 width=20) (actual time=53.468..59.072 rows=11615 loops=1)
-> Sort (cost=1243.05..1273.79 rows=12294 width=20) (actual time=53.467..55.482 rows=12295 loops=1)
Sort Key: company_name
Sort Method: quicksort Memory: 1214kB
-> Seq Scan on company (cost=0.00..407.94 rows=12294 width=20)
(actual time=0.018..5.338 rows=12295 loops=1)
PA152, Vlastislav Dohnal, FI MUNI, 2023 13
Aggregations / Grouping
◼ Procedure
 Create groups for group-by attributes
 Store accumulated values of aggregation functions
◼ Internal structure
 Organize values of grouping attributes, e.g., hashing
 Accumulated value of aggregations
◼ MIN, MAX, COUNT, SUM – one value (number)
◼ AVG – two numbers (SUM and COUNT)
 Accumulated values are small: M-1 blocks are enough
◼ Iterators:
◼ All prepared in Open
◼ Advantage of pipelining is inapplicable
Output block is not reserved.
PA152, Vlastislav Dohnal, FI MUNI, 2023 14
Set Operations
◼ Requirement: min(B(R), B(S)) ≤ M-2
Smaller relation read in memory
Larger relation is read gradually
Set union (possibly also Set difference):
◼ Memory requirements: B(R)+B(S) ≤ M-2
◼ Assumption
R is larger relation, i.e., S is in memory
◼ Implementation
Create a temp search structure
◼ E.g., hashing
PA152, Vlastislav Dohnal, FI MUNI, 2023
15
Set union
Notice: Not multiset union
i.e., without ALL in SQL
◼ Read S; construct search structure
Eliminate duplicates
Output unique records immediately
◼ Read R and check existence of the record
in S
If present, skip it.
If not seen, output it and add to structure
◼ Limitations
B(R)+B(S) ≤ M-2
PA152, Vlastislav Dohnal, FI MUNI, 2023
16
Set intersection
Notice: Not multiset intersection
i.e., without ALL in SQL
◼ Read S; construct search structure
Eliminate duplicates
◼ Read R and check existence of the record
in S
If present, output the record and delete it from
structure.
If not seen, skip it.
◼ Limitations
min(B(R), B(S)) ≤ M-2
PA152, Vlastislav Dohnal, FI MUNI, 2023 17
Set Difference
◼ R–S
 Read S; construct search structure
◼ Eliminate duplicates
 Read R and check existence of the record in S
◼ If not present, output it
 Also insert into internal structure
 B(S) + B(R) ≤ M-2 (worse case, but with pipelining)
◼ Or max(B(R),B(S)) ≤ M-2, when preprocessing R (no pipelining)
◼ S–R
 Read S; construct search structure
◼ Eliminate duplicates
 Read R and check existence of the record in S
◼ If present, delete it from internal structure
 Output all remaining recs. in S (no pipel.)
 B(S) ≤ M-1
PA152, Vlastislav Dohnal, FI MUNI, 2023 18
Multiset (Bag) Operations
◼ Bag union RBS
 Easy exercise…
◼ Bag intersection RBS
 Read S; construct search structure
◼ Eliminate duplicates by storing their count
 Read R and check existence of the record in S
 If record is present, output it
◼ and decrement record count!
◼ If counter is zero, delete it from internal structure
 If record is not found, skip it
 min(B(R), B(S)) ≤ M-2
PA152, Vlastislav Dohnal, FI MUNI, 2023 19
Multiset (Bag) Operations
◼ Bag difference S–BR
 Same idea
 If record of R is present in S, decrement its counter
 Output internal structure (recs. of S)
◼ with positive count
 B(S) ≤ M-1
◼ Bag difference R–BS
 By analogy…
 If record of R is not present in S → output
 If found,
◼ → if counter is zero, output it
◼ → decrement the counter and skip it
 B(S) ≤ M-2
PA152, Vlastislav Dohnal, FI MUNI, 2023 20
Join Operation – one pass version
◼ Cross product
 Easy exercise…
◼ Natural join
Assume relations R(X,Y), S(Y,Z)
◼ X – unique attributes is R, Z – unique attrs. in S
◼ Y – common attributes in R and S
Read S; construct search structure on Y
For each record of R, find all matching recs. of S
◼ Output concatenation of all combinations (eliminate
repeating attributes Y)
◼ Outer join ?
PA152, Vlastislav Dohnal, FI MUNI, 2023 21
One-Pass Algorithms
◼ Summary
Unary operation: op(R)
◼ B(R) ≤ M-1, 1 block for output; some need 1 for input
Binary operation: R op S
◼ B(S) ≤ M-2, 1 block for R, 1 block for output
 Some ops require: B(R)+B(S) ≤ M-2 or max(B(R),B(S))<M-1
Cost = B(R) + B(S)
◼ Based on size of memory buffers M
Known → ok
Not known → estimate it
◼ Wrong size → swapping, use two-pass
algo instead of one-pass algorithm
PA152, Vlastislav Dohnal, FI MUNI, 2023 22
Join Algorithms
◼ Relations do not fit in memory
So called “one and a half”-pass algorithms
◼ Basic variant: Nested-loop join
for each s in S do
◼ for each r in R do
 if r and s match in Y then output concatenation of r and s.
◼ Example
T(R) = 10 000 T(S) = 5 000 M=2
Costs = 5 000(1+10 000) = 50 005 000 IOs
Reading whole Rreading a record of S
PA152, Vlastislav Dohnal, FI MUNI, 2023 23
Join Algorithms
◼ Relations accessed by blocks
◼ Block-based nested-loop join
◼ R – inner relation, S – outer relation
◼ Example:
B(R) = 1000 B(S) = 500 M=3
Costs = 500(1+1000) = 500 500 IOs
PA152, Vlastislav Dohnal, FI MUNI, 2023 24
Join Algorithms
◼ Exploit all buffer blocks (M blocks)
Cached Block-based Nested-loop Join
Read M-2 blocks of relation S at once
◼ Read relation R block by block
 Join records
Costs: B(S)/(M-2)  (M-2 + B(R)) IOs
◼ Example RS:
M=102
Costs: 5  (100 + 1000) = 5 500 IOs
Swapping relations
◼ Costs: 10  (100 + 500) = 6 000 IOs
PA152, Vlastislav Dohnal, FI MUNI, 2023 25
Join Algorithms – Summary
◼ Nested-loops join
Use always blocked variant
Read the smaller relation into memory (if M>>3)
◼ Storage of relation
Important for final costs
◼ Interlaced → each record needs one I/O
◼ Non-interlaced → each record needs B(R)/T(R) I/Os
only
◼ Applicable to any join condition
theta joins
PA152, Vlastislav Dohnal, FI MUNI, 2023
26
Two-Pass Algorithms
◼ Procedure:
Preprocess input relation → store it
◼ Sorting (Multi-way MergeSort)
◼ Hashing
Processing
◼ Operations:
Joins
Duplicate elimination (DISTINCT)
Aggregations (GROUP BY)
Set operations
PA152, Vlastislav Dohnal, FI MUNI, 2023 27
memory
R
S
...
sorted runs
R
S
sorted relations
...
...
join
result
pass to merge
relations
Join Algorithms – MergeJoin
◼ R S R(X,Y), S(Y,Z)
disk
PA152, Vlastislav Dohnal, FI MUNI, 2023 28
Join Algorithms – MergeJoin
◼ R S R(X,Y), S(Y,Z)
◼ Algorithm:
Sort R and S
i = 1; j = 1;
while (i ≤ T(R))  (j ≤ T(S)) do
◼ if R[i].Y = S[j].Y then doJoin()
◼ else if R[i].Y > S[j].Y then j = j+1
◼ else if R[i].Y < S[j].Y then i = i+1
PA152, Vlastislav Dohnal, FI MUNI, 2023 29
Join Algorithms – MergeJoin
◼ Function doJoin():
Proceed nested-loop join for records of same Y
while (R[i].Y = S[j].Y)  (i ≤ T(R)) do
◼ j2 = j
◼ while (R[i].Y = S[j2].Y)  (j2 ≤ T(S)) do
 Output joined R[i] and S[j2]
 j2 = j2 + 1
◼ i = i + 1
j = j2
PA152, Vlastislav Dohnal, FI MUNI, 2023 30
Join Algorithms – MergeJoin
i R[i].Y S[j].Y j
1 10 5 1
2 20 20 2
3 20 20 3
4 30 30 4
5 40 30 5
50 6
52 7
PA152, Vlastislav Dohnal, FI MUNI, 2023 31
Join Algorithms – MergeJoin
◼ Costs
MergeSort of R and S → 4(B(R) + B(S))
MergeJoin → B(R) + B(S)
◼ Example (M=102)
MergeJoin
◼ Sorting: 4(1000 + 500) = 6000 read/write IOs
◼ Joining: 1000 + 500 = 1500 read IOs
◼ Total: 7500 read/write IOs
Original cached block-based nested-loop join
◼ 5500 read IOs
PA152, Vlastislav Dohnal, FI MUNI, 2023 32
Join Algorithms – MergeJoin
◼ Another example
B(R) = 10 000 B(S) = 5 000
M = 102 blocks
Cached Block-based Nested-loop Join
◼ (5 000/100)  (100 + 10 000) = 505 000 read IOs
MergeJoin
◼ 5(10 000 + 5 000) = 75 000 read/write IOs
10x larger relations!!!
PA152, Vlastislav Dohnal, FI MUNI, 2023 33
Join Algorithms – MergeJoin
◼ MergeJoin
Preprocessing is expensive
◼ If relations are sorted by Y, can be omitted.
◼ Analysis of IO costs
MergeJoin
◼ linear complexity
Cached Block-based Nested-loop Join
◼ quadratic complexity
→ from a certain size of relations,
MergeJoin is better
PA152, Vlastislav Dohnal, FI MUNI, 2023 34
Join Algorithms – MergeJoin
◼ Memory requirements
Limitation to max 𝐵 𝑅 , 𝐵 𝑆 < 𝑀2
◼ Optimal memory size
Using MergeSort on relation R
◼ Number of runs = Τ𝐵 𝑅 𝑀, Run length = 𝑀
◼ Limitation: number of runs ≤ 𝑀 − 1
◼ Τ𝐵 𝑅 𝑀 < 𝑀 → 𝐵 𝑅 < 𝑀2 → 𝑀 > 𝐵 𝑅
◼ Example
B(R) = 1000 → M>31.62
B(S) = 500 → M>22.36
PA152, Vlastislav Dohnal, FI MUNI, 2023 35
Join Algorithms – MergeJoin→SortJoin
◼ Improvement:
Not necessary to have the relations sorted
completely
R
S
Can join
directly?
sorted runs
(1st phase of MergeSort)
PA152, Vlastislav Dohnal, FI MUNI, 2023 36
Join Algorithms – SortJoin
◼ Improvement
Prepare sorted runs of R and S
Read 1st block of all runs (R and S)
Get min value in Y
◼ Find corresponding records in other runs
◼ Join them
◼ In case too many records with the same Y
Apply block-nested-loop join in the remaining
memory
PA152, Vlastislav Dohnal, FI MUNI, 2023 37
Join Algorithms – SortJoin
◼ Costs
Sorted runs: 2(B(R) + B(S))
Joining: B(R) + B(S)
◼ Limitations
Run length = M, number of runs < M
→ B(R) + B(S) < M(M-1)
◼ Example (M=102)
Sorting: 2(1000 + 500) Joining: 1000 + 500
Total: 4 500 read/write IOs
◼ → better than cached block-based
nested-loop join
PA152, Vlastislav Dohnal, FI MUNI, 2023 38
Join Algorithms – HashJoin
◼ R S R(X,Y), S(Y,Z)
...
...
M-1R
...
...
M-1S
memory buckets
join
PA152, Vlastislav Dohnal, FI MUNI, 2023 39
Join Algorithms – HashJoin
◼ R S R(X,Y), S(Y,Z)
Define a hash function for attributes Y
Create hashed index of R and S
◼ Address space is M-1 buckets
For each i  [0,M-2]
◼ Read bucket i of R and S
◼ Find matching records and join them
PA152, Vlastislav Dohnal, FI MUNI, 2023 40
Join Algorithms – HashJoin
◼ Joining buckets
Read whole bucket of S (≤ M-2)
◼ May create an internal structure to speed up
Read bucket of R block by block
BucketsofR
...
Si
memory
...
BucketsofS
PA152, Vlastislav Dohnal, FI MUNI, 2023 41
Join Algorithms – HashJoin
◼ Costs:
Create hashed index: 2(B(R)+B(S))
Bucket joining: B(R)+B(S)
◼ Limitations:
Size of each bucket of S ≤ M-2
◼ Estimate: 𝑚𝑖𝑛 𝐵 𝑅 , 𝐵 𝑆 < 𝑀 − 1 . (𝑀 − 2)
◼ Example:
Hashing: 2(1000+500)
Joining: 1000+500
Total: 4 500 read/write IOs
PA152, Vlastislav Dohnal, FI MUNI, 2023 42
Join Algorithms – HashJoin
◼ Minimum memory requirements
Hashing S; optimal bucket occupation
◼ Memory buffer: M blocks
◼ Bucket size = B(S) / (M-1)
 This must be smaller than M (due to joining)
 → Τ𝐵(𝑆) 𝑀 − 1 ≤ 𝑀 − 2
◼ ≈ 𝑀 − 1 > 𝐵 𝑆
PA152, Vlastislav Dohnal, FI MUNI, 2023 43
Join Algorithms – HashJoin
◼ Optimization
keep some buckets in memory
Hybrid HashJoin
◼ Bucketing of S – Optimal size
B(S)=500
 𝐵 𝑆  23
i.e., each bucket is of 22 blocks
M=102
◼ → keep 3 buckets in memory (66 blocks)
◼ → 36 blocks of memory to spare
PA152, Vlastislav Dohnal, FI MUNI, 2023 44
Join Algorithm – Hybrid HashJoin
◼ Preprocessing S
Contents of memory buffer
Memory usage (M=102):
G0-2 3*22 blocks
Other buckets 23-3 blocks
Reading S 1 block
output 1 block
Total 88 blocks
14 blocks are available!
memory
G0
G2
in
...
22 blocks
23-3=20 buckets
S
......
PA152, Vlastislav Dohnal, FI MUNI, 2023 45
Join Algorithm – Hybrid HashJoin
◼ Structure of memory to hash R
1000/23 = 44 blocks per bucket
Records hashed to bucket 0-2
◼ Join immediately with S0-2 buckets (in memory) →
output
memory
G0
G2
in
...
44 blocks
23-3=20 buckets
R
...
...
PA152, Vlastislav Dohnal, FI MUNI, 2023 46
Join Algorithm – Hybrid HashJoin
◼ Joining buckets
Do for buckets with id 3-22
Read one whole bucket in memory; read the
other bucket block by block
memory
Hi
output
...22
23-3=20
result
...
44
23-3=20
Buckets of S Buckets of R
one bucket
of S
one block of
bucket of R
PA152, Vlastislav Dohnal, FI MUNI, 2023 47
Join Algorithm – Hybrid HashJoin
◼ Costs:
Bucketize S: 500 + 2022 = 940 read/write IOs
Bucketize R: 1000 + 2044 = 1880 read/write IOs
◼ Only 20 buckets to write!
Joining: 2044 + 2022 = 1320 read IOs
◼ Three buckets are already done (during bucketizing R)
In total: 4140 read/write IOs
PA152, Vlastislav Dohnal, FI MUNI, 2023 48
Join Algorithms
◼ Hybrid HashJoin
How many buckets to keep in memory?
◼ Empirically: 1 bucket
◼ Hashing record pointers
Organize pointers to records instead of
records themselves
◼ Store pairs [key value, rec. pointer] in buckets
Joining
◼ If match, we must read the records
PA152, Vlastislav Dohnal, FI MUNI, 2023 49
Join Algorithm – Hashing Pointers
◼ Example
100 key-pointer pairs fit in one block
Estimate results size: 100 recs
Costs:
◼ Bucketize S in memory (500 IOs)
 5000 records → 5000/100 blocks = 50 blocks in memory
◼ Joining – read R gradually and join
 If match, read full records of S → 100 read IOs
◼ Total: 500 + 1000 + 100 = 1600 read IOs
PA152, Vlastislav Dohnal, FI MUNI, 2023 50
Join Algorithms – IndexJoin
◼ R S R(X,Y), S(Y,Z)
◼ Assume:
Index on attributes Y of R
◼ Procedure:
For each record s  S
Look up matches in index → records A
◼ For each record r  A
◼ Output concatenation of r and s
PA152, Vlastislav Dohnal, FI MUNI, 2023 51
Join Algorithms – IndexJoin
◼ Example
Assume
◼ Index on Y of R: HT=2, LB=200
◼ Scenario 1
Index fits in memory
Costs:
◼ Pass of S: 500 read IOs (B(S)=500, T(S)=5000)
◼ Searching in index: for free
 If match, read record of R → 1 read IO
PA152, Vlastislav Dohnal, FI MUNI, 2023 52
Join Algorithms – IndexJoin
◼ Costs
Depends on the number of matches
Variants:
◼ A) Y in R is primary key; Y in S is foreign key
→ 1 record
Costs: 500 + 500011 = 5500 read IOs
◼ B) V(R,Y) = 5000 T(R) = 10 000
uniform distribution → 2 records
Costs: 500 + 500021 = 10500 read IOs
◼ C) DOM(R,Y)=1 000 000 T(R) = 10 000
→ 10k/1m = 1/100 of record
Costs: 500 + 5000(1/100)1 = 550 read IOs
PA152, Vlastislav Dohnal, FI MUNI, 2023 53
Join Algorithms – IndexJoin
◼ Scenario 2
Index does not fit in memory
Index on Y of R is of 201 blocks
◼ Keep root-node block and 99 leaf-node blocks
in memory M=102
Costs for searching
◼ 0(99/200) + 1(101/200) = 0.505 read IOs per
search (query)
PA152, Vlastislav Dohnal, FI MUNI, 2023 54
Join Algorithms – IndexJoin
◼ Scenario 2
Costs
◼ B(S) + T(S)(searching index + reading records)
Variants:
◼ A) → 1 record
Costs: 500 + 5000(0.5+1) = 8000 read IOs
◼ B) → 2 records
Costs: 500 + 5000(0.5+2) = 13000 read IOs
◼ C) → 1/100 of record
Costs: 500 + 5000(0.5+1/100)
= 3050 read IOs
PA152, Vlastislav Dohnal, FI MUNI, 2023 55
Join Algorithms – Summary
R  S
B(R) = 1000
B(S) = 500
Algorithm Costs
Cached Block-based Nested-loop Join 5500
Merge Join (w/o sorting) 1500
Merge Join (with sorting) 7500
Sort Join 4500
Index Join (R.Y index) 8000 → 550
Hash Join 4500
Hybrid 4140
Pointers 1600
PA152, Vlastislav Dohnal, FI MUNI, 2023
Join Algorithms – Summary
R  S Assume B(S) < B(R), Y are common attributes
Algorithm Costs Limits
Block-based Nested-loop B(S)  (1+B(R)) M=3
Cached version B(S)/(M-2)  (M-2 + B(R)) M3
Merge Join (w/o sorting) B(R) + B(S) M=3
Merge Join (with sorting) 5  (B(R) + B(S)) 𝑀 = 𝐵 𝑅
Sort Join 3  (B(R) + B(S)) 𝑀 = 𝐵 𝑅 + 𝐵 𝑆 + 1
Index Join (R.Y index)
(max costs)
B(S) + T(S)  (HT + )
e.g.  = T(R)/V(R,Y)
min. M=4
Hash Join 3  (B(R) + B(S)) 𝑀 = 2 + 𝐵 𝑆
max. M-1 buckets
Hybrid 3 𝐵 𝑅 + 𝐵(𝑆) −
2 𝐵 𝑅 + 𝐵(𝑆)
𝐵 𝑅 𝑀 =
𝐵(𝑅)
𝐵 𝑅
+ 𝐵 𝑅 + 1
Pointers B(S)+B(R)+T(R)  
e.g.  = T(S)/V(S,Y)
M=B(hash index on S)+3
56
PA152, Vlastislav Dohnal, FI MUNI, 2023 57
Join Algorithms – Recommendation
◼ Cached Block-based Nested-loop Join
 Good for small relations (relative to memory size)
◼ HashJoin
 For equi-joins (equality on attributes only)
 Relations are not sorted or no indexes
◼ SortJoin
 Good for non-equi-joins
 E.g., R.Y > S.Y
◼ MergeJoin
 If relations are already sorted
◼ IndexJoin
 If index exists, it could be useful
 Depends on expected result size
PA152, Vlastislav Dohnal, FI MUNI, 2023 58
Two-Pass Algorithms
◼ Using sorting
Duplicate Elimination
Aggregations (GROUP BY)
Set operations
PA152, Vlastislav Dohnal, FI MUNI, 2023 59
Duplicate Elimination
◼ Procedure
Do 1st phase of MergeSort
◼ → sorted runs on disk
Read all runs block by block
◼ Find smallest record and output it
◼ Skip all duplicate records
◼ Properties
Costs: 3B(R)
Limitations: B(R) ≤ M*(M-1)
◼ Optimal M ≥ 𝐵 𝑅 + 1
PA152, Vlastislav Dohnal, FI MUNI, 2023 60
Aggregations
◼ Procedure (analogous to previous)
Sort runs of R (by group-by attributes)
Read all runs block by block
◼ Find smallest value → new group
 Compute all aggregates over all records of this group
 No more record in this group → output it
◼ Properties
Costs: 3B(R)
Limitations: B(R) ≤ M*(M-1)
◼ Optimal M ≥ 𝐵 𝑅 + 1
PA152, Vlastislav Dohnal, FI MUNI, 2023 61
Set union
◼ Notice: No two-pass algo for bag union
◼ Set union
Do 1st phase of MergeSort on R and S
◼ → sorted runs on disk
Read all runs (both R and S) gradually
◼ Find the first remaining record and output it
◼ Skip all duplicates of this record (in R and S)
◼ Properties
Costs: 3(B(R) + B(S))
Limitations: 𝐵 𝑅 + 𝐵 𝑆 ≤ 𝑀
◼ Need one block per all runs (of R and also S)
PA152, Vlastislav Dohnal, FI MUNI, 2023 62
Set/bag intersection and difference
◼ RS, R-S, RBS, R-BS
◼ Procedure
Do 1st phase of MergeSort on R and S
Read all runs (both R and S) gradually
◼ Find the first remaining record t
◼ Count t’s occurrences in R and S (separately)
 #R, #S
◼ Copy to output (respecting specific operation)
PA152, Vlastislav Dohnal, FI MUNI, 2023 63
Set/bag intersection and difference
◼ On copy to output:
RS: output t,
◼ if #R > 0  #S > 0
RBS: output t min(#R,#S)-times
R-S: output t,
◼ if #R > 0  #S = 0
R-BS: output t max(#R - #S,0)-times
◼ Properties
Costs: 3(B(R) + B(S))
Limitations: 𝐵 𝑅 + 𝐵 𝑆 ≤ 𝑀
◼ Need one block per all runs (of R and
also S)
PA152, Vlastislav Dohnal, FI MUNI, 2023 64
Two-Pass Algorithms
◼ Using hashing
Duplicate Elimination
Aggregations (GROUP BY)
Set operations
PA152, Vlastislav Dohnal, FI MUNI, 2023 65
Duplicate Elimination
◼ Procedure
Bucketize R into M-1 buckets
◼ → store buckets on disk
For each bucket
◼ Read it in memory and remove duplicates; output
remaining records
 bucket size is max. M-1 blocks
◼ Properties
Costs: 3B(R)
Limitations: B(R) ≤ (M-1)2
PA152, Vlastislav Dohnal, FI MUNI, 2023 66
Aggregations
◼ Procedure (analogous to previous)
 Bucketize R into M-1 buckets by group-by attrs.
◼ → store buckets on disk
 For each bucket
◼ Read block by block in memory and
◼ Create groups for new values and compute aggregates
 Limit on bucket size is not defined. But groups and partial
aggregates must fit in max. M-1 blocks.
◼ Output results
◼ Properties
 Costs: 3B(R)
 Limitations: B(R) ≤ (M-1)2
lze téměř zrušitcan be relaxed
PA152, Vlastislav Dohnal, FI MUNI, 2023 67
Set union, intersection, difference
◼ Procedure
Bucketize R and S (the same hash function)
◼ into M-1 buckets
Process the pair of buckets Ri and Si
◼ Read one in memory (depends on operation)
 bucket size: max. M-2
◼ Read the other gradually
◼ Properties
Costs: 3(B(R) + B(S))
Limitations on M depends on the operation
Set intersection, difference
◼ Intersection (smaller relation is S)
Load the bucket of S in mem
Restrictions: min(B(R), B(S)) ≤ (M-2)*(M-1)
◼ Difference R-S:
To eliminate duplicates in R, read bucket of R
into mem
Restrictions: B(R) ≤ (M-2)*(M-1)
◼ Difference S-R:
Load the bucket of S in mem
Restrictions: B(S) ≤ (M-2)*(M-1)
PA152, Vlastislav Dohnal, FI MUNI, 2023 68
Set Union
◼ Must eliminate duplicates in R and S
◼ for each i in hash addresses:
◼ read BktS
i , build in-mem hash table & eliminate dups
 also gradually output the records
◼ read BktR
i gradually:
 for each r in BktR
i :
▪ if r not in in-mem hash table
▪ output r and add to in-mem hash table
◼ Restrictions: 𝐵 𝑅 + 𝐵 𝑆 < 𝑀
Need to load both the buckets (at worst) into M
PA152, Vlastislav Dohnal, FI MUNI, 2023 69
Summary
◼ Operations
 distinct, group by, set operations, joins
◼ Algorithm type
 one-pass, one-and-a-half pass, two-pass
◼ Implementation
 Sorting
 Hashing
 Exploiting indexes
◼ Costs
 blocks to read/write
 memory footprint
PA152, Vlastislav Dohnal, FI MUNI, 2023 70
Lecture Takeaways
◼ Influence of algorithm implementation on
costs
◼ Estimated costs influence the choice of
implementation
◼ If more mem is needed (estimation was
wrong)
It is allocated and the operation is not
terminated.
◼ Also, tiny code changes count!
PA152, Vlastislav Dohnal, FI MUNI, 2023 71