PA152: Efficient Use of DB 8. Query Optimization Vlastislav Dohnal PA152, Vlastislav Dohnal, FI MUNI, 2023 2 Query Optimization ◼ Generating and comparing query execution plans Pick the best Query Execution Plans Cost estimation Generating Filtering Assigning costs PA152, Vlastislav Dohnal, FI MUNI, 2023 3 Generating Execution Plans ◼ Consider using: Rel. algebra transformation rules Implementations of rel. alg. operations Use of existing indexes Building indexes and sorting on the fly PA152, Vlastislav Dohnal, FI MUNI, 2023 4 Plan Cost Estimation ◼ Depends on costs of each operation  i.e., its implementation ◼ Assumptions for operation costs:  Input is read from and disk  Output is kept in memory  Costs on CPU ◼ Processing on CPU is faster than reading from disk ◼ Can be neglected but often simplified  Network communication costs ◼ Issue in distributed databases  Ignoring contents of mem buffers/caches between queries ◼ Estimated costs of operation  = number of read and write accesses to disk Operation Cost Estimation ◼ Example: settings in PostgreSQL https://www.postgresql.org/docs/15/runtime-config-query.html#RUNTIME-CONFIG-QUERY-CONSTANTS https://www.postgresql.org/docs/15/static/runtime-config-resource.html  seq_page_cost (1.0)  random_page_cost (4.0)  cpu_tuple_cost (0.01)  cpu_index_tuple_cost (0.005)  cpu_operator_cost (0.0025)  shared_buffers (32MB) – ¼ RAM  effective_cache_size (4GB) – ½ RAM  work_mem (8MB) ◼ Memory available to an operation PA152, Vlastislav Dohnal, FI MUNI, 2023 5 PA152, Vlastislav Dohnal, FI MUNI, 2023 6 Operation Cost Estimation ◼ Parameters B(R) – size of relation R in blocks f(R) – max. record count to store in a block M – max. RAM buffers available (in blocks) HT(i) – depth of index i (in levels) LB(i) – sum of all leaf nodes of index i PA152, Vlastislav Dohnal, FI MUNI, 2023 7 Operation Implementation ◼ Based on concept of iterator Open – initialization ◼ preparations before returning any record of result GetNext – return next record of result Close – finalization ◼ release temp buffers, … ◼ Advantages Result may not be returned at once ◼ Does not occupy main memory; may not be materialized on a disk Pipelining can be used PA152, Vlastislav Dohnal, FI MUNI, 2023 8 Accessing Relation: table scan ◼ Relation is not interlaced Reading costs: B(R) TwoPhase-MergeSort = 3B(R) reading/writing ◼ Final writing is ignored ◼ Relation is interlaced Reading costs are up to T(R) blocks! TwoPhase-MergeSort ◼ T(R) + 2B(R) reads and writes R1 R2 R3 R4 R5 R6 R7 R8 … R1 R2 S1 S2 R3 R4 S3 S4 … PA152, Vlastislav Dohnal, FI MUNI, 2023 9 Accessing Relation: index scan ◼ Read relation using an index  Scanning index → reading records ◼ Read index blocks (<< B(R)) ◼ Read records of relation  Applicable to any attribute  Max. costs: ◼ (max. B(R) and T(R) reads) + (up to 𝑚 𝐻𝑇+1 − 1)  Where m is an index arity (LB = 𝑚 𝐻𝑇 ) ◼ Advantages  Can limit to a subset of records (interval) ◼ Min. costs: 0 read blocks of relation + 1..HT blocks of index  For a covering index Max. number of nodes in an m-ary tree PA152, Vlastislav Dohnal, FI MUNI, 2023 10 One-Pass Algorithms ◼ Implementation:  Read relation → Processing → Output buffers  Processing records one by one ◼ Operations  Projection, Selection, Duplicate elimination (DISTINCT) ◼ costs: B(R)  Aggregation functions (GROUP BY) ◼ costs: B(R)  Set operations, cross product ◼ costs: B(R) + B(S) PA152, Vlastislav Dohnal, FI MUNI, 2023 11 Duplicate Elimination – distinct ◼ Procedure  Test whether the record is in output  If not, output the record ◼ Test for existence in output  Store already seen records in memory ◼ Can use M-2 blocks  No data structure: n2 complexity (comparisons)  Use hashing ◼ Limitation: B(R) < M-1 ◼ Can be implemented using iterators? Distinct – example ◼ Relation company(company_key,company_name) PA152, Vlastislav Dohnal, FI MUNI, 2023 12 # explain analyze SELECT DISTINCT company_name FROM provider.company; HashAggregate (cost=438.68..554.67 rows=11600 width=20) (actual time=9.347..12.133 rows=11615 loops=1) Group Key: company_name -> Seq Scan on company (cost=0.00..407.94 rows=12294 width=20) (actual time=0.019..5.007 rows=12295 loops=1) Planning time: 0.063 ms Execution time: 12.799 ms # explain analyze SELECT DISTINCT company_key FROM provider.company; Unique (cost=0.29..359.43 rows=12294 width=8) (actual time=0.041..8.857 rows=12295 loops=1) -> Index Only Scan using company_pkey on company (cost=0.29..328.69 rows=12294 width=8) (actual time=0.039..5.686 rows=12295 loops=1) Heap Fetches: 4726 Planning time: 0.063 ms Execution time: 9.645 ms # explain analyze SELECT DISTINCT company_name FROM provider.company ORDER BY company_name; Unique (cost=1243.05..1304.52 rows=11600 width=20) (actual time=53.468..59.072 rows=11615 loops=1) -> Sort (cost=1243.05..1273.79 rows=12294 width=20) (actual time=53.467..55.482 rows=12295 loops=1) Sort Key: company_name Sort Method: quicksort Memory: 1214kB -> Seq Scan on company (cost=0.00..407.94 rows=12294 width=20) (actual time=0.018..5.338 rows=12295 loops=1) PA152, Vlastislav Dohnal, FI MUNI, 2023 13 Aggregations / Grouping ◼ Procedure  Create groups for group-by attributes  Store accumulated values of aggregation functions ◼ Internal structure  Organize values of grouping attributes, e.g., hashing  Accumulated value of aggregations ◼ MIN, MAX, COUNT, SUM – one value (number) ◼ AVG – two numbers (SUM and COUNT)  Accumulated values are small: M-1 blocks are enough ◼ Iterators: ◼ All prepared in Open ◼ Advantage of pipelining is inapplicable Output block is not reserved. PA152, Vlastislav Dohnal, FI MUNI, 2023 14 Set Operations ◼ Requirement: min(B(R), B(S)) ≤ M-2 Smaller relation read in memory Larger relation is read gradually Set union (possibly also Set difference): ◼ Memory requirements: B(R)+B(S) ≤ M-2 ◼ Assumption R is larger relation, i.e., S is in memory ◼ Implementation Create a temp search structure ◼ E.g., hashing PA152, Vlastislav Dohnal, FI MUNI, 2023 15 Set union Notice: Not multiset union i.e., without ALL in SQL ◼ Read S; construct search structure Eliminate duplicates Output unique records immediately ◼ Read R and check existence of the record in S If present, skip it. If not seen, output it and add to structure ◼ Limitations B(R)+B(S) ≤ M-2 PA152, Vlastislav Dohnal, FI MUNI, 2023 16 Set intersection Notice: Not multiset intersection i.e., without ALL in SQL ◼ Read S; construct search structure Eliminate duplicates ◼ Read R and check existence of the record in S If present, output the record and delete it from structure. If not seen, skip it. ◼ Limitations min(B(R), B(S)) ≤ M-2 PA152, Vlastislav Dohnal, FI MUNI, 2023 17 Set Difference ◼ R–S  Read S; construct search structure ◼ Eliminate duplicates  Read R and check existence of the record in S ◼ If not present, output it  Also insert into internal structure  B(S) + B(R) ≤ M-2 (worse case, but with pipelining) ◼ Or max(B(R),B(S)) ≤ M-2, when preprocessing R (no pipelining) ◼ S–R  Read S; construct search structure ◼ Eliminate duplicates  Read R and check existence of the record in S ◼ If present, delete it from internal structure  Output all remaining recs. in S (no pipel.)  B(S) ≤ M-1 PA152, Vlastislav Dohnal, FI MUNI, 2023 18 Multiset (Bag) Operations ◼ Bag union RBS  Easy exercise… ◼ Bag intersection RBS  Read S; construct search structure ◼ Eliminate duplicates by storing their count  Read R and check existence of the record in S  If record is present, output it ◼ and decrement record count! ◼ If counter is zero, delete it from internal structure  If record is not found, skip it  min(B(R), B(S)) ≤ M-2 PA152, Vlastislav Dohnal, FI MUNI, 2023 19 Multiset (Bag) Operations ◼ Bag difference S–BR  Same idea  If record of R is present in S, decrement its counter  Output internal structure (recs. of S) ◼ with positive count  B(S) ≤ M-1 ◼ Bag difference R–BS  By analogy…  If record of R is not present in S → output  If found, ◼ → if counter is zero, output it ◼ → decrement the counter and skip it  B(S) ≤ M-2 PA152, Vlastislav Dohnal, FI MUNI, 2023 20 Join Operation – one pass version ◼ Cross product  Easy exercise… ◼ Natural join Assume relations R(X,Y), S(Y,Z) ◼ X – unique attributes is R, Z – unique attrs. in S ◼ Y – common attributes in R and S Read S; construct search structure on Y For each record of R, find all matching recs. of S ◼ Output concatenation of all combinations (eliminate repeating attributes Y) ◼ Outer join ? PA152, Vlastislav Dohnal, FI MUNI, 2023 21 One-Pass Algorithms ◼ Summary Unary operation: op(R) ◼ B(R) ≤ M-1, 1 block for output; some need 1 for input Binary operation: R op S ◼ B(S) ≤ M-2, 1 block for R, 1 block for output  Some ops require: B(R)+B(S) ≤ M-2 or max(B(R),B(S))>3) ◼ Storage of relation Important for final costs ◼ Interlaced → each record needs one I/O ◼ Non-interlaced → each record needs B(R)/T(R) I/Os only ◼ Applicable to any join condition theta joins PA152, Vlastislav Dohnal, FI MUNI, 2023 26 Two-Pass Algorithms ◼ Procedure: Preprocess input relation → store it ◼ Sorting (Multi-way MergeSort) ◼ Hashing Processing ◼ Operations: Joins Duplicate elimination (DISTINCT) Aggregations (GROUP BY) Set operations PA152, Vlastislav Dohnal, FI MUNI, 2023 27 memory R S ... sorted runs R S sorted relations ... ... join result pass to merge relations Join Algorithms – MergeJoin ◼ R S R(X,Y), S(Y,Z) disk PA152, Vlastislav Dohnal, FI MUNI, 2023 28 Join Algorithms – MergeJoin ◼ R S R(X,Y), S(Y,Z) ◼ Algorithm: Sort R and S i = 1; j = 1; while (i ≤ T(R))  (j ≤ T(S)) do ◼ if R[i].Y = S[j].Y then doJoin() ◼ else if R[i].Y > S[j].Y then j = j+1 ◼ else if R[i].Y < S[j].Y then i = i+1 PA152, Vlastislav Dohnal, FI MUNI, 2023 29 Join Algorithms – MergeJoin ◼ Function doJoin(): Proceed nested-loop join for records of same Y while (R[i].Y = S[j].Y)  (i ≤ T(R)) do ◼ j2 = j ◼ while (R[i].Y = S[j2].Y)  (j2 ≤ T(S)) do  Output joined R[i] and S[j2]  j2 = j2 + 1 ◼ i = i + 1 j = j2 PA152, Vlastislav Dohnal, FI MUNI, 2023 30 Join Algorithms – MergeJoin i R[i].Y S[j].Y j 1 10 5 1 2 20 20 2 3 20 20 3 4 30 30 4 5 40 30 5 50 6 52 7 PA152, Vlastislav Dohnal, FI MUNI, 2023 31 Join Algorithms – MergeJoin ◼ Costs MergeSort of R and S → 4(B(R) + B(S)) MergeJoin → B(R) + B(S) ◼ Example (M=102) MergeJoin ◼ Sorting: 4(1000 + 500) = 6000 read/write IOs ◼ Joining: 1000 + 500 = 1500 read IOs ◼ Total: 7500 read/write IOs Original cached block-based nested-loop join ◼ 5500 read IOs PA152, Vlastislav Dohnal, FI MUNI, 2023 32 Join Algorithms – MergeJoin ◼ Another example B(R) = 10 000 B(S) = 5 000 M = 102 blocks Cached Block-based Nested-loop Join ◼ (5 000/100)  (100 + 10 000) = 505 000 read IOs MergeJoin ◼ 5(10 000 + 5 000) = 75 000 read/write IOs 10x larger relations!!! PA152, Vlastislav Dohnal, FI MUNI, 2023 33 Join Algorithms – MergeJoin ◼ MergeJoin Preprocessing is expensive ◼ If relations are sorted by Y, can be omitted. ◼ Analysis of IO costs MergeJoin ◼ linear complexity Cached Block-based Nested-loop Join ◼ quadratic complexity → from a certain size of relations, MergeJoin is better PA152, Vlastislav Dohnal, FI MUNI, 2023 34 Join Algorithms – MergeJoin ◼ Memory requirements Limitation to max 𝐵 𝑅 , 𝐵 𝑆 < 𝑀2 ◼ Optimal memory size Using MergeSort on relation R ◼ Number of runs = Τ𝐵 𝑅 𝑀, Run length = 𝑀 ◼ Limitation: number of runs ≤ 𝑀 − 1 ◼ Τ𝐵 𝑅 𝑀 < 𝑀 → 𝐵 𝑅 < 𝑀2 → 𝑀 > 𝐵 𝑅 ◼ Example B(R) = 1000 → M>31.62 B(S) = 500 → M>22.36 PA152, Vlastislav Dohnal, FI MUNI, 2023 35 Join Algorithms – MergeJoin→SortJoin ◼ Improvement: Not necessary to have the relations sorted completely R S Can join directly? sorted runs (1st phase of MergeSort) PA152, Vlastislav Dohnal, FI MUNI, 2023 36 Join Algorithms – SortJoin ◼ Improvement Prepare sorted runs of R and S Read 1st block of all runs (R and S) Get min value in Y ◼ Find corresponding records in other runs ◼ Join them ◼ In case too many records with the same Y Apply block-nested-loop join in the remaining memory PA152, Vlastislav Dohnal, FI MUNI, 2023 37 Join Algorithms – SortJoin ◼ Costs Sorted runs: 2(B(R) + B(S)) Joining: B(R) + B(S) ◼ Limitations Run length = M, number of runs < M → B(R) + B(S) < M(M-1) ◼ Example (M=102) Sorting: 2(1000 + 500) Joining: 1000 + 500 Total: 4 500 read/write IOs ◼ → better than cached block-based nested-loop join PA152, Vlastislav Dohnal, FI MUNI, 2023 38 Join Algorithms – HashJoin ◼ R S R(X,Y), S(Y,Z) ... ... M-1R ... ... M-1S memory buckets join PA152, Vlastislav Dohnal, FI MUNI, 2023 39 Join Algorithms – HashJoin ◼ R S R(X,Y), S(Y,Z) Define a hash function for attributes Y Create hashed index of R and S ◼ Address space is M-1 buckets For each i  [0,M-2] ◼ Read bucket i of R and S ◼ Find matching records and join them PA152, Vlastislav Dohnal, FI MUNI, 2023 40 Join Algorithms – HashJoin ◼ Joining buckets Read whole bucket of S (≤ M-2) ◼ May create an internal structure to speed up Read bucket of R block by block BucketsofR ... Si memory ... BucketsofS PA152, Vlastislav Dohnal, FI MUNI, 2023 41 Join Algorithms – HashJoin ◼ Costs: Create hashed index: 2(B(R)+B(S)) Bucket joining: B(R)+B(S) ◼ Limitations: Size of each bucket of S ≤ M-2 ◼ Estimate: 𝑚𝑖𝑛 𝐵 𝑅 , 𝐵 𝑆 < 𝑀 − 1 . (𝑀 − 2) ◼ Example: Hashing: 2(1000+500) Joining: 1000+500 Total: 4 500 read/write IOs PA152, Vlastislav Dohnal, FI MUNI, 2023 42 Join Algorithms – HashJoin ◼ Minimum memory requirements Hashing S; optimal bucket occupation ◼ Memory buffer: M blocks ◼ Bucket size = B(S) / (M-1)  This must be smaller than M (due to joining)  → Τ𝐵(𝑆) 𝑀 − 1 ≤ 𝑀 − 2 ◼ ≈ 𝑀 − 1 > 𝐵 𝑆 PA152, Vlastislav Dohnal, FI MUNI, 2023 43 Join Algorithms – HashJoin ◼ Optimization keep some buckets in memory Hybrid HashJoin ◼ Bucketing of S – Optimal size B(S)=500  𝐵 𝑆  23 i.e., each bucket is of 22 blocks M=102 ◼ → keep 3 buckets in memory (66 blocks) ◼ → 36 blocks of memory to spare PA152, Vlastislav Dohnal, FI MUNI, 2023 44 Join Algorithm – Hybrid HashJoin ◼ Preprocessing S Contents of memory buffer Memory usage (M=102): G0-2 3*22 blocks Other buckets 23-3 blocks Reading S 1 block output 1 block Total 88 blocks 14 blocks are available! memory G0 G2 in ... 22 blocks 23-3=20 buckets S ...... PA152, Vlastislav Dohnal, FI MUNI, 2023 45 Join Algorithm – Hybrid HashJoin ◼ Structure of memory to hash R 1000/23 = 44 blocks per bucket Records hashed to bucket 0-2 ◼ Join immediately with S0-2 buckets (in memory) → output memory G0 G2 in ... 44 blocks 23-3=20 buckets R ... ... PA152, Vlastislav Dohnal, FI MUNI, 2023 46 Join Algorithm – Hybrid HashJoin ◼ Joining buckets Do for buckets with id 3-22 Read one whole bucket in memory; read the other bucket block by block memory Hi output ...22 23-3=20 result ... 44 23-3=20 Buckets of S Buckets of R one bucket of S one block of bucket of R PA152, Vlastislav Dohnal, FI MUNI, 2023 47 Join Algorithm – Hybrid HashJoin ◼ Costs: Bucketize S: 500 + 2022 = 940 read/write IOs Bucketize R: 1000 + 2044 = 1880 read/write IOs ◼ Only 20 buckets to write! Joining: 2044 + 2022 = 1320 read IOs ◼ Three buckets are already done (during bucketizing R) In total: 4140 read/write IOs PA152, Vlastislav Dohnal, FI MUNI, 2023 48 Join Algorithms ◼ Hybrid HashJoin How many buckets to keep in memory? ◼ Empirically: 1 bucket ◼ Hashing record pointers Organize pointers to records instead of records themselves ◼ Store pairs [key value, rec. pointer] in buckets Joining ◼ If match, we must read the records PA152, Vlastislav Dohnal, FI MUNI, 2023 49 Join Algorithm – Hashing Pointers ◼ Example 100 key-pointer pairs fit in one block Estimate results size: 100 recs Costs: ◼ Bucketize S in memory (500 IOs)  5000 records → 5000/100 blocks = 50 blocks in memory ◼ Joining – read R gradually and join  If match, read full records of S → 100 read IOs ◼ Total: 500 + 1000 + 100 = 1600 read IOs PA152, Vlastislav Dohnal, FI MUNI, 2023 50 Join Algorithms – IndexJoin ◼ R S R(X,Y), S(Y,Z) ◼ Assume: Index on attributes Y of R ◼ Procedure: For each record s  S Look up matches in index → records A ◼ For each record r  A ◼ Output concatenation of r and s PA152, Vlastislav Dohnal, FI MUNI, 2023 51 Join Algorithms – IndexJoin ◼ Example Assume ◼ Index on Y of R: HT=2, LB=200 ◼ Scenario 1 Index fits in memory Costs: ◼ Pass of S: 500 read IOs (B(S)=500, T(S)=5000) ◼ Searching in index: for free  If match, read record of R → 1 read IO PA152, Vlastislav Dohnal, FI MUNI, 2023 52 Join Algorithms – IndexJoin ◼ Costs Depends on the number of matches Variants: ◼ A) Y in R is primary key; Y in S is foreign key → 1 record Costs: 500 + 500011 = 5500 read IOs ◼ B) V(R,Y) = 5000 T(R) = 10 000 uniform distribution → 2 records Costs: 500 + 500021 = 10500 read IOs ◼ C) DOM(R,Y)=1 000 000 T(R) = 10 000 → 10k/1m = 1/100 of record Costs: 500 + 5000(1/100)1 = 550 read IOs PA152, Vlastislav Dohnal, FI MUNI, 2023 53 Join Algorithms – IndexJoin ◼ Scenario 2 Index does not fit in memory Index on Y of R is of 201 blocks ◼ Keep root-node block and 99 leaf-node blocks in memory M=102 Costs for searching ◼ 0(99/200) + 1(101/200) = 0.505 read IOs per search (query) PA152, Vlastislav Dohnal, FI MUNI, 2023 54 Join Algorithms – IndexJoin ◼ Scenario 2 Costs ◼ B(S) + T(S)(searching index + reading records) Variants: ◼ A) → 1 record Costs: 500 + 5000(0.5+1) = 8000 read IOs ◼ B) → 2 records Costs: 500 + 5000(0.5+2) = 13000 read IOs ◼ C) → 1/100 of record Costs: 500 + 5000(0.5+1/100) = 3050 read IOs PA152, Vlastislav Dohnal, FI MUNI, 2023 55 Join Algorithms – Summary R  S B(R) = 1000 B(S) = 500 Algorithm Costs Cached Block-based Nested-loop Join 5500 Merge Join (w/o sorting) 1500 Merge Join (with sorting) 7500 Sort Join 4500 Index Join (R.Y index) 8000 → 550 Hash Join 4500 Hybrid 4140 Pointers 1600 PA152, Vlastislav Dohnal, FI MUNI, 2023 Join Algorithms – Summary R  S Assume B(S) < B(R), Y are common attributes Algorithm Costs Limits Block-based Nested-loop B(S)  (1+B(R)) M=3 Cached version B(S)/(M-2)  (M-2 + B(R)) M3 Merge Join (w/o sorting) B(R) + B(S) M=3 Merge Join (with sorting) 5  (B(R) + B(S)) 𝑀 = 𝐵 𝑅 Sort Join 3  (B(R) + B(S)) 𝑀 = 𝐵 𝑅 + 𝐵 𝑆 + 1 Index Join (R.Y index) (max costs) B(S) + T(S)  (HT + ) e.g.  = T(R)/V(R,Y) min. M=4 Hash Join 3  (B(R) + B(S)) 𝑀 = 2 + 𝐵 𝑆 max. M-1 buckets Hybrid 3 𝐵 𝑅 + 𝐵(𝑆) − 2 𝐵 𝑅 + 𝐵(𝑆) 𝐵 𝑅 𝑀 = 𝐵(𝑅) 𝐵 𝑅 + 𝐵 𝑅 + 1 Pointers B(S)+B(R)+T(R)   e.g.  = T(S)/V(S,Y) M=B(hash index on S)+3 56 PA152, Vlastislav Dohnal, FI MUNI, 2023 57 Join Algorithms – Recommendation ◼ Cached Block-based Nested-loop Join  Good for small relations (relative to memory size) ◼ HashJoin  For equi-joins (equality on attributes only)  Relations are not sorted or no indexes ◼ SortJoin  Good for non-equi-joins  E.g., R.Y > S.Y ◼ MergeJoin  If relations are already sorted ◼ IndexJoin  If index exists, it could be useful  Depends on expected result size PA152, Vlastislav Dohnal, FI MUNI, 2023 58 Two-Pass Algorithms ◼ Using sorting Duplicate Elimination Aggregations (GROUP BY) Set operations PA152, Vlastislav Dohnal, FI MUNI, 2023 59 Duplicate Elimination ◼ Procedure Do 1st phase of MergeSort ◼ → sorted runs on disk Read all runs block by block ◼ Find smallest record and output it ◼ Skip all duplicate records ◼ Properties Costs: 3B(R) Limitations: B(R) ≤ M*(M-1) ◼ Optimal M ≥ 𝐵 𝑅 + 1 PA152, Vlastislav Dohnal, FI MUNI, 2023 60 Aggregations ◼ Procedure (analogous to previous) Sort runs of R (by group-by attributes) Read all runs block by block ◼ Find smallest value → new group  Compute all aggregates over all records of this group  No more record in this group → output it ◼ Properties Costs: 3B(R) Limitations: B(R) ≤ M*(M-1) ◼ Optimal M ≥ 𝐵 𝑅 + 1 PA152, Vlastislav Dohnal, FI MUNI, 2023 61 Set union ◼ Notice: No two-pass algo for bag union ◼ Set union Do 1st phase of MergeSort on R and S ◼ → sorted runs on disk Read all runs (both R and S) gradually ◼ Find the first remaining record and output it ◼ Skip all duplicates of this record (in R and S) ◼ Properties Costs: 3(B(R) + B(S)) Limitations: 𝐵 𝑅 + 𝐵 𝑆 ≤ 𝑀 ◼ Need one block per all runs (of R and also S) PA152, Vlastislav Dohnal, FI MUNI, 2023 62 Set/bag intersection and difference ◼ RS, R-S, RBS, R-BS ◼ Procedure Do 1st phase of MergeSort on R and S Read all runs (both R and S) gradually ◼ Find the first remaining record t ◼ Count t’s occurrences in R and S (separately)  #R, #S ◼ Copy to output (respecting specific operation) PA152, Vlastislav Dohnal, FI MUNI, 2023 63 Set/bag intersection and difference ◼ On copy to output: RS: output t, ◼ if #R > 0  #S > 0 RBS: output t min(#R,#S)-times R-S: output t, ◼ if #R > 0  #S = 0 R-BS: output t max(#R - #S,0)-times ◼ Properties Costs: 3(B(R) + B(S)) Limitations: 𝐵 𝑅 + 𝐵 𝑆 ≤ 𝑀 ◼ Need one block per all runs (of R and also S) PA152, Vlastislav Dohnal, FI MUNI, 2023 64 Two-Pass Algorithms ◼ Using hashing Duplicate Elimination Aggregations (GROUP BY) Set operations PA152, Vlastislav Dohnal, FI MUNI, 2023 65 Duplicate Elimination ◼ Procedure Bucketize R into M-1 buckets ◼ → store buckets on disk For each bucket ◼ Read it in memory and remove duplicates; output remaining records  bucket size is max. M-1 blocks ◼ Properties Costs: 3B(R) Limitations: B(R) ≤ (M-1)2 PA152, Vlastislav Dohnal, FI MUNI, 2023 66 Aggregations ◼ Procedure (analogous to previous)  Bucketize R into M-1 buckets by group-by attrs. ◼ → store buckets on disk  For each bucket ◼ Read block by block in memory and ◼ Create groups for new values and compute aggregates  Limit on bucket size is not defined. But groups and partial aggregates must fit in max. M-1 blocks. ◼ Output results ◼ Properties  Costs: 3B(R)  Limitations: B(R) ≤ (M-1)2 lze téměř zrušitcan be relaxed PA152, Vlastislav Dohnal, FI MUNI, 2023 67 Set union, intersection, difference ◼ Procedure Bucketize R and S (the same hash function) ◼ into M-1 buckets Process the pair of buckets Ri and Si ◼ Read one in memory (depends on operation)  bucket size: max. M-2 ◼ Read the other gradually ◼ Properties Costs: 3(B(R) + B(S)) Limitations on M depends on the operation Set intersection, difference ◼ Intersection (smaller relation is S) Load the bucket of S in mem Restrictions: min(B(R), B(S)) ≤ (M-2)*(M-1) ◼ Difference R-S: To eliminate duplicates in R, read bucket of R into mem Restrictions: B(R) ≤ (M-2)*(M-1) ◼ Difference S-R: Load the bucket of S in mem Restrictions: B(S) ≤ (M-2)*(M-1) PA152, Vlastislav Dohnal, FI MUNI, 2023 68 Set Union ◼ Must eliminate duplicates in R and S ◼ for each i in hash addresses: ◼ read BktS i , build in-mem hash table & eliminate dups  also gradually output the records ◼ read BktR i gradually:  for each r in BktR i : ▪ if r not in in-mem hash table ▪ output r and add to in-mem hash table ◼ Restrictions: 𝐵 𝑅 + 𝐵 𝑆 < 𝑀 Need to load both the buckets (at worst) into M PA152, Vlastislav Dohnal, FI MUNI, 2023 69 Summary ◼ Operations  distinct, group by, set operations, joins ◼ Algorithm type  one-pass, one-and-a-half pass, two-pass ◼ Implementation  Sorting  Hashing  Exploiting indexes ◼ Costs  blocks to read/write  memory footprint PA152, Vlastislav Dohnal, FI MUNI, 2023 70 Lecture Takeaways ◼ Influence of algorithm implementation on costs ◼ Estimated costs influence the choice of implementation ◼ If more mem is needed (estimation was wrong) It is allocated and the operation is not terminated. ◼ Also, tiny code changes count! PA152, Vlastislav Dohnal, FI MUNI, 2023 71