PA036: DB Project 1. PostgreSQL Vlastislav Dohnal Relational DBMS ◼ PostgreSQL open-source relational database management system ver. 16 – docs ◼ Accessing DB a command-line client: psql ◼ Benchmarking DB a command-line tool: pgbench PA036, Vlastislav Dohnal, FI MUNI, 2025 2 Query Plan ◼ command EXPLAIN shows how a database will execute a query PA036, Vlastislav Dohnal, FI MUNI, 2025 3 EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 7000; QUERY PLAN ------------------------------------------------------------ Seq Scan on tenk1 (cost=0.00..483.00 rows=7001 width=244) Filter: (unique1 < 7000) EXPLAIN SELECT * FROM tenk1 WHERE unique1 < 100; QUERY PLAN ----------------------------------------------------------------- Bitmap Heap Scan on tenk1 (cost=5.07..229.20 rows=101 width=244) Recheck Cond: (unique1 < 100) -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=101 width=0) Index Cond: (unique1 < 100) Query Plan (cont.) PA036, Vlastislav Dohnal, FI MUNI, 2025 4 EXPLAIN SELECT * FROM tenk1 t1, tenk2 t2 WHERE t1.unique1 < 100 AND t1.unique2 = t2.unique2; QUERY PLAN ----------------------------------------------------------------------------- Hash Join (cost=230.47..713.98 rows=101 width=488) Hash Cond: (t2.unique2 = t1.unique2) -> Seq Scan on tenk2 t2 (cost=0.00..445.00 rows=10000 width=244) -> Hash (cost=229.20..229.20 rows=101 width=244) -> Bitmap Heap Scan on tenk1 t1 (cost=5.07..229.20 rows=101 width=244) Recheck Cond: (unique1 < 100) -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=101 width=0) Index Cond: (unique1 < 100) Query Plan (cont.) ◼ Add analyze to show real execution times PA036, Vlastislav Dohnal, FI MUNI, 2025 5 EXPLAIN ANALYZE SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000 LIMIT 2; QUERY PLAN ------------------------------------------------------------------------------ Limit (cost=0.29..14.71 rows=2 width=244) (actual time=0.177..0.249 rows=2 loops=1) -> Index Scan using tenk1_unique2 on tenk1 (cost=0.29..72.42 rows=10 width=244) (actual time=0.174..0.244 rows=2 loops=1) Index Cond: (unique2 > 9000) Filter: (unique1 < 100) Rows Removed by Filter: 287 Planning time: 0.096 ms Execution time: 0.336 ms It may signal inefficiency of filter. Planner may show discrepancy in the number of rows If planner is wrong in estimates, try to update statistics – vacuum command. Query Plan (cont.) ◼ Beware of data modifying queries under inspection -> use transactions PA036, Vlastislav Dohnal, FI MUNI, 2025 6 BEGIN; EXPLAIN ANALYZE UPDATE tenk1 SET hundred = hundred + 1 WHERE unique1 < 100; QUERY PLAN ------------------------------------------------------------------------------ Update on tenk1 (cost=5.08..230.08 rows=0 width=0) (actual time=3.791..3.792 rows=0 loops=1) -> Bitmap Heap Scan on tenk1 (cost=5.08..230.08 rows=102 width=10) (actual time=0.069..0.513 rows=100 loops=1) Recheck Cond: (unique1 < 100) Heap Blocks: exact=90 -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.05 rows=102 width=0) (actual time=0.036..0.037 rows=300 loops=1) Index Cond: (unique1 < 100) Planning Time: 0.113 ms Execution Time: 3.850 ms ROLLBACK; Query Plan (cont.) ◼ Add buffers – shows how much data was read PA036, Vlastislav Dohnal, FI MUNI, 2025 7 EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM tenk1 WHERE unique1 < 100 AND unique2 > 9000; QUERY PLAN ------------------------------------------------------------------------------ Bitmap Heap Scan on tenk1 (cost=25.08..60.21 rows=10 width=244) (actual time=0.323..0.342 rows=10 loops=1) Recheck Cond: ((unique1 < 100) AND (unique2 > 9000)) Buffers: shared hit=15 -> BitmapAnd (cost=25.08..25.08 rows=10 width=0) (actual time=0.309..0.309 rows=0 loops=1) Buffers: shared hit=7 -> Bitmap Index Scan on tenk1_unique1 (cost=0.00..5.04 rows=101 width=0) (actual time=0.043..0.043 rows=100 loops=1) Index Cond: (unique1 < 100) Buffers: shared hit=2 -> Bitmap Index Scan on tenk1_unique2 (cost=0.00..19.78 rows=999 width=0) (actual time=0.227..0.227 rows=999 loops=1) Index Cond: (unique2 > 9000) Buffers: shared hit=5 Planning time: 0.088 ms Execution time: 0.423 ms PA036, Vlastislav Dohnal, FI MUNI, 2025 8 Query Rewriting ◼ Modifying the query text to eliminate unnecessary operations  Keeping the result identical!!! ◼ Techniques  Use of indexes ◼ attribute selection, inclusion of attributes, clustering the table by an index  Elimination unnecessary ops (DISTINCT, GROUP BY, …)  (Correlated) subqueries ◼ changing subquery to a join, “with” clause  Temporary tables  Incorrect use of having  Materialized views and its maintenance PA036, Vlastislav Dohnal, FI MUNI, 2025 9 Statistics in PostgreSQL ◼ Relation hotel PA036, Vlastislav Dohnal, FI MUNI, 2025 10 Statistics in PostgreSQL ◼ Attribute hotel.id ◼ Attribute hotel.name PA036, Vlastislav Dohnal, FI MUNI, 2025 11 Statistics in PostgreSQL ◼ Attribute hotel.state ◼ Attribute hotel.distance_to_center Indexes in PostgreSQL ◼ Index types in docs  B-tree, Hash  GiST, SP-GiST ◼ for several two-dimensional geometric data types, ◼ supports “nearest neighbors” queries  GIN ◼ inverted file, for indexing arrays  BRIN (Block Range Index) ◼ stores summaries about the values stored in consecutive physical block ranges of a table, good for values well-correlated with the physical order of the table rows PA036, Vlastislav Dohnal, FI MUNI, 2025 12 Indexes in PostgreSQL (cont.) ◼ Multicolumn indexes Mind the order of attributes ◼ Expression based indexes Evaluate a function to be stored in the index ◼ Partial indexes Index only a subset of rows PA036, Vlastislav Dohnal, FI MUNI, 2025 13 Transactions in Pg ◼ Transactions (a sequence of work done all or none) ACID properties Control commands: ◼ BEGIN - Starts a new transaction. ◼ COMMIT - Saves all changes made in the transaction permanently. ◼ ROLLBACK - Undoes all changes made in the current transaction. PA036, Vlastislav Dohnal, FI MUNI, 2025 14 Transactions in Pg (cont.) ◼ Transactions Control commands within a transaction: ◼ SAVEPOINT - Creates a checkpoint inside a transaction to which you can ROLLBACK. ◼ ROLLBACK TO SAVEPOINT - Undoes changes made after a specific savepoint. ◼ Isolation level determines how transactions interact with each other PA036, Vlastislav Dohnal, FI MUNI, 2025 15 Isolation Levels in Pg ◼ Command: BEGIN; SET TRANSACTION ISOLATION LEVEL …; ◼ Levels: Read uncommitted Read committed (default) Repeatable read Serializable PA036, Vlastislav Dohnal, FI MUNI, 2025 16 Isolation Levels in Pg (cont.)  Read uncommitted (fastest, unsafe) ◼ Transactions can read uncommitted changes from other transactions  Read committed ◼ A transaction sees only committed changes from other transactions. ◼ Each query in the transaction gets a fresh snapshot of the database.  Repeatable read ◼ A transaction sees a consistent snapshot throughout its execution. ◼ Prevents non-repeatable reads but may still allow phantom reads.  Serializable (slowest, correct) ◼ Transactions are executed in a way that ensures they behave as if they were executed sequentially. ◼ Prevents dirty reads, non-repeatable reads, and phantom reads. PA036, Vlastislav Dohnal, FI MUNI, 2025 17 Isolation Levels in Pg (cont.) ◼ Any change to data (INSERT, UPDATE, DELETE) creates a “lock” Lock is release on COMMIT or ROLLBACK. ◼ View locks: SELECT * FROM pg_locks WHERE granted = true; ◼ Typically, we are happy to let handled by Pg automatically. ◼ Typically, we are happy to let handled by Pg automatically. PA036, Vlastislav Dohnal, FI MUNI, 2025 18 Isolation Levels in Pg (cont.) ◼ Row-level locks SELECT… FROM… FOR [UPDATE|SHARE] FOR UPDATE ◼ Locks selected rows, preventing other transactions from modifying them. FOR SHARE ◼ Allows concurrent reads but prevents updates/deletes. PA036, Vlastislav Dohnal, FI MUNI, 2025 19 Isolation Levels in Pg (cont.) ◼ Table-level locks – controls access to the whole table  ACCESS SHARE – Default lock for SELECT statements (allows concurrent reads).  ROW SHARE – Acquired by SELECT ... FOR UPDATE or FOR SHARE.  ROW EXCLUSIVE – Used by INSERT, UPDATE, and DELETE.  SHARE – Prevents writes but allows concurrent reads.  SHARE ROW EXCLUSIVE – Prevents concurrent writes and some reads.  EXCLUSIVE – Allows only the locking transaction to modify the table.  ACCESS EXCLUSIVE – Blocks all reads and writes (used by ALTER TABLE, DROP TABLE). PA036, Vlastislav Dohnal, FI MUNI, 2025 20 ◼ That’ all, folks. PA036, Vlastislav Dohnal, FI MUNI, 2025 21