Database System Concepts, 6th Ed. ©Silberschatz, Korth and Sudarshan See www.db-book.com for conditions on re-use Chapter 1: Introduction ©Silberschatz, Korth and Sudarshan1.2Database System Concepts - 6th Edition Database Management System (DBMS)  DBMS contains information about a particular enterprise  Collection of interrelated data  Set of programs to access the data  An environment that is both convenient and efficient to use  Database Applications:  Banking: transactions  Airlines: reservations, schedules  Universities: registration, grades  Sales: customers, products, purchases  Online retailers: order tracking, customized recommendations  Manufacturing: production, inventory, orders, supply chain  Human resources: employee records, salaries, tax deductions  Databases can be very large.  Databases touch all aspects of our lives ©Silberschatz, Korth and Sudarshan1.3Database System Concepts - 6th Edition University Database Example  Application program examples  Add new students, instructors, and courses  Register students for courses, and generate class rosters  Assign grades to students, compute grade point averages (GPA) and generate transcripts  In the early days, database applications were built directly on top of file systems ©Silberschatz, Korth and Sudarshan1.4Database System Concepts - 6th Edition Drawbacks of using file systems to store data  Data redundancy and inconsistency  Multiple file formats, duplication of information in different files  Difficulty in accessing data  Need to write a new program to carry out each new task  Data isolation — multiple files and formats  Integrity problems  Integrity constraints (e.g., account balance > 0) become “buried” in program code rather than being stated explicitly  Hard to add new constraints or change existing ones ©Silberschatz, Korth and Sudarshan1.5Database System Concepts - 6th Edition Drawbacks of using file systems to store data (Cont.)  Atomicity of updates  Failures may leave database in an inconsistent state with partial updates carried out  Example: Transfer of funds from one account to another should either complete or not happen at all  Concurrent access by multiple users  Concurrent access needed for performance  Uncontrolled concurrent accesses can lead to inconsistencies – Example: Two people reading a balance (say 100) and updating it by withdrawing money (say 50 each) at the same time  Security problems  Hard to provide user access to some, but not all, data Database systems offer solutions to all the above problems ©Silberschatz, Korth and Sudarshan1.9Database System Concepts - 6th Edition Data Models  A collection of tools for describing  Data  Data relationships  Data semantics  Data constraints  Relational model  Entity-Relationship data model (mainly for database design)  Object-based data models (Object-oriented and Object-relational)  Semistructured data model (XML)  Other older models:  Network model  Hierarchical model ©Silberschatz, Korth and Sudarshan1.10Database System Concepts - 6th Edition Relational Model  Relational model (Chapter 2)  Example of tabular data in the relational model Columns Rows ©Silberschatz, Korth and Sudarshan1.11Database System Concepts - 6th Edition A Sample Relational Database ©Silberschatz, Korth and Sudarshan1.13Database System Concepts - 6th Edition Data Definition Language (DDL)  Specification notation for defining the database schema Example: create table instructor ( ID char(5), name varchar(20), dept_name varchar(20), salary numeric(8,2))  DDL compiler generates a set of table templates stored in a data dictionary  Data dictionary contains metadata (i.e., data about data)  Database schema  Integrity constraints  Primary key (ID uniquely identifies instructors)  Referential integrity (references constraint in SQL) – e.g. dept_name value in any instructor tuple must appear in department relation  Authorization ©Silberschatz, Korth and Sudarshan1.14Database System Concepts - 6th Edition SQL  SQL: widely used non-procedural language  Example: Find the name of the instructor with ID 22222 select name from instructor where instructor.ID = „22222‟  Example: Find the ID and building of instructors in the Physics dept. select instructor.ID, department.building from instructor, department where instructor.dept_name = department.dept_name and department.dept_name = „Physics‟  Application programs generally access databases through one of  Language extensions to allow embedded SQL  Application program interface (e.g., ODBC/JDBC) which allows SQL queries to be sent to a database  Chapters 3, 4 and 5 ©Silberschatz, Korth and Sudarshan1.16Database System Concepts - 6th Edition Database Design?  Is there any problem with this design? Table instructor ©Silberschatz, Korth and Sudarshan1.17Database System Concepts - 6th Edition Design Approaches  Normalization Theory (Chapter 8)  Formalize what designs are bad, and test for them  Entity Relationship Model (Chapter 7)  Models an enterprise as a collection of entities and relationships  Entity: a “thing” or “object” in the enterprise that is distinguishable from other objects – Described by a set of attributes  Relationship: an association among several entities  Represented diagrammatically by an entity-relationship diagram: ©Silberschatz, Korth and Sudarshan1.18Database System Concepts - 6th Edition The Entity-Relationship Model  Models an enterprise as a collection of entities and relationships  Entity: a “thing” or “object” in the enterprise that is distinguishable from other objects  Described by a set of attributes  Relationship: an association among several entities  Represented diagrammatically by an entity-relationship diagram: What happened to dept_name of instructor? ©Silberschatz, Korth and Sudarshan1.21Database System Concepts - 6th Edition Storage Management  Storage manager is a program module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system.  The storage manager is responsible to the following tasks:  Interaction with the file manager  Efficient storing, retrieving and updating of data  Issues:  Storage access  File organization  Indexing and hashing ©Silberschatz, Korth and Sudarshan1.22Database System Concepts - 6th Edition Query Processing 1. Parsing and translation 2. Optimization 3. Evaluation ©Silberschatz, Korth and Sudarshan1.24Database System Concepts - 6th Edition Transaction Management  What if the system fails?  What if more than one user is concurrently updating the same data?  A transaction is a collection of operations that performs a single logical function in a database application  Transaction-management component ensures that the database remains in a consistent (correct) state despite system failures (e.g., power failures and operating system crashes) and transaction failures.  Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database. ©Silberschatz, Korth and Sudarshan1.26Database System Concepts - 6th Edition Database System Internals ©Silberschatz, Korth and Sudarshan1.27Database System Concepts - 6th Edition Database Architecture The architecture of a database systems is greatly influenced by the underlying computer system on which the database is running:  Centralized  Client-server  Parallel (multi-processor)  Distributed ©Silberschatz, Korth and Sudarshan1.28Database System Concepts - 6th Edition History of Database Systems  1950s and early 1960s:  Data processing using magnetic tapes for storage  Tapes provided only sequential access  Punched cards for input  Late 1960s and 1970s:  Hard disks allowed direct access to data  Network and hierarchical data models in widespread use  Ted Codd defines the relational data model  Would win the ACM Turing Award for this work  IBM Research begins System R prototype  UC Berkeley begins Ingres prototype  High-performance (for the era) transaction processing ©Silberschatz, Korth and Sudarshan1.29Database System Concepts - 6th Edition History (cont.)  1980s:  Research relational prototypes evolve into commercial systems  SQL becomes industrial standard  Parallel and distributed database systems  Object-oriented database systems  1990s:  Large decision support and data-mining applications  Large multi-terabyte data warehouses  Emergence of Web commerce  Early 2000s:  XML and XQuery standards  Automated database administration  Later 2000s:  Giant data storage systems  Google BigTable, Yahoo PNuts, Amazon, .. ©Silberschatz, Korth and Sudarshan1.30Database System Concepts - 6th Edition End of Chapter 1