Structured Analysis Lecture 6 1© Jiří Sochor, Jaroslav Ráček Outline  Yourdon Modern Structured Analysis (YMSA)  Context diagram (CD)  Data flow diagram (DFD)  Data modelling  Entity relationship diagram (ERD)  Normalization and database design 2© Jiří Sochor, Jaroslav Ráček Yourdon Modern Structured Analysis (YMSA) Lecture 6/Part 1 3© Jiří Sochor, Jaroslav Ráček E. Yourdon: Modern structured analysis © Jiří Sochor, Jaroslav Ráček 4 environment model behavioral model top-down and bottom-up balancing Environment model  Context diagram is a special case of a data flow diagram, containing a single process representing the whole system. It emphasizes:  Terminators – people and systems communicatingwith the system  Data received from the environment that shall be processed  Data produced by the system and sent to the environment  Data stores shared by the system and its terminators  System boundary  Event list is a textual list of stimuli coming from the environment that must be responded by the system. © Jiří Sochor, Jaroslav Ráček 5 Context diagram example © Jiří Sochor, Jaroslav Ráček 6 Behavioral model  Behavioral model specifies the flow of data through the modeled information system, modeling its process aspects.  It shows what kinds of information will be input to and output from the system, where the data will come from and go to, and where the data will be stored.  It does not show information about the timing of processes, or information about whether processes will operate in sequence or in parallel.  Data flow diagram (DFD) is a graphical representation of the system as a network of processes that fulfill system functions and communicate through system data. © Jiří Sochor, Jaroslav Ráček 7 Data flow diagram (DFD)  DFD consists of four types of elements:  Processes  Data flows  Data stores  Terminators © Jiří Sochor, Jaroslav Ráček 8 Processes and Data flows  A Process models a part of the system that transforms specific inputs to outputs.  Name of a process is a single word, phrase or simple sentence, e.g. “User authentication”.  The process name sometimes contains the name of a person, group of people, department or device – specifying also the actor or tool of the process.  A Data flow models a way for data transfer from one part of the system to another.  Flows can also model the transfer of physical materials. © Jiří Sochor, Jaroslav Ráček 9 Data stores  Data store models a static collection of data that are shared by two or more processes operating in different time.  Name is a plural of the data name going to and coming from the data store. © Jiří Sochor, Jaroslav Ráček 10 Terminators  A Terminator represents an external entity communicating with the system.  Terminators are external to the modeled system.  The flows connecting terminators with the processes or data stores inside the system represent the interfaces between the system and its environment. © Jiří Sochor, Jaroslav Ráček 11 Top-down and bottom-up DFD balancing © Jiří Sochor, Jaroslav Ráček 12 primary DFD 98 processes first balancing 14 processes system DFD 2 processes Data modelling Lecture 6/Part 2 13© Clear View Training 2010 v2.6 Data modeling  Defines static data structure, relationships and attributes  Complementary to the behavior model in structured analysis; models information not covered by DFDs  More stable and essential information comparing to DFD  Entity-Relationship modeling  Identify system entities – both abstract (lecture) and concrete (student)  For each entity examine – the purpose of the entity, its constituents (attributes) and relationships among entities  Check model consistency and include data details © Jiří Sochor, Jaroslav Ráček 14 Entity Relationship Diagram (ERD)  Entities and their types  Relationships and their types  Attributes and their domains © Jiří Sochor, Jaroslav Ráček 15 Entities and Entity types  An Entity is anything about which we want to store data  Identifiable – entities can be distinguished by their identity  Needed – has significant role in the designed system  Described by attributes shared by all entities of the same type  An Entity set is a set of entities of the same Entity type. © Jiří Sochor, Jaroslav Ráček 16 Entity Entity type You Student Your neighbor Student Me Teacher This PB007 lecture Lecture Student Lecture Teacher Relationships and Relationship types  Entities take part in Relationships (among possibly more than two entities), that can often be identified from verbs or verb phrases.  You are attending this PB007 lecture.  I am giving this PB007 lecture.  A Relationship set is a set of relationships of the same Relationship type.  A student attends a lecture.  A teacher gives a lecture. © Jiří Sochor, Jaroslav Ráček 17 Student Lecture attends gives Teacher Attributes and Attribute domains  An Attribute is a fact, aspect, property, or detail about either an entity type or a relationship type.  E.g. a lecture might have attributes: time, date, length, place.  An Attribute type is a type domain of the attribute. If the domain is complex (domain of an attribute address), the attribute may be an entity type instead. © Jiří Sochor, Jaroslav Ráček 18 Lecture time date length place Attributes or entities?  To decide whether a concept be modeled as an attribute or an entity type:  Do we wish to store any information about this concept (other than an identifying name)?  Is it single-valued?  E.g. objectives of a course – are they more than one? If just one, how complex information do we want to store about it?  General guidelines:  Entities can have attributes but attributes have no smaller parts.  Entities can have relationships between them, but an attribute belongs to a single entity. © Jiří Sochor, Jaroslav Ráček 19 Relationship-type degree © Jiří Sochor, Jaroslav Ráček 20 Entita Entita Entita Entita Exactly one occurence None or one occurence One or more occurence None or more occurences Relationship-type degree © Jiří Sochor, Jaroslav Ráček 21 Every manager manages exactly one department. Every department is managed by exactly one manager. Every edition plan contains one or more titles. Every book title is part of exactly one edition plan. Every producer produces one or more products. Every product is produced by one or more producers. Relationship-type degree © Jiří Sochor, Jaroslav Ráček 22 Mandatory relationship Optional relationship Recursive relationship Cardinality ratio  Cardinality ratio of a relationship type describes the number of entities that can participate in the relationship.  One to one 1:1  Each lecturer has a unique office.  One to many 1:N  A lecturer may tutor many students, but each student has just one tutor.  Many to many M:N  Each student takes several modules, and each module is taken by several students. © Jiří Sochor, Jaroslav Ráček 23 More relationships between two entities  Relationship “nabízí” has attributes:  platební podmínky, termíny.  Relationship “dodal” has attributes:  údaje z dodacího listu. © Jiří Sochor, Jaroslav Ráček 24 Relationships among more than two entities © Jiří Sochor, Jaroslav Ráček 25 Association entity  The Purchase contract exists just as a result of the relationship between the Customer and Goods entity. © Jiří Sochor, Jaroslav Ráček 26 association entity Super-type and sub-type entities  Extended ERDs model also inheritance, i.e. the relationship of specialization–generalization © Jiří Sochor, Jaroslav Ráček 27 super-type entity sub-type entity Super-types and sub-types in ERD © Jiří Sochor, Jaroslav Ráček 28 ERD modeling in structured analysis  Iterative development in structured analysis  Entities identification -> initial ERD  Attributes identification -> detailed ERD  Identification of missing and redundant entities  ERD-DFD consistency checking  Modeled in parallel with DFD © Jiří Sochor, Jaroslav Ráček 29 ERD modeling guidelines 1. Initial ERD  Domain analysis and user interview  Entities identification  Analogical to UML class identification 2. Detailed ERD  Entities refinement  Attributes identification based on  Behavioral DFD models  Data dictionary provided by the customer © Jiří Sochor, Jaroslav Ráček 30 ERD modeling guidelines 3. Identification of missing and redundant entities  Entities constituting of only the identifier  Entity sets consisting of a single entity  Association entities  Derived entities and relationships 4. Consistency and completeness checking  Based on DFDs and DE (Data elements) © Jiří Sochor, Jaroslav Ráček 31 Removal of unneeded (redundant) entities © Jiří Sochor, Jaroslav Ráček 32 The Spouse entity is better suited as Employee’s attribute. Removal of unneeded (redundant) entities © Jiří Sochor, Jaroslav Ráček 33 Removal of unneeded relationships © Jiří Sochor, Jaroslav Ráček 34 The duty to renew the license can be derived from the entities Data dictionary  Used for documentation of complex ERD models  Symbols:  = consists of  + and  ( ) optional part (0 or 1)  [ | ] alternative choice  { } iteration (1 or more) a=1{b}15  * * comment  @ identifier (key) © Jiří Sochor, Jaroslav Ráček 35 Example – Order  Order no. 2012-007-24  Uni BookStore, 70 Austin St, 718-793-1395 New York  Issue date: 23.4.2012 Deliverydate: 30.4.2012  Customer: no. 007 Dr. John Smith  Goods: Number Name Pieces Price/piece P3876 Software engineering 6 65 H4681 UML2 and the UP 4 48 © Jiří Sochor, Jaroslav Ráček 36 ERD example – Transport © Jiří Sochor, Jaroslav Ráček 37 ERD example – Library © Jiří Sochor, Jaroslav Ráček 38 Relational database design based on ERDs  Entity-relationship modeling is a first step towards database design. Database design process: 1. Determine the purpose of the database. 2. Find and organize the information required - Create ERD model of the system. Each entity type becomes a table, attribute becomes a column, entity becomes a row in the table. Handle relationships with attributes, association entities and M:N relationships. 39© Jiří Sochor, Jaroslav Ráček Database design process (continued) 3. Specify primary keys - Choose each table’s primary key. The primary key is a column that is used to uniquely identify each row. An example might be Product ID or Order ID. 4. Apply the normalization rules - Apply the data normalization rules to see if tables are structured correctly. Make adjustments to the tables. 5. Refine the design - Analyze the design for errors. Create tables and add a few records of sample data. Check if results come from the tables as expected. Make adjustments to the design, as needed. 40© Jiří Sochor, Jaroslav Ráček Association entities © Jiří Sochor, Jaroslav Ráček 41 … or association entity. Order can be an entity on its own… Relationships to entities © Jiří Sochor, Jaroslav Ráček 42 M:N relationships © Jiří Sochor, Jaroslav Ráček 43 Entities and keys Unambiguous identification  Every entity is uniquely identified by its key.  Non-redundance  All items of the key are necessary to identify an entity, no item can be removed from the key.  Candidate keys  There are more combinations of entity attributes that can be used as an entity key.  Primary key  The selected candidate key, marked with # symbol. © Jiří Sochor, Jaroslav Ráček 44 Data normalization by E.F. Codd  Free the database of modification anomalies  Update anomaly – the same information expressed on multiple rows -> updates resulting in logical inconsistencies.  Insertion anomaly – certain facts cannot be recorded, because of their binding with another information into one record.  Deletion anomaly – deletion of data representing certain facts necessitating deletion of unrelated data.  Minimize redesign when extending the database structure  Make the data model more informative to users  Avoid bias towards any particular pattern of querying © Jiří Sochor, Jaroslav Ráček 45 © Jiří Sochor, Jaroslav Ráček 46 1. Normal form (1.NF) zam# jmeno sex odb# praxe osoba odbornost zam# jmeno sex 1. Normal form zam# +odb# praxe Def.1.NF: The records contain no repeating groups. © Jiří Sochor, Jaroslav Ráček 47 1. Normal form (1.NF) © Jiří Sochor, Jaroslav Ráček 48 1. Normal form (1.NF) Full functional dependency  Functional dependency  In a given table, an attribute Y is said to have a functional dependency on a set of attributes X if and only if each X value is associatedwith precisely one Y value.  Trivial functional dependency  A trivial functional dependency is a functional dependency of an attribute on a superset of itself.  Full functional dependency  An attribute is fully functionally dependent on a set of attributes X if it is: functionally dependent on X, and not functionally dependent on any proper subset of X. © Jiří Sochor, Jaroslav Ráček 49 © Jiří Sochor, Jaroslav Ráček 50 2. Normal form zam# jméno plat projekt# datum dokončení programátor# balík# jméno progr. jméno balíku hodiny Př.: Entita ZAMESTNANEC Př.: EntitaAKTIVITA_PROGRAMATORA Is in 2.NF. Not in 2.NF. 2.NF requires full functional dependency of all non-key attributes on the whole key. © Jiří Sochor, Jaroslav Ráček 51 2. Normal form – normalization example součást# dodavatel# název dodav. údaje dodav. cena dodavatel# název dodav. údaje dodav. součást# dodavatel# cena © Jiří Sochor, Jaroslav Ráček 52 Problems of „not being“ in 2.NF – examples Dokud nám dodavatel nedodá součást, nemůžeme zapsat jeho adresu a další údaje. Pokud přestane dodavatel dočasně zásobovat, pak zrušení záznamu o součásti zruší i jeho údaje. Jakákoliv změna v údajích o dodavatelích je komplikovaná (vyhledání a oprava více záznamů). © Jiří Sochor, Jaroslav Ráček 53 2. Normal form zam# jméno plat projekt# datum dokončení Př.: Entita ZAMESTNANEC functional dependence of the primary key functional dependence of an alternative key Def.: 2.NF: Record R is in 2.NF if it is in 1.NF and every non-key attribute of R is fully functionally dependent on every candidate key of R. © Jiří Sochor, Jaroslav Ráček 54 3. Normal form – transitive dependence A B C A B B C C is transitively dependent on A Def. 3NF (alternative 1): Record R is in 3.NF if it is in 2.NF and every attribute of R is functionally dependent on a key and nothing but a key. Def. 3.NF (alternative 2): Record R is in 3.NF if it is in 2.NF and every non-key attribute of R is non-transitively dependent on every candidate key of R. © Jiří Sochor, Jaroslav Ráček 55 Problems of „not being“ in 3.NF Př.: ZAMESTNANEC zam# jméno plat projekt# datum dokončení není 3.NF Problems of „not being“ in 3.NF - Dokud nepřidělíme pracovníky na projekt, nemůžeme zapsat datum ukončení. - Jestliže všichni opustí projekt, zrušíme veškerou informaci o datu ukončení. - Změnu data ukončení je nutné provést na mnoha místech. © Jiří Sochor, Jaroslav Ráček 56 4. Normal form – conditional dependence zákazník# jméno adresa stát daň zákazník# jméno adresa stát zákazník# daň ZÁKAZNÍCI ZÁKAZNÍCI DOMÁCÍ conditional dependence 4.NF removes conditional functional dependencies Daň strháváme těm, kteří sídlí ve stejném státě jako naše firma. © Jiří Sochor, Jaroslav Ráček 57 Example – 4.NF normalization repeating groups primary key © Jiří Sochor, Jaroslav Ráček 58 Example – 4.NF normalization transitively dependent attributes © Jiří Sochor, Jaroslav Ráček 59 Example – 4.NF normalization done © Jiří Sochor, Jaroslav Ráček 60 Example – 4.NF normalization attribute dependent on a part of the key part of the key © Jiří Sochor, Jaroslav Ráček 61 Example – 4.NF normalization Summary: All the entity sets were normalized into 4.NF and their names changed accordingly. © Jiří Sochor, Jaroslav Ráček 62 Example – 4.NF normalization and ERD generation ERD vs. UML Class Diagram 63© Jiří Sochor, Jaroslav Ráček  Class diagrams  model both structural and behavior features of a system (attribute and operations),  contain many different types of relationships (association, aggregation, composition, dependency, generalization), and  are more likely to map into real-world objects.  Entity relationship models  model only structural data view with a low variety of relationships (simple relations and rarely generalization), and  are more likely to map into database tables (repetitive records).  They allow us to design primary and foreign entity keys, and used to be normalized to simplify data manipulation. ERD vs. UML Class Diagram 64© Jiří Sochor, Jaroslav Ráček  Although there can be one to one mapping between ERD and Class diagram, it is very common that  one class is mapped to more than one entity, or  more classes are mapped to a single entity.  Furthermore, not all classes need to be persistent and hence reflected in the ERD model, which uses to be driven by the database design.  Summary:  ERD is data-oriented and persistence-specific  Class diagram targets also operations and is persistence independent