LAB OF SOFTWARE ARCHITECTURES AND INFORMATION SYSTEMS FACULTY OF INFORMATICS MASARYK UNIVERSITY, BRNO PV260 - SOFTWARE QUALITY LECT2. Software Measurement & Metrics and their role in quality improvement Bruno Rossi brossi@mail.muni.cz 2-103 ● Introduction ● The Measurement Process ● Motivational Examples ● Background on Software Measurement ● The Goal Question Metrics approach ● Measures and Software Quality Improvement → SQALE (Software Quality Assessment Based on Lifecycle Expectations) ● Case Studies Outline 3-103 ● The following bug (can you spot it?) in Apple's SSL code was undiscovered from Sept 2012 to Feb 2014 – how can it be? M. Bland, “Finding more than one worm in the apple,” Communications of the ACM, vol. 57, no. 7, pp. 58–64, Jul. 2014. Introduction 4-103 ● Modern systems are very large & complex in terms of structure & runtime behaviour ● The figure on the right represents Eclipse JDT 3.5.0 (350K LOCs, 1.324 classes, 23.605 methods ) Classes black - Methods red – Attributes blue. Method containment, attribute containment, and class→ → → inheritance gray - Invocations red - Accesses blue→ → → Introduction 5-103 ● We need ways to understand attributes of software, represent in a concise way and use it to track for software & development process improvement ● Software Measurement and Metrics are one of the aspects we can consider LOCs 354.780 NOM 23.605 NOC 1.324 NOP 45 LOCs=lines of code, NOM=nr. of methods NOC=nr. of classes, NOP=nr. of packages If we consider the following metrics, what can we say? Are they “good” metrics? Introduction 6-103 ● Measurement is the process by which numbers or symbols are assigned to attributes of entities in the real world in such a way as to describe them according to clearly defined rules (N. Fenton and S. L. Pfleeger, 1997) → A measurement is the process to define a measure Measurement 7-103 ● The measurement process goes from the real world to the numerical representation ● Interpretation goes from the numerical representation to the relevant empirical results Real World Numbers Reduced Numbers Relevant Empirical Results Intelligence Barrier Measures Interpretation The Measurement Process 8-103 ● To avoid anecdotal evidence without a clear study (through experiments or prototypes for example) ● To increase the visibility and the understanding of the process ● To analyze the software development ● To make predictions through statistical models Gilbs’s Principle of fuzzy targets (1988): “Projects without clear goals will not achieve their goals clearly” Why Software Measurement 9-103 ● Although measurement may be integrated in development, very often objectives of measurements are not clear ● “I measure the process because there is an automated tool that collects the metrics, but do not know how to read the data and what I can do with the data” Tom De Marco (1982): “You cannot manage what you cannot measure” ... ...but you need to know what to measure and how to measure However... 10-103 Motivational Examples about the pitfalls in linking the real world phenomenon to numbering systems 11-103 ● You were asked to conduct a study to evaluate whether there is discrimination among man and woman in university's enrollment ● You set-up a case study and looked at the final results → Is there a discrimination in place? → What can you conclude from the numbers above? Applicants % admitted Men 8442 44% Woman 4321 35% A Motivational Example (1/3) 12-103 ● Now look at the same study, but performed at the department level (top 6 departments): ● There does not seem to be any discrimination against women! The conclusion is that women tended to apply to more competitive departments than men ● The effect we just saw is called Simpson's paradox Source of the example: http://en.wikipedia.org/wiki/Simpson%27s_paradox – considering the following papers: J. Pearl (2000). Causality: Models, Reasoning, and Inference, Cambridge University Press. P.J. Bickel, E.A. Hammel and J.W. O'Connell (1975). "Sex Bias in Graduate Admissions: Data From Berkeley. Science 187 (4175): 398–404. Department Men Women Applicants % admitted Applicants % admitted A 825 62% 108 82% B 560 63% 25 68% C 325 37% 593 34% D 417 33% 375 35% E 191 28% 393 24% F 272 6% 341 7% A Motivational Example (2/3) 13-103 ● Simpsons' paradox: How can it be? ● It can happen that: a/b < A/B c/d < C/D (a + c)/(b + d) > (A + C)/(B + D) ● e.g. 1/5 < 2/8 6/8 < 4/5 7/13 > 6/13 ● It is the result of not considering an hidden variable, as in the example not considering the difficulty of entering a certain department Dept Men Women Applicants admitted Applicants admitted A 5 20% 8 25% B 8 75% 5 80% Total 13 53% 13 46% A Motivational Example (3/3) 14-103 Background on Software Measurement 15-103 Measurement artifacts / objects Product (architecture implementation, documentation) Process (management, lifecycle, CASE) Resources (personnel, software, hardware) Measurement Models Flow graphs Call graphs Structure tree Code schema ... Scale types, statistics Correlation Estimation Adjustment Calibration Measurement Evaluation Analysis Visualization Exploration Prediction ... Measurement Goals Understanding Learning Improvement Management Controlling ... artefactBased operation quantificationBased operation valueBased operation experienceBased operation Software Measurement Methods 16-103 Information Product Information Needs Interpretation Indicator (analysis) Model Derived Measure Derived Measure Measurement Function Base Measure Base Measure Measurement Method Measurement Method Attribute Attribute Entity Measurable Concept Measurable Concept: abstract relationship between attributes of entities and information needs Measurement Information Model (ISO/IEC 15939) 17-103 Derived Measure Derived Measure Measurement Function Base Measure Base Measure Measurement Method Measurement Method Attribute Attribute Entity Measurable Concept Property relevant to information needs Operations mapping an attribute to a scale Variable assigned a value by applying the method to one attribute Algorithm for combining two or more base measures Variable assigned a value by applying the measurement function to two or more values of base measures Bottom partMeasurement Information Model (ISO/IEC 15939) 18-103 Information Product Information Needs Interpretation Indicator (analysis) Model Algorithm for combining measures and decision criteria Variable assigned a value by applying the analysis model to base and/or derived measures Explanation relating the quantitative information in the indicator to the information needs The outcome of the measurement process that satisfies the information needs Top partMeasurement Information Model (ISO/IEC 15939) 19-103 Information Product Information Needs Interpretation Indicator (analysis) Model Derived Measure Derived Measure Measurement Function Base Measure Base Measure Measurement Method Measurement Method Attribute Attribute Entity Measurable Concept B1= Nr. of inaccurate computations encountered by users B2= Operation Time B1/B2 Computational Accuracy Comparison of values obtained with generic thresholds and/or targets External quality measures – Functionality - Accuracy Software Run-time accuracy Run-time usability Information Product Information Needs Interpretation Indicator (analysis) Model Derived Measure Derived Measure Measurement Function Base Measure Base Measure Measurement Method Measurement Method Attribute Attribute Entity Measurable Concept B1= Number of detected failures B2= Number of performed test cases B1/B2 Failure density against test cases Comparison of values obtained with generic thresholds and/or targets External quality measures – Reliability - Maturity Software Run-time reliability Level of testing Inspired by Abran, Alain, et al. "An information model for software quality measurement with ISO standards." Proceedings of the International Conference on Software Development (SWDC-REK), Reykjavik, Iceland. 2005. ISO/IEC 15939 Examples 20-103 ● A measure is a mapping between – The real world – The mathematical or formal world with its objects and relations ● Different mappings give different views of the world depending on the context (height, weight, …) ● The mapping relates attributes to mathematical objects; it does not relate entities to mathematical objects Measure Definition 21-103 ● The validity of a measure depends on definition of the attribute coherent with the specification of the real world – Is LOC a valid measure? – It depends on our measurement goals, e.g.: → Do we consider blanks and comments in the LOCs? → How are the lines exactly computed (e.g. considering “;” as end statements only) You might have two different projects with two different definitions of LOCs so that the following can be true at the same time P1>P2 and P1C Example, an artefact that has an estimated development cost of 300 hours and a STI of 8.30 hours, and using the reference table on the left 82-103 ● The final representation can take the form of a Kiviat diagram in which the different density indexes are represented SQALE – Rating 83-103 ● This is the view you find in SonarCube http://www.sonarqube.org/sonar-sqale-1-2-in-screenshot SQALE – Rating 84-103 ● Given our initial discussion of measurement pitfalls, scales and representation condition, the following sentence should be now clear: “Because the non-remediation costs are not established on an ordinal scale but on a ratio scale, we have shown [..] that we can aggregate the measures by addition and comply with the measurement theory and the representation clause.” SQALE Letouzey, Jean-Louis, and Michel Ilkiewicz. "Managing technical debt with the SQALE method." IEEE software 6 (2012): 44-51. 85-103 Case Studies 86-103 ● Suppose that we have the some projects on which we computed the following set of metrics ● What can you say about the projects? Project01 Project02 Project03 Project04 Project05 Project06 # LOCS 4920 5817 4013 4515 3263 5735 # packages 29 49 33 35 25 33 # classes 126 199 159 181 75 198 # methods 658 862 644 817 415 715 # attributes 153 196 227 285 78 177 # parameters 301 459 393 440 182 415 # local vars 493 533 325 397 339 416 # calls 2051 2830 1844 2297 917 2015 Proj_status complete complete incomplete complete incomplete complete Case Study 87-103 ● What if we consider relative instead of absolute values? ● This would allow to compare the values across projects Project01 Project02 Project03 Project04 Project05 Project06 LOCs/NOM 7.48 6.75 6.23 5.53 7.86 8.02 NOC/NOP 4.34 4.06 4.82 5.17 3.00 6.00 NOM/NOC 5.22 4.33 4.05 4.51 5.53 3.61 att/NOC 1.21 0.98 1.43 1.57 1.04 0.89 param/NOM 0.46 0.53 0.61 0.54 0.44 0.58 locvars/NOM 0.75 0.62 0.50 0.49 0.82 0.58 Calls/NOM 3.12 3.28 2.86 2.81 2.21 2.82 highest value Proj_status complete complete incomplete complete incomplete complete lowest value Case Study 88-103 Case Study ● What if we make sense out of the metrics by using the GQM approach? G1. Analyze the software product (object of study) for the purpose of evaluation (purpose) with respect to the effectiveness of code structure (quality focus) from the point of view of the development team (point of view) in the environment of our project named xyx (environment). Q1.1. what is the structure of the system? M1.2.1 Calls/NOM M1.2.2 param/NOM M1.1.3 NOM/NOC Q1.2. what is the coupling within the system? M1.1.1 NOC/NOP M1.1.2 LOCs/NOM 89-103 Case Study ● What if we make sense out of the metrics by using the GQM approach? G1. Analyze the software product (object of study) for the purpose of evaluation (purpose) with respect to the effectiveness of code structure (quality focus) from the point of view of the development team (point of view) in the environment of our project named xyx (environment). Q1.1. what is the structure of the system? M1.2.1 Calls/NOM M1.2.2 param/NOM M1.1.3 NOM/NOC Q1.2. what is the coupling within the system? M1.1.1 NOC/NOP M1.1.2 LOCs/NOM P1: 3.12 P5: 2.21 P1: 0.46 P5: 0.44 90-103 Case Study ● What happens if we consider LOCs instead of NOMs? G1. Analyze the software product (object of study) for the purpose of evaluation (purpose) with respect to the effectiveness of code structure (quality focus) from the point of view of the development team (point of view) in the environment of our project named xyx (environment). Q1.1. what is the structure of the system? M1.2.1 Calls/LOCs M1.2.2 param/LOCs M1.1.3 NOM/NOC Q1.2. what is the coupling within the system? M1.1.1 NOC/NOP M1.1.2 LOCs/NOM P1: 0.41 P5: 0.28 P1: 0.14 P5: 0.05 91-103 ● Another useful way to think in terms of relative values and thresholds is to use the Overview Pyramid ● The Overview pyramid allows to represent three different aspects of internal quality: inheritance, size & complexity and coupling ● It provides both absolute and relative values that are compared against typical thresholds NOP: Number of Packages NOC: Number of Classes NOM: Number of Methods LOC: Lines of Code CYCLO: Cyclomatic Complexity ANDC: Average Number of Derived Classes AHH: Average Hierarchy Height CALL: Number of Distinct Method Invocations FANOUT: Number of Called Classes Case Study – The Overview Pyramid 92-103 Project 1 Project 2 Project 3 Close to high Close to average Close to low Case Study – The Overview Pyramid 93-103 Project 4 Project 5 Project 6 Close to high Close to average Close to low Case Study – The Overview Pyramid 94-103 Back to our initial project Eclipse JDT 3.5.0 The overview pyramid Close to high Close to average Close to low Case Study – The Overview Pyramid 95-103 ● Measurement is important to track progress of software projects and to focus on relevant parts that need attention ● As such, we always need to take measurement into account with some “grain of salt” ● Still, collecting non-relevant or non-valid metrics might be even worse than not collecting any valid measure at all Conclusions 96-103 Extra Slides 97-103 ● LOCs: Lines of Code ● CC: McCabe Cyclomatic complexity ● Fan in: number of local flows that terminates in a module ● Fan out: number of local flows emanate from a module ● Information flow complexity of a a module: length of the module times the squared difference of fan in and fan out ● NOM: Number of Methods per class ● WMC: Weighted Methods per Class ● DIT: Depth of Inheritance Tree ● NOC: Number of Children ● CBO: Coupling Between Objects ● RFC: Response For a Class ● LCOM: Lack of Cohesion of Methods ● ANDC: Average Number of Derived Classes ● AHH: Average Hierarchy Height List of some Acronyms 98-103 – Analogies – Axioms – Correlations – Criterions – Intuitions – Laws – Lemmas – Formulas, – Methodologies – Principles – Relations – Rule Of Thumbs – Theories ● Measurement Experience can have the form of: Measurement Experience 99-103 Example: Laws in Software Engineering: how were these derived? Software Engineering Laws (1/4) 100-103 Information hiding in object oriented programming “A human being can concentrate on 7±2 items at a time” “Productivity is improved by reducing accidents and controlling essence” “Testing can show the presence but not absence of errors” Pr(A|B) = Pr(B|A)*Pr(A) / Pr(b) Software Engineering Laws (2/4) 101-103 “Requirement deficiencies are the prime source of project failure” “The value of a model depends on the view taken,but none is best for all purposes” “the user will never know what they want until after the system is in production” “Good designs require deep application domain knowledge” “What applies to small systems does not apply to large ones” “Everything put together falls apart sooner or later” 8 laws of software evolution Software Engineering Laws (3/4) 102-103 The number of transistors on an integrated circuit will double in about 18 months. The number of radio communications doubles every 30 months “the number of lines of code a programmer can write in a fixed period of time is the same regardless of the programming language” “If builders built buildings the way programmers wrote programs, the first woodpecker that came along would destroy civilization” Perspective based inspections (along one dimension, for a specific stakeholder) are highly eeffective and efficient Software reuse reduces cycle time and increases productivity and quality Software Engineering Laws (4/4) 103-103 ● N. Fenton and J. Bieman, Software Metrics: A Rigorous and Practical Approach, Third Edition, 3 edition. Boca Raton: CRC Press, 2014. ● C. Ebert and R. Dumke, Software Measurement: Establish - Extract - Evaluate Execute, Softcover reprint of hardcover 1st ed. 2007 edition. Springer, 2010. ● Lanza, Michele, and Radu Marinescu. Object-oriented metrics in practice: using software metrics to characterize, evaluate, and improve the design of objectoriented systems. Springer Science & Business Media, 2007. ● Some code samples from Martin, Robert C. Clean code: a handbook of agile software craftsmanship. Pearson Education, 2008. ● Moose platform for software data analysis http://moosetechnology.org ● The SQALE Method http://www.sqale.org/wp-content/uploads/2010/08/SQALE-Method-EN-V1-0.pdf References