MKP-logo-white-transparent Title 4th-edition Chapter 1 Computer Abstractions and Technology MKP-logo Chapter 1 — Computer Abstractions and Technology — 2 The Computer Revolution nProgress in computer technology nUnderpinned by Moore’s Law nMakes novel applications feasible nComputers in automobiles nCell phones nHuman genome project nWorld Wide Web nSearch Engines nComputers are pervasive MKP-logo Chapter 1 — Computer Abstractions and Technology — 3 Classes of Computers nDesktop computers nGeneral purpose, variety of software nSubject to cost/performance tradeoff nServer computers nNetwork based nHigh capacity, performance, reliability nRange from small servers to building sized nEmbedded computers nHidden as components of systems nStringent power/performance/cost constraints MKP-logo Chapter 1 — Computer Abstractions and Technology — 4 The Processor Market f01-01-P374493 Y-axis is in millions MKP-logo Chapter 1 — Computer Abstractions and Technology — 5 What You Will Learn nHow programs are translated into the machine language nAnd how the hardware executes them nThe hardware/software interface nWhat determines program performance nAnd how it can be improved nHow hardware designers improve performance nWhat is parallel processing MKP-logo Chapter 1 — Computer Abstractions and Technology — 6 Understanding Performance nAlgorithm nDetermines number of operations executed nProgramming language, compiler, architecture nDetermine number of machine instructions executed per operation nProcessor and memory system nDetermine how fast instructions are executed nI/O system (including OS) nDetermines how fast I/O operations are executed MKP-logo Chapter 1 — Computer Abstractions and Technology — 7 Below Your Program nApplication software nWritten in high-level language nSystem software nCompiler: translates HLL code to machine code nOperating System: service code nHandling input/output nManaging memory and storage nScheduling tasks & sharing resources nHardware nProcessor, memory, I/O controllers f01-02-P374493 MKP-logo Chapter 1 — Computer Abstractions and Technology — 8 Levels of Program Code nHigh-level language nLevel of abstraction closer to problem domain nProvides for productivity and portability nAssembly language nTextual representation of instructions nHardware representation nBinary digits (bits) nEncoded instructions and data f01-03-P374493 MKP-logo Chapter 1 — Computer Abstractions and Technology — 9 f01-04-P374493 Components of a Computer nSame components for all kinds of computer nDesktop, server, embedded nInput/output includes nUser-interface devices nDisplay, keyboard, mouse nStorage devices nHard disk, CD/DVD, flash nNetwork adapters nFor communicating with other computers The BIG Picture MKP-logo Chapter 1 — Computer Abstractions and Technology — 10 f01-05-P374493 Anatomy of a Computer Output device Input device Input device Network cable MKP-logo Chapter 1 — Computer Abstractions and Technology — 11 Anatomy of a Mouse nOptical mouse nLED illuminates desktop nSmall low-res camera nBasic image processor nLooks for x, y movement nButtons & wheel nSupersedes roller-ball mechanical mouse n optical-mouse-exploded optical-mouse MKP-logo Chapter 1 — Computer Abstractions and Technology — 12 Through the Looking Glass nLCD screen: picture elements (pixels) nMirrors content of frame buffer memory f01-06-P374493 MKP-logo Chapter 1 — Computer Abstractions and Technology — 13 Opening the Box f01-07-P374493 f01-08-P374493 MKP-logo Chapter 1 — Computer Abstractions and Technology — 14 Inside the Processor (CPU) nDatapath: performs operations on data nControl: sequences datapath, memory, ... nCache memory nSmall fast SRAM memory for immediate access to data MKP-logo Chapter 1 — Computer Abstractions and Technology — 15 Inside the Processor nAMD Barcelona: 4 processor cores f01-09-P374493 MKP-logo Chapter 1 — Computer Abstractions and Technology — 16 Abstractions nAbstraction helps us deal with complexity nHide lower-level detail nInstruction set architecture (ISA) nThe hardware/software interface nApplication binary interface nThe ISA plus system software interface nImplementation nThe details underlying and interface The BIG Picture MKP-logo Chapter 1 — Computer Abstractions and Technology — 17 flash-cards A Safe Place for Data nVolatile main memory nLoses instructions and data when power off nNon-volatile secondary memory nMagnetic disk nFlash memory nOptical disk (CDROM, DVD) hard-disk-drive flash-memory-exploded dvd-drive Floppy disks – capacity 100-200 KB Macintosh Diskettes – 1.44MB MKP-logo Chapter 1 — Computer Abstractions and Technology — 18 Networks nCommunication and resource sharing nLocal area network (LAN): Ethernet nWithin a building nWide area network (WAN: the Internet nWireless network: WiFi, Bluetooth ethernet-cables wireless-router MKP-logo Chapter 1 — Computer Abstractions and Technology — 19 Technology Trends nElectronics technology continues to evolve nIncreased capacity and performance nReduced cost Year Technology Relative performance/cost 1951 Vacuum tube 1 1965 Transistor 35 1975 Integrated circuit (IC) 900 1995 Very large scale IC (VLSI) 2,400,000 2005 Ultra large scale IC 6,200,000,000 DRAM capacity f01-12-P374493 MKP-logo Chapter 1 — Computer Abstractions and Technology — 20 Defining Performance nWhich airplane has the best performance? MKP-logo Chapter 1 — Computer Abstractions and Technology — 21 Response Time and Throughput nResponse time nHow long it takes to do a task nThroughput nTotal work done per unit time ne.g., tasks/transactions/… per hour nHow are response time and throughput affected by nReplacing the processor with a faster version? nAdding more processors? nWe’ll focus on response time for now… MKP-logo Chapter 1 — Computer Abstractions and Technology — 22 Relative Performance nDefine Performance = 1/Execution Time n“X is n time faster than Y” nExample: time taken to run a program n10s on A, 15s on B nExecution TimeB / Execution TimeA = 15s / 10s = 1.5 nSo A is 1.5 times faster than B MKP-logo Chapter 1 — Computer Abstractions and Technology — 23 Measuring Execution Time nElapsed time nTotal response time, including all aspects nProcessing, I/O, OS overhead, idle time nDetermines system performance nCPU time nTime spent processing a given job nDiscounts I/O time, other jobs’ shares nComprises user CPU time and system CPU time nDifferent programs are affected differently by CPU and system performance MKP-logo Chapter 1 — Computer Abstractions and Technology — 24 CPU Clocking nOperation of digital hardware governed by a constant-rate clock Clock (cycles) Data transfer and computation Update state Clock period nClock period: duration of a clock cycle ne.g., 250ps = 0.25ns = 250×10–12s nClock frequency (rate): cycles per second ne.g., 4.0GHz = 4000MHz = 4.0×109Hz MKP-logo Chapter 1 — Computer Abstractions and Technology — 25 CPU Time nPerformance improved by nReducing number of clock cycles nIncreasing clock rate nHardware designer must often trade off clock rate against cycle count MKP-logo Chapter 1 — Computer Abstractions and Technology — 26 CPU Time Example nComputer A: 2GHz clock, 10s CPU time nDesigning Computer B nAim for 6s CPU time nCan do faster clock, but causes 1.2 × clock cycles nHow fast must Computer B clock be? MKP-logo Chapter 1 — Computer Abstractions and Technology — 27 Instruction Count and CPI nInstruction Count for a program nDetermined by program, ISA and compiler nAverage cycles per instruction nDetermined by CPU hardware nIf different instructions have different CPI nAverage CPI affected by instruction mix MKP-logo Chapter 1 — Computer Abstractions and Technology — 28 CPI Example nComputer A: Cycle Time = 250ps, CPI = 2.0 nComputer B: Cycle Time = 500ps, CPI = 1.2 nSame ISA nWhich is faster, and by how much? A is faster… …by this much MKP-logo Chapter 1 — Computer Abstractions and Technology — 29 CPI in More Detail nIf different instruction classes take different numbers of cycles nWeighted average CPI Relative frequency MKP-logo Chapter 1 — Computer Abstractions and Technology — 30 CPI Example nAlternative compiled code sequences using instructions in classes A, B, C Class A B C CPI for class 1 2 3 IC in sequence 1 2 1 2 IC in sequence 2 4 1 1 nSequence 1: IC = 5 nClock Cycles = 2×1 + 1×2 + 2×3 = 10 nAvg. CPI = 10/5 = 2.0 nSequence 2: IC = 6 nClock Cycles = 4×1 + 1×2 + 1×3 = 9 nAvg. CPI = 9/6 = 1.5 MKP-logo Chapter 1 — Computer Abstractions and Technology — 31 Performance Summary nPerformance depends on nAlgorithm: affects IC, possibly CPI nProgramming language: affects IC, CPI nCompiler: affects IC, CPI nInstruction set architecture: affects IC, CPI, Tc The BIG Picture MKP-logo Chapter 1 — Computer Abstractions and Technology — 32 Power Trends nIn CMOS IC technology ×1000 ×30 5V → 1V f01-15-P374493 MKP-logo Chapter 1 — Computer Abstractions and Technology — 33 Reducing Power nSuppose a new CPU has n85% of capacitive load of old CPU n15% voltage and 15% frequency reduction nThe power wall nWe can’t reduce voltage further nWe can’t remove more heat nHow else can we improve performance? MKP-logo Chapter 1 — Computer Abstractions and Technology — 34 Uniprocessor Performance f01-16-P374493 Constrained by power, instruction-level parallelism, memory latency MKP-logo Chapter 1 — Computer Abstractions and Technology — 35 Multiprocessors nMulticore microprocessors nMore than one processor per chip nRequires explicitly parallel programming nCompare with instruction level parallelism nHardware executes multiple instructions at once nHidden from the programmer nHard to do nProgramming for performance nLoad balancing nOptimizing communication and synchronization MKP-logo Chapter 1 — Computer Abstractions and Technology — 36 Manufacturing ICs nYield: proportion of working dies per wafer f01-18-P374493 MKP-logo Chapter 1 — Computer Abstractions and Technology — 37 AMD Opteron X2 Wafer nX2: 300mm wafer, 117 chips, 90nm technology nX4: 45nm technology f01-19-P374493 MKP-logo Chapter 1 — Computer Abstractions and Technology — 38 Integrated Circuit Cost nNonlinear relation to area and defect rate nWafer cost and area are fixed nDefect rate determined by manufacturing process nDie area determined by architecture and circuit design MKP-logo Chapter 1 — Computer Abstractions and Technology — 39 SPEC CPU Benchmark nPrograms used to measure performance nSupposedly typical of actual workload nStandard Performance Evaluation Corp (SPEC) nDevelops benchmarks for CPU, I/O, Web, … nSPEC CPU2006 nElapsed time to execute a selection of programs nNegligible I/O, so focuses on CPU performance nNormalize relative to reference machine nSummarize as geometric mean of performance ratios nCINT2006 (integer) and CFP2006 (floating-point) MKP-logo Chapter 1 — Computer Abstractions and Technology — 40 CINT2006 for Opteron X4 2356 Name Description IC×109 CPI Tc (ns) Exec time Ref time SPECratio perl Interpreted string processing 2,118 0.75 0.40 637 9,777 15.3 bzip2 Block-sorting compression 2,389 0.85 0.40 817 9,650 11.8 gcc GNU C Compiler 1,050 1.72 0.47 24 8,050 11.1 mcf Combinatorial optimization 336 10.00 0.40 1,345 9,120 6.8 go Go game (AI) 1,658 1.09 0.40 721 10,490 14.6 hmmer Search gene sequence 2,783 0.80 0.40 890 9,330 10.5 sjeng Chess game (AI) 2,176 0.96 0.48 37 12,100 14.5 libquantum Quantum computer simulation 1,623 1.61 0.40 1,047 20,720 19.8 h264avc Video compression 3,102 0.80 0.40 993 22,130 22.3 omnetpp Discrete event simulation 587 2.94 0.40 690 6,250 9.1 astar Games/path finding 1,082 1.79 0.40 773 7,020 9.1 xalancbmk XML parsing 1,058 2.70 0.40 1,143 6,900 6.0 Geometric mean 11.7 High cache miss rates MKP-logo Chapter 1 — Computer Abstractions and Technology — 41 SPEC Power Benchmark nPower consumption of server at different workload levels nPerformance: ssj_ops/sec nPower: Watts (Joules/sec) MKP-logo Chapter 1 — Computer Abstractions and Technology — 42 SPECpower_ssj2008 for X4 Target Load % Performance (ssj_ops/sec) Average Power (Watts) 100% 231,867 295 90% 211,282 286 80% 185,803 275 70% 163,427 265 60% 140,160 256 50% 118,324 246 40% 920,35 233 30% 70,500 222 20% 47,126 206 10% 23,066 180 0% 0 141 Overall sum 1,283,590 2,605 ∑ssj_ops/ ∑power 493 MKP-logo Chapter 1 — Computer Abstractions and Technology — 43 Pitfall: Amdahl’s Law nImproving an aspect of a computer and expecting a proportional improvement in overall performance nCan’t be done! nExample: multiply accounts for 80s/100s nHow much improvement in multiply performance to get 5× overall? nCorollary: make the common case fast MKP-logo Chapter 1 — Computer Abstractions and Technology — 44 Fallacy: Low Power at Idle nLook back at X4 power benchmark nAt 100% load: 295W nAt 50% load: 246W (83%) nAt 10% load: 180W (61%) nGoogle data center nMostly operates at 10% – 50% load nAt 100% load less than 1% of the time nConsider designing processors to make power proportional to load MKP-logo Chapter 1 — Computer Abstractions and Technology — 45 Pitfall: MIPS as a Performance Metric nMIPS: Millions of Instructions Per Second nDoesn’t account for nDifferences in ISAs between computers nDifferences in complexity between instructions nCPI varies between programs on a given CPU MKP-logo Chapter 1 — Computer Abstractions and Technology — 46 Concluding Remarks nCost/performance is improving nDue to underlying technology development nHierarchical layers of abstraction nIn both hardware and software nInstruction set architecture nThe hardware/software interface nExecution time: the best performance measure nPower is a limiting factor nUse parallelism to improve performance