MKP-logo-white-transparent Title 4th-edition Chapter 2 Instructions: Language of the Computer MKP-logo Chapter 2 — Instructions: Language of the Computer — 2 Instruction Set nThe repertoire of instructions of a computer nDifferent computers have different instruction sets nBut with many aspects in common nEarly computers had very simple instruction sets nSimplified implementation nMany modern computers also have simple instruction sets MIPS -- Microprocessor without Interlocked Pipeline Stages Instructions of equal size Very large instruction word (VLIW) RISC, CISC Variable length instruction word MKP-logo Chapter 2 — Instructions: Language of the Computer — 3 The MIPS Instruction Set nUsed as the example throughout the book nStanford MIPS commercialized by MIPS Technologies (www.mips.com) nLarge share of embedded core market nApplications in consumer electronics, network/storage equipment, cameras, printers, … nTypical of many modern ISAs nSee MIPS Reference Data tear-out card, and Appendixes B and E MKP-logo Chapter 2 — Instructions: Language of the Computer — 4 Arithmetic Operations nAdd and subtract, three operands nTwo sources and one destination n add a, b, c # a gets b + c nAll arithmetic operations have this form nDesign Principle 1: Simplicity favours regularity nRegularity makes implementation simpler nSimplicity enables higher performance at lower cost MKP-logo Chapter 2 — Instructions: Language of the Computer — 5 Arithmetic Example nC code: n f = (g + h) - (i + j); nCompiled MIPS code: n add t0, g, h # temp t0 = g + h add t1, i, j # temp t1 = i + j sub f, t0, t1 # f = t0 - t1 MKP-logo Chapter 2 — Instructions: Language of the Computer — 6 Register Operands nArithmetic instructions use register operands nMIPS has a 32 × 32-bit register file nUse for frequently accessed data nNumbered 0 to 31 n32-bit data called a “word” nAssembler names n$t0, $t1, …, $t9 for temporary values n$s0, $s1, …, $s7 for saved variables nDesign Principle 2: Smaller is faster nc.f. main memory: millions of locations MKP-logo Chapter 2 — Instructions: Language of the Computer — 7 Register Operand Example nC code: n f = (g + h) - (i + j); nf, …, j in $s0, …, $s4 nCompiled MIPS code: n add $t0, $s1, $s2 add $t1, $s3, $s4 sub $s0, $t0, $t1 MKP-logo Chapter 2 — Instructions: Language of the Computer — 8 Memory Operands nMain memory used for composite data nArrays, structures, dynamic data nTo apply arithmetic operations nLoad values from memory into registers nStore result from register to memory nMemory is byte addressed nEach address identifies an 8-bit byte nWords are aligned in memory nAddress must be a multiple of 4 nMIPS is Big Endian nMost-significant byte at least address of a word nc.f. Little Endian: least-significant byte at least address MKP-logo Chapter 2 — Instructions: Language of the Computer — 9 Memory Operand Example 1 nC code: n g = h + A[8]; ng in $s1, h in $s2, base address of A in $s3 nCompiled MIPS code: nIndex 8 requires offset of 32 n4 bytes per word n lw $t0, 32($s3) # load word add $s1, $s2, $t0 offset base register MKP-logo Chapter 2 — Instructions: Language of the Computer — 10 Memory Operand Example 2 nC code: n A[12] = h + A[8]; nh in $s2, base address of A in $s3 nCompiled MIPS code: nIndex 8 requires offset of 32 n lw $t0, 32($s3) # load word add $t0, $s2, $t0 sw $t0, 48($s3) # store word MKP-logo Chapter 2 — Instructions: Language of the Computer — 11 Registers vs. Memory nRegisters are faster to access than memory nOperating on memory data requires loads and stores nMore instructions to be executed nCompiler must use registers for variables as much as possible nOnly spill to memory for less frequently used variables nRegister optimization is important! MKP-logo Chapter 2 — Instructions: Language of the Computer — 12 Immediate Operands nConstant data specified in an instruction n addi $s3, $s3, 4 nNo subtract immediate instruction nJust use a negative constant n addi $s2, $s1, -1 nDesign Principle 3: Make the common case fast nSmall constants are common nImmediate operand avoids a load instruction MKP-logo Chapter 2 — Instructions: Language of the Computer — 13 The Constant Zero nMIPS register 0 ($zero) is the constant 0 nCannot be overwritten nUseful for common operations nE.g., move between registers n add $t2, $s1, $zero MKP-logo Chapter 2 — Instructions: Language of the Computer — 14 Unsigned Binary Integers nGiven an n-bit number nRange: 0 to +2n – 1 nExample n0000 0000 0000 0000 0000 0000 0000 10112 = 0 + … + 1×23 + 0×22 +1×21 +1×20 = 0 + … + 8 + 0 + 2 + 1 = 1110 nUsing 32 bits n0 to +4,294,967,295 MKP-logo Chapter 2 — Instructions: Language of the Computer — 15 2s-Complement Signed Integers nGiven an n-bit number nRange: –2n – 1 to +2n – 1 – 1 nExample n1111 1111 1111 1111 1111 1111 1111 11002 = –1×231 + 1×230 + … + 1×22 +0×21 +0×20 = –2,147,483,648 + 2,147,483,644 = –410 nUsing 32 bits n–2,147,483,648 to +2,147,483,647 MKP-logo Chapter 2 — Instructions: Language of the Computer — 16 2s-Complement Signed Integers nBit 31 is sign bit n1 for negative numbers n0 for non-negative numbers n–(–2n – 1) can’t be represented nNon-negative numbers have the same unsigned and 2s-complement representation nSome specific numbers n 0: 0000 0000 … 0000 n–1: 1111 1111 … 1111 nMost-negative: 1000 0000 … 0000 nMost-positive: 0111 1111 … 1111 MKP-logo Chapter 2 — Instructions: Language of the Computer — 17 Signed Negation nComplement and add 1 nComplement means 1 → 0, 0 → 1 nExample: negate +2 n+2 = 0000 0000 … 00102 n–2 = 1111 1111 … 11012 + 1 = 1111 1111 … 11102 MKP-logo Chapter 2 — Instructions: Language of the Computer — 18 Sign Extension nRepresenting a number using more bits nPreserve the numeric value nIn MIPS instruction set naddi: extend immediate value nlb, lh: extend loaded byte/halfword nbeq, bne: extend the displacement nReplicate the sign bit to the left nc.f. unsigned values: extend with 0s nExamples: 8-bit to 16-bit n+2: 0000 0010 => 0000 0000 0000 0010 n–2: 1111 1110 => 1111 1111 1111 1110 MKP-logo Chapter 2 — Instructions: Language of the Computer — 19 Representing Instructions nInstructions are encoded in binary nCalled machine code nMIPS instructions nEncoded as 32-bit instruction words nSmall number of formats encoding operation code (opcode), register numbers, … nRegularity! nRegister numbers n$t0 – $t7 are reg’s 8 – 15 n$t8 – $t9 are reg’s 24 – 25 n$s0 – $s7 are reg’s 16 – 23 MKP-logo Chapter 2 — Instructions: Language of the Computer — 20 MIPS R-format Instructions nInstruction fields nop: operation code (opcode) nrs: first source register number nrt: second source register number nrd: destination register number nshamt: shift amount (00000 for now) nfunct: function code (extends opcode) op rs rt rd shamt funct 6 bits 6 bits 5 bits 5 bits 5 bits 5 bits MKP-logo Chapter 2 — Instructions: Language of the Computer — 21 R-format Example n add $t0, $s1, $s2 special $s1 $s2 $t0 0 add 0 17 18 8 0 32 000000 10001 10010 01000 00000 100000 000000100011001001000000001000002 = 0232402016 op rs rt rd shamt funct 6 bits 6 bits 5 bits 5 bits 5 bits 5 bits MKP-logo Chapter 2 — Instructions: Language of the Computer — 22 Hexadecimal nBase 16 nCompact representation of bit strings n4 bits per hex digit 0 0000 4 0100 8 1000 c 1100 1 0001 5 0101 9 1001 d 1101 2 0010 6 0110 a 1010 e 1110 3 0011 7 0111 b 1011 f 1111 nExample: eca8 6420 n1110 1100 1010 1000 0110 0100 0010 0000 MKP-logo Chapter 2 — Instructions: Language of the Computer — 23 MIPS I-format Instructions nImmediate arithmetic and load/store instructions nrt: destination or source register number nConstant: –215 to +215 – 1 nAddress: offset added to base address in rs nDesign Principle 4: Good design demands good compromises nDifferent formats complicate decoding, but allow 32-bit instructions uniformly nKeep formats as similar as possible op rs rt constant or address 6 bits 5 bits 5 bits 16 bits MKP-logo Chapter 2 — Instructions: Language of the Computer — 24 Stored Program Computers nInstructions represented in binary, just like data nInstructions and data stored in memory nPrograms can operate on programs ne.g., compilers, linkers, … nBinary compatibility allows compiled programs to work on different computers nStandardized ISAs The BIG Picture f02-07-P374493 MKP-logo Chapter 2 — Instructions: Language of the Computer — 25 Logical Operations nInstructions for bitwise manipulation Operation C Java MIPS Shift left << << sll Shift right >> >>> srl Bitwise AND & & and, andi Bitwise OR | | or, ori Bitwise NOT ~ ~ nor nUseful for extracting and inserting groups of bits in a word MKP-logo Chapter 2 — Instructions: Language of the Computer — 26 Shift Operations nshamt: how many positions to shift nShift left logical nShift left and fill with 0 bits nsll by i bits multiplies by 2i nShift right logical nShift right and fill with 0 bits nsrl by i bits divides by 2i (unsigned only) op rs rt rd shamt funct 6 bits 6 bits 5 bits 5 bits 5 bits 5 bits MKP-logo Chapter 2 — Instructions: Language of the Computer — 27 AND Operations nUseful to mask bits in a word nSelect some bits, clear others to 0 n and $t0, $t1, $t2 0000 0000 0000 0000 0000 1101 1100 0000 0000 0000 0000 0000 0011 1100 0000 0000 $t2 $t1 0000 0000 0000 0000 0000 1100 0000 0000 $t0 MKP-logo Chapter 2 — Instructions: Language of the Computer — 28 OR Operations nUseful to include bits in a word nSet some bits to 1, leave others unchanged n or $t0, $t1, $t2 0000 0000 0000 0000 0000 1101 1100 0000 0000 0000 0000 0000 0011 1100 0000 0000 $t2 $t1 0000 0000 0000 0000 0011 1101 1100 0000 $t0 MKP-logo Chapter 2 — Instructions: Language of the Computer — 29 NOT Operations nUseful to invert bits in a word nChange 0 to 1, and 1 to 0 nMIPS has NOR 3-operand instruction na NOR b == NOT ( a OR b ) n nor $t0, $t1, $zero 0000 0000 0000 0000 0011 1100 0000 0000 $t1 1111 1111 1111 1111 1100 0011 1111 1111 $t0 Register 0: always read as zero MKP-logo Chapter 2 — Instructions: Language of the Computer — 30 Conditional Operations nBranch to a labeled instruction if a condition is true nOtherwise, continue sequentially nbeq rs, rt, L1 nif (rs == rt) branch to instruction labeled L1; nbne rs, rt, L1 nif (rs != rt) branch to instruction labeled L1; nj L1 nunconditional jump to instruction labeled L1 MKP-logo Chapter 2 — Instructions: Language of the Computer — 31 Compiling If Statements nC code: n if (i==j) f = g+h; else f = g-h; nf, g, … in $s0, $s1, … nCompiled MIPS code: n bne $s3, $s4, Else add $s0, $s1, $s2 j Exit Else: sub $s0, $s1, $s2 Exit: … Assembler calculates addresses f02-09-P374493 MKP-logo Chapter 2 — Instructions: Language of the Computer — 32 Compiling Loop Statements nC code: n while (save[i] == k) i += 1; ni in $s3, k in $s5, address of save in $s6 nCompiled MIPS code: n Loop: sll $t1, $s3, 2 add $t1, $t1, $s6 lw $t0, 0($t1) bne $t0, $s5, Exit addi $s3, $s3, 1 j Loop Exit: … MKP-logo Chapter 2 — Instructions: Language of the Computer — 33 Basic Blocks nA basic block is a sequence of instructions with nNo embedded branches (except at end) nNo branch targets (except at beginning) nA compiler identifies basic blocks for optimization nAn advanced processor can accelerate execution of basic blocks MKP-logo Chapter 2 — Instructions: Language of the Computer — 34 More Conditional Operations nSet result to 1 if a condition is true nOtherwise, set to 0 nslt rd, rs, rt nif (rs < rt) rd = 1; else rd = 0; nslti rt, rs, constant nif (rs < constant) rt = 1; else rt = 0; nUse in combination with beq, bne n slt $t0, $s1, $s2 # if ($s1 < $s2) bne $t0, $zero, L # branch to L MKP-logo Chapter 2 — Instructions: Language of the Computer — 35 Branch Instruction Design nWhy not blt, bge, etc? nHardware for <, ≥, … slower than =, nCombining with branch involves more work per instruction, requiring a slower clock nAll instructions penalized! nbeq and bne are the common case nThis is a good design compromise MKP-logo Chapter 2 — Instructions: Language of the Computer — 36 Signed vs. Unsigned nSigned comparison: slt, slti nUnsigned comparison: sltu, sltui nExample n$s0 = 1111 1111 1111 1111 1111 1111 1111 1111 n$s1 = 0000 0000 0000 0000 0000 0000 0000 0001 nslt $t0, $s0, $s1 # signed n–1 < +1 Þ $t0 = 1 nsltu $t0, $s0, $s1 # unsigned n+4,294,967,295 > +1 Þ $t0 = 0 MKP-logo Chapter 2 — Instructions: Language of the Computer — 37 Procedure Calling nSteps required 1.Place parameters in registers 2.Transfer control to procedure 3.Acquire storage for procedure 4.Perform procedure’s operations 5.Place result in register for caller 6.Return to place of call MKP-logo Chapter 2 — Instructions: Language of the Computer — 38 Register Usage n$a0 – $a3: arguments (reg’s 4 – 7) n$v0, $v1: result values (reg’s 2 and 3) n$t0 – $t9: temporaries nCan be overwritten by callee n$s0 – $s7: saved nMust be saved/restored by callee n$gp: global pointer for static data (reg 28) n$sp: stack pointer (reg 29) n$fp: frame pointer (reg 30) n$ra: return address (reg 31) MKP-logo Chapter 2 — Instructions: Language of the Computer — 39 Procedure Call Instructions nProcedure call: jump and link n jal ProcedureLabel nAddress of following instruction put in $ra nJumps to target address nProcedure return: jump register n jr $ra nCopies $ra to program counter nCan also be used for computed jumps ne.g., for case/switch statements MKP-logo Chapter 2 — Instructions: Language of the Computer — 40 Leaf Procedure Example nC code: n int leaf_example (int g, h, i, j) { int f; f = (g + h) - (i + j); return f; } nArguments g, …, j in $a0, …, $a3 nf in $s0 (hence, need to save $s0 on stack) nResult in $v0 MKP-logo Chapter 2 — Instructions: Language of the Computer — 41 Leaf Procedure Example nMIPS code: n leaf_example: addi $sp, $sp, -4 sw $s0, 0($sp) add $t0, $a0, $a1 add $t1, $a2, $a3 sub $s0, $t0, $t1 add $v0, $s0, $zero lw $s0, 0($sp) addi $sp, $sp, 4 jr $ra Save $s0 on stack Procedure body Restore $s0 Result Return MKP-logo Chapter 2 — Instructions: Language of the Computer — 42 Non-Leaf Procedures nProcedures that call other procedures nFor nested call, caller needs to save on the stack: nIts return address nAny arguments and temporaries needed after the call nRestore from the stack after the call MKP-logo Chapter 2 — Instructions: Language of the Computer — 43 Non-Leaf Procedure Example nC code: n int fact (int n) { if (n < 1) return f; else return n * fact(n - 1); } nArgument n in $a0 nResult in $v0 MKP-logo Chapter 2 — Instructions: Language of the Computer — 44 Non-Leaf Procedure Example nMIPS code: n fact: addi $sp, $sp, -8 # adjust stack for 2 items sw $ra, 4($sp) # save return address sw $a0, 0($sp) # save argument slti $t0, $a0, 1 # test for n < 1 beq $t0, $zero, L1 addi $v0, $zero, 1 # if so, result is 1 addi $sp, $sp, 8 # pop 2 items from stack jr $ra # and return L1: addi $a0, $a0, -1 # else decrement n jal fact # recursive call lw $a0, 0($sp) # restore original n lw $ra, 4($sp) # and return address addi $sp, $sp, 8 # pop 2 items from stack mul $v0, $a0, $v0 # multiply to get result jr $ra # and return MKP-logo Chapter 2 — Instructions: Language of the Computer — 45 Local Data on the Stack nLocal data allocated by callee ne.g., C automatic variables nProcedure frame (activation record) nUsed by some compilers to manage stack storage f02-12-P374493 MKP-logo Chapter 2 — Instructions: Language of the Computer — 46 f02-13-P374493 Memory Layout nText: program code nStatic data: global variables ne.g., static variables in C, constant arrays and strings n$gp initialized to address allowing ±offsets into this segment nDynamic data: heap nE.g., malloc in C, new in Java nStack: automatic storage MKP-logo Chapter 2 — Instructions: Language of the Computer — 47 Character Data nByte-encoded character sets nASCII: 128 characters n95 graphic, 33 control nLatin-1: 256 characters nASCII, +96 more graphic characters nUnicode: 32-bit character set nUsed in Java, C++ wide characters, … nMost of the world’s alphabets, plus symbols nUTF-8, UTF-16: variable-length encodings MKP-logo Chapter 2 — Instructions: Language of the Computer — 48 Byte/Halfword Operations nCould use bitwise operations nMIPS byte/halfword load/store nString processing is a common case nlb rt, offset(rs) lh rt, offset(rs) nSign extend to 32 bits in rt nlbu rt, offset(rs) lhu rt, offset(rs) nZero extend to 32 bits in rt nsb rt, offset(rs) sh rt, offset(rs) nStore just rightmost byte/halfword MKP-logo Chapter 2 — Instructions: Language of the Computer — 49 String Copy Example nC code (naïve): nNull-terminated string n void strcpy (char x[], char y[]) { int i; i = 0; while ((x[i]=y[i])!='\0') i += 1; } nAddresses of x, y in $a0, $a1 ni in $s0 MKP-logo Chapter 2 — Instructions: Language of the Computer — 50 String Copy Example nMIPS code: n strcpy: addi $sp, $sp, -4 # adjust stack for 1 item sw $s0, 0($sp) # save $s0 add $s0, $zero, $zero # i = 0 L1: add $t1, $s0, $a1 # addr of y[i] in $t1 lbu $t2, 0($t1) # $t2 = y[i] add $t3, $s0, $a0 # addr of x[i] in $t3 sb $t2, 0($t3) # x[i] = y[i] beq $t2, $zero, L2 # exit loop if y[i] == 0 addi $s0, $s0, 1 # i = i + 1 j L1 # next iteration of loop L2: lw $s0, 0($sp) # restore saved $s0 addi $sp, $sp, 4 # pop 1 item from stack jr $ra # and return MKP-logo Chapter 2 — Instructions: Language of the Computer — 51 0000 0000 0111 1101 0000 0000 0000 0000 32-bit Constants nMost constants are small n16-bit immediate is sufficient nFor the occasional 32-bit constant n lui rt, constant nCopies 16-bit constant to left 16 bits of rt nClears right 16 bits of rt to 0 lhi $s0, 61 0000 0000 0111 1101 0000 1001 0000 0000 ori $s0, $s0, 2304 MKP-logo Chapter 2 — Instructions: Language of the Computer — 52 Branch Addressing nBranch instructions specify nOpcode, two registers, target address nMost branch targets are near branch nForward or backward op rs rt constant or address 6 bits 5 bits 5 bits 16 bits nPC-relative addressing nTarget address = PC + offset × 4 nPC already incremented by 4 by this time MKP-logo Chapter 2 — Instructions: Language of the Computer — 53 Jump Addressing nJump (j and jal) targets could be anywhere in text segment nEncode full address in instruction op address 6 bits 26 bits n(Pseudo)Direct jump addressing nTarget address = PC31…28 : (address × 4) MKP-logo Chapter 2 — Instructions: Language of the Computer — 54 Target Addressing Example nLoop code from earlier example nAssume Loop at location 80000 Loop: sll $t1, $s3, 2 80000 0 0 19 9 4 0 add $t1, $t1, $s6 80004 0 9 22 9 0 32 lw $t0, 0($t1) 80008 35 9 8 0 bne $t0, $s5, Exit 80012 5 8 21 2 addi $s3, $s3, 1 80016 8 19 19 1 j Loop 80020 2 20000 Exit: … 80024 MKP-logo Chapter 2 — Instructions: Language of the Computer — 55 Branching Far Away nIf branch target is too far to encode with 16-bit offset, assembler rewrites the code nExample n beq $s0,$s1, L1 n ↓ n bne $s0,$s1, L2 j L1 L2: … MKP-logo Chapter 2 — Instructions: Language of the Computer — 56 Addressing Mode Summary f02-18-P374493 MKP-logo Chapter 2 — Instructions: Language of the Computer — 57 Synchronization nTwo processors sharing an area of memory nP1 writes, then P2 reads nData race if P1 and P2 don’t synchronize nResult depends of order of accesses nHardware support required nAtomic read/write memory operation nNo other access to the location allowed between the read and write nCould be a single instruction nE.g., atomic swap of register ↔ memory nOr an atomic pair of instructions MKP-logo Chapter 2 — Instructions: Language of the Computer — 58 Synchronization in MIPS nLoad linked: ll rt, offset(rs) nStore conditional: sc rt, offset(rs) nSucceeds if location not changed since the ll nReturns 1 in rt nFails if location is changed nReturns 0 in rt nExample: atomic swap (to test/set lock variable) ntry: add $t0,$zero,$s4 ;copy exchange value n ll $t1,0($s1) ;load linked n sc $t0,0($s1) ;store conditional n beq $t0,$zero,try ;branch store fails n add $s4,$zero,$t1 ;put load value in $s4 MKP-logo Chapter 2 — Instructions: Language of the Computer — 59 f02-21-P374493 Translation and Startup Many compilers produce object modules directly Static linking MKP-logo Chapter 2 — Instructions: Language of the Computer — 60 Assembler Pseudoinstructions nMost assembler instructions represent machine instructions one-to-one nPseudoinstructions: figments of the assembler’s imagination n move $t0, $t1 → add $t0, $zero, $t1 n blt $t0, $t1, L → slt $at, $t0, $t1 bne $at, $zero, L n$at (register 1): assembler temporary MKP-logo Chapter 2 — Instructions: Language of the Computer — 61 Producing an Object Module nAssembler (or compiler) translates program into machine instructions nProvides information for building a complete program from the pieces nHeader: described contents of object module nText segment: translated instructions nStatic data segment: data allocated for the life of the program nRelocation info: for contents that depend on absolute location of loaded program nSymbol table: global definitions and external refs nDebug info: for associating with source code MKP-logo Chapter 2 — Instructions: Language of the Computer — 62 Linking Object Modules nProduces an executable image n1. Merges segments n2. Resolve labels (determine their addresses) n3. Patch location-dependent and external refs nCould leave location dependencies for fixing by a relocating loader nBut with virtual memory, no need to do this nProgram can be loaded into absolute location in virtual memory space MKP-logo Chapter 2 — Instructions: Language of the Computer — 63 Loading a Program nLoad from image file on disk into memory n1. Read header to determine segment sizes n2. Create virtual address space n3. Copy text and initialized data into memory nOr set page table entries so they can be faulted in n4. Set up arguments on stack n5. Initialize registers (including $sp, $fp, $gp) n6. Jump to startup routine nCopies arguments to $a0, … and calls main nWhen main returns, do exit syscall MKP-logo Chapter 2 — Instructions: Language of the Computer — 64 Dynamic Linking nOnly link/load library procedure when it is called nRequires procedure code to be relocatable nAvoids image bloat caused by static linking of all (transitively) referenced libraries nAutomatically picks up new library versions MKP-logo Chapter 2 — Instructions: Language of the Computer — 65 f02-22-P374493 Lazy Linkage Indirection table Stub: Loads routine ID, Jump to linker/loader Linker/loader code Dynamically mapped code MKP-logo Chapter 2 — Instructions: Language of the Computer — 66 f02-23-P374493 Starting Java Applications Simple portable instruction set for the JVM Interprets bytecodes Compiles bytecodes of “hot” methods into native code for host machine MKP-logo Chapter 2 — Instructions: Language of the Computer — 67 C Sort Example nIllustrates use of assembly instructions for a C bubble sort function nSwap procedure (leaf) n void swap(int v[], int k) { int temp; temp = v[k]; v[k] = v[k+1]; v[k+1] = temp; } nv in $a0, k in $a1, temp in $t0 MKP-logo Chapter 2 — Instructions: Language of the Computer — 68 The Procedure Swap nswap: sll $t1, $a1, 2 # $t1 = k * 4 n add $t1, $a0, $t1 # $t1 = v+(k*4) n # (address of v[k]) n lw $t0, 0($t1) # $t0 (temp) = v[k] n lw $t2, 4($t1) # $t2 = v[k+1] n sw $t2, 0($t1) # v[k] = $t2 (v[k+1]) n sw $t0, 4($t1) # v[k+1] = $t0 (temp) n jr $ra # return to calling routine MKP-logo Chapter 2 — Instructions: Language of the Computer — 69 The Sort Procedure in C nNon-leaf (calls swap) n void sort (int v[], int n) n { n int i, j; n for (i = 0; i < n; i += 1) { n for (j = i – 1; n j >= 0 && v[j] > v[j + 1]; n j -= 1) { n swap(v,j); n } n } n } nv in $a0, k in $a1, i in $s0, j in $s1 MKP-logo Chapter 2 — Instructions: Language of the Computer — 70 The Procedure Body n move $s2, $a0 # save $a0 into $s2 n move $s3, $a1 # save $a1 into $s3 n move $s0, $zero # i = 0 nfor1tst: slt $t0, $s0, $s3 # $t0 = 0 if $s0 ≥ $s3 (i ≥ n) n beq $t0, $zero, exit1 # go to exit1 if $s0 ≥ $s3 (i ≥ n) n addi $s1, $s0, –1 # j = i – 1 nfor2tst: slti $t0, $s1, 0 # $t0 = 1 if $s1 < 0 (j < 0) n bne $t0, $zero, exit2 # go to exit2 if $s1 < 0 (j < 0) n sll $t1, $s1, 2 # $t1 = j * 4 n add $t2, $s2, $t1 # $t2 = v + (j * 4) n lw $t3, 0($t2) # $t3 = v[j] n lw $t4, 4($t2) # $t4 = v[j + 1] n slt $t0, $t4, $t3 # $t0 = 0 if $t4 ≥ $t3 n beq $t0, $zero, exit2 # go to exit2 if $t4 ≥ $t3 n move $a0, $s2 # 1st param of swap is v (old $a0) n move $a1, $s1 # 2nd param of swap is j n jal swap # call swap procedure n addi $s1, $s1, –1 # j –= 1 n j for2tst # jump to test of inner loop nexit2: addi $s0, $s0, 1 # i += 1 n j for1tst # jump to test of outer loop Pass params & call Move params Inner loop Outer loop Inner loop Outer loop MKP-logo Chapter 2 — Instructions: Language of the Computer — 71 nsort: addi $sp,$sp, –20 # make room on stack for 5 registers n sw $ra, 16($sp) # save $ra on stack n sw $s3,12($sp) # save $s3 on stack n sw $s2, 8($sp) # save $s2 on stack n sw $s1, 4($sp) # save $s1 on stack n sw $s0, 0($sp) # save $s0 on stack n … # procedure body n … n exit1: lw $s0, 0($sp) # restore $s0 from stack n lw $s1, 4($sp) # restore $s1 from stack n lw $s2, 8($sp) # restore $s2 from stack n lw $s3,12($sp) # restore $s3 from stack n lw $ra,16($sp) # restore $ra from stack n addi $sp,$sp, 20 # restore stack pointer n jr $ra # return to calling routine The Full Procedure MKP-logo Chapter 2 — Instructions: Language of the Computer — 72 Effect of Compiler Optimization Compiled with gcc for Pentium 4 under Linux MKP-logo Chapter 2 — Instructions: Language of the Computer — 73 Effect of Language and Algorithm MKP-logo Chapter 2 — Instructions: Language of the Computer — 74 Lessons Learnt nInstruction count and CPI are not good performance indicators in isolation nCompiler optimizations are sensitive to the algorithm nJava/JIT compiled code is significantly faster than JVM interpreted nComparable to optimized C in some cases nNothing can fix a dumb algorithm! MKP-logo Chapter 2 — Instructions: Language of the Computer — 75 Arrays vs. Pointers nArray indexing involves nMultiplying index by element size nAdding to array base address nPointers correspond directly to memory addresses nCan avoid indexing complexity MKP-logo Chapter 2 — Instructions: Language of the Computer — 76 Example: Clearing and Array clear1(int array[], int size) { int i; for (i = 0; i < size; i += 1) array[i] = 0; } clear2(int *array, int size) { int *p; for (p = &array[0]; p < &array[size]; p = p + 1) *p = 0; } move $t0,$zero # i = 0 loop1: sll $t1,$t0,2 # $t1 = i * 4 add $t2,$a0,$t1 # $t2 = # &array[i] sw $zero, 0($t2) # array[i] = 0 addi $t0,$t0,1 # i = i + 1 slt $t3,$t0,$a1 # $t3 = # (i < size) bne $t3,$zero,loop1 # if (…) # goto loop1 move $t0,$a0 # p = & array[0] sll $t1,$a1,2 # $t1 = size * 4 add $t2,$a0,$t1 # $t2 = # &array[size] loop2: sw $zero,0($t0) # Memory[p] = 0 addi $t0,$t0,4 # p = p + 4 slt $t3,$t0,$t2 # $t3 = #(p<&array[size]) bne $t3,$zero,loop2 # if (…) # goto loop2 MKP-logo Chapter 2 — Instructions: Language of the Computer — 77 Comparison of Array vs. Ptr nMultiply “strength reduced” to shift nArray version requires shift to be inside loop nPart of index calculation for incremented i nc.f. incrementing pointer nCompiler can achieve same effect as manual use of pointers nInduction variable elimination nBetter to make program clearer and safer MKP-logo Chapter 2 — Instructions: Language of the Computer — 78 ARM & MIPS Similarities nARM: the most popular embedded core nSimilar basic set of instructions to MIPS ARM MIPS Date announced 1985 1985 Instruction size 32 bits 32 bits Address space 32-bit flat 32-bit flat Data alignment Aligned Aligned Data addressing modes 9 3 Registers 15 × 32-bit 31 × 32-bit Input/output Memory mapped Memory mapped MKP-logo Chapter 2 — Instructions: Language of the Computer — 79 Compare and Branch in ARM nUses condition codes for result of an arithmetic/logical instruction nNegative, zero, carry, overflow nCompare instructions to set condition codes without keeping the result nEach instruction can be conditional nTop 4 bits of instruction word: condition value nCan avoid branches over single instructions MKP-logo Chapter 2 — Instructions: Language of the Computer — 80 Instruction Encoding f02-34-P374493 MKP-logo Chapter 2 — Instructions: Language of the Computer — 81 The Intel x86 ISA nEvolution with backward compatibility n8080 (1974): 8-bit microprocessor nAccumulator, plus 3 index-register pairs n8086 (1978): 16-bit extension to 8080 nComplex instruction set (CISC) n8087 (1980): floating-point coprocessor nAdds FP instructions and register stack n80286 (1982): 24-bit addresses, MMU nSegmented memory mapping and protection n80386 (1985): 32-bit extension (now IA-32) nAdditional addressing modes and operations nPaged memory mapping as well as segments MKP-logo Chapter 2 — Instructions: Language of the Computer — 82 The Intel x86 ISA nFurther evolution… ni486 (1989): pipelined, on-chip caches and FPU nCompatible competitors: AMD, Cyrix, … nPentium (1993): superscalar, 64-bit datapath nLater versions added MMX (Multi-Media eXtension) instructions nThe infamous FDIV bug nPentium Pro (1995), Pentium II (1997) nNew microarchitecture (see Colwell, The Pentium Chronicles) nPentium III (1999) nAdded SSE (Streaming SIMD Extensions) and associated registers nPentium 4 (2001) nNew microarchitecture nAdded SSE2 instructions MKP-logo Chapter 2 — Instructions: Language of the Computer — 83 The Intel x86 ISA nAnd further… nAMD64 (2003): extended architecture to 64 bits nEM64T – Extended Memory 64 Technology (2004) nAMD64 adopted by Intel (with refinements) nAdded SSE3 instructions nIntel Core (2006) nAdded SSE4 instructions, virtual machine support nAMD64 (announced 2007): SSE5 instructions nIntel declined to follow, instead… nAdvanced Vector Extension (announced 2008) nLonger SSE registers, more instructions nIf Intel didn’t extend with compatibility, its competitors would! nTechnical elegance ≠ market success MKP-logo Chapter 2 — Instructions: Language of the Computer — 84 Basic x86 Registers f02-36-P374493 MKP-logo Chapter 2 — Instructions: Language of the Computer — 85 Basic x86 Addressing Modes nTwo operands per instruction Source/dest operand Second source operand Register Register Register Immediate Register Memory Memory Register Memory Immediate nMemory addressing modes nAddress in register nAddress = Rbase + displacement nAddress = Rbase + 2scale × Rindex (scale = 0, 1, 2, or 3) nAddress = Rbase + 2scale × Rindex + displacement MKP-logo Chapter 2 — Instructions: Language of the Computer — 86 x86 Instruction Encoding nVariable length encoding nPostfix bytes specify addressing mode nPrefix bytes modify operation nOperand length, repetition, locking, … f02-41-P374493 MKP-logo Chapter 2 — Instructions: Language of the Computer — 87 Implementing IA-32 nComplex instruction set makes implementation difficult nHardware translates instructions to simpler microoperations nSimple instructions: 1–1 nComplex instructions: 1–many nMicroengine similar to RISC nMarket share makes this economically viable nComparable performance to RISC nCompilers avoid complex instructions MKP-logo Chapter 2 — Instructions: Language of the Computer — 88 Fallacies nPowerful instruction Þ higher performance nFewer instructions required nBut complex instructions are hard to implement nMay slow down all instructions, including simple ones nCompilers are good at making fast code from simple instructions nUse assembly code for high performance nBut modern compilers are better at dealing with modern processors nMore lines of code Þ more errors and less productivity MKP-logo Chapter 2 — Instructions: Language of the Computer — 89 f02-43-P374493 Fallacies nBackward compatibility Þ instruction set doesn’t change nBut they do accrete more instructions x86 instruction set MKP-logo Chapter 2 — Instructions: Language of the Computer — 90 Pitfalls nSequential words are not at sequential addresses nIncrement by 4, not by 1! nKeeping a pointer to an automatic variable after procedure returns ne.g., passing pointer back via an argument nPointer becomes invalid when stack popped MKP-logo Chapter 2 — Instructions: Language of the Computer — 91 Concluding Remarks nDesign principles n1. Simplicity favors regularity n2. Smaller is faster n3. Make the common case fast n4. Good design demands good compromises nLayers of software/hardware nCompiler, assembler, hardware nMIPS: typical of RISC ISAs nc.f. x86 MKP-logo Chapter 2 — Instructions: Language of the Computer — 92 Concluding Remarks nMeasure MIPS instruction executions in benchmark programs nConsider making the common case fast nConsider compromises Instruction class MIPS examples SPEC2006 Int SPEC2006 FP Arithmetic add, sub, addi 16% 48% Data transfer lw, sw, lb, lbu, lh, lhu, sb, lui 35% 36% Logical and, or, nor, andi, ori, sll, srl 12% 4% Cond. Branch beq, bne, slt, slti, sltiu 34% 8% Jump j, jr, jal 2% 0%