Sie sind auf Seite 1von 12
Machines, Machine Languages, and Digital Logic 37 2.2.4 BRANCH INSTRUCTIONS In Chapter 1 we mentioned the program counter (PC) as pointing to the next instruction to be executed. Thus, the PC controls program flow. During normal program execution, the PC is incremented during the instruction fetch to point to the next instruction. A transfer of control, or a branch, to an instruction other than the next one in sequence requires computation of a target address, which is the address to which control is to branch target be transferred, A target address is specified in a branch or jump instruction. The target ldress address is loaded into the PC, replacing the address stored there. ‘There may be special branch target registers within the CPU. Ifa branch target register is preloaded with the target address prior to the execution of the branch instruction, the CPU will prefetch the instruction located at the branch target address, so that instruction will be ready for execution when the branch instruction finishes execution, Branch target registers are examples of registers that have “personality,” as discussed at the beginning of Section 2.1. A branch may be unconditional, as is the C goto statement, or it may be conditional, which means it depends on whether some condition within the processor state is true or false. ‘The conditional branch is used to implement the condition part of C statements such as if (X<0) X = -X; ‘There is no machine instruction that corresponds directly to the conditional statement above. ‘The approach most machines take is to set various status flags within the CPU as a result of ALU operations. The instruction set contains a number of conditional branch instructions that test various of these flags and then branch or not according to their settings. The bit or bits that describe the condition are stored in a register variously called the processor status ‘word (PSW), the condition code (CC) register, of the status register. Some of the status bits condition code record the results of arithmetic operations. The VAX11 code to implement the preceding (CC) statement is CMP x, 0 BGE OVER MNEG X, X OVER: ‘The most'common condition-code bits are zero (Z), overflow (V), carry (C), and negative (N), which are set to indicate that the last arithmetic operation resulted in a zero result, an arithmetic overflow, a carry-out of the most significant bit (msb), or a negative result, respectively. An alternative approach is to compute the difference of two values and store the result in a register. A conditional branch instruction is then used to test the computed value to see whether the condition is met. This is the method used in the SRC machine to be described in Section 2.3 and 2.4. We discuss conditional branching and the merits of these and other approaches in Chapter 6, “Computer Arithmetic and the Arithmetic Unit.” Table 2.3 shows several examples of branch instructions. Notice that the SOB (subtract one and branch) and JCXZ instructions are specifically designed to control loops; they test the value of a register and branch if the result is zero. The SOB instruction even handles decrementing the register prior to the test. These instructions are used to implement higher- level language constructs such as for, whi Te, and repeat. 38 Computer Systems Design and Architecture Table 2.3 Example Branch Instructions ane BLBS A, Tot Branch to address Tgt if the least significant bit at location Ais set. | VAXIL bun r2 Branch to location in r2 if the previous floating point comparison | PPC G4 signaled that one or more of the values was not a number. beq $2,$1,32 | Branch to location PC+4+32 if contents of $1 and $2 are equal. MIPS R3000 SOB R4 Loop —_| Decrement R4 and branch to address Loop if result x 0. DEC PDPII JOXZ Addr’ [ jump to Addr if contents of register Cx = 0. Intel 8086 2.2.5 4-, 3-,2-, 1-, ano O-Aporess AnD GENERAL Recister Machine Classes ‘Machine instructions must be encoded into a bit pattern that somehow specifies all of the four items discussed in the start of Section 2.2, Two examples of such encodings were shown in Table 1.2 for the MC68000 processor. The instruction set designer would like to minimize the number of bits devoted to the specification, while at the same time allowing maximum flexibility in how these items can be specified. All things being equal, the designer would like the entire encoding for the instruction to fit into one machine word, and in fact this has become a hallmark of the RISC approach. For a two-operand arithmetic instruction, five items need to be specified: 1. Operation to be performed 2, Location of first operand 3. Location of second operand 4, Place to store the result 5. Location of next instruction to be executed In this section we consider a number of abstract or hypothetical machines that vary in how many of these five items are explicitly specified, from five down to one. In each hypothetical machine, we study the encoding of an ALU instruction. ‘As we consider each machine, we quantify the number of bits required to encode ‘one of its instructions. To make the encodings explicit, we assume that the hypothetical machine has 24-bit memory addresses and has 128 instructions. This means 3 bytes will be required to encode each address, and 7 bits to specify one of the 128 instructions. The 7 bits will be rounded up to 1 byte. ‘The 4-Address Machines and Operations. ‘The 4-address machine specifies all of the last four items in the list above, so it will require an address for each operand, one for the result, and one for the next instruction, Figure 2.3 shows schematically the programmer's model of the machine operation and instruction format. Notice that there are no programmer-accessible registers in the CPU, as none are needed. The operands are both fetched from memory, and the result is returned to memory. The address of the next instruction is also specified as an explicit value. Machines, Machine Languages, and Digital Logic Memon CPU add, Res, Opt, Op2, Nexti (Res — Opt + Op2) Optaddr: Op2addr: ‘Opt (Op2 ResAddr:| Res Nextiddr: Instruction format Bits: 8 24 24 24 24 add | Restdar [ Opracar | Op2adar | NextiAddr Which — Where to Where to find ‘operation putresuit Where tofind operands next instruction Fig. 2.3 The 4-Address Machine and Instruction Format 23, 6 0 Wasted ‘Op ‘Operand 1 address Operand 2 address Result address ‘Address of next instruction Fig. 2.3a Layout of a 4-Address Instruction in Memory Each address requires 3 bytes, so it will require 4 x 3 + 1 = 13 bytes to encode a 4-address ALU instruction. If the normal path between memory and the CPU is 24-bit word “chunks,” then the instruction will occupy five words in memory, and five words will need to be transferred to the CPU just to specify the instruction. A layout of the instruction in memory might appear as shown in Figure 2.3a. Let us now count the number of memory accesses required when the instruction executes. Five words will be transferred to the CPU when the instruction itself is fetched, as shown in the preceding figure. Then the two words representing the operands themselves need to be fetched into the CPU, and after the addition has been performed, the result needs to be written back to memory. This means that 5 + 2+ 1 = 8 words must be transferred to add the two words and store the result. Because of the large instruction word size and number of memory accesses, the 4-address machine and instruction format is not normally seen in machine design, although the 4-address structure is used intemally in some implementations of computer control units. This kind of controller implementation is known as microcoded control. In a microcoded design, the steps required to execute an instruction are themselves stored as fa sequence of microcode instructions that are executed to effect instruction execution, We discuss the design of microcoded machines in Chapter 4. Time and space costs make the 4-address format uncompetitive for other uses. In fact, much of the effort designers put into memory accesses per instruction 39 40 Computer Systems Design and Architecture Memory. cpu add, Res, Op1, Op2 (Res « Op2 + Opt) OptAdde|_Opt_|-—+> OpeAddr:| “Ope ResAddr| Rese 1 [Program NextiAddr:| Next fe— counter Where to find ‘next instruction - Instruction format Bits:_8 24 24 24 add | ResAddr | OptAddr | Op2Addr Which — Where to operation put result Where to find operands Fig. 2.4 The 3-Address Machine and Instruction Format instruction set organization goes toward reducing the number of instruction bits needed to specify the preceding five items. In some hypothetical implementation domain where memory costs and access times were small compared to ALU execution time, however, the 4-address machine could be considered as a design alternative. ‘The 3-Address Machines and Operations. The inclusion within the CPU of a program rogram counter counter that always points to the next instruction eli inates the need to specify the address of the next instruction in all but the class of branch instructions. It is now the responsibility of the control unit to know the size of the currently executing, truction, so the control unit can advance the PC to point to the next address beyond. Machines with a program counter but no operand storage in the CPU are known as 3-address machines. Figure 2.4 shows the operation of a 3-address machine and the corresponding instruction format. Note the inclusion of a program counter that points to the next instruction. Figure 2.4 also shows that instru 's and data may be stored in different parts of memory. The instruction format now includes only four fields: one that specifies which ope specifies the result address. is to be performed, two that specify operand memory addresses, and one that ‘There is neither the need nor the ability to specify the address of the next instruction. In our example above, the number of bytes would be reduced from 13 to 3 x3 +1= 10 bytes, or four 24-bit words in a machine that accessed 24-bit words instead of bytes. The 2-Address Machines and Operations. A reduction to two addresses can be obtained by storing the result into the memory address of one of the operands. Figure 2.5 shows the machine structure and instruction format for the 2-address machine. The change from a 3-address machine to a 2-address machine does not require any change in the register structure of the CPU, but only in the instruction meaning and format. Now our example instruction is reduced from 10 bytes to 2 x 3 + 1 = 7 bytes, or three 24-bit words in a word-oriented machine. Machines, Machine Languages, and Digital Logic 41 Memory cru add Op2, Opt (Op2—- Op2 + Opt) OprAdar:|“Opt_|-—+ ' Op2Addr:|Op2 Res} Program NostiAddr: ‘counter Where to find next instruction Instruction format Bits:_8 24 24 add | Op2acar | Opiaaar Wien 7 Where fo find operands operation, Where to put result Fig. 2.5 The 2-Address, Machine and Instruction Format The 1-Address (Accumulator) Machines and Instructions. Let us now add a single accumulator, Acc, to the CPU and use it both for the source of one operand and as the result destination. Figure 2.6 shows the programmer's model and instruction format for such a machine, Now the result is kept in the accumulator in the CPU and may be used for further computations. The Acc serves both as the source of one operand and as the storage location for the result. Notice that because there is only a single accumulator, it need not be mentioned in the machine instruction. add Opt (Acc + Acc + Opt) prada: Where to find operand2. and A where 19 put result Nextiaddr: Instruction format Bis:_8 24 ‘add | OptAdar Which Where to find ‘operation operand Fig. 2.6 The 1-Address Machine and Instruction Format 42 Computer Systems Design and Architecture load and store stack, push and The 1-address instruction requires additional operations to load and store the accumulator's contents, however. These instructions are also 1-address instructions. The instruction Ida Addr loads the accumulator from address Addr, and sta Addr stores the accumulator’s contents into address Addr: Acc = OpAddr; > Ida OpAddr OpAddr = Acc; > sta OpAddr Our example addition now requires only 1 x 3 + 1 =4 bytes, or two 24-bit words for the addition, although if one operand were not in memory, or if the final result needed to be written back to memory, additional load or store instructions would be required. ‘The 1-address machines generally provide a minimum in the size of both program and CPU memory required, and the architecture was quite popular in very early mainframes and early microcomputers. The Intel 8080, Motorola 6800, and MOS Technology 6502 were examples of machines that contained accumulators. ‘The 0-Address (Stack) Computers and Address Formats. The inclusion of a push- down stack in the CPU allows ALU instructions with no addresses. Operands are pushed onto the stack from memory, and ALU operations implicitly operate on the top members of the stack, Figure 2.7 shows the add operation performed on the two operands in the top and second positions on the stack. The operation removes both operands from the stack and replaces them with the result. The push operation from memory to stack is also shown. The code to add two memory operands is a bit more complex: Op3 = Opl Op2; push Op1 pee tees Bush Ope add pop Op3 Notice that the push and pop operations still require a memory address, and the word count for the code above is 3 + 1 = 4 bytes for each push and pop, and an additional Instruction formats Memory ee push Opt (TOS — Opt) OptAddr:| Opt Bits; 6 24 t Format [push | Optacdr_] Operation Result add (TOS « TOS + SOS) Sak Bins: 8 "Format add NextiAddr:| Ned }e;—{~ Program ’ counter_|°* ‘ Wich operation | Where to fina \ where ' ‘ to find operands, vo — Seminervovon 1 and where to put result {on the stack) Fig. 2.7 The 0-Address, or Stack, Machine and Instruction Formats Machines, Machine Languages, and Digital Logic 43 byte for expression evaluation, for a total of 4 x 3+ 1 = 13 bytes. The drawback of a 0- address computer is that operands must always be in the top two stack locations, and extra instructions may be required to get them there. Stack machines, like stack calculators, have their adherents. General register machines have achieved more popularity in recent times, however, probably because they are more amenable to machine hardware speedup jues, such as pipelining, that run instructions in parallel. We cover pipeline techniques in Chapter 5. Example 2.1 Expression Evaluation with 3-, 2-, 1-, and 0-Address Machines Evaluate the expressiona = (b + c)* (d - e) in3-, and 0- address machines. For these machines, minimal code to evaluate this expression is shown in the following table: 3-Address 2-Address Accumulator Stack add a,b,c Toad a,b Ida b push b mpy a,a,d add a,c add c push c sub a,a,e mpy a.d mpy d add sub a,e sub e push d staa mpy push e sub Pop a a General Register Machines and 1-1/2 Address Instructions. Accumulator and stack machines offer the advantage that an intermediate result can be retained in the CPU. Complex operations make it worthwhile to store more than one temporary result in the CPU. Supplying more than one register to the CPU requires the use of instruction bits to encode which register is to be used, but specifying one of n registers register requires only log,(n) bits, and temporary registers can make instruction execution much Selection more efficient. A common type of ALU instruction in the general register machine is the analog of the 3-address instruction, but it uses CPU registers instead of memory addresses. These “small” addresses that specify registers instead of memory addresses are sometimes referred to as half addresses. An instruction that specifies one operand in half addresses memory and one operand in a register would be known as a 1-1/2 address instruction, Figure 2.8 shows the structure and formats of a general register machine. The figure shows both a load from memory to register R8 and an add of registers R6 and R4 with the result stored in R2. ‘Our encoding example is now a bit more complex. Assume that there are 32 general- Purpose registers. Each register reference thus requires 5 bits to specify 1 of the 32 registers, and our 3-register add instruction now requires 5 x 3 +7 = 22 bits, which means we can encode the instruction in one 24-bit word. The load instruction will require 7 + 5 +24 bits, or two 24-bit words. General register machines provide the greatest flexibility to the programmer, and virtually every new architecture since 1980 has been of this class. You may wonder at 44 Computer Systems Design and Architecture Instruction formats. load load RA, Opt (RB < Opt) addr _Opt_}-+-284 >| Re ba = [leag [Rs | Optaaar Re add R2, 4, RG (R2.— Ra + Re) add | re | a | 6 Nextt Program Fig. 2.8 General Register Machine and Instruction Formats ‘machines register-memory machines memory- the seeming progression from accumulator machines to general register machines. This is due in large part to the reduction in the cost of machine memory. When RAM is $15 per megabyte, having many general purpose registers is a fine idea. It was not so fine in the heyday of the accumulator, during the early days of computing, when a single bit cost $25. Classifying Machines by Operand and Result Location. Bear in mind that the preceding classes are hypothetical. First of all, there are no 4-address machines. Second, nearly all real machines provide some combination of the above instruction classes. The VAXIL is probably the champion in this category, as it includes instructions from all classes, Real machines are usually classed as being in the load-store, register-memory, or memory-memory classes. Many modern computers, including RISCs, are of the load/store, sometimes called register-to-register, variety. These are 1-1/2-address machines in which memory access instructions are limited to two instructions: load and store. The load instruction moves data from memory to a processor register, and the store instruction moves data from the processor to memory. ALU and branch operations in load-store machines can accept only operands located in processor registers, and they must store the result in a processor register. Load/store machines are sometimes called register-to-register machines because ALU operations must have operands and results in registers. The philosophy is that moving data values back and forth between memory and the processor is an expensive operation, and that the instruction set design should discourage this operation by limiting its usage to just a few explicit load and store instructions. Register-memory machines locate operands and result in a combination of memory and registers. They are classed as 1- or 1-1/2-address machines, in which one operand or the result must be an accumulator or general register. Memory-to-memory machines allow both the operands and the result to reside in memory. They are classed as either 2- or 3- address machines, depending on whether one of the operand locations also serves as a result location, Machines, Machine Languages, and Digital Logic 45 Key Concepts: Trade-Offs in Instruction Set and Processor Registers ‘The range of choices of processor-state structure and instruction types trade off flexibility in the placement of operands and results against the amount of information that must be specified by an instruction. The 3-address machines have the shortest code sequences but require an unreasonably large number of bits per instruction. The O-address machines have the longest code sequences and the shortest individual instructions. Even in O-address machines there are I-address instructions, push and pop. General register machines modify the addressing rules by specifying one of a small set of registers, such as 32, by a short address, such as 5 bits. A general register counterpart to a 3-address instruction could specify 2 registers and | memory address. H _Load-store machines only include full memory addresses in instructions that move data between memory and registers: load and store. Current technology makes register access much faster than memory access and places a premium on short instructions. Both favor the general register organization. 2.2.6 Access PatHs To Oreranos: Appressinc Moves ‘The computation process involves the continuous shuffling of program and data into and ‘out of the CPU. The many different kinds of data structures and program variable references possible in modem high-level languages have driven machine architects to develop many sophisticated ways of providing access paths to operands in memory and CPU registers: addressing modes. Below we provide an informal description of some of the more common ones. After we discuss the SRC computer and the RTN description language, we use RTN to formally describe these and other less common addressing modes. Be aware that the terms used to describe addressing modes are not standardized in any way, and different machine and assembly language designers have their own terminology for addressing modes. To access an operand in memory, the CPU must first generate an address, which it then issues to the memory subsystem. That address is referred to as an effective address. ‘The address may be computed in various ways, and the results of that computation may be interpreted in various ways. The process may involve computing an address in memory where the operand address is stored. Some of the more common addressing modes are shown in Figure 2.9. The immediate addressing mode, shown in Figure 2.9a, is used to access constants stored in the instruction. It supplies an operand without computing an address, so it is not used for result addressing. Immediate addressing provides one of the two means of introducing constants into a program. (The other way is by storing the constants as data items in memory and retrieving them by direct addressing.) ‘The previous code examples haveused the direct addressing mode, shown in Figure 2.9b. ‘The address of the operand is specified as a constant contained in the instruction. addressing effective address immediate addressing direct addressing 46 Computer Systems Design and Architecture (@) immediate addressing: Instruction contains. the operand (b) Direct addressing: instruction contains address of operand Adres of adress of A — (0) indirect aderoasing: instruction contains ‘ner [Op TS dcr of address of operand foad (A), ... >|Operand ada (6) Register direct addressing: oad Rt verter register contains operand ° ni [—Operand (0) Register indirect addressing: register contains address ne (Uber oor of operand T—+|_Operang load [R2).. (f Displacement (based or indexed) addressing: address of operand = register + constant (g) Relative addressing: ‘str [Op + address of operand = PC + constant \ (h) Implied addressing: Instr [Op instruction implies cme the address of operand Fig. 2.9 Common Addressing Modes indirect In indirect addressing, shown in Figure 2.9c, a constant in the instruction specifies not addressing mode the address of the value, but the address of the address of the value. An example of the use of indirect addressing is in implementing pointers, where the pointer, which is an address, is stored in memory at the pointer address. Thus, two memory references are required to Machines, Machine Languages, and Digital Logic 47 access the value. The CPU must first fetch the pointer, which is stored in memory; then, having that address, the CPU accesses the value stored at that address. In the register direct mode, shown in Figure 2.94, the operand is contained in the specified register. ‘When the address of the operand is in a register, the mode is referred to as the register indirect mode, shown in Figure 2.9e. This addressing mode is used to sequentially access the elements of an array stored in memory. The starting address of the array is stored in a register, an access made, and the register incremented to point to the next element. To access arrays, or components of the C struct, or the Pascal record (which by definition are stored at a fixed offset from the start address of the structure), the indexed ‘mode, sometimes called displacement or based addressing. is used. as shown in Figure 2.9f. ‘The memory address is formed by adding a fixed constant, usually contained within the instruction, to the address value contained in a register. The term indexed is normally used when the constant value is the base of an array in memory, added to the “index” stored in a register. The term displacement is used when the base of a Struct is held in a register and added to the constant offset, or displacement, of the field in the struct. The relative addressing mode, shown in Figure 2.9, is similar to indexed, but the base address is held in the PC rather than in another register. This allows the storage of memory operands at a fixed offset from the current instruction. In implied addressing mode, the address of the operand(s) is implied by the instruction and need not be mentioned explicitly. For example, the instruction CMC shown in Figure 2.9h stands for “complement the carry flag”. The operand being the carry flag is implied in this instruction. Similarly, consider a hypothetical instruction MUL, which always multiplies the values present in (wo fixed registers say °X’ and *Y’ and stores the product in the register ‘P’. This instruction uses implied addressing mode. The implied address mode has the disadvantage that itis least flexible, but has the advantage that the instruction can be encoded in a compact manner. The previous discussion is only to provide a flavor for the complexity of addressing modes, A formal description of addressing modes is given in Section 2.5, 2.3 Informal Description of the Simple RISC Computer, SRC In this section we provide an informal description of SRC. and in the next section we provide a formal description. This example machine is sufficiently simple and lacking in the complications necessary in real machines that in Chapters 4 and 5 it will serve as an example of detailed machine hardware design 2.3.1 RecisteR AND Memory Structure Figure 2.10 shows the programmer's model of the SRC machine. Itisa general register machine, with 32 general purpose, 32-bit registers, plus a program counter (PC) and an instruction register (IR). Although the main memory is organized as an array of bytes, only 32-bit words can be fetched from or stored into main memory. Its memory operand access follows the load- store model described previously. A word at address A is defined as the 4 bytes at that address and the succeeding three addresses. The byte at the lowest address contains the most significant 8 bits, the byte at the next address contains the next most significant 8 bits, and so on. register direct mode register indirect mode indexed, or displacement, o based mode addressing mode 48 Computer Systems Design and Architecture ‘TheSRCCPU Main memory at ° 70 RO} 32.32-bit | 0 [general —| = [- purpose —] bytes egisters of rif main]. Po memory iR ee Instruction formats feidstle, Bt vy 2 ‘addi, andi, ori (oote te |} weer ° 2d stl (OP yet) 28222117 9 8. neg, not ruse nites zaman wn abr (oot fete arms Gad as gparsre is see fopfa te Pre | cay aed ond as gzarirse wis Sasson Pootate [e | vasa —} S acd ew, Foote Tm Le] ‘ay_z126 220197 ™ [oot [> 1) unwed coma] sha shi.she ay crap 2221716 12 v» (Optra [0 re [eS rms peed | 2.3.2 Instruction Formats [7] means contents of register 7 ‘M[32] means contents ‘of memory location 32 Example 1013.4 (RI3} = MIA}) 1813, 4(r5) (R(3] = MERIS) + 41) addi 2,4,1 (Rl) = Rid] +1) dr 15,8 (AIS) = MIPC + 8) lart6,45 (RI) = PC + 45) regi. (RI7]=-AIB) bear 4, 0 (branch to A] if RO} == 0) bring 6, 14, 10 {(R{6) = PC; branch to R(t R(O] +0) ‘add 10,2, 4 (R(O} = RZ] + RA) she 10,1144 (PiO} = Fs} shited right by 4 bits shi 2,14, 16 (P{2} = F[s]shitted leh by count in RI6)) stop Fig. 2.10 Programmer's Model of the SRC Figure 2.10 shows 23 instructions in 8 different formats: @ Load and store instructions: There are four load instructions—1d, 1dr, 1a, and 1ar—and two store instructions—st and str. @ Branch instructions: There are two branch instructions, br and br, that allow unconditional and conditional branches to an address contained in a specified

Das könnte Ihnen auch gefallen