Design of A Pipelined Powerpc Processor Using Verilog

University of Southampton
Faculty of Engineering, Science and Mathematics

School of Electronics and Computer Science
Design of a Pipelined PowerPC

Processor using Verilog
by
Chidhambaranathan Rajamanikkam(canr1g09)
24 September 2010
A dissertation submitted in partial fulfillment of the degree of
MSc Microelectronics Systems Design

by examination and dissertation
Project Supervisor: B Iain McNally

Second Examiner: Dr. Koushik Maharatna
PowerPC processor is designed by IBM. It is widely used in many embedded systems because
of its low power consumption. PowerPC processor is designed using the RISC (Reduced Instruction
Set Computing) instruction set architecture.
This project gives an overview of the implementation of a 32-bit pipelined processor. The
designed pipelined processor is capable of executing PowerPC instructions. The instructions include
PowerPC fixed point integer instructions, branch instructions and integer load/store instructions. The
processor is designed using Verilog description language. These modules are successfully tested using
NC Verilog simulator.
This report gives the information of the instruction set and their forms and architecture of the
PowerPC processor. It also covers the information on the pipelining approach adopted and the data
and control hazards associated with it. The designed processor overcomes both data and control
hazards.
Canr1g09
1
I would like to thank my project supervisor, B Iain McNally for his valuable guidance and
support for doing this project. I would to extend my gratitude to my second examiner Dr. Koushik
Maharatna for his support.
Canr1g09
2
ABSTRACT ------------------------------------------------------------------------------------------------------------------------------------- 1
ACKNOWLEDGEMENT --------------------------------------------------------------------------------------------------------------------- 2
CONTENTS ------------------------------------------------------------------------------------------------------------------------------------- 3
LIST OF FIGURES ----------------------------------------------------------------------------------------------------------------------------- 5
LIST OF TABLES------------------------------------------------------------------------------------------------------------------------------- 6
CHAPTER 1: INTRODUCTION ------------------------------------------------------------------------------------------------------------- 7
CHAPTER 2: BACKGROUND -------------------------------------------------------------------------------------------------------------- 8
2.1 POWERPC: ------------------------------------------------------------------------------------------------------------------------------ 8
2.2 POWERPC REGISTERS: ----------------------------------------------------------------------------------------------------------------- 8
2.2.1 General Purpose Registers: ------------------------------------------------------------------------------------------------ 8
2.2.2 Exception Register: ----------------------------------------------------------------------------------------------------------- 8
2.2.3 Count Register: ---------------------------------------------------------------------------------------------------------------- 9
2.2.4 Condition Register: ----------------------------------------------------------------------------------------------------------- 9
2.2.5 Link Register: ----------------------------------------------------------------------------------------------------------------- 10
2.3 POWERPC DATA TYPES:-------------------------------------------------------------------------------------------------------------- 10
2.4 POWERPC BRANCH INSTRUCTIONS: ------------------------------------------------------------------------------------------------- 10
2.4.1Addressing Modes: ---------------------------------------------------------------------------------------------------------- 11
2.5 POWERPC LOAD/STORE INSTRUCTIONS: ------------------------------------------------------------------------------------------- 11
2.5.1 Addressing Modes: --------------------------------------------------------------------------------------------------------- 11
2.5.2 Load Instructions: ----------------------------------------------------------------------------------------------------------- 12
2.5.3 Store Instructions: ---------------------------------------------------------------------------------------------------------- 13
2.6 POWERPC FIXED POINT INTEGER INSTRUCTIONS: ---------------------------------------------------------------------------------- 13
2.6.1 Arithmetic Instructions: --------------------------------------------------------------------------------------------------- 13
2.6.2 Logical Instructions: -------------------------------------------------------------------------------------------------------- 14
2.6.3 Sign- Extension Instructions: --------------------------------------------------------------------------------------------- 14
2.6.4 Rotate Instructions: -------------------------------------------------------------------------------------------------------- 14
2.6.4.2 PowerPC Rotate Instructions: --------------------------------------------------------------------------------------------------- 15
2.6.5 Shift Instructions: ----------------------------------------------------------------------------------------------------------- 15

2.6.5.1 Logical Left Shift Instructions: --------------------------------------------------------------------------------------------------- 15
2.6.5.2 Logical Right Shift Instructions: ------------------------------------------------------------------------------------------------- 16
2.6.5.3 Algebraic Shift Instructions: ----------------------------------------------------------------------------------------------------- 16
2.7 PIPELINING OVERVIEW:--------------------------------------------------------------------------------------------------------------- 16

2.7.1 Pipelining Hazards: --------------------------------------------------------------------------------------------------------- 17
2.7.1.1 Structural Hazards: ---------------------------------------------------------------------------------------------------------------- 17
2.7.1.2 Control Hazards: -------------------------------------------------------------------------------------------------------------------- 18
Canr1g09
3

--------------------------------------------------------------------------------------------------------------------------------------------------- 18
2.7.1.3 Data Hazards: ----------------------------------------------------------------------------------------------------------------------- 18
CHAPTER 3: DESIGN ---------------------------------------------------------------------------------------------------------------------- 20

3.1 INITIAL DATAPATH: ------------------------------------------------------------------------------------------------------------------- 20
3.2 INSTRUCTION SET DESIGN: ----------------------------------------------------------------------------------------------------------- 22
3.2.1 Fixed Point Integer Instructions: ---------------------------------------------------------------------------------------- 22
3.2.1.1 Fixed Point Arithmetic Instructions: ------------------------------------------------------------------------------------------- 22
3.2.1.2 Fixed Point Logical Instructions: ------------------------------------------------------------------------------------------------ 24
3.2.1.3 Fixed Point Shift Instructions: --------------------------------------------------------------------------------------------------- 26
3.2.1.4 Fixed Point Rotate Instructions: ------------------------------------------------------------------------------------------------ 28
3.2.1.5 Fixed Point Compare Instructions: --------------------------------------------------------------------------------------------- 30
3.2.2 Load/Store Instructions: -------------------------------------------------------------------------------------------------- 32

3.2.2.1 Load Instructions: ------------------------------------------------------------------------------------------------------------------ 32
3.2.2.2 Store Instructions: ----------------------------------------------------------------------------------------------------------------- 34
3.2.3 Branch Instructions: -------------------------------------------------------------------------------------------------------- 36

3.2.4 Data forwarding and Load-Use: ---------------------------------------------------------------------------------------- 38
3.2.5 System Call Instruction (sc): ---------------------------------------------------------------------------------------------- 40
CHAPTER 4: TESTING --------------------------------------------------------------------------------------------------------------------- 41
4.1 CREATING INSTRUCTIONS: ------------------------------------------------------------------------------------------------------------ 41
4.2 FIXED POINT INTEGER INSTRUCTIONS: ----------------------------------------------------------------------------------------------- 41
4.2.1 Fixed Point Arithmetic Instructions: ----------------------------------------------------------------------------------- 43
4.2.2 Fixed Point Logical Instructions: ---------------------------------------------------------------------------------------- 45
4.2.3 Fixed Point Shift Instructions:-------------------------------------------------------------------------------------------- 47
4.2.4 Fixed Point Rotate Instructions: ----------------------------------------------------------------------------------------- 49
4.2.5 Fixed Point Compare Instructions: ------------------------------------------------------------------------------------- 51
4.3 LOAD/STORE INSTRUCTIONS: -------------------------------------------------------------------------------------------------------- 53
4.3.1 Load Instructions: ----------------------------------------------------------------------------------------------------------- 53
4.3.2 Store Instructions: ---------------------------------------------------------------------------------------------------------- 55
4.4 BRANCH INSTRUCTIONS:-------------------------------------------------------------------------------------------------------------- 57
4.5 DATA FORWARDING AND LOAD-USE: ------------------------------------------------------------------------------------------------ 59
4.6 SIGN-EXTENSION INSTRUCTIONS: ---------------------------------------------------------------------------------------------------- 61
CHAPTER 5: PROJECT WORK PLAN AND MILESTONES ------------------------------------------------------------------------- 63
CHAPTER 6: CONCLUSION -------------------------------------------------------------------------------------------------------------- 70
6.1 ACHIEVEMENTS: ---------------------------------------------------------------------------------------------------------------------- 70
6.2 LIMITATIONS:-------------------------------------------------------------------------------------------------------------------------- 70
CHAPTER 7: FUTURE WORK ------------------------------------------------------------------------------------------------------------ 71
Canr1g09
4
FIGURE 1: INDIRECT ADDRESSING MODE SOURCED FROM [3] ............................................................................................ 12

FIGURE 2: INDIRECT INDEXED ADDRESSING SOURCED FROM [3] .......................................................................................... 12
FIGURE 3: MASK (MB < ME) SOURCED FROM [4]........................................................................................................... 15
FIGURE 4: MASK (MB > ME) SOURCED FROM [4]........................................................................................................... 15
FIGURE 5: 5 STAGE PIPELINE SOURCED FROM [1] ............................................................................................................ 17
FIGURE 6: CONTROL HAZARDS ..................................................................................................................................... 18
FIGURE 7: DATA FORWARDING (DATA HAZARDS) ............................................................................................................ 18
FIGURE 8: LOAD-USE .................................................................................................................................................. 19
FIGURE 9: INITIAL DATAPAT ......................................................................................................................................... 21
FIGURE 10: FIXED POINT ARITHMETIC INSTRUCTION ........................................................................................................ 23
FIGURE 11: FIXED POINT LOGICAL INSTRUCTION.............................................................................................................. 25
FIGURE 12: FIXED POINT SHIFT INSTRUCTION ................................................................................................................. 27
FIGURE 13: FIXED POINT ROTATE INSTRUCTION .............................................................................................................. 29
FIGURE 14: FIXED POINT COMPARE INSTRUCTION ........................................................................................................... 31
FIGURE 15: LOAD INSTRUCTION ................................................................................................................................... 33
FIGURE 16: STORE INSTRUCTION .................................................................................................................................. 35
FIGURE 17: BRANCH INSTRUCTIONS .............................................................................................................................. 37
FIGURE 18: DATA FORWARDING AND LOAD USE ............................................................................................................. 39
FIGURE 19: DESIGN BROWSER WINDOW ....................................................................................................................... 42
FIGURE 20: ADDI INSTRUCTION .................................................................................................................................... 44
FIGURE 21: ORI INSTRUCTION ..................................................................................................................................... 46
FIGURE 22: SHIFT INSTRUCTION (SRAW) ........................................................................................................................ 48
FIGURE 23: RLWIMI INSTRUCTION................................................................................................................................. 50
FIGURE 24: COMPARE INSTRUCTION ............................................................................................................................. 52
FIGURE 25: LOAD (LWZ) INSTRUCTION .......................................................................................................................... 54
FIGURE 26: STORE WITH UPDATE INSTRUCTION ............................................................................................................... 56
FIGURE 27: BRANCH INSTRUCTIONS .............................................................................................................................. 58
FIGURE 28: LOAD-USE AND DATA FORWARDING.............................................................................................................. 60
FIGURE 29: SIGN-EXTENSION INSTRUCTIONS .................................................................................................................. 62
Canr1g09
5
TABLE 1: INITIAL GANTT CHART -------------------------------------------------------------------------------------------------------------- 64

TABLE 2: FINAL GANTT CHART --------------------------------------------------------------------------------------------------------------- 67
Canr1g09
6
The main aim of the project is to design a 32-bit pipelined PowerPC processor. Verilog HDL
is used as the hardware description language for writing the modules. The length of an instruction and
registers are 32 bit long. The modules are simulated and the final results of the simulation are
analysed. The designed processor runs fixed point integer instructions, branch instructions, integer
load/store instructions, and sign-extension instructions. The fixed point integer instructions include
arithmetic, logical, compare, shift, and rotate instructions.
The chapter 2 covers the background study of the PowerPC processor. The registers and the
instruction format of PowerPC processor are covered in this chapter. The pipeline approach and
hazards occurring in the pipelined processor are also included in this chapter.
The chapter 3 in this report covers the design architecture for different instructions. The
datapath for fixed point integer instructions, load/store instructions, branch instructions are designed
and explained in this chapter.
The chapter 4 in this report covers the testing the design in NC Verilog simulator. The final
result of datapath design is discussed with their waveform.
List of implemented instructions shown in Appendix [3] and instructions which are not
implemented in the design are shown in the Appendix [4] section of the report. The program which is
used for testing is also shown in the Appendix [5].
Canr1g09
7
2.1 PowerPC:
PowerPC processor is a 32 bit processor which is capable of doing floating point, fixed point,
control instructions and also memory management instructions. The fixed point instructions include
arithmetic, logical, compare, shift and rotate instructions. PowerPC consists of general purpose
registers and various special purpose registers such as Program counter, also called as Next
Instruction Pointer (NIP)/ Instruction Address Pointer, Link register, and count register [5]. Some
PowerPC processors also have 32 (64 or 32 bit) floating point registers. PowerPC is an example of the
RISC architecture. The RISC architecture in the PowerPC allows [5]:
All the instructions in the PowerPC processor are fixed 32 bit length Instructions [5].
In PowerPC, data from the memory is retrieved and stored in registers and then written back
to the memory. There are some instructions (except load and store instructions) that
manipulate memory directly [5].
2.2 PowerPC Registers:

PowerPC processor has 32 general purpose registers, count register, Link register, Next
Instruction Pointer, Exception Register, and Condition register.
2.2.1 General Purpose Registers:

The general purpose registers are 32 bit long. These registers are used by the fixed point
integer instructions. The general purpose registers are selected by the 5-bit address in the register field
in the instruction [4]. Each of the general purpose registers are used to store the result of the
operations performed by the instruction. All the data manipulation is done in the registers which is
internal to the processor [7].
2.2.2 Exception Register:

The Exception registers is 32 bit in long for the 32 bit processor implementation [4]. The
Exception register is updated by the results of the arithmetic operations which produce the overflow
or carry. This register is also used to indicate number of bytes to be transferred by load / store string
indexed instructions [4], [7]. The bit representation of the exception register is shown in Appendix
[2].
The CA field in the exception register XER [2] can be modified by the add-carrying,
Subtract-from, add-extended, and subtract-from-extended instructions. CA bit is set to 1 whenever the
Canr1g09
8

carry from the arithmetic operations. For the rotate and shift instructions the carry bit is used. Mtspr
and mcrxr are used to clear the OV bit [4].
The OV bit of the XER [1] is set by enabling the OE bit in the instruction to 1. Add, Subtract
and negate instruction sets OV bit, if carry out of the msb is not equal to the carry out of the msb+1.
Else the OV bit is cleared. If multiply and divide is executed then if the result is not represented in 32
bit, OV bit is set to 1. Mtspr and mcrxr are used to clear the OV bit [4].
The SO bit is of the XER[0] set to 1 whenever the instruction sets overflow bit. This bit can
be cleared by mtspr instruction [4].
2.2.3 Count Register:

The count register (CTR) is a 32 bit register which can be used by the branch
instructions. The contents of count register is used as the branch target address. The bit representation
of the count register is shown in Appendix [2].
2.2.4 Condition Register:

The condition register (CR) is a 32 bit register which reflects the result of the some
instructions and it is also used for testing and conditional branching. The 32 bit conditional register is
grouped by eight 4-bit fields, CR0-CR7 [4], [7]. The field specification of the Condition register is
shown in Appendix [2]. Each of the CR field contains the bit LT, GT, EQ, and SO. These bits are
updated by results of the compare instruction. The CR0 field is modified by the result of the fixed
point instructions whenever the Rc field in the instruction enabled. The instructions such as addic.,
andi., and addis., also modifies the CR0 fields. The bit definitions for the CR0 field are follows [4]:
LT bit is set to 1, when the result is negative else this bit is cleared.
GT bit is set to 1, when the result is positive else this bit is cleared.
EQ bit is set to 1, when the result is equal is zero else cleared.
SO bit is the copy of the SO bit in the exception register.
The CR1 field is modified by the floating point instructions. The remaining field of the condition
register is modified by the compare instructions. The bit definition for the CRn (CR2- CR7) fields is
[4], [7],
LT set to 1, when the register, rA is less than immediate value or register, rB. The immediate
value can be the signed or unsigned.
GT set to 1, when the register, rA is greater than immediate value or register, rB. The
immediate value can be signed or unsigned.
EQ set to 1, when the register, rA and immediate value or register, rB is equal.
SO bit is the copy of the XER [SO] bit.
Canr1g09
9
2.2.5 Link Register:

The Link register is 32 bit register that is used by the branch instructions. The field
specification of the Condition register is shown in Appendix [2]. It is also used for the subroutine
linkage. There are two ways in which the branch instruction uses the link register [4], [7]. BranchConditional to Link Register (bclrx) instructions read the branch -target address from the link register
(LR). If the link register update option (LK) bit is enabled in the branch instructions, the effective
address of the instruction following the branch instruction is loaded in the link register.
2.3 PowerPC Data Types:

The load and store instructions in the PowerPC processor supports 8(byte), 16(halfword),
32(word), and 64(doubleword) bits. It uses either little-endian or big-endian style [3]. The Unsigned
byte can be used for logical or integer arithmetic operations. Some of the load/ Store instructions uses
the unsigned byte to load from the memory or store in the general purpose registers by zero expanding
on the left to 32 bit length register size [3]. The Signed Halfword is used for the arithmetic
operations. Some of the load/ store instructions use the signed halfword to load from the memory or to
store in the 32 bit register by expanding by zero on the left to 32 bit size [3]. The Unsigned word is
32 bit in length which can be used for logical operations and as an address pointer [3]. The Signed
word is used to perform arithmetic operations [3]. The Unsigned Doubleword can be used as the
address pointer [3].
2.4 PowerPC Branch Instructions:

There are two types of branches, conditional branch and unconditional branch. Both the
conditional and unconditional branches alter the program flow sequence in the forward or backward
using the AA signal [4]. The function of the AA signal is explained in the section 2.4.1. The branch
target address is also calculated from link register and count register. One of the features of the Link
register is to store the return address of the branch instructions. The conditional branch instruction
tests the bit in condition register. If the condition is true, then the Program counter is modified else the
program flow sequence is not altered [4]. The branch instruction also affects the contents of the count
register [4]. The count register value is decremented by 1, and then value is tested by the branch
instruction [4]. The branch instruction uses three types of the addressing, absolute, Indexed and
relative addressing. The branch instructions which are implemented in the design are shown in the
Appendix [3] and instruction format is shown in the Appendix [1].
The unconditional branch instruction modifies the program counter without testing any bit
[4]. The LI bit in the instruction field is extended to 32 bit by adding two 0-bit in the right and sign
extending the msb to left. The value of the LI is the branch target address.
Canr1g09
10
2.4.1Addressing Modes:
Branch instructions uses three addressing modes for calculating the branch target address. The
three addressing modes are explained in this section. Both the conditional and unconditional branch
instruction uses the absolute addressing [3]. For the unconditional branch, the effective address of
the next instruction is calculated by the 24-bit immediate value within the instruction. This immediate
value is extended to 32 bit length by adding two 0-bits in the right and sign extending the left. For the
conditional branch, the effective address of the next instruction is calculated by the 16-bit immediate
value within the instruction [3]. This 16-bit is extended to 32 bit by adding two 0-bit in the right and
sign extending to the left.
As like absolute addressing, relative addressing is also used for both conditional and
unconditional branching. The effective address calculation is same as the absolute addressing. The
resulting address is added with the current instruction address to produce next instruction address [3].
Indexed Addressing is used only by the conditional branch instructions. The effective
address of the next instruction is taken from either link register or count register [3]. In this case, the
count register is used to hold the address of the branch instruction. This is also can be used to hold the
count for looping.
2.5 PowerPC Load/Store Instructions:

The fixed point integer load and store instructions used to move data from data memory to the
specified general purpose register and to move data from the general purpose register to the data
memory. The Load/Store instructions which are implemented in the design are shown in the Appendix
[3] and instruction format is shown in the Appendix [1].
2.5.1 Addressing Modes:

The PowerPC has two addressing modes for the load/store instructions. With register
indirect addressing mode, instruction includes 16 bit displacement which is added with the base
register [3].In addition, the effective address is fed back to the base register, updating its current
contents. The other addressing mode for the load/store instruction is register indirect indexed
addressing [3]. In this mode, instruction includes base register and an index register both of which
may be any of the general purpose register. The effective address is calculated by adding the contents
of the base register and index register. If the update is enabled, then the effective address is loaded to
the base register. The following figure 1 shows the indirect addressing mode for the load/store
instructions [3]. The figure 2 shows the indirect indexed addressing for load and store instructions.
Canr1g09
11

Base Register (GPR)
Signed Displacement
16
disp
+
With update
Logical Address
To address translation
Figure 1: Indirect Addressing Mode sourced from [3]

Base Register (GPR)
Index Register (GPR)
+
With update
Logical Address
To address translation
Figure 2: Indirect Indexed Addressing sourced from [3]
The register indirect addressing mode can be represent in the RTL,

Effective address [base register] + displacement
The register indirect indexed addressing can be represent in the RTL,
Effective address [base register] + [index register]
2.5.2 Load Instructions:

The fixed point integer load instructions is used to read the data from the data memory and
stores the data in the any of the general purpose register. The load and zero instructions are used to
Canr1g09
12

read data from the memory and the remaining high order bits are cleared to zero [4]. The load and
algebraic instructions are used to read the data from the memory and fill the higher order bits to one
[4]. The load and update instructions are used to load data from the memory. In addition, it updates
the base register with the memory address.
2.5.3 Store Instructions:

The fixed point integer store instructions are used to read the data from the general purpose
register and store it in the data memory [4].PowerPC supports several types of the store instructions.
The store and update instructions are used to write data to memory and in addition it updates base
register with the memory address.
2.6 PowerPC Fixed Point Integer Instructions:

The fixed point integer instruction uses general purpose register for its operation and storing
the result. The source for the operation is obtained either from general purpose register or an
immediate value. These instructions do not access data memory for their operation. Both signed and
unsigned integers can be used as the source operands. The condition register and exception register
are updated [4].
The PowerPC architecture supports several types of the integer instructions [4],
Arithmetic Instructions
Logical Instructions
Rotate Instructions
Compare Instructions
Shift Instructions
2.6.1 Arithmetic Instructions:

The Arithmetic instructions perform addition, subtraction, negative, multiplication, and
division. These instructions use general purpose registers as its source and destination operands. Some
instructions use immediate value as its source operands. Integer arithmetic instructions support both
signed and unsigned operations [4]. This carry is stored in the carry bit in the exception register. If the
record bit (Rc) in the instruction is enabled to 1, then the CR0 field of the condition register is
updated. If the result of the arithmetic operation is zero, zero bit is set to 1. For the signed operation,
the negative bit is set to 1 when the MSB is set to 1 [4]. The arithmetic instructions which are
implemented in the design are shown in the Appendix [3] and instruction format is shown in the
Appendix [1].
Canr1g09
13

The negation instruction is used to perform the 2s complement of the operand. The source
operand is 2s complemented and it is stored in the destination register [4]. If the record bit in the
instruction is enabled, then condition register is updated.
The multiply instructions are used to perform multiplication between the two 32-bit
operands and produce 64-bit result. The source operands for the multiplication can be either register
value or an immediate value. In Multiply Low-Word Instructions and Multiply Low-Word
Immediate Instructions, the destination register is loaded with the low 2-bit of the product [4]. In
Multiply High-Word Instructions, the destination register is loaded with the higher 32bit product
[4]. The exception and condition registers are updated.
The Divide Instructions are used to perform division. The source and destination operands
for the division must be from general purpose registers. The quotient is loaded in the destination
registers. In Divide-Word Instructions, the two 32-bit operands are divided and the low 32-bit of the
quotient is loaded in the destination register. In Divide-Word Unsigned Instructions, the destination
register is loaded with low 32-bit quotient. The source operands are interpreted as unsigned integers.
The exception and condition register is updated [4].
2.6.2 Logical Instructions:

The Logical instructions are used to perform the logical operations such as logical OR, logical
AND, logical NAND, logical NOR and logical XOR. These instructions perform on the 32-bit
operands. If the operand is an immediate value, this value is extended by either adding zeros in the
right i.e. immediate shifted or the 16 bit value is extended by the adding the 0-bit in the left i.e.
unsigned immediate value [4]. The record bit in the instruction is indicated by .. If the record bit
(Rc) in the instruction field is enabled, the result of the logical instruction updates the condition
register [4].The logical instructions which are implemented in the design are shown in the Appendix
[3] and instruction format is shown in the Appendix [1]. The exception register is not updated by the
logical instructions.
2.6.3 Sign- Extension Instructions:

There are two sign extended instructions that supported by the PowerPC. They are extsh and
extsb. The extsh updates the destination register by reading the lower halfword from the source
register. The 16th bit is extended to 32 bit and updated in the destination register. Similarly, extsb
updates the destination register by reading the lower byte from the source register and sign extending
the 24th bit to32-bit data [4]. The sign-extension instructions which are implemented in the design are
shown in the Appendix [3] and instruction format is shown in the Appendix [1].
2.6.4 Rotate Instructions:

Rotate instructions uses the general purpose registers for source and destination. The data is
rotated by the left from the LSB to MSB. The data coming out of the MSB is rotated to the LSB of the
Canr1g09
14

data. If the Rc field in the rotate instructions is enabled, the result of the rotate instructions updates the
condition register field, CR0. For the rotate instructions, the mask should be generated [4]. All the
rotate instructions are implemented in the design and shown in the Appendix [3] and instruction
format is shown in the Appendix [1].
2.6.4.1 Mask Generation:
The mask is a 32-bit data. The MB and ME are 5-bit field used to generate the 32-bit mask. If
the value of the MB is less than the value of ME, then the bits in the mask between the MB and ME
is set to 1. The remaining bits are set to 0. If the value of the ME is less than the value of MB, then
the bits in the mask between the ME and MB is set to 0. The remaining bits are set to 1. The figure 3
shows the mask generation if MB < ME [4],
0 0 0 0 .0
1 1 1 1 1 11
MB
0 0 0 0 0.0
ME
31
Figure 3: Mask (MB < ME) sourced from [4]
The figure 4 shows the mask generation if ME < MB [4],

1 1 1 1 1 1 .1
0
000000000.......0
ME
1 1 1 1 1. . . 1
MB
31
Figure 4: Mask (MB > ME) sourced from [4]
2.6.4.2 PowerPC Rotate Instructions:

PowerPC supports three rotate instructions. They are rlwimi, rlwnm, and rlwinm. The
instruction, rlwimi rotates the source register left by the number of bits specified in the 5-bit SH field
[4]. Insert the rotated data to the destination register where the bits in the mask are enabled to 1. The
remaining bits are unchanged in the destination register. The instruction, rlwnm rotates the source
register, rS left by the number of bit specified in the source register, rB[27:31] [4]. The rotated data is
AND with the mask and the result is stored in the destination register. The instruction, rlwinm rotates
the source register left by the number of bits specified by the 5-bit SH field [4]. The rotated data is
AND with the mask and the result is loaded in the destination register.
2.6.5 Shift Instructions:

Shift instructions are used shift the contents of the source register to either left or right. It
operates on the 32-bit operand [4]. All the shift instructions are implemented in the design and shown
in the Appendix [3] and instruction format is shown in the Appendix [1].
2.6.5.1 Logical Left Shift Instructions:
Three general purpose registers are used for the logical shift left instructions. The logical left
shift instructions shifts the source register, rS bits from the LSB to the MSB by the number of bits
Canr1g09
15

specified by source register, rB[27:31] [4]. The bit shifted out from the MSB is filled with zero. The
condition register field, CR0 is updated when the Rc bit is set to 1.
2.6.5.2 Logical Right Shift Instructions:
Three general purpose registers are used for the logical shift right instructions. The logical
right shift instructions shifts the source register, rS bits from the MSB to the LSB by the number of
bits specified by source register, rB[27:31]. The bit shifted from the MSB to LSB is filled with zero
[4]. The condition register field, CR0 is updated when the Rc bit is set to 1.
2.6.5.3 Algebraic Shift Instructions:
The two instructions sraw and srawi are used in the PowerPC. The instruction, sraw shifts
the data in the source register, rS right by number of bits specified by the source register, rB[27:31].
The MSB of the source register, rS is replicated to fill the vacated bit positions on the left. . The bits
shifted out of the LSB are lost. The result is stored in the destination register. The instruction, srawi
shifts the data in the source register, rS right by the number of bits specified by 5-bit SH field. The
MSB of the source register, rS is replicated to fill the vacated bit positions on the left. The bits shifted
out of the LSB are lost. The result is stored in the destination register [4].
2.7 Pipelining Overview:

Pipelining is an implementation technique in which more than one instruction is overlapped in
execution [1]. The execution of the instruction is fast. The Harvard architecture has five stages of the
Pipelining. The five stages of pipeline are Instruction Fetch (IF), Instruction Decode (ID), Instruction
Execute, Memory, and Write back [1].
Instruction Fetch (IF) stage is used fetch the instruction to be executed from the instruction
memory.
Instruction Decode (ID) stage is used read values from the registers. In PowerPC, the
reading the register values and decoding will occur in the same stage.
Instruction Execute (IE) stage is used to calculate the data memory address if load/store
instruction is executed. Otherwise, this stage is used to execute the instruction and calculate
the result.
Memory (MEM) stage is used for the load/store instruction which reads/stores the data from
the registers.
Write Back (WB) stage is used to store the result into the register.
The following figure 5 shows the five stages of the pipeline which instructions are executed per clock
cycle.
Canr1g09
16

In the figure 5, there are 5 instructions to be executed in a sequential order. The instruction 1 is
fetched by the IF stage and it is sent to the ID stage. While the Instruction 1 is decoded, the second
instruction is fetched [1]. When the instruction 1 is executed, the instruction 2 is decoded and at a
same time instruction is fetched and it continues.
CLK
Inst 1
IF
Inst 2
Inst 3
ID
EX
MEM
WB
IF
ID
EX
MEM
WB
ID
EX
MEM
IF
ID
EX
IF
ID
IF
Inst 4
Inst 5
Figure 5: 5 Stage Pipeline sourced from [1]
2.7.1 Pipelining Hazards:

Pipelining Hazards occur in the pipeline when the next instruction cannot execute in the
following clock cycles. There are three kinds of hazards. They are structural hazards, control hazards,
and data hazards [1].
2.7.1.1 Structural Hazards:
The structural hazard is the first hazard. The hardware cannot support the combination of
instructions that to be executed in the same clock cycle [1].If there is two memories, one for
instructions and another for data, the structural hazard can be avoided [9].
Canr1g09
17

2.7.1.2 Control Hazards:
The control hazard is the second hazard which arises from the need to make decision based on
the results of the one instruction while the others are executing. When executing the branch
instruction, the branch address is calculated either in second or third stage [1]. Once the branch is
likely to be taken, the instructions in the IF and ID stages should not execute. These two instructions
have to flush from the pipeline [9] shown in the figure 6..
IF
ID
EX
Inst 2
Inst 1
Inst
Instruction Flushed
MEM
Branch is taken
Figure 6: Control Hazards
2.7.1.3 Data Hazards:

Data hazards occur when the instruction depends on the previous instruction result. This is
called data dependency [1]. If the data is not available, the wrong data is fetched and produces the
incorrect result. There are two possibilities for data hazards and can be avoided using data forwarding
unit and load-use unit [9].
2.7.1.3.1Dataforwarding Hazard:
This is can be avoided by dataforwarding unit. The result in the each of the EX, MEM, WB is
forwarded to the dataforwarding unit. For example, addi r2, r4, 1010h and xoris r6, r2, 1100h
In the second instruction, the r2 value is depends on the previous instruction result. This value is
IF
ID
EX
MEM
WB
Dataforwarding
Figure 7: Data Forwarding (Data Hazards)
Forward to the dataforwarding unit [9]. Now the r2 is available for the next instruction shown in
figure 7.
2.7.1.3.2 Load-Use Hazard:
For example, lwz r15, 0010h and or r10, r15, r12
The load-use hazard occur when instruction is depends on previous load instruction result. To avoid
this hazard, the ID stage is stalled for 1 clock cycle. The result of the load instruction is available in
Canr1g09
18

the MEM stage. This is forwarded to the dataforwarding unit. The example shown above, ID stage for
the OR instruction is stalled for 1 clock cycle. At the end of the MEM stage, the r15 is forwarded to
ID stage [9] shown in figure 8.
IF
ID
EX
MEM
WB
Data Forwarded from the end of MEM stage to

Beginning of the EX stage
Figure 8: Load-use
Canr1g09
19
The 32-bit PowerPC Processor is designed using the verilog description language modules.
The initial Datapath is designed and expanded accordingly to suitable for the PowerPC instruction set.
The main aim of the project is to design the Pipelined processor which supports PowerPC instruction
set. But with the given time limit, the designed processor supports the fixed point integer instructions,
load/store instructions, and branch instructions. The design is started with implementing the basic
arithmetic and logical instructions and tested. The design is then expanded by adding the load/store
instructions, data forwarding and load-use.
3.1 Initial Datapath:

The Initial Datapath is shown in the figure 9. The Datapath of the design consists of five
stages of pipeline. The five stages are IF, ID, EX, MEM, and WB. The pipeline approach and their
stages are explained in the section 2.7. The hidden lines in the figure 9 shows the five stages of
pipeline implemented. Pipeline registers in the hidden line is used to store the value. The value in this
registers can be used as input for the next stage. The processor performs its operation during the
positive edge clock cycles. The values in these registers are updated during each positive edge clock
cycle. Two memories are used. One is for Data memory and another is for Instruction memory.
The multiplexers are used for select the register contents for the different instructions. The
reg0_add and reg1_add are used as the source for the arithmetic and load/store instructions. The
regshift_add and reg1_add are used as the source for the logical, shift and rotate instructions. The
write back register is selected between the reg0_add and regshift_add which depends on the type of
the instruction set. The data coming out of the MEM_Phase is sent to the register bank to store it in
the registers which act as WB_Phase. The ALU module calculates the memory address and result of
each instruction. For calculating the memory address, ALU uses the register contents as the source.
The data from the MEM_Phase can be either from the result of the ALU or the data from the memory.
The resister bank sends the data to the data memory for the storing the data. The NIP in the Datapath
is the Next Instruction Pointer which stores the address of the next instruction to be executed. The IP
in the Datapath is the Instruction pointer which stores the address of the current instruction being
executed.
Canr1g09
20
IF/ID
ID/EX
Branch target Address
EX/MEM
MEM/WB
32
Immediate value
4
A
+
Result
IP
L
32
Inst[0:31]
Memory
address
op0
32
32
Data
Memory
Data
Data
Input
op1
Instruction
Memory
Reg1_add
Reg0_add
Write Back
Data
NIP
Out
Register Bank
Mem_data
32
32
Mem_data
Regshift_add 1
Mem_data
Write Back
Data
Write Back
Reg
IF_Phase
Regshift_add
Reg0_add
Write Back
Reg
5
Write Back
Reg
ID_Phase
MEM_Phase
WB_Phase
EX_Phase
Figure 9: Initial Datapat
Canr1g09
21
3.2 Instruction Set Design:

The processor design supports fixed integer instructions, load/store instructions and branch
instructions. The data hazards and control hazards are encountered in the earlier part of the design and
they are eliminated by adding the data forwarding unit, Load-use unit and branch prediction unit.
3.2.1 Fixed Point Integer Instructions:

3.2.1.1 Fixed Point Arithmetic Instructions:
The figure 10 shows the design for fixed point arithmetic instructions. The arithmetic
instructions are fetched from the instruction memory. The instruction is sent to the control module.
The control module separates the opcode field, registers fields, immediate value, record bit, and
extended opcode field available in the instruction field. The register field is sent to the register bank
module to the read the content in the register. The reg0_add and reg1_add are the two operands for the
arithmetic operation. The final result of the arithmetic operation is written back in the regshift_add
field. The alu_operands module acts as the decoder which gives the input to the ALU module. There
are pipeline registers at the end of each phase to store values. The ID_op0 and ID_op1 are two
operands sent to ALU module as input. The ALU module calculates the arithmetic result and writes in
the out register. All the addition, subtraction, multiplication, and division instructions are executed in
this design.
The wb_reg is the write back register where final result is stored back. At the end of the
EX_Phase, the result of the arithmetic operation is stored in the EX_out registers. The condition
register, CR is updated if Rc bit in the instruction is set to 1. The value in the EX_out is passed to the
MEM_Phase and it is stored in the MEM_wb_data registers. At the end of the MEM_Phase, the data
in the MEM_wb_data is stored back in the MEM_wb_reg. The simm field indicates the 16-bit
immediate field is sign extended to 32-bit immediate data. The imm_is (immediate shifted) field
indicates 16-bit immediate field is extended to 32-bit by concatenating 16 0-bit at right of the 16-bit
immediate data.
Canr1g09
22
IF/ID
ID/EX
Opcode[0:5]
EX/MEM
MEM/WB
simm[0:31]
Wb_reg[0:4]
32
32
ID_Wb_reg[0:4]
32
Imm_is[0:31]
Regshift_add[0:4]
EX_Wb_reg[0:4]
op0[0:31]
5
ID_op1[0:31]
32
Alu_operands
A
op1[0:31]
5
5
EX_out[0:31] 32
32
L
U
ID_op2[0:31]
MEM_wb_data[0:31]
CR0
Field
32
CR
Reg_bank
Wb_reg[0:4]
IF_Phase
Wb_data[0:31]
MEM_Wb_reg[0:4]
32
out[0:31]
32
32
Reg1_add[0:4]
Reg0_add[0:4]
Value_op0[0:31]
Value_op1[0:31]
Inst[0:31]
Control
ID_Phase
EX_Phase
MEM_Phase
Figure 10: Fixed Point Arithmetic Instruction
Canr1g09
23

3.2.1.2 Fixed Point Logical Instructions:
The figure 11 shows the processor design which supports the fixed point logical instructions.
The IF_Phase module fetches the instruction from the instruction memory. The 32-bit instruction is
sent to the control module. The control module separates the opcode field, register fields, immediate
value, record bit, and the extended opcode field. The register fields reg1_add, and regshift_add are
passed to the register bank module. The data in the registers are read and is sent to the alu_operands
module. The uimm indicates the 32-bit data formed by the concatenating the 16 0-bit in the left of the
immediate data in the instruction field. The imm_is indicates the immediate shifted which is formed
by the concatenating the 16 0-bit in the right of the 16-bit immediate data. The immediate data and
data in the register are passed to the alu_operands module. The alu_operands module decodes the data
based on the opcode field. This processor design executes logical OR, AND, NAND, XOR, and NEG
instructions.
The reg0_add is write back register which stores the result of the logical operation. The
wb_reg indicates write back register which is same as the register field, reg0_add. At the beginning of
the EX_Phase, the input operands are sent to the ALU. The ALU performs the logical operation
between the two operands. The result is stored in the out register. The condition register, CR is
updated when the Rc bit in the instruction is set. At the end of the EX_Phase, the result of the logical
operation is stored in the EX_out register. This data is stored in the MEM_wb_data register at the end
of MEM_Phase. The data in the MEM_wb_data of the MEM_Phase is written back and stored in the
register.
Canr1g09
24
IF/ID
ID/EX
Opcode[0:5]
EX/MEM
MEM/WB
uimm[0:31]
Wb_reg[0:4]
32
Imm_is[0:31]
Reg0_add[0:4]
op0[0:31]
32
32
ID_Wb_reg[0:4]
EX_Wb_reg[0:4]
ID_op1[0:31]
32
Alu_operands
A
op1[0:31]
out[0:31]
32
EX_out[0:31] 32
32
U
ID_op2[0:31]
MEM_wb_data[0:31]
32
Reg_bank
CR0
Field
CR
Wb_reg[0:4]
Wb_data[0:31]
MEM_Wb_reg[0:4]
32
32
Reg1_add[0:4]
Regshift_add[0:4]
Value_rs[0:31]
Value_op1[0:31]
Inst[0:31]
Control
32
IF_Phase
ID_Phase
EX_Phase
MEM_Phase
Figure 11: Fixed Point Logical Instruction
Canr1g09
25

3.2.1.3 Fixed Point Shift Instructions:
The figure 12 shows the processor design which supports the fixed point shift instructions.
The instruction is fetched from the instruction memory in the IF_Phase module. The instruction is
stored in the IF_Phase pipeline register. In the next clock cycle, the instruction is sent to the
ID_Phase. The control module separates the register fields, Sh field, extended opcode field, and
record bit (Rc). The register fields are sent to the register bank module to get the 32 bit data from the
registers. The immediate data field, Sh is separated by the control module and is passed to the
alu_operands module. The register data and Sh are decoded by the alu_operands module. The
reg0_add indicates the write back register. At the end of the ID_Phase, the op0 and op1 are stored in
the pipeline registers. The ALU module shifts the op0 by the number of bits specified by op1 [27:31].
The shifted data is stored in the out register. At the end of the EX_Phase, the result is stored in the
EX_out register. This data is passed to the MEM_Phase and data is stored in MEM_wb_data. In the
next clock cycle, the data is written back and stored in the register. The condition register, CR is
updated when the Rc bit in the instruction is set. The immediate data in the Sh field represents the
number the number of bits for shifting the data.
Canr1g09
26
IF/ID
ID/EX
Opcode[0:5]
Extended opcode[0:9]
6
Wb_reg[0:4]
op0[0:31]
EX_Wb_reg[0:4]
ID_Wb_reg[0:4]
ID_op1[0:31]
32
Alu_operands
A
op1[0:31]
EX_out[0:31] 32
32
U
ID_op2[0:31]
MEM_wb_data[0:31]
32
out[0:31]
32
32
CR0
Field
Reg_bank
CR
Wb_reg[0:4]
Wb_data[0:31]
MEM_Wb_reg[0:4]
Reg1_add[0:4]
Regshift_add[0:4]
Value_rs[0:31]
Value_op1[0:31]
Inst[0:31]
Control
Reg0_add[0:4]
MEM/WB
32
32
Sh[0:4]
EX/MEM
32
5
IF_Phase
ID_Phase
EX_Phase
MEM_Phase
Figure 12: Fixed Point Shift Instruction
Canr1g09
27

3.2.1.4 Fixed Point Rotate Instructions:
The figure 13 shows the processor design that supports the fixed point rotate instructions. The
instruction is fetched from the instruction memory. This instruction is sent to the control module in
ID_Phase. The control module separates the register fields, 5-bit ME field, 5-bit MB field,
extendedopcode field, Sh field, and record bit (Rc). The register field reg1_add, and regshift_add are
sent to register bank to get the 32-bit data from the register. The register field reg0_add is write back
register in where the final result of the rotate instruction is stored. The alu_operands module is used as
the multiplexer that selects either Sh bit or the register data, value_op1 depending on the rotate
instruction. The wb_reg indicates the write back register which is same as the reg0_add. The
value_op0, value_op1, and value_rs are taken from the register bank. At the end of the ID_Phase, the
op0, op1, and wb_reg are stored in the pipeline registers.
The mask is generated by the signals ID_ME, and ID_MB I the beginning of EX_Phase. If
the ID_MB bit is less than the ID_MB, the bits in the mask between the MB and ME are filled with 1bit. The remaining bits in the mask are filled with 0-bit. Similarly, if the ID_MB is greater than the
ID_ME, the bits in the mask between the MB and ME are filled with 0-bit. The remaining bits in the
mask are filled with 1-bit. The 32-bit mask is generated for the rotate instructions.
The inputs to this module are the two 32-bit input data, and Sh bit. The Sh bit represents the
number of bits to be rotated. The ID_op0 is rotated by the number of bits specified by either Sh field
or ID_op1 [27:31]. Based on the rotate instruction, the mask bit is AND with the rotated data or bits in
the rotated data is inserted in the write back register data where mask bits are set to 1.
The rotated data is stored in the out register of EX_Phase module. At the end EX_Phase, the
result of the rotate instruction is stored in the EX_out register. This data is moved to MEM_Phase and
written in the MEM_wb_data registers. In the WB_Phase, the data in the MEM_wb_data register is
written back to the register module and stored in the register. If Rc bit in the instruction is enabled, the
condition register (CR) is updated with the result of the rotate instruction.
Canr1g09
28
IF/ID
ID/EX
ID_ME[0:4]
5
5
ID_MB[0:4]
6
Sh[0:4]
Reg0_add[0:4]
op0[0:31]
Regshift_add[0:4]
Reg1_add[0:4]
ID_op1[0:31]
32
op1[0:31]
Reg1_add[0:4]
Regshift_add[0:4]
Reg0_add[0:4]
32
32
32
EX_out[0:31]
32
32
32
Rotate
Value_op0[0:31]
Reg0_add[0:4]
out[0:31]
Alu_operands Wb_reg[0:4]
MEM_wb_data[0:31]
32
Opcode[0:5]
Value_op1[0:31]
Value_rs[0:31]
Inst[0:31]
Control
Mask
Generation
MEM/WB
Mask[0:31]
ME[0:4]
MB[0:4]
EX/MEM
ID_op2[0:31]
5
EX_Wb_reg[0:4]
ID_Wb_reg[0:4]
Reg_bank
Wb_data[0:31]
MEM_Wb_reg[0:4]
CR0
Field
CR
Wb_reg[0:4]
IF_Phase
ID_Phase
EX_Phase
MEM_Phase
Figure 13: Fixed Point Rotate Instruction
Canr1g09
29

3.2.1.5 Fixed Point Compare Instructions:
The figure 14 shows the processor design for the fixed point compare instructions. The
instruction from the instruction memory is fetched and it is stored in the pipeline register in the
IF_Phase. In the next clock cycle, the instruction is sent to the control module in the ID_Phase. The
control module separates the opcode field, register fields, extendedopcode field, immediate value, and
crfD field. The crfD field in the instruction indicates the write back condition register field (CR0
CR7). The ionstrution format is shown in appendix [1]. The 9th and 10th bit in the instruction must be
0, otherwise the instruction becomes invalid. The register fields reg0_add and reg1_add are sent to
register bank module. The value_op0 and value_op2 are read from the register fields and passed to
alu_operands module. The 16-bit immediate field is extended to 32-bit by sign extending to the left
and is sent to alu_operands module. The alu_operands module decodes the signals based on the
compare instruction.
In the EX_Phase, the ID_op0 and ID_op1 are subtracted in the alu module. If the result of the
subtraction is zero, the EQ bit is set to 1 else it is set to 0. If the ID_op0 is less than the
ID_op1, lt is set to 1 else gt is set to 1. If any overflow occurs, the summary overflow (SO) is set to 1.
These bits are stored in the pipeline registers of EX_Phase. In the next clock cycle, the condition
register is updated. The CRn field in the ID_Phase is modified by the EX_lt, EX_gt, EX_EQ, and
EX_SO. The CRn field is sent to the condition register fields.
Canr1g09
30
IF/ID
ID/EX
Opcode[0:5]
simm[0:31]
CrfD[0:2]
Inst[0:31]
Control
32
Wb_cr[0:2]
op0[0:31]
ID_Wb_cr[0:2]
ID_op1[0:31]
Alu_operands
5
lt
gt
op1[0:31]
Value_op0[0:31]
Value_op1[0:31]
Reg1_add[0:4]
Reg0_add[0:4]
Reg_bank
EQ
SO
ID_op2[0:31]
32
32
EX_Wb_cr[0:2]
32
CR
CRn
field
IF_Phase
ID_Phase
EX_Phase
Figure 14: Fixed Point Compare Instruction
Canr1g09
31
3.2.2 Load/Store Instructions:

The processor design supports PowerPC integer load and store instructions. The integer load
and store instructions support different data types such as byte, halfword, and word. Only load and
store instructions access the data memory.
3.2.2.1 Load Instructions:
The figure 15 shows the processor design which supports load instructions. There are two
addressing modes for calculating the memory address. The load instruction uses three register fields in
which two is used for calculating memory address and other is used as the write back register. The
instruction is fetched from the instruction memory and stored in the register. In the ID_Phase, the
instruction is passed to the control module which separates the opcode field, register fields,
displacement, and extendedopcode field. The regshift_add is the write back register in which the data
from the memory is stored. The register fields reg0_add and reg1_add are sent to register bank
module and reads the data in the registers. The value_op0 and value_op1 are the base address and
index address. Based on the instruction, either value_op1 or displacement is selected. The two
operands are stored in the register at the end of the ID_Phase.
In the EX_Phase, ID_op0 (base address) and ID_op1 (displacement or index address) are
added to get the effective address of the memory. If the update signal is enabled, then the base register
is updated with the effective address. For the load instructions, the ld signal is set to 1. At the end of
the EX_Phase, the effective address is stored in the pipeline register. The effective is passed to the
memory in the MEM_Phase. The data in the address is fetched and it is stored in MEM_wb_data
register. In the WB_Phase, the data in the MEM_wb_data register is written back and stored in the
register in the register bank module. If the instruction data type is byte, 8 bits are fetched from the
memory. If the instruction data type is halfword, 16 bit data is fetched from the memory. If the
instruction data type is word, 32 bit data is fetched from the memory.
Canr1g09
32
IF/ID
ID/EX
Opcode[0:5]
10
Wb_reg[0:4]
Regshift_add[0:4]
ID_Wb_reg[0:4]
EX_Wb_reg[0:4]
op0[0:31]
Disp[0:31]
32
ID_op1[0:31]
32
Alu_operands
op1[0:31]
32
ld
32
32
32
Address
Data
Memory
ID_op2[0:31]
ID_ld
ld
Data out
Reg_bank
Wb_reg[0:4]
Wb_data[0:31]
ID_update
Upd_add[0:4]
EX_EA[0:31]
ID_Phase
MEM_wb_data[0:31]
32
EA[0:31]
MEM_Wb_reg[0:4]
Reg1_add[0:4]
Reg0_add[0:4]
32
Update
Value_op1[0:31]
Value_op0[0:31]
32
IF_Phase
EX_EA[0:31]
Inst[0:31]
MEM/WB
Extendedopcode[0:9]
Control
EX/MEM
EX_EA[0:31]
EX_Phase
MEM_Phase
Figure 15: Load Instruction
Canr1g09
33

3.2.2.2 Store Instructions:
The figure 16 shows the processor design which supports PowerPC store instructions. The
memory address can be calculated by adding the base address and displacement or base address and
index register. The instruction is fetched from the instruction memory in the IF_Phase. In the
ID_Phase, the instruction is sent to the control module. The control module separates opcode field,
extendedopcode field, register fields, and displacement. The three register fields are passed to the
register bank module to get the data value_op0, value_op1, and mem_data. The mem_data represents
the 32-bit data to be written in the data memory. The value_op0 and value_op1 are 32-bit data which
are used to calculate the memory address. Based on the type of the instruction, either displacement or
value_op1 is selected. The op0 and op1 are stored in the pipeline registers at the end of the ID_Phase.
If the update signal is set to 1, the memory address is updated in the base register for the store
instruction with update. The st signal shown in the figure 16 is set to 1 if the store instruction is being
executed. The EX_EA register stores the memory address of the data memory. In the MEM_Phase,
the mem _data is written in the data memory specified by the address. If the instruction data type is
byte, 8 bits are written to the memory. If the instruction data type is halfword, 16 bit data is written to
the memory. If the instruction data type is word, 32 bit data is written to the memory.
Canr1g09
34
IF/ID
ID/EX
Opcode[0:5]
Extendedopcode[0:9]
Control
Disp[0:31]
10
32
op0[0:31]
Regshift_add[0:4]
Inst[0:31]
EX/MEM
ID_op1[0:31]
Alu_operands
32
Reg1_add[0:4]
Reg0_add[0:4]
Regshift_add[0:4]
32
32
st
ID_op2[0:31]
ID_st
Update
32
Value_op1[0:31]
Value_op0[0:31]
EA[0:31]
EX_EA[0:31]
op1[0:31]
Address
Data
Memory
st
Data in
Reg_bank
Mem_data[0:31]
32
ID_update
Upd_add[0:4]
5
IF_Phase
ID_Mem_data[0:31]
ID_Phase
EX_EA[0:31]
EX_Phase
MEM_Phase
Figure 16: Store Instruction
Canr1g09
35
3.2.3 Branch Instructions:

The figure 17 shows the processor design for the branch instructions. The instruction is
fetched from the instruction memory and stored in the pipeline registers in the IF_Phase module. In
the beginning of the ID_Phase, the control module predicts whether the instruction is branch or not.
There are two branch instructions, conditional branch (bc) and unconditional branch (b). If the
instruction is branch unconditional, the control module separates the opcode field, AA bit, LK bit, and
LI field. The LI field is extended to 32-bit by extending the sign bit. If the instruction is branch
conditional, the control module separates opcode field, BO field, BI field, AA bit, LK bit, and BD
field. The BD field is extended by the sign extending the BD to 32-bit. The LI and BD indicate the
branch target address. The 5-bit BO field indicate the condition to be tested by the branch conditional
instruction and 5-bit BI field indicate the bit in the condition register.
The branch instruction tests the condition in the beginning of the EX_Phase. The branch
target address is also calculated in this phase. The branch_addr signal in the figure 17 indicates the
branch target address. If the AA signal is 0, the branch address is formed by adding the BD or LI with
the current IP. If the condition is true, the branch_addr is moved to the IP in the register bank module.
The program counter value is changed. But the instruction in the IF_Phase and EX_Phase need to be
removed from the pipeline. Once the condition satisfied, branch_cond signal is set to 1. This signal
deletes the instruction in the IF_Phase and EX_Phase module.
If the condition is not satisfied, the instructions in the pipeline are executed in sequential
order. The program counter is not modified. If the instruction is unconditional branch, the program
counter is modified with the branch_addr. If the LK signal is enabled, the branch_addr is stored in the
link register.
Canr1g09
36
IF/ID
EX/MEM
ID/EX
PC_addr
IP
4
+
32
Inst[0:31]
AA 1
NIP
ID_PC_addr
Data
32
control
LI[0:31]
BD[0:31]
MB[0:5]
ME[0:5]
32
brtgt_addr
32
32
5
LK
32
Branch_cond
Branch_addr[0:31]
32
Link
register
Instruction
Memory
Branch_
check
1
CR
Figure 17: Branch Instructions
Canr1g09
37
3.2.4 Data forwarding and Load-Use:

The data forwarding and load-use module are used to avoid data hazards. The figure 18 shows
the processor design to avoid data hazards. The data forwarding and load-use module is implemented
in ID_Phase module. The output from the alu module is out signal and it is fed back to as the input to
the data forwarding module. The output of the EX_Phase module and MEM_Phase module is fed
back as the input to the data forwarding module. The data forwarding modules compares the whether
the register field is same as the write back register field. If the register fields are same, the data from
that phase is forwarded as the operand values. The instruction should be fixed point integer
instruction. The data forward module forwards only when the register field is same as the write back
register field in all the phases.
If the instruction following the load instruction may be fixed point integer instructions where
the register field is same as the fixed point integer instructions, the data is not available for that
instruction. The data is available at the end of the MEM_Phase. Because the write back register for
the load instruction fetch the data from the data memory in the MEM_Phase. At this point the next
instruction reaches the EX_Phase, fetches the wrong register value instead of the correct value. To get
the correct value, the instruction has to wait till it gets correct value. The signals in the EX_Phase
have to wait till the data is available for the execution. The IF_Phase and ID_Phase in the pipeline
have to stall for 1 clock cycle. So the data will be available at the end of 1 clock cycle.
The stall_PC in the figure 18 is used to stall the pipeline. If stall_PC is enabled, the pipeline is
stalled. The op0_sel, op0_sel, and rs_sel are used as the control signal to select the data from the
MEM_Phase. The load-use module stalls the pipeline by enabling the stall_PC signal. The instruction
pointer is also stalled for 1 clock cycle. When the pipeline is stalled, the current instruction pointer
value is stored till the stall_PC clear to 0. For example, if the data for the op1 is not available, op1
have to wait till the data is available. The other signals in the EX_Phase also have to wait till the data
is available. Once the data is available, the op1_sel signal selects the data from the memory. The
EX_Phase will execute the instruction.
Canr1g09
38

ID/EX
IF/ID
EX/MEM
MEM/WB
4
+
32
Stall_PC
out
EX_out 32
32
Op0_sel
Op1_sel
Load-Use
NIP
1
1
1
rs_sel
32
inst
control
Regshift_add
IP
32
EX_out
32
32
0
Mem_data
32
MEM_Wb_reg
NIP
32
32
32
32
Alu_op0
32
Data in
32
Data out
A
out
32
L
0 32
1
Alu_op1
Data
Memory
MEM_wb_data
Wb_data
Wb_reg
op0
Address
MEM_wb_data
Register
bank
Value_rs
IP
Data Forward
{value_op0, valu_op1}
32
0
1
{file_op0, file_op1}
{Reg0_add, reg1_add}
32
Ins_mem
Alu_operands
op1
Wb_reg
5
ID_Wb_reg
IF_Phase
EX_Wb_reg
MEM_Wb_reg
5
32
ID_Phase
EX_Phase
MEM_Phase
Figure 18: Data Forwarding and Load Use
Canr1g09
39
3.2.5 System Call Instruction (sc):

Due to the time limit, the System call (sc) is not implemented in the design. The special
purpose registers such as MSR, EVPR, SRR0, and SRR1 is not implemented in the design. This
instruction is used for the system call exception. When the system call exception is occurred, the data
in the machine state register (MSR) is moved to the SRR1 (store/restore register1). The SRR0
(store/restore register0) is used to store the next instruction which follows the system call instruction.
The system call instruction modifies the bit in the MSR.
The exception vector address (EVA) is moved to the next instruction pointer (NIP). The
program flow sequence is changed. The EVA is formed by the concatenating the highword in the
Exception vector prefix register (EVPR) to the left. The MSR contents are modified when the
instructions are fetched from the NIP.
Canr1g09
40
This chapter covers how to create the PowerPC processor instructions, simulating the
instructions and final result of the each design. The NC Verilog simulator is used to simulate the
instruction and the output signal waveform is verified and the design is not synthesized.
4.1 Creating Instructions:

The instructions are written in the hex code format and stored in the instruction memory
module of the IF_Phase module. The hex code instruction are fetched from the instruction memory
and passed to the control module. For example, the hex code for the Instruction andis. r9, r12,FOFO
is 7589FOFOh. Similarly, all the instructions are converted to hex code based on the format of the
instruction shown in the Appendix [5]. Due to the time limit, floating point instructions, exceptions,
interrupt, management instructions, and control instructions are not implemented. The instructions
which are implemented are shown in the Appendix [3]. The instructions which are not implemented in
the design are shown in the Appendix [4].
4.2 Fixed Point Integer Instructions:

In this section, the fixed point instructions are tested and the final waveforms are discussed.
The integer instruction covers arithmetic, logical, compare, rotate, and shift instructions. All the
instructions are run in the NC Verilog simulator. The command used for the simulation is,
ncv_gui PPC_proceesor_stim.v PPC_processor.v
The Design browser window will open. Select the signals that affects by the instruction and view in
the waveform window.
The PPC_processor is the top level module. It connects the 5 modules of the pipeline such as
IF_Phase, ID_Phase, EX_Phase, and MEM_Phase. The write back phase is implemented in the
ID_Phase. Each module takes 1 clock cycle for their execution because it is a sequential block.
Therefore, each instruction takes totally five clock cycles for their complete execution. The following
figure 19 shows the design browser window. The PPC_processor_stim is the test bench module. The
PowerPC is the instance name of the top level module and the IF, ID, EX, MEM are the sub modules
of the top level module.
Canr1g09
41
Figure 19: Design Browser Window
Canr1g09
42
4.2.1 Fixed Point Arithmetic Instructions:

The design of the fixed point instruction is simulated in the NC Verilog simulator. Arithmetic,
logical, shift, rotate, and compare instructions are simulated. The following figure 20 shows the
simulated waveform of the addi instruction. The inst in the figure 20 shows the 32 bit instruction
fetched from the instruction memory. The ID_op0 and ID_op1 are source operands which stores the
value of the two operands. The out register is used to store the result of the arithmetic instruction. This
output is stored in the EX_out register at the end of the EX_Phase module. Since the arithmetic
instructions do not access the memory, the data is stored in the MEM_Phase for 1 clock cycle. The
MEM_wb_data register stores the data coming from the EX_out register in the MEM_Phase module.
Then the data is moved back to the register bank module to store the result in the destination register.
For example, consider addition instruction addi r8,r2,0080h. The hex code of this instruction
is 39020080h. In the figure 20, the TimeA shows the instruction being fetched. In the next clock
cycle, the ID_op0 and ID_op1 are the input operand of the instruction. The immediate value in the
instruction 0080h is separated by the control module. The value in the register r2 is 0 and moved to
the ID_op0. The immediate value is moved to ID_op1. This is shown in the figure 20. In the
EX_Phase module i.e 3rd clock cycle, the ID_op0 and ID_op1 are added and the result is stored in the
EX_out register. In the 4th clock cycle, the result of the addition is moved to the MEM_wb_data
register. The write back register is moved to MEM_wb_reg register. The data in the MEM_wb_data is
stored in the register, r8. This instruction takes 5 clock cycles for its complete execution. The CR and
XER are the register which updates the carry and overflow bit.
Canr1g09
43
Figure 20: addi Instruction
Canr1g09
44
4.2.2 Fixed Point Logical Instructions:

The figure 21 shows the final result of the simulated logical ori instruction. The inst is the 32
bit instruction which is the pipeline registers of the IF_Phase module. The ID_op0 and ID_op1 are the
used to store the two operands for the alu module. The output of the logical instruction is stored in the
EX_out register at the end of the EX_Phase module. The data is moved to next stage, MEM_Phase
module and stored in the MEM_wb_data register. Finally, the data is moved to the write back register
in the register bank module in the ID_Phase module.
The figure 21 shows the signal waveform of the logical ori instruction. The hex coded value
of the ori r11,r19,00FFh instruction is 626B00FFh. The timeA in the figure 21 shows the instruction
being fetched from the memory. The unsigned immediate data is sent as the second operand and
stored in the ID_op1. The data in the register, r11 is sent to the ID_op0 register in the ID_Phase
module. In the EX_Phase module, the two operands in the ID_op0 and ID_op1 are sent to the alu
module. The alu module perform the logical or and stores the result in the EX_out register. This is
shown in figure 21. The result of the logical ori instruction is sent to the write back registers through
the MEM_Phase module. It takes 5 clock cycles for its complete execution of the instruction.
Canr1g09
45
Figure 21: ORI Instruction
Canr1g09
46
4.2.3 Fixed Point Shift Instructions:

The fixed point shift instructions are tested by simulating the shift instructiong in the
NC verilog simulator. The fixed point shift instructions are fetched from the instruction memory in
the IF_Phase module. The ID_op0 and ID_op1 are two pipeline registers used to store the two
operand value. These data is passed to the alu module. The ID_op1 is shift right by the number bits
specified by ID_sh. The shifted data is stored in the EX_out register at the end of the EX_Phase
module. This shifted result is moved to next Phase, MEM_Phase and stored in the MEM_wb_data
register. The MEM_wb_reg is the write back register where the result of the shift instruction is stored.
The sraw r17,r8,sh[00111] instruction is simulated in the NC verilog simulator and the necessary
signals are shown in the waveform in the figure 22. The hex code for this instruction is 7D113E70h.
The timeA in the figure 22 shows the instruction being fetched from the memory.
The data, 00000080h in the resister, r8 are stored in the ID_op0 register. The value, 07h of sh
is stored in the ID_sh register. The ID_op0 is shifted by the number of bits specified by the ID_sh.
The shifted data, 00000001h is stored in the EX_out register at the end of the EX_Phase module. This
data is moved to MEM_Phase and stored in MEM_wb_data register and written back in the register,
r17.
Canr1g09
47
Figure 22: Shift Instruction (sraw)
Canr1g09
48
4.2.4 Fixed Point Rotate Instructions:

The operation of the rotate instruction is described in the section 2.6.4. For the rotate
instructions, the mask has to be generated. The mask is generated at the beginning of the EX_Phase
module. The rotate module in the EX_phase module is used for executing the rotate instructions. The
inputs to this module are two operands from the ID_op0 and ID_op1. The number bits to be rotated is
specified either in ID_op1 or ID_sh. The figure 23 shows the result of rlwimi instruction.
The instruction is converted to hex code. The hex code of the rlwini r3,r10,sh,mb, me
instruction is 51434195. The timeA in the figure 23 shows the instruction being fetched from the
memory. The value of the r10 is 868EFF7Fh and the value of the ID_ME, ID_ME, and ID_sh are 06h,
0Ah, and 08h. The value of the destination register, r3 is 00000080h and moved to the ID_op1. The
data in the ID_op1 is rotated by the number of bits specified by the ID_sh (07h). The generation of the
mask is explained in the section 2.6.4.1. The generated value of the mask is 03C00000h. The rotate
module in the EX_Phase rotates the ID_op1. The ID_op0 is moved to the out register. The rotated
data is inserted to the out register where the corresponding bits in the mask register should be 1. If the
bits in the mask are 0, then the corresponding bit in the rotated data is not inserted. In the figure 23, it
is shown that wherever the bits in the mask are 1, the roted_out data is inserted in the out register. The
data in the EX_out is moved to the next phase, MEM_Phase. Finally, the data in the MEM_wb_data
register is written back to the destination register, r3.
Canr1g09
49
Figure 23: rlwimi Instruction
Canr1g09
50
4.2.5 Fixed Point Compare Instructions:

There are four compare instructions that supported by PowerPC processor. They are cmp,
cmpi, cmpl, and cmpli instructions. The figure 24 shows the simulation of the cmpli crf5,r8,0080h.
The hex code of this instruction is 2A880080h. The timeA in the figure 24 shows the instruction
being fetched from the memory. This instruction is fetched from the instruction memory in the
IF_Phase module. The control module separates the crfD field, register field, r8 and signed immediate
value, 0080h. The data in the register, r8 is stored in the pipeline register ID_op0 the 16-bit immediate
value is extended to 32-bit by sign extending to the left. This signed immediate is stored in the
pipeline register, ID_op1.
The data in these register are given as the input to the alu module. This two data is compared
and the compared result updates the lt, gt, and eq signal. In the figure 24, the two data are equal. The
eq signal is updated and set to 1. The other signal lt, so, and gt are set to 0. At the end of the
EX_Phase , the signals are stored in the pipeline registers, EX_lt, EX_gt, EX_so and EX_eq.
These signals are moved and stored in the 4-bit CR5 fields. The condition register, CR is
updated with the CR5 field. The condition register is updated at the end of the EX_Phase.
Canr1g09
51
Figure 24: Compare Instruction
Canr1g09
52
4.3 Load/Store Instructions:

This section covers the testing of the load/store instructions of the PowerPC processor design.
These are only instructions that access the data memory directly.
4.3.1 Load Instructions:

The hex code for instructions is created using the instruction format. This instruction is
fetched from the instruction memory and stored in the instruction register at the end of the IF_Phase
module and passed to the control module in the ID_Phase module. The control module separates the
register and immediate value from the instruction field. If the instruction form is D-form, the
immediate field is the displacement. If the instruction format is X-Form, the register field is the index
register. The effective address can be calculated by adding the contents of the base address and
displacement or index register.
The figure 25 shows the load word and zero instruction, lwz r3,0010h (r30). The hex for this
instruction is 807E0010. The timeA in the figure 25 shows the instruction being fetched from the
memory. The disp in the figure 25 indicate the displacement and the register, r30 is the base address.
At the end of the ID_Phase, the ID_op0 and ID_op1 store the base address and displacement. The
EX_EA register stores the effective address of the data memory. In the figure 25, it is shown the
memory address is calculated by adding the base address, 00000000h and displacement, 00000010h.
The effective address is 00000010h is stored in the EX_EA register. In the MEM_Phase module, the
address register store the memory address and it is passed to the memory module. In the memory
module, the data in the address specified is fetched. The data in the memory location, 00000010h is
00000080h. This data is stored in the MEM_wb_data register. The MEM_wb_reg in the figure 25
indicates the write back register which the data in the MEM_wb_data register is written in the
ID_Phase module.
Canr1g09
53
Figure 25: Load (lwz) Instruction
Canr1g09
54
4.3.2 Store Instructions:

As described in the section 2.5.3. The store instruction stores the data in the data memory.
The figure 26 shows the simulation result of the store instruction with update. The instruction shown
in the figure 26 is stwu r3, 0070h (r1). This instruction stores the word in the register r3 in the
memory location. The hex code of this instruction is 94610070h. The timeA in the figure 26 shows
the instruction being fetched from the memory. This instruction is fetched from the instruction
memory. In the ID_Phase, the data in the register, r3 is stored in the ID_op0 register which acts as the
base register. The displacement is stored in the ID_op1 register. The ID_memdata indicate the data to
be stored in the memory. The memory address is calculated in the EX_Phase module by adding the
two register values and stored in the EX_EA register. At the same time, the base register, r3 is
updated with the effective address value.
In the MEM_Phase module, the EX_memdata is divided in to 4 bytes and stored in the
data_in0, data_in1, data_in2, and data_in3. These data are passed to the memory module and stored in
their specified location in the memory. This is shown in the figure 26. The register, r1 is updated with
the memory address.
Canr1g09
55
Figure 26: Store with update Instruction
Canr1g09
56
4.4 Branch Instructions:

The branch instruction of the PowerPC is explained in the section 2.4. The figure 27 shows
the simulation waveform of the branch instruction. The conditional branch instruction is shown in the
figure 27. The timeA in the figure 27 shows the instruction being fetched from the memory. The BO
and BI field in the instruction is separated by the control module in the ID_Phase module. The BD is
the branch target address is separated and extended to 32 bit by adding 0-bit in the right and sign
extending the 7th to the left.
At the beginning of the ID_Phase, the branch signal predicts whether the instruction is a
branch instruction or not. The value of the ID_BO field is 01100h and value in the BI field is 01110h.
The branch instruction checks whether the CR[14] is 1. If this condition is satisfied, the branch_cond
signal goes to 1 which indicates the branch likely to be taken. At the same time, the branch target
address is calculated. Since the ID_aa signal is 1, the current program counter value is not added to
branch address. The branch_addr in the waveform indicates the branch target address.
At the beginning of the EX_Phase module, the branch_cond is set to 1 and branch_addr is
calculated. The branch_addr is passed to the register bank module in the ID_Phase. The pc_next in the
waveform is changed to the new program counter value. The IP, program counter is changed to the
branch target address. The program flow sequence is modified. Once the branch is to be taken, the
instruction in the IF_Phase and ID_Phase module has to be flushed.
By using the signal branch and branch_cond signal, the signals in the IF_Phase and ID_Phase
module are set to 0 to prevent them from executing. For the unconditional branch, the branch_addr is
passed to the pc_next without checking any condition.
Canr1g09
57
Figure 27: Branch Instructions
Canr1g09
58
4.5 Data forwarding and Load-use:

To load-use and data forwarding is used to avoid data hazards. The figure 28 shows the
simulation result of the load-use and data forwarding. The load instruction, lbz r10, 0010 (r30) and
next instruction are adde. r12, r10, r4. The data, r10 will not be available for the next instruction. This
is data hazards. The hex code for the load and adde instruction is 895E0010h and 7D8A2515h. The
timeA in the figure 28 shows the load instruction being fetched from the memory. Following the load
instruction is the adde. instruction. The instructions are fetched from the instruction memory and
passed to the control module. When the adde instruction is excuting in the ID_Phase, the data for the
register, r10 is not available because the lbz fetches the data from the memory in the end of the
MEM_Phase. Adde instruction has to wait until the end of the MEM_Phase for the register, r10 data.
At the end of the MEM_Phase, the lbz instruction fetches the data for register, r10 from the data
memory. So IF_Phase and ID_Phase have stall for 1 clock cycle. For stalling these two Phases,
stall_PC signal is used.
If stall_PC signal is 1, the IF_Phase and ID_Phase are stalled. When this signal goes 0, the
instructions are executed continously. The data for register, r10 is available for the adde. instruction at
the EX_Phase. It is selected by enabling the control signal, op0. After getting the data for register,
r10, the data is forwarded to the alu_op0 as shown in the figure 28. The alu_op0 and alu_op1 are two
operands for the alu module. The IP, program counter is stalled for 1 clock cycle. After the data for
register, r10 is forwarded to the alu_op0, the adde. instruction is executed and the result is stored in
the EX_out register.
Canr1g09
59
Figure 28: Load-use and Data forwarding
Canr1g09
60
4.6 Sign-Extension Instructions:

There are two instructions in PowerPC processor for the sign extension. The figure 29 shows
simulation of the extsb instruction. The extsb r7, r8 is shown in the figure 29. The hex code of this
instruction is 7D070774. The signal inst in the figure 29shows the instruction is fetched from the
instruction memory. The data for the registers, r8 is read from the register bank module. This data,
00000080h is stored in ID_op0 register at the end of the ID_Phase module. In the alu module, the 23rd
bit(1) of the ID_op0 is extended to the MSB of the output. The EX_out register which is shown in the
figure 29 is used to hold the output of the extsb instruction. The output, FFFFFF80h is stored in the
MEM_wb_data register in the MEM_Phase. The data is then sent back to the register bank module for
storing it in the destination register, r7 in the ID_Phase module.
Canr1g09
61
Figure 29: Sign-Extension Instructions
Canr1g09
62
Two Gantt chart is shown in the table 1 and table 2. The table 1 shows the Gantt chart for the
project at the beginning stage. The work for the project is planned and weekly milestones are set. The task
plans and weekly milestones are shown in the gantt chart. During the later part of the project, the gannt
chart is modified and the final gantt chart is shown in the table 2. The project task plan and milestones
was completed successfully.
Canr1g09
63

Table 1: Initial Gantt Chart
Week (starting week
14/06 -
21/06 -
28/06 -
5/07 -
12/07 -
19/07 -
26/07 -
2/08 -
9/08 -
16/08 -
23/08 -
30/08 -
6/09 -
13/09 -
20/09 -
beginning 14th June)
19/06
25/06
2/07
9/07
16/07
23/07
30/07
6/08
13/08
20/08
27/08
3/09
10/09
17/09
27/09
Supervisor Holiday
Topic Selection
ACTIVITIES
Background
Research
PowerPC
Instruction set
Initial Datapath
Pipelining
Tasks
Initial Datapath
Behavioural Level ALU
Behavioural Level
Condition Codes
Canr1g09
64

Behavioural Level Memory
Behavioural Level Processor

Final Design
Writing-up
MILESTONES
Initial Datapath
Behavioural Level -
*
*
ALU
Behavioural Level
Condition Codes
Behavioural Level -
Memory
Behavioural Level -
Processor
End of Practical
Work
Canr1g09
65

Final Design
Milestone
demonstrate
to
sup/examiner
Milestone
dissertation
draft
complete
Final corrections
Milestone Hand-in
Canr1g09
66

Table 2: Final Gantt Chart
Week (starting week
14/06 21/06
28/06
5/07
12/07
19/07
26/07
2/08
9/08
16/08
23/08 30/08 6/09
13/09 20/09
beginning 14th June)
2/07
9/07
16/07
23/07
30/07
6/08
13/08
20/08
27/08 3/09
19/06 25/06
10/09 17/09 27/09
Supervisor Holiday
Topic Selection
ACTIVITIES
Background Research
PowerPC
Instruction set
Initial Datapath
Pipelining
Tasks
Initial Datapath
Behavioural Modules
Arithmetic
and
Canr1g09
67

Load
and
Store
Instructions
Pipeline
Implementation
Data Forwarding and
load-use
Branch Instructions
conditional
and
unconditional
Rotate
and
Shift
Instructions
Sign-Extension
Instructions
Final Design
Writing-up
MILESTONES
Initial Datapath
Arithmetic
*
and
Canr1g09
68

Load
and
Store
Instructions
Pipelining
Implementation
Pipelining with data
forwarding and Load-
use Implementation
Branch Instructions
Rotate and Shift
*
*
Instructions
End of Practical Work
Final Design
Milestone
demonstrate to
sup/examiner
Milestone
dissertation draft
complete
Final corrections
Milestone Hand-in
Canr1g09
69
The aim of the project is to design the 32-bit pipelined processor. The pipelined processor is
designed and it can execute the PowerPC instructions. The pipelined processor can execute fixed
point integer instructions, load/store instructions, and branch instructions. The fixed integer
instructions include arithmetic, logical, compare, rotate and shift instructions. Each instruction takes
five clock cycles for its complete execution. The data hazard is encountered in the pipelined processor
and it is eliminated by implementing data forwarding and load-use module. The pipelined processor
also supports PowerPC branch instructions. The control hazard is eliminated by flushing the
instruction in IF and ID stages of the pipeline. The pipelined processor also supports the PowerPC
integer load/store instructions.
6.1 Achievements:
Pipeline is implemented in the processor design and fully working.
All the instructions which are implemented executes successfully.
The hazards such as data hazards and the control hazards are encountered in earlier part of the
design and they are been eliminated at the later part of the design.
Within the time period given, the pipelined processor is able to execute fixed point
instructions, branch instructions, and load/store instructions.
6.2 Limitations:
Due to time limit, the system call instructions are not implemented.
The design is not synthesized because of the limited time period.
The design will not execute floating point instructions, memory management instructions,
load/store multiple instructions, load/store string instructions, and processor control
instructions.
Canr1g09
70
With the given amount of time period, the processor can able to run the fixed point integer
instructions, integer load store instructions, and branch instructions. The system call instructions are
mot implemented because of the given time period.
In future, the processor can be expanded with implementing floating point instructions
load/store multiple instructions, load/store byte reverse instructions, system linkage instructions, trap
instructions, condition register logical instructions, integer load/store string instructions, branch
conditional-count register, branch conditional-link register, TLB Management instructions, processor
control instructions, cache management instructions, and synchronization instructions.
If I would have an additional one more month, I could able to implement system call
instructions, condition register logical instructions, branch conditional-link register, and branchconditional to count register in the design.
Canr1g09
71

Design of A Pipelined Powerpc Processor Using Verilog

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Design of A Pipelined Powerpc Processor Using Verilog

Hochgeladen von

Copyright:

Verfügbare Formate

University of Southampton

Faculty of Engineering, Science and Mathematics

Design of a Pipelined PowerPC

A dissertation submitted in partial fulfillment of the degree of

MSc Microelectronics Systems Design

Project Supervisor: B Iain McNally

School of Electronics and Computer Science

School of Electronics and Computer Science

School of Electronics and Computer Science

2.6.5 Shift Instructions: ----------------------------------------------------------------------------------------------------------- 15

2.7 PIPELINING OVERVIEW:--------------------------------------------------------------------------------------------------------------- 16

School of Electronics and Computer Science

CHAPTER 3: DESIGN ---------------------------------------------------------------------------------------------------------------------- 20

3.2.2 Load/Store Instructions: -------------------------------------------------------------------------------------------------- 32

3.2.3 Branch Instructions: -------------------------------------------------------------------------------------------------------- 36

School of Electronics and Computer Science

FIGURE 1: INDIRECT ADDRESSING MODE SOURCED FROM [3] ............................................................................................ 12

School of Electronics and Computer Science

TABLE 1: INITIAL GANTT CHART -------------------------------------------------------------------------------------------------------------- 64

School of Electronics and Computer Science

School of Electronics and Computer Science

2.2 PowerPC Registers:

2.2.1 General Purpose Registers:

2.2.2 Exception Register:

School of Electronics and Computer Science

2.2.3 Count Register:

2.2.4 Condition Register:

School of Electronics and Computer Science

2.2.5 Link Register:

2.3 PowerPC Data Types:

2.4 PowerPC Branch Instructions:

School of Electronics and Computer Science

2.5 PowerPC Load/Store Instructions:

2.5.1 Addressing Modes:

School of Electronics and Computer Science

Figure 1: Indirect Addressing Mode sourced from [3]

Index Register (GPR)

Figure 2: Indirect Indexed Addressing sourced from [3]

The register indirect addressing mode can be represent in the RTL,

2.5.2 Load Instructions:

School of Electronics and Computer Science

2.5.3 Store Instructions:

2.6 PowerPC Fixed Point Integer Instructions:

2.6.1 Arithmetic Instructions:

School of Electronics and Computer Science

2.6.2 Logical Instructions:

2.6.3 Sign- Extension Instructions:

2.6.4 Rotate Instructions:

School of Electronics and Computer Science

Figure 3: Mask (MB < ME) sourced from [4]

The figure 4 shows the mask generation if ME < MB [4],

Figure 4: Mask (MB > ME) sourced from [4]

2.6.4.2 PowerPC Rotate Instructions:

2.6.5 Shift Instructions:

School of Electronics and Computer Science

2.7 Pipelining Overview:

School of Electronics and Computer Science

Figure 5: 5 Stage Pipeline sourced from [1]

2.7.1 Pipelining Hazards:

School of Electronics and Computer Science

2.7.1.3 Data Hazards:

Figure 7: Data Forwarding (Data Hazards)