Sie sind auf Seite 1von 49

Contents

1. Introduction 2. System Description


1. Building Datapath a) Major Components b) Components for Arithmetic and Logic Functions c) Load word (lw) and store word (sw) instructions d) Branch on equal instruction e) Jump instruction 2. Simple Implementation Scheme a) Creating a Single Datapath b) ALU Control c) Main Control 3. Multicycle Impplementation a) Additions and Changes in the Scheme b) Execution of Instructions in Clock Cycles 4. Module Specification ( Functional Description ) a) ALU b) Memory c) Control d) Datapath i) Instruction Fetch ii) Instruction Decode iii) Execution iv) Memory Writeback e) Processor Memory

3. Uniqueness 4. Challenges Faced 5. Conclusion 6. VHDL Code ( RTL Schematic included in folder ) 7. References

1. Introduction
Project Description
Designing a fully working reduced instruction set(RISC) for a processor. Design and implement/simulate using VHDL a processor to run the developed instruction set using a Field Programmable Gate Array (FPGA) (if possible).

Motivation
Being an engineering student, it was our first academic project. This project inspired us for more research at UG level. Practical Implementation of theoretical concepts. Opportunity to calibrate a Fun-to-do element in the course. Performing a real hardware project was of utmost interest.

Our Approach
First of all, we started with revision of VHDL concepts. Then we reviewed features of RISC processor and its advantages over CISC. We continued with our study on RISC processors and implemented some of the components of MIPS processor. The next task was to design an instruction set for our own processor. Innovative thinking of some of the group members paved the way for 16-bit instruction set. At last, but not the least the only job left, perhaps the most difficult one, was to implement the processor using VHDL and we moved on to finish it.

2. System Description
1. Building a Datapath
a) Major Components At first we look at the elements required to execute the NEO instructions and their connection. The first element needed is a place to store the program instructions. This Instruction Memory is used to hold and supply instructions given an address. The address must be kept in the Program Counter (PC), and in order to increment the PC to the address of the next instruction, we also need an Adder. All these elements are shown in figure-

After fetching one instruction from the instruction memory, the program counter has to be incremented so that it points to the address of the next instruction 2 bytes later. This is realised by the datapath shown in figure-

b) Components for Arithmetic and Logic Functions The instructions we use all read two registers, perform an ALU operation and write back the result. These arithmetic-logical instructions are also called R-type instructions. This instruction class considers add, sub, slt, and and or. The 8 registers of the processor are stored in a Register File. To read a dataword two inputs and two outputs are needed. The inputs are 3 bits wide and specify the register number to be read, the outputs are 16 bits wide and carry the value of the register. To write the result back two inputs are needed: one to specify the register number and one to supply the data to be written. The Register is shown in Figure

To process the data from the Register, an ALU with two data inputs is used. Figure shows the combination of Register and ALU to operate on R-type instructions.

c) Load word (lw) and store word (sw) instructions Two more elements are needed to implement the sw- and lw-instructions: the Data Memory and the Sign Extension Unit.

The sw- and lw-instructions compute a memory address by adding a register value to the 8-bit signed offset field contained in the instruction. Because the ALU has 16-bit values, the instruction offset field must be sign extended from 8 to 16 bits simply by concatenating the sign-bit 8 times to the original value. The instruction field for a lw- or sw-instruction is shown in figureop 5 bit rs 3 bit rt 3 bit constant 5 bit

d) Branch on equal instruction The beq instruction has three operands, two registers that are compared for equality, and a 11-bit offset used to compute the branch target address relative to the branch instruction address.

Figure shows the datapath for a branch on equal instruction. The datapath must do two operations: compare the register contents and compute the branch target. Therefore two things must be done: The address field of the branch instruction must be sign extended from 8 bits to 16 bits and must be shifted left 2 bits so that it is a word offset. The branch target address is computed by adding the address of the next instruction (PC + 2) to the before computed offset.

e) Jump Instruction The jump instruction is similar to the branch instruction, but computes the target PC differently and not conditional. The destination address for a jump is formed by concatenating the upper 3 bits of the current PC + 2 to the 11-bit address field in the jump instruction and adding 00 as the last two bits.

2. Simple Implementation Scheme


The simplest possible implementation of the MISP Processor contains the datapath segments explained above added by the required control lines.

a) Creating a Single Datapath The simplest datapath might attempt to execute all instructions in one clock cycle. This means that any element can be used only once per instruction. So these elements have to be duplicated. If possible datapath elements can be shared by different instruction flows. Therefore multiple connections to the input must be realised. This is commonly done by a multiplexer.

Figure shows the combined datapath including a memory of instructions and one for data, the ALU, the PC-unit and the mentioned multiplexers.

b) ALU Control The NEO field that contains the information about the instruction has the following structure: op 5 bit rs 3 bit rt 3 bit rd 3 bit shamt 2 bit

The meaning of the fields are: op: basic operation rs: first register source rt: second register source rd: register destination shamt: shift amount Opcode for R-Type instructionsMnemonic ADD MOVE SUB SLL SRL AND Opcode 00000 00001 00010 00011 00100 00101 Description RD = RS + RT RD = RS RD = RS RT RD = RS << SHIFT RD = RS >> SHIFT RD = RS & RT

OR NOT XOR SLT JR Opcode for I-Type instructionsMnemonic

00110 00111 01000 01001 01010

RD = RS | RT RD = ~RS RD = RS XOR RT RD = (RS < RT) ? 1 : 0 PC = RS

Opcode

Description

ADDI SUBI SLTI LW SW Opcode for J-Type instructions-

01011 01100 01101 01110 01111

RD = RS + CONST RD = RS - CONST RD = (RS < CONST) ? 1 : 0 RD = MEM(RS + OFF) RS = MEM(RT + OFF)

Mnemonic

Opcode

Description

BEQ BNE J JAL c) Main Control

10000 10001 10010 10011

IF, RS = RT, PC += OFF IF, RS RT, PC += OFF Jump to address Jump and link

The main control unit generates the control bits for the multiplexers, the data memory and the ALU control unit. The input of the main control unit is the 5-bit op-field of the NEO instruction field.

3. Multicycle Implementation
To avoid the disadvantages of the single cycle implementation described in the section before, a multicycle implementation is used. This technique divides each instruction into steps and each step is executed in one clock cycle. The multicycle implementation allows a functional unit to be used more than once in a instruction, so that the number of functional units can be reduced. The major advantage of a multicycle design is the ability to share functional units within an execution.

a) Additions and Changes in the Scheme

Comparing to the single-cycle datapath the differences are that only one memory unit is used for instructions and data, there is only one ALU instead of an ALU and two adders and several output registers are added to hold the output value of a unit until it is used in a later clock cycle. The instruction register (IR) and the memory data register (MDR) are added to save the output of the memory. The registers A and B hold the register operands read form the register file and the ALUOut holds the output of the ALU. With exception of the IR all these registers hold data only between a pair of adjacent clock cycles. Because the IR holds the value during the whole time of the execution of a instruction, it requires a write control signal. The reduction from former three ALUs to one causes also the following changes in the datapath : An additional multiplexer is added for the first ALU input to choose between the A register and the PC. The multiplexer at the second ALU input is changed from a two-way to a fourway multiplexer. The two new inputs are a constant 2 to increment the PC and the sign-extended and shifted offset field for the branch instruction. In order to handle branches and jumps more additions in the datapath are required. The three cases of R-type instructions, branch instruction and jump instruction cause three different values to be written into the PC: The output of the ALU which is PC + 2 should be stored directly to the PC. The register ALUOut after computing the branch target address. The lower 11 bits of the IR shifted left by two and concatenated with the upper 4 bits of the incremented PC, when the instruction is jump. If the instruction is branch, the write signal for the PC is conditional. Only if the the two compared registers are equal, the computed branch address has to be written to the PC.

Therefore the PC needs two write signals, which are PCWrite if the write is unconditional (value is PC + 2 or jump instruction) and PCWriteCond if the write is conditional.

The write signal for the PC is combined form the ALU zero bit and the two write signals PCWrite and PCWriteCond by an AND gate and OR gate. b) Execution of Instructions in Clock Cycles The execution of an instruction is broken into clock cycles, that means that each instruction is divided into a series of steps.

The execution of an instruction is divided into maximal five steps. Different elements of the datapath can work in parallel during one clock cycle, whereas others can only be used in series. So there must be sure, that after one step the values computed are stored either in the memory or in one of the registers.

The operation steps are: 1. Instruction fetch step Fetch the instruction from the memory and computed the address of the sequential instruction:

IR = Memory[PC] PC = PC + 4

Control signal setting: MemRead = 1 IRWrite = 1 IorD = 0 ALUSrcA = 1 ALUSrcB = 01 ALUOp = 00 PCSource = 00 PCWrite = 1 2. Instruction decode and register fetch step It is still unknown what the instruction is, so there can only be performed actions that are applicable for all instructions or are not harmful. The registers indicated by the rs and rd field of the instruction are read and store into the A and B register, and the potential branch target is computed and stored into the ALUOut register. A = Reg[IR[13-11]] B = Reg[IR[10-8]] ALUOut = PC + (sign-extend (IR[7-0]) << 2)

Control signal setting:


ALUSrcA = 0 ALUSrcB = 11 ALUOp = 00 3. Execution, memory address computation or branch completion In this step the instruction is known and the operation depends on what the instruction is. One of these four functions is executed: a. Memory reference:
ALUOut = A + sign-extend(IR[7-0])

Control signal setting:


ALUSrcA = 1 ALUSrcB = 10 ALUOp = 00

b. Arithmetic-logical instruction:
ALUOut = A op B

Control signal setting:


ALUSrcA = 1 ALUSrcB = 00 ALUOp = 10

c. Branch:
if (A == B) PC = ALUOut

Control signal setting:


ALUSrcA = 1 ALUSrcB = 00 ALUOp = 01 PCWriteCond = 1 PCSource = 01

d. Jump:
PC = PC[15-13] & (IR[10-0] << 2)

Control signal setting:


PCWrite = 1

4. Memory access or R-type instruction completion step In this step a load or store instruction accesses memory or a arithmeticlogical instruction writes its result. a. Memory reference:
MDR = Memory [ALUOut]

or
Memory [ALUOut] = B

Control signal setting:


MemRead = 1 or MemWrite = 1 IorD = 1

b. Arithmetic-logical instruction:
Reg[IR[6-4]] = ALUOut

Control signal setting:


RegDst = 1 RegWrite = 1 MemtoReg = 0

5. Memory read completion step The load instruction is completed by writing back the value from the memory: Reg[IR[9-7]] = MDR

Control signal setting:


MemtoReg = 1 RegWrite = 1 RegDst = 0

4. Module Specification
a) ALU
Functional Description The arithmetic-logic unit (ALU) performs basic arithmetic and logic operations which are controlled by the opcode. The result of the instruction is written to the output. An additional zero-bit signalizes an high output if the result equals zero. At the present time, the basic arithmetic operations add and sub and the logic operations and, or and slt can be applied to inputs. The inputs are 16 bit wide with type unsigned. A detection of overflow or borrow is not supported at the moment.

b) Memory
Functional Description Data is synchronously written to or read from the memory with a data bus width of 16 bit. The memory consists of four ram blocks with 8 bit data width each. A control signal enables the memory to be written, otherwise data is only read. In order to store data to the memory the data word is subdivided into four bytes which are separately written to the ram blocks. Vice versa, the single bytes are concatenated to get the data word back again. At the moment, it is only possible to read and write data words. An addressing of half-words or single bytes is not allowed. In order to write or read data words, all ram blocks have to be selected. Hence, the lowest two bit are not examined for chip-select logic.

Data is addressed by the NEO-processor with an address width of 16 bit, while the address width of a ram block is 8 bit each. All ram blocks are connected to the same address. Since we do not use the full address width for addressing and chip selects, data words are addressed by multiple addresses.

c) Control
Functional Description The input to the State Machine are the upper 5 bits of the opcode field containing the instruction. The outputs of the state machine are the control signals of the single functional units of the processor implementation especially the multiplexers of the datapath. The Operation Code of the ALU is stored in a truth table and the corresponding Opcode is produced depending on the ALUOp signal of the state machine.

d) Data Path
Functional Description The datapath is divided into four sections with respect to the pipelining structure of a processor. The four parts are the Instruction Fetch, Instruction Decode, Execution and Memory Writeback. These sections are synthesized of their own and then combined to the Data Block. i) Instruction Fetch Functional Description The Instruction Fetch Block contains the PC the Instruction Register and the Memory Data Register. This part provides the data and instruction form the memory. ii) Instruction Decode Functional Description The Instruction Decode Block writes the instruction of the Instruction Register to the Register File and computes the second operand for a Branch Instruction or a sw- or lw-instruction. iii) Execution Functional Description The Execution contains the ALU as main element and computes the desired result of the instruction.

It also computes the jump target address and provides it for the Memory Writeback Block. The operands loaded to the ALU are chosen by two multiplexers which are sensible to the signals ALUSrcA and ALUSrcB. iv) Memory Writeback Functional Description The Memory Writeback Block consists of the ALUOut register and a multiplexer with source signal PCSource. This block leads the result of the computation either back to memory or to the register file. The multiplexer leads back the next PC value depending on the PCSource signal.

e) Processor and Memroy


Functional Description The both parts Datapath and Controlpath are combined to the processing unit. Together with the Memory the whole processor is completed.

3. Uniqueness
What we were trying to do while working on this project was to learn and experiment on how a processor works and how we could modify its specifications for our purpose. So, we did not need a lot of instructions to implement and so thought upon of building our own ISA for our processor NEO. We designed our processor for lower number of instructions. This implied that we needed a few bits. So we devised a compact ISA of 16-bit set. It reduced the added burden of unused bits. ISA did not have function field rather we integrated it in opcode field only. So our ISA is as follows : R- Type Format
op 5 bit rs 3 bit rt 3 bit rd 3 bit shamt 2 bit

I- Type Format
Op 5 bit rs 3 bit rt 3 bit constant 5 bit

J- Type Format op 5 bit offset 11 bit

4. Challenges Faced
First of all we would like to say that working on this project was very fun and we really liked working on this project as a team. We also realized how an impossible looking task viewed from the eyes of an individual becomes so easy when people of different special calibre get united and work upon it. But during this term we faced a lot of problems too. First of all there was time limitation because of over-burden of a few irrelevant courses which led to less devotion of time to more relevant and useful subjects. Then, while working on the project, programming of individual components was not that difficult but implementation of synchronous operation of control and datapath was quite complex for us.Also integration of VHDL codes of individual components written by different team members in different mnemonics and variables led to lot of errors and chaos. Now a few technical errors that arose also made us feel downhearted. First we were using ModelSim which was the default option for simulator but the code segments were not simulating which led us to think that there was error in our codes. But later we realize that it was not because of the codes but because of using MultiSim. So we shifted on to using ISim for simulation which removed a lot of warnings and simulation errors from our codes. Then we would like to recommend to make use of already available advanced libraries such as numeric_std to include arithmetic operations of Signed integers, since it makes our work very easy. Also, there is a strange fact about the simulation of RTL Schematic of Xilinx. When we generated our RTL Schematic, we knew that we were not taking most of the bits from various components but what Xilinx did was that it removed all those component diagrams from RTL Schematic. Later, on googling it we found out that this behaviour was happening was because in the main entity no output was defined from those blocks i.e. if a block of component is not giving any output, Xilinx simply removes that from the RTL Schematic. So make sure whenever you find such an ambiguity, just make sure to include the output from every block and not leave a block as waste else it wouldnt be included in the RTL Schematic.

5. Conclusion
o Our own experiences
Application of Theoretical knowledge learned in VHDL and Computer Architecture We learned VHDL coding in the course DIGITAL ELECTRONICS. But we never thought that we would have such a great opportunity to use that knowledge in future to develop a code for our own processor while in our UG academics. When we started, a general tension of How we will be able to read so many pages of books to know deeply about this course was in our minds but now we all have developed how to do pacing of pages and to fetch only that stuff that is important . Realization that working as an individual and as a TEAM is a totally different aspect Since when we have entered this institute, we all have been working individually. But then the project under your guidance has changed our way of working. We all learned how to work in a group effectively and efficiently in this world where people of different traits and natures are there. Working on this project taught us how to bring effectivity and outcome combined as a total team effort and not purely on an individual level. So we thank you sir for giving us that platform that brought out the best in us. Got acquainted with the existing processors and their variety for use in different purposes Now we know what are the different technologies used in different processors and we can also differentiate between their varying characteristics. We also came to know that how we can make our NEO more useful and according to demand now a days.

o Scope of Improvement
Including more operations Our main intention while doing this project was to learn about how a processor really works. So,we didnt focus on including a large no. of instructions. If we are given more time, we would like to include more operations.

Include Pipelining to improve the processors performance As we all know, Pipelining enhances the performance of a processor. But due to time limitation, we could not work upon including pipeling for our processor. Realizing hardware implementation of the processor on FPGA Spartan 3 kit We were almost there in realizing the hardware implementation of the processor on FPGA Spartan 3 kit but we could not arrange the FPGA kit. We even studied about the user constraint file required in the VHDL for hardware implementation on FPGA. But due to unavailability of the kit, we could not achieve what we intended for.

6. VHDL Code ( RTL Schematic included in folder )


1. VHDL code for ALU
LIBRARY IEEE; USE IEEE.std_logic_1164.ALL; USE IEEE.numeric_std.ALL; -- use package USE work.procmem_definitions.ALL; ENTITY alu IS PORT ( a, b : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); opcode : IN STD_ULOGIC_VECTOR(2 DOWNTO 0); result : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0); zero : OUT STD_ULOGIC); END alu; ARCHITECTURE behave OF alu IS BEGIN PROCESS(a, b, opcode) -- declaration of variables VARIABLE a_uns : UNSIGNED(width-1 DOWNTO 0); VARIABLE b_uns : UNSIGNED(width-1 DOWNTO 0); VARIABLE r_uns : UNSIGNED(width-1 DOWNTO 0); VARIABLE z_uns : UNSIGNED(0 DOWNTO 0); BEGIN -- initialize values a_uns := UNSIGNED(a); b_uns := UNSIGNED(b); r_uns := (OTHERS => '0'); z_uns(0) := '0'; CASE opcode IS -- add WHEN "010" => r_uns := a_uns + b_uns; -- sub WHEN "110" => r_uns := a_uns - b_uns; -- and WHEN "000" => r_uns := a_uns AND b_uns; -- or WHEN "001" => r_uns := a_uns OR b_uns; -- slt WHEN "111" => r_uns := a_uns - b_uns; IF SIGNED(r_uns) < 0 THEN r_uns := TO_UNSIGNED(1, r_uns'LENGTH); ELSE r_uns := (OTHERS => '0'); END IF;

-- others WHEN OTHERS => r_uns := (OTHERS => 'X'); END CASE; -- set zero bit if result equals zero

IF TO_INTEGER(r_uns) = 0 THEN z_uns(0) := '1'; ELSE z_uns(0) := '0'; END IF; -- assign variables to output signals result <= STD_ULOGIC_VECTOR(r_uns); zero <= z_uns(0); END PROCESS; END behave;

2. VHDL code for ALUControl


LIBRARY IEEE; USE IEEE.std_logic_1164.ALL; ENTITY ALUControl IS PORT (instr_4_0 : IN std_ulogic_vector(4 downto 0); ALUOp : IN std_ulogic_vector(1 downto 0); ALUopcode : OUT std_ulogic_vector(2 downto 0) ); END ALUControl; ARCHITECTURE behave OF ALUControl IS BEGIN Alu_Control : PROCESS(instr_4_0, ALUOp) CONSTANT cADD : std_ulogic_vector(4 downto 0) := "00000"; CONSTANT cSUB : std_ulogic_vector(4 downto 0) := "00010"; CONSTANT cAND : std_ulogic_vector(4 downto 0) := "00100"; CONSTANT cOR : std_ulogic_vector(4 downto 0) := "00101"; CONSTANT cSLT : std_ulogic_vector(4 downto 0) := "01010"; BEGIN case ALUOp is when "00" => ALUopcode <= "010"; -- add when "01" => ALUopcode <= "110"; -- subtract when "10" => -- operation depends on function field case instr_4_0(4 downto 0) is when cADD => ALUopcode <= "010"; -- add when cSUB => ALUopcode <= "110"; -- subtract when cAND => ALUopcode <= "000"; -- AND when cOR => ALUopcode <= "001"; -- OR when cSLT => ALUopcode <= "111"; -- slt when others => ALUopcode <= "000"; end case;

when others => ALUopcode <= "000"; end case; END PROCESS; END behave;

3. VHDL code for Control


LIBRARY IEEE; USE IEEE.std_logic_1164.ALL; ENTITY control IS PORT (clk, rst_n : IN std_ulogic; instr_15_11 : IN std_ulogic_vector(4 downto 0); instr_4_0 : IN std_ulogic_vector(4 downto 0); zero : IN std_ulogic; ALUopcode : OUT std_ulogic_vector(2 downto 0); RegDst, RegWrite, ALUSrcA, MemRead, MemWrite, MemtoReg, IorD, IRWrite : OUT std_ulogic; ALUSrcB, PCSource : OUT std_ulogic_vector(1 downto 0); PC_en : OUT std_ulogic ); END control; ARCHITECTURE behave OF control IS COMPONENT ControlFSM PORT ( clk, rst_n : IN std_ulogic; instr_15_11 : IN std_ulogic_vector(4 downto 0); RegDst, RegWrite, ALUSrcA, MemRead, MemWrite, MemtoReg, IorD, IRWrite, PCWrite, PCWriteCond : OUT std_ulogic; ALUOp, ALUSrcB, PCSource : OUT std_ulogic_vector(1 downto 0) ); END COMPONENT; COMPONENT ALUControl PORT ( instr_4_0 : IN std_ulogic_vector(4 downto 0); ALUOp : IN std_ulogic_vector(1 downto 0); ALUopcode : OUT std_ulogic_vector(2 downto 0) ); END COMPONENT; SIGNAL ALUOp_intern : std_ulogic_vector(1 downto 0); SIGNAL PCWrite_intern : std_ulogic; SIGNAL PCWriteCond_intern : std_ulogic; BEGIN inst_ControlFSM : ControlFSM PORT MAP ( clk => clk,

rst_n => rst_n, instr_15_11 => instr_15_11, RegDst => RegDst, RegWrite => RegWrite, ALUSrcA => ALUSrcA, MemRead => MemRead, MemWrite => MemWrite, MemtoReg => MemtoReg, IorD => IorD, IRWrite => IRWrite, PCWrite => PCWrite_intern, PCWriteCond => PCWriteCond_intern, ALUOp => ALUOp_intern, ALUSrcB => ALUSrcB, PCSource => PCSource ); inst_ALUControl : ALUControl PORT MAP ( instr_4_0 => instr_4_0, ALUOp => ALUOp_intern, ALUopcode => ALUopcode ); PC_en <= PCWrite_intern OR (PCWriteCond_intern AND zero); END behave;

4. VHDL code for ControlFSM


LIBRARY IEEE; USE IEEE.std_logic_1164.ALL; ENTITY ControlFSM IS PORT (clk, rst_n : IN std_ulogic; instr_15_11 : IN std_ulogic_vector(4 downto 0); RegDst, RegWrite, ALUSrcA, MemRead, MemWrite, MemtoReg, IorD, IRWrite, PCWrite, PCWriteCond : OUT std_ulogic; ALUOp, ALUSrcB, PCSource : OUT std_ulogic_vector(1 downto 0) ); END ControlFSM; ARCHITECTURE behave OF ControlFSM IS -------------------------------------------------------------------------------- Definition of the state names TYPE state_type IS (InstDec, MemAddComp, MemAccL, MemReadCompl, MemAccS, Exec, RCompl, BranchCompl, JumpCompl, ErrState, InstFetch); SIGNAL state, next_state : state_type; BEGIN -------------------------------------------------------------------------------- State process state_reg : PROCESS(clk, rst_n) BEGIN IF rst_n = '0' THEN state <= InstFetch; ELSIF RISING_EDGE(clk) THEN state <= next_state;

END IF; END PROCESS; -------------------------------------------------------------------------------- Logic Process logic_process : PROCESS(state, instr_15_11) -- RegDst RegWrite ALUSrcA MemRead MemWrite MemtoReg IorD IRWrite PCWrite PCWriteCond10x1bit -- ALUOp ALUSrcB PCSource3x2bit VARIABLE control_signals : std_ulogic_vector(15 downto 0); -- Defintion of Constants for the value of the Inst_Funct_Field Constant LOADWORD : std_ulogic_vector(4 downto 0) := "00011"; Constant STOREWORD : std_ulogic_vector(4 downto 0) := "01011"; Constant RTYPE : std_ulogic_vector(4 downto 0) := "00000"; Constant BEQ : std_ulogic_vector(4 downto 0) := "00100"; Constant JMP : std_ulogic_vector(4 downto 0) := "00010"; BEGIN CASE state IS -- Instruction Fetch WHEN InstFetch => control_signals := "0001000110000100"; next_state <= InstDec; -- Instruction Decode and Register Fetch WHEN InstDec => control_signals := "0000000000001100"; IF instr_15_11 = LOADWORD OR instr_15_11 = STOREWORD THEN next_state <= MemAddComp; ELSIF instr_15_11 = RTYPE THEN next_state <= Exec; ELSIF instr_15_11 = BEQ THEN next_state <= BranchCompl; ELSIF instr_15_11 = JMP THEN next_state <= JumpCompl; ELSE next_state <= ErrState; END IF; -- Memory Address Computation WHEN MemAddComp => control_signals := "0010000000001000"; if instr_15_11 = LOADWORD THEN next_state <= MemAccL; ELSIF instr_15_11 = STOREWORD THEN next_state <= MemAccS; ELSE next_state <= ErrState; END IF; next_state <= RCompl; -- R-type Completion WHEN RCompl => control_signals := "1110000000100000"; next_state <= InstFetch; -- Branch Completion

WHEN BranchCompl => control_signals := "0010000001010001"; next_state <= InstFetch; -- Jump Completion WHEN JumpCompl => control_signals := "0000000010001110"; next_state <= InstFetch; WHEN OTHERS => control_signals := (others => 'X'); next_state <= ErrState; END case; RegDst <= control_signals(15); RegWrite <= control_signals(14); ALUSrcA <= control_signals(13); MemRead <= control_signals(12); MemWrite <= control_signals(11); MemtoReg <= control_signals(10); IorD <= control_signals(9); IRWrite <= control_signals(8); PCWrite <= control_signals(7); PCWriteCond <= control_signals(6); ALUOp <= control_signals(5 downto 4); ALUSrcB <= control_signals(3 downto 2); PCSource <= control_signals(1 downto 0); END process; END behave;

5. VHDL code for data


LIBRARY IEEE; USE IEEE.std_logic_1164.ALL; USE IEEE.numeric_std.ALL; -- use package USE work.procmem_definitions.ALL; ENTITY data IS PORT (clk, rst_n : IN std_ulogic; PC_en, IorD, MemtoReg, IRWrite, ALUSrcA, RegWrite, RegDst : IN std_ulogic; PCSource, ALUSrcB : IN std_ulogic_vector(1 downto 0); ALUopcode : IN std_ulogic_vector(2 downto 0); mem_data : IN std_ulogic_vector(width-1 downto 0); reg_B, mem_address : OUT std_ulogic_vector(width-1 downto 0); instr_15_11 : OUT std_ulogic_vector(4 downto 0); instr_4_0 : OUT std_ulogic_vector(4 downto 0); zero : OUT std_ulogic ); END data;

ARCHITECTURE behave OF data IS COMPONENT data_fetch PORT ( clk : IN STD_ULOGIC; rst_n : IN STD_ULOGIC; pc_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); alu_out : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); mem_data : IN std_ulogic_vector(width-1 DOWNTO 0); PC_en : IN STD_ULOGIC; IorD : IN STD_ULOGIC; IRWrite : IN STD_ULOGIC; reg_memdata : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0); instr_15_11 : OUT STD_ULOGIC_VECTOR(4 downto 0); instr_10_8 : OUT STD_ULOGIC_VECTOR(2 DOWNTO 0); instr_7_5 : OUT STD_ULOGIC_VECTOR(2 DOWNTO 0); instr_4_0 : OUT STD_ULOGIC_VECTOR(4 downto 0); mem_address : OUT std_ulogic_vector(width-1 DOWNTO 0); pc_out : OUT std_ulogic_vector(width-1 DOWNTO 0)); END COMPONENT; COMPONENT data_decode PORT ( clk : IN STD_ULOGIC; rst_n : IN STD_ULOGIC; instr_10_8 : IN STD_ULOGIC_VECTOR(2 DOWNTO 0); instr_7_5 : IN STD_ULOGIC_VECTOR(2 DOWNTO 0); instr_4_0 : IN STD_ULOGIC_VECTOR(4 downto 0); reg_memdata : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); alu_out : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); RegDst : IN STD_ULOGIC; RegWrite : IN STD_ULOGIC; MemtoReg : IN STD_ULOGIC; reg_A : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0); reg_B : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0); instr_4_0_se : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0);

instr_4_0_se_sl : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0)); END COMPONENT; COMPONENT data_execution PORT ( instr_10_8 : IN std_ulogic_vector(2 downto 0); instr_7_5 : IN std_ulogic_vector(2 downto 0); instr_4_0 : IN std_ulogic_vector(4 downto 0); ALUSrcA : IN std_ulogic; ALUSrcB : IN std_ulogic_vector(1 downto 0); ALUopcode : IN std_ulogic_vector(2 downto 0); reg_A, reg_B : IN std_ulogic_vector(width-1 downto 0); pc_out : IN std_ulogic_vector(width-1 downto 0); instr_4_0_se : IN std_ulogic_vector(width-1 downto 0);

instr_4_0_se_sl : IN std_ulogic_vector(width-1 downto 0); jump_addr : OUT std_ulogic_vector(width-1 downto 0); alu_result : OUT std_ulogic_vector(width-1 downto 0); zero : OUT std_ulogic); END COMPONENT; COMPONENT data_memwriteback PORT ( clk, rst_n : IN std_ulogic; jump_addr : IN std_ulogic_vector(width-1 downto 0); alu_result : IN std_ulogic_vector(width-1 downto 0); PCSource : IN std_ulogic_vector(1 downto 0); pc_in : OUT std_ulogic_vector(width-1 downto 0); alu_out : OUT std_ulogic_vector(width-1 downto 0)); END COMPONENT; SIGNAL pc_in_intern : std_ulogic_vector(width-1 downto 0); SIGNAL alu_out_intern : std_ulogic_vector(width-1 downto 0); SIGNAL reg_memdata_intern : std_ulogic_vector(width-1 downto 0); SIGNAL instr_10_8_intern : std_ulogic_vector(2 downto 0); SIGNAL instr_7_5_intern : std_ulogic_vector(2 downto 0); SIGNAL instr_4_0_intern : std_ulogic_vector(4 downto 0); SIGNAL pc_out_intern : std_ulogic_vector(width-1 downto 0); SIGNAL reg_A_intern : std_ulogic_vector(width-1 downto 0); SIGNAL reg_B_intern : std_ulogic_vector(width-1 downto 0); SIGNAL instr_4_0_se_intern : std_ulogic_vector(width-1 downto 0); SIGNAL instr_4_0_se_sl_intern : std_ulogic_vector(width-1 downto 0); SIGNAL jump_addr_intern : std_ulogic_vector(width-1 downto 0); SIGNAL alu_result_intern : std_ulogic_vector(width-1 downto 0); BEGIN inst_data_fetch: data_fetch PORT MAP ( clk => clk, rst_n => rst_n, pc_in => pc_in_intern, alu_out => alu_out_intern, mem_data => mem_data, PC_en => PC_en, IorD => IorD, IRWrite => IRWrite, reg_memdata => reg_memdata_intern, instr_15_11 => instr_15_11, instr_10_8 => instr_10_8_intern, instr_7_5 => instr_7_5_intern, instr_4_0 => instr_4_0_intern, mem_address => mem_address, pc_out => pc_out_intern);

inst_data_decode : data_decode PORT MAP ( clk => clk, rst_n => rst_n, instr_10_8 => instr_10_8_intern, instr_7_5 => instr_7_5_intern, instr_4_0 => instr_4_0_intern, reg_memdata => reg_memdata_intern, alu_out => alu_out_intern, RegDst => RegDst, RegWrite => RegWrite, MemtoReg => MemtoReg, reg_A => reg_A_intern, reg_B => reg_B_intern, instr_4_0_se => instr_4_0_se_intern, instr_4_0_se_sl => instr_4_0_se_sl_intern ); inst_data_execution: data_execution PORT MAP ( instr_10_8 => instr_10_8_intern, instr_7_5 => instr_7_5_intern, instr_4_0 => instr_4_0_intern, ALUSrcA => ALUSrcA, ALUSrcB => ALUSrcB, ALUopcode => ALUopcode, reg_A => reg_A_intern, reg_B => reg_B_intern, pc_out => pc_out_intern, instr_4_0_se => instr_4_0_se_intern, instr_4_0_se_sl => instr_4_0_se_sl_intern, jump_addr => jump_addr_intern, alu_result => alu_result_intern, zero => zero ); inst_data_memwriteback : data_memwriteback PORT MAP ( clk => clk, rst_n => rst_n, jump_addr => jump_addr_intern, alu_result => alu_result_intern, PCSource => PCSource, pc_in => pc_in_intern, alu_out => alu_out_intern ); reg_B <= reg_B_intern; instr_4_0 <= instr_4_0_intern; END behave;

6. VHDL code for data decode


LIBRARY IEEE; USE IEEE.std_logic_1164.ALL; USE IEEE.numeric_std.ALL; -- use package USE work.procmem_definitions.ALL; ENTITY data_decode IS PORT ( -- inputs clk : IN STD_ULOGIC; rst_n : IN STD_ULOGIC; instr_10_8 : IN STD_ULOGIC_VECTOR(2 DOWNTO 0); instr_7_5 : IN STD_ULOGIC_VECTOR(2 DOWNTO 0); instr_4_0 : IN STD_ULOGIC_VECTOR(4 DOWNTO 0); reg_memdata : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); alu_out : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); -- control signals RegDst : IN STD_ULOGIC; RegWrite : IN STD_ULOGIC; MemtoReg : IN STD_ULOGIC; -- outputs reg_A : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0); reg_B : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0); instr_4_0_se : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0); instr_4_0_se_sl : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0) ); END data_decode; ARCHITECTURE behave OF data_decode IS COMPONENT regfile IS PORT (clk,rst_n : IN std_ulogic; wen : IN std_ulogic; -- write control writeport : IN std_ulogic_vector(width-1 DOWNTO 0); -- register input adrwport : IN std_ulogic_vector(regfile_adrsize-1 DOWNTO 0);-- address write adrport0 : IN std_ulogic_vector(regfile_adrsize-1 DOWNTO 0);-- address port 0 adrport1 : IN std_ulogic_vector(regfile_adrsize-1 DOWNTO 0);-- address port 1 readport0 : OUT std_ulogic_vector(width-1 DOWNTO 0); -- output port 0 readport1 : OUT std_ulogic_vector(width-1 DOWNTO 0) -- output port 1 ); END COMPONENT; COMPONENT tempreg IS PORT ( clk : IN STD_ULOGIC; rst_n : IN STD_ULOGIC; reg_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); reg_out : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0) ); END COMPONENT;

-- internal signals SIGNAL write_reg : STD_ULOGIC_VECTOR(regfile_adrsize-1 DOWNTO 0); SIGNAL write_data : STD_ULOGIC_VECTOR(width-1 DOWNTO 0); SIGNAL data_1 : STD_ULOGIC_VECTOR(width-1 DOWNTO 0); SIGNAL data_2 : STD_ULOGIC_VECTOR(width-1 DOWNTO 0); BEGIN A : tempreg PORT MAP ( clk => clk, rst_n => rst_n, reg_in => data_1, reg_out => reg_A ); B : tempreg PORT MAP ( clk => clk, rst_n => rst_n, reg_in => data_2, reg_out => reg_B ); inst_regfile : regfile PORT MAP ( clk => clk, rst_n => rst_n, wen => RegWrite, writeport => write_data, adrwport => write_reg, adrport0 => instr_10_8, adrport1 => instr_7_5, readport0 => data_1, readport1 => data_2 ); -- multiplexer for write register write_reg <= instr_7_5 WHEN RegDst = '0' ELSE instr_4_0(4 DOWNTO 2) WHEN RegDst = '1' ELSE (OTHERS => 'X'); -- multiplexer for write data write_data <= alu_out WHEN MemtoReg = '0' ELSE reg_memdata WHEN MemtoReg = '1' ELSE (OTHERS => 'X'); -- sign extension and shift proc_sign_ext : PROCESS(instr_4_0) -- variables needed for reading result of sign extension VARIABLE temp_instr_4_0_se : STD_ULOGIC_VECTOR(width-1 DOWNTO 0); VARIABLE temp_instr_4_0_se_sl : STD_ULOGIC_VECTOR(width-1 DOWNTO 0); BEGIN -- sign extend instr_4_0 to 32 bits temp_instr_4_0_se := STD_ULOGIC_VECTOR(RESIZE(SIGNED(instr_4_0), instr_4_0_se'LENGTH)); -- shift left 2 temp_instr_4_0_se_sl := temp_instr_4_0_se(width-3 DOWNTO 0) & "00"; instr_4_0_se <= temp_instr_4_0_se; instr_4_0_se_sl <= temp_instr_4_0_se_sl;

END PROCESS; END behave;

7. VHDL code for data execution


LIBRARY IEEE; USE IEEE.std_logic_1164.ALL; USE IEEE.numeric_std.ALL; -- use package USE work.procmem_definitions.ALL; ENTITY data_execution IS PORT (instr_10_8 : IN std_ulogic_vector(2 downto 0); instr_7_5 : IN std_ulogic_vector(2 downto 0); instr_4_0 : IN std_ulogic_vector(4 downto 0); ALUSrcA : IN std_ulogic; ALUSrcB : IN std_ulogic_vector(1 downto 0); ALUopcode : IN std_ulogic_vector(2 downto 0); reg_A, reg_B : IN std_ulogic_vector(width-1 downto 0); pc_out : IN std_ulogic_vector(width-1 downto 0); instr_4_0_se : IN std_ulogic_vector(width-1 downto 0); instr_4_0_se_sl : IN std_ulogic_vector(width-1 downto 0); jump_addr : OUT std_ulogic_vector(width-1 downto 0); alu_result : OUT std_ulogic_vector(width-1 downto 0); zero : OUT std_ulogic ); END data_execution; ARCHITECTURE behave OF data_execution IS COMPONENT alu PORT ( a, b : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); opcode : IN STD_ULOGIC_VECTOR(2 DOWNTO 0); result : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0); zero : OUT STD_ULOGIC ); END COMPONENT; SIGNAL mux_A_out : std_ulogic_vector(width-1 downto 0); SIGNAL mux_B_out : std_ulogic_vector(width-1 downto 0); BEGIN alu_inst: alu PORT MAP ( a => mux_A_out, b => mux_B_out, opcode => ALUopcode, result => alu_result, zero => zero );

-- Multiplexor for ALU input A: mux_A : PROCESS (ALUSrcA, PC_out, reg_A) BEGIN CASE ALUSrcA IS WHEN '0' => mux_A_out <= PC_out; WHEN '1' => mux_A_out <= reg_A; WHEN OTHERS => mux_A_out <= (OTHERS => 'X'); END CASE; END PROCESS; -- Multiplexor for AlU input B: mux_B : PROCESS (ALUSrcB, reg_B, instr_4_0_se, instr_4_0_se_sl) BEGIN CASE ALUSrcB IS WHEN "00" => mux_B_out <= reg_B; WHEN "01" => mux_B_out <= STD_ULOGIC_VECTOR(TO_UNSIGNED(4, width)); --constant 4 WHEN "10" => mux_B_out <= instr_4_0_se; WHEN "11" => mux_B_out <= instr_4_0_se_sl; WHEN OTHERS => mux_B_out <= (OTHERS => 'X'); END CASE; END PROCESS; -- Computation of Jump Address: jump_addr <= PC_out(width-1 downto width-3) & instr_10_8 & instr_7_5 & instr_4_0 & "00"; END behave;

8. VHDL code for data fetch


LIBRARY IEEE; USE IEEE.std_logic_1164.ALL; USE IEEE.numeric_std.ALL; -- use package USE work.procmem_definitions.ALL; ENTITY data_fetch IS PORT ( -- inputs clk : IN STD_ULOGIC; rst_n : IN STD_ULOGIC; pc_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); alu_out : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); mem_data : IN std_ulogic_vector(width-1 DOWNTO 0); -- control signals PC_en : IN STD_ULOGIC; IorD : IN STD_ULOGIC; IRWrite : IN STD_ULOGIC; -- outputs reg_memdata : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0); instr_15_11 : OUT STD_ULOGIC_VECTOR(4 DOWNTO 0); instr_10_8 : OUT STD_ULOGIC_VECTOR(2 DOWNTO 0); instr_7_5 : OUT STD_ULOGIC_VECTOR(2 DOWNTO 0);

instr_4_0 : OUT STD_ULOGIC_VECTOR(4 DOWNTO 0); mem_address : OUT std_ulogic_vector(width-1 DOWNTO 0); pc_out : OUT std_ulogic_vector(width-1 DOWNTO 0) ); END data_fetch; ARCHITECTURE behave OF data_fetch IS COMPONENT instreg IS PORT ( clk : IN STD_ULOGIC; rst_n : IN STD_ULOGIC; memdata : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); IRWrite : IN STD_ULOGIC; instr_15_11 : OUT STD_ULOGIC_VECTOR(4 DOWNTO 0); instr_10_8 : OUT STD_ULOGIC_VECTOR(2 DOWNTO 0); instr_7_5 : OUT STD_ULOGIC_VECTOR(2 DOWNTO 0); instr_4_0 : OUT STD_ULOGIC_VECTOR(4 DOWNTO 0) ); END COMPONENT; COMPONENT tempreg IS PORT ( clk : IN STD_ULOGIC; rst_n : IN STD_ULOGIC; reg_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); reg_out : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0) ); END COMPONENT;

COMPONENT pc IS PORT ( clk : IN STD_ULOGIC; rst_n : IN STD_ULOGIC; pc_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); PC_en : IN STD_ULOGIC; pc_out : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0) ); END COMPONENT; -- signals for components SIGNAL pc_out_intern : STD_ULOGIC_VECTOR(width-1 DOWNTO 0); BEGIN -- instances of components proc_cnt: pc PORT MAP ( clk => clk, rst_n => rst_n, pc_in => pc_in, PC_en => PC_en, pc_out => pc_out_intern); instr_reg : instreg PORT MAP ( clk => clk, rst_n => rst_n, memdata => mem_data, IRWrite => IRWrite,

instr_15_11 => instr_15_11, instr_10_8 => instr_10_8, instr_7_5 => instr_7_5, instr_4_0 => instr_4_0 ); mem_data_reg : tempreg PORT MAP ( clk => clk, rst_n => rst_n, reg_in => mem_data, reg_out => reg_memdata ); -- multiplexer addr_mux : PROCESS(IorD, pc_out_intern, alu_out) VARIABLE mem_address_temp : STD_ULOGIC_VECTOR(width-1 DOWNTO 0); BEGIN IF IorD = '0' THEN mem_address_temp := pc_out_intern; ELSIF IorD = '1' THEN mem_address_temp := alu_out; ELSE mem_address_temp := (OTHERS => 'X'); END IF; mem_address <= mem_address_temp; END PROCESS; pc_out <= pc_out_intern; END behave;

9. VHDL code for data memwriteback


LIBRARY IEEE; USE IEEE.std_logic_1164.ALL; USE IEEE.numeric_std.ALL; -- use package USE work.procmem_definitions.ALL; ENTITY data_memwriteback IS PORT (clk, rst_n : IN std_ulogic; jump_addr : IN std_ulogic_vector(width-1 downto 0); alu_result : IN std_ulogic_vector(width-1 downto 0); PCSource : IN std_ulogic_vector(1 downto 0); pc_in : OUT std_ulogic_vector(width-1 downto 0); alu_out : OUT std_ulogic_vector(width-1 downto 0) ); END data_memwriteback; ARCHITECTURE behave OF data_memwriteback IS COMPONENT tempreg PORT ( clk : IN STD_ULOGIC; rst_n : IN STD_ULOGIC;

reg_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); reg_out : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0) ); END COMPONENT; SIGNAL alu_out_internal : std_ulogic_vector(width-1 downto 0); BEGIN tempreg_inst: tempreg PORT MAP ( clk => clk, rst_n => rst_n, reg_in => alu_result, reg_out => alu_out_internal ); -- Multiplexor for ALU input A: mux : PROCESS (PCSource, ALU_result, ALU_out_internal, jump_addr) BEGIN CASE PCSource IS WHEN "00" => pc_in <= alu_result; WHEN "01" => pc_in <= alu_out_internal; WHEN "10" => pc_in <= jump_addr; WHEN OTHERS => pc_in <= (OTHERS => 'X'); END CASE; END PROCESS; alu_out <= alu_out_internal; END behave;

10. VHDL code for instreg


LIBRARY IEEE; USE IEEE.std_logic_1164.ALL; USE IEEE.numeric_std.ALL; -- use package USE work.procmem_definitions.ALL; ENTITY instreg IS PORT ( clk : IN STD_ULOGIC; rst_n : IN STD_ULOGIC; memdata : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); IRWrite : IN STD_ULOGIC; instr_15_11 : OUT STD_ULOGIC_VECTOR(4 DOWNTO 0); instr_10_8 : OUT STD_ULOGIC_VECTOR(2 DOWNTO 0); instr_7_5 : OUT STD_ULOGIC_VECTOR(2 DOWNTO 0); instr_4_0 : OUT STD_ULOGIC_VECTOR(4 DOWNTO 0) ); END instreg; ARCHITECTURE behave OF instreg IS BEGIN proc_instreg : PROCESS(clk, rst_n) BEGIN

IF rst_n = '0' THEN instr_15_11 <= (OTHERS => '0'); instr_10_8 <= (OTHERS => '0'); instr_7_5 <= (OTHERS => '0'); instr_4_0 <= (OTHERS => '0'); ELSIF RISING_EDGE(clk) THEN -- write the output of the memory into the instruction register IF(IRWrite = '1') THEN instr_15_11 <= memdata(15 DOWNTO 11); instr_10_8 <= memdata(10 DOWNTO 8); instr_7_5 <= memdata(7 DOWNTO 5); instr_4_0 <= memdata(4 DOWNTO 0); END IF; END IF; END PROCESS; END behave;

11. VHDL code for memory


LIBRARY IEEE; USE IEEE.std_logic_1164.ALL; USE IEEE.numeric_std.ALL; -- use package USE work.procmem_definitions.ALL; ENTITY memory IS PORT ( clk : IN STD_ULOGIC; rst_n : IN STD_ULOGIC; MemRead : IN STD_ULOGIC; MemWrite : IN STD_ULOGIC; mem_address : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); data_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); data_out : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0) ); END memory;

ARCHITECTURE behave OF memory IS COMPONENT ram IS PORT (address :IN std_logic_vector(7 DOWNTO 0); data : IN std_logic_vector(7 DOWNTO 0); inclock : IN std_logic; -- used to write data in RAM cells wren_p : IN std_logic; q : OUT std_logic_vector(7 DOWNTO 0)); END COMPONENT; -- internal signals SIGNAL wren_p : STD_LOGIC; SIGNAL data_in_0 : STD_LOGIC_VECTOR(7 DOWNTO 0); SIGNAL data_in_1 : STD_LOGIC_VECTOR(7 DOWNTO 0); SIGNAL data_in_2 : STD_LOGIC_VECTOR(7 DOWNTO 0); SIGNAL data_in_3 : STD_LOGIC_VECTOR(7 DOWNTO 0); SIGNAL data_out_0 : STD_LOGIC_VECTOR(7 DOWNTO 0);

SIGNAL data_out_1 : STD_LOGIC_VECTOR(7 DOWNTO 0); SIGNAL data_out_2 : STD_LOGIC_VECTOR(7 DOWNTO 0); SIGNAL data_out_3 : STD_LOGIC_VECTOR(7 DOWNTO 0); SIGNAL address_0 : STD_LOGIC_VECTOR(7 DOWNTO 0); SIGNAL address_1 : STD_LOGIC_VECTOR(7 DOWNTO 0); SIGNAL address_2 : STD_LOGIC_VECTOR(7 DOWNTO 0); SIGNAL address_3 : STD_LOGIC_VECTOR(7 DOWNTO 0); BEGIN -- instances of 4 ram blocks mem_block0 : ram PORT MAP ( address => address_0, data => data_in_0, inclock => clk, wren_p => wren_p, q => data_out_0 ); mem_block1 : ram PORT MAP ( address => address_1, data => data_in_1, inclock => clk, wren_p => wren_p, q => data_out_1 ); mem_block2 : ram PORT MAP ( address => address_2, data => data_in_2, inclock => clk, wren_p => wren_p, q => data_out_2 ); mem_block3 : ram PORT MAP ( address => address_3, data => data_in_3, inclock => clk, wren_p => wren_p, q => data_out_3 );

-- create a write_enable for instances wren_p <= '1' WHEN MemWrite = '1' AND MemRead = '0' ELSE '0' WHEN MemWrite = '0' AND MemRead = '1' ELSE '0' WHEN MemWrite = '0' AND MemRead = '0' ELSE 'X'; -- assert address to ram blocks (pure logic) addr_assert: PROCESS(mem_address) VARIABLE temp_ram_address : STD_ULOGIC_VECTOR(ram_adrwidth-1 DOWNTO 0); BEGIN -- read/write only words: A1 A0 --> not used for address -- note: ram blocks can be addressed with mulitple addresses temp_ram_address := mem_address(ram_adrwidth-1+2 DOWNTO 2); address_0 <= TO_STDLOGICVECTOR(temp_ram_address); address_1 <= TO_STDLOGICVECTOR(temp_ram_address);

address_2 <= TO_STDLOGICVECTOR(temp_ram_address); address_3 <= TO_STDLOGICVECTOR(temp_ram_address); END PROCESS; -- assert data_in to ram blocks (pure logic) -- separate bytes out of data_in data_in_3 <= TO_STDLOGICVECTOR(data_in(4*ram_datwidth-1 DOWNTO 3*ram_datwidth)); data_in_2 <= TO_STDLOGICVECTOR(data_in(3*ram_datwidth-1 DOWNTO 2*ram_datwidth)); data_in_1 <= TO_STDLOGICVECTOR(data_in(2*ram_datwidth-1 DOWNTO ram_datwidth)); data_in_0 <= TO_STDLOGICVECTOR(data_in(ram_datwidth-1 DOWNTO 0)); -- assert output of memory blocks to data_out (pure logic) data_out <= TO_STDULOGICVECTOR( data_out_3 & data_out_2 & data_out_1 & data_out_0); END behave;

12. VHDL code for neo


LIBRARY IEEE; USE IEEE.std_logic_1164.ALL; USE IEEE.numeric_std.ALL; -- use package USE work.procmem_definitions.ALL; ENTITY NEO IS PORT (clk, rst_n : IN std_ulogic; mem_data : IN std_ulogic_vector(width-1 downto 0); reg_B, mem_address : OUT std_ulogic_vector(width-1 downto 0); MemRead, MemWrite : OUT std_ulogic ); END NEO; ARCHITECTURE behave OF NEO IS COMPONENT control PORT ( clk, rst_n : IN std_ulogic; instr_15_11 : IN std_ulogic_vector(4 downto 0); instr_4_0 : IN std_ulogic_vector(4 downto 0); zero : IN std_ulogic; ALUopcode : OUT std_ulogic_vector(2 downto 0); RegDst, RegWrite, ALUSrcA, MemRead, MemWrite, MemtoReg, IorD, IRWrite : OUT std_ulogic; ALUSrcB, PCSource : OUT std_ulogic_vector(1 downto 0); PC_en : OUT std_ulogic); END COMPONENT; COMPONENT data PORT ( clk, rst_n : IN std_ulogic; PC_en, IorD, MemtoReg, IRWrite, ALUSrcA, RegWrite, RegDst : IN std_ulogic; PCSource, ALUSrcB : IN std_ulogic_vector(1 downto 0); ALUopcode : IN std_ulogic_vector(2 downto 0); mem_data : IN std_ulogic_vector(width-1 downto 0); reg_B, mem_address : OUT std_ulogic_vector(width-1 downto 0); instr_15_11 : OUT std_ulogic_vector(4 downto 0);

instr_4_0 : OUT std_ulogic_vector(4 downto 0); zero : OUT std_ulogic); END COMPONENT; -- internal signals for connection of components SIGNAL instr_15_11_intern : std_ulogic_vector(4 downto 0); SIGNAL instr_4_0_intern : std_ulogic_vector(4 downto 0); SIGNAL zero_intern : std_ulogic; SIGNAL ALUopcode_intern : std_ulogic_vector(2 downto 0); SIGNAL RegDst_intern : std_ulogic; SIGNAL RegWrite_intern : std_ulogic; SIGNAL ALUSrcA_intern : std_ulogic; SIGNAL MemtoReg_intern : std_ulogic; SIGNAL IorD_intern : std_ulogic; SIGNAL IRWrite_intern : std_ulogic; SIGNAL ALUSrcB_intern : std_ulogic_vector(1 downto 0); SIGNAL PCSource_intern : std_ulogic_vector(1 downto 0); SIGNAL PC_en_intern : std_ulogic; BEGIN inst_control : control PORT MAP ( clk => clk, rst_n => rst_n, instr_15_11 => instr_15_11_intern, instr_4_0 => instr_4_0_intern, zero => zero_intern, ALUopcode => ALUopcode_intern, RegDst => RegDst_intern, RegWrite => RegWrite_intern, ALUSrcA => ALUSrcA_intern, MemRead => MemRead, MemWrite => MemWrite, MemtoReg => MemtoReg_intern, IorD => IorD_intern, IRWrite => IRWrite_intern, ALUSrcB => ALUSrcB_intern, PCSource => PCSource_intern, PC_en => PC_en_intern );

inst_data: data PORT MAP ( clk => clk, rst_n => rst_n, PC_en => PC_en_intern, IorD => IorD_intern, MemtoReg => MemtoReg_intern, IRWrite => IRWrite_intern, ALUSrcA => ALUSrcA_intern, RegWrite => RegWrite_intern, RegDst => RegDst_intern, PCSource => PCSource_intern, ALUSrcB => ALUSrcB_intern,

ALUopcode => ALUopcode_intern, mem_data => mem_data, reg_B => reg_B, mem_address => mem_address, instr_15_11 => instr_15_11_intern, instr_4_0 => instr_4_0_intern, zero => zero_intern ); END behave;

13. VHDL code for pc


LIBRARY IEEE; USE IEEE.std_logic_1164.ALL; USE IEEE.numeric_std.ALL; -- use package USE work.procmem_definitions.ALL; ENTITY pc IS PORT ( clk : IN STD_ULOGIC; rst_n : IN STD_ULOGIC; pc_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); PC_en : IN STD_ULOGIC; pc_out : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0) ); END pc; ARCHITECTURE behave OF pc IS BEGIN proc_pc : PROCESS(clk, rst_n) VARIABLE pc_temp : STD_ULOGIC_VECTOR(width-1 DOWNTO 0); BEGIN IF rst_n = '0' THEN pc_temp := (OTHERS => '0'); ELSIF RISING_EDGE(clk) THEN IF PC_en = '1' THEN pc_temp := pc_in; END IF; END IF; pc_out <= pc_temp; END PROCESS; END behave;

14. VHDL code for processor


LIBRARY IEEE; USE IEEE.STD_LOGIC_1164.ALL; USE IEEE.NUMERIC_STD.ALL; -- use package USE work.procmem_definitions.ALL;

ENTITY processor IS PORT (clk, rst_n : IN std_ulogic; run:in std_ulogic; --mem_data : IN std_ulogic_vector; data_in1 , mem_address1 : in std_ulogic_vector(width-1 DOWNTO 0); MemRead1, MemWrite1 : in std_ulogic; --data_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); data_out1 : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0) ); END processor; ARCHITECTURE behave OF processor IS COMPONENT NEO PORT ( clk, rst_n : IN std_ulogic; mem_data : IN std_ulogic_vector(width-1 downto 0); reg_B, mem_address : OUT std_ulogic_vector(width-1 downto 0); MemRead, MemWrite : out std_ulogic ); END COMPONENT; COMPONENT memory PORT ( clk : IN STD_ULOGIC; rst_n : IN STD_ULOGIC; MemRead : IN STD_ULOGIC; MemWrite : IN STD_ULOGIC; mem_address : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); data_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); data_out : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0)); END COMPONENT; SIGNAL mem_data : std_ulogic_vector(width-1 downto 0); signal reg_B : std_ulogic_vector(width-1 downto 0); signal mem_address : std_ulogic_vector(width-1 downto 0); signal MemRead : std_ulogic; signal MemWrite : std_ulogic; signal WrEnable: std_ulogic; signal RdEnable: std_ulogic; signal addr: STD_ULOGIC_VECTOR(width-1 DOWNTO 0); signal data_write: STD_ULOGIC_VECTOR(width-1 DOWNTO 0); signal data_read : STD_ULOGIC_VECTOR(width-1 DOWNTO 0); BEGIN process( run, rst_n,MemRead,MemWrite,mem_address,reg_B, mem_data,MemRead1,MemWrite1,mem_address1,data_in1) begin if (run='1') then WrEnable<=MemWrite; RdEnable<=MemRead; addr<=mem_address; data_write<=reg_B; --data_read<=mem_data;

elsif (run='0') then WrEnable<=MemWrite1; RdEnable<=MemRead1; addr<=mem_address1; data_write<=data_in1; --data_read<=mem_data1; end if; end process; inst_NEO : NEO PORT MAP ( clk => clk, rst_n => rst_n, mem_data => mem_data, reg_B => reg_B, mem_address => mem_address, MemRead => MemRead, MemWrite => MemWrite ); inst_memory : memory PORT MAP ( clk => clk, rst_n => rst_n, MemRead => RdEnable, MemWrite => WrEnable, mem_address => addr, data_in => data_write, data_out => data_read ); data_out1<=data_read; mem_data<=data_read; END behave;

15. VHDL code for procmem_definitions


PACKAGE ProcMem_definitions IS -- globals CONSTANT width : NATURAL := 16; -- definitions for regfile CONSTANT regfile_depth : positive := 8; -- register file depth = 2**adrsize CONSTANT regfile_adrsize : positive := 3; -- address vector size = log2(depth) -- definitions for memory CONSTANT ram_adrwidth : positive := 4; -- m x n - RAM Block CONSTANT ram_datwidth : positive := 4; -- initial RAM content in IntelHEX Format CONSTANT ramfile_std : string := "./simulation/ram_256x8.hex"; CONSTANT ramfile_block0 : string := "./simulation/ram0_256x8.hex"; CONSTANT ramfile_block1 : string := "./simulation/ram1_256x8.hex"; CONSTANT ramfile_block2 : string := "./simulation/ram2_256x8.hex"; CONSTANT ramfile_block3 : string := "./simulation/ram3_256x8.hex"; END ProcMem_definitions;

16. VHDL code for ram


LIBRARY IEEE; USE IEEE.std_logic_1164.ALL; USE IEEE.numeric_std.ALL; -- use altera_mf library for RAM block --LIBRARY altera_mf; --USE altera_mf.ALL; -- use package USE work.procmem_definitions.ALL;

ENTITY ram IS PORT (address : IN std_logic_vector(7 DOWNTO 0); data : IN std_logic_vector(7 DOWNTO 0); inclock : IN std_logic; -- used to write data in RAM cells wren_p : IN std_logic; q : OUT std_logic_vector(7 DOWNTO 0)); END ram;

ARCHITECTURE rtl OF ram IS TYPE MEM IS ARRAY(0 TO 255) OF std_logic_vector(7 DOWNTO 0); SIGNAL ram_block : MEM; SIGNAL read_address_reg : std_logic_vector(7 DOWNTO 0); BEGIN PROCESS (inclock) BEGIN IF rising_edge(inclock) THEN IF (wren_p = '1') THEN ram_block(to_integer(unsigned(address))) <= data; END IF; -- address is registered at rising edge -- not used, because asynchronous data output is needed for NEO design --read_address_reg <= address; END IF; END PROCESS; -- registered address is used for synchronous data output --q <= ram_block(to_integer(unsigned(read_address_reg))); -- asynchronous memory output (needed for NEO design according to [PaHe98]) -- address is unregistered q <= ram_block(to_integer(unsigned(address))); END rtl;

17. VHDL code for regfile


LIBRARY IEEE; USE IEEE.std_logic_1164.ALL; USE IEEE.numeric_std.ALL; -- use package USE work.procmem_definitions.ALL; ENTITY regfile IS PORT (clk,rst_n : IN std_ulogic; wen : IN std_ulogic; -- write control writeport : IN std_ulogic_vector(width-1 DOWNTO 0); -- register input adrwport : IN std_ulogic_vector(regfile_adrsize-1 DOWNTO 0);-- address write adrport0 : IN std_ulogic_vector(regfile_adrsize-1 DOWNTO 0);-- address port 0 adrport1 : IN std_ulogic_vector(regfile_adrsize-1 DOWNTO 0);-- address port 1 readport0 : OUT std_ulogic_vector(width-1 DOWNTO 0); -- output port 0 readport1 : OUT std_ulogic_vector(width-1 DOWNTO 0) -- output port 1 ); END regfile; ARCHITECTURE behave OF regfile IS SUBTYPE WordT IS std_ulogic_vector(width-1 DOWNTO 0); -- reg word TYPE TYPE StorageT IS ARRAY(0 TO regfile_depth-1) OF WordT; -- reg array TYPE SIGNAL registerfile : StorageT; -- reg file contents BEGIN -- perform write operation PROCESS(rst_n, clk) BEGIN IF rst_n = '0' THEN FOR i IN 0 TO regfile_depth-1 LOOP registerfile(i) <= (OTHERS => '0'); END LOOP; ELSIF rising_edge(clk) THEN IF wen = '1' THEN registerfile(to_integer(unsigned(adrwport))) <= writeport; END IF; END IF; END PROCESS; -- perform reading ports readport0 <= registerfile(to_integer(unsigned(adrport0))); readport1 <= registerfile(to_integer(unsigned(adrport1))); END behave;

18. VHDL code for tempreg


LIBRARY IEEE; USE IEEE.std_logic_1164.ALL; USE IEEE.numeric_std.ALL; -- use package USE work.procmem_definitions.ALL;

ENTITY tempreg IS PORT ( clk : IN STD_ULOGIC; rst_n : IN STD_ULOGIC; reg_in : IN STD_ULOGIC_VECTOR(width-1 DOWNTO 0); reg_out : OUT STD_ULOGIC_VECTOR(width-1 DOWNTO 0) ); END tempreg; ARCHITECTURE behave OF tempreg IS BEGIN temp_reg: PROCESS(clk, rst_n) BEGIN IF rst_n = '0' THEN reg_out <= (OTHERS => '0'); ELSIF RISING_EDGE(clk) THEN -- write register input to output at rising edge reg_out <= reg_in; END IF; END PROCESS; END behave;

7. References
David A. Patterson, John L. Hennessy: Computer Organization and Design - The Hardware/Software Interface - Third Editon Computer Organization and Architecture: Designing for Performance, 8th Edition, William Stallings Computer System Architecture, 3rd Edition, M. Morris Mano Dr. Sanjeev Manhas VHDL Slides http://vhdlguru.blogspot.in/

Das könnte Ihnen auch gefallen