Sie sind auf Seite 1von 70

 

 
 
 
 
 
Speed Lake 
CSSE 232 Processor Project 

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
Team 2A 
Thaddeus Hughes, Evë Maquelin, Matthew Howlett, Ian Sheffert, David Li 
   
 
 

Table of Contents 
 
Changelog 4 

Executive Summary 6 

Our Processor 6 
Design 6 
Instructions and Control 6 
Multi-Cycle Design 7 
Memory and the Stack 7 
Procedure Calls 8 
Implementation 8 
Testing 9 
Compiler and Assembler 10 
Results 11 

Conclusion 12 

A: Software Specifications 13 

Available Registers 13 


General Purpose Registers 13 
Restricted Purpose Registers 13 

Procedure Call Convention 14 

Machine Language Instruction Types 15 


Inherent (N) Type Instructions 15 
Immediate (I) Type Instructions 15 
Branch (B) Type Instructions 15 

Instruction Semantics 16 


Arithmetic and Logical Instructions 16 
Branch/Jump Instructions 18 
I/O Manipulation Instructions 19 
Memory Manipulation Instructions 19 
Stack Instructions 20 

Example Program 21 


Assembly Program 21 
Assembled Machine Code 23 

Common Operations 25 


Adding Numbers in Memory 25 

 
 
 

Loop Through an Array in Memory 25 


Loading an Address into the IA Register 26 
Conditional Statements 27 
Reading from/Writing to a Display Register 28 

B: Hardware Specifications 29 

Register Transfer Language 29 


Arithmetic 29 
Memory Instructions 29 
Jumps/Branches 30 
LCD / Buttons 31 
Stack Instructions 31 

Component Specification 32 

Datapath Schematic 34 

Control Signals 35 

Control Unit 36 


Fetch Stage 36 
Opcode Breakdown 37 
Inherent Opcodes 37 
Immediate Opcodes 38 
Branch/Jump Opcodes 38 
Read Stage 39 
Write Stage 40 

C: Testing and Integration 41 

RTL Testing 41 


Procedure 41 
RTL Markup 41 

Unit Test Plan 45 


Unit Testing Procedure: 46 
Tables for Unit Testing 46 
Muxes 46 
Registers 48 
ALUs 49 
Other Components 51 
Memory Unit 51 
Instruction Memory 51 

 
 
 

Integration Plan 52 


Step 1: Small Subsystems 52 
Step 2: Registers, the ALU, and Program Memory 55 
Step 3: The Big Kahuna 56 

D: Design Process Journal 58 

Milestone 1 58 
Meeting Monday, January 8 58 
First Meeting Wednesday, January 10 59 
Second Meeting Wednesday, January 10 59 

Milestone 2 60 
Impromptu Meeting Thursday, January 11 60 
Meeting Friday, January 12 61 
Meeting Wednesday, January 17 61 

Milestone 3 62 
Meeting Sunday, January 21 62 
Meeting Tuesday, January 23 62 

Milestone 4 63 
Meeting Thirstday, January 25 63 
Meeting Sunday, January 28 63 

Milestone 5 64 
Meeting Monday During Class 2/5/2018 64 
Meeting Monday Evening 2/5/2018 64 
Meeting Tuesday During Class 2/6/2018 64 
Meeting Wednesday During/After Class 2/7/2018 64 
Meeting Monday During Class 2/12/2018 64 
Meeting Tuesday During Class 2/13/2018 65 
Meeting Wednesday During and After Class 2/14/2018 65 
Meeting Thursday After Scheduled Meeting 2/15/2018 65 
Meeting Wednesday 2/21/2018 66 

   

 
 
 

Changelog 
 
Version  Date  Description 

1.0  January 10, 2018  Initial version of the document created for Milestone 1 

1.1  January 17, 2018  Milestone 2 Updates: 


● Procedure calling pattern was altered to accommodate 
use of memory for local data storage 
● The instructions cya and rya were converted to 
pseudo-instructions, and the instructions pushia, popia, 
pushra, and popra were added to accomplish this 
● The jump and branch instruction types were 
consolidated into one 
● More branch instructions were added to branch off of 
the sign of the value in the DA register 
● Example program was updated to reflect changes in the 
instruction set 
 
Milestone 2 Additions: 
● RTL for each instruction 
● List of necessary components 
● Testing procedure and results of RTL 

1.2  January 24, 2018  Milestone 3 Updates: 


● Updated and condensed RTL tables 
● No more IA ALU. Everything goes through one ALU. 
● Removed MIPS from the document 
● Document rearranged into logical sections 
● Updated components 
 
Milestone 3 Additions: 
● Unit Test Plan 
● Integration Plan 
● Control signal descriptions 
● Datapath schematic 

1.3  January 30, 2018  Milestone 4 Updates: 


● No more passing B through ALU. IA is also passed 
through ALU to get to DA, rather than being directly 
wired up. Instead, ALU port B is wired to input mux of 
DA. 
● Addition of instructions sllm, srlm, subm 
● Changing opcodes 
● Inherent types now have a 4-bit shamt used for shifting 

 
 
 

1.3  February 7, 2018  Milestone 5 Updates: 


● RTL updated to match control design 
● Instruction type diagrams were updated to reflect 
opcode structure 
● Control signal table was updated for readability 
● Editing documentation 

 
   

 
 
 

Executive Summary 
We have designed a simple accumulator-style processor, which runs on an FPGA board. The 
processor provides basic support for the LCD and buttons on the board, and a compiler / 
assembler were created in order to make programming easier. 
 
In this document we will discuss the instruction set, implementation, testing, and final 
performance results of the processor. 

Our Processor 
We chose to implement an multi-cycle accumulator-style processor with a stack. We designed 
ours with only one working register, the DA register, and an indirect addressing (IA) register to 
store memory addresses. 
 
Though some of us had prior experience working with accumulators, there was still a lot left to 
learn about how they worked. Also, the addition of a stack seemed like an easy improvement 
that would be quite valuable to the programmer. Later, our choice of style turned out to be quite 
convenient as the lack of arguments on most instructions left room for a lot of opcodes and 
therefore a wide range of instructions. This led to the creation of an instruction set that not only 
computed relative primes, but could handle general computation with ease. 

Design 
Over the course of the project we were forced to make a lot interesting but difficult design 
decisions. Though most of our decisions were made with processor efficiency and elegance as 
the priority, such as the design of our control unit and the ALU design, some were made with 
ease of implementation as the priority, such as choosing to keep our cycle times and cycles per 
instruction constant. 

Instructions and Control 


Our instructions consist of basic arithmetic, logical operations, stack operations, branches, 
function calls, memory operations, and i/o operations designed for buttons, switches, and the 
FPGA board LCD screen. These are broken down into three types: inherent, immediate, and 
branch instructions.  
 
Of the 44 total instructions, 27 are truly inherent, meaning they take no arguments. This is where 
we focused the majority of our control unit optimization. Among these are the arithmetic 
operations, such as addition and subtraction. Though we have immediate instructions that do 
these operations with constant values, we wanted an easy way to interface with values in the 

 
 
 

accumulator and values stored in memory. By putting the memory address of a value into the IA 
register, instructions like add/addm can use both the value in the DA register and and the value 
in memory and store the result accordingly. Add stores the value into DA and addm stores the 
value back into memory at IA. This pattern is repeated across many of our instructions. 
 
Due to the nature and size of our instruction set, we thought the conventional or expected 
method of designing our control unit, a finite state machine that takes whole instructions as 
inputs, would be too large and inefficient. Instead of creating cases for each instruction or group 
of instructions, we created cases for each control signal and instruction type. Since most 
control signals have default states for each instruction cycle, this brought our cases per cycle 
down from a potential 44 to an average of 9. Though not efficient to design or even to 
implement, this was far and away the best decision for hardware efficiency. 

Multi-Cycle Design 
Of all the parts of our processor, cycle design is where we took the most shortcuts and made 
the most compromises. Each instruction has three equal-length cycles: Fetch, Memory Read, 
and Memory Write. Though it was easiest to implement, we missed out on a major optimization 
opportunity. Not all instructions make use of the Memory Read and Memory Write stages, so we 
could have modified our control unit to skip those stages for certain instructions. Doing this, 
however, would have involved making significant changes to our control unit, which was 
infeasible given the current time constraints, as well as our datapath. Ultimately, we opted to 
leave it as is in the hopes of implementing some sort of pipeline, but that never occured. 

Memory and the Stack 


Since the omission of temporary registers makes operations on local variables a bit more 
challenging, we recognized the need for a simple and robust memory system. The programmer 
has two options for working with local values, the stack and program memory.  
 
The stack follows the traditional format of pushing values to the top of the stack, and popping 
them off as necessary. We did, however, omit the typical peek instruction. When we were paring 
down our instruction set we realized that peeking was not necessary for any of the integral 
operations of our processor, such as making function calls or working with local data. In the 
event that someone really needs to keep the value they’ve looked at on the stack, they can 
immediately push it back at the cost of a single instruction. 
 
Program memory provides a perhaps more intuitive way to store local variables. In combination 
with our assembler, the programmer can load addresses by name into the IA register and either 
manipulate them in the DA register, or use instructions that store results to memory. This is 
where our instruction design choices shine most, as this provides huge potential program 
optimization. Due to the way our cycles are designed, instructions that involve reading and 
writing to memory don’t take any longer than instructions that only involve registers, so this cuts 

 
 
 

down the expected process of loading a value, manipulating it in DA, and storing back to 
memory to a single instruction. 
 
As they are housed in the same memory unit, there is the potential for the stack and program 
memory to collide in nasty ways. The stack builds up from the bottom of memory, whereas 
program memory is indexed from the top. If a program makes too deep of a recursive call, it’s 
possible they will begin to overlap. This could be mitigated by increasing the size of the memory 
unit, but was not an issue we ran into in any of our testing. 

Procedure Calls 
Though most of the details surrounding procedure calls can be found in Appendix A, the design 
of our calling conventions was the subject of heated debate for several days. As such, we felt it 
deserved special mention. 
 
Unlike many other processor designs, our processor does not have dedicated registers for 
procedure arguments and return values. Instead these values are stored on the stack and the 
responsibility of preserving data is left to the caller. As these conventions are not 
hardware-enforced, it is imperative that the programmer follow them carefully. 

Implementation 
We began datapath design by whiteboarding out the individual components we knew we would 
need and beginning to connect them by going through the list of instructions and their 
associated RTL to make sure they were supported. This was a somewhat iterative process as 
we found ways to shrink muxes, reduce stages of logic, and found ways to make components 
serve multiple purposes (i.e. we originally had an ALU dedicated for the IA register, but 
determined we could use the primary ALU for the same purpose). 
 

 
 
 

 
 
Overall, the design focuses on reducing the hardware footprint as much as possible in order to 
speed up cycles. 
 
Our Xilinx model is almost entirely written in Verilog. This is because we all preferred to read 
code rather than schematics, and some of us had significant prior experience with Verilog and 
only one team member had experience with VHDL. We implemented the processor in modules 
which matched the initial integration tests: decoder, registers, ALU, (program) memory, control, 
and LCD driving. This made debugging and splitting up work easier in the long term, we believe, 
although it did make for finding problems and incorrect links difficult at times. 

Testing 
We began with simple unit tests for all individual components (i.e. general purpose registers, 
adders, muxes). In each of these, we choose the component to test, determine the expected 
inputs, control signals, and outputs, build the testbench, run it, and check with tables in the 
testbench to determine validity. On most individual components, there were no major changes 
to be made in order for components to work. 
 
After unit tests, we integrated some of these into small segments of the processor. 
 
The PC Subsystem consists of the program counter register, incrementer, return address 
register, the necessary muxes which hook these together, and the input control signals. Testing 
for this system proved rather straightforward with no major changes necessary. 
 
The ALU subsystem consists of the ALU, muxes into it, and necessary control signals. Testing 
of this system proved very straightforward with no changes necessary.  
 
 
 
10 

 
The Stack/Program Memory Subsystem consists of the SP register, IA register, memory, and 
necessary muxes. Testing of this proved very straightforward with no changes necessary. 
 
The Control testbench tests, as expected, the control unit. This was the second-most daunting 
part of the whole processor and testing, and in our testing we had to do countless small fixes to 
make things work as expected, and also had to refer back to the testbench when we had more 
fundamental issues with our processor (such as an ALU output latch). 
 
The Processor core consists of the necessary components to execute a given instruction 
(everything but PC subsystem and instruction memory). We found that we would need a register 
to serve as a latch on the output of the ALU. 
 
At this point, we were ready to test the entire processor, which gave us some headaches. We 
found that our branch control was not working. This was because we were expecting the wrong 
outputs in the PC test (we needed to branch to PC | imm, not PC+1 | imm). We were using the 
assembler at this point, and found that there were some bugs within it that caused unexpected 
program behavior. After this was resolved, we battled glitches with our assembly before finally 
making Euclid’s algorithm work. 

Compiler and Assembler 


When we originally looked at the relative prime algorithm we were given, our first thought was 
that it was going to be abysmal writing an assembly version. As such, we decided to build a 
compiler to help. The completed compiler is a Python script that takes in basic C code and 
supports the following constructs: 
 
- Method declarations with any number of arguments 
- Local variable declarations 
- Local variable assignments 
- Addition and subtraction on local variables 
- While loops 
- Control flow such as if/else if/else blocks 
- Method calls 
 
As the goal of this class was not to make a fully-functional compiler, there are a few guidelines 
to follow when writing code for the compiler: 
 
- Keep assembly in mind. Arithmetic is only supported where a local variable is 
manipulated by a constant or a variable, eg. a = a + 4, or a = a + b. 
- All logic must be inside of a method body. You can specify an assembly header that sets 
up arguments and calls the method inside of the build file. 

 
 
 
11 

- Conditions must be as simple as possible, eg. if (i == 5) or if (a < b) as opposed to if (i - 4 


< foo(j)). Local variables can be used to store values ahead of the condition. 
- Comments must be single line comments and on their own lines. 
 
At a high level, the compiler works by first cleaning the program, parsing the program into an 
abstract syntax tree (AST), and converting the tree to assembly. More details on the steps is 
below: 
 
- Cleaning: This step removes all unnecessary characters, including newlines, comments, 
and miscellaneous spaces. 
- Parsing: Perhaps the most challenging step in the compiling process, this step turns the 
cleaned program into abstract syntax. Using a stack, the parser works from the 
innermost code blocks outwards, turning individual lines of code into various 
components and composites of components. A full list can be found in syntax.py, but 
these range from arithmetic blocks to method calls to variable declarations to 
conditional statements. At the end of the parsing step, the entire program has been 
turned into a tree of syntactical components 
- Converting to Assembly: This step walks through the tree, compiling generalized 
assembly from each component into the final program. Syntax2.py contains a series of 
optimizations made to this step. Instead of blindly outputting repetitive, inefficient, and 
generalized code, the compiler tries to simulate the program and cut out instructions 
that would ultimately have no effect. The optimizations are unfortunately still under 
development, though syntax.py is fully functional. 
 
To help transfer our assembly to the processor, we also developed an assembler which has 
support for: 
- Labels 
- Defined constants 
- Tabs and spaces 
- Comments 
- Hex and decimal immediates 
- Pseudoinstructions (currently, just unpacking li and la for large immediates) 
 
Running the compiler invokes the assembler as well, so it outputs an assembly file, a text file 
with commented machine code, and .COE files for easy instruction memory generation. 

Results 
We were able to get the model implemented on the FPGA board- this included running a simple 
program which would take 8-bit inputs separated by 2 seconds and merging them into a 16-bit 
input, running relPrime, and then displaying the result to the FPGA screen. 
 

 
 
 
12 

We wrote a hand-optimized assembly program and compared it to compiler output. We found 


the following performance specifications: 
  Optimized Code  Unoptimized Code 
(Compiled from C) 

# of 16 bit instructions to store euclid’s and  60  97 


relPrime (including retrieving input) 

# of 16 bit memory addresses  4  7 

# of 16 bit instructions executed with 0x13B0  92092   163496  

# of cycles to execute with 0x13B0  276276  490488 

Average cycles per instruction  3 

Cycle time for design  6.366 ns 

Clock speed  157 MHz 


 
 

Conclusion 
Creating a processor is quite difficult! We learned that there are many compromises to be made 
in a design, and having gone through one iteration of a design helps you to make better 
compromises. After doing one pass through, there are many changes we would individually like 
to see happen, and some that we can all agree would be significant improvements, such as 
leveraging multi-cycle to its fullest and shortening the length of some instructions. 

   

 
 
 
13 

A: Software Specifications 
Available Registers 

General Purpose Registers 


Default Accumulating (DA) Register 
This is the primary register available to the programmer. It serves as a way to accumulate 
values and is typically the first argument in any operation.When used properly, its contents are 
preserved across procedure calls. 
 
Indirect Addressing (IA) Register 
This is a secondary register available to the programmer, which points to which portion of 
memory to read/write from. It serves as a way to accumulate values and parameterize 
instructions. When used properly, its contents are preserved across procedure calls. 

Restricted Purpose Registers 


Program Counter (PC) Register 
This register stores the memory address for the current instruction. It cannot be altered by the 
programmer, but branching at jumping instructions alter it as part of their behavior. It is also 
incremented appropriately at the end of every completed instruction. 
 
Return Address (RA) Register 
This register stores the memory address of the instruction to return to upon executing the ​ret 
instruction. It cannot be directly altered by the programmer, but RA is altered appropriately by 
instructions p​ ushra, popra, call​ . When used properly, its contents are preserved across 
procedure calls. 
 
Stack Pointer (SP) Register 
This register stores the memory address for the top item on the stack. It cannot directly be 
altered by the programmer, but instructions such as ​push​and ​pop​may alter it as part of their 
behavior. 
 
 
 

 
 
 
14 

Procedure Call Convention 


 
Calling a Procedure (Being the Caller) 
Before jumping or branching to a procedure, you must first back up any local memory you want 
to ensure persists across the procedure call. This is to be done on the stack. Next, you must 
back up the critical registers RA, IA, and DA (using pushra, pushia, and push).  
 
After backing up the registers you may then push whatever arguments the procedure requires to 
the stack. You are now ready to call the procedure. 
 
Upon returning from the procedure any return values will be present at the top of the stack. 
From here, you can pop off these values. After this, be sure to restore critical registers RA, IA, DA 
(using popra, popia, pop). 
 
Defining a Procedure (Being the Callee) 
As a callee your responsibilities are much less stringent. The caller expects the return values to 
be at the top of the stack so your only responsibility is to clear any passed arguments and push 
the expected return values to the stack. You may then return using an instruction such as ​ret​ . 
 
In the event that you push any local values to the stack, you must remove them before pushing 
your return values. 

   

 
 
 
15 

Machine Language Instruction Types 

Inherent (N) Type Instructions 


These instructions are majority opcode, with a small 4-bit immediate (shamt) for shift 
operations. 
 
0  0  grp (2)  aluop (3)  op (5)  shamt (4) 
 
prefix ​ ​ lways 00 
:​ A
grp ​ :​ I​nstruction subgroup 
aluop ​ :​ A​ LU opcode 
op :​​ ​Operation type 
shamt ​ :​ S​hift amount 

Immediate (I) Type Instructions 


These instructions contain an 8-bit immediate value. 
 
0  1  alu (2)  op (4)  imm (8) 
 
prefix ​ ​ lways 01 
:​ A
alu ​ ​irst two bits of ALU opcode (the last bit is assumed to be 0) 
:​ F
op :​​ ​Operation type 
imm ​ ​he immediate value 
:​ T

Branch (B) Type Instructions 


These instructions take a 12-bit label which points to an address in memory. 
 
1  op (3)  imm (12) 
 
prefix ​ ​ lways 1 
:​ A
op :​​ ​
Type of branch 
imm​
:​ ​The immediate to be used by the branch instruction. Varies by instruction. 

   
 
 
 
16 

Instruction Semantics 

Arithmetic and Logical Instructions 


Addition to DA 
add N (00) 00 0000 1110 XXXX  
Performs addition on the value stored at memory as described by the IA register and the 
contents of DA register. The result is put into DA register. 
 
Addition to Memory 
addm N (00) 00 0000 1101 XXXX 
Performs addition on the value stored at memory as described by the IA register and the 
contents of DA register. The result is put into memory at the location described by the IA 
register. 
 
Addition with Immediate 
addi imm I (01) 00 1100 iiii iiii 
Performs signed addition on the immediate value and DA register. The result is put into 
DA register. 
 
AND to DA 
and N (00) 00 1100 1110 XXXX 
Performs a logical AND on the value stored at memory as described by the IA register 
and the contents of DA register. The result is put into DA register. 
 
AND to Memory 
andm N (00) 00 1100 1101 XXXX 
Performs a logical AND on the value stored at memory as described by the IA register 
and the contents of DA register. The result is put into memory at the location described 
by the IA register. 
 
AND with Immediate 
andi imm I (01) 11 1010 iiii iiii 
Put the result of a logical AND between DA register and the immediate value into DA 
register. 
 
OR to DA 
or N (00) 00 1000 1110 XXXX 
Performs a logical OR on the value stored at memory as described by the IA register and 
the contents of DA register. The result is put into DA register. 
 
 
 
 
 
17 

OR to Memory 
orm N (00) 00 1000 1101 XXXX 
Performs a logical OR on the value stored at memory as described by the IA register and 
the contents of DA register. The result is put into memory at the location described by 
the IA register. 
 
OR with Immediate 
ori imm I  (01) 10 1010 iiii iiii 
Put the result of a logical OR between DA register and the immediate value into DA 
register  
 
Shift Left Logical 
sll shamt N (00) 00 0110 0010 shmt 
Shift the value in DA register left by shamt. 
 
Shift Right Logical 
srl shamt N (00) 00 1010 0010 shmt 
Shift the value in DA register right by shamt. 
 
Shift Left Logical to Memory 
sllm shamt N (00) 00 0110 0101 shmt 
Shift the value in memory at IA left by shamt. 
 
Shift Right Logical to Memory 
srlm shamt N (00) 00 1010 0101 shmt 
Shift the value in memory at IA right by shamt. 
 
Subtraction to DA 
sub N (00) 00 0010 1110 XXXX 
Subtracts the value stored at memory as described by the IA register from the contents 
of DA register. The result is put into DA register. 
 
Subtraction to Memory 
subm N (00) 00 0010 1101 shmt 
Subtracts the value stored at memory as described by the IA register from the contents 
of DA register. The result is put into memory at the location described by the IA register. 
 
Load Upper Immediate 
lui imm I (01) 00 0000 iiii iiii 
Load the immediate value specified into the upper half of DA register. 
 
Load Immediate 
li imm I (01) 00 0100 iiii iiii 
Load the immediate value specified into the lower half of DA register. Sign extended. 

 
 
 
18 

 
Two's Complement 
two N (00) 00 0100 0010 XXXX 
Take the two’s complement of DA register. The result is also put in DA register. 
 

Branch/Jump Instructions 
Branch if Not Equal To 0 
bnez label B (​
1) 111 LLLL LLLL LLLL 
Conditionally jump to the address specified by label if DA register does not contain the 
value 0. 
 
Branch if Equal To 0 
bez label B (1) 110 LLLL LLLL LLLL 
Conditionally jump to the address specified by label if DA register contains the value 0. 
 
Branch if No Carry 
bnc label B (1) 101 LLLL LLLL LLLL 
Conditionally jump to the address specified by label if the carry bit is set to 0 from the 
previous operation. 
 
Branch if Carry 
bc label B (1) 100 LLLL LLLL LLLL 
Conditionally jump to the address specified by label if the carry bit is set to 1 from the 
previous operation. 
 
Branch if Positive 
bp label B (1) 011 LLLL LLLL LLLL 
Conditionally jump to the address specified by label if DA register’s first bit is 0. 
 
Branch if Negative 
bn label B (1) 010 LLLL LLLL LLLL 
Conditionally jump to the address specified by label if DA register’s first bit is 1. 
 
Jump 
j label B (1) 000 LLLL LLLL LLLL 
Jumps to the address specified by label. 
 
Call 
call label B (1) 001 LLLL LLLL LLLL 
Jumps to the address specified by label after storing the contents of the PC register 
(incremented by 1) in the RA register. 
 
 
 
 
19 

Return 
ret N (00) 10 1111 0001 XXXX 
Jumps to the address stored in the RA register. 
 

I/O Manipulation Instructions 


LCD Write 
lcdw N (00) 11 0000 XXXX XXXX 
Writes the current value in the DA register to the DP register. 
 
LCD Move Cursor 
lcdmc imm I (00) 11 0100 iiii iiii 
Moves the LCD cursor to a specified location on the LCD display. 
 
LCD Move Cursor to DA 
lcdmcda (00) 11 0100 1111 1111 
Moves the LCD cursor to the location in DA. 
 
LCD Clear 
lcdclr N (00) 11 1000 XXXX XXXX 
Clears the LCD screen and resets the cursor position to the start. 
 
Read Buttons and Switches 
buttr N (00) 10 1110 0011 XXXX 
Reads the button and switch inputs from the BS register in the order 
S0,S1,S2,S3,B0,B1,B2,B3 to the DA register. 

Memory Manipulation Instructions 


Load Word 
lw N (00) 00 1111 0110 XXXX 
Load the contents of the memory pointed by the IA register into DA register. 
 
Store Word 
sw N (00) 10 1110 0101 XXXX 
Store the contents of DA register at the address pointed by the IA register. 
 
Load Indirect Addressing 
lia N (00) 10 1110 0010 1XXX 
Loads the value of the IA register into DA register. 
 
 

 
 
 
20 

Store to Indirect Addressing 


sia N (00) 10 1110 1001 1XXX 
Stores the value of the DA register into the IA register. 
 
Load Address 
la imm I (01) 00 0011 iiii iiii 
Load the immediate value specified into the lower half of the IA register. Zero-extended. 
 
OR Upper Address 
oua imm I (01) 10 0001 iiii iiii 
Load the immediate value specified into the upper half of the IA register. Zero extended. 
 
Point to Adjacent Memory 
iap imm I (01) 00 0101 iiii iiii 
Increments (or decrements) the IA register by the specified value. 

Stack Instructions 
Push to Stack 
push N (00) 01 1110 0100 0XXX 
Pushes the contents of DA register to the stack. Decrements the SP register. 
 
Pop off Stack 
pop N (00) 01 1111 0010 0XXX 
Loads the word at the top of the stack into DA register. Increments the SP register. 
 
Push RA to Stack 
pushra N (00) 01 1110 1000 0XXX 
Pushes the contents of RA register to the stack. Decrements the SP register. 
 
Pop RA off Stack 
popra N (00) 01 1111 1000 1XXX 
Loads the word at the top of the stack into RA register. Increments the SP register. 
 
Push IA to Stack 
pushia N (00) 01 1110 0000 0XXX 
Pushes the contents of DA register to the stack. Decrements the SP register. 
 
Pop IA off Stack 
popia N (00) 01 1111 0001 0XXX 
Loads the word at the top of the stack into DA register. Increments the SP register. 

 
 
 
21 

Example Program 
The following program is an example of programming in our processor’s assembly language. It 
finds the relative primes of some input N. 

Assembly Program 
relPrime: 
# Fetch argument n 
la 0 # Load address of variable N into IA 
pop # Pop off the last argument from the stack  
sw # Store DA into mem[IA] 
 
# Create variable M 
la 1 # Load address of variable M into IA 
li 2 # Load value 2 into DA 
sw # Store DA into mem[IA] 
 
relPrime_loop: 
pushra # Backup critical registers (DA, IA, PC) 
pushia 
push 
 
# Setup up N and M as arguments for gcd 
la 0 # Load address of variable N into IA 
lw # Load mem[IA] into DA 
push # Push DA onto the stack (put N on as an argument) 
 
# Repeat for M 
la 1 
lw 
push 
 
# Call the GCD function 
call gcd 
 
# Get the return values 
pop # put return value into DA 
addi -1 # subtract 1 
# if the result is zero (return value == 1), we're done 
bez relPrime_done  
 
pop 
popia 
popra # Restore critical registers (DA, IA, PC) 

 
 
 
22 

 
li 1 # load the immediate 1 into DA 
addm # mem[IA] = DA + mem[IA] (recall! IA points to M 
after the rya) 
 
relPrime_done: 
lw # NOTE: M is already in IA at this point 
push # Push DA onto stack 
ret # Go back to where we came from 
 
 
gcd: 
# fetch argument B 
la 1 
pop 
sw 
 
# fetch argument A 
la 0 
pop 
sw 
 
bnez gcd_nonzero # DA contains a, check if nonzero 
la 1 # IA = addr of B 
lw # DA = mem[IA] = B 
push # Push DA onto stack (put B on stack) 
ret # Go back to where we came from 
gcd_nonzero: 
 
gcd_loop: 
la 1 # IA = addr of B 
lw # DA = mem[IA] = B 
 
# if B == 0 then we're done 
bez gcd_done  
# a = a-b 
la 0 # IA = addr of A 
lw # DA = mem[IA] = A 
la 1 # IA = addr of B 
sub # DA = DA-mem[IA] = A-B 
la 0 # IA = addr of A 
sw # A = DA 
 
# skip over the next case 
j gcd_casedone  
 
gcd_case2: 
# b = b-a. Same as above 
 
 
 
23 

la 1  
lw    
la 0 
sub   
la 1 
sw   
 
gcd_casedone: 
j gcd_loop # go to start of loop 
 
gcd_done: # It's over! 
la 0 # IA = addr of A 
lw # DA = mem[IA] = A 
push # push DA (A) onto stack 
ret # Go back to where we came from 

Assembled Machine Code 


0100101100000000 # la 0 
0001111100100000 # pop 
0010111001010000 # sw 
0100101100000001 # la 1 
0101010000000010 # li 2 
0010111001010000 # sw 
0001111010001000 # pushra 
0001111000010000 # pushia 
0001111001100000 # push 
0100101100000000 # la 0 
0000111101100000 # lw 
0001111001100000 # push 
0100101100000001 # la 1 
0000111101100000 # lw 
0001111001100000 # push 
1001000000011110 # call gcd 
0001111100100000 # pop 
0100110011111111 # addi -1 
1111000000011010 # bez relPrime_done 
0001111100100000 # pop 
0001111100010000 # popia 
0001111110001000 # popra 
0101010000000001 # li 1 
0000000011010000 # addm 
0000111101100000 # lw 
0001111001100000 # push 
0010111100010000 # ret 
0100101100000001 # la 1 
0001111100100000 # pop 

 
 
 
24 

0010111001010000 # sw 
0100101100000000 # la 0 
0001111100100000 # pop 
0010111001010000 # sw 
1110000000101010 # bnez gcd_nonzero 
0100101100000001 # la 1 
0000111101100000 # lw 
0001111001100000 # push 
0010111100010000 # ret 
0100101100000001 # la 1 
0000111101100000 # lw 
1111000000111111 # bez gcd_done 
0100101100000000 # la 0 
0000111101100000 # lw 
0100101100000001 # la 1 
0000001011100000 # sub 
0100101100000000 # la 0 
0010111001010000 # sw 
1000000000111101 # j gcd_casedone 
0100101100000001 # la 1 
0000111101100000 # lw 
0100101100000000 # la 0 
0000001011100000 # sub 
0100101100000001 # la 1 
0010111001010000 # sw 
1000000000101011 # j gcd_loop 
0100101100000000 # la 0 
0000111101100000 # lw 
0001111001100000 # push 
0010111100010000 # ret 
 
 
 

   

 
 
 
25 

Common Operations 

Adding Numbers in Memory 


This assembly snippet adds 3 numbers that are stored in memory and places the result in the 
next spot in memory. For our purposes, we assume the address of the first number is already in 
the IA register. 
 
Assembly 
 
andi 0 # clear the value in DA 
add # add the first number to DA 
iap 16 # move IA by 1 word (16 bits) to the next number 
add # add the second number to DA 
iap 16 # move IA to the next number 
add # add the third number to DA 
iap 16 # move IA to the next word in memory 
sw # store DA 
 
Machine Code 
 
0100 01XX 0000 0000 
0000 000X XXXX XXXX 
0110 01XX 0001 0000 
0000 000X XXXX XXXX 
0110 01XX 0001 0000 
0000 000X XXXX XXXX 
0110 01XX 0001 0000 
0001 010X XXXX XXXX 

Loop Through an Array in Memory 


This assembly snippet loops through memory. For our purposes we will assume the starting 
address is already in the IA register and the length of the array is 10. 
 
Assembly 
 
li 10 # set DA to be 10 (array length) 
loop: 
push # push the value of DA (array index) to the stack 
lw # load the array element 
 
# do whatever you desire with the array element 

 
 
 
26 

 
pop # restore the index to DA 
addi -1 # decrement the index 
iap 16 # increment the position of IA by one word 
bnez loop # check if we have traversed the whole array 
 
# continue the program here 
 
Machine Code 
Note: loop is found at address 0x111 
 
0101 10XX 0000 1010 
0010 010X XXXX XXXX 
0001 001X XXXX XXXX 
0010 011X XXXX XXXX 
0100 00XX 1111 1111 
0110 01XX 0001 0000 
1000 0001 0001 0001 

Loading an Address into the IA Register 


Assembly 
For small addresses (load 0x50): 
 
la 0x50 # load address 0x50 into the IA register 
 
For larger addresses (load 0x1234): 
 
la 0x34 # load address 0x34 into the IA register 
oua 0x12 # or the value 0x1200 with the IA register, store in IA 
register (result: 0x1234) 
 
Machine Code 
For small addresses (load 0x50): 
 
0110 0000 0101 0000 
 
For larger addresses(load 0x1234): 
 
0110 0000 0011 0100 
0101 1100 0001 0010 

   

 
 
 
27 

Conditional Statements 
Assembly 
Skip over code if memory at label A equals memory at label B 
 
la A 
lw 
la B 
sub 
bez equal 
# Put code to execute if A!=B 
 
equal: 
 
Skip over code if memory at label A is less than memory at label B 
 
la A 
lw 
la B 
sub # DA = A-B 
bnc gt 
# Put code to execute if A <= B 
gt: 
 
Machine Code 
Skip over code if memory at label A equals memory at label B 
 
#for reference 
A address = 0 ​ 000 0001 
B address = 0​ 000 0010 
equal address = 1 ​ 010 1010 1010 
gt address = ​0101 0101 0101 
############################# 
 
0110 00XX 0000 0001 
0001 001X XXXX XXXX 
0110 00XX 0000 0010 
0000 110X XXXX XXXX 
1001 1010 1010 1010 
# Put code to execute if A!=B 
​0000 1010 1010 1010: 
  
Skip over code if memory at label A is less than memory at label B 
  
0110 00XX 0000 0001 
0001 001X XXXX XXXX 

 
 
 
28 

0110 00XX 0000 0010 


0000 110X XXXX XXXX​ # DA = A-B 
1010 0101 0101 0101 # ​ Put code to execute if A <= B 
0000 0101 0101 0101​

  
 

Reading from/Writing to a Display Register 


These are examples of writing and reading to the DP register. 
Assembly 
LCDWriting: 
li 5 #loads 5 into the DA register 
lcdmc 0  #moves the LCD cursor to position 0 
lcdw  #writes the DA register (5) into the DP register 
 
LCDReading: 
lcdr #reads value to the DA register 
la A #loads address A 
sub #LCD value - A 
bnc gt #branches if last operation carries 
 
gt: 
 
Machine Code 
 
0101 1000 0000 0101 
0110 1000 0000 0000 
0001 1010 0000 0000 
 
0001 1100 0000 0000 
0110 0000 0000 1111 
0000 1100 0000 0000 
1010 0000 0000 1000 

 
   

 
 
 
29 

B: Hardware Specifications 
Register Transfer Language 

Arithmetic 
  A/L to memory  A/L to DA  A/L with imm to DA 
inst = mem[PC] 
Fetch 
newPC = PC+1 
Instruction   

ALUOut = DA op 
Stage 1  ALUOut = DA op Mem[IA] 
SE/ZE/ZEu(inst[7:0]) 

Mem[IA] = ALUOut  DA = ALUOut 


Stage 2  PC = newPC  PC = newPC 

Memory Instructions 
  lw  sw 

inst = mem[PC] 
Fetch  newPC = PC+1 
PC = newPC 

DA = mem[IA]  mem[IA] = DA 


Stage 1  PC = newPC  PC = newPC 

   

 
 
 
30 

 
  iap  la  oua  sia  lia 

inst = mem[PC] 
Fetch  newPC = PC+1 
PC = newPC 

ALUOut = IA +  ALUOut =  ALUOut = IA OR 


Stage 1  SE(inst[7:0])  ZE(inst[7:0])  ZEU(inst[7:0]) 
ALUOut = DA  ALUOut = IA 

IA = ALUOut  DA = ALUOut 
Stage 2  PC = newPC  PC = newPC 

Jumps/Branches 
 
  ret  Branch  Jump 
Call 

inst = mem[PC] 
Fetch  newPC = PC+1 
 

if <flag> 
PC = PC[15:12] || inst[11:0]  RA = PC 
PC = PC[15:12] || 
Stage 1  PC = RA 
else  inst[11:0] 
PC = PC[15:12] || 
inst[11:0] 
PC = newPC 

   

 
 
 
31 

LCD / Buttons 
  lcdwr  lcdclr  lcdmc  buttr 

inst = mem[PC] 
Fetch  newPC = PC+1 

DP = DA  Row = inst[5] 


CLEAR = 1  Sa = inst[4:0]  DA = BS 
Stage 1  PC = 
PC = newPC  Movecursor = 1  PC = newPC 
newPC  PC = newPC 

Stack Instructions 
    pop  popia  popra  push  pushia  pushra 

inst = mem[PC] 
  Fetch  newPC = PC+1 

MemOut = mem[SP]  ALUOut = DA  ALUOut = IA  ALUOUT = RA 


  Stage 1  newSP=SP-1  newSP = SP+1  newSP = SP+1  newSP = SP+1 

DA = mem[SP]  IA = mem[SP]  RA = mem[SP]  mem[SP] = ALUOut 


  Stage 2  SP = newSP  SP = newSP  SP = newSP  SP = newSP 
PC = newPC  PC = newPC  PC = newPC  PC = newPC 

   

 
 
 
32 

Component Specification 
1. General Purpose Register
a. Input Signal(s)​: 16-bit regWrite
b. Output Signal(s)​: 16-bit regRead
c. Control Signal(s)​: 1-bit writeEnable
d. Description​: The input signal is ignored unless the writeEnable control signal is
set to 1. If it is, then the contents of the register is overwritten with the contents of
the input signal. Regardless of the control signal, the output signal will always
reflect the contents of the register.
2. Program Memory
a. Input Signal(s)​: 16-bit Address, 16-bit WriteData
b. Output Signal(s)​: 16-bit ReadData
c. Control Signal(s)​: 1-bit MemRead, 1-bit MemWrite
d. Description​: If MemWrite is 1, then the data on the WriteData input will be
written to the memory address on the Address input. If MemRead is 1, then the
data at the memory address on the Address input will be available on the
ReadData output.
3. 1:2 Mux
a. Input Signal(s)​: 16-bit in0, 16-bit in1
b. Output Signal(s)​: 16-bit out
c. Control Signal(s)​: 1-bit select
d. Description​: Select in0 or in1 to be fed to out.
4. 2:4 Mux
a. Input Signal(s)​: 16-bit in0, 16-bit in1, 16-bit in2, 16-bit in3
b. Output Signal(s)​: 16-bit out
c. Control Signal(s)​: 2-bit select
d. Description​: Select one of the inputs to be fed to out.
5. Sign Extension Unit
a. Input Signal(s):​ 8-bit signal
b. Output Signal(s)​: 16-bit signal
c. Control Signal(s)​: None
d. Description: ​Sign extends the 8-bit input signal to 16-bits
e. Implements SE
6. Zero Extension Unit
a. Input Signal(s):​ 8-bit signal
b. Output Signal(s)​: 16-bit signal
c. Control Signal(s)​: None
d. Description: ​Zero extends the 8-bit input signal to 16-bits (zeros on MSB side)
e. Implements ZE

 
 
 
33 

7. Zero Extension Upper Unit


a. Input Signal(s):​ 8-bit signal
b. Output Signal(s)​: 16-bit signal
c. Control Signal(s)​: None
d. Description: ​Zero extends the 8-bit input signal to 16-bits (zeros on LSB side)
e. Implements ZEU
8. ALU
a. Input Signal(s)​: 16-bit A, 16-bit B
b. Output Signal(s)​: 16-bit ALUResult, 1-bit Carry
c. Control Signal(s)​: 3-bit ALUOp
d. Description: ​Performs an operation selected by ALUOp on the A and B inputs,
making the result available on ALUOut (e.g., A op B = ALUOut). If this results in a
carry, then the carry output wire is 1. Otherwise, the carry output wire is 0.
e. Implements op
9. Incrementer
a. Input Signal(s)​: 16-bit in
b. Output Signal(s)​: 16-bit out
c. Control Signal(s)​: 1-bit direction
d. Description: ​16-bit adder that either increments or decrements based on
direction.
e. Implements newSP = SP+1 or SP-1, and PC increment.

10. Instruction Memory


a. Input Signal(s)​: 16-bit PC address
b. Output Signal(s)​: 16-bit Instruction
c. Control Signal(s)​: none
d. Description​: Takes in the current address of the PC and outputs the instruction
there.

11. ALU Latch


a. Input Signal(s): ​16-bit in
b. Output Signal(s):​ 16-bit out
c. Control Signal(s):​ 1-bit write signal
d. Description:​ Holds the output of the ALU at the end of the READ stage so that it
can be used to write to the proper registers in the WRITE stage.

   

 
 
 
34 

Datapath Schematic 

 
 
 
 
35 

Control Signals 
Into Control Unit: 
DA  The entire DA register is used as an input to the control unit (this provides 
the zero and negative ‘flags’). 
CARRY  The carry bit from the ALu is fed into the control unit. 
Out of Control Unit: 
PCSRC1  Selects between newPC/branched PC and RA as write input for PC 
register. 
PCW  Whether or not to write to the PC register. 
FEN  Which flag to use (for branch operations) 
FINV  Whether or not to invert the flag (for branch operations) 
IMEMR  Whether or not to read from instruction memory. 
IASRC  Selects whether to feed ALUResult, MemOut, or DA as write input for IA 
register. 
IAW  Whether or not to write to the IA register. 
DASRC  Selects whether to feed ALUResult, MemOut, IA, or Buttons & Switch as 
write input for DA register. 
DAW  Whether or not to write to the DA register. 
SPDIR  Selects whether to increment (1) or decrement (0) the SP register. 
SPW  Whether or not to write to the SP register. 
RASRC  Selects whether to feed MemOut or newPC as write input for RA register. 
LCDROW  Selects the row on the LCD display the cursor to move to. 
LCDSTARTADDRESS  Selects the position in the row for the cursor to move to. 
LCDMOVECURSOR  Indicates to the LCD driver that the cursor needs to move to another 
location. 
LCDCLEAR  Indicates to the LCD driver that the LCD needs to be cleared. 
LCDWRITE  Indicates to the LCD that the lcd_DP needs to be written to the display. 
ALUASRC  Selects whether to feed IA, DA, or RA into ALU input A 
ALUBSRC  Selects whether to feed zero-extended-upper, zero-extended, 
sign-extended immediate, or MEM_OUT 
ALU_LATCHW  Whether or not to write to the ALU Latch  
ADDRSRC  Selects whether to feed IA or SP as memory address. 
MEMR  Whether or not to read from the memory unit. 
MEMW  Whether or not to write to the memory unit. 

 
 
 
36 

Control Unit 
The processor is split into three cycles, the fetch, read, and write stages. With the exception of 
the fetch stage, where all control signals are predetermined, control signals are based off of bits 
in the instruction. 
 
The instructions are split into sets of bits that help control determine not only what type of 
instruction it is working with, but values for the relevant control signals. See the Machine 
Language Instruction Types section for more information on these sections, as they will be 
referred to by name. 
 
For the Read and Write stages, if the instruction prefix is 1X, the signals fall to the Branch (B) 
column. Then if the prefix is 00, they fall to the Inherent (N) column. Else, they fall to the 
Immediate (I) column. Commas separate wires in a bus, and conditions evaluate to 1 for true 
and 0 for false. 

Fetch Stage 
FEN XX SPW 0
FINV X RASRC X
PCSRC0 0X RAW 0
PCW 0 ALUA XX
IMEMR 1 ALUB XX
IASRC XX ALUOP XXX
IAW 0 ADRSRC X
DASRC XX MEMR 0
DAW 0 MEMW 0
SPDIR X ALU_LATCHW  0 
   
       

   

 
 
 
37 

Opcode Breakdown 
For the next two stages, opcodes play a very important role in the decoding of control signals. 
Due to the number of inherent type and simplicity of immediate type instructions, we are able to 
manipulate the bits in the instruction to determine the appropriate control. These bits are 
typically found in the op and grp sections of instructions, as defined in the Machine Language 
Instruction Types section. 

Inherent Opcodes 
Inherent opcodes, due to their variety, are further broken down into subgroups numbered 0 
through 3. The grp section indicates which group an instruction belongs to. The op breakdown 
for each subgroup is as follows: 
 
Group 00 - Arithmetic/Logic (A/L) 
The DASRC mux will either be set to 00 or 11, this bit signifies which 
op[4]  DASRC  of the two it should be. 

The ALUB mux will either be set to 01 or 11, this bit signifies which 
op[3]  ALUB  of the two it should be. 

op[2]  MEMR  This bit is the memory read control signal in Read Stage 

op[1]  DAW  This bit is the DA write control signal in Write Stage 

op[0]  MEMW  This bit is the memory write control signal in Write Stage 
 
Group 01 - Stack 
Note: These instructions “cheat” by dipping into the shamt space of the instruction to store 
extra data. 
This is a very busy bit. It controls: 
SPDIR/ ● The direction of the SP register adder 
MEMR/  ● The control signal for memory read in Read Stage 
MEMW/ ● The inverse of memory write in Write Stage. 
op[4]  DASRC  ● The bits for DASRC 

op[3:2]  ALUA  These bits control the mux into ALU A 

op[1]  DAW 

op[0]  IAW 

shamt[3]  RAW  These bits control the write signals to the DA, IA, and RA register. 
 
 
 
 
 
38 

Group 10 - The Snowflakes (Snow) 


 
op[4]  PCSRC1  This bit signifies whether PCSRC1 should be on or off 

op[3]  IAW  This bit is the IA write control signal in Write Stage 

op[2]  MEMW  This bit is the memory write control signal in Write Stage 

op[1]  DAW  This bit is the DA write control signal in Write Stage 

The DASRC mux will either be set to 00 or 01, this bit signifies which 
op[0]  DASRC  of the two it should be. 
 
Group 11 - LCD 
 
LCD control signals are generated by AND-ing the whole instruction with values hardcoded into 
control, as the control structure for LCD is fairly straightforward. 

Immediate Opcodes 
 
This bit is a heavy lifter. It controls: 
IASRC/  ● Both bits of IASRC 
DASRC/  ● The inverse of the first bit of DASRC 
ALUA/  ● The second bit of ALUA 
IAW/  ● The IA write control signal in Write Stage 
op[3]  DAW  ● The inverse of the DA write control signal in Write Stage 

op[2:1]  ALUB  These bits control the mux into ALU B 

IAW/  This bit is the IA write control signal in Write Stage, and its inverse is 
op[0]  DAW  the DA write control signal in Write Stage 
 

Branch/Jump Opcodes 
 
FEN/ 
op[2:1]  PCSRC1  These bits control the FEN signal. When 00 they also control PCSRC1 

op[0]  FINV  This bit is the inverse FINV 


 

 
 
 
39 

Read Stage 
  N I B
FEN 00 00 FEN
FINV 0 0 !FINV
Snow: PCSRC1 
0 FEN == 00
PCSRC1 Others: 0 
PCW DEFAULT: 0
IMEMR DEFAULT: 0
IASRC grp* IASRC,IASRC XX
IAW DEFAULT: 0
Snow: 0,DASRC 
DASRC,0 XX
DASRC Others: DASRC,DASRC

DAW DEFAULT: 0

Stack: SPDIR 
XX XX
SPDIR Others: X 
pop**: 1 
DEFAULT: 0
SPW Others: 0
RASRC 0 X 1
RAW DEFAULT: 0
Stack: ALUA 
0,ALUA XX
ALUA Others: 01
A/L: ALUB,1 
ALUB XX
ALUB Others: XX 
ALUOP aluop alu,0 XXX
ADRSRC grp[0]* 0 XX
A/L and Stack: MEMR 
0 0
MEMR Others: 0 
MEMW DEFAULT: 0
ALU_LATCHW  DEFAULT: 1 
 
* the group number doubles as control for the mux into the IA register, and bit 0 serves as the 
ADRSRC control when need be 
** SPW is 1 for pop, popra, and popia 

 
 
 
40 

Write Stage 
If a control signal is not explicitly stated, it should be assumed that it did not change from Read 
Stage. 
  N I B
FEN RS RS RS
FINV RS RS RS
PCSRC1 RS RS RS
PCW DEFAULT: 1
IMEMR DEFAULT: 0
IASRC RS RS RS
Stack,Snow: IAW 
IAW 0
IAW Others: 0

RS RS RS
DASRC
DAW DAW  DAW 0
SPDIR RS RS RS
push**: 1 
DEFAULT: 0
SPW Others: 0
RASRC RS RS RS
Stack: RAW 
0 call*
RAW Others: 0 
ALUA RS RS RS
ALUB RS RS RS
ALUOP RS RS RS
ADRSRC RS RS RS
MEMR DEFAULT: 0
MEMW MEMW 0 0
ALU_LATCHW  DEFAULT: 0 
 
Note: RS = Value from Read Stage 
* only the call instruction turns this bit on 
** SPW =1 for push, pushra, and pushia 

   
 
 
 
41 

C: Testing and Integration 


RTL Testing 

Procedure 
1. Identify the block in the RTL you want to test 
2. Identify the initial conditions of the CPU.  
3. Identify the final conditions that should result from the execution of the instruction 
4. Step through the commands in the RTL chart and record all changes within the CPU 
5. Verify that the final state of the CPU matches the expected final state 

RTL Markup 
Arithmetic/Logical To DA 
 
Add, And, Or, Sub, Two 
Inst = Mem[PC] //gets the instruction 
newPC = PC +1  //increments the instruction counter 
Op = inst[15:8] //selects operation 
B=Mem[IA] //loads memory from address at IA 
A= DA 
DA = A+B // DA = A&&B // DA = A||B // DA = A-B // DA = two(B) 
PC = newPC 
 
Addm, Andm, Orm, Subm 
Inst = Mem[PC]  //gets the instruction 
newPC = PC+1  //increments the instruction counter 
Op = inst[15:8] //selects ALU operation 
B= Mem[IA] //loads memory from address at IA 
A=DA 
Mem[IA] = A + B // Mem[IA] = A&&B // Mem[IA] = A||B // Mem[IA] = A-B  
PC = newPC 
 
Addi 
Inst = Mem[PC]  //gets the instruction 
newPC = PC+1  //increments the instruction counter 
Op = inst[15:8] //selects the operation  
A=DA //puts DA in the A input of ALU 

 
 
 
42 

B= SE[inst[7:0]] //puts the sign extended immediate in B port of ALU 


DA = A + B //adds the inputs A and B port of ALU and puts result in DA 
PC = newPC 
Andi 
Inst = Mem[PC]  //gets the instruction 
newPC = PC+1  //increments the instruction counter 
Op = inst[15:8] //selects the operation 
A = DA //puts DA in the A input of ALU 
B = SE[inst[8:0]] //puts the sign extended immediate in B port of ALU 
DA = A AND B //Ands the inputs A and B port of ALU and puts result in DA 
PC = newPC 
Ori 
Inst = Mem[PC]  //gets the instruction 
newPC = PC+1  //increments the instruction counter 
Op = inst[15:8] //selects the operation 
A=DA //puts DA in the A input of ALU 
B= SE[inst[8:0]] //puts the sign extended immediate in B port of ALU 
DA = DA OR B //ors the inputs A and B port of ALU and puts result in DA 
PC = newPC 
Sll,Srl 
Inst = Mem[PC] //gets the instruction  
newPC = PC+1  //increments the instruction counter 
Op = inst[15:8] //selects the operation 
A = DA //puts DA in the A input port of ALU 
B = inst[4:0] puts the immediate in B port of ALU 
DA = A >> B //shifts the DA register  
PC = newPC 
Lui 
Inst = Mem[PC] //get the instruction 
newPC = PC+1 //increments the instruction counter 
Op = inst[15:8] 
DA = ZEu[inst[7:0]] //Zero extend upper adds zeros for bits [7:0] of immediate  
PC = newPC  
Li 
Inst = Mem [PC] //gets the instruction from memory 
newPC = PC+1 //increments the instruction counter 
Op = inst[15:8] //selects the load immediate operation 
B =SE[inst[7:0]] //sign extends the immediate 
DA = B //puts B into the DA register  
PC = newPC  
 
Load Word 
inst = Mem[PC] //gets the instruction 

 
 
 
43 

newPC = PC+1  //increments the instruction counter 


op = inst[15:11] //selects the load word operation 
DA = Mem[IA] //loads the memory at the address in the IA register 
PC = newPC 
Store Word 
inst = Mem[PC] //gets the instruction 
newPC = PC+1  //increments the instruction counter 
op = inst[15:11] //selects the store word operation 
Mem[IA] = DA //puts the memory at the address in the IA register 
PC = newPC 
 
Bnez, Bez, Bnc, bc 
inst = Mem[PC] //gets the instruction 
newPC = PC+1  //increments the instruction counter 
If <flag> //checks a flag and changes operation based on flag value 
PC = PC[15:12] concat inst[11:0] //sets PC equal to the top 4 bits of PC 
concatenated with the lower 12 bits of the instruction 
Else 
PC = newPC //otherwise PC is incremented by 1 
 
LCDwr 
inst = Mem[PC] //gets the instruction 
newPC = PC+1  //increments the instruction counter 
DP = DA //writes the value to the DP register 
PC = newPC 
Buttr 
inst = Mem[PC] //gets the instruction 
newPC = PC+1  //increments the instruction counter 
DA = BS //takes the button and switch inputs and puts them in DA 
PC = newPC 
 
LCDclr 
inst = Mem[PC] //gets the instruction 
newPC = PC+1  //increments the instruction counter 
CLEAR = 1 //tells the LCD to clear 
PC = newPC 
 
LCDmc 
inst = Mem[PC] //gets the instruction 
newPC = PC+1  //increments the instruction counter 
row = inst[5] //sets the row input to the lcd_control_master in the LCD_Driver 
Startaddress = inst[4:0] //sets the start address input wire to lower 4 bits of imm   
MoveCursor = 1 //tells the lcd_control_master to update 

 
 
 
44 

PC = newPC 
 
Push 
inst = Mem[PC] //gets the instruction 
newPC = PC+1  //increments the instruction counter 
op = inst[15:11] //selects the push instruction 
newSP = SP-1  //moves the SP register down 1 
SP = newSP 
Mem[SP] = DA //puts DA onto the stack 
PC = newPC 
Pop  
inst = Mem[PC] //gets the instruction 
newPC = PC+1  //increments the instruction counter 
op = inst[15:11] //selects the pop operation 
newSP = SP+1  //moves the stack pointer up 1  
SP = newSP 
PC = newPC 
Iap 
inst = Mem[PC] //gets the instruction 
newPC = PC+1  //increments the instruction counter 
IA = IA + inst[11:0] //increments the IA register by an immediate value 
PC = newPC 
 
La 
inst = Mem[PC] //gets the instruction 
newPC = PC+1  //increments the instruction counter 
IA = ZE(inst[7:0]) //loads a zero extended immediate into IA 
PC = newPC 
 
Oua  
inst = Mem[PC] //gets the instruction 
newPC = PC+1  //increments the instruction counter 
IA = IA | inst[11:0] //ors the IA register with an immediate value 
PC = newPC 
 

inst = Mem[PC] //gets the instruction 
newPC = PC+1  //increments the instruction counter 
PC = PC[15:12] concat inst[11:0] //concatenates the top 4 bits of PC to the lower 12 
bits of an immediate 
 
Call 
inst = Mem[PC] //gets the instruction 

 
 
 
45 

newPC = PC+1  //increments the instruction counter 


RA = PC //stores PC in the RA register 
PC = PC[15:12] concat inst[11:0]  //concatenates the top 4 bits of PC to the lower 12 
bits of an immediate 
 
Ret 
inst = Mem[PC] //gets the instruction 
newPC = PC+1  //increments the instruction counter 
PC = RA //restores the PC from the RA 

Unit Test Plan 


To ensure accurate and efficient operation of our processor every component must be tested. 
Parts like muxes will be tested exhaustively due to their restricted inputs and outputs while 
components such as registers will be tested extensively but not exhaustively so as to save time 
testing while ensuring edge cases can still be met.  
 
In addition to testing basic functionality of the components the amount of time needed to run 
each operation will be recorded to assist with setting the proper clock cycle lengths later in the 
project. 
 
Components to Test: 
PC Select Mux 
PC adder 
Instruction Memory 
IA Mux 
DA Mux 
SP Incrementer 
RA Mux 
ALU MuxA 
ALU MuxB 
Addr Mux 
Ia Register 
DA Register 
SP Register 
RA Register 
ALU 
Memory Unit 
LCD Driver 
Zero Extenders(tested during integration due to it being a wire operation) 
Sign Extenders(tested during integration due to it being a wire operation) 
Zero Upper Extenders(tested during integration due to it being a wire operation) 

 
 
 
46 

(Registers between cycles) 

Unit Testing Procedure: 


1. Choose the basic component to test. 
2. Discern which testing subset it belongs to based on breakdown below 
3. Determine the expected inputs, control signals, and outputs 
4. Use the tables below to create a testbench 
a. For ALUs determine the operations and test each the ALU is capable of 
performing 
b. The tables are not meant to be the only tests ran on the testbench, they only 
serve as an example of tests that the user can run to confirm proper operation 
5. Run the testbench 
6. Check with the tables below or use common sense to check validity of output data 
 
Due to the similarity in operation between a number of the components the components will be 
subdivided as follows: 
 
Muxes​:  ALUs​: 
IA Mux (2:4)  The ALU 
DA Mux (2:4)  PC Incrementer 
PC Select Mux (2:4)  Sign Extender 
ALU MuxA (2:4)  Zero Extender 
ALU MuxB (2:4)  Zero Upper Extender 
RA Mux (1:2)  SP Incrementer 
Addr Mux (1:2)   
  Others: 
Registers:  LCD Driver 
IA Register  Memory Unit 
DA Register  Instruction Memory 
SP Register 
RA Register 
PC Register 
 

Tables for Unit Testing 

Muxes 
4 Input, 2 Bit Control Mux 
 

 
 
 
47 

Will take in 3 or 4 inputs of varying lengths and output 1 value based on selection by control 
bits. 
The tester will ensure that result of testing corresponds with table and diagram. 

 
Control Bit Signals  Mux Output 

0  0  A 

0  1  B 

1  0  C 

1  1  D 
 
 
2 Input 1 Bit Control Mux 
Will take in 2 inputs of varying lengths and output 1 value based on selection by control bits. 
The tester will ensure that result of testing corresponds with table and diagram 

 
 
 
48 

 
Control Bit Signal  Mux Output 

0  A 

1  B 
 

Registers 
The purposes of registers in this design is to store data for future use. The registers must be 
able to have an input written to it and output the value written in it to function properly. All of the 
registers in our design need to support this functionality for data with a size up to 16 bits.  
 
16 Bit Registers 
Will write an input to its reserved memory location when the control signal is 1 and continually 
output the value store in it. As the register needs to be able to store any 16 bit value an effective 
testbench will test several 16 bit values. The intermediate registers are essentially multiple 
single 16 bit registers combined. As a result they will be tested in the same way as the single 16 
bit registers. 
The table below offers examples of 16 bit values to test. 

 
 
 
49 

 
 
16 bit data input  Register Control Signal  Output  

0000 0000 0000 0000  1  0000 0000 0000 0000 

1111 1111 1111 1111  0  Previous register value 

0101 0101 0101 0101  1  0101 0101 0101 0101 

1010 1010 1010 1010  0  Previous register value 

1001 0110 1001 0110  1  1001 0110 1001 0110 

ALUs 
The below tables are used to test the ALUs present in our design. To make it easier to find 
specific components for testing single operation components are included in this section.  
 
Single Operation ALU 1 Input 
This includes all of the operations extenders and incrementers that will need to individually 
complete. Each operation in the below table corresponds to one of the single operators used in 
the processor. For testing single operation ALUs use the table as a reference 
 
Single Operation  Data Input  Operation  Control  Data Output 
ALU  Signal 
Received 

Sign Extender  1111 0000  Sign Extend  N/A  1111 1111 1111 0000 

0111 0000  Sign Extend   N/A  0000 0000 0111 0000 

 
 
 
50 

Zero Extender  1111 0000  Zero Extend  N/A  0000 0000 1111 0000 

0111 0000  Zero Extend  N/A  0000 0000 0111 0000 

Zero Extender  1111 0000  Zero Extend  N/A  1111 0000 0000 0000 
Upper  Upper 

1111 0001  Zero Extend  N/A  1111 0001 0000 0000 


Upper 

PC Incrementer  1111 0001  Increment  N/A  1111 0001 0001 0000 


  0000 1111 

0000 0000  Increment  N/A  0000 0000 0000 0001 


0000 0000 

SP Incrementer  1111 0000  Increment  1  1111 0000 0010 0000 


  0001 1111 

1111 0000  Decrement  0  1111 0000 0000 1111 


0001 0000 
 
The ALU 
 
A input  B input  Operation  Control  Output  Carry Flag 
Signal Input  Output 

0x0001  0x0001  add  0b000  0x0002  0x0 

0xFFFF  0x0001  0b000  0x0000  0x1 

0x0001  0x0011  sub  0b001  0xFFF0  0x1 

0x1111  0x1110  0b001  0x0001  0x0 

0x0FFF  0xFEF1  and  0b110  0X0EF1  x 

0x8787  0x7878  or  0b100  0xFFFF  x 

0x4044  0x0002  sll  0b011  0x0110  x 

0x2222  0x0002  0b011  0x8888  x 

0x8888  0x0002  srl  0b101  0x2222  x 

0x4371  0x0002  0b101  0x10DC  x 

0x4371  0xXXXX  pass  0b111  0x4371  x 


 

 
 
 
51 

Other Components 
Consists of parts that are unique and that cannot be easily grouped together 
 

Memory Unit 
The memory unit is used to access and store all needed values from memory. For it to work 
properly it must receive a 16 bit data address, control signals, a 16 bit input, and output a 16 bit 
value. The below table outlines a basic test of the memory unit. 
 
 
Address Input  Data Input  MemW (control  MemR (control  Output 
signal)  signal) 

0xFEDC  0xXXXX  0  1  Mem[0xFEDC]* 

0xF3A7  0x4782  1  0  Data input 


stored in 
Address Input in 
Memory 
No output 
 
*Mem[data] represents the 16 bit value stored at the address input memory location. 
 

Instruction Memory 
The instruction memory stores the output data the control unit needs to perform its operations. 
The memory is accessed by 16 bit inputs from our instructions. A control signal called IMemR 
controls whether or not the instruction memory outputs data. 
 
 
Address Input  IMemR (control signal)  Output 

0x1234  1  Control located at 


0x1234 

0xFDEC  0  0x0000 

0xFDEC  1  Control located at 


0xFDEC 

 
 
 
52 

Integration Plan 
In order to successfully integrate our components, we will follow the 3 step plan outlined below. 
The general testing procedure for each set of components is to iterate through the applicable 
permutations in the control signals, and compare the expected state after each permutation 
with the actual state. Permutations may be tested with multiple starting states if there are 
anticipated edge cases we would like to cover. 

Step 1: Small Subsystems 


The PC Subsystem 
This subsystem consists of the PC register, an adder, a mux for PC source, an OR unit, and two 
different zero extension units. For the purposes of the test, the instruction will be fed in on a 
wire. 

 
We are asserting that the value in the PC register is the expected value, and will be ignoring the 
cases where PC W is set to off, as the writing capabilities should have been tested at a previous 
layer. 
 
   

 
 
 
53 

 
Control    Starting State    Result 

PC Src  PC W    Inst  RA  PC    Expected PC 

00  1    0x1234  0x4848  0x2222    0x2223 

01  1    0x1234  0x4848  0x2222    0x2234 

10  1    0x1234  0x4848  0x2222    0x4848 

 
The ALU Subsystem 
This subsystem consists of an ALU, a mux for ALU input A, and a mux for ALU input B. For the 
purposes of this test, the values of the IA, RA, and DA register as well as the instruction will be 
fed in on wires. 
 

 
 
We are asserting that the value of ALUOut is the expected value. 
 
Note: Instead of explicitly stating the ALUCtrl value, we list the operations we expect to occur at 
each permutation of ALU A. This is both for the sake of brevity and maintainability. 
 
   

 
 
 
54 

 
Control    Starting State    Result 

ALU A  ALU B  ALU OP    DA  IA  RA  Inst    ALU Out 

00    0x1540  0x2471  0x0254  0x4488    0x1540 OP 0x4400 


add, 
sub, 
00  01    0x1540  0x2471  0x0254  0x4488    0x1540 OP 0x0088 
shift, 
pass 
10    0x1540  0x2471  0x0254  0x4488    0x1540 OP 0xFF88 

00  add,    0x1540  0x2471  0x0254  0x4488    0x2471 OP 0x4400 


sub, 
and, 
01  01    0x1540  0x2471  0x0254  0x4488    0x2471 OP 0x0088 
or, 
shift, 
10  pass    0x1540  0x2471  0x0254  0x4488    0x2471 OP 0xFF88 

00    0x1540  0x2471  0x0254  0x4488    0x0254 

10  01  pass    0x1540  0x2471  0x0254  0x4488    0x0254 

10    0x1540  0x2471  0x0254  0x4488    0x0254 


 
The Stack/Program Memory Subsystem 
This subsystem consists of a program memory unit, the SP register, and an adder. For the 
purposes of this test, the memory unit will be pre-populated in such a way that the value stored 
at an address is the square of the address. This scheme should be sufficient for the scope of 
the tests. 
 

 
 
We are asserting that the value read out of memory is the expected value. Memory reading is 
always set to on, and memory writing is always set to off. 

 
 
 
55 

 
Control    Starting State    Result 

SP Inc  SP W    SP    Expected Mem Out 

0  1    0x0020    0x0100 

1  1    0x0020    0x0900 

0  0    0x0020    0x0400 

1  0    0x0020    0x0400 

Step 2: Registers, the ALU, and Program Memory 


Processor Core 
The next step in the integration process is assembling the core of the processor. This consists 
of the IA, RA, and DA registers, the ALU subsystem, and the program memory unit. For the 
purposes of these tests we will be ignoring the SP unit, as its operation has been tested and it 
does not affect the processor core. Its functionality will be further tested in step 3. 
 

 
 
In previous steps we tested the connections from the IA, RA, and DA registers and inst into the 
ALU, so we will be ignoring those tests for this step. We will also be ignoring the memory unit, as 
its functionality should have already been tested. Instead we will be monitoring the writing back 
to the IA, RA, and DA registers. As such, the write control signals for all registers will be set to on 
by default. 
 
 
 
56 

 
We will be asserting that the end state of the IA, RA, and DA registers is as expected. This will be 
checked by eye or with a very careful script, so the table below is merely for reference. 
 
Control    Starting State      Ending State 

IASRC  DASRC  ALU OP    IA  DA  RA    IA  DA  RA 

00    0x1540  0x2471  0x0254    ALU Out  ALU Out  Mem Out 
add, 
01    0x1540  0x2471  0x0254    ALU Out  0x2471  Mem Out 
sub, 
00 
shift, 
10    0x1540  0x2471  0x0254    ALU Out  0x1540  Mem Out 
pass 
11    0x1540  0x2471  0x0254    ALU Out  Mem Out  Mem Out 

00    0x1540  0x2471  0x0254    Mem Out  ALU Out  Mem Out 
add, 
01  sub,    0x1540  0x2471  0x0254    Mem Out  0x2471  Mem Out 
and, 
01 
or, 
10  shift,    0x1540  0x2471  0x0254    Mem Out  0x1540  Mem Out 
pass 
11    0x1540  0x2471  0x0254    Mem Out  Mem Out  Mem Out 

00    0x1540  0x2471  0x0254    0x2471  ALU Out  Mem Out 

01    0x1540  0x2471  0x0254    0x2471  0x2471  Mem Out 


10  pass 
10    0x1540  0x2471  0x0254    0x2471  0x1540  Mem Out 

11    0x1540  0x2471  0x0254    0x2471  Mem Out  Mem Out 

Step 3: The Big Kahuna 


This is the final stage in integration, putting everything together. Writing out a table for this 
testing procedure would be extensive and unmaintainable, so the idea is as follows. 
 
For each instruction: 
1. Convert the instruction to machine code 
2. Adjust the mif file in instruction memory with the updated instruction(s) 
3. Run the processor until the instruction has been completed 
4. Check the states of the processor to confirm that the operation was performed correctly. 
5. If any errors are found additional non documented testing will be performed to isolate the 
root of the error 
 
 

 
 
 
57 

(All numbers are in decimal unless otherwise specified) 


Test Basic Immediate Instructions: 
 
Program  Result 

addi 5  Should see 5 in the DA register 

addi 5  Should see 13 in the DA register 


ori 12 

iap 1  Should see 1 in the IA register 


   
Test Basic Arithmetic Instructions 
 
Program  Result 

addi 5  Should see 13 in the DA register 


ori 12  Should see 13 in address 0 of program 
sw  memory 

addi 5  Should see 0 in the DA register 


ori 12  Should see 13 in address 1 of program 
iap 1  memory 
sw  Should see 0 in the IA register 
iap -1   
lw 

addi 5  Should see 25 in the DA register 


ori 12  Should see 38 in address 1 of program 
iap 1  memory 
sw  Should see 1 in the IA register 
iap -1   
lw 
iap 1 
li 25 
addm 
 
Test Basic Stack Operation 
  
Program  Result 

addi 5  Stack Pointer is at FFFE 


ori 12  Stack has 13 at location FFFF 

 
 
 
58 

iap 1  DA should have 13 in it 


sw 
push 
iap -1 
lw 
iap 1 
li 25 
addm 
pop 
push 
 

   

 
 
 
59 

D: Design Process Journal 


Milestone 1 
Meeting Monday, January 8 
Members Present: Thad, Evë, Matthew, Ian 
 
At this meeting we decided to build an accumulator-style processor. This is a pattern that we 
are all mostly familiar with, and will fit well within the requirements of the project. As there is an 
implicit register in every instruction, the capabilities of the 16-bit data requirement can be 
maximized, leaving room for interesting optimizations. 
 
We proceeded to outline commands that would be necessary / useful for programming. These 
are: 
 
sll 
srl 
andi 
and 
or 
ori 
lui 
sw 
sm0 (store into memory address 0 - for indirect addressing) 
lw 
subi 
add 
addi 
sub 
bnez 

 
sw ​ and l
​ w ​
work by using something (a special register or memory address 0) to provide the 
index value. We still aren’t sure whether to implement j
​ al​
. We need some way of keeping track 
of return address though. 
 
No work items came as a result of this meeting. 

First Meeting Wednesday, January 10 


Members Present: All 
 
At this meeting we settled the procedure call convention for our processor. As the bulk of our 
knowledge about accumulator processors comes from the PIC, this was uncharted territory and 
 
 
 
60 

caused quite a bit of kerfuffle. Eventually we decided on putting the majority of the procedure 
responsibilities on the caller, and passing arguments and return values on the stack. This allows 
us to maintain an accumulator-style low register count, while expanding on the capabilities of 
processors like the PIC. 
 
We also decided how we would handle storing local values. Our options were to store them on 
the stack and create commands that let you access more than just the top word, or put them 
into memory and force the programmer to back up what they wanted to persist before handing 
over the reigns to other procedures. We chose the latter, both because it preserved the integrity 
and concept of the stack, and also because we would then have to create instructions that 
performed operations on elements of the stack. As we already have instructions that perform 
operations on registers or memory, adding a third variation of an instruction would both 
increase our opcodes and add unnecessary complexity. 
 
Sometime between now and our next meeting (in approximately 4 hours time): 
● Ian will examine the instructions we’ve laid out and will try to condense them into 
opcodes/instruction types 
● Evë will update the design process journal 
● Thad will translate the example program into our assembly and determine if our new 
instruction set is feasible 
● David, Matthew, and Evë will write some code snippets for the ‘Common Operations’ 
section 

Second Meeting Wednesday, January 10 


Members Present: Thad, Evë, Matthew, Ian 
 
At this meeting we aimed to finish the remainder of Milestone 1. This involved nailing down 
instruction formats and opcodes, as well as addressing modes. 
 
The main question we had was whether we needed jump register, and whether we could turn 
jump into an inherent instructuction by using the address stored in the IA register. We ended up 
by deciding jump would include a 12 bit label or immediate which would be extended by the top 
4 bits of PC.  
 
We also assembled the assembly into machine code. 
 
No work items came as a result of this meeting. 

 
 
 
61 

Milestone 2 
Impromptu Meeting Thursday, January 11 
Members Present: Thad, Evë Matthew, Ian 
 
At this meeting we debriefed the design meeting we just had with Micah. 
 
We discussed adding a register to store carry, overflow, and perhaps negative flags. Though 
having a status register ensures that the data persists, we decided to instead very carefully set 
up the datapath to render the register irrelevant. This is because we do not want to support a 
scenario in which we let the programmer set any of the flags or view them for an extended 
period of time. In the event we decide to handle interrupts, we will revisit this decision. 
 
An interesting idea that came up in the meeting with Micah was to have a chunk of memory 
dedicated to local values. This solves a lot of our problems with programming recursion and 
preserves the integrity of our design, so we decided to move forward with that in mind. 
 
Though we discussed condensing opcodes, we would like to wait to do some analysis on which 
instructions get used at what frequency so that we can do proper encoding using a system like 
Huffman coding. 
 
We then briefly discussed the idea of replacing ret with jr (jump register), before Thad threw out 
the ridiculous idea of creating an instruction that pops the stack directly into PC. Though this 
could have been a productive discussion we quickly moved on. 
 
At this point we noticed that we’d effectively sucked ourselves into making a multi-cycle 
processor. Though not intended, the 4 of us are on board with the idea. 
 
Work completed during the meeting: 
● Evë: Updated process journal 
● Ian: Updated design document with opcodes and instruction types 
● Thad: Updated example program 
● Matthew: Updated design document with opcodes and instruction types 
 

Meeting Friday, January 12 


Members Present: Thad, Evë, Matthew, Ian, and David 
 
In this meeting we brought David up to speed with our design. 

 
 
 
62 

We began looking at the milestone two requirements. 


We put final verification that our processor design was what we wanted going forwards. 
Grouped our instructions into categories that share the same RTL implementation 
Decided we would finalize op-code assignment when beginning work on the control.’ 
Created a chart to organize the RTL. 
 
Work completed during the meeting: 
● Evë: left early 
● Ian: Updated process journal/ helped organize instructions into RTL categories 
● Matthew: helped organize instructions into RTL categories and identify signals required 
for RTL 
● Thad: Added bneg and bpos commands and rearranged branch/jump opcodes. 
● David: Make a summary chat for the RTL to better represent 4 stage of a multi-cycle 
processor 
 
Work to complete before next meeting: 
The list of RTL commands was split up between the group to work on before the next meeting. 
Evë: Stack commands (including CYA/RYA), and load/store word 
Ian: Arithmetic Memory/DA commands 
Thad: IA commands, and branch/jump commands 
Matthew: LCD and BS commands 
David: Arithmetic/Logic Immediate commands 

Meeting Wednesday, January 17 


Members Present: Thad, Evë, Matthew, Ian, David 
 
At this meeting we aimed to complete the work remaining for Milestone 2. Which we did! 
 
We decided to break up CYA and RYA into more instructions to simplify control logic. pushRA, 
pushIA, popRA, popIA are a result of this (push and pop already work with DA). This could also 
mean if the programmer doesn’t want to backup DA or IA, they don’t need to. 
 
Work completed during the meeting: 
● Evë: Converted written RTL into tables, added changelog, formatted document 
● Thad: Converted written RTL into tables, added components to list 
● Ian: Converted written RTL into tables, added components to list 
● Matthew: Converted written RTL into tables, tested RTL 
● David: Created list of components 

 
 
 
63 

Milestone 3 
Meeting Sunday, January 21 
Members Present: Matthew, Thad, Evë, Ian  
 
At this meeting we recapped our Friday discussion with Micah, as well as delegated lab work. 
Since we will not be using interrupts for our games, we chose to do labs 6 and 7. David will work 
on lab 6, and Matthew will work on lab 7. 
 
One of the biggest concerns from the meeting were our RTL tables. We spent some time 
condensing and refactoring our tables. 
 
We also drew out our initial datapath. This was done on a whiteboard, so after this meeting 
someone will have to reconstitute it in software like Visio. We are currently in a race to see who 
can download it the fastest. As we are all in F217, the prospects do not look good for any of us. 
Ian may have to restart his computer. 
 
A key decision we made in the datapath was to have the RA, IA, and DA registers mux into an 
ALU input. We decided to go this route because there is already a mux on the B input, so adding 
one to the A input does not change cycle time. 
 
As a very important side note, Evë won the Visio installation battle. 
 
Work to complete before next meeting: 
● Matthew has the honor of copying the diagram into Visio 
● David will work on Lab 6 
● Thad will update the RTL and describe the control signals 
● Evë will start the integration plan 
● Ian will design the unit tests and update the component list 

Meeting Tuesday, January 23 


Members Present: Matthew, Ian, Evë, Thad, David 
 
At this meeting we finished the tasks we were meant to complete before the meeting. Actually, 
Matthew had finished copying the design into Visio, but we needed to add some components. 
 
We discussed some formatting standards, and had a brief discussion about initializing the 
processor. We decided to cross that bridge later.  
 

 
 
 
64 

Milestone 4 
Meeting Thirstday, January 25 
Members Present: Matthew, Ian, Evë, Thad, David 
In addition to assignments to complete Labs 6&7…. 
 
Evë: Optimize opcodes, make memory unit testbench 
Thad: Fix the RTL again & purge document of intermediate registers, fix Micah comments, make 
Mux testbenches. 
Matt: Get rid of intermediate registers from the datapath diagram. Start putting together ALU 
control and ALU control codes (if those exist at all). 
Ian: Make register testbench 
David: Make ALU testbench 

Meeting Sunday, January 28 


Members Present: Matthew, Ian, Evë, Thad 
At this meeting we finished the tasks we were assigned last meeting, and redistributed other 
tasks. 
 
Evë: Designed opcodes for inherents 
Thad: Designed opcodes for branch and immediates 
Ian: Insured the test benches that tested the registers, adders, and muxes worked. 
Matthew: Updated schematic and implemented components 
 
Assignments: 
Evë: Finish designing inherent control + add opcode logic to design doc 
Thad: Fix the design doc (add/remove instructions + fix opcodes) 
Ian: Check in all test benches 
Matthew: Check in all components 
David: Make ALU 
 
 
 
 
 
 
 
 
 

 
 
 
65 

Milestone 5 
 

Meeting Monday During Class 2/5/2018 


Members Present: Thad, David, Matthew, Ian 
Thad: Fixing RTL documentation 
David/Matthew/Ian: Write testbenches 
 

Meeting Monday Evening 2/5/2018 


Members Present: Matthew, Evë (Thad for 5 minutes) 
Wrote control using verilog if/switch statements… reorganized tables to be more sensible. Also 
clarified with Thad what was going on with the PCSRC control signal(s). 

Meeting Tuesday During Class 2/6/2018 


Members Present: Thad, David, Matthew, Ian, Evë 
Thad: Finished assembler now that we know opcodes 
David/Matthew/Ian: Write testbenches 

Meeting Wednesday During/After Class 2/7/2018 


Members Present: Thad, David, Matthew, Ian, Evë 
Thad: Wrote testbench for program memory. 
David/Matthew/Ian: Write testbenches. Wrote basic control testbench (includes example of 
each instruction type), did some debugging of the control unit to make them work. 
Later, wrote entire processor testbench- in this though, PC was not incrementing properly. Still 
not sure why. 
 

Meeting Monday During Class 2/12/2018 


Members Present: Thad, David, Matthew, Ian, Evë 
Ian and Matt explained changes they had made over the weekend to the control, datapath, and 
opcodes to allow for more of the instructions to work. Discussed as a group how to make the 
control more readable to people outside of the team.  
Ian: Ran tests on individual instructions and checked the outputs to make sure they were 
functioning correctly. 
Matthew and David: Helped with debugging the processor 
Thad and Evë: Worked on fixing up our control documentation 
 
 
 
 
66 

Meeting Tuesday During Class 2/13/2018 


Members Present: Thad, David, Matthew, Ian, Evë 
As a group we: 
Worked on debugging the branch instructions. Corrected errors in control documentation and 
control verilog file. 
Ian was given the task of testing each instruction before wednesday. 
 

Meeting Wednesday During and After Class 2/14/2018 


Members Present: Thad, Ian, David, Evë, and Matthew 
As a group: 
Discussed the results of the instructions Ian ran the previous night. Realized our arithmetic 
instructions, our la instruction, and a few other instructions were not working properly. Added a 
latch following the ALU output so as to clearly separate the Read and write stages. This latch 
insures that the processor will not perform both a memory read and memory write during the 
same cycle when an instruction like addm is run. Eventually we got the processor working. 
 
Outside of class: 
Thad: Completed Assembler worked on running Euclid’s algorithm on processor 
Ian: Continued running instruction tests on processor and debugging errors as the arose. 
Updated design journal. 
David: Helped write and run instruction tests 
Matthew: Continued to work on preparing the processor to be run on the fpga board. 
Evë: Worked on compiler updated documentation 
 

Meeting Thursday After Scheduled Meeting 2/15/2018 


Members Present: Thad, Ian, David, Evë, and Matthew 
As a group: 
We discussed the feedback we received from our meeting and planned out what we needed to 
work on for the coming week. 
Ian: Worked on updating documentation so that it matched our current processor 
Evë: Worked on completing the compiler 
Thad: Worked on updating the assembler and designing a game 
Matthew: Worked on getting the processor to run on the FPGA board. 
David: Helped update the documentation and worked on the presentation. 
 

 
 
 
67 

Meeting Wednesday 2/21/2018 


Members Present: Thad, Ian, David, and Evë 
Matthew was unable to meet due to his final’s schedule 
As a group:  
Worked on finishing touches to design documentation and design process journal. Continued 
work on presentation. 
 

Das könnte Ihnen auch gefallen