Beruflich Dokumente
Kultur Dokumente
12:14 AM
COMP273 Page 1
COMP273 Page 2
COMP273 Page 3
COMP273 Page 4
COMP273 Page 5
September-07-10
1:01 PM
COMP273 Page 6
COMP273 Page 7
COMP273 Page 8
COMP273 Page 9
September-09-10
1:08 PM
COMP273 Page 10
COMP273 Page 11
COMP273 Page 12
September-14-10
1:06 PM
COMP273 Page 13
COMP273 Page 14
COMP273 Page 15
COMP273 Page 16
September-16-10
1:08 PM
COMP273 Page 17
COMP273 Page 18
COMP273 Page 19
COMP273 Page 20
COMP273 Page 21
September-21-10
1:07 PM
COMP273 Page 22
Audio
Recording
Audio recording
started: 1:39 PM
September-21-10
COMP273 Page 23
COMP273 Page 24
COMP273 Page 25
September-23-10
1:07 PM
Audio
Recording
COMP273 Page 26
COMP273 Page 27
September-28-10
1:04 PM
The smallest thing you can ask for is the register value or a byte. Since the status register is an entire register, you
can't ask for a single bit.
If the status register looks like : 10101111 and you are interested in a particular bit, how do you simplify?
& 00000100
If the register produces a 0 then you know that the 3rd bit from the R was a 0, else it
was a 1 and it produces a value != 0
Audio
Recording
x=status&0
• Answer to the question I had: Send one character, and then send the next character, then send the next
character.
Use the dots to show that you want to connect them. Another short hand where we show:
COMP273 Page 28
Can send binary numbers to the machine - input enters decoder, to trigger particular wires
PAL has programmable side on AND array, not programmable OR array (ie. Fixed)
PLA has programmable AND and OR arrays -- ie. Both L and R are programmable
More Questions:
COMP273 Page 29
Micro Architecture (Lecture 8)
Atomic: each instruction does one and only one activity. If it does add, it adds - won't subtract.
Processing unit = CPU; job is to move instruction from RAM to control unit
- Control unit's job is to realize what the instruction is and execute it
This is really pretty much identical to how computers operate today. No other alternatives really exist (kind of
quantum computing)
Cache: Represented very similarly to Von Neumann representation of information memory in, memory out
Pipeline: Memory but also CPU (important concept); all modern CPUs now are pipeline CPUs
If you don’t have a cache, and are loading everything from RAM, 'beat' required to move the instruction
through each element - takes a lot of effort to transfer it.
Key Components:
• Program counter - kind of acts like a pointer; points to ram. Contains address of instruction; after
instruction executed, it goes through a simple adder , 1 is added to it and it moves to the next
instruction.
COMP273 Page 30
•
COMP273 Page 31
September-30-10
1:07 PM
Address Byte
.
.
101 BEQ R3, 200 (*if (R3==0) goto 200*)
100 Add R3, R2, R1 (*R3 = R2+R1*)
.
1
0
Audio
Recording
COMP273 Page 32
COMP273 Page 33
COMP273 Page 34
Micro Architecture Part 2
October-05-10
1:08 PM
- Assignment 2 Out
- Midterm exam postponed to October 19th tentative
If we do a later exam, will still just include anything NOT related to programming (as we may have started programming in assembler MIPS.)
- Pipeline, instead of being spread out, is in a sort of straight line - like car conveyor belt production, you can accomplish more things at once
- Pipeline has 2 caches - one for instruction, one for data -- improves efficiency of the pipeline
- Pipeline linear activity allows to execute more than one instruction at once
- Program counter contains address of the instruction of the next instruction to execute
- Instruction memory holds the addresses of instructions
- As soon as we can download our code into the cache, we can take advantage of the pipeline
- Registers organized sort of like RAM: have an integer number that enters and selects which register you want to use (similar to A1, Q2 RAM circuit)
○ Can select up to 3 registers at once; if you need 4 register, have to find a different way to implementing since only wired u p to 3 registers MAX
○ Can ask for 3 registers, but can only get 2 registers out of the box
- Registers move directly to the ALU or can skip the ALU if not needed and go directly to the data memory cache
- Otherwise, two wires from the registers go into the ALU and the answer is output to the data memory cache
Should be able to create the high-level version of the diagrams. PIPELINE, CLASSICAL, HOW TO CODE IN MIPS -- what this course
is about.
Pipeline Architecture
Organizing the pipeline in 4 different stages, and ensuring are different boundaries (no other wires interfere with the process)
- Imagine a bunch of AND gates that stop once fetch has been completed and stops activity
- Similar for LOAD, ALU, STORE -- they're all segregated into these particular steps.
If truly separated, means that you could have 4 different instructions inside the CPU executing at the same time, at different stages.
EX.
ADD Y loaded from PC to instruction memory, held by AND gate - clock ticks, AND gates open, ADD Y passes through to LOAD
ADD Y, 5, 2 <<== 5, 2 are basically saying that when LOADED, "give me the contents of register 5 and register 2"
5,2 comes out of registers and ADD Y instruction tells ALU it has to add
Address Y gets answer from the ALU that comes out of the ALU and is stored in Data Memory
While this is happening other instructions are being loaded and held back by the AND gates separating each section.
ADD y, 10, 2
- One tick for load
- Get 10 to ALU
- Get 2 to ALU
- Add
- Y data saved
Therefore with the classical architecture, would take 4-5 ticks to get the information stored whereas using the pipelining method would allow other
instructions to be executed simultaneously, processing ~10 instructions
Drawback of Pipeline: All instructions executed in same number of ticks (everything needs to go through the pipeline even if it doesn't need to)
IE. Car has to sit in production waiting for 'fancy decals' that it won't actually get, until it can move to the next station.
- But the benefit far outweighs the negative
Timing issues:
BREQ R0 ==> if R0 = 0…
The instructions are based on previous instruction answers
- TIMING FAULT: all instructions in the pipeline are cancelled ---> BAD; all advantage is lost, and would have to re-load everything again
PC --> RAM --> IR (instruction register) --> CU --- does all the work <<CLASSICAL MODEL>>
PC --> CACHE --> IR1 (not shown in slide 4 diagram...this is how we hold the instruction)--> Registers ((and IR1 --> another IR2)) <<PIPELINE MODEL>>
COMP273 Page 35
PC --> RAM --> IR (instruction register) --> CU --- does all the work <<CLASSICAL MODEL>>
PC --> CACHE --> IR1 (not shown in slide 4 diagram...this is how we hold the instruction)--> Registers ((and IR1 --> another IR2)) <<PIPELINE MODEL>>
EX. ADD Y,5,2 --> Since IR1 takes the information, it must download the instruction to IR2 so that it can be used in ALU and
adding portion must be in IR3 so that it can be stored in the data cache/memory
- Instead of having a control unit (CU) at the end as with Classical CPU, there are typically CUs at every stage in the Pipeline CPU
Add OP1/S1/S2/D <<== S1 and S2 hold variables since adding variable to variable
Addi OP2/S1/CONSTANT/D <<== doesn't need second S2 because the S2 equivalent is stored as a constant since var + const.
COMP273 Page 36
ALU Portion of CPU
- Can address all of RAM with 32 bits; if you only have 16 bits can only jump so far in RAM
- To get to a particular address, stores how many instructions away it is (positively or negatively) and multiplied by 4 is the
address + PC -- lets you get 4 times farther than the 16 bits allows you to 'travel'
- Shift left 2 : If you shift bits over by 2, you multiply by 2 everytime you shift a bit (if you shift twice, multiply by four!)
○ 0001 - 1
○ 0010 - 2 SHIFTING BY 2
○ 0100 - 4
Shifts by 4!!
Delayed Branching
- Branching is a delayed activity -- happens just in case it will be used; machine doesn’t know if
BEQZ R0, … will be true at the branching stage ; must assume it is true "just in case"
COMP273 Page 37
Micro Architecture Part 2 cont'd
October-07-10
1:09 PM
If you're having trouble understanding anything here, read the textbook! Slides are actually scanned from textbook, so very similar.
Can't have a simple PC that just loops - because there might me other things connected to it. Therefore, once the ALU adds 4 from the PC, it
moves through the other elements
MIPS does OP codes in a special way to optimize how the pipeline works.
MIPS:
- OP codes are classed
- When 0 sent to Control Unit (CU) it tells it that it needs to use the ALU
- Information on how we want to use the ALU is stored in the 'back'
- Mini-Op goes directly to ALU -> CU
fn ()
{
/*declare variables*/
return whatever;
}
Branch Instructions:
Jump Instruction:
- OP Code 2
- Can jump farther than other operations
Remember that the CPU can't use RAM - it can only use registers. So data must be moved to the registers so that it can be used by the
COMP273 Page 38
Remember that the CPU can't use RAM - it can only use registers. So data must be moved to the registers so that it can be used by the
CPU.
1) FETCH: Address register of the PC goes into the Read Address, and goes up to the Add register and goes up to the ADD ALU and MUX.
Only continues back around to the PC counter when the MUX produces a 1
2) LOAD: 32 wires are split up in separate paths - wires go different places; all places where things stop because they wait for signals to
come from other elements of the CPU; no control for reading information - as soon as the info goes in, it is read in corresponding
register.
CISC:
- INTEL, IBM
- Complex --> mini optimize
- Advantage: arraysum dest, ptr, cellcount => would sum the entire array REALLY fast; can't do in RISC without building as a program
Microprogramming: ADD has a set of wires that carries out that activity (that's the micro program)
Slide 5
- The 'register opcode field' is the opcode that's coming in from the opcode register/instruction register
Add dest,source1,source2
Really does:
Get s1& s2
Add s1 with s2
Dest. = answer
What happens:
- Instruction register contains OPCODE
□ Opcode comes in and extra wires are added to it within the control unit
COMP273 Page 39
For each instruction, can format them with and gates to determine which code is given.
In some way, it's kind of like a huge case statement where each of the different codes are different
'cases'.
In a pipeline machine, all have same number of steps - so you would construct instructions in a way
that they would always have the same number of steps.
Only difference between a FLAT and PIPELINE sequencer is that a FLAT has everything in it - big long
machine.
COMP273 Page 40
October-21-10
2:10 PM
COMP273 Page 41
COMP273 Page 42