Beruflich Dokumente
Kultur Dokumente
CISC vs RISC
CISC (Complex Instruction Set Computer) Emphasis on hardware Multi-clock, complex instructions LOAD and STORE incorporated in instructions Small code sizes, high cycles per second Transistors used for storing complex instructions RISC (Reduced Instruction Set Computer) Emphasis on software Single-clock, reduced instructions LOAD and STORE are independent instructions Large code sizes, low cycles per second Spends more transistors on memory registers
Generic Computer
Data resides in main memory Execution unit carries out computations Can only operate on data loaded into registers
CISC Approach
Complex instructions built into hardware (Ex. MULT) Entire task in one line of assembly MULT 2:3, 5:2 High-level language A = A * B Compiler high-level language into assembly Smaller program size & fewer calls to memory -> savings on cost of memory and storage
RISC Approach
Only simple instructions 4 lines of assembly LOAD A, 2:3 LOAD B, 5:2 PROD A, B STORE 2:3, A Less transistors of hardware space All instructions execute in uniform time (one clock cycle) - pipelining
What is Pipelining?
Before Pipelining
After Pipelining
EDGE Architecture
EDGE (Explicit Data Graph Execution) Conventional architectures process one instruction at a time; EDGE processes blocks of instructions all at once and more efficiently Current multicore technologies increase speed by adding more processors Shifts burden to software programmers, who must rewrite their code EDGE technology - alternative approach when race to multicore runs out of steam
TRIPS
Tera-op Reliable Intelligently Adaptive Processing System first EDGE processor prototype Funded by the Defense Advanced Research Projects Agency - $15.4 million Goal of one trillion instructions per second by 2012
Loop is unrolled Reduces the overhead per loop iteration Reduces the number of conditional branches that must be executed
Compiler produces TRIPS Intermediate Language (TIL) files Syntax of (name, target, sources)
Scheduler analyzes each block dataflow graph Places instructions within the block Produces assembly language files
TRIPS prototype chip - 130-nm ASIC process; 500 MHz Two processing cores; each can issue 16 operations per cycle with up to 1,024 instructions in flight simultaneously Current high-performance processors - maximum execution rate of 4 operations per cycle 2 MBs L2 cache 32 banks
Execution node fully functional ALU and 64 instruction buffers Data flow techniques work well with the three kinds of concurrency found in software instruction level, thread level, and data level parallelism