Sie sind auf Seite 1von 8

RISC vs CISC

You can read a reply to this text by going here.

In the early days of computing, you had a lump of silicon which performed a number of instructions. As time progressed, more and more facilities were required, so more and more instructions were added. However, according to the 20-80 rule, 20% of the available instructions are likely to be used 80% of the time, with some instructions only used very rarely. Some of these instructions are very complex, so creating them in silicon is a very arduous task. Instead, the processor designer uses microcode. To illustrate this, we shall consider a modern CISC processor (such as a Pentium or 68000 series processor). The core, the base level, is a fast RISC processor. On top of that is an interpreter which 'sees' the CISC instructions, and breaks them down into simpler RISC instructions. Already, we can see a pretty clear picture emerging. Why, if the processor is a simple RISC unit, don't we use that? Well, the answer lies more in politics than design. However Acorn saw this and not being constrained by the need to remain totally compatible with earlier technologies, they decided to implement their own RISC processor. Up until now, we've not really considered the real differences between RISC and CISC, so... A Complex Instruction Set Computer (CISC) provides a large and powerful range of instructions, which is less flexible to implement. For example, the 8086 microprocessor family has these instructions:
JA JAE JB ... JPO JS JZ Jump if Above Jump if Above or Equal Jump if Below Jump if Parity Odd Jump if Sign Jump if Zero

There are 32 jump instructions in the 8086, and the 80386 adds more. I've not read a spec sheet for the Pentium-class processors, but I suspect it (and MMX) would give me a heart attack! By contrast, the Reduced Instruction Set Computer (RISC) concept is to identify the subcomponents and use those. As these are much simpler, they can be implemented directly in silicon, so will run at the maximum possible speed. Nothing is 'translated'. There are only two

Jump instructions in the ARM processor - Branch and Branch with Link. The "if equal, if carry set, if zero" type of selection is handled by condition options, so for example:
BLNV BLEQ Branch with Link NeVer (useful!) Branch with Link if EQual

and so on. The BL part is the instruction, and the following part is the condition. This is made more powerful by the fact that conditional execution can be applied to most instructions! This has the benefit that you can test something, then only do the next few commands if the criteria of the test matched. No branching off, you simply add conditional flags to the instructions you require to be conditional:
SWI MVNVS MOVVC "OS_DoSomethingOrOther" R0, #0 R0, #0 $...whatever... AX, 0 failed DX, 0 ; call the SWI ; If failed, set R0 to -1 ; Else set R0 to 0 ; ; ; ; call the interrupt did it return zero? if so, it failed, jump to fail code else set DX to 0

Or, for the 80486:


INT CMP JE MOV return RET failed MOV JMP

; and return DX, 0FFFFH return ; failed - set DX to -1

The odd flow in that example is designed to allow the fastest non-branching throughput in the 'did not fail' case. This is at the expense of two branches in the 'failed' case. I am not, however, an x86 coder, so that can possibly be optimised - mail me if you have any suggestions...

Most modern CISC processors, such as the Pentium, uses a fast RISC core with an interpreter sitting between the core and the instruction. So when you are running Windows95 on a PC, it is not that much different to trying to get W95 running on the software PC emulator. Just imagine the power hidden inside the Pentium... Another benefit of RISC is that it contains a large number of registers, most of which can be used as general purpose registers. This is not to say that CISC processors cannot have a large number of registers, some do. However for it's use, a typical RISC processor requires more registers to give it additional flexibility. Gone are the days when you had two general purpose registers and an 'accumulator'. One thing RISC does offer, though, is register independence. As you have seen above the ARM register set defines at minimum R15 as the program counter, and R14 as the link register (although, after saving the contents of R14 you can use this register as you wish). R0 to R13 can be used in any way you choose, although the Operating System defines R13 is used as a stack pointer. You can, if you don't require a stack, use R13 for your own purposes. APCS applies firmer rules and assigns more functions to registers (such as Stack Limit). However, none of these - with the exception of R15 and sometimes R14 - is a constraint applied by the processor.

You do not need to worry about saving your accumulator in long instructions, you simply make good use of the available registers. The 8086 offers you fourteen registers, but with caveats: The first four (A, B, C, and D) are Data registers (a.k.a. scratch-pad registers). They are 16bit and accessed as two 8bit registers, thus register A is really AH (A, high-order byte) and AL (A low-order byte). These can be used as general purpose registers, but they can also have dedicated functions - Accumulator, Base, Count, and Data. The next four registers are Segment registers for Code, Data, Extra, and Stack. Then come the five Offset registers: Instruction Pointer (PC), SP and BP for the stack, then SI and DI for indexing data. Finally, the flags register holds the processor state. As you can see, most of the registers are tied up with the bizarre memory addressing scheme used by the 8086. So only four general purpose registers are available, and even they are not as flexible as ARM registers. The ARM processor differs again in that it has a reduced number of instruction classes (Data Processing, Branching, Multiplying, Data Transfer, Software Interrupts). A final example of minimal registers is the 6502 processor, which offers you:
Accumulator X register Y register PC SP PSR for results of arithmetic instructions First general purpose register Second general purpose register Program Counter Stack Pointer, offset into page one (at &01xx). Processor Status Register - the flags.

While it might seem like utter madness to only have two general purpose registers, the 6502 was a very popular processor in the '80s. Many famous computers have been built around it. For the Europeans: consider the Acorn BBC Micro, Master, Electron... For the Americans: consider the Apple2 and the Commadore PET. The ORIC uses a 6502, and the C64 uses a variant of the 6502. (in case you were wondering, the Speccy uses the other popular processor - the ever bizarre and freaky Z80) So if entire systems could be created with a 6502, imagine the flexibility of the ARM processor. It has been said that the 6502 is the bridge between CISC design and RISC. Acorn chose the 6502 for their original machines such as the Atom and the System# units. They went from there to design their own processor - the ARM.

To summarise the above, the advantages of a RISC processor are:

Quicker time-to-market. A smaller processor will have fewer instructions, and the design will be less complicated, so it may be produced more rapidly.

Smaller 'die size' - the RISC processor requires fewer transistors than comparable CISC processors... This in turn leads to a smaller silicon size (I once asked Russell King of ARMLinux fame where the StrongARM processor was - and I was looking right at it, it is that small!) ...which, in turn again, leads to less heat dissipation. Most of the heat of my ARM710 is actually generated by the 80486 in the slot beside it (and that's when it is supposed to be in 'standby'). Related to all of the above, it is a much lower power chip. ARM design processors in static form so that the processor clock can be stopped completely, rather than simply slowed down. The Solo computer (designed for use in third world countries) is a system that will run from a 12V battery, charging from a solar panel. Internally, a RISC processor has a number of hardwired instructions. This was also true of the early CISC processors, but these days a typical CISC processor has a heart which executes microcode instructions which correlate to the instructions passed into the processor. Ironically, this 'heart' tends to be RISC. :-) As touched on my Matthias below, a RISC processor's simplicity does not necessarily refer to a simple instruction set. He quotes LDREQ R0,[R1,R2,LSR #16]!, though I would prefer to quote the 26 bit instruction LDMEQFD R13!, {R0,R2-R4,PC}^ which restores R0, R2, R3, R4, and R15 from the fully descending stack pointed to by R13. The stack is adjusted accordingly. The '^' pushes the processor flags into R15 as well as the return address. And it is conditionally executed. This allows a tidy 'exit from routine' to be performed in a single instruction. Powerful, isn't it? The RISC concept, however, does not state that all the instructions are simple. If that were true, the ARM would not have a MUL, as you can do the exact same thing with looping ADDing. No, the RISC concept means the silicon is simple. It is a simple processor to implement. I'll leave it as an exercise for the reader to figure out the power of Mathias' example instruction. It is exactly on par with my example, if not slightly more so!

For a completion of this summary, and some very good points regarding the ARM processor, keep reading...

In response to the original version of this text, Matthias Seifert replied with a more specific and detailed analysis. He has kindly allowed me to reproduce his message here...

RISC vs ARM
You shouldn't call it "RISC vs CISC" but "ARM vs CISC". For example conditional execution of (almost) any instruction isn't a typical feature of RISC processors but can only(?) be found on ARMs. Furthermore there are quite some people claiming that an ARM isn't really a RISC processor as it doesn't provide only a simple instruction set, i.e. you'll hardly find any CISC processor which provides a single instruction as powerful as a
LDREQ R0,[R1,R2,LSR #16]!

Today it is wrong to claim that CISC processors execute the complex instructions more slowly, modern processors can execute most complex instructions with one cycle. They may need very long pipelines to do so (up to 25 stages or so with a Pentium III), but nonetheless they can. And complex instructions provide a big potential of optimisation, i.e. if you have an instruction which took 10 cycles with the old model and get the new model to execute it in 5 cycles you end up with a speed increase of 100% (without a higher clock frequency). On the other hand ARM processors executed most instruction in a single cycle right from the start and thus don't have this optimisation potential (except the MUL instruction). The argument that RISC processors provide more registers than CISC processors isn't right. Just take a look at the (good old) 68000, it has about the same number of registers as the ARM has. And that 80x86 compatible processors don't provide more registers is just a matter of compatibility (I guess). But this argument isn't completely wrong: RISC processors are much simpler than CISC processors and thus take up much less space, thus leaving space for additional functionality like more registers. On the other hand, a RISC processor with only three or so registers would be a pain to program, i.e. RISC processors simply need more registers than CISC processors for the same job. And the argument that RISC processors have pipelining whereas CISCs don't is plainly wrong. I.e. the ARM2 hadn't whereas the Pentium has... The advantages of RISC against CISC are those today:

RISC processors are much simpler to build, by this again results in the following advantages: o easier to build, i.e. you can use already existing production facilities
o

much less expensive, just compare the price of a XScale with that of a Pentium III at 1 GHz... less power consumption, which again gives two advantages:

much longer use of battery driven devices no need for cooling of the device, which again gives to advantages:

smaller design of the whole device no noise

RISC processors are much simpler to program which doesn't only help the assembler programmer, but the compiler designer, too. You'll hardly find any compiler which uses all the functions of a Pentium III optimally...

And then there are the benefits of the ARM processors:

Conditional execution of most instructions, which is a very powerful thing especially with large pipelines as you have to fill the whole pipeline every time a branch is taken, that's why CISC processors make a huge effort for branch prediction The shifting of registers while other instructions are executed which mean that shifts take up no time at all (the 68000 took one cycle per bit to shift) The conditional setting of flags, i.e. ADD and ADDS, which becomes extremely powerful together with the conditional execution of instructions The free use of offsets when accessing memory, i.e.
LDR LDR LDR LDR LDR LDR ... R0,[R1,#16] R0,[R1,#16]! R0,[R1],#16 R0,[R1,R2] R0,[R1,R2]! R0,[R1],R2

The 68000 could only increase the address register by the size of the data read (i.e. by 1, 2 or 4). Just imagine how much better an ARM processor can be programmed to draw (not only) a vertical line on the screen.

The (almost) free use of all registers with all instructions (which may well be an advantage of any RISC processor). It simply is great to be able to use
ADD PC,PC,R0,LSL #2 MOV R0,R0 B R0is0 B R0is1 B R0is2 B R0is3 ...

or even

ADD PC,PC,R0,LSL #3 MOV R0,R0 MOV R1,#1 B Continue MOV R2,#2 B Comtinue MOV R2,#4 B Continue MOV R2,#8 B Continue ...

I used this technique when programming my C64 emulator even more excessively to emulate the 6510. There the shift is 8 which gives 256 bytes for each instruction to emulate. Within those 256 bytes there is not only the code for the emulation of the instruction but also the code to react on interrupts, the fetching of the next instruction and the jump to the emulation code of that instruction, i.e. the code to emulate the CLC (clear C flag) looks like this:
ADD next BIC LDR CMP BNE LDRB ADD MOV bytes R6,R6,#1 R0,[R12,#64] R0,#0 &00018040 R1,[R4,#1]! PC,R5,R1,LSL #8 R0,R0 R10,R10,#1 ; increment PC of 6510 to point to ; ; ; ; ; ; ; ; instruction clear C flag of 6510 status register read 6510 interrupt state interrupt occurred? yes -> jump to interrupt handler read next instruction jump to emulation code lots of these to fill up the 256

This means that there is only one single jump for each instruction emulated. By this (and a bit more) the emulator is able to reach 76% of the speed of the original C64 with an A3000, 116% with an A4000, 300% with an A5000 and 3441% with my RiscPC (SA at 287 MHz). The code may look hard to handle, but the source of it looks much better:
;-----------; ; $18 - CLC ; ;-----------; ADD R10,R10,#1 BIC R6,R6,#%00000001 FNNextCommand FNFillFree

; ; ; ;

increment PC of 6510 clear C flag of 6510 status register do next command fill remaining space

My reply to his reply (!)


The RISC/CISC debate continues. Looking in a few books, it would seem to come down to whether or not microcode is used - thus RISC or CISC is determined more by the actual physical

design of the processor than by what instructions or how many registers it offers. This would support the view that some maintain that the 6502 was an early RISC processor. But I'm not going there...

Das könnte Ihnen auch gefallen