You are on page 1of 16

http://www.eastaughs.fsnet.co.uk/cpu/index.

htm

The microprocessor is sometimes referred to as the 'brain' of the personal computer,
and is responsible for the processing of the instructions which make up computer
software. It houses the central processing unit, commonly referred to as the CPU,
and as such is a crucially important part of the home PC. However, how many people
really understand how the chip itself works?

This tutorial aims to provide an introduction to the various parts of the
microprocessor, and to teach the basics of the architecture and workings of the CPU
across three specific sections:

CPU Structure
This section, using a simplified model of a central processing unit as an example,
takes you through the role of each of the major constituent parts of the CPU. It also
looks more closely at each part, and examines how they are constructed and how
they perform their role within the microprocessor.

Instruction Execution
Once you are familiar with the various elements of the processor, this section looks
at how they work together to process and execute a program. It looks at how the
various instructions that form the program are recognised, together with the
processes and actions that are carried out during the instruction execution cycle
itself.

Further Features
Now that the basics have been covered, this section explores the further
advancements in the field of microprocessor architecture that have occured in recent
years. Explanations of such techniques as pipelining and hyperthreading are
provided, together with a look at cache memory and trends in CPU architecture.

Each section also concludes with a multiple choice quiz with which you can test your
knowledge, while some also contain interactive animations in order to improve your
learning experience. These animations are in Macromedia Flash format, and will
require Flash Player to be installed on your computer. If it is not, please visit the
Macromedia website in order to download and install the browser plug-in.

The first section of this tutorial related to the structure of the central processing unit.
Please click the button marked with the next arrow below to proceed.

It consists of a decoder. For more information on these parts of the CPU. it is here that the program being executed is stored. which looks at the arithmetic and logic unit. control logic circuits. Click the title above for more. System Bus This is comprised of the control bus. Memory The memory is not an actual part of the CPU itself. click the title above. Click on an area for more details. memory and peripherals. Control Unit (CU) This controls the movement of instructions in and out of the processor. It is used for connections between the processor. Register Array This is a small amount of internal memory that is used for the quick storage and retreival of data and instructions. and as such is a crucial part of the overall structure involved in program execution. and transferal of data between the various parts. we shall begin my looking at a simplified model of the structure. The model to be used can be seen on the right of this page. click the right arrow button below to move on to the next page. which are: Arithmetic & Logic Unit (ALU) The part of the central processing unit that deals with operations such as addition. All processors include some common registers used for specific functions. More on the control unit can be discovered by clicking the title above. click the corresponding title of the description above. For more. memory address register and stack pointer. and multiplication of integers and Boolean operations.As there are a great many variations in architecture between the different kinds of CPU. and also controls the operation of the ALU. For further information on the memory. subtraction. instruction register. You could also click on the part in question on the diagram to the right. please see the seperate tutorial if available. accumulator. It receives control signals from the control unit telling it to carry out these operations. and is a good basis on which to build your knowledge of the workings of a microprocessor. namely the program counter. and is instead housed elsewhere on the motherboard. and a clock to ensure everything happens at the correct time. click the title above. It is also responsible for performing the instruction execution cycle. data bus and address bus. . The simplified model consists of five parts. Alternatively. The simplified model of the central processing unit. However. For more.

Addition and subtraction These two tasks are performed by constructs of logic gates. Instead. the amount by which the values differ is not required. however. Comparison Comparison operations compare values in order to determine such things as whether one number is greater than. is the section of the processor that is involved with executing operations of an arithmetic or logical nature. so it is recommended that you read further into the areas outlined above to aid with your learning. You can look at the ALU as comprising many subcomponents for each specific task that it is required to perform. such as half adders and full adders. Earlier processors used either additional chips known as maths co-processors. The accumulator holds the results of operations. Click subcomponents are: on a different section for more information. it is not strictly necessary for the result of the calculation to be stored in this instance. Multiplication and division In most modern processors. or the arithmetic and logic unit. the accumulator and flag registers. More on these registers can be found in the register array section. These operations can be performed by subtraction of one of the numbers from the other. and used later in further processing. including seeing if an operation produces a result of zero. with the aid of they can also perform subtraction via use of inverters and 'two's complement' arithmetic. or used a completely different method to perform the task. while the flag register contains a number of individual bits that are used to store information about the last operation carried out by the ALU. less than or equal to another. the appropriate status flags in the flag register are set and checked to detemine the result of the operation.The ALU. and as such can be handled by the aforementioned logic gates. the multiplication and division of integer values is handled by specific floating-point hardware within the CPU. . Most of these logical tests are used to then change the values stored in the flag register. While they may be termed 'adders'. However. The topic of logic gates is too expansive and detailed to be covered in full here. of these tasks and their appropriate with the arithmetic & logic unit highlighted in red. in particular. It works in conjunction with the register array for many of these. so that they may be checked later by seperate operations or instructions. Logical tests Further logic gates are used within the ALU to perform a number of different logical tests.. Others produce a result which is then stored. Many resources exist on the internet and elsewhere relating to this topic. Some The simplified model of the central processing unit.

This ensures that the actions themselves also occur at these same regular intervals. together with the addressing mode used. or alternatively click on a section of the diagram above to view a different section. It does this by issuing control signals to the other areas of the processor. Again. the three main elements of the control unit are as follows: Decoder This is used to decode the instructions that make up a program when they are being processed.Bit shifting Shifting operations move bits left or right within a word. Timer or clock The timer or clock ensures that all processes and instructions are carried out and completed at the right time. and is responsible for controlling much of the operation of the rest of the processor. and to determine in what actions must be taken in order to process them. and what should be done with the results. the control unit can be broken down further for easier understanding. this is a quite complicated logical procedure. and actions only occur when a pulse is detected. These decisions are normally taken by looking at the opcode of the instruction. Click on a different section for more information. This is covered in greater detail in the instruction execution section of this tutorial. what data they should be using to perform said actions. Similarly to the arithmetic and logic unit. These signals inform the arithmetic and logic unit and the register array what they actions and steps they should be performing. which uses pulses from the clock within the control unit to trigger a chain reaction of movement across the bits that make up the word. them on what should be performed next. As such. Further detail is not required at this stage on the control unit. to move . with the control unit highlighted in red. This is accomplished via the use of a shift register. and further reading may aid your understanding. though it is clear that there is much detail at lower levels that has yet to be touched on. Pulses are sent to the other areas of the CPU at regular intervals (related to the processor clock speed). which are then sent around the processor. Click the next button below to move on and look at the control unit. The control unit is arguably the most complicated part of this model CPU. Control logic circuits The control logic circuits are used to create the control signals themselves. instructing The simplified model of the central processing unit. with different operations filling the gaps created in different ways. meaning that the operations of the CPU are synchronised. However.

address of the next instruction that has to executed in a program. The next action to take is then determined and carried out. The control unit then checks this register when needing to know which memory address to check or obtain data from. data and other values that may need to be quickly accessed during the execution of a program. designed to be quickly accessed for purposes of fast data retrieval. or ACC) The accumulator is used to hold the result of operations performed by the arithmetic and logic unit. Memory Buffer Register (MBR) When an instruction or data is obtained from the memory or elsewhere. These contain instructions. which are changed as a result of operations involving the arithmetic and logic unit. in order for the speed of the whole execution process to be reduced. that is able to resume following an execution at the correct point. Memory Address Register (MAR) Used for storage of memory addresses. This is to ensure the CPU knows at all times where it has reached. please click the next button below. Click on a This register is used to hold the memory different section for more information. and that the program is executed correctly. and the data is moved on to the desired location. Further information can be found in the section on the ALU. Instruction Register (IR) This is used to hold the current instruction in the processor while it is being decoded and executed. Flag register / status flags The flag register is specially designed to contain all the appropriate 1-bit status flags. Accumulator (A. usually the addresses involved in the instructions held in the instruction register. This is because the time needed to access the instruction register is much less than continual checking of the memory location itself. . These are: The simplified model of the central processing unit. A register is a memory location within the CPU itself. which houses many such registers. as covered in the section on the ALU.on to the next element of the processor (the register array). it is first placed in the memory buffer register. Many different types of registers are common between most microprocessor designs. Processors normally contain a register array. Program Counter (PC) with the register array highlighted in red.

The addresses are transferred in binary format. For instance. memory and peripherals. Each wire is used for the transfer of signals corresponding to a single bit of binary data. and is bi-directional so that it allows data flow in both directions along the wires. Click the next arrow button below in order to read more. Address Bus The address bus contains the connections between the microprocessor and memory that carry the signals relating to the addresses which the CPU is processing at that time. with suffixes of L and U indicating the lower and upper sections of the register respectively. a greater width allows greater amounts of data to be transferred at the same time.Other general purpose registers These registers have no specific purpose. Not all of the communication that uses the bus involves the CPU. In the model used here these are assigned the names A and B. including the microprocessor. instances. Again. As such. The final main area of the model microprocessor being used in this tutorial is the system bus. called the data bus. specific lines are used for each of read. the number of wires used in the data bus (sometimes known as the 'width') can differ. such as the locations that the CPU is reading from or writing to. . but are generally used for the quick storage of pieces of data that are required later in the program execution. with each line of the address bus carrying a single binary digit. The width of the address bus corresponds to the maximum addressing capacity of the bus. different. with the system bus highlighted in red. although naturally the examples used in this tutorial will centre on such The simplified model of the central processing unit. which can be sent from the control unit within the CPU. which can be outlined as follows: Control Bus The control bus carries the signals relating to the control and co-ordination of the various activities across the computer. control bus and address bus. as each line is used to perform a specific task. or the largest address within memory that the bus can work with. These all have seperate responsibilities and characteristics. write and reset requests. The system bus consists of three different groups of wiring. Data Bus This is used for the exchange of data between the processor. Click on a different section for more information. The system bus is a cable which carries data communication between the major components of the computer. Different architectures result in differing number of lines of wire within the control bus. Therefore the maximum address capacity is equal to two to the power of the number of lines present (2^lines).

processing each instruction in turn. Following on from looking at the structure and architecture of the central processing unit itself. we shall now look at how the CPU is used to execute programs and make the computer as a whole run smoothly and efficiently. code comprising the program and any associated files is stored on the hard drive. The code remains there until the user chooses to execute the program in question. Results can then be stored back in the memory. recognition for instructions that could be encountered needs to be programmed into the processor. as illustrated in the diagram above. This process is called the 'instruction execution cycle'. before that. and the actions that should be carried out are decided upon. and data associated with these instructions. The next section will look at the instruction execution process. The instructions that can be recognized by a processor are referred to as an 'instruction set'. Click the next arrow below to take a short quiz relating to this section of the tutorial. Therefore. and is also covered later on in this tutorial. . it is necessary for the CPU to understand what the instruction is telling it to do. Once the instruction has been recognized. To do this. though other media or downloading from the internet is also common). This is the same flow of information as when a program is executed only in reverse.This concludes the look at the simplified model processor that will be used for the remainder of this tutorial. and how these different parts work together to execute programs. there's a chance to test what you've learnt in this section regarding processor architecture. When software is installed onto a modern day personal computer (most commonly from a CD-ROM. The CPU then executes the program from memory. and are described in greater detail on the next page of the tutorial. However. Of course. A flow diagram illustrating the flow of data within the PC during program execution and the saving of data. and look at the complete computer unit. and later saved to the hard drive and possibly backed up onto removal media or in seperate locations. Further explanation can be found below. This code comprises of a series of instructions for performing designated tasks. on which point sections of the code are loaded into the computers memory. we must take a step back from concentrating solely on the processor. in order to execute the instructions. the actions are then performed before the CPU proceeds on to the next instruction in memory.

indicate where the data required for the operation can be found and how it can be accessed (the addressing mode. programmers almost never write their programs directly into this form. This means that the minimum length of the machine codes used here should be 24 binary bits. for a processor to be able to process an instruction. When a processor is executing a program. As outlined in the introduction to this section. The length of a machine code can vary . to allow for greater features. we will presume we are using a 24-bit CPU. and have pre-determined methods available to carry out these actions. Machine language can be directly interpreted by the hardware itself. While it may not have been originally written in this way. easier coding. The exact format of the machine codes is again CPU dependant. For the purpose of this tutorial. Click the next arrow below to proceed. or operands.Allows for 64 unique opcodes (2^6) Operand(s) 18 bits (0-17) . which is discussed in full later). Now we know what form the data is in when it is read by the CPU.On the next page of this tutorial is a more in-depth look at instruction sets. The operand. which may then be referred to using the mnemonic STA (short for STore Accumulator).common lengths vary from one to twelve bytes in size.16 bits (0-15) for address values . It is this idea which is the reasoning behind the 'instruction set'. it needs to be able to determine what the instruction is asking to be carried out. and is able to be easily encoded as a string of binary bits and sent easily via electrical signals. the CPU needs to know what actions it may be asked to perform. which in this instance are split as shown in the table below: Opcode 6 bits (18-23) . it is necessary to learn about the cycle by which the instructions of a program are executed. Each machine code of an instruction set consists of two seperate fields: Opcode Operand(s) The opcode is a short code which indicates what operation is expected to be performed. and to cope with changes in the actual architecture of the processor itself. an instruction to store the contents of the accumulator in a given memory address could be given the binary opcode 000001. Each operation has a unique opcode. the program is in a machine language. For this to occur. it is translated to a machine language at some point before execution so that it is understandable by the CPU. which the CPU is designed to expect and be able to act upon when detected. Such mnemonics will be used for the examples on upcoming pages. Different processors have different instruction sets.2 bits (16/17) for specifying addressing mode to be used Opcodes are also given mnemonics (short names) so that they can be easily referred to in code listings and similar documentation. For example. The instruction set is a collection of pre-defined machine codes. This is the . However.

four main groups of actions do exist. there is an interactive animation below. Clicking the next arrow below will take you to further information relating to the fetch cycle. decoded. the shorter the time between pulses. so that each pulse is an equal time following the last. which can be accessed by clicking the next arrow below Once a program is in memory it has to be executed. Fetch Cycle The fetch cycle takes the address required from memory. and then executed. The first part of the instruction execution cycle is the fetch cycle. Execute Cycle The actual actions which occur during the execute cycle of an instruction depend on both the instruction itself. which is the cycle by which each instruction in turn is processed. Each instruction is fetched from memory. For more on each part of the cycle click the relevant heading. Diagram showing the basics of the instruction to ensure that the execution proceeds execution cycle. the clock located within the CPU control unit is used. It determines which opcode and addressing mode have been used. To best illustrate the actions that occur within the fetch cycle. This produces regular pulses on the system bus at a specific frequency. smoothly. stores it in the instruction register. However. Decode Cycle Here. so that commands can be kept in time with each other across the whole computer unit. However. and the addressing mode specified to be used to access the data that may be required. and moves the program counter on one so that it points to the next instruction.the higher the clock speed. decoded and acted upon in turn until the program is completed. and as such what actions need to be carried out in order to execute the instruction in question. which will now be looked at in more detail. Once the instruction has been fetched and stored in the instruction register. The instruction execution cycle can be clearly divided into three different parts. This clock pulse frequency is linked to the clock speed of the processor . To keep the events synchronised. it must . Actions only occur when a pulse is detected. To do this. the control unit checks the instruction that is now stored within the instruction register. each instruction must be looked at. or use the next arrow as before to proceed though each stage in order. which are discussed in full later on.topic of the next page of the tutorial. This is achieved by the use of what is termed the 'instruction execution cycle'. it is is also necessary to synchronise the activites of the processor.

in order to change the sequence of subsequent operations. • Transfer of data between the CPU and an input or output devices. However. • Processing of data. but the least flexible. Once the opcode is known. and also checking which addressing mode needs to be used to obtain any required data. and how they are executed in each of the three main addressing modes. the operands of the instruction contain the memory address where the data required for execution is stored. Therefore. and as describing all the possible instructions is unnecessary. As such it is the least used of the three in practice. For the instruction to be processed the required data must be first fetched from that location. Direct addressing For direct addressing. These can possibly be conditional.then be decoded. the following tutorial pages will only look at a few possible instructions. These addressing modes are: Immediate addressing With immediate addressing. These are: Mnemonic Description MOV Moves a data value from one location to another ADD Adds to data values using the ALU. with no two opcodes requiring the same actions to occur. there are generally four groups of different actions that can occur: • Transfer of data between the CPU and memory. Once the instruction has been fetched and is stored. the next step is to decode the instruction in order to work out what actions should be performed to execute it. This involves examining the opcode to see which of the machine codes in the CPU's instruction set it corresponds to. This is the quickest of the addressing modes to execute. using the CPU model from this tutorial. possibly involving the use of the arithmetic and logic unit. not in a seperate memory location. The decoding process is detailed on the next page. Different actions need to be carried out dependant on the opcode. which can be accessed by clicking the next arrow below. based on the values stored at that point within the flag register. For greater simplicity. bits 16 to 23 should be examined. the execution cycle can occur. no lookup of data is actually required. • A control operation. The data is located within the operands of the instruction itself. and returns the result to the accumulator STO Stores the contents of the accumulator in the specified location END Marks the end of the program in memory The four instructions used in the examples for the remainder of this section of the tutorial The following three pages of the tutorial will look at the first two of these instructions. .

When writing out the code in mnemonic form. No lookup of data from memory is required. and also the three main addressing modes that are used. To proceed. To best illustrate the methods used by immediate addressing there is an interactive animation below. The next page of the tutorial shows the full execution of one such simple program. the data required for execution of the instruction is located directly within the operands of the instruction itself. This can prove useful if decisions need to be made within the execution. and is available by clicking on the next arrow button below. Indirect addressing means that the memory address given in the operands of the instruction is not the location of the actual data required. This is the most flexible of the modes. The final of the three addressing modes to be looked at is indirect addressing. that address holds a further address. To best illustrate the methods used by indirect addressing there is an interactive animation below Now that we have covered all the stages of the instruction execution process. we are able to examine the full execution of simple programs. When writing out the code in mnemonic form. The final of the three modes of addressing to be looked at is indirect addressing. Instead. please click the next arrow button below. Direct addressing means that the operands of the instruction hold the address of the location in memory where the data required can be found. at which the data is stored. To best illustrate the methods used by direct addressing there is an interactive animation below. The data is then fetched from this location in order to allow the instruction to be executed. When writing out the code in mnemonic form. . operands that require this mode are marked with a # symbol. The second of the three addressing modes to be looked at is direct addressing. as the memory address used in processing can be changed during execution. but also the slowest as two data lookups are required. click the next arrow below.Indirect addressing When using indirect addressing. there is instead another memory address given where the data actually is located. With immediate addressing. The next of the three addressing modes that will be looked at is direct addressing. To proceed to the next page where this mode is covered. operands that require this mode are marked with a @ symbol. However. rather than the data being at this location. Click the next arrow below to proceed The first of the three addressing modes to be looked at is immediate addressing. no symbol is required to mark operands which use this form. The next page looks at immediate addressing. the operands give a location in memory similarly to direct addressing.

Click the next arrow below to read more about the topic. Pipelining is the name given to the process by which the processor can be working on more than one instruction at once. and how the various different addressing modes affect how the CPU processes instructions. Understandably. two competing architectures have emerged. this enables the execution of the program to be completed with greater speed. many refinements to the workings and architecture have also been implemented. Modern architectures Outside of pipelining. a third instruction can be fetched. There has been a general look at a simple processor architecture. Current processors tend not to be strictly adherent to either architecture. While the information covered up to this point is still applicable and relevant to the majority of microprocessors. The simplest way to approach pipelining is to consider the three stage fetch. CISC and RISC architectures Over the course of the development of the modern day processor. decode and execute instruction execution cycle outlined earlier. but is not without complications and problems. There are times during each of these subcycles of the main cycle where the main memory is not being accessed. In this final section of the tutorial there will be a brief look at three main areas where these refinements have occured: Pipelining This is a method by which the processor can be involved in the execution of more than a single instruction at one time. an explanation of the method by which instructions are executed. Below is an interactive animation that demonstrates the benefits which this simple form of pipelining can produce. when instruction one is being executed and instruction two is being decoded.In the previous two sections the basics of the workings and architecture of the central processing unit has been explained. These have to be overcome by careful design. New advancements are added with each new generation of processors. These are in many differing areas such as cache memory and specialised instruction set extensions. However. instead being a mix of the two ideals. All examples have shown an instruction having to be executed in full before the next one can be started on. The first of these areas to be covered is the topic of pipelining. and the CPU could be considered 'idle'. . therefore. many other improvements to the general architecture of the microprocessor have been developed. The idea. but both were designed with the intention of improving CPU performance. CISC and RISC have several major differences in features and ideas. However. is to begin the fetch stage for a second instruction while the first stage is being decoded. Up until this point in the tutorial we have assumed that the processor is only able to process one instruction at a time. modern CPUs are very rarely as simple as the ones that have been discussed thus far. this is not how modern CPUs work. Then. RISC and CISC.

pipelining is not without problems. While pipelining can severely cut the time taken to execute a program. the non pipelined method manages to completely execute three instructions. as three of the above stages (fetch instruction. there are many issues which need to be taken into consideration relating to the technique of pipelining. This means that the processor needs to be designed well in order to cope with these potential interruptions to the flow of data. write operand) require access to the memory. On top of this. As you can tell.Across the nine time cycles shown above. the diagram below shows how the cycle can be broken down into six stages rather than three: Diagram showing the differences between the common 3 stage model of the instruction execution cycle. there are problems that cause it to not work as well as it perhaps should. . There is also the matter of potential conflicts within the memory system. which would possibly upset the synchronisation. Simple load instructions. for example. fetch operands. and hence the pipelining would not be as beneficial as it would first seem. However. will not require the use of the final 'write operand' stage. This can arise when data produced earlier needs to be used. Firstly. seven instructions are executed in full . it is not without creating further problems of its own. Also. The three stages of the instruction execution process do not necessarily take an equal amount of time. With pipelining. This makes it much harder to synchronise the various stages of the different instructions. For more on the problems associated with pipelining and how they can be overcome.and another two are started. the problem of conditional branching and result dependant instructions also occurs. While it is a powerful technique for the purpose of increasing CPU performance. it is not always the case than an instruction will use all six of these stages. while this may solve some of the problems outlined above. Many memory management systems would not allow three seperate instructions to be accessing the memory at once. click the next arrow below. One of the simplest ways in which the effects of these problems can be reduced is by breaking the instruction execution cycle into stages that are more likely to be of an equal duration. and does not necessarily work as well as this. or when a conditional branch based on a previous outcome is used. and the 6 stage model used in more advanced pipelining. some instructions may be dependent on the results of other earlier instructions. with the time taken for 'execute' being generally longer than 'fetch'. it does require careful design and consideration in order to achieve the best possible results. For example. However.

However. Addressing modes are simplified back to four or less. Two competing architectures were developed for this purpose. This means that RISC chips are much cheaper to produce than their CISC counterparts. Years of development have been undertaken into improving the architecture of the central processing unit. such as the IBM PowerPC processor. only allowing such simple instructions means a greater burden is placed upon the software itself. Performance was improved here by allowing the simplification of program compilers. Click the next arrow to proceed. have a greatly simplified and reduced instruction set. the complexity of the processor hardware and architecture that resulted can cause such chips to be difficult to understand and program for. CISC RISC Large (100 to 300) Instruction Set Small (100 or less) Complex (8 to 20) Addressing Modes Simple (4 or less) Specialised Instruction Format Simple Variable Code Lengths Fixed Variable Execution Cycles Standard for most . which have very large instruction sets reaching up to and above three hundred seperate instructions. and different processors conformed to each one. and the length of the codes is fixed in order to allow standardisation across the instruction set. Less instructions in the instruction set means a greater emphasis on the efficient writing of software with the instructions that are available. The idea here was that the best way to improve performance would be to simplify the processor workings as much as possible. This idea is at the root of CISC processors. CISC: Complex Instruction Set Computers Earlier developments were based around the idea that making the CPU more complex and supporting a larger number of potential instructions would lead to increased performance. such as the Intel x86 range. However. potentially allowing for greater speeds. and as such also had supporters and detractors. with the main aim of improving performance. numbering in the region of one hundred instructions or less. Also the reduced instruction set means that the processor can execute the instructions more quickly. and also means they can be expensive to produce. as the range of more advanced instructions available led to less refinements having to be made at the compilation process. RISC: Reduced Instruction Set Computers In opposition to CISC. RISC processors. we shall move on to look at the competing architectures of CISC and RISC. They also have increased complexity in other areas. with many more specialised addressing modes and registers also being implemented.Next. Changing the architecture to this extent means that less transistors are used to produce the processors. Both had their strengths and weaknesses. Supporters of the CISC architecture will point out that their processors are of good enough performance and cost to make such efforts not worth the trouble. the mid-1980s saw the beginnings of the RISC philosophy. and variable length of the instruction codes themselves.

but this page The Pentium 4 is the first of Intel's contains a brief introduction to three of the most processors to make use of the new common of these other techniques. with CPUs from each side incorporating ideas from the other. provides a great improvement in the time taken for processing over continual accessing from the main memory at a slower speed. More recent processors have larger caches . The final page of this section of the tutorial looks at other improvements to modern CPU architectures. However. it becomes evident that the whole rivalry between CISC and RISC is now not of great importance. Higher Cost / CPU Complexity Lower Compilation Simplifies Processor design Processor design Complicates Software Summary of the main differences between the two competing architectures Looking at the most modern processors. Most programs end up accessing the same data and instructions over and over again at some point in their execution. the Intel 486 had a cache of only eight kilobytes. Placing these in higher speed storage.for instance. Specialised instruction set extensions The most commonly known extensions to the traditional CPU instruction set are Intel's MMX and AMD's 3DNow! technology. and consist of a number of specific instructions which are specialized to perform the short repetitive tasks that make up the large majority of multimedia processing. These . These both come into use when the processor is asked to perform operations involving graphics. in what is known as a 'Level 1' cache. On top of the topics already covered in this section. This is because the two architectures are converging closer to each other. while the reduced instruction sets of RISC processors contain similar numbers of instructions to those found in certain CISC chips. Home computer processors traditionally have implemented the cache directly into their architecture. which are referred to as 'Level 2' cache and much larger in size than 'Level 1' caches. CISC processors now use many of the same techniques as RISC ones. audio and video. The most modern CPUs also make use of external caches. These are generally at too high a level to be discussed within this tutorial. hyperthreading technology Cache memory This is a small amount of high-speed memory used specifically as a fast and effective method of storage for commonly used instructions. Click the next arrow below to proceed. and why each design path was chosen. while the Pentium II used multiple stores totalling up to two megabytes of storage space. it is still important that you understand the ideas behind these two differing architectures. such as a cache. there are many other ways in which companies who manufacture microprocessors have attempted to improve the performance of their CPUs.

This technology does not provide the same performance increase as actual seperate processors would do. such as the streaming of music and video files. This concludes the section of the further features of the more modern microprocessor. Hyperthreading Hyperthreading is a new technology.extensions use SIMD (Single Instruction. You have reached the conclusion of the microprocessor tutorial. but provides a considerable boost for less cost and power consumption than said multiple processors would require. Hopefully you should now have a greater understanding of the architecture of the microprocessor and how it works. . MMX makes use of fifty-seven SIMD instructions. Following this page is a multiple-choice quiz with which you can test your knowledge from this section. Such extensions ultimately enhance the performance of the processor in activites relating to gaming. Current processors such as the aforementioned Pentium 4 currently split the CPU into two logical processors. Intel are currently working on further advancements which will enable splitting higher numbers of threads to be simultaneously executed. This includes further extensions to improve operations relating to internet-related activity. Click the next arrow below to continue. introduced by Intel with their most recent Pentium 4 processors. while the Pentium 4 raises this number to one hundred and forty-four. This enables the CPU via use of shared hardware resources to execute multiple seperate parts of a program (or 'threads') at the same time. Multiple Data) instructions in order to greatly reduce the time taken. It works by using what is known as 'simultaneous multithreading' to make the single processor appear to the computer operating system as multiple logical processors. multimedia applications. The improved 3DNow! technology found in the AMD Athlon processor also contains SIMD instructions for this purpose. as such instructions perform their operations to multiple pieces of data at the same time. and use of the internet and other forms of communication.