0 Bewertungen0% fanden dieses Dokument nützlich (0 Abstimmungen)
78 Ansichten5 Seiten
This document contains a midterm exam for an introduction to computer architecture course. The exam covers topics like computer architecture fundamentals, MIPS assembly language, performance and power evaluation, and MIPS instruction code implementation. It contains 4 sections with multiple choice and short answer questions testing understanding of these topics.
This document contains a midterm exam for an introduction to computer architecture course. The exam covers topics like computer architecture fundamentals, MIPS assembly language, performance and power evaluation, and MIPS instruction code implementation. It contains 4 sections with multiple choice and short answer questions testing understanding of these topics.
Copyright:
Attribution Non-Commercial (BY-NC)
Verfügbare Formate
Als PDF, TXT herunterladen oder online auf Scribd lesen
This document contains a midterm exam for an introduction to computer architecture course. The exam covers topics like computer architecture fundamentals, MIPS assembly language, performance and power evaluation, and MIPS instruction code implementation. It contains 4 sections with multiple choice and short answer questions testing understanding of these topics.
Copyright:
Attribution Non-Commercial (BY-NC)
Verfügbare Formate
Als PDF, TXT herunterladen oder online auf Scribd lesen
Your Name: _______________________ Total Points: _________ Last 5 of PeopleSoft ID: _____________
There are 4 groups of questions, each is worth 25 questions, totaling 100 points. Show all your work besides your answers for full or maximum partial credit. Note: The Exam questions have been proofread thoroughly and should be self- explanatory. Therefore unless absolutely necessary, please refrain from asking questions during the Exam. This helps maintain a quiet environment for everyone during the limited time available for the Exam.
Topics covered in this Mid-term Exam No. 1: A. Computer architecture fundamentals B. MIPS assembly language concepts C. Performance and power evaluations D. MIPS instruction code implementation
Q1. These questions relate to general concepts in computer architecture. Answer and justify your answers concisely and sufficiently.
Q1a. (10 points) Amdahls Law assumes that there is a fraction of the running time, e.g. o, that can be improved by a factor p, to produce an improvement on overall performance. Let T old be the old running time without improvement, and T new be the running time after improvement. Express T new in terms of T old , o and p. Then, write down T old /T new in terms of o and p.
T new =(oT old )/p +(1 - o)T old , unit in time T old /T new =1 / (o/p +(1 - o)), unitless
Q1b. (5 points) Based on your results from Q1a, comment on what happens if you could come up with a factor p that is infinitely large. In that case, how does o affects overall performance?
T old /T new is limited by 1/(1 - o) as p , theoretically. This implies that, even if p is large, the performance improvement will only be 2 or less if o is not larger than 50%. This can be seen by plotting 1/(1 - o) vs. o for o =[0, 1].
Q1c. (3 points) What is the clock rate (in Hz) of a CPU with a 200 picosecond cycle time?
Clock rate =1 / Cycle Time =1 / (200 x 10 -12 ) =1 / (0.2 x 10 -9 ) =5 GHz.
Q1d. (7 points) Given that performance is measured by CPU time, explain the roles of IC (instruction count), CPI (cycles per instruction), and CCT (clock cycle time). State at least 2 challenges hardware designers face when trying to improve performance.
CPU time =IC * CPI * CCT. CCT is usually a function of the hardware. IC and CPI, on the other hand, are usually affected by software such as compiler, algorithm, and the programming language in use. All these 3 are affected by the ISA, thus indirectly affected by the hardware
An increase in clock rate, i.e. smaller CCT, often drives the CPI up and increase power consumption trade-off 1.
Reducing total number of clock cycles, i.e. IC*CPI, often means we need to have a lower clock rate as well trade-off 2.
Q2. These questions relate to MIPS assembly language concepts and MIPS instructions. Q2a. (7 points) Write the following sequence of code in MIPS assembly language:
x =x +y +z - q;
Assume that x, y, z, and q are stored in registers $s1, $s2, $s3 and $s4 respectively.
Q2b. (5 points) In MIPS, how do we flip the bits, i.e. perform a 1s-compliment, on a 32-bit value stored in a register, using one instruction? Write down that instruction with required register operands, and justify your answer.
Since (a NOR b) is equivalent to NOT (a OR b), we can flip the bits in $t1 and store the result of the flip in $t0 with this MIPS instruction: nor $t0, $t1, $zero
Q2c. (5 points) Given this MIPS instruction beq $s0, $s1, L1, show what should be done if L1 is located at an address too large to be accommodated by the 16-bit branch address field.
bne $s0,$s1,L2 j L1 #else for L1 at an address larger than 16 bits can accommodate L2:
Q2d. (8 points) Bitwise operations such as AND, OR, and SLL instructions are faster because these are directly supported by hardware, as compared to * (multiplication) and % (modulus) which are typically emulated by subroutines implemented with other MIPS instructions. Explain why the following equivalencies hold (&, <<and | represent AND, SLL and OR, respectively): 1. Offset =L_Addr & 0x03ff IS EQUIVALENT TO Offset =L_Addr % 1024 2. Addr =(Frame <<10) | Offset IS EQUIVALENT TO Addr =Frame * 1024 +Offset where L_Addr, Addr, and Frame represent 16-bit-wide registers, and Offset is 10-bit wide.
L_Addr & 0x3FF will give you the 10 least significant bits of L_Addr, since 0x03FF is equal to 0000 0011 1111 1111 (with 10 bits of value 1). The result is the same as L_Addr % 1024. Frame << 10 will multiply Frame by 2 10 =1024. The| Offset bitwise OR will fill in the 10 least significant bits of Addr (initially all with value of 0 after the Frame << 10 operation) with the bits from Offset. This is the same as adding Offset to Frame * 1024. Q3. The following questions pertain to two CPU designs, X and Y, both based on CMOS IC technology and shares the same instruction set architecture. X has a clock cycle time of 250 ps and a measured CPI of 1.0 for a J ava program. Y, on the other hand, has a clock cycle time of 500 ps and a measured CPI of 1.5 for the same J ava program. Furthermore, under normal conditions, Y operates at 15% less voltage and 20% less capacitive load compared to X. Q3a. (10 points) Which design, X or Y, is faster, i.e. gives better performance over the other one, and by how much, using CPU time as the measure?
Let I be the Instruction Count (X and Y share the same ISA), then: CPU Time X =I * CPI X * Cycle Time X =I * 1.0 * 250 =I * 250 ps CPU Time Y =I * CPI Y * Cycle Time Y =I * 1.5 * 500 =I * 750 ps X is faster than Y by (I*750) / (I*250) =3 times
Q3b. (10 points) Which design, X or Y, consumes less dynamic power over the other one, and by how much?
Dynamic Power =C V 2 f Power for X =C V 2 (1/250ps) =4x10 9 CV 2 Power for Y =(C*0.80) (V*0.85) 2 (1/500ps) =1.156x10 9 CV 2
Y consumes less power than X, by about 2.844x10 9 CV 2
Y consumes less power than X, i.e. Power Y / Power X =1.156 / 4 =0.289 ~ 30% Alternatively, Power for Y / Power for X =(C*0.80) (V*0.85) 2 (f*0.5) / CV 2 f =0.289
Q3c. (5 points) If you are a mobile device design engineer, which architecture, X or Y, will you choose for your device? What if you are a server designer? (You will receive full or maximum partial credit for any answer that makes sense. Note that Y uses 30% power of X but X performs 3 times faster than Y.)
Q4. Suppose that a new MIPS instruction, called bcp, was designed to copy a block of words from one address to another. Assume that this instruction requires that the starting address of the source block be in register $t1 and that the destination address be in $t2. The instruction also requires that the number of words to copy be in $t3 (which is >0). Furthermore, assume that the values of these registers as well as register $t4 can be destroyed in executing this instruction (so that the registers can be used as temporaries to execute the instruction). You will most likely need the MIPS load, store, add, and branch instructions for Q4a.
Q4a. (15 points) Use MIPS assembly code to implement block copy operation WITHOUT bcp, i.e. emulate bcp using other existing MIPS instructions. loop: lw $t4, 0($t1) sw $t4, 0($t2) addi $t1, $t1, 4 addi $t2, $t2, 4 subi $t3, $t3, 1 bne $t3, $zero, loop
Q4b. (10 points) Use MIPS assembly code to implement block copy WITH this bcp instruction. You can assume the same registers to be used as in Q4a and that bcp uses no register operands. li $t1, src li $t2, dst li $t3, count bcp