University Dr.-Ing. Basermann ‘SS 2004/2005
Duisburg-Essen Prof. Dr.-Ing. Hunger
[ ‘Computer Architecture 2/ Advanced Computer Architecture Seite: 1
Annotation to the assignments and the solution sheet
This is a multiple choice examination, that means:
* Solution approaches are not assessed,
+ For each subpart of an assignment one or more answers can be right.
But: If you mark the box "None of them" of one subpart, the other marked answers of
this subpart will be disregarded.
+ Itis not possible to get a negative score in a subpart of any assignment.
Note the following points
* In addition to the assignment sheet there is a solution sheet
* Mark the answers on the solution sheet as described!!!
MARKED ANSWERS ON THE ASSIGNMENT SHEET WILL NOT BE CONSIDERED.
+ You get the assignment sheet only once.
* Incase of erroneous entries ask the personnel for a new solution sheet .
* Only use the sheets enclosed in the envelop. Don't use any other paper. If you need
more paper ask the supervisors.
+ Return everything, i.e. assignment sheet, solution sheet and the sheets - used and
unused. Only exams that are returned completely will be assessed.
* FILL-IN YOUR NAME AND MATRICULATION NUMBER ON THE ASSIGNMENT SHEET
AND THE SOLUTION SHEET!
Name Mairikelnummer DUniversity Dr.-Ing, Basermann ‘SS 2004/2005
Duisburg-Essen Prof. Dr.-Ing. Hunger
Computer Architecture 2/ Advanced Computer Architecture Seite: 2
Question 1 (14 Points)
Parallelism within a Processor
1.1. Which of the following statements about the von Neumann architecture is/are true?
A:_| Programs and data are resident in different memories.
‘The computer structure is independent of the problem to be processed.
Programs consist of a sequence of instructions which are executed in parallel
The machine applies binary codes.
None of the answers above is correct.
1.2 Instruction Pipelining: How long (in ns) is the gap (bubble) within the fourth task entering
the pipe below?
oh) FOI Fel -Foerel
4ns 3Rs ahs 8ns 3
ns
F: | 12ns.
G16 ns.
H: 20 ns.
None of the answers above is correct.
1.3 Pipelining: what is the execution time per stage of a pipeline that has 5 equal stages
and a mean overhead of 8 cycles?
d:_[2Zeycles.
K:_|3 cycles.
L: [4 cycles.
M:_| None of the answers above is correct.
1.4 Itanium processor, ILP (EPIC): A vector operation c = a + b with 154 elements per
vector shall be performed. How many cycles are required within the loop below for the
vector operation above (neglect the branch operation br.ctop ) if the load (/d/)
instructions take two cycles and the remaining operations take 1 cycle?
Name Mairikelnummer DUniversity Dr.-Ing, Basermann
Duisburg-Essen Prof. Dr.-Ing. Hunger
Computer Architecture 2/ Advanced Computer Architecture
‘SS 2004/2005
Seite: 3
3: 1158.
| 159,
N
oO:
P: 162.
Q:_| None of the answers above is correct.
1.5 Which feature of Itanium processors aims to increase parallelism by changing
instructions order?
R:_[Rolating Registers.
S:_|Predication.
T:_| Speculation.
U: [None of the answers above is correct.
Name Mairikelnummer
Typ.University
Dr.-Ing, Basermann ‘SS 2004/2005
Duisburg-Essen Prof. Dr.-Ing. Hunger
Computer Architecture 2/ Advanced Computer Architecture
Seite: 4
Question 2 (12 Points)
Classification & Performance of Parallel Architectures
2.1 Which kind of architecture is represented by the following figure?
Is
Lj cu, | 1S PU; |_—OS
0 «—>|
is
Shared
b Ls b |, 0s
vo a FP emory
vo «<——I cu, |_ is PU, |, os is
| >| ke >|
SISD architecture
SIMD architecture.
MIMD architecture.
MISD architecture.
None of the answers above is correct.
2.2 Which statement(s) related to the system in figure in 2.1 is/are true?
F:_| The system is very well scalable with respect to the number of processors.
G:_|The system represents a vector processor.
H: 2) The processors can communicate with each others through shared variables.
None of the answers above is correct.
2.3 Parallel programs: Which is the parallel execution time of a program with mean parallel
overhead 4 s and sequential execution time 600 s on 150 processors?
J: [4s K: [8s L: [i2s,
M: |N: None of the answers above is
correct
Name Mairikelnummer DUniversity Dr.-Ing, Basermann ‘SS 2004/2005
Duisburg-Essen Prof. Dr.-Ing. Hunger
Computer Architecture 2/ Advanced Computer Architecture Seite: 5
2.4 Parallel programs: Which is the execution time of a program on 100 processors if 93%
of the program is ideally parallel, the remaining part is sequential and the sequential
execution time is 10000 s?
100 s. P:_ [593 s.
None of the answers above is correct
793 s.
ao
2.8 — Workload driven evaluation of parallel systems, memory constrained scaling: A matrix
factorization with complexity 1 takes 20 hours for a square matrix which requires
128710" bytes on one processor (8 bytes per element). Which time would it need on 100
processors (assuming 50% parallel efficiency)?
$:_ [200 hours T:_ |400 hours
V:__| None of the answers above is correct
600 hours.
2.6 — Workload driven evaluation of parallel systems, time-constrained scaling: Which should
be the number of rows for a matrix-matrix multiplication on 1 processor if it is 3000 on
30 processors (assuming 90% parallel efficiency)?
Ww: | 1000. X:_ [1500
Z: [None of the answers above is correct
2000.
Name Mairikelnummer DUniversity Dr.-Ing, Basermann ‘SS 2004/2005
Duisburg-Essen Prof. Dr.-Ing. Hunger
‘Computer Architecture 2/ Advanced Computer Architecture Seite: 6
Question 3 (12 Points)
Interconnection Networks
3.4
Topology: What is the difference between a 2-D torus and a hypercube with 16 nodes
regarding the topology parameters node degree, diameter, bisection width, and average
distance?
[The hypercube has the higher bisection width
The node degree is different
The 2-D torus has the higher average distance.
No difference
3.2
mio ojo >!
None of the answers above is correct.
E-cube routing: Which is the path taken from 010 to 101?
[010 > 011 -> 001 -> 101
010 -> 110 -> 100 -> 101.
010 -> 000 -> 007 -> 101.
010 -> 110 -> 111 > 101
3.3
3.4
None of the answers above is correct.
Topology: Which is the height of a binary tree with 128 nodes?
N: | None of the answers
K-M is correct.
Which routing strategies are deadlock-tree?
E-cube routing on hyperoubes.
XY routing on tori
XY routing on 2D meshes.
:_|None of the answers above is correct.
Ware Tatrikeinummer TeUniversity Dr.-Ing, Basermann ‘SS 2004/2005
Duisburg-Essen Prof. Dr.-Ing. Hunger
Computer Architecture 2/ Advanced Computer Architecture Seite: 7
3.5 Topology: Which is the average distance in a butterfly network with 256 nodes?
S$: | 16. T: 4.
u: [8. None of the answers
S-U is correct.
3.6 Routing in a butterfly network: Which statement is true?
'W: [Each stage corresponds to a bit in the destination address.
X:_| The corresponding bit of the destination address selects the
output of each stage (0 or 1).
Y: | The corresponding bit of the destination address selects the
input of each stage (0 or 1).
Z:__[None of the answers above is correct.
Name Mairikelnummer DUniversity Dr.-Ing. Basermann
‘SS 2004/2005
Duisburg-Essen Prof. Dr.-Ing. Hunger
‘Computer Architecture 2/ Advanced Computer Architecture Seite: 8
Question 4 (9 Points)
Caches
44 Simple cache model, 1 level only: Which is the cache access time if the access time
from the processor view is 5 ns, the hit rate is 99% and the cache access time is 1/400
of the memory access time?
A: [2ns.
B: ins.
Cc: |3ns.
D:__|None of the answers above is correct.
4.2 Cache coherence: For which shared (virtual) memory systems is the snooping protocol
not suited?
E: | Systems with butterfly network.
F:_ [Bus based systems
G:_| Systems with 3-D torus network.
H:_ [None of the answers above is correct.
Write-back caches: Cache data marked as exclusive.
J:__|Write-back caches: Cache data marked as modified.
K:__|Write-through caches: After writing to shared data.
L: [None of the answers above is correct
Snooping cache protocol: In which cases is the main memory up-to-date?
4.4 Snooping cache protocol, write-back caches: What is not an immediate effect of writing
to shared data in the cache of one processor?
M:_| Updating copies in the caches of other processors.
: | Invalidating copies in the caches of other processors.
Updating main memory.
None of the answers above is correct.
Name Mairikelnummer
TypUniversity Dr.-Ing, Basermann ‘SS 2004/2005
Duisburg-Essen Prof. Dr.-Ing. Hunger
Computer Architecture 2/ Advanced Computer Architecture Seite: 9
4.5 Directory-based cache coherence protocols for distributed memory systems: Which
information is not necessary in the directory of each processor?
Status information on data in memory of other processors
Locations of copies of the processor's cache data
Status information on the processor's cache data
Status information on the processor's cache data + locations of copies.
None of the answers above is correct.
cen
Name Mairikelnummer D