Beruflich Dokumente
Kultur Dokumente
Ranade
Dept of CSE, IIT Bombay
¨ Availability of very powerful parallel computers, e.g.
CDAC PARAM Yuva
¨ Need to solve large problems
¨ Multicore desktop machines
¨ Inexpensive GPUs
¨ FPGA coprocessors
¨ Network of processors.
¡ Local computation: one operation/step/processor
¡ Communication with d neighbour s:
b words/L steps/proc.
¨ Shared Memory
¡ Local computation: one operation/step/processor
¡ Access to shared memor y: b words/L steps/processor
¨ “Fine grain”: small L, large b. Else “coarse”
¨ Separate program/processor. (possible: same
program)
¨ Synchronous vs. Asynchronous execution of
different processors
¨ Memory heirarchies. Interaction of cache pol icies.
Dramatic difference in memory access times for
hits vs. misses.
¨ Algorithm design is tricky!
Maximize Speedup = T1 / Tp
T1 = Time using best sequential algorithm
Tp = Time using parallel algorithm on p
processors.