Beruflich Dokumente
Kultur Dokumente
Lecture 10
26th October 2017
Dr Chris Johnson
chris.johnson@manchester.ac.uk
Optimisation
. Measuring performance
. Parallel code
Tn
lim sup @ª
n ª fn
The Mandelbrot set M is the set of complex numbers c > C such that
the iteration
z0 0,
zk1 z2k c
remains bounded as k ª.
c A 2,
S S and/or Szk S A 2 for some k.
We approximate
zk remains bounded as k ª
by
zk S @ 2
S for k B K
int main()
{
for (int i=-500; i<=250; i++) // loop over real part of c
{
for (int j=-375; j<=375; j++) // loop over imag part of c
{
std::complex<double> z(0,0), c(i/250.0, j/250.0);
int k = 0; // loop up to 2500 times, or until |z|>2
for (; k < 2500 && std::abs(z)<2.0; k++)
z = z*z + c; // apply iteration
std::cout << k << " "; // output # of iterations
}
std::cout << std::endl;
}
return 0;
}
Mandelbrot set: Program output
Mandelbrot set: Run times (unoptimised)
Mandelbrot set run time (using single core of Core i7-4770K),
real 0m13.015s
user 0m12.998s
sys 0m0.004s
. Compiled with
g++ -O2 -ffast-math -march=native mandelbrot_2.cpp
k zk
0 0.000000 0.000000i
1 0.900000 0.100000i
2 0.100000 0.080000i
... ...
36 0.093627 0.123035i
37 0.906372 0.123039i
38 0.093629 0.123038i
39 0.906372 0.123040i
... ...
int main()
{
int nThreads = 8;
std::vector<std::thread> threads(nThreads);
Thread 1
Thread 2
Thread 3
Thread 4
.
int iSize=751, iStart=-500, jSize=751, jStart=-375; // loop lengths and start points
std::vector<int> iters(iSize*jSize); // storage for iteration counts
4
Time per access (ns)
2
L3 Cache
1 L1 Cache
0
256 1K 4K 16K 64K 256K 1M 4M 16M 64M 256M
Buffer size (bytes)
Memory optimisation example: matrix-vector product
b Ax bi Qa x
j
ij j
N.B: The opposite is true for b xA. Data format must suit algorithm
1
I found 1.42ms (column-major) vs. 3.22ms (row-major) for 10242 matrix
Summary
. Optimisation has costs: only optimise where necessary
. Further reading:
. Code Complete (S. McConnell) chapters 25 and 26
. What every programmer should know about memory
(U. Drepper) http://www.akkadia.org/drepper/cpumemory.pdf