Beruflich Dokumente
Kultur Dokumente
optimization
Theoretical preliminaries
Performance analysis
Application of queuing theory
I/O performance
Performance optimization
Results from compiler optimization
Analysis of memory requirements
Reducing memory utilization
Theoretical preliminaries
NP-completeness
Challenges in analyzing real-time systems
The Halting Problem
Amdahls Law
Gustafsons Law
NP-definitions
The relation between the complexity classes P and NP is studied in computational complexity
theory, the part of the theory of computation dealing with the resources required during
computation to solve a given problem. The most common resources are
time (how many steps it takes to solve a problem)
space (how much memory it takes to solve a problem).
The complexity class P is the class of problems which can be solved by an algorithm which
runs in polynomial time on a deterministic machine.
Deterministic: given the computer's present state and any inputs, there is only one possible action that the computer takes
The complexity class NP is the class of all problems that cannot be solved in polynomial time
by a deterministic machine. A candidate solution can be verified to be correct by a polynomial
time algorithm.
Example: Consider the subset sum problem, an example of a problem that is easy to verify, but whose answer may be
difficult to compute. Given a set of integers, does some nonempty subset of them sum to 0? For instance, does a subset of
the set {2, 3, 15, 14, 7, 10} add up to 0? The answer is "yes, because {2, 3, 10, 15} add up to zero" can be
quickly verified with three additions. However, finding such a subset in the first place could take more time; hence this
problem is in NP (quickly checkable) but not necessarily in P (quickly solvable).
Hence the classical question is P = NP ?
One of the 7 million dollar questions of Clay Mathematical Institute
NP-Completeness
A decision problem is NP complete if it is in the class NP and all other problems in NP are polynomial
transformable to it.
travelling salesman problem is NP-complete
NP complete problems tend to be those relating to resource allocation, -- this is the situation that occurs in real-time scheduling.
This fact does not bode well for the solution of real-time scheduling problems.
A problem is NP-hard if all problems in NP are polynomial transformable to that problem, but it is impossible
to show that the problem is in the class NP.
Solvable Verifiable
Reducible
(transformable)
NP = nondeterministic polynomial time
Challenges in analyzing real-time
systems
When there are mutual exclusion constraints, it is impossible to find a totally
on-line optimal runtime scheduler.
The problem of deciding whether it is possible to schedule a set of periodic
processes that use semaphores only to enforce mutual exclusion is NP-hard.
The multiprocessor scheduling problem with two processors, no resources,
independent tasks, and arbitrary computation times is NP-complete.
The multiprocessor scheduling problem with two processors, no resources,
independent tasks, arbitrary partial order and task computation times of
either 1 or 2 units of time is NP-complete.
The multiprocessor scheduling problem with two processors, one resource, a
forest partial order (partial order on each processor), and each computation
time of every task equal to 1 is NP-complete.
The multiprocessor scheduling problem with three or more processors, one
resource, all independent tasks and each computation time of every task
equal to 1 is NP-complete.
The Halting Problem
Can a computer program be written, which takes an arbitrary program
and an arbitrary set of inputs and determines whether or not will halt
on it?
In computability theory, the halting problem is a decision problem which can be stated as
follows: Given a description of a program, decide whether the program finishes running or
continues to run, and will thereby run forever. This is equivalent to the problem of deciding,
given a program and an input, whether the program will eventually halt when run with that
input, or will run forever.
Alan Turing proved in 1936 that a general algorithm to solve the halting problem for all
possible program-input pairs cannot exist. We say that the halting problem is undecidable
over Turing machines.
The Halting Problem
A schedulability analyzer is a special case of the Halting Problem.
Amdahls Law
Amdahls Law a statement regarding the level of
parallelization that can be achieved by a parallel computer.
Amdahls Law for a constant problem size, speedup
approaches zero as the number of processor elements
grows.
Amdahls Law is frequently cited as an argument against
parallel systems and massively parallel processors.
speedup
1 ( 1)
n
n s
=
+
Amdahls Law
Amdahl's law, also known as
Amdahl's argument,
[1]
is named
after computer architect Gene
Amdahl, and is used to find the
maximum expected improvement
to an overall system when only
part of the system is improved. It
is often used in parallel computing
to predict the theoretical maximum
speedup using multiple processors.
The speedup of a program using
multiple processors in parallel
computing is limited by the
sequential fraction of the program.
For example, if 95% of the program
can be parallelized, the theoretical
maximum speedup using parallel
computing would be 20 as shown in
the diagram, no matter how many
processors are used.
Gustafsons Law
Gustafson found that the underlying principle that the problem size
scales with the number of processors is inappropriate.
Gustafsons empirical results demonstrated that the parallel or vector
part of a program scales with the problem size.
Times for vector start-up, program loading, serial bottlenecks, and I/O
that make up the serial component of the run do not grow with the
problem size.
speedup s p n = +
Gustafsons Law
0
2
4
6
8
10
12
14
16
18
1 4 7
1
0
1
3
1
6
1
9
2
2
2
5
2
8
3
1
Number of Processors
S
p
e
e
d
u
p
Amdahl
Gustafson
Linear speedup of Gustafson compared to diminishing return speedup of Amdahl
with 50% of code available for parallelization.
Gustafsons Law
Gustafson's Law (also known as Gustafson-Barsis' law) is a law
in computer science which says that problems with large, repetitive
data sets can be efficiently parallelized. Gustafson's Law contradicts
Amdahl's law, which describes a limit on the speed-up that
parallelization can provide.
Gustafson's Law proposes that programmers set the size of
problems to use the available equipment to solve problems within a
practical fixed time. Therefore, if faster (more parallel) equipment is
available, larger problems can be solved in the same time.
Amdahl's law is based on fixed workload or fixed problem size. It
implies that the sequential part of a program does not change with
respect to machine size (i.e, the number of processors). However
the parallel part is evenly distributed by n processors.
The impact of Gustafson's law was to shift
[citation needed]
research goals
to select or reformulate problems so that solving a larger problem in
the same amount of time would be possible. In particular the law
redefines efficiency as a need to minimize the sequential part of a
program, even if it increases the total amount of computation.
Performance estimation and
optimization
Theoretical preliminaries
Performance analysis
Application of queuing theory
I/O performance
Performance optimization
Results from compiler optimization
Analysis of memory requirements
Reducing memory utilization
Performance analysis
Code execution time estimation
Analysis of polled loops
Analysis of coroutines
Analysis of round-robin systems
Response time analysis for fixed period systems
Response time analysis -- RMA example
Analysis of sporadic and aperiodic interrupt systems
Deterministic performance
Code execution time estimation
Instruction counting
Instruction execution time simulators
Use the system clock
Analysis of polled loops
Analysis of polled loop response time, a) source
code, b) assembly equivalent.
Analysis of coroutines
Tracing the execution path in a two-task coroutine system. A switch statement in
each task drives the phase driven code (not shown). A central dispatcher calls
task1() and task2() and provides intertask communication via global variables or
parameter lists.
Analysis of round-robin systems
Assume n tasks, each with max execution time c and time quantum q
then T is worst case time from readiness to completion for any task
(also known as turnaround time), denoted, is the waiting time plus
undisturbed time to complete.
Example, suppose there are 5 processes with a maximum execution
time of 500 ms. The time quantum is 100ms. Then total completion
time is
( 1)
c
T n q c
q
(
= +
(
(
500
(5 1) 100 500 2500
100
T ms
(
= + =
(
(
Analysis of round-robin systems
Now assume that there is a context
switching overhead o. Then
For example, consider previous case with
context switch overhead of 1 ms. Then time
to complete task set is
| |
( 1)
c
T n q n o c
q
(
= + +
(
(
| |
500
(5 1) 100 5 1 500 2525
100
T
(
= + + =
(
(
Response time analysis for fixed
period systems
For a general task,response time,R
i
,is given as R
i
= e
i
+ I
i
where e
i
is max execution time and I
i
is the maximum
amount of delay in execution, caused by higher priority
tasks.
By solving as a recurrence relation, the response time of
the n
th
iteration can be found as
Where hp(i) is the set tasks of a higher priority than task i.
1
( )
/
n n
i i i j j
j hp i
R e R p e
+
e
(
= +
(