Beruflich Dokumente
Kultur Dokumente
+ =
21 Computer Science, Rutgers CS 519: Operating System Theory
UNIX Priority Calculation
Very long running CPU bound jobs will get stuck at the highest
priority.
Decay function used to weight utilization to recent CPU usage.
A processs utilization at time t is decayed every second:
The system-wide load is the average number of runnable jobs
during last 1 second
niceFactor u
load
load
u t t +
(
+
= ) 1 (
) 1 2 (
2
22 Computer Science, Rutgers CS 519: Operating System Theory
UNIX Priority Decay
1 job on CPU. load will thus be 1. Assume niceFactor is 0.
Compute utilization at time N:
+1 second:
+2 seconds
+N seconds
0 1
3
2
U U =
0
0 2
2
1 1
3
2
3
2
3
2
3
2
U U U U U
|
.
|
\
|
+ =
(
+ =
...
3
2
3
2
2
2
1
|
.
|
\
|
+ =
n
n U U U n
23 Computer Science, Rutgers CS 519: Operating System Theory
Scheduling Algorithms
FIFO is simple but leads to poor average response times. Short
processes are delayed by long processes that arrive before them
RR eliminate this problem, but favors CPU-bound jobs, which have
longer CPU bursts than I/O-bound jobs
SJN, SRT, and HRRN alleviate the problem with FIFO, but require
information on the length of each process. This information is not
always available (although it can sometimes be approximated based
on past history or user input)
Feedback is a way of alleviating the problem with FIFO without
information on process length
24 Computer Science, Rutgers CS 519: Operating System Theory
Its a Changing World
Assumption about bi-modal workload no longer holds
Interactive continuous media applications are sometimes
processor-bound but require good response times
New computing model requires more flexibility
How to match priorities of cooperative jobs, such as
client/server jobs?
How to balance execution between multiple threads of a single
process?
25 Computer Science, Rutgers CS 519: Operating System Theory
Lottery Scheduling
Randomized resource allocation mechanism
Resource rights are represented by lottery tickets
Have rounds of lottery
In each round, the winning ticket (and therefore the
winner) is chosen at random
The chances of you winning directly depends on the
number of tickets that you have
P[wining] = t/T, t = your number of tickets, T = total number of
tickets
26 Computer Science, Rutgers CS 519: Operating System Theory
Lottery Scheduling
After n rounds, your expected number of wins is
E[win] = nP[wining]
The expected number of lotteries that a client must wait before
its first win
E[wait] = 1/P[wining]
Lottery scheduling implements proportional-share resource
management
Ticket currencies allow isolation between users, processes, and
threads
OK, so how do we actually schedule the processor using lottery
scheduling?
27 Computer Science, Rutgers CS 519: Operating System Theory
Implementation
28 Computer Science, Rutgers CS 519: Operating System Theory
Performance
Allocated and observed
execution ratios between
two tasks running the
Dhrystone benchmark.
With exception of 10:1
allocation ratio, all observed
ratios are close to allocations
29 Computer Science, Rutgers CS 519: Operating System Theory
Short-term Allocation Ratio
30 Computer Science, Rutgers CS 519: Operating System Theory
Isolation
Five tasks running the Dhrystone
benchmark. Let amount.currency
denote a ticket allocation of amount
denominated in currency. Tasks
A1 and A2 have allocations 100.A and
200.A, respectively. Tasks B1 and B2
have allocations 100.B and 200.B,
respectively. Halfway thru experiment
B3 is started with allocation 300.B.
This inflates the number of tickets in B
from 300 to 600. Theres no effect on
tasks in currency A or on the aggregate
iteration ratio of A tasks to B tasks.
Tasks B1 and B2 slow to half their
original rates, corresponding to the
factor of 2 inflation caused by B3.
31 Computer Science, Rutgers CS 519: Operating System Theory
Borrowed-Virtual-Time (BVT) Scheduling
Current scheduling in general purpose systems does not
support rapid dispatch of latency-sensitive applications
Examples include continuous media applications such as
teleconferencing, playing movies, voice-over-IP, etc.
Whats the problem with the traditional Unix scheduler?
Beauty of BVT is its simplicity
Corollary: not that much to say
Tricky part is figuring out the appropriate parameters
of the paper is on this (which Im going to skip)
32 Computer Science, Rutgers CS 519: Operating System Theory
BVT Scheduling: Basic Idea
Scheduling is done based on virtual time
Each thread has
EVT (effective virtual time)
AVT (actual virtual time)
W (warp factor)
warpBack (whether warp is on or not)
EVT of thread is computed as
Threads accumulate virtual time as they run
Thread with earliest EVT is scheduled next
) 0 : ? ( W warpBack A E
33 Computer Science, Rutgers CS 519: Operating System Theory
BVT Scheduling: Details
Can only switch every C time units to prevent thrashing
Threads can accumulate virtual time at different rates
Allow for weighted fair sharing of CPU
To make sure that latency-sensitive threads are
scheduled right away, give these threads high warp
values
Have limits on how much and how long can warp to
prevent abuse
34 Computer Science, Rutgers CS 519: Operating System Theory
BVT Scheduling: Performance
35 Computer Science, Rutgers CS 519: Operating System Theory
BVT Scheduling: Performance
36 Computer Science, Rutgers CS 519: Operating System Theory
BVT Scheduling: Performance
37 Computer Science, Rutgers CS 519: Operating System Theory
BVT Scheduling: Performance
38 Computer Science, Rutgers CS 519: Operating System Theory
BVT vs. Lottery
How do the two compare?
Parallel Processor Scheduling
40 Computer Science, Rutgers CS 519: Operating System Theory
Simulating Ocean Currents
Model as two-dimensional grids
Discretize in space and time
finer spatial and temporal
resolution => greater accuracy
Many different computations per
time step
set up and solve equations
Concurrency across and within
grid computations
(a) Cross sections (b) Spatial discretization
of a cross section
41 Computer Science, Rutgers CS 519: Operating System Theory
m
1
m
2
r
2
Star on which f or ces
are being computed
Star too close to
approximate
Small group f ar enough away to
approximate to center of mass
Large group f ar
enough away to
approximate
Case Study 2: Simulating Galaxy Evolution
Simulate interactions of many stars evolving over time
Computing forces is expensive
O(n2) brute force approach
Hierarchical Methods take advantage of force law: G
42 Computer Science, Rutgers CS 519: Operating System Theory
Case Study 2: Barnes-Hut
Many time steps, plenty of concurrency across stars within each
Locality Goal
Particles close together in space should be on same processor
Difficulties: Nonuniform, dynamically changing
Spatial Domain Quad-tree
43 Computer Science, Rutgers CS 519: Operating System Theory
Case Study 3: Rendering Scenes by Ray
Tracing
Shoot rays into scene through
pixels in projection plane
Result is color for pixel
Rays shot through pixels in
projection plane are called primary
rays
Reflect and refract when they hit
objects
Recursive process generates ray
tree per primary ray
Tradeoffs between execution time
and image quality
Viewpoint
Projection Plane
3D Scene
Ray from
viewpoint to
upper right corner
pixel
Dynamically
generated ray
44 Computer Science, Rutgers CS 519: Operating System Theory
Partitioning
Need dynamic assignment
Use contiguous blocks to exploit spatial coherence among
neighboring rays, plus tiles for task stealing
A block,
the unit of
assignment
A tile,
the unit of decomposition
and stealing
45 Computer Science, Rutgers CS 519: Operating System Theory
Sample Speedups
46 Computer Science, Rutgers CS 519: Operating System Theory
Coscheduling (Gang)
Cooperating processes may interact frequently
What problem does this lead to?
Fine-grained parallel applications have process working
set
Two things needed
Identify process working set
Coschedule them
Assumption: explicitly identified process working set
Some good recent work has shown that it may be
possible to dynamically identify process working set
47 Computer Science, Rutgers CS 519: Operating System Theory
Coscheduling
What is coscheduling?
Coordinating across nodes to make sure that processes
belonging to the same process working set are
scheduled simultaneously
How might we do this?
48 Computer Science, Rutgers CS 519: Operating System Theory
Impact of OS Scheduling Policies and
Synchronization on Performance
Consider performance for a set of applications for
Feedback priority scheduling
Spinning
Blocking
Spin-and-block
Block-and-hand-off
Block-and-affinity
Gang scheduling (time-sharing coscheduling)
Process control (space-sharing coscheduling)
49 Computer Science, Rutgers CS 519: Operating System Theory
Applications
50 Computer Science, Rutgers CS 519: Operating System Theory
Normal Scheduling with Spinning
51 Computer Science, Rutgers CS 519: Operating System Theory
Normal Scheduling with Blocking Locks
52 Computer Science, Rutgers CS 519: Operating System Theory
Gang Scheduling
53 Computer Science, Rutgers CS 519: Operating System Theory
Process Control (Space-Sharing)
54 Computer Science, Rutgers CS 519: Operating System Theory
Multiprocessor Scheduling
Load sharing: poor locality; poor synchronization
behavior; simple; good processor utilization. Affinity
or per processor queues can improve locality.
Gang scheduling: central control; fragmentation --
unnecessary processor idle times (e.g., two
applications with P/2+1 threads); good
synchronization behavior; if careful, good locality
Hardware partitions: poor utilization for I/O-intensive
applications; fragmentation unnecessary processor
idle times when partitions left are small; good locality
and synchronization behavior