Sie sind auf Seite 1von 46

Multiprocessors, Threads

and Microkernels

Fred Kuhns

Motivation for Multiprocessors


Enhanced Performance Concurrent execution of tasks for increased
throughput (between processes)
Exploit Concurrency in Tasks (Parallelism
within process)

Fault Tolerance graceful degradation in face of failures

Fred Kuhns ( )

Basic MP Architectures
Single Instruction Single Data (SISD)
conventional uniprocessor designs.

Single Instruction Multiple Data (SIMD)


Vector and Array Processors

Multiple Instruction Single Data (MISD)


Not Implemented.

Multiple Instruction Multiple Data (MIMD)


conventional MP designs

Fred Kuhns ( )

MIMD Classifications
Tightly Coupled System - all processors
share the same global memory and have
the same address spaces (Typical SMP
system).
Main memory for IPC and Synchronization.

Loosely Coupled System - memory is


partitioned and attached to each
processor. Hypercube, Clusters (MultiComputer).

Message passing for IPC and synchronization.

Fred Kuhns ( )

MP Block Diagram
CPU
cache

MMU

CPU
cache

MMU

CPU
cache MMU

CPU
cache

MMU

Interconnection Network

MM

Fred Kuhns ( )

MM

MM

MM

Memory Access Schemes


Uniform Memory Access (UMA)
Centrally located
All processors are equidistant (access times)

NonUniform Access (NUMA)


physically partitioned but accessible by all
processors have the same address space

NO Remote Memory Access (NORMA)


physically partitioned, not accessible by all
processors have own address space
Fred Kuhns ( )

Other Details of MP
Interconnection technology
Bus
Cross-Bar switch
Multistage Interconnect Network

Caching - Cache Coherence Problem!


Write-update
Write-invalidate
bus snooping

Fred Kuhns ( )

MP OS Structure - 1
Separate Supervisor

all processors have own copy of the kernel.


Some share data for interaction
dedicated I/O devices and file systems
good fault tolerance but bad for concurrency

Master/Slave Configuration

Master: monitors status and assigns work


Slaves: schedulable pool of resources
master can be bottleneck
poor fault tolerance

Fred Kuhns ( )

MP OS Structure - 2
Symmetric Configuration - Most Flexible.
all processors are autonomous, treated equal
one copy of the kernel executed concurrently
across all processors
Synchronized access to shared data structures:

Lock entire OS - Floating Master


Mitigated by dividing OS into segments that normally
have little interaction
multithread kernel and control access to resources
(continuum)

Fred Kuhns ( )

MP Overview
MultiProcessor
SIMD

MIMD

Shared Memory
(tightly coupled)
Master/Slave

Fred Kuhns ( )

Distributed Memory
(loosely coupled)

Symmetric
(SMP)

Clusters

10

SMP OS Design Issues


Threads - effectiveness of parallelism
depends on performance of primitives used
to express and control concurrency.
Process Synchronization - disabling
interrupts is not sufficient.
Process Scheduling - efficient, policy
controlled, task scheduling. Issues:

Global versus Local (per CPU)


Task affinity for a particular CPU
resource accounting
inter-thread dependencies

Fred Kuhns ( )

11

SMP OS design issues - cont.


Memory Management - complication of
shared main memory.
cache coherence
memory access synchronization
balancing overhead with increased concurrency

Reliability and fault Tolerance - degrade


gracefully in the event of failures

Fred Kuhns ( )

12

Typical SMP System


500MHz

CPU
cache

MMU

CPU
cache

MMU

CPU
cache

MMU

CPU
cache

MMU

System/Memory Bus
Issues:
Memory contention
Main 50ns
I/O
Bridge
INT
Limited bus BW
Memory
subsystem
I/O contention
ether
System Functions
Cache coherence
(timer, BIOS, reset)
scsi

Typical I/O Bus:


33MHz/32bit (132MB/s)
66MHz/64bit (528MB/s)
Fred Kuhns ( )

video

13

Some Useful Definitions


Parallelism: degree to which a
multiprocessor application achieves parallel
execution
Concurrency: Maximum parallelism an
application can achieve with unlimited
processors
System Concurrency: kernel recognizes
multiple threads of control in a program
User Concurrency: User space threads
(coroutines) provide a natural programming
model for concurrent applications.
Fred Kuhns ( )

14

Introduction to Threads
Multithreaded Process Model
Single-Threaded
Process Model
Process
Control
Block

User
Stack

User
Address
Space

Kernel
Stack

Fred Kuhns ( )

Thread

Thread

Thread

Thread
Control
Block

Thread
Control
Block

Thread
Control
Block

Process
Control
Block

User
Stack

User
Stack

User
Stack

User
Address
Space

Kernel
Stack

Kernel
Stack

Kernel
Stack

15

Process Concept Embodies


Unit of Resource ownership - process is
allocated a virtual address space to hold
the process image
Unit of Dispatching - process is an
execution path through one or more
programs
execution may be interleaved with other
processes

These two characteristics are treated


independently by the operating system
Fred Kuhns ( )

16

Threads
Effectiveness of parallel computing depends on
the performance of the primitives used to
express and control parallelism
Separate notion of execution from Process
abstraction
Useful for expressing the intrinsic concurrency
of a program regardless of resulting
performance
We will discuss three examples of threading:
User threads,
Kernel threads and
Scheduler Activations

Fred Kuhns ( )

17

Threads cont.
Thread : Dynamic object representing an
execution path and computational state.
One or more threads per process, each having:
Execution state (running, ready, etc.)
Saved thread context when not running
Execution stack
Per-thread static storage for local variables
Shared access to process resources
all threads of a process share a common
address space.
Fred Kuhns ( )

18

Thread States
Primary states:
Running, Ready and Blocked.

Operations to change state:


Spawn: new thread provided register context
and stack pointer.
Block: event wait, save user registers, PC and
stack pointer
Unblock: moved to ready state
Finish: deallocate register context and stacks.
Fred Kuhns ( )

19

User Level Threads


User level threads - supported by user
level threads libraries
Examples
POSIX Pthreads, Mach C-threads, Solaris
threads

Benefits:
no modifications required to kernel
flexible and low cost

Drawbacks:
can not block without blocking entire process
no parallelism (not recognized by kernel)
Fred Kuhns ( )

20

Kernel Level Threads


Kernel level threads - directly supported by
kernel, thread is the basic scheduling entity
Examples:

Windows 95/98/NT/2000, Solaris, Tru64 UNIX, BeOS, Linux

Benefits:

coordination between scheduling and


synchronization
less overhead than a process
suitable for parallel application

Drawbacks:

more expensive than user-level threads


generality leads to greater overhead

Fred Kuhns ( )

21

Scheduler Activations
Attempt to combine benefits of both user and
kernel threading support
blocking system call should not block whole process
user space library should make scheduling decisions

efficiency by avoiding unnecessary user, kernel


mode switches.
Kernel assigns a set of virtual processors to
each process. User library then schedules
threads on these virtual processors.

Fred Kuhns ( )

22

Scheduler Activations

An activation:

execution context for running thread


Kernel passes new activation to library when
upcall is performed.
Library schedules user threads on activations.
space for kernel to save processor context of
current user thread when stopped by kernel
upall performed when one of the following
occurs:

user thread performs blocking system call


blocked thread belonging to process, then its library
is notified allowing it to either schedule a new thread
or resume the preempted thread.
23
Fred Kuhns ( )

Pthreads
a POSIX standard (IEEE 1003.1c) API
for thread creation and synchronization.
API specifies behavior of the thread
library, implementation is up to
development of the library.
Common in UNIX operating systems.

Fred Kuhns ( )

24

UNIX Support for Threading


BSD:
process model only. 4.4 BSD enhancements.

Solaris
user threads, kernel threads, LWPs and in 2.6
Scheduler Activations

Mach
kernel threads and tasks. Thread libraries provide
semantics of user threads, LWPs and kernel threads.

Digital UNIX - extends MACH to provide usual


UNIX semantics.
Pthreads library.
Fred Kuhns ( )

25

Solaris Threads
Supports:
user threads (uthreads) via libthread and
libpthread
LWPs, abstraction that acts as a virtual CPU
for user threads.
LWP is bound to a kthread.

kernel threads (kthread), every LWP is


associated with one kthread, however a
kthread may not have an LWP

interrupts as threads
Fred Kuhns ( )

26

Solaris kthreads
Fundamental scheduling/dispatching object
all kthreads share same virtual address
space (the kernels) - cheap context switch
System threads - example STREAMS,
callout
kthread_t, /usr/include/sys/thread.h
scheduling info, pointers for scheduler or sleep
queues, pointer to klwp_t and proc_t

Fred Kuhns ( )

27

Solaris LWP
Kernel provided mechanism to allow for both user
and kernel thread implementation on one platform.
Bound to a kthread
LWP data (see /usr/include/sys/klwp.h)
user-level registers, system call params, resource
usage, pointer to kthread_t and proc_t

All LWPs in a process share:


signal handlers

Each may have its own

signal mask
alternate stack for signal handling

No global name space for LWPs


Fred Kuhns ( )

28

Solaris User Threads

Implemented in user libraries


library provides synchronization and scheduling
facilities
threads may be bound to LWPs
unbound threads compete for available LWPs
Manage thread specific info
thread id, saved register state, user stack, signal
mask, priority*, thread local storage

Solaris provides two libraries: libthread and


libpthread.
Try man thread or man pthreads
Fred Kuhns ( )

29

Solaris Thread Data Structures


proc_t
p_tlist

klwp_t

kthread_t
t_procp
t_lwp
t_forw

lwp_thread
lwp_procp

Fred Kuhns ( )

30

Solaris Threading Model (Combined)


Process 2

Process 1

user

......
...

kernel

hardware
Fred Kuhns ( )

Int kthr

P
31

Solaris User Level Threads


Stop

Wakeup

Runnable
Continue
Stop

Stopped
Preempt
Stop

Fred Kuhns ( )

Sleeping

Dispatch

Active

Sleep

32

Solaris Lightweight Processes


Timeslice
or Preempt

Stop

Running
Dispatch

Runnable

Wakeup
Blocking
System
Call

Stopped

Continue

Wakeup

Stop
Blocked

Fred Kuhns ( )

33

Solaris Interrupts
One system wide clock kthread
pool of 9 partially initialized kthreads per
CPU for interrupts
interrupt thread can block
interrupted thread is pinned to the CPU

Fred Kuhns ( )

34

Solaris Signals and Fork


Divided into Traps (synchronous) and
interrupts (asynchronous)
each thread has its own signal mask,
global set of signal handlers
Each LWP can specify alternate stack
fork replicates all LWPs
fork1 only the invoking LWP/thread

Fred Kuhns ( )

35

Mach
Two abstractions:

Task - static object, address space and system


resources called port rights.
Thread - fundamental execution unit and runs in
context of a task.

Zero or more threads per task,


kernel schedulable
kernel stack
computational state

Processor sets - available processors


divided into non-intersecting sets.

permits dedicating processor sets tasks

Fred Kuhns ( )

36

Mach c-thread Implementations


Coroutine-based - multiple user threads
onto a single-threaded task
Thread-based - one-to-one mapping
from c-threads to Mach threads.
Default.
Task-based - One Mach Task per cthread.

Fred Kuhns ( )

37

Digital UNIX
Based on Mach 2.5 kernel
Provides complete UNIX programmers
interface
4.3BSD code and ULTRIX code ported to Mach
u-area replaced by utask and uthread
proc structure retained

Threads:

Signals divided into synchronous and asynchronous


global signal mask
each thread can define its own handlers for
synchronous signals
global handlers for asynchronous signals

Fred Kuhns ( )

38

Windows 2000 Threads


Implements the one-to-one mapping.
Each thread contains
- a thread id
- register set
- separate user and kernel stacks
- private data storage area

Fred Kuhns ( )

39

Linux Threads
Linux refers to them as tasks rather
than threads.
Thread creation is done through clone()
system call.
Clone() allows a child task to share the
address space of the parent task
(process)

Fred Kuhns ( )

40

4.4 BSD UNIX


Initial support for threads implemented
but not enabled in distribution
Proc structure and u-area reorganized
All threads have a unique ID
How are the proc and u areas reorganized
to support threads?

Fred Kuhns ( )

41

Microkernel
Transition to Microkernel discussion

Fred Kuhns ( )

42

Microkernel
Small operating system core
Contains only essential operating systems
functions
Many services traditionally included in the
operating system are now external
subsystems

device drivers
file systems
virtual memory manager
windowing system and security services

Fred Kuhns ( )

43

Microkernel Benefits
Portability
isolate port specific code to microkernel

Reliability
modular design, small microkernel, simpler
validation

Uniform interface
all services are provided by means of message
passing

Extensibility
allows the addition of new services
Fred Kuhns ( )

44

Microkernel Benefits
Flexibility
existing features can be subtracted

Distributed system support


message are sent without knowing what the
target machine is or where it is located

Object-oriented operating system


components are objects with clearly defined
interfaces that can be interconnected to
form software
Fred Kuhns ( )

45

Microkernel Design
Primitive memory management
mapping each virtual page to a physical page
frame: grant, map and flush.

Inter-process communication
I/O and interrupt management

Fred Kuhns ( )

46

Das könnte Ihnen auch gefallen