Sie sind auf Seite 1von 18

Unit: 4 Multirate systems: priority based scheduling,evaluvating operating system performance Network and multiprocessor:Network and multiprocessor categories,Distributed

embedded processors, MPSoCs and shared memory multiprocessors,Design example. Multirate Systems
Implementing code that satisfies timing requirements is even more complex when multiple rates of computation must be handled. Multirate embedded computing systems are very common,including automobile engines, printers,and cell phones.In all these systems,certain operations must be executed periodically,and each operation is executed at its own rate. A process is a single execution of a program. If we run the same program two different times, we have created two different processes. Each process has its own state that includes not only its registers but all of its memory The first job of the OS is to determine that process runs next.The work of choosing the order of running processes is known as scheduling.The OS considers a process to be in one of three basic scheduling states: waiting, ready, or executing. There is at most one process executing on the CPU at any time.

Definition: It is the method by which threads, processes or data flows are given access to system resources (e.g. processor time, communications bandwidth). By use of Proper algorithm of scheduling, we can perform multiple task in a given time. The scheduler is concerned mainly with: Throughput - The total number of processes that complete their execution per time unit. Response time - amount of time it takes from when a request was submitted until the first response is produced. Waiting Time - Equal CPU time to each process (or more generally appropriate times according to each process' priority). It is the time for which the process remains in the ready queue. A scheduling algorithm is: Preemptive: if the active process or task or thread can be temporarily suspended to execute a more important process or task or thread . Non-Preemptive: if the active process or task or thread cannot be suspended, i.e., always runs to completion.

Some commonly used RTOS scheduling algorithms are:

1. 2. 3. 4. 5. Cooperative Scheduling of ready tasks in a queue. Cyclic and round robin (time slicing) Scheduling. Preemptive Scheduling. Rate-Monotonic Scheduling (RMS). Scheduling using Earliest deadline first (EDF).

1.Cooperative Scheduling of ready tasks in a queue.

Each task cooperate to let the running task finish. Cooperative means that each task cooperates to let the a running one finish. None of the tasks does block in-between anywhere during the ready to finish states. The service is in the cyclic order.

2.Cyclic and round robin (time slicing) Scheduling. Cyclic Scheduling

Round Robin (time slicing) Scheduling

Priority-based scheduling in RTOS

1. static priority A task is given a priority at the time it is created, and it keeps this priority during the whole lifetime. The scheduler is very simple, because it looks at all wait queues at each priority level, and starts the task with the highest priority to run. 2. dynamic priority The scheduler becomes more complex because it has to calculate tasks priority on -line, based on dynamically changing parameters. Typical RTOS based on fixed-priority preemptive scheduler Assign each process a priority At any time, scheduler runs highest priority process ready to run Process runs to completion unless preempted

Rate-Monotonic Scheduling Rate-monotonic scheduling (RMS), introduced by Liu and Layland [Liu73],was one of the first scheduling policies developed for real-time systems and is still very widely used. RMS is a static scheduling policy. It turns out that these fixed priorities are sufficient to efficiently schedule the processes in many situations. The theory underlying RMS is known as ratemonotonic analysis (RMA).This theory, as summarized below, uses a relatively simple model of the system. All processes run periodically on a single CPU. Context switching time is ignored. There are no data dependencies between processes. The execution time for a process is constant. All deadlines are at the ends of their periods. The highest-priority ready process is always selected for execution.

Earliest-deadline-first (EDF)
A task with a closer deadline gets a higher scheduling Processes with soonest deadline given highest priority .The scheduler needs not only to know the deadline time of all tasks it has to schedule, but also their duration.

What are Hard Real-Time Scheduling Considerations

Hard real-time scheduling presents its own set of unique problems Rate Monotonic Scheduling presents one approach to addressing the problem of Hard real-time scheduling Hard Real-Time Tasks must be guaranteed to complete within a predefined amount of time Soft Real-Time Statistical distribution of response-time of tasks is acceptable Rate Monotonic refers to assigning priorities as a monotonic function of the rate (frequency of occurrence) of those processes.

Rate Monotonic Scheduling (RMS) can be accomplished based upon rate monotonic principles. Rate Monotonic Analysis (RMA) can be performed statically on any hard real-time system concept to decide if the system is schedulable. There are three different types of algorithms Fixed Priority Assignment (Static) Deadline Driven Priority Assignment (Dynamic) Mixed Priority Assignment Fixed Priority Scheduling Algorithm Assign the priority of each task according to its period, so that the shorter the period the higher the priority. Tasks are defined based on: Task is denoted as: t1 Request Period: T1, T2,...,Tm Run-Time: C1, C2,...,Cm Case 1: Priority(t1) > Priority(t2) Case 2: Priority(t2) > Priority(t1) Processor utilization is defined as the fraction of processor time spent in the execution of the task set.

The Deadline Driven Scheduling Algorithm Priorities are assigned to tasks according to the deadlines of their current requests. A task will be assigned the highest priority if the deadline of its current request is the nearest Conversely, a task will be assigned the lowest priority if the deadline of its current request is the furthest. At any instant, the task with the highest priority and yet unfulfilled request will be executed. A Mixed Scheduling Algorithm: Tasks with small execution periods are scheduled using the fixed priority algorithm. All other tasks are scheduled dynamically and can only run on the CPU when all fixed priority tasks have completed. Originally motivated by interrupt hardware limitations Interrupt hardware acted as a fixed priority scheduler Did not appear to be compatible with a hardware dynamic scheduler

Multiprocessor system
A multiprocessor system is a collection of a number of standard processors put together in an innovative way to improve the performance / speed of computer hardware. The main feature of this architecture is to provide high speed at low cost in comparison to uniprocessor. In a distributed system, the high cost of multiprocessor can be offset by employing them on a computationally intensive task by making it compute server. The multiprocessor system is generally characterised by increased system throughput and application speedup - parallel processing.

Throughput can be improved, in a time-sharing environment, by executing a number of unrelated user processor on different processors in parallel. As a result a large number of different tasks can be completed in a unit of time without explicit user direction. On the other hand application speedup is possible by creating a multiple processor scheduled to work on different processors. The scheduling can be done in two ways: 1) Automatic means, by parallelising compiler. 2) Explicit-tasking approach, where each programme submitted for execution is treated by the operating system as an independent process. Multiprocessor operating systems aim to support high performance through multiple CPUs. Processor Coupling Tightly-coupled multiprocessor systems contain multiple CPUs that are connected at the bus level. Loosely-coupled multiprocessor systems often referred to as clusters are based on multiple standalone single or dual processor commodity computers interconnected via a high speed communication system. Architecture of multiprocessor interconnection, including: Bus-oriented System Crossbar-connected System Hyper cubes Multistage Switch-based System.

programming models (Communication models) of Multiprocessor 1. shared memory 2. message passing

With shared memory multiprocessors, processing elements access a shared memory space via an interconnection network. With a message passing multiprocessor, each processor has its own local memory and PEs (PE: processing element.)communicate with one another by passing messages through the interconnection network. Most embedded processors use a combination of shared memory and message passing.

1. shared memory
In a shared memory model, multiple workers all operate on the same data. This opens up a lot of the concurrency issues that are common in parallel programming.

2.message passing
Message passing systems make workers communicate through a messaging system. Messages keep everyone seperated, so that workers cannot modify each other's data.

MULTIPROCESSOR systems-on-chips (MPSoCs) MULTIPROCESSOR systems-on-chips (MPSoCs) have emerged in the past decade as an important class of very large scale integration (VLSI) systems. An MPSoC is a system on-chipa VLSI system that incorporates most or all the components necessary for an applicationthat uses multiple programmable processors as system components. MPSoCs are widely used in networking, communications, signal processing, and multimedia among other applications. Embedded computing, in contrast, implies realtime performance. In real-time systems, if the computation is not done by a certain deadline, the system fails. If the computation is done early, the system may not benet . High-performance systems must often operate within strict power and cost budgets. As a result, MPSoC .

A.Performance and Power Efciency: An MPSoC can save energy in many ways and at all levels of abstraction. Irregular memory systems save energy because multiported RAMs burn more energy; eliminating ports in parts of the memory where they are not needed saves energy.Similarly, irregular interconnection networks save power by reducing the loads that must be driven in the network. Using different instruction sets for different processors can make each CPU more efcient for the tasks it is required to execute.

B. Real-Time Performance(MPSoCs and shared memory multiprocessors,)

Another important motivation to design new MPSoC architectures is real-time performance. heterogeneous architectures, although they are harder to program, can provide improved real-time behavior by reducing conicts among processing elements and tasks. For example, consider a shared memory multiprocessor in which all CPUs have access to all parts of memory. The hardware cannot directly limit accesses to a particular memory location; therefore, noncritical accesses from one processor may conict with critical accesses from another. Software methods can be used to nd and eliminate such conicts but only at noticeable cost. Furthermore, access to any memory location in a block may be sufcient to disrupt real -time access to a few specialized locations in that block. However, if that memory block is addressable only by certain processors, then programmers can much more easily determine what tasks are accessing the locations to ensure proper real-time responses.


DISTRIBUTED EMBEDDED ARCHITECTURES A distributed embedded system can be organized in many different ways, but its basic units are the PE and the network as illustrated in Figure 8.1. A PE may be an instruction set processor such as a DSP, CPU, or microcontroller, as well as a nonprogrammable unit such as theASICs used to implement PE 4. An I/O device such as PE 1 (which we call here a sensor or actuator, depending on whether it provides input or output) may also be a PE, so long as it can speak the network protocol to communicate with other PEs. The network in this case is a bus, but other network topologies are also possible. It is also possible that the system can use more than one network, such as when relatively independent functions require relatively little communication among them. We often refer to the connection between PEs provided by the network as a communication link. The system of PEs and networks forms the hardware platform on which the application runs.

NETWORKS FOR EMBEDDED SYSTEMS Networks for embedded computing span a broad range of requirements; many of those requirements are very different from those for general-purpose networks. Some networks are used in safety-critical applications, such as automotive control. Some networks, such as those used in consumer electronics systems, must be very inexpensive. Other networks,such as industrial control networks,must be extremely rugged and reliable Several interconnect networks have been developed especially for distributed embedded computing: The I2C bus is used in microcontroller-based systems. The Controller Area Network (CAN) bus was developed for automotive electronics. It provides megabit rates and can handle large numbers of devices. Ethernet and variations of standard Ethernet are used for a variety of control applications. In addition,many networks designed for general-purpose computing have been put to use in embedded applications as well.In this section, we study some commonly used embedded networks, including the I2C bus and Ethernet; we will also briey discuss networks for industrial applications 8.2.1 The I2C Bus The I2C bus [Phi92] is a well-known bus commonly used to link microcontrollers into systems. It has even been used for the command interface in an MPEG-2 video chip [van97]; while a separate bus was used for high-speed video data, setup information was transmitted to the on-chip controller through an I2C bus interface.I2C is designed to be low cost,easy to implement, and of moderate speed (up to 100 KB/s for the standard bus and up to 400 KB/s for the extended bus). As a result, it uses only two lines: the serial data line (SDL) for data and the serial clock line (SCL), which indicates when valid data are on the data line. Figure 8.7 shows the structure of a typical I2 C bus system. Every node in the network is connected to both SCL and SDL. Some nodes may be able to act as bus masters and the bus

may have more than one master. Other nodes may act as slaves that only respond to requests from masters.The basic electrical interface to the bus is shown in Figure 8.8.The bus does not dene particular voltages to be used for high or low so that either bipolar or MOS circuits can be connected to the bus. Both bus signals use open collector/open drain circuits.1 A pull-up resistor keeps the default state of the signal high, and transistors are used in each bus device to pull down the signal when a 0 is to be transmitted. Open collector/open drain signaling allows several devices to simultaneously write the bus without causing electrical damage.The open collector/open drain circuitry allows a slave device to stretch a clock signal during a read from a slave. The master is responsible for generating the SCL clock,but the slave can stretch the low period of the clock (but not the high period) if necessary. The I2C bus is designed as a multimaster busany one of several different devices may act as the master at various times. As a result, there is no global master to generate the clock signal on SCL. Instead, a master drives both SCL and SDL when it is sending data.When the bus is idle, both SCL and SDL remain high.When two devices try to drive either SCL or SDL to different values, the open collector/open drain circuitry prevents errors, but each master device must listen to the bus while transmitting to be sure that it is not interfering with another messageif the device receives a different value than it is trying to transmit, then it knows that it is interfering with another message.

Every I2C device has an address.The addresses of the devices are determined by the system designer, usually as part of the program for the I2C driver. The addresses must of course be chosen so that no two devices in the system have the same address. A device address is 7 bits in the standard I2C denition the extended I2C allows 10-bit addresses). The address 0000000 is used to signal a general call or bus broadcast, which can be used to signal all devices simultaneously. The address 11110XX is reserved for the extended 10-bit addressing scheme; there are several other reserved addresses as well. A bus transaction comprised a series of 1-byte transmissions and an address followed by one or more data bytes. I2C encourages a data-push programming style. When a master wants to write a slave, it transmits the slaves address followed by the data. Since a slave cannot initiate a transfer, the master must send a read request with the slaves address and let the slave transmit the data. Therefore, an address transmission includes the 7-bit address and 1 bit for data direction: 0 for writing from the master to the slave and 1 for reading from the slave to the master. (This explains the 7-bit addresses on the bus.) The format of an address transmission is shown in Figure 8.9. A bus transaction is initiated by a start signal and completed with an end signalas follows: A start is signaled by leaving the SCL high and sending a 1 to 0 transition on SDL. A stop is signaled by setting the SCL high and sending a 0 to 1 transition on SDL. However, starts and stops must be paired. A master can write and then read (or read and then write) by sending a start after the data transmission, followed by another address transmission and then more data. The basic state transition graph for the masters actions in a bus transaction is shown in Figure 8.10. The formats of some typical complete bus transactions are shown in Figure 8.11.In the rst example, the master writes 2 bytes to the addressed slave. In the second, the master requests a read from a slave. In the third, the master writes 1 byte to the slave, and then sends another start to initiate a read from the slave. Figure 8.12 shows how a data byte is transmitted on the bus,including start and stop events.The transmission starts when SDL is pulled low while SCL remains high.After this start condition, the clock line is pulled low to initiate the data transfer. At each bit, the clock line goes high

while the data line assumes its proper value of 0 or 1. An acknowledgment is sent at the end of every 8bit transmission,whether it is an address or data. For acknowledgment, the transmitter does not pull down the SDL, allowing the receiver to set the SDL to 0 if it properly received the byte. After acknowledgment, the SDL goes from low to high while the SCL is high, signaling the stop condition.

The bus uses this feature to arbitrate on each message. When sending, devices listen to the bus as well. If a device is trying to send a logic 1 but hears a logic 0,it immediately stops transmitting and gives the other sender priority. (The devices should be designed so that they can stop transmitting in time to allow a valid bit to be sent.) In many cases,arbitration will be completed during the address portion of a transmission,but arbitration may continue into the data portion. If two devices are trying to send identical data to the same address, then of course they never interfere and both succeed in sending their message.

The I2C interface on a microcontroller can be implemented with varying percentages of the functionality in software and hardware [Phi89]. As illustrated in Figure 8.13, a typical system has a 1-bit hardware interface with routines for bytelevel functions. The I2C device takes care of generating the clock and data. Theapplication code calls routines to send an address, send a data byte, and so on,which then generates the SCL and SDL, acknowledges, and so forth. One of themicrocontrollers timers is typically used to control the length of bits on the bus.Interrupts may be used to recognize bits. However, when used in master mode,polled I/O may be acceptable if no other pending tasks can be performed, since masters initiate their own transfers.

Automotive Networks: The CAN bus The CAN bus [Bos07] was designed for automotive electronics and was rst used in production cars in 1991. CAN is very widely used in cars as well as in other applications. The CAN bus uses bit-serial transmission. CAN runs at rates of 1 MB/s over a twisted pair connection of 40 m. An optical link can also be used. The bus protocol supports multiple masters on the bus. Many of the details of the CAN and I2C buses are similar, but there are also signicant differences. As shown in Figure 8.22,each node in the CAN bus has its own electrical drivers and receivers that connect the node to the bus in wired-AND fashion. In CAN terminology, a logical 1 on the bus is called recessive and a logical 0 is dominant.The driving circuits on the bus cause the bus to be pulled down to 0 if any node on the bus pulls the bus down (making 0 dominant over 1). When all nodes are transmitting 1s, the bus is said to be in the recessive state; when a node transmits a 0, the bus is in the dominant state. Data are sent on the network in packets known as data frames.

CAN is a synchronous busall transmitters must send at the same time for bus arbitration to work. Nodes synchronize themselves to the bus by listening to the bit transitions on the bus.The rst bit of a data frame provides the rst synchronization opportunity in a frame. The nodes must also continue to synchronize themselves against later transitions in each frame. The format of a CAN data frame is shown in Figure 8.23. A data frame starts with a 1 and ends with a string of seven zeroes. (There are at least three bit elds between data frames.) The rst eld in the packet contains the packets destination address and is known as the arbitration eld. The destination identier is 11 bits long. The trailing remote transmission request (RTR) bit is set to 0 if the data frame is used to request data from the device specied by the identier. When RTR=1, the packet is used to write data to the destination identier. The control eld provides an identier extension and a 4-bit length for the data eld with a 1

in between. The data eld is from 0 to 64 bytes, depending on the value given in the control eld. A cyclic redundancy check (CRC) is sent after the data eld for error detection. The acknowledge eld is used to let the identier signal whether the frame was correctly received: The sender puts a recessive bit (1) in the ACK slot of the acknowledge eld; if the receiver detected an error, it forces the value to a dominant (0) value. If the sender sees a 0 on the bus in the ACK slot, it knows that it must retransmit. The ACK slot is followed by a single bit delimiter followed by the end-of-frame eld.

Control of the CAN bus is arbitrated using a technique known as Carrier Sense Multiple Access with Arbitration on Message Priority (CSMA/AMP). (As seen in Section 8.2.2, Ethernet uses CSMA without AMP.) This method is similar to the I2C buss arbitration method; like I2C, CAN encourages a data-push programming style. Network nodes transmit synchronously,so they all start sending their identier elds at the same time. When a node hears a dominant bit in the identier when it tries to send a recessive bit, it stops transmitting. By the end of the arbitration eld, only one transmitter will be left. The identier eld acts as a priority identier, with the all-0 identier having the highest priority. A remote frame is used to request data from another node. The requestor sets the RTR bit to 0 to specify a remote frame; it also species zero data bits. The node specied in the identier eld will respond with a data frame that has the requested value. Note that there is no way to send parameters in a remote framefor example, you cannot use an identier to specify a device and provide a parameter to say which data value you want from that device. Instead,each possible data request must have its own identier. An error frame can be generated by any node that detects an error on the bus. Upon detecting an error, a node interrupts the current transmission with an error frame, which consists of an error ag eld followed by an error delimiter eld of 8 recessive bits. The error delimiter eld

allows the bus to return to the quiescent state so that data frame transmission can resume.The bus also supports an overload frame, which is a special error frame sent during the interframe quiescent period. An overload frame signals that a node is overloaded and will not be able to handle the next message. The node can delay the transmission of the next frame with up to two overload frames in a row, hopefully giving it enough time to recover from its overload.The CRC eld can be used to check a messages data eld for correctness. If a transmitting node does not receive an acknowledgment for a data frame, it should retransmit the data frame until the frame is acknowledged. This action corresponds to the data link layer in the OSI model.Figure 8.24 shows the basic architecture of a typical CAN controller. The controller implements the physical and data link layers; since CAN is a bus, it does not need network layer services to establish end-to-end connections.The protocol control block is responsible for determining when to send messages, when a message must be resent due to arbitration losses, and when a message should be received. The FlexRay network has been designed as the next generation of system buses for cars. FlexRay provides high data ratesup to 10 MB/swith deterministic communication. It is also designed to be fault-tolerant. The Local Interconnect Network ( LIN) bus [Bos07] was created to connect components in a small area, such as a single door. The physical medium is a single wire that provides data rates of up to 20 KB/s for up to 16 bus subscribers. All transactions are initiated by the master and responded to by a frame. The software for the network is often generated from a LIN description le that describes the network subscribers, the signals to be generated, and the frames.

Several buses have come into use for passenger entertainment. Bluetooth is becoming the standard mechanism for cars to interact with consumer electronics devices such as audio players or phones. The Media Oriented Systems Transport (MOST) bus [Bos07] was designed for entertainment and multimedia information. The basic MOST bus runs at 24.8 MB/s and is known as MOST 25; 50 and 150 MB/s versions have also been developed. MOST can support up to 64 devices. The network is organized as a ring.

Data transmission is divided into channels. A control channel transfers control and system management data. Synchronous channels are used to transmit multimedia data; MOST 25 provides up to 15 audio channels. An asynchronous channel provides high data rates but without the quality-of-service guarantees of the synchronous channels.