Streaming Video Over Variable Bit-Rate Wireless Channels

268
IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 6, NO. 2, APRIL 2004
Streaming Video Over Variable Bit-Rate Wireless Channels

Thomas Stockhammer, Member, IEEE, Hrvoje Jenka c, Student Member, IEEE, and Gabriel Kuhn
AbstractWe consider streaming of video sequences over both constant and variable bit-rate (VBR) channels. Our goal is to enable decoding of each video unit before exceeding its displaying deadline and, hence, to guarantee successful sequence presentation even if the media rate does not match the channel rate. In this work, we will show that the separation between a delay jitter buffer and a decoder buffer is in general suboptimal for VBR video transmitted over VBR channels. We will specify the minimum initial delay and the minimum required buffer for a given video stream and a deterministic VBR channel. In addition, we provide some probabilistic statements in case that we observe a random behavior of the channel bit rate. A specific example tailored to wireless video streaming is discussed in greater details and bounds are derived which allow guaranteeing a certain quality-of-service even for random VBR channels in a wireless environment. Simulation results validate the findings. Index TermsReceiver buffer, streaming video, variable bit-rate (VBR), wireless video.
I. INTRODUCTION
HE POPULARITY of IP-based video streaming over the Internet is continuously growing, with hundreds of new subscribers registered daily. In addition, existing and emerging wireless systems such as EGPRS, UMTS, CDMA-2000, and WLAN enable IP-based multimedia transmission and reception at any place and time at reasonable and sufficient data rates. Video transmission for mobile terminals is likely to be a major application in future mobile systems and may be a key factor to their success. The video-capable display on mobile devices paves the road to several new applications. However, due to the business models in emerging wireless systems in which the end-users costs are proportional to the transmitted data volume, and also due to limited resources bandwidth and transmission power, compression efficiency is the main target for wireless video and multimedia applications. This limits the application of error-resilience or scalability features which, in general, suffer from significantly reduced efficiency. In addition, mobile devices are hand-held and still constrained in processing power and storage capacity such that complex receiver algorithms are not applicable. Three major service categories are likely to be integrated in 2.5G and 3G wireless systems: conversational services, packet-switched streaming
Manuscript received January 16, 2003; revised June 15, 2003. The associate editor coordinating the review of this manuscript and approving it for publication was Dr. Wenjun Zeng. H. Jenka c and T. Stockhammer are with the Institute for Communications Engineering (LNT), Mnich University of Technology (TUM), D-80290 Mnich, Germany. G. Kuhn is with the Center for Mathematical Sciences, Mnich University of Technology (TUM), D-80290 Mnich, Germany. Digital Object Identifier 10.1109/TMM.2003.822795
services (PSS), and multimedia messaging services (MMS). Whereas conversational services seem to remain a niche application in near-future wireless systems, PSS and MMS are gaining increasing popularity. In this work, we will focus on streaming of pre-encoded video to wireless clients taking into account the aforementioned system constraints and limited resources. Generally, content and service providers are reluctant to provide multimedia data, such as pre-encoded video for streaming services, separately for wired and wireless clients. It is expected that a huge amount of content is exclusively stored on a server in the wired Internetaccessed by both fixed and wireless clients. Packet delays and losses are very common in nowadays IP-based packet networks. However, due to predictive video coding, lost IP packets result not only in decoding errors of the current frame, but also in quality degradation of subsequent frames included in the dependency chain. Whereas packet losses and delays in the fixed Internet mainly result from network congestion, wireless transmission packet losses and delays usually originate from link layer impairment. Channel coding and automatic repeat request strategies are used to combat bad channel conditions, resulting from multipath propagation, scattering, and fading, and to guarantee an error-free reception of IP-packets at the expense of delay jitter. A simple mean to combat this jitter is the introduction of a decoder buffer in combination with an initial playback delay to smooth the bit rate variations caused by the transmission channel [1]. In this work, we will concentrate on the transmission of variable bit rate (VBR) encoded video over VBR channels. After formulating the exact problem and discussing related work, we will show that the separation between delay jitter buffer and decoder buffer is, in general, suboptimal for VBR video transmitted over VBR channels. We will define the minimum initial delay and the minimum required buffer for a given video sequence and a deterministic VBR channel. In addition, we provide some probabilistic statements in the case that we have a random behavior of the channel bit rate. A specific example tailored to wireless video streaming will be discussed in greater detail and bounds will be derived which allow to guarantee a certain quality-of-service (QoS), even for random VBR channels in a wireless environment. Simulations for a specific example are carried out. Concluding remarks and future work items will be presented. II. PROBLEM FORMULATION AND RELATED WORK Assume the general simple media streaming system setup according to Fig. 1 consisting of media streaming server, a transport channel, and a streaming client. The server stores several
1520-9210/04$20.00 2004 IEEE
STOCKHAMMER et al.: STREAMING VIDEO OVER VBR WIRELESS CHANNELS
269
Fig. 1. Transport of VBR encoded video streams to wired and wireless clients.
pre-encoded media streams, which are in general VBR encoded. VBR coded video has usually much better rate-distortion performance than video coded with constant bit rate (CBR) [2], [3]. Each VBR media stream is characterized by a certain du. The sampling curve is deration and a sampling curve fined as the overall amount of data (e.g., measured in bits) produced by the video encoder up to time . In general, this curve is monotonically increasing and has staircase characteristic. In and addition, we assume , with being the size of the media stream in bits. In the following, we assume that all transport overhead such as IP-headers, system-specific information, or control messages are included in the sampling curve. The transport channel is commonly characterized by the transmission bit rate (in bits per second), at which bits enter the decoder buffer. In CBR scenarios, is usually the channel bit rate and is typically but not necessarily matched to the average bit rate of the video clip. In typical MMS applications, the media stream would be downloaded and stored at the receiver. The downloading is given as . The minimum time interval time between the moment when the download started and moment the playback of the stream can be started is obviously given by the downloading time . In addition, the receiver requires a storage capacity of at least bits at the receiver. In contrast, in streaming applications, the goal is to start playback while downloading the media stream. This has two significant benefits when compared to MMS. First, the initial playback can be reduced significantly, and second, the usually delay costly storage capacity at the receiver, referred to as receiver buffer size , can be much smaller than the total size of the bit stream . The decoder buffer smoothes the bit rate fluctuations in the VBR coded video. However, this strategy of starting the playback while downloading also involves the problem of possible decoder buffer underflows and overflows. An underflow occurs in the case that not enough data is present at the decoder at the time a certain video frame has to be decoded and displayed. This results in a display problem and the continuous flow of the decoded video is interrupted. In contrast, an overflow occurs whenever too much data has arrived at the decoder. This results in a loss of parts of the video stream and, in general, in a significant performance degradation. To avoid , the transmission rate these problems, the sampling curve
, the initial delay , and the buffer size have to be adjusted appropriately, see, e.g., [4]. Once the client has requested a certain stream, parts of the video bit stream are pre-loaded in the decoder before playing it back. The video decoder starts removing bits from the buffer when a specified initial decoder buffer fullness (also in bits) is achieved. and determine the initial or start-up delay as . The connection between and the sampling curve can be formalized by the so called leaky bucket model [5], [6]. It is said that a leaky bucket model with paramcontains a bit stream with sampling curve eters if there is no underflow or overflow of the decoder buffer. The bits enter the decoder buffer at rate until the level of fullness is and then the bits for the first media unit are instantaneously removed. The bits keep entering the buffer at rate and the decoder removes the bits for the following frames at some given time instants, typically but not necessarily with the frame rate of the video stream. Note that if a complementary encoder buffer of the same size as the decoder buffer is used, decoder buffer overflows can be avoided. However, in this case the channel is not fully exploited, because if the decoder buffer is full, the encoder buffer is empty and no data is transmitted. As the exact decoder buffer size is not usually known and the channel should be fully exploited, we dispense the complementary encoder buffer in the following. Two different types of problems are typically present in the design of video-streaming services, one related to the encoding of video and one related to the transmission of pre-encoded video, (see, e.g., [2] and [4]). First, the leaky bucket model is completely specified and the video has to be encoded such that the sampling curve fulfills the requirement of no decoder buffer violation. This application is very typical for the streaming or broadcasting of live-encoded video as in this case the sampling curve is not determined in advance. Therefore, in video standardization communities these consideration have been used to specify a hypothetical reference decoder (HRD) [7], [8]. System specifications of media transmission standards usually discuss and specify the connections between a leaky bucket and the sampling curve. For the video buffer verifier (VBV) design for MPEG-2, as well as for the HRD design for H.263, the buffer size and the initial delay are fixed. is either specified in the bit stream (MPEG-2) or by some constraint in the specification of the profile and level. can be fixed by transmitting the value in the bit stream (MPEG-2 CBR mode), by filling up the , or by buffer completely (MPEG-2 VBR mode), i.e., just using the size of the first frame, i.e., the first frame is immediately decoded (H.263). Then, with buffer and delay fixed once, the video encoder has to design an appropriate sampling curve such that a buffer underflow is avoided. Recently, for the standardization of MPEG-4 AVC/H.264 a new approach to the HRD design has been proposed [9] which generalizes and extends the concepts of previous HRDs. An encoder can create a video bitstream that is contained by several leaky buckets. This HRD interpolates among the leaky bucket parameters and can operate at any desired transmission bit rate, buffer size, or initial delay. The second problem addresses transmission of pre-encoded video. In this case, the sampling curve is a priori specified
270
and , and have to be determined such that the encoded bitstream is contained by this leaky bucket. In [4], the selection of the initial delay and the minimum buffer size has been addressed assuming a CBR transmission channel. However, it has been recognized that the CBR channel assumption is not sufficient, especially for streaming services over the Internet. Delay jitter occurs due to packet delays and congestions in routers as well as end-to-end retransmissions in case of packet losses. Therefore, channel-adaptive streaming technologies have gained significant interest. According to [10], these techniques can be grouped into three different categories. Adaptive media playout [11] is a new technique that allows a streaming media client, without the involvement of the server, to control the rate at which data is consumed by the sampling process. Therefore, the probability of decoder buffer underflows and overflows can be reduced, but still a noticable artifact in the displayed video occurs. A second technology for a streaming media system is proposed, which makes decisions that govern how it will allocate transmission resources among packets. Recent work [12] provides a flexible framework to allow rate-distortion optimized (RaDio) packet scheduling. In this case, the system allocates time and bandwidth resources by adapting to the varying channel conditions. Finally, it is shown that this RaDio transmission can be supported, if media streams are pre-encoded with appropriate packet dependencies, possibly adapted to the channel (channel-adaptive packet dependency control) [13]. Note that in the case of adaptive packet scheduling, the sampling curve might be decided by the application server at streaming time and is not completely pre-determined by the encoder. Although these channel-adaptive streaming techniques show noticeable benefits in streaming video over the Internet, they require significant modifications in the streaming client, the streaming server, or both. Therefore, in this work we focus on simple solutions to stream video over VBR channels, which only require insignificant modifications in the streaming server, but provide a certain guaranteed QoS. Hence, instead of dimensioning initial delays and buffers under CBR channel assumption as, e.g., in [4], the goal in this work is to determine these parameters for deterministic and random VBR channels with special focus to wireless video streaming. The inherent packet losses on mobile links due to fading and interference result in delay jitter as link layer retransmissions are applied. The resulting tradeoff between throughput and delay jitter on the wireless link is preferred for reliability as data losses result in a significant degradation of the displayed video. For video streaming over 3G mobile systems, a video buffering model has been proposed to support VBR video coding [1]. In [14], the mapping of UMTS bearer bit rates to media bit rates is presented. In addition, the specification and the conveyance of the initial delay and the buffer size for a given sampling curve using the Session Description Protocol (SDP) are discussed. III. STREAMING MEDIA OVER VBR CHANNELS A. Overview In the following, we assume a media streaming system setup consisting of media streaming server, a transport channel, and
a streaming client. The server stores several pre-encoded media streams where each of it is characterized by a sampling curve . The channel is assumed to be error free with limited transmission rate. However, in contrast to common considerations, we assume that the bit rate might vary over time. This VBR , which channel can be characterized by the receiver curve specifies the total amount of data received error-free up to time at the receiver. Obviously, is monotonically increasing and . This generalizes the CBR channel we define . with linear receiver curve As we are interested in streaming applications, we want to minimize the initial delay and the receiver buffer size for a VBR channel. To avoid a buffer underflow at the receiver buffer, the initial playback delay has to be chosen such that for any 1 bits are available at the decoder, time instant , at least i.e., (1) Furthermore, the receiver buffer should have the capacity to store all received and nondecoded data, as buffer overflow will result in lost data. The buffer size has to be chosen such that at any time instant , all received nondecoded data can be stored in the receiver buffer, i.e., (2) Obviously, if is known before transmission or even prior to encoding and generation of , the sampling curve can be designed such that the video stream contains a leaky bucket with , or and can be selected appropriately according to (1) and (2), respectively. We will provide a simple solution for this problem in Section III-B and discuss the benefits of a single receiver buffer. However, in general, the VBR channel behavior is not known in advance, neither at the encoder nor at the decoder. This problem will be discussed in detail in Section III-D. B. Deterministic VBR Receiver Curves Assume for now that the exact receiver curve and are known in advance at the media the sampling curve streaming server2 . Let us additionally define the pseudoinverse function of the monotonically increasing sampling curve as . Then, the following proposition specifies the minimum initial delay and the decoder buffer size. and a receiver Proposition 1: Given a sampling curve , the minimum initial delay to avoid buffer underflow curve should be chosen as (3) and the corresponding minimum receiver buffer size to avoid a decoder buffer overflow as (4)
the delayed sampling curve p t is defined as the playout curve. the CBR channel is included in this framework by specifying a linear R t. receiver curve r t
1In [1], 2Note,
( 0 1)
()= 1
271
Fig. 2. Separate dejitter and decoder buffer.
Outline of Proof: From (1), it is obvious that we attempt to have minimum initial delay, which is given as maximum horand the reizontal difference between the sampling curve . With properties of and the definition of ceiver curve the pseudoinverse, the condition in (3) is obvious. Similarly, the minimum buffer size is the maximum vertical difference for any between the receiver curve and the playout curve . C. Single Receiver Buffer In [1], the problem of streaming video over VBR channels is discussed. Initially, a concept to introduce two buffers at the receivera delay jitter buffer and a decoder bufferhas been proposed. However, clarification has been added to [1] that this separation is only conceptual and client implementations may not include a separate jitter buffer. In addition, it was addressed that the rate at which the delay jitter buffer should be emptied is not obvious. We will provide a justification on the benefits of a single receiver buffer and address the issue of the emptying rate in case of separate buffers. The delay jitter buffer compensates the delay jitter introduced by the channel to obtain a CBR channel with bit rate at the entrance of the decoder buffer. Therefore, traditional HRDs such as the MPEG-2 VBV or the is delayed H.263 HRD can be applied. The receiver curve and de-jittered such that at the output of by an initial delay the delay jitter buffer, a CBR channel with rate is visible to the decoder buffer (see Fig. 2). With Proposition 1, it is obvious and the minimum delay jitter that the minimum initial delay are given as buffer size (5) (6) The decoder buffer is specified such that the video stream is contained in the leaky bucket with sampling curve and with Proposition 1, we can again and the minimum specify the minimum initial delay as decoder buffer size (7) (8) We will now compare the single buffer approach with the separate buffer approach. The connections are graphically illustrated in Fig. 3 and serve as an intuitive proof for several of and the following statements. For any valid sampling curve any valid receiver curve , the following statements can be shown. 1) For any fixed , is the minimum delay for the single is the minimum delay for buffer case according to (1),
is the minthe delay jitter buffer according to (5), and imum delay for the separate decoder buffer. According to . (7), it is obvious that 2) Similarly, for the respective minimum buffer sizes , , and , according to (4), (6), and (8), respectively, and for . any , it is obvious that such that 3) Only if there exists an and can we design separate buffers such that the sum delay equals to the single buffer . This means that the receiver case, i.e., and the playout curve can be separated curve by a straight line, as illustrated in Fig. 3. 4) Only if there exists an such that it fulfills the previous and condition and, in addition, , then we can design two separate buffers with . This means that the receiver and the shifted playout curve curve can be separated by a straight line . This is also illustrated in Fig. 3. From these arguments, it is obvious that two buffers are generally worse than one buffer in terms of minimum initial delay and minimum receiver buffer size. In addition, for separated delay jitter buffer and decoder buffer, it is worth optimizing the intermediate bit rate corresponding to the aforementioned emptying rate to minimize the delay. As a single receiver buffer generally performs at least as good as two separate buffers, we only consider single receiver buffers in the following. D. Random Receiver Curves The assumption that the receiver curve is a priori known at the transmitter or receiver is obviously not very realistic for most practical systems. However, to specify the initial delay and the minimum buffer size, the exact knowledge of the receiver curve is essential. Especially in systems with QoS guarantees, events like buffer overflow or underflows should not occur too frequently. Therefore, it is reasonable to guarantee a certain QoS by specifying the probability that the transmission of a sequence is successful without any buffer overflow or underflow with the constraint of minimum initial delay and minimum receiver buffer size. To formalize the concept, let us define the stationary and describing the random receiver ergodic random process curve behavior. Obviously, each realization of this random process has the same monotonic properties as the deterministic as defined previously. In addition, we receiver curve and some lower limit assume that some upper limit for the random receiver curve exist, which determine the that any realization of is entirely within probability these limits for all . This also means that the receiver curve will be in between the limits with probability . However, we are not necessarily interested in events that the receiver curve is entirely within the region, but in the event that successful sequence playout at the receiver, denoted as , is possible. A successful playout is defined such that no deadline violation and no receiver buffer overflow occurs while playing back the sequence. Hence, continuous decoding and error-free sequence presentation is assured. The following proposition provides
272
Fig. 3. Single receiver buffer versus separate delay jitter and decoder buffers.
guidelines of how to select the initial delay and the receiver buffer size to guarantee a certain probability that the successful playout at the receiver is possible. Proposition 2: For a given sampling curve , a given upper and lower limit and such that (9) and, if the initial delay is selected as (10) and the receiver buffer size as (11) it can be guaranteed that the probability of successful playout of the sequence at the receiver is at least , i.e., . Outline of proof: According to (3), the condition in is always (10) guarantees that the playout curve below . Additionally, with (4), the condition in (11) guarantees that the shifted playout curve is always above . Therefore, selecting according according to (11) guarantees that the video to (10) and stream with sampling curve can be decoded for all rewith property . ceiver curves Note that additional receiver curves might exist which ensure successful decoding, but do not fulfill this property. However, all receiver curves fulfilling this property allow successful decoding and, therefore, the probability that a sequence with occurs is a lower bound to property the probability of successful decoding. If the streaming server has knowledge of the sampling curve and the random process of the VBR channel , or at least and , then the design criteria according on the bounds to Proposition 2 allows the transmitter selecting the initial delay and the buffer such that a certain QoS in terms of successful decoding probability is guaranteed. This solution is very generic and can be applied to any random VBR channel as long as a statistical description is available. We will discuss the problem
of initial delay selection and buffer design for a typical wireless transmission scenario in the following. IV. APPLICATION TO WIRELESS VIDEO STREAMING A. Channel Model In the following, we consider a simple but meaningful channel model to describe a wireless video transmission. According to [1], for UMTS Terrestrial Radio Access Network (UTRAN), a radio bearer using a dedicated channel and running in acknowledged mode could fulfill the requirements of recovering from lost radio link control (RTC) packets and having a fairly stable network throughput behavior. First, a dedicated channel can maintain a fixed transport channel rate on the physical layer. Second, when used in acknowledged mode, the probability of lost IP packets is virtually zero due to an efficient retransmission protocol on the RLC layer, which retransmits only the erroneous RLC packets of an IP packet. The cases discussed in [1] assume the streaming server to be located in the mobile operators network or connected to the mobile network such that sufficient QoS is available, e.g., through the use of over-provisioning. Therefore, packet losses and delays in the access network can be neglected. It has also been recognized that due to the fast power control in UMTS, the requested QoS in terms of link layer loss probability can be maintained, and RLC packet are lost statistically independently [15]. Finally, the delay of the retransmission can be neglected as in general back-channel delay and retransmissions on link layers happens very rapidly. This assumption can be justified especially in scenarios where the channel propagation time of one packet is sufficiently smaller than the time interval between two consecutive video frames or IP packet transmissions. Moreover, in delayed feedback systems, packet labeling allows to reorder received packets. In the following, we consider that erroneous RLC segments are retransmitted immediately on the next available transmission slot and that we operate in persistent acknowledged mode, i.e., retransmissions of erroneous packets are performed until the current RLC packet is correctly received.
273
With these preliminaries we specify a general communication system, where application packets, e.g., video frames encapsulated in RTP/IP are segmented into smaller link layer packets (RLC packets in case UMTS) of constant payload length (in , are bits). The link layer packets, indexed by with , the transmission time insent out at times terval. Link layer packets are either correctly received or a loss of this packet is indicated. The probability for correct reception of link layer packets is defined as . We summarize this wireless with the payload size of link channel model as the transmit time interval, and the success layer packets, probability. B. Delay and Buffer Design for Wireless Video Transmission A transmission system obviously results in a . To specify , let be a random random receiver curve variable which describes the successful transmission of a link , and let be the layer packet at time index , with and probability of a successful packet reception the probability for a lost packet . The lost packet will immediately be retransmitted at time instant . Thus, we coni.i.d. and define the random variable as the number sider of successfully received link layer packets at after link-layer transmission attempts, i.e.,
any , such that probability with
, the is lower bounded by
(17)
The proof for this Lemma is given in Appendix I. With this lemma, we can find criteria on the minimum initial delay and the buffer size when transmitting over the wireless channel specwith the following proposition. ified as Proposition 3: For a sampling curve , a wireless system , the definitions in Lemma 1 of according to (15), according to (16), according to (17), such that , any such that (18) and any such that (19) Then, if the initial delay is selected as (20)
(12) With , we define the random receiver curve for the investigated wireless channel as (13) . Assuming that the channel has been used with times, according to our retransmission policy, link layer packets have been received. Obviously, due to our assumption of statistically independent packets, is binomially distributed, i.e., (14) We are now interested in finding appropriate initial delays and buffer design criteria when transmitting over such a channel. Therefore, we need to find an upper limit and a lower limit for this random process, such that we can apply Proposition 2. The following lemma is useful in finding these limits. with Lemma 1: For a set of binomial random processes , each with success probability according to , an upper bound (12), any positive constant (15) a lower bound (16)
and the receiver buffer size as (21) it can be guaranteed that the probability of successful playout , i.e., of the sequence at the receiver is at least . Outline of proof: The proposition is obvious, with Lemma 1, Proposition 2, and
(22) With the selection of such that , it is guaranteed that all receiver curves, which ensure successful decoding, have been taken into account. C. Discussion and Results for a Specific Example We will now show the usefulness of these propositions for a specific example and discuss the basic procedure to determine and the corresponding minimum an appropriate initial delay which guarantees successful decoding of a video buffer and probability . Given a stream with sampling curve certain sampling curve and a wireless system , only the streaming server has to be modified to integrate this and are QoS policy such that an appropriate pair of determined according to Proposition 3 and transmitted to the and decoder, e.g., via SDP. Note also that several pairs of for a set of outage probabilities could be transmitted to decoder prior to transmission and the decoder could trade off initial delay versus success probability. In both cases, only the
274
TABLE I AVERAGE PSNR AND AVERAGE BIT RATE FOR SPORTS NEWS SEQUENCE ENCODED WITH MPEG-4 ASP AND DIFFERENT QPS
Fig. 4. Resulting values for the outage probabilities for different c and U values given the packet success probability p : .
=09
streaming server needs to be aware of the channel conditions and the sampling curve in advance. Let us assume a typical wireless transmission system [1] charbits and the time acterized by the RLC packet length . This interval between two consecutive packets as results in a maximum transmission bit rate of 64 000 bits/s. The RLC packet error probability is assumed as 0.1, which is commonly targeted and achieved in practical systems in acknowledged mode by fast power control. This wireless transmission system can be summarized by . With the packet success rate , any number of total transmission packets , an appropriate , and any requested upper on the probability for nonsuccessful decoding, we can bound find an appropriate parameter by applying Lemma . This alaccording to (15) and lower lows to determine upper limits limits according to (16) for a given packet success probability , as well for as selected and , as shown in Fig. 4. It can be seen that the length of the sequence has little influence . Therefore, it can on the outage probability as long as be concluded that outage events are more likely in the beginning of the sequence than later in the sequence, especially for low requested outage probabilities. To evaluate and verify the findings, simulations for typical radio bearer parameters according to [1] have been conducted. These parameters can be mapped to our wireless system as . In addition, a set of sampling curves has been generated by encoding a 70-s sports-news sequence alternating speakers and sport scenes in QCIF resolution. This sequence was VBR encoded with MPEG-4 Advanced Simple Profile (ASP) at constant frame rate 10 Hz by applying just a single quantization parameter (QP) for the entire sequence such that the quality stays almost constant. Table I shows the peak signal-to-noise ratio (PSNR) and the average bit rates for three different QPs. In Fig. 5, the receiver curve limits for the wireless transmisare shown. The resulting playout sion system and curve of sequence A is also shown in Fig. 5 by delaying the sam, such that at any point the lower limit on pling curve the random receiver curve is above the playout curve according to condition (10). The minimum right shift determines the minimum delay. Also sketched is determination of the minimum
curve with minimum receiver buffer size.
Fig. 5. Upper limit r (t) and lower limit r (t) for W (C = 640; = 10 ms; p = 0:9) and = 10 , playout curve p(t 0 1 , and shifted playout
required receiver buffer size. The playout curve is shifted by the minimum buffer size such that the upper limit on the random receiver curve is below this curve for any time . and For this system the different sampling curves according to Table I simulations to verify the applicability of the findings have been carried out. Pairs for the initial delay and the minimum buffer size for different parameters and corresponding outage probabilities have been determined according to Proposition 3. The transmission of the encoded sequences which their corresponding sampling curves over the wireless system have been evaluated using independent Monte-Carlo simulation with a minimum of experiments for each parameter set. Whenever just any buffer violation happens, i.e., the receiver buffer would underflow or overflow, the transmission for this specific receiver curve is declared as an outage. The simulated outage probability over the initial delay with corresponding minimum buffer size according to (11), as well as with infinite buffer size, are shown in Fig. 6 together with the upper bounds according to Proposition 3. For increasing initial delay, the probability for an outage obviously decreases. For both simulations and upper bounds, the decrease is significant over just a small range of extra delay. It can be observed that infinite buffer size roughly halves the outage probability compared to minimum buffer size, i.e., for the limited buffer size, about the same amount of underflows and overflows occur. The bound has the same decay over delay, but is right-shifted by about 0.20.6 s. This is obvious as the bound declares an outage whenever the receiver curve is not within the limits (compare Fig. 5). Another interesting aspect
275
Fig. 6. Simulated outage probability over the initial delay with corresponding minimum buffer size according to (11) with infinite buffer size as well as the upper bounds according to Proposition 3 for different sampling curves according C ; ;p : . to Table I and
W ( = 640 = 10 ms = 0 9)
Fig. 7. Connection of the minimum initial delay and the minimum required receiver buffer size B to guarantee successful sequence playout with probability for different sampling curves and ;p : . C ;
W ( = 640 = 10 ms = 0 9)
becomes obvious from the diagram. For streaming video over wireless channels, the encoded video sequence does not necessarily have to match the average bit rate of the channel if the initial delay is appropriately adjusted. For example, comparing sequence A and sequence C, a gain of about 0.8 dB in PSNR can be obtained by introducing an additional initial delay of about 5 s and obviously also sacrificing the extra cost for additional bit rate. This obviously also requires an additional buffer in the receiver. The connection of the minimum initial delay and the minimum required receiver buffer size to guarantee successful sequence playout with probability for different sampling curves is shown in Fig. 7. The required buffer sizes are in a reasonable range below 100 kB, which is in general provided by state-ofthe-art mobile devices. This diagram shows the flexibility of the developed framework to determine initial delay and minimum buffer sizes under QoS policies in wireless systems with little computational efforts. Of interest is that the required initial delay as well as the required buffer sizes for this typical wireless system and these typical sampling curves are rather small compared to the delays experienced on multimedia streaming services in the current Internet. Therefore, it seems that wireless video streaming with QoS policies is less critical than presumed if a dedicated channel and persistent acknowledged mode over a typical UTRAN radio bearer is applied. V. CONCLUSIONS In this work, we have considered streaming of video sequences over VBR channels. The main results of the paper, which have been shown by exact mathematical treatment, can be summarized as follows. If the sampling curve of the video stream and the receiver curve are known a priori to the server, then there exist minimum values for the initial playout delay and the decoder buffer size guaranteeing successful playout. In addition, it has been shown that the separation between a delay jitter buffer and a decoder buffer is, in general, suboptimal for VBR video transmitted over a VBR channel. If the receiver curve is a priori unknown but some statistical description of
the channel is available, a certain QoS can be guaranteed by selecting appropriate initial delay and buffer sizes such that the likelihood of buffer violations can be upper bounded. The application of the findings to wireless transmission has been discussed and bounds are derived which allow guaranteeing a certain QoS, even for random VBR channels in a wireless environment. Simulation results for a realistic system and specific video sequence validate the findings. Although the bounds are relatively loose in terms of outage probability, they are valid upper bounds and they allow estimating minimum delay and buffer size very well. Future work includes tighter bounds for the outage probability, as well as the consideration of different channel models, e.g., channels with correlated RLC packet losses or a heterogeneous system taking into account delays and losses in fixed and wireless IP transmission. APPENDIX A. Proof of Lemma 1 Proof: We consider Bin . Let i.i.d. Ber and define
(23) with , where denotes the Gauss delimiter. Since we want to bound the probability
the first sets are chosen maximal, i.e., , . Note that the expected value we assume some regularity conditions
and so
(24) denotes an increasing concave function such that and . Given , the only possi, i.e., , occurs if , bility to leave set where
276
, and . Hence
or
, and
by the theorem of the Iterated Logarithm (see theorem 31.1 in ; hence [16]). Let us define
Let ; thus (25) is satisfied. To keep (25) sufficiently small, we denote by a modified version of by
It follows that with a constant and likewise condition (24) is satisfied. ACKNOWLEDGMENT The authors are grateful to the anonymous reviewers for helpful comments which improved the clarity and readability of this work. There is a and, hence, . Now we want to give an upper bound of the probability that there is a time satisfying ; note this is the complement of and , the event , hence , so that REFERENCES
[1] V. Varsa and I. Curcio, Transparent end-to-end packet switched streaming service (PSS); RTP usage model (Release 5), 3GPP TR 26.937 V1.4.0, 2003. [2] T. V. Lakshman, A. Ortega, and A. R. Reibman, VBR video: Trade-offs and potentials, Proc. IEEE, vol. 86, pp. 952973, May 1998. [3] V. Varsa and M. Karczewicz, Long window rate control for video streaming , in Proc. 11th Int. Packet Video Workshop, Kyongju, Korea, May 2001. [4] I. M. Pao and M. T. Sun, Encoding stored video for streaming applications, IEEE Trans. Circuits Syst. Video Technol., vol. 11, pp. 199209, Feb. 2001. [5] C.-Y. Hsu, A. Ortega, and A. R. Reibman, Joint selection of source and channel rate for VBR video transmission under ATM policing constraints, IEEE J. Select. Areas Commun., vol. 15, pp. 10161028, Aug. 1997. [6] A. R. Reibman and B. G. Haskell, Constraints on variable bit-rate video for ATM networks, IEEE Trans. Circuits Syst. Video Technol., vol. 2, pp. 361372, Sept. 1992. [7] ITU-T, Hypothetical reference decoder, Video Coding for Low Bit Rate Communiction, Annex B, Sept. 1997. [8] ISO/IEC 138 180-2, Information TechnologyGeneric Coding of Moving Pictures and Associated Audio Information: Video (mpeg-2/h.262, video Buffering Verifier , Annex C, 2nd ed., 2000. [9] J. Ribas-Corbera, P. Chou, and S. Regunathan, A flexible decoder buffer model for jvt coding , in Proc. Int. Conf. Image Processing (ICIP-02), Rochester, NY, Sept. 2002. [10] B. Girod, M. Kalman, Y. J. Liang, and R. Zhang, Advances in video channel-adaptive streaming , in Proc. Int. Conf. Image Processing (ICIP-02), Rochester, NY, Sept. 2002. [11] E. G. Steinbach, N. Frber, and B. Girod, Adaptive play-out for low latency video streaming , in Proc. Int. Conf. Image Processing (ICIP-01), Thessaloniki, Greece, Oct. 2001. [12] P. A. Chou and Z. Miao, Rate-distortion optimized streaming of packetized media, Microsoft Research Tech. Rep. MSR-TR-200135, [Online] Available: http://citeseer.nj.nec.com/chou01ratedistortion. html, Feb. 2001. [13] Y. J. Liang and B. Girod, Rate-distortion optimized low-latency video streaming using channel-adaptive bitstream assembly , in Proc. IEEE Int. Conf. Multimedia and Expo (ICME-2002), Lausanne, Switzerland, Aug. 2002. [14] UMTS Video StreamingUse Case Example for PSS Video Buffering Model ,, Tampere, Finland, 3GPP TSG-SA, 2002. [15] M. Findeli, MPEG-4 based video streaming over UMTS, Diploma thesis, Mnich Univ. Technol., Mnich, Germany, 2002. [16] H. Bauer, Wahrscheinlichkeitstheorie. Berlin, Germany: De Gruyter, 1991.
Thus, the probability that there exists an is bounded by that
such
The question is now how to choose suitable and not too big or small such that (25) is sufficiently small. We propose to to fulfill choose the sets (25) and to grow as slow as possible. Hence, (25) grows very small with respect to increasing . A good candidate is given
277
Thomas Stockhammer (M98) received the Dipl.-Ing. degree in electrical engineering in 1996 from Mnich University of Technology (TUM), Mnich, Germany, where he is currently pursuing the Dr.-Ing. degree in the area of source and video transmission over mobile and packet-lossy channels. In 1996, he visited Rensselear Polytechnic Institute (RPI), Troy, NY, to perform his diploma thesis in the area of combined source channel coding for video and coding theory. There he started the research in video transmission and combined source and channel coding. In 2000, he was Visiting Researcher in the Information Coding Laboratory, University of California at San Diego (UCSD). Since then, he has published several conference and journal papers and holds several patents. He regularly participates and contributes to different standardization activities, e.g., ITU-T H.324, H.264, ISO/IEC MPEG, JVT, IETF, and 3GPP. He acts as a member of several technical program committees, as a reviewer for different journals, and as an Evaluator for the European Commission. His research interests include joint source and channel coding, video transmission, multimedia networks, system design, rate-distortion optimization, and information theory, as well as mobile communications.
Hrvoje Jenka c (S04) received the Dipl.-Ing. degree in electrical engineering in 2001 from Mnich University of Technology (TUM), Mnich, Germany, where he is currently pursuing the Ph.D. degree in the communications engineering field. Since then, he has been with the Institute for Communications Engineering, TUM, as a Research and Teaching Assistant. His research interest are in the area of reliable multimedia transmission over the wireless internet. received the Werner von Siemens ExMr. Jenkac cellence Award in 2002, founded by the Siemens company.
Gabriel Kuhn received the Diploma degree in mathematics in 2002 from Mnich University of Technology (TUM), Mnich, Germany, where he is currently pursuing the Ph.D. degree in mathematics. He is currently the Chair of Mathematical Statistics at TUM as a Research and Teaching Assistant. His research interests are in the field of multivariate extreme value theory. Mr. Kuhn received the Deutsche Mathematiker Vereinigung award in 2002 for his diploma thesis.

Streaming Video Over Variable Bit-Rate Wireless Channels

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Streaming Video Over Variable Bit-Rate Wireless Channels

Hochgeladen von

Copyright:

Verfügbare Formate

268

IEEE TRANSACTIONS ON MULTIMEDIA, VOL. 6, NO. 2, APRIL 2004