Beruflich Dokumente
Kultur Dokumente
A SEMINAR REPORT ON
Motion Vector Recovery Based Error Concealment For H.264 Video Communication
Submitted by:
BHADGAONKAR SATISH A.
T.E (ELECTRONICS) ROLL NO: 70 SEAT NO: W33383
Mr. R.G.MEVEKARI
Page 1
WCE Sangli
CERTIFICATE
This is to certify that the seminar report entitled
Motion Vector Recovery Based Error Concealment For H.264 Video Communication
Submitted by:
BHADGAONKAR SATISH A.
It is a record of his own work carried out by him in partial fulfilment of T.Y.B.TECH.(ELECTRONICS)
W.C.E, SANGLI.
Under my guidance during the academic year 2010-2011
Mr. R. G. Mevekari
(Guide)
Page 2
WCE Sangli
ACKNOWLEDGEMENT
I would like to thank to Mr. R.G.Mevekari for his guidance and inspiration for me to complete this seminar in a better way. I am thankful to WCE i.e. my college for providing me the facility of accessing IEEE papers. I also thank to the library personnel for offering all the help I needed for this work .After all I am thankful to all my colleagues who helped me directly or indirectly.
Page 3
WCE Sangli
DECLARATION
I, hereby, declare that the seminar report entitled Motion Vector Recovery Based Error Concealment
Page 4
WCE Sangli
INDEX
1. Introduction 2. Errors 3. Overview of Error Control and Error Concealment Techniques 4. Motion vector recovery 1. Temporal Replacement based MVR Scheme 2. Boundary Matching based MVR Algorithm 3. MVR based on Lagrange Interpolation 4. MVR based on Polynomial Model 5. MVR Techniques: A Performance Comparison 6. Conclusion and Open Issues 7. Bibliography
Page 5
Page 6
Figure 1: Subjective quality comparison for the "Stefan (QCIF)" sequence at 15% MBs loss in 9th frame at 1354 Kbps (a) Without Error; PSNR=39.02 dB (b) 15% MBs lost; PSNR=13.24 dB.
Several techniques have been proposed to combat this visual quality degradation caused by errors that occur during transmission of such compressed videos. 1) Error resilience based techniques that improve the robustness of videos against transmission errors 2) Techniques that initiate an automatic retransmission request (ARQ) on a decoding error 3) Error concealment (EC) based techniques that hide or recover the errors by using the other non-erroneous video information received.
Page 7
Page 8
WCE Sangli
4. Motion Vector Recovery
MVR techniques are more popular as they effectively address two important problems related to EC in video communication, namely, the computational time requirement; and, quality of the recovered video. These MVR techniques take less time to execute without compromising the quality of the recovered video and that make them highly suitable for real-time video streaming applications.
4.1 Temporal Replacement based MVR Scheme As mentioned in the previous section, the most common temporal MVR method is the temporal replacement (TR), which replaces the lost MVs with (0, 0). This signifies that no movement happened in the lost area of a given frame compared to the previous frame. Since all the lost MVs are replaced by (0, 0) in the TR technique, this technique is the fastest among all the existing MVR techniques reported in the literature. The main drawback with the TR technique is the poor quality of the recovered video compared to the other MVR techniques discussed below. 4.2 Boundary Matching based MVR Algorithm The test model calculates the MVs for each of the lost MBs by using a matching algorithm. The H.264 standard offers the following flexibility in configuring the size of an MB. Each of the MBs can be any one of the following sizes: single MB which is a 1616 pixel matrix, four sub-MBs which are 88 pixel matrices, eight sub-MBs which are either 48 pixel or 84 pixel matrices each, and sixteen sub-MBs which are 44 pixel matrices each. In each of the above cases, every sub-MB is associated with an MV. The BMA works only on the MBs that are configured as 88 sub-MBs. If in case an MB is realized as a single 1616 pixel matrix, it is divided into four 88 sub-MBs. The MVs of the new sub-MBs are the same as that of the larger sized MB. In case an MB is divided into other sizes, for example 48, 84, etc., the sub-MBs of smaller sizes are merged to form a larger 88 sub-MB. In this case the MVs of the newly formed larger sub-MB are the average of the MVs of the smaller sized sub-MBs that were merged to form the larger one. The prediction of the lost MV for this sub-MB is done by choosing one of the MVs from other correctly decoded/recovered adjacent sub-MBs. The decision of which MV of a neighbouring sub-MB be used as prediction for a lost sub-MB is made as described below: In this procedure, all the MV values of the correctly decoded/recovered sub-MB adjacent to the lost sub-MB are considered. The MV value of one such sub-MB is taken and is assigned as the MV value of the lost subMB. Now the sub-MB is inserted into its place in the frame and the luminance change
Page 9
WCE Sangli
across its boundaries are computed. The above step is repeated for each of the adjacent non-erroneous sub-MBs of the lost sub-MB and that MV which gives out the smallest luminance change across the boundaries of the lost sub-MB is chosen as its predicted MV value. The luminance change in the boundary of two sub-MBs is the average of the absolute difference values of the pixels in the boundary. Though this technique ensures reasonable quality of the picture in the recovered video, it requires very large computational time compared to its counterparts discussed further in this paper. Unlike other video-coding standards, the MVs of H.264 cover smaller area of the video frame being encoded. This leads to a strong correlation between neighbouring MVs, thus making H.264 standard amenable for statistical analysis to recover the lost MVs. The techniques discussed further in this paper are based on such statistical analysis. 4.3 MVR based on Lagrange Interpolation This sub-section presents an MVR method that is based on the Lagrange Interpolation (LAGI) formula. Lagrange interpolation formula is one of the most widely used interpolation functions. Its computational cost is lower than most of the other interpolation functions reported in the literature. The remaining of this section describes how a third order (n=3) polynomial interpolation can be used for MVR. As mentioned earlier, the H.264 standard divides every frame into several MBs. Each MB is associated with 1 to 16 MVs ensuring backward compatibility with previous standards. [Figure 2] shows an H.264 frame segment with 9 MBs denoted by F m,n, where m and n denote the spatial location of the MB within the frame. Each MB is associated with 16 MVs. In [Figure 2], let F m,n denote the lost MB. As in the case of many EC algorithms for MVR, it is assumed that either two of the vertically adjacent or two of the horizontally adjacent MBs of the lost MB are correctly decoded. In [Figure 2], it is assumed without loss of generality that both the horizontally adjacent MBs of F m,n are error-free. In this case, the lost MVs of F m,n are recovered row-by-row. Let MV ij (0i, j 3) denote the correct MVs that belong to the rows of the horizontally adjacent MBs of F m,n as shown in [Figure 2]. Let V ij 0(0i, j 3) represent the MVs of the rows of F m,n that need to be recovered.
Page 10
WCE Sangli
The procedure to recover one row of MVs of F m,n is described as follows: Based on the LAGI formula, the correct neighbouring MVs MV i , ..., MV i3 and the corresponding values of x coordinates (p i s) are used to constitute a Lagrange polynomial. The value of V ij can be computed as follows:
where,
Page 11
WCE Sangli
It is obvious from above equations that the values of Lagrange parameters (L 0j ,..., L 3j) are constant across all the lost MBs. Similarly, the recovery of the other rows follows the same procedure. A similar procedure is followed if the vertically adjacent MBs of F m,n are error-free. The main advantage with the LAGI technique is that it ensures the high quality of recovered video consuming very less computational time.
4.4 MVR based on Polynomial Model This subsection presents a PIM to form a polynomial that describes the motion tendency of MVs adjacent to any of the lost MVs. This polynomial model results in an approximate function that can describe the change tendency of the MVs within a small area. Based on the property of this polynomial model, an approximation of the lost MVs can be obtained from the neighbouring MVs and the lost MB can be reconstructed. As shown in, the correct neighbouring MVs: y is (MV i0 ,..., MV i3) and the corresponding values of x coordinates (p i s) are used to constitute a polynomial model. A polynomial model, which describes the correlation of the MVs in the neighbouring MBs, can be constituted as follows:
Where a 0, a 1 ..., al are a set of unknown coefficients that can be calculated by the given points and l is the order of the polynomial. The objective is to compute the set of coefficients such that the squares of differences between W l(x i) and y i are minimized. The squares of differences between W l(x i) and y i can be presented as a function of the independent variable a 0 , a 1 , ..., a l as shown in the following equation
To obtain the minimum of F (a 0 , a 1 , ..., a l), the set of coefficients a 0 , a 1 , ..., a l should satisfy the Equation (4)
Page 12
WCE Sangli
From Equation (4), a set of functions can be obtained to calculate the coefficients, as presented in Equation (5)
Since there are four MVs available in the neighbouring MB, polynomial up to the third order (i.e. l=3) can be used to perform this interpolation. However, the first order polynomial cannot accurately describe nonlinear movement. The third order polynomial often results in an oscillatory curve, and it is suitable only for the interpolation data that change quickly. The second-order polynomial can represent the smooth curve, thus the second-order polynomial is more suitable for this kind of applications compared to the other two polynomials. This technique produces comparable quality of the recovered video as it is produced in case of the LAGI technique. On the other hand, this technique is little bit more expensive compared to the LAGI in terms of number of computations. Different polynomials are needed to handle the varying amount of motion in received video frames, ensuring higher PSNR values. In this context, the main advantage of this technique over the LAGI technique is that it allows the flexibility of choosing different types of polynomials based on the characteristics of the frame to be concealed, which in turn makes this technique adaptive in nature.
Page 13
WCE Sangli
5. MVR Techniques: A Performance Comparison
The experimental results presented in this section use the standard benchmark video sequence, namely, the Coastguard. The video sequence has total of 300 frames and the video sequence is encoded and decoded by the JM12.4, which is a standard CODEC program for H. 264. In the simulations reported in this section, the fixed group of pictures (GOPs) length (IntraPeriod) parameter is set to 11 to achieve the best trade-off between compression and quality. The PSNR results for the different benchmark sequences across different MVR algorithms with both the test scenarios are presented. In each case, the MB loss rate is assumed to be 15% and QP is set to 20. [Figure 5] presents a totally different test scenario where deterministic errors of 15% MB loss are introduced in a given P frame of the Coastguard sequence. To capture the worst-case scenario, the simulator introduces these errors in frames that have the maximum motion. Interestingly, in this case, irrespective of the bitrates in which it is coded, the 69th frame has the maximum motion. It is evident from this performance analysis that the LAGI and PIM are comparable, BMA stands second, and TR is at the bottom in terms of the quality of the recovered video.
Figure 5: Subjective quality comparison for the "Coastguard (QCIF)" sequence at 15% macro blocks loss in 69th frame at QP=24 (a) Original (b) 15% macro blocks lost (c) Concealed using TR (d) Concealed using BMA (e) Concealed using LAGI (f) Concealed using PIM.
Page 14
Page 15
WCE Sangli
Bibliography
1. Kavish Seth, V Kamakoti, S Srinivasan Department of Electrical Engineering, Indian Institute of Technology - Madras, Chennai 600036, India Department of Computer Science & Engineering, Indian Institute of Technology Madras, Chennai - 600036, India 2. Jinghong Zheng, Student Member, IEEE, and Lap-Pui Chau, Senior Member, IEEE 3. Donghyung Kim, Sanghyup Cho, and Jechang Jeong Dept. of Electrical and Computer Engineering, Hanyang University Haengdang, Seongdong, Seoul, South Korea
Page 16