Sie sind auf Seite 1von 52

HEVC Inter prediction

2011-10-22 (SAT)
Contents
Overview of inter prediction

Inter prediction in HEVC
GOP coding structure
Adaptive motion vector prediction(AMVP)
Merge
Asymmetric motion partition(AMP)
Interpolation filter
OVERVIEW OF INTER
PREDICTION
Overview of inter prediction
The encoder forms a model of
the current frame based on
the samples of a previously
transmitted frame

Motion-compensated
predicted frame is subtracted
from the current frame to
reduce a residual error frame

Transform coding of the
residual frame
Current
frame
Residual
frame
Motion-
compensated
frame
Motion estimation
Previous
frame
_
Overview of inter prediction
The goals of inter prediction
ME creates a model of the current frame based on available data in
one or more previously encoded frames to match the current frame
as closely as possible
n-1 frame n frame
Overview of inter prediction
Transmitted data
Motion vector (PMV, MVD)
Reference index (LIST_0/LIST_1)
Prediction mode
Residual data (quantized coefficients)



How?
GOP CODING STRUCTURE
GOP coding structure
Temporal prediction structure
All Intra (No temporal prediction is allowed)

Low Delay (LD)
The first picture shall be coded as IDR picture
GPB (Generalized P and B) picture (on/off)

Random access (RA)
Hierarchical B structure shall be used for coding
IDR Intra picture or CDR(clean random access) picture shall be
inserted cyclically per about one second in random access point
GOP coding structure Low delay
IDR or
Intra picture
GPB(Generalized P and B) picture
0
1
2
4
5 3
6
7
8
time
QPI
QPB
L3
=QPI+3
QPB
L2
=QPI+2
QPB
L3
QPB
L3
QPB
L3

QPB
L2

QPB
L1
=QPI+1 QPB
L1

GOP coding structure Random access
IDR or
Intra picture
GPB(Generalized P and B) picture
0
5
3
2
7 6
4
8
1
time
Referenced B Picture
Non-referenced B Picture
8
4
1
2
3 5
6
7
0
QPI
QPB
L4
=QPI+4 QPB
L4
QPB
L4
QPB
L4

QPB
L3
=QPI+3 QPB
L3

QPB
L2
=QPI+2
QPB
L1
=QPI+1
POC
Coding
order
GOP coding structure Random access
Variables:
m_iHrchDepth = log
2
GOP_size + 1;
iTimeOffset = (1<<m_iHrchDepth-1-iDepth);
iStep = iTimeOffset<<1;
iNumPicRcvd = GOP_size;
for( iDpeth=0; iDepth<m_iHrchDepth; iDepth++ )
{
iTimeOffset = (1<<m_iHrchDepth-1-iDepth);
iStep = iTimeOffset<<1;

for(;iTimeOffset<=iNumPicRcvd; )
{
compressSlice();

iTimeOffset += iStep;
}
}
IDR or
Intra picture
GPB(Generalized
P and B) picture
0
5
3
2
7 6
4
8
1
time
Referenced B
Picture
Non-
referenced B
Picture
8
4
1
2
3 5
6
7
0
: Depth == 0
: Depth == 1
: Depth == 2
: Depth == 3
*uiPOCCurr = iPOCLast (iNumPicRcvd iTimeOffset);
AMVP (ADAPTIVE MOTION
VECTOR PREDICTION)
MV prediction of H.264/AVC
Median of each component of MV
No transmission overhead
Slice-based use of temporal MV predictor
C B
A
Current Block

= (

= (

)
Fig. Spatial neighboring block
MV prediction of HEVC
Explicit signaling of MV predictor index
Transmission overhead
PU-based use of temporal MV predictor
B
1
A
1
B
2
B
0
A
0
Current Block
Fig. Spatial AMVP candidates





Co-located PU
Center
Right-
bottom
Fig. Temporal AMVP candidates
AMVP
Decoder receives
ref_idx
mvd
mvp_idx
B
1
A
1
B
2
B
0
A
0
Current Block
Fig. Spatial AMVP candidates





Co-located PU
Center
Right-
bottom
Fig. Temporal AMVP candidates
AMVP Decoder side AMVP syntax
prediction_unit( x0, y0 , log2CUSize ) { Descriptor
if( skip_flag[ x0 ][ y0 ] ) {
merge_idx[ x0 ][ y0 ] ue(v) | ae(v)
} else if( PredMode = = MODE_INTRA ) {

} else { /* MODE_INTER */
if( entropy_coding_mode_flag || PartMode != PART_2Nx2N )
merge_flag[ x0 ][ y0 ] u(1) | ae(v)
if( merge_flag[ x0 ][ y0 ] ) {
merge_idx[ x0 ][ y0 ] ue(v) | ae(v)
} else {
if( slice_type = = B ) {
if( !entropy_coding_mode_flag ) {
combined_inter_pred_ref_idx ue(v)
if( combined_inter_pred_ref_idx == MaxPredRef )
inter_pred_flag[ x0 ][ y0 ] ue(v)
} else
inter_pred_flag[ x0 ][ y0 ] ue(v) | ae(v)
}
if( inter_pred_flag[ x0 ][ y0 ] = = Pred_LC ) {
if( num_ref_idx_lc_active_minus1 > 0 ) {
if( !entropy_coding_mode_flag ) {
if( combined_inter_pred_ref_idx == MaxPredRef )
ref_idx_lc_minus4[ x0 ][ y0 ] ue(v)
} else
ref_idx_lc[ x0 ][ y0 ] ae(v)
}
mvd_lc[ x0 ][ y0 ][ 0 ] se(v) | ae(v)
mvd_lc[ x0 ][ y0 ][ 1 ] se(v) | ae(v)
mvp_idx_lc[ x0 ][ y0 ] ue(v) | ae(v)
}
AMVP Decoder side AMVP syntax
else { /* Pred_L0 or Pred_BI */
if( num_ref_idx_l0_active_minus1 > 0 ) {
if( !entropy_coding_mode_flag ) {
if( combined_inter_pred_ref_idx == MaxPredRef )
ref_idx_l0_minusX[ x0 ][ y0 ] ue(v)
} else
ref_idx_l0_minusX[ x0 ][ y0 ] ue(v) | ae(v)
}
mvd_l0[ x0 ][ y0 ][ 0 ] se(v) | ae(v)
mvd_l0[ x0 ][ y0 ][ 1 ] se(v) | ae(v)
mvp_idx_l0[ x0 ][ y0 ] ue(v) | ae(v)
}
if( inter_pred_flag[ x0 ][ y0 ] = = Pred_BI ) {
if( num_ref_idx_l1_active_minus1 > 0 ) {
if( !entropy_coding_mode_flag ) {
if( combined_inter_pred_ref_idx == MaxPredRef )
ref_idx_l1_minusX[ x0 ][ y0 ] ue(v)
} else
ref_idx_l1[ x0 ][ y0 ] ue(v) | ae(v)
}
mvd_l1[ x0 ][ y0 ][ 0 ] se(v) | ae(v)
mvd_l1[ x0 ][ y0 ][ 1 ] se(v) | ae(v)
mvp_idx_l1[ x0 ][ y0 ] ue(v) | ae(v)
}
}
}
}
AMVP Encoder side processing
1. Search for three candidates (spatial:2, temporal:1)

2. Remove redundant MVPs

3. Additional candidate list
Zero vector candidates are created by combining zero vector and
refIdx

4. Decision of best MVP before motion estimation
Distortion : SAD
Rate: Truncated unary code (MVP index)
RDCost = Distortion + (Bits* + 0.5)>>16;

5. Decision of the best MVP candidate after motion estimation
Best MVP index: smallest mvd = Best_MV MV of mvp_idx[i]
mvp_idx bin
0 0
1 10
2 110
Starting point for ME
AMVP Spatial AMVP candidates
Spatial AMVP candidates

mvLxA: Left spatial candidates
Derivation order: A
0
A
1
First available MV
1
st
: scan without scaling (vec1, vec2)
2
nd
: scan with scaling (vec3, vec4)
mvLxB: Above spatial candidates
Derivation order: B
0
B
1
B
2
First available MV
1
st
: scan without scaling (vec1, vec2)
2
nd
: scan with scaling, if scaling wasnt used before (vec3, vec4)
Fig. Spatial AMVP candidates
B
1
A
1
B
2
B
0
A
0
Current
Block
AMVP Spatial AMVP candidates
Spatial AMVP candidates
Four candidates can be derived at each neighboring PU
vec1: same reference index, same list
vec2: same reference index, different list
vec3: different reference index, same list
vec4: different reference index, different list

time
k l m j i picture id
current
block
neighboring
block b
j
L0 mv
m
L1 mv
j
mvL1
i
mvL0 1
2
3
4
AMVP Temporal AMVP candidate
Temporal AMVP candidate
Derivation order:
1. Right-bottom position of co-located PU
2. Center position of co-located PU






Co-located PU
Center
Right-
bottom
Fig. Temporal AMVP candidates
mvL1
mvL0
current
picture
co-located
picture
reference
picture
Co-located
partition
mvL1Col
AMVP - MV Scaling
Scaling of MV predictor has been modified (JCTVC-F142)






HM3 rounds half towards plus infinity
Proposed scheme rounds half towards zero
HM version Modification
HM3 + 128 8
HM4
( )
+ 127 8
:
MERGE
Merge
Decoder receives
ref_idx
mvd
mvp_idx
merge_flag
merge_index
Fig. Merge candidates
D
C B
A
E
Current
Block





Co-located PU
Center
Right-
bottom
Merge Decoder side Merge skip syntax
coding_unit( x0, y0, log2CUSize ) { Descriptor
if( entropy_coding_mode_flag && slice_type != I )
skip_flag[ x0 ][ y0 ] u(1) |ae(v)
if( skip_flag[ x0 ][ y0 ] )
prediction_unit( x0, y0, log2CUSize, log2CUSize, 0 , 0 )
else {

}
}
prediction_unit( x0, y0 , log2CUSize ) { Descriptor
if( skip_flag[ x0 ][ y0 ] ) {
merge_idx[ x0 ][ y0 ] ue(v)|ae(v)
} else if( PredMode = = MODE_INTRA ) {

} else { /* MODE_INTER */

}
}
Merge skip
Merge Decoder side Merge syntax
prediction_unit( x0, y0 , log2CUSize ) { Descriptor
if( skip_flag[ x0 ][ y0 ] ) {
merge_idx[ x0 ][ y0 ] ue(v)|ae(v)
} else if( PredMode = = MODE_INTRA ) {

} else { /* MODE_INTER */
if( entropy_coding_mode_flag || PartMode != PART_2Nx2N )
merge_flag[ x0 ][ y0 ] u(1) |ae(v)
if( merge_flag[ x0 ][ y0 ] ) {
merge_idx[ x0 ][ y0 ] ue(v)|ae(v)
} else {

}
}
General case - merge
Merge Encoder side processing
1. Search for five candidates

Output: Mv, RefIdx, Predflag for LIST_0/LIST_1
S
0
, S
1
, S
2
, S
3
: Spatial candidates
Col: Temporal candidate

2. Remove redundant candidates

3. Additional candidate list (JCTVC-F470)
Combined bi-directional merge candidate (5 times)
Scaled bi-directional merge candidate (1 time)
Zero vector merge candidate

4. Decision of the best MRG candidate
S
0
S
1
S
2
S
3
Col
merge_idx bin
0 0
1 10
2 110
3 1110
4 1111
Merge Spatial merge candidates
Spatial merge candidates (4 candidates)
Derivation Order: A, B, C, D, E
Fig. Spatial merge candidates
D
C B
A
E
Current Block
Merge Temporal merge candidate
refIdx derivation for merge TMVP (JCTVC-E481)
Decide three refIdx
refIdxLeft: A
refIdxAbove: B
refIdxCorner: C or D or E

Decide majority of them
If three of them are not available
refIdx = 0
Otherwise
Set minimum of available refIdx

Derivation of temporal merge candidate
Same process with TMVP
D
C B
A
E
Current Block
Merge Temporal merge candidate
Example) Decision of reference frame
D
C B
A
E
Current
Block
Curr
PU
B
A
E
ex) second 4x8 PU in 8x8 CU
Neighbor LIST RefIdx
A
LIST_0 0
LIST_1 1
B
LIST_0 -1
LIST_1 1
C NULL
D NULL
E
LIST_0 -1
LIST_1 1
LIST_0
refIdxLeft 0
refIdxAbove -1
refIdxCorner -1
LIST_1
refIdxLeft 1
refIdxAbove 1
refIdxCorner 1
LIST_0 0
LIST_1 1
Merge Additional cand. list
1. Combined bi-directional merge candidate (5 times)
mvL0_A(uni)
mvL1_B(uni)
mvL1_B(bi)
mvL0_A(bi)
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_B, ref 0
2
3
4
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_B, ref 0
2 mvL0_A, ref 0 mvL1_B, ref 0
3
4
Cur
List 0
Ref 0
List 1
Ref 0
Merge Additional cand. list
2. Scaled bi-directional merge candidate (1 time)
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_A, ref 1
2
3
4
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_A, ref 1
2 mvL0_A, ref 0 mvL0_A, ref 0
3
4
mvL0_A(ref 0)
Cur
mvL0_A(ref 0)
mvL1_A(ref 1)
List 0
Ref 0
List 0
Ref 1
List 1
Ref 0
List 1
Ref 1
Merge Additional cand. list
3. Zero vector merge
Zero vector merge candidates are created by combining zero vector
and refIdx
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_A, ref 1
2 mvL0_A, ref 0 mvL1_A, ref 1
3
4
Merge L0 L1
0 mvL0_A, ref 0 -
1 - mvL1_A, ref 1
2 mvL0_A, ref 0 mvL1_A, ref 1
3 (0,0), ref 0 (0,0), ref 0
4
AMP (ASYMMETRIC MOTION
PARTITION)
Asymmetric motion partition (AMP)
Rectangular shape PU splitting of a block for inter prediction
AMP is used from the size of 64x64 to 16x16 CU
AMP improves the coding efficiency, since irregular image
patterns
2NxnU 2NxnD nLx2N nRx2N
Asymmetric motion partition (AMP)

Random access HE Random access LC
Y U V Y U V
Class A -0.9 -1.2 -0.9 -0.7 -0.7 -0.5
Class B -0.9 -1.0 -1.0 -0.7 -0.7 -0.6
Class C -0.9 -1.0 -1.1 -0.7 -0.9 -0.9
Class D -0.8 -1.0 -0.9 -0.5 -0.7 -0.6
Class E
Overall -0.9 -1.0 -1.0 -0.7 -0.7 -0.7
Enc Time[%] 144% 151%
Dec Time[%] 99% 99%
Low delay (B) HE Low delay (B) LC
Y U V Y U V
Class A
Class B -1.1 -1.5 -1.5 -0.9 -0.8 -0.6
Class C -1.0 -1.2 -1.3 -0.7 -0.6 -0.7
Class D -1.1 -1.3 -1.5 -0.6 -0.5 -0.9
Class E -2.3 -2.2 -2.4 -1.7 -1.1 -1.3
Overall -1.3 -1.5 -1.6 -0.9 -0.7 -0.8
Enc Time[%] 144% 150%
Dec Time[%] 99% 99%
Table. Experimental result of AMP without encoding speed-up

Random Access HE Random Access LC
Y U V Y U V
Class A -0.5 -0.8 -0.5 -0.4 -0.6 -0.2
Class B -0.5 -0.8 -0.7 -0.4 -0.5 -0.5
Class C -0.6 -0.8 -0.8 -0.5 -0.6 -0.7
Class D -0.5 -0.9 -0.8 -0.4 -0.5 -0.6
Class E
Overall -0.5 -0.8 -0.7 -0.4 -0.6 -0.5
Enc Time[%] 112% 112%
Dec Time[%] 99% 98%
Low delay B HE Low delay B LC
Y U V Y U V
Class A
Class B -0.7 -1.1 -1.2 -0.5 -0.4 -0.3
Class C -0.7 -1.0 -0.9 -0.4 -0.4 -0.7
Class D -0.7 -1.2 -0.8 -0.5 -0.7 -0.2
Class E -1.5 -1.9 -1.7 -1.0 -1.0 -0.9
Overall -0.8 -1.2 -1.1 -0.6 -0.6 -0.5
Enc Time[%] 111% 111%
Dec Time[%] 100% 99%
Table. Experimental result of AMP with encoding speed-up
INTERPOLATION FILTER
Interpolation filter of H.264/AVC
1/4
th
accuracy motion vector
Cascaded filtering: 6-tap half-pel + bi-linear for luma
Bi-linear for chroma (1/8
th
)
Integer-pel no interpolation
Half-pel 6-tap
Quarter-pel 6-tap + bi-linear
Interpolation filter of HEVC
1/4
th
accuracy motion vector
1-pass filter: 8-tap for both 1/2
nd
and 1/4
th
pel
4-tap filter for chroma (1/8
th
)
Integer-pel no interpolation
Half-pel 8-tap
Quarter-pel 8-tap
Interpolation filter
Two modifications in HM4.0 and WD4.0
The motion compensation process to simplify the process by
removing rounding operations
Ensure that all data after each of the vertical and horizontal filtering
passes holds in 16-bit memory

Advantage
Software simpler
Text simpler
No difference in performance
Interpolation filter
Integer samples
Upper-case letters
Fractional sample
positions
Lower-case letters
For quarter sample
luma interpolation
A
-1,-1
A
0,-1
a
0,-1
b
0,-1
c
0,-1
A
1,-1
A
2,-1
A
-1,0
A
0,0
a
0,0
b
0,0
c
0,0
A
1,0
A
2,0
d
-1,0
d
0,0
e
0,0
f
0,0
g
0,0
d
1,0

h
-1,0
h
0,0
i
0,0
j
0,0
k
0,0
h
1,0

n
-1,0
n
0,0
p
0,0
q
0,0
r
0,0
n
1,0

A
-1,1
A
0,1
A
0,1
b
0,1
c
0,1
A
1,1
A
2,1
A
-1,2
A
0,2
A
1,2
A
2,2
Interpolation filter
Interpolation filter coefficients
Luma




Chroma
Filter()
1/4 { -1, 4, -10, 57, 19, -7, 3, -1 }
1/2 { -1, 4, -11, 40, 40, -11, 4, -1 }
Filter()
1/8 { -3, 60, 8, -1 }
1/4 { -4, 54, 16, -2 }
3/8 { -5, 46, 27, -4 }
1/2 { -4, 36, 36, -4 }
Interpolation filter
Luma interpolation process
(1D interpolation filter)
For fractional positions
a
0,0
, b
0,0
and c
0,0
,
horizontal 1D filter is
used.
For fractional positions
d
0,0
, h
0,0
and n
0,0
, vertical
1D filter is used.
The input of 1D
interpolation function is
integer position values.
The output is interpolated
value X, which has
fractional position .
A
-1,-1
A
0,-1
a
0,-1
b
0,-1
c
0,-1
A
1,-1
A
2,-1
A
-1,0
A
0,0
a
0,0
b
0,0
c
0,0
A
1,0
A
2,0
d
-1,0
d
0,0
e
0,0
f
0,0
g
0,0
d
1,0

h
-1,0
h
0,0
i
0,0
j
0,0
k
0,0
h
1,0

n
-1,0
n
0,0
p
0,0
q
0,0
r
0,0
n
1,0

A
-1,1
A
0,1
A
0,1
b
0,1
c
0,1
A
1,1
A
2,1
A
-1,2
A
0,2
A
1,2
A
2,2
Interpolation filter
Example) 1/2 position b
0,0
8-tap separable DCTIF
coefficient of 1/2 position

{ -1, 4, -11, 40, 40, -11, 4, -1 }
A
-1,-1
A
0,-1
a
0,-1
b
0,-1
c
0,-1
A
1,-1
A
2,-1
A
-1,0
A
0,0
a
0,0
b
0,0
c
0,0
A
1,0
A
2,0
d
-1,0
d
0,0
e
0,0
f
0,0
g
0,0
d
1,0

h
-1,0
h
0,0
i
0,0
j
0,0
k
0,0
h
1,0

n
-1,0
n
0,0
p
0,0
q
0,0
r
0,0
n
1,0

A
-1,1
A
0,1
A
0,1
b
0,1
c
0,1
A
1,1
A
2,1
A
-1,2
A
0,2
A
1,2
A
2,2

0,0
= 1
3,0
+ 4
2,0
11
1,0
+ 40
0,0
+ 40
1,0
11
2,0
+ 4
3,0
1
4,0
+ 32 /64
Interpolation filter
Luma interpolation process
(2D separable
interpolation filter)
For remaining positions
first horizontal 1D filter is
applied for extended
block, and then vertical
1D filter is used.
A
-1,-1
A
0,-1
a
0,-1
b
0,-1
c
0,-1
A
1,-1
A
2,-1
A
-1,0
A
0,0
a
0,0
b
0,0
c
0,0
A
1,0
A
2,0
d
-1,0
d
0,0
e
0,0
f
0,0
g
0,0
d
1,0

h
-1,0
h
0,0
i
0,0
j
0,0
k
0,0
h
1,0

n
-1,0
n
0,0
p
0,0
q
0,0
r
0,0
n
1,0

A
-1,1
A
0,1
A
0,1
b
0,1
c
0,1
A
1,1
A
2,1
A
-1,2
A
0,2
A
1,2
A
2,2
Interpolation filter
Example) 1/4 position e
0,0
2D separable Interpolation
8horizontal 1D filter +
1vertical 1D filter
A
-1,-1
A
0,-1
a
0,-1
b
0,-1
c
0,-1
A
1,-1
A
2,-1
A
-1,0
A
0,0
a
0,0
b
0,0
c
0,0
A
1,0
A
2,0
d
-1,0
d
0,0
e
0,0
f
0,0
g
0,0
d
1,0

h
-1,0
h
0,0
i
0,0
j
0,0
k
0,0
h
1,0

n
-1,0
n
0,0
p
0,0
q
0,0
r
0,0
n
1,0

A
-1,1
A
0,1
A
0,1
b
0,1
c
0,1
A
1,1
A
2,1
A
-1,2
A
0,2
A
1,2
A
2,2
Interpolation filter
1D filtering





2D filtering

Intermediate value should be saved and processed

0,0
= 1
3,0
+ 4
2,0
10
1,0
+ 57
0,0
+ 19
1,0
7
2,0
+ 4
3,0
1
4,0
+ 1 1

0,0
= 1
3,0
+ 4
2,0
10
1,0
+ 57
0,0
+ 19
1,0
7
2,0
+ 4
3,0
1
4,0
1
1
,0
= 1
,3
+ 4
,2
10
,1
+ 57
,0
+ 19
,1
7
,2
+ 4
,3
1
,4

0,0
= 1 1
3,0
+ 4 1
2,0
10 1
1,0
+ 57 1
0,0
+ 19 1
1,0
7 1
2,0
+ 4 1
3,0
1 1
4,0
+ 2 2

0,0
= 1
3,0
+ 4
2,0
10
1,0
+ 57
0,0
+ 19
1,0
7
2,0
+ 4
3,0
1 1
4,0
2
Interpolation filter Example
template<int N, bool isVertical, bool isFirst, bool isLast>
Void TComInterpolationFilter::filter(Short const *src, Int srcStride, Short *dst, Int dstStride, Int width, Int height, Short const *coeff)
{
Int row, col;
Int cStride = ( isVertical ) ? srcStride : 1; // isVertical: 1(vertical filtering), 0(horizontal filtering)
src -= ( N/2 - 1 ) * cStride; // N: 8(Luma), 4(Chroma)
Int offset;
Short maxVal;
Int headRoom = IF_INTERNAL_PREC - (g_uiBitDepth + g_uiBitIncrement); // IF_INTERNAL_PREC: 14, guiBitdepth+g_uiBitIncrement: 10(HE),8(LC)
Int shift = IF_FILTER_PREC; // IF_FILTER_PREC: 6
// isFirst: whether first filtering or not
if ( isLast ) { // last filtering
shift += (isFirst) ? 0 : headRoom; // isFirst: shift = 6, other case: shift = 6+4(HE), 6+6(LC)
offset = 1 << (shift - 1); // isFirst: offset = 1<<6 = 32, other case: offset = 1<<10(HE), 1<<12(LC)
offset += (isFirst) ? 0 : IF_INTERNAL_OFFS << IF_FILTER_PREC; // !isFirst: offset = 1<<10 + 1<<11(HE), 1<<12 + 1<<11(LC)
maxVal = g_uiIBDI_MAX; // maxVal: 1023(HE), 255(LC)
} else { // other case
shift -= (isFirst) ? headRoom : 0; // isFirst: shift = 2(HE), 0(LC)
offset = (isFirst) ? -IF_INTERNAL_OFFS << shift : 0; // isFirst: -((1<<IF_FILTER_PREC-1)<<2)=-(1<<7)(HE), -(1<<5)(LC)
maxVal = 0;
}

for (row = 0; row < height; row++) {
for (col = 0; col < width; col++) {
Int sum;
sum = src[ col + 0 * cStride] * coeff[0];
sum += src[ col + 1 * cStride] * coeff[1];
sum += src[ col + 2 * cStride] * coeff[2];
sum += src[ col + 3 * cStride] * coeff[3];
sum += src[ col + 4 * cStride] * coeff[4];
sum += src[ col + 5 * cStride] * coeff[5];
sum += src[ col + 6 * cStride] * coeff[6];
sum += src[ col + 7 * cStride] * coeff[7];
Short val = ( sum + offset ) >> shift;
if ( isLast ) { // clipping in last filtering
val = ( val < 0 ) ? 0 : val;
val = ( val > maxVal ) ? maxVal : val;
}
dst[col] = val; // store filtering output pixel
}
src += srcStride;
dst += dstStride;
}
}
modified version for seminar
Interpolation filter Example. Half-pel
template<int N, bool isVertical, bool isFirst, bool isLast>
Void TComInterpolationFilter::filter(Short const *src, Int srcStride, Short *dst, Int dstStride, Int width, Int height, Short const *coeff)
{
Int row, col;
Int cStride = ( isVertical ) ? srcStride : 1; // isVertical: 1(vertical filtering), 0(horizontal filtering)
src -= ( N/2 - 1 ) * cStride; // N: 8(Luma), 4(Chroma)
Int offset;
Short maxVal;
Int headRoom = IF_INTERNAL_PREC - (g_uiBitDepth + g_uiBitIncrement); // IF_INTERNAL_PREC: 14, guiBitdepth+g_uiBitIncrement: 10(HE),8(LC)
Int shift = IF_FILTER_PREC; // IF_FILTER_PREC: 6
// isFirst: whether first filtering or not
if ( isLast ) { // last filtering
shift += (isFirst) ? 0 : headRoom; // isFirst: shift = 6, other case: shift = 6+4(HE), 6+6(LC)
offset = 1 << (shift - 1); // isFirst: offset = 1<<6 = 32, other case: offset = 1<<10(HE), 1<<12(LC)
offset += (isFirst) ? 0 : IF_INTERNAL_OFFS << IF_FILTER_PREC; // !isFirst: offset = 1<<10 + 1<<11(HE), 1<<12 + 1<<11(LC)
maxVal = g_uiIBDI_MAX; // maxVal: 1023(HE), 255(LC)
} else { // other case
shift -= (isFirst) ? headRoom : 0; // isFirst: shift = 2(HE), 0(LC)
offset = (isFirst) ? -IF_INTERNAL_OFFS << shift : 0; // isFirst: -((1<<IF_FILTER_PREC-1)<<2)=-(1<<7)(HE), -(1<<5)(LC)
maxVal = 0;
}

for (row = 0; row < height; row++) {
for (col = 0; col < width; col++) {
Int sum;
sum = src[ col + 0 * cStride] * coeff[0];
sum += src[ col + 1 * cStride] * coeff[1];
sum += src[ col + 2 * cStride] * coeff[2];
sum += src[ col + 3 * cStride] * coeff[3];
sum += src[ col + 4 * cStride] * coeff[4];
sum += src[ col + 5 * cStride] * coeff[5];
sum += src[ col + 6 * cStride] * coeff[6];
sum += src[ col + 7 * cStride] * coeff[7];
Short val = ( sum + offset ) >> shift;
if ( isLast ) { // clipping in last filtering
val = ( val < 0 ) ? 0 : val;
val = ( val > maxVal ) ? maxVal : val;
}
dst[col] = val; // store filtering output pixel
}
src += srcStride;
dst += dstStride;
}
}
modified version for seminar
-1 4 -11 40 40 -11 4 -1
Example) 1/2 position b
0,0
8-tap separable DCTIF coefficient of 1/2
position
isFrist = true;
isLast = true; (uni-direction case)
shift = 6;
offset = 1<<(6-1) = 32;
maxVal = 1023(HE), 255(LC);
cStirde = 1; (horizontal filtering)
A0,0 a0,0 b0,0 c0,0 A1,0
d0,0 e0,0 f0,0 g0,0 d1,0
h0,0 i0,0 j0,0 k0,0 h1,0
n0,0 p0,0 q0,0 r0,0 n1,0
A0,1 A0,1 b0,1 c0,1 A1,1
Interpolation filter Example. Quarter pel
template<int N, bool isVertical, bool isFirst, bool isLast>
Void TComInterpolationFilter::filter(Short const *src, Int srcStride, Short *dst, Int dstStride, Int width, Int height, Short const *coeff)
{
Int row, col;
Int cStride = ( isVertical ) ? srcStride : 1; // isVertical: 1(vertical filtering), 0(horizontal filtering)
src -= ( N/2 - 1 ) * cStride; // N: 8(Luma), 4(Chroma)
Int offset;
Short maxVal;
Int headRoom = IF_INTERNAL_PREC - (g_uiBitDepth + g_uiBitIncrement); // IF_INTERNAL_PREC: 14, guiBitdepth+g_uiBitIncrement: 10(HE),8(LC)
Int shift = IF_FILTER_PREC; // IF_FILTER_PREC: 6
// isFirst: whether first filtering or not
if ( isLast ) { // last filtering
shift += (isFirst) ? 0 : headRoom; // isFirst: shift = 6, other case: shift = 6+4(HE), 6+6(LC)
offset = 1 << (shift - 1); // isFirst: offset = 1<<6 = 32, other case: offset = 1<<10(HE), 1<<12(LC)
offset += (isFirst) ? 0 : IF_INTERNAL_OFFS << IF_FILTER_PREC; // !isFirst: offset = 1<<10 + 1<<11(HE), 1<<12 + 1<<11(LC)
maxVal = g_uiIBDI_MAX; // maxVal: 1023(HE), 255(LC)
} else { // other case
shift -= (isFirst) ? headRoom : 0; // isFirst: shift = 2(HE), 0(LC)
offset = (isFirst) ? -IF_INTERNAL_OFFS << shift : 0; // isFirst: -((1<<IF_FILTER_PREC-1)<<2)=-(1<<7)(HE), -(1<<5)(LC)
maxVal = 0;
}

for (row = 0; row < height; row++) {
for (col = 0; col < width; col++) {
Int sum;
sum = src[ col + 0 * cStride] * coeff[0];
sum += src[ col + 1 * cStride] * coeff[1];
sum += src[ col + 2 * cStride] * coeff[2];
sum += src[ col + 3 * cStride] * coeff[3];
sum += src[ col + 4 * cStride] * coeff[4];
sum += src[ col + 5 * cStride] * coeff[5];
sum += src[ col + 6 * cStride] * coeff[6];
sum += src[ col + 7 * cStride] * coeff[7];
Short val = ( sum + offset ) >> shift;
if ( isLast ) { // clipping in last filtering
val = ( val < 0 ) ? 0 : val;
val = ( val > maxVal ) ? maxVal : val;
}
dst[col] = val; // store filtering output pixel
}
src += srcStride;
dst += dstStride;
}
}
modified version for seminar
-1 4 -10 57 19 -7 3 -1
Example) 1/4 position e
0,0
2D separable interpolation
(1) Horizontal filtering
isFrist = true;
isLast = false;
shift = 2(HE), 0(LC);
offset = -(1<<7);
maxVal = 0;
cStirde = 1; (horizontal filtering)
A0,0 a0,0 b0,0 c0,0 A1,0
d0,0 e0,0 f0,0 g0,0 d1,0
h0,0 i0,0 j0,0 k0,0 h1,0
n0,0 p0,0 q0,0 r0,0 n1,0
A0,1 A0,1 b0,1 c0,1 A1,1
Interpolation filter Example. Quarter pel
template<int N, bool isVertical, bool isFirst, bool isLast>
Void TComInterpolationFilter::filter(Short const *src, Int srcStride, Short *dst, Int dstStride, Int width, Int height, Short const *coeff)
{
Int row, col;
Int cStride = ( isVertical ) ? srcStride : 1; // isVertical: 1(vertical filtering), 0(horizontal filtering)
src -= ( N/2 - 1 ) * cStride; // N: 8(Luma), 4(Chroma)
Int offset;
Short maxVal;
Int headRoom = IF_INTERNAL_PREC - (g_uiBitDepth + g_uiBitIncrement); // IF_INTERNAL_PREC: 14, guiBitdepth+g_uiBitIncrement: 10(HE),8(LC)
Int shift = IF_FILTER_PREC; // IF_FILTER_PREC: 6
// isFirst: whether first filtering or not
if ( isLast ) { // last filtering
shift += (isFirst) ? 0 : headRoom; // isFirst: shift = 6, other case: shift = 6+4(HE), 6+6(LC)
offset = 1 << (shift - 1); // isFirst: offset = 1<<6 = 32, other case: offset = 1<<10(HE), 1<<12(LC)
offset += (isFirst) ? 0 : IF_INTERNAL_OFFS << IF_FILTER_PREC; // !isFirst: offset = 1<<9 + 1<<11(HE), 1<<11 + 1<<11(LC)
maxVal = g_uiIBDI_MAX; // maxVal: 1023(HE), 255(LC)
} else { // other case
shift -= (isFirst) ? headRoom : 0; // isFirst: shift = 2(HE), 0(LC)
offset = (isFirst) ? -IF_INTERNAL_OFFS << shift : 0; // isFirst: -((1<<IF_FILTER_PREC-1)<<2)=-(1<<7)(HE), -(1<<5)(LC)
maxVal = 0;
}

for (row = 0; row < height; row++) {
for (col = 0; col < width; col++) {
Int sum;
sum = src[ col + 0 * cStride] * coeff[0];
sum += src[ col + 1 * cStride] * coeff[1];
sum += src[ col + 2 * cStride] * coeff[2];
sum += src[ col + 3 * cStride] * coeff[3];
sum += src[ col + 4 * cStride] * coeff[4];
sum += src[ col + 5 * cStride] * coeff[5];
sum += src[ col + 6 * cStride] * coeff[6];
sum += src[ col + 7 * cStride] * coeff[7];
Short val = ( sum + offset ) >> shift;
if ( isLast ) { // clipping in last filtering
val = ( val < 0 ) ? 0 : val;
val = ( val > maxVal ) ? maxVal : val;
}
dst[col] = val; // store filtering output pixel
}
src += srcStride;
dst += dstStride;
}
}
modified version for seminar
-1 4 -10 57 19 -7 3 -1
Example) 1/4 position e
0,0
2D separable interpolation
(2) Vertical filtering
isFrist = false;
isLast = true;
shift = 10(HE), 12(LC);
offset = 1<<9 + 1<<11(HE), 1<<11 + 1<<11(LC);
maxVal = 1023(HE), 255(LC);
cStirde = srcStride; (vertical filtering)
A0,0 a0,0 b0,0 c0,0 A1,0
d0,0 e0,0 f0,0 g0,0 d1,0
h0,0 i0,0 j0,0 k0,0 h1,0
n0,0 p0,0 q0,0 r0,0 n1,0
A0,1 A0,1 b0,1 c0,1 A1,1
THANK YOU

Das könnte Ihnen auch gefallen