Algorithm and Architecture Design of The H.265HEVC Intra Encoder

Algorithm and Architecture Design of the H.
265/HEVC
Intra Encoder
ABSTRACT:
Improved video coding techniques introduced in the H.265/HEVC standard allow
video encoders to achieve better compression efficiencies. On the other hand the
increased complexity requires a new design methodology able to face challenges
associated with ever higher spatio-temporal resolutions. The paper presents the
computationally-scalable algorithm and its hardware architecture able to support
the intra encoding up to the 2160p@30fps resolution. The scalability allows the
tradeoff between the throughput and the compression efficiency. In particular, the
encoder is able to check a variable number of candidate modes. The rate estimation
based on bin counting and the distortion estimation in the transform domain
simplify the rate-distortion analysis and enable the evaluation of a great number of
candidate intra modes. The encoder preselects candidate modes by the processing
of 88 predictions computed from original samples. The preselection shares
hardware resources used for the processing of predictions generated from
reconstructed samples. To support intra 44 modes for the 2160p@30fps
resolution, the encoder incorporates a separate reconstruction loop. The processing
of blocks with different sizes is interleaved to compensate the delay of

reconstruction loops. Implementation results show that the encoder utilizes 1086k
gates and 52 kB on-chip memories for TSMC 90nm. The main reconstruction loop
can operate at 400 MHz, whereas the remaining modules work at 200 MHz. For
2160p@30fps videos, the average BD-Rate is 5.46% compared to the HM
software.
EXISTING SYSTEM:
THE latest research and standardization efforts in video coding has led to the
specification of the H.265/HEVC standard (High Efficiency Video Coding) in 2013
[1-3]. It significantly improves the rate-distortion efficiency as compared to its
predecessor H.264/AVC [4]. On the other hand, the improvement is achieved at the
cost of the increased computational complexity. The problem is of particular
importance if we take into account ever higher demands for spatio-temporal video
resolutions. In many applications, the support for the real-time compression is
indispensable. To address this requirement, many research and development works
were started.
H.265/HEVC extends the 1616 macroblock to 6464 coding tree unit
(CTU) which can be recursively split into four coding units (CU). The standard
specifies more sizes for prediction units (PUs) and transform units (TUs) included
in a CU. In the case of the intra coding, 33 directional and two non-directional
modes allow a more accurate spatial prediction of successive blocks, whereas

H.264/AVC employs up to nine modes. The best efficiency is achieved when using
the expensive rate-distortion optimization (RDO). However, the search for the
optimal mode in a brute-force fashion involves a large amount of computations.
Therefore, it is beneficial to preselect some modes based on a simplified cost
function. This method is applied in the HM reference software [3], which uses Sum
of Absolute Transformed Differences (SATD) as the cost function. Another speedup technique applied in the software is the table-based rate estimation [5]. Instead
of performing Context Adaptive Binary Arithmetic Coding (CABAC), the
technique accumulates bin contributions pre-calculated for each possible
probability state. Simplifications applied in the software introduce slight quality
losses. Nevertheless, the complexity is still huge. Its reduction is indispensable to
obtain the algorithm suitable for a real-time implementation with a reasonable
amount of resources.
PROPOSED SYSTEM:
In this paper, the algorithm and the architecture design for the computationallyscalable H.265/HEVC intra encoder is proposed. The architecture supports
resolutions up to 2160p@30fps. The encoder allows the tradeoff between the
compression efficiency and the throughput. The design takes advantage of the
following new techniques:
1. The rate estimation based on bin counting and the distortion estimation in the
transform domain simplify rate-distortion analysis and enable the analysis of a
great number of candidate modes.
2. The encoder preselects candidate modes by the processing of 88 predictions
computed from original samples.
3. The preselection shares hardware resources used for the RDO processing of
predictions generated from reconstructed samples.
4. The encoder incorporates a separate reconstruction loop to support intra 44
modes for the 2160p@30fps resolution.
5. The processing of blocks with different sizes/types is interleaved to compensate
the delay of reconstruction loops.
Fig. 1 shows probability density functions of the ratio of the bit number produced
by CABAC to the number of input bins. The functions are estimated based on
statistic for all CUs. As can be seen, the number of bits is highly correlated with
the number of bins. The correlation is stronger for smaller QP. Taking advantage of
the correlation, it is possible to directly replace the output-bit counting by the
input-bin counting to estimate rates. In this approach, contributions of non-bypass
bins are computed with an error. As a consequence, RD costs determined for
particular mode combinations are inaccurate leading to some losses in the

compression efficiency. On the other hand, the bin counting significantly simplifies
computations since the reference to CABAC probability models is avoided.
SOFTWARE IMPLEMENTATION:
Modelsim 6.0
Xilinx 14.2
HARDWARE IMPLEMENTATION:
SPARTAN-III, SPARTAN-VI

Algorithm and Architecture Design of The H.265HEVC Intra Encoder

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Algorithm and Architecture Design of The H.265HEVC Intra Encoder

Hochgeladen von

Copyright:

Verfügbare Formate

Algorithm and Architecture Design of the H.

of blocks with different sizes is interleaved to compensate the delay of

modes allow a more accurate spatial prediction of successive blocks, whereas

particular mode combinations are inaccurate leading to some losses in the

Das könnte Ihnen auch gefallen