Beruflich Dokumente
Kultur Dokumente
SELVARANI. K
(951911106080)
SUJITHA. M
(951911106093)
SWARNAMUKI. R (951911106096)
BACHELOR OF ENGINEERING
in
ELECTRONICS AND COMMUNICATION ENGINEERING
P.S.R.ENGINEERING COLLEGE, SIVAKASI-626140
APRIL 2015
SIGNATURE
SIGNATURE
C.K.RAMAR, M.E.,
Mrs.J.MEENA, M.E.,
SUPERVISOR
Assistant professor
Engineering
Engineering
Sivakasi-626 140
Sivakasi-626 140
INTERNAL EXAMINER
EXTERNAL EXAMINER
ii
ACKNOWLEDGEMENT
First and foremost we wish to express our deep unfathomable
feeling, gratitude to our institution and our department for providing us a
chance to fulfill our long cherished of becoming Electronics and
Communication Engineers.
We thank our beloved correspondent Mr.R.Solaisamy for his
support in every staff our college for their contribution in the growth of
this project.
We wish to express our hearty thanks to the principal of our
college Dr.B.G.Vishnuram, M.E.,Ph.D.,FIE., for his constant
motivation and continual encouragement regarding our project work.
We are greatly indebted to our Head of Department
Mr.C.K.Ramar,M.E., for his sincere help, and the encouragement he has
given towards the accomplishment of this project work.
We express our warm and sincere thanks to our guide
Mrs.J.Meena, M.E., Assistant professor/Electronics and communication
Engineering, for her tireless and meticulous efforts in bringing out this
project to its logical conclusion.
We are committed to place our heartfelt thanks to all teaching and
non teaching staff members, lab technician and friends, and all the noble
hearts that gave us immense encouragement towards the completion of
our project.
iii
ABSTRACT
To compress the video signals with very small delay and small area
in an efficient manner is completely a challenging technique in VLSI by
using Verilog HDL .This project proposes creating an architecture for
compress a video signals which supports three different types of video
codec
standards
are
MPEG-1/2/4(88),
H.264(88,44),
VC-
iv
TABLE OF CONTENT
CHAPTER
TITLE
NO
PAGE
NO
ABSTRACT
IV
LIST OF FIGURES
LIST OF TABLE
XI
LIST OF ABBREVIATIONS
XII
INTRODUCTION
1.7 Applications
VIDEO CODECS
11
11
14
14
2.2.1.1 MPEG-1
14
2.2.1.2 MPEG-2
15
2.2.1.3 MPEG-4
15
2.2.2 H.264
16
2.2.3 VC-1
17
SYSTEM ANALYSIS
18
18
19
3.2.1 Introduction
19
3.2.2 CSDA
20
20
vi
20
3.3.1 Introduction
20
20
21
21
21
21
22
24
3.5.1 Description
24
3.6 Modules
25
25
26
27
3.6.4 ECAT
28
3.6.5 Permutation
29
30
vii
30
32
SYSTEM IMPLEMENTATION
33
33
33
4.1.2 Synthesis
33
4.1.3 Implementation
33
4.1.4 Verification
34
34
34
36
37
37
4.6 VERILOG
38
RESULT ANALYSIS
40
40
viii
41
42
43
44
45
48
49
55
CONCLUSION
51
REFERENCE
52
ix
LIST OF FIGURES
FIGURE NO
TITLE
PAGE NO
3.1
24
3.2
26
3.3
27
3.4
28
3.5
Architecture of ECAT
29
3.6
Permutation Concept
29
3.7
30
3.8
TMEM
32
5.1
42
5.2
43
5.3
44
5.4
45
5.5
46
5.6
47
5.7
48
5.8
49
5.9
50
LIST OF TABLES
TABLE NO
TITLE
PAGE NO
3.1
18
5.1
Measured Results
40
5.2
41
xi
LIST OF ABBREVATION
ABBREVATIONS
ACRONYMS
ASICS
CAD
CMOS
CSDA
DA
Distributed Arithmetic
ECAT
FS
Factor Sharing
IT
Integer Transform
MPEG
MST
NEDA
RTL
SOC
Silicon On Chip
TMEM
Memory Transpose
VC
Video Codec
VHDL
Very
High
Speed
Integrated
xii
Circuits
CHAPTER 1
INTRODUCTION
Compression can mainly done by using several transforms such as
Discrete Cosine Transform, Integer Transforms, Distributed Arithmetic,
Factor sharing in video and image signals. These transforms are mainly
used as matrix decomposition methods to reduce the Hardware cost as
well as the implementation cost, but the implementation of such
transforms may be tedious in some cases especially for making a single
compatible architecture for different types of standards. In this project a
new technique which involves in supporting three video coding standards
has to be implemented.
1.1
VIDEO COMPRESSION
Video compression uses modern coding techniques to reduce
very
high data
rate.
Although lossless
video
LOSSY COMPRESSION
In information technology, "lossy" compression is the class of data
they are all closely related to each other. Together, these developments
are going to make possible the visions of embedded systems and
ubiquitous computing.
1.4.1 Reconfigurable computing
Reconfigurable computing is a very interesting and pretty recent
development in microelectronics. It involves fabricating circuits that can
be reprogrammed on the fly! And no, we are not talking about
microcontrollers
running
with
EEPROM
inside.
Reconfigurable
possibilities
in
microelectronics.
Consider
for
example,
design
specifications
that
specifications?
get
translated
into
hardware
winding process, going through many stages with special effort spent in
design verification at every stage. This means that the time from drawing
board to market, is very long. This proves to be rather undesirable in case
of large expanding market, with many competitors trying to grab a share.
We need alternatives to cut down on this time so that new ideas reach the
market faster, where the first person to get in normally gains a large
advantage.
B.
a need for a large number of chip designers, who can churn out chips
designed for specific applications. Its impractical to think of training so
many people in the intricacies of VLSI design.
6
C.
Specialized training
Person who wishes to design ASIC's will require extensive training
in the field of VLSI design. But we cannot possibly expect to find a large
number of people who would wish to undergo such training. Also, the
process of training these people will itself entail large investments in time
and money. This means there has to be system which can abstract out all
the details of VLSI, and which allows the user to think in simple systemlevel terms.
There are quite a few tools available for using high-level languages
in circuit design. But this area has started showing fruits only recently.
For example, there is a language called Handel-C, that looks just like
good old C. But it has some special extensions that make it usable for
defining circuits. A program written in Handel-C, can be represented
block-by-block by hardware equivalents. And in doing all this, the
compiler takes care of all low-level issues like clock-frequency, layout,
etc. The biggest selling point is that the user does not really have to learn
anything new, except for the few extensions made to C, so that it may be
conveniently used for circuit design.
Another quite different language, that is still under development, is
Lava. This is based on an esoteric branch of computer science, called
"functional programming". FP itself is pretty old, and is radically
different from the normal way we write programs. This is because it
assumes parallel execution as a part of its structure - its not based on the
normal idea of "sequence of instructions". This parallel nature is
something very suitable for hardware since the logic circuits are is
inherently parallel in nature. Preliminary studies have shown that Lava
can actually create better circuits than VHDL itself, since it affords a
high-level view of the system, without losing sight of low-level features.
7
1.5
DESIGN METHODOLOGY
A good VLSI design system should provide for consistent in all
APPLICATIONS
Digital video codecs are found in DVD systems (players,
10
CHAPTER 2
VIDEO CODECS
2.1
11
using a zig-zag scan order, and the entropy coding typically combines a
number of consecutive zero-valued quantized coefficients with the value
of the next non-zero quantized coefficient into a single symbol, and also
has special ways of indicating when all of the remaining quantized
coefficient values are equal to zero. The entropy coding method typically
uses variable-length coding tables. Some encoders can compress the
video in a multiple step process called n-pass encoding (e.g. 2-pass),
which performs a slower but potentially better quality compression.
The decoding process consists of performing, to the extent
possible, an inversion of each stage of the encoding process. The one
stage that cannot be exactly inverted is the quantization stage. There, a
best-effort approximation of inversion is performed. This part of the
process is often called "inverse quantization" or "dequantization",
although quantization is an inherently non-invertible process.
This process involves representing the video image as a set of
macroblocks. For more information about this critical facet of video
codec design.
Video codec designs are often standardized or will be in the futurei.e., specified precisely in a published document. However, only the
decoding process needs to be standardized to enable interoperability. The
encoding process is typically not specified at all in a standard, and
implementers are free to design their encoder however they want, as long
as the video can be decoded in the specified manner. For this reason, the
quality of the video produced by decoding the results of different
encoders that use the same video codec standard can vary dramatically
from one encoder implementation to another.
13
2.2
DIFFERENT STANDARDS
In this project three different types of standards has to be
aspect
of
the
whole
specification. The
standards
also
specify Profiles and Levels. Profiles are intended to define a set of tools
that are available, and Levels define the range of appropriate values for
the properties associated with them. Some of the approved MPEG
standards were revised by later amendments and/or new editions. MPEG
has standardized the following compression formats and ancillary
standards:
2.2.1.1
MPEG-1
includes the popular MPEG-1 Audio Layer III (MP3) audio compression
format.
2.2.1.2
MPEG-2
scheme
for
over
which
includes
the
air digital
television ATSC, DVB and ISDB, digital satellite TV services like Dish
Network, digital cable television signals, SVCD and DVD Video. It is
also used on Blu-ray Discs, but these normally use MPEG-4 Part 10 or
SMPTE VC-1 for high-definition content.
2.2.1.3
MPEG-4
15
2.2.2 H.264
H.264 or MPEG-4 Part 10, Advanced Video Coding (MPEG-4
AVC) is a video compression format that is currently one of the most
commonly used formats for the recording, compression, and distribution
of video content. H.264/MPEG-4 AVC is a block-oriented motioncompensation-based video compression standard developed by the ITUT Video Coding Experts Group (VCEG) together with the ISO/IEC
JTC1 Moving Picture Experts Group (MPEG).
H.264 is perhaps best known as being one of the video encoding
standards for Blu-ray Discs; all Blu-ray Disc players must be able to
decode H.264. It is also widely used by streaming internet sources, such
as videos from Vimeo, YouTube, and the iTunes Store, web software
such as the Adobe Flash Player and Microsoft Silverlight, H.264 is
typically used for lossy compression in the strict mathematical sense,
although the amount of loss may sometimes be imperceptible. It is also
possible to create truly lossless encodings using it e.g., to have localized
lossless-coded regions within lossy coded pictures or to support rare use
cases for which the entire encoding is lossless.
The intent of the H.264/AVC project was to create a standard
capable of providing good video quality at substantially lower bit rates
than previous standards (i.e., half or less the bit rate of MPEG-2, H.263,
or MPEG-4 Part 2), without increasing the complexity of design so much
that it would be impractical or excessively expensive to implement. An
additional goal was to provide enough flexibility to allow the standard to
be applied to a wide variety of applications on a wide variety of networks
and systems, including low and high bit rates, low and high resolution
video, broadcast, DVD storage, RTP/IP packet
T multimedia telephony systems.
16
networks,
and ITU-
17
CHAPTER 3
SYSTEM ANALYSIS
3.1
PROJECT INTRODUCTION
Compression can mainly done by using several transforms such as
Video Codecs
Dimensions
Groups
MPEG 1/2/4
88
ISO
H.264
88,44
ITU-T
VC-1
88,84,48,44
Microsoft
processing, the function is any quantity or signal that varies over time,
such
as
the
pressure
of
a sound
wave,
a radio signal,
or
EXISTING SYSTEM
3.2.1 INTRODUCTION
Numerous researchers have worked on transform core designs,
including discrete cosine transform (DCT) and integer transform, using
distributed arithmetic (DA) , factor sharing (FS)
and matrix
transform cores for MPEG-1/2/4 and H.264 and VC-1 cannot support the
VC-1 compression standard. To overcome this limitation the proposed
system exists.
3.2.2 CSDA
CSDA means Common Sharing Distributed Arithmetic, it is the
technique that combines the Factor sharing and Distributed Arithmetic to
generate the CSDA coefficients. Factor sharing means sharing the same
factors from the existing input and Distributed Arithmetic means sharing
the same input coefficients. In existing system pipeline register is used as
a storage element.
3.2.3 LIMITATIONS OF EXISTING SYSTEM
Low throughput
High cost
High delay
More number of adders
3.3
PROPOSED SYSTEM
3.3.1 INTRODUCTION
The proposed CSDA combines DA and FS methods. By expand
the coefficients matrix at bit level The Factor sharing method first shares
the same factor in each coefficient ,the distributed method is then applied
to share the same combination of Input among each coefficient position.
3.3.2 BUFFER AS A MEMORY
In proposed system instead of pipeline register buffer is used as a
memory element. Buffer is active only when the clock input is high. The
20
usage of buffer here makes the bit stream without getting any halt in the
memory. Hence the delay is considerably reduced. There is no storage in
the register which makes the retrieval time must be very small.
3.3.3 ADVANTAGES OF PROPOSED SYSTEM
High throughput
Low cost
Supports three different types of video codecs
Reduction in number of adders
3.4
(3.1)
(3.2)
] [
]
(
(3.3)
[ ]
][
(3.4)
(3.5)
FLOW DIAGRAM
Input coefficient
matrix
MODULES
software
25
26
1st stage
2nd stage
memory
memory
27
1st stage
memory
2nd stage
memory
28
29
3.7
(3.6)
(3.7)
Because the eight-point coefcient structures in MPEG- 1/2/4,
H.264, and VC-1 standards are the same, the eight-point transform for
these standards can use the same mathematic derivation. According to the
30
symmetry property, the 1-D eight- point transform can be divided into
even and odd two four-point transforms, Ze and Zo, as listed in and
respectively
(3.8)
The even part of the operation in (10) is the same as that of the four-point
H.264 and VC-1 transformations. Moreover, the even part Ze can be
further decomposed into even and odd parts: Zee and Zeo
(3.9)
31
3.7.2 TMEM
The TMEM is implemented using 64-word 12-bit dual-port buffer
and has a latency of 52 cycles. Based on the time scheduling strategy and
result of the time scheduling strategy, the 1st-D and 2nd-D transforms are
able to be computed simultaneously. The transposition memory is an 88
buffer array with the data width of 16 bits and is shown in Fig
Fig.3.8 TMEM
32
CHAPTER 4
SYSTEM IMPLEMENTATION
4.1
design software suite that allows you to take your design from design
entry through Xilinx device programming. The ISE Project Navigator
manages and processes your design through the following steps in the
ISE design flow.
4.1.1 Design Entry
Design entry is the first step in the ISE design flow. During design
entry, you create your source files based on your design objectives. You
can create your top-level design file using a Hardware Description
Language (HDL), such as VHDL, Verilog, or ABEL, or using a
schematic. You can use multiple formats for the lower-level source files
in your design.
4.1.2 Synthesis
After design entry and optional simulation, you run synthesis.
During this step, VHDL, Verilog, or mixed language designs become net
list files that are accepted as input to the implementation step.
4.1.3 Implementation
After synthesis, you run design implementation, which converts the
logical design into a physical file format that can be downloaded to the
selected target device. From Project Navigator, you can run the
implementation process in one step, or you can run each of the
implementation processes separately. Implementation processes vary
33
ModelSim Overview
ModelSim is a very powerful simulation environment, and as such
35
Project flow
A project is a collection mechanism for an HDL design under
Debugging tools
ModelSim offers numerous tools for debugging and analyzing your
Local libraries
References to global libraries
4.6
VERILOG
Verilog, standardized as IEEE 1364, is a hardware description
netlist) for the circuit. Some Verilog constructs are not synthesizable.
Also the way the code is written will greatly effect the size and speed of
the synthesized circuit. Most readers will want to synthesize their circuits,
so no synthesizable constructs should be used only for test benches.
These are program modules used to generate I/O needed to simulate the
rest of the design. The words not synthesizable will be used for
examples and constructs as needed that do not synthesize.
There are two types of code in most HDLs:
Structural, which is a verbal wiring diagram without storage.
assign a=b & c | d; /* | is a OR */
assign d = e & (~c);
Here the order of the statements does not matter. Changing e will change
a.
Procedural which is used for circuits with storage, or as a
convenient way to write conditional logic.
always @(posedge clk) // Execute the next statement on every
rising clock edge.
count <= count+1;
Procedural code is written like c code and assumes every
assignment is stored in memory until over written. For synthesis, with
flip-flop storage, this type of thinking generates too much storage.
However people prefer procedural code because it is usually much easier
to write, for example, if and case statements are only allowed in
procedural code. As a result, the synthesizers have been constructed
which can recognize certain styles of procedural code as actually
combinational.
39
CHAPTER 5
RESULT ANALYSIS
30K
27K
88
44(L)
44(H)
88
84
48
44
Power Consumption
38.7m
3.4Mw
N/A
N/A
N/A
46.3mW
26m
(mW)
88
H.264
Supporting
Standards
VC-1
Table 5.1
Measured Results
40
CSDA
Proposed
36.8K
CSDA
39.1K
MPEG 1/2/4
Lee et al.
36.6K
Gate counts(NAND2)
al.
Chang et
55.6K
Measured results
al.
39.8K
Huang et
Lee et al.
Exsisting
5.1
Video
codec
standards
MPEG
H.264
VC-1
4(H)
4(I)
41
These are the selection inputs which are given to the individual standards.
The desired standard can be obtained using the MUX selection.
5.3
(1111101) for eight point transform we get the MPEG output simulation
as shown in fig.7.1
5.4
(1001000) for eight point transform and (0000011) for four point
transform we get the H.264 output simulation as shown in fig.7.2
43
5.5
(1001000) for eight point transform and (0000011) for four point
transform we get the VC-1 output simulation as shown in fig.7.3
44
5.6
45
46
47
5.7
48
5.8
49
5.9
50
CHAPTER 6
CONCLUSION
The CSDA-MST core can achieve high performance, with a high
throughput rate and low-cost VLSI design, supporting MPEG-1/2/4,
H.264, and VC-1 MSTs. By using the proposed CSDA method, the
number of adders and MUXs in the MST core can be saved efficiently.
Measured results show the CSDA-MST core with a synthesis and
simulation rate with 27k logic gates and with power consumption of
26mW. Measured results show the CSDA-MST core with a throughput
rate of 1.28 G-pixels/s, which can support (4928 2048@24 Hz) digital
cinema format with only 27k logic gates. Because visual media
technology has advanced rapidly, this approach will help meet the rising
high-resolution specifications and future needs as well.
51
REFERENCES
1.
2.
3.
4.
A high-
6.
52
7.
8.
53