The Visual Display System

THE VISUAL DISPLAY SYSTEM
The Visual Display System consists of two important components – the Monitor and the Adapter Card & Cable.
THE MONITOR
The principle on which the monitor works is based upon the operation of a sealed glass tube called the Cathode Ray Tube
(CRT).
Monochrome CRT
The CRT is a
vacuum sealed
glass tube having
two electrical
terminals inside,
the negative
electrode or
cathode (K) and
a positive
electrode or
anode (A).
Across these
terminals a high
potential of the
order of 18 KV is
maintained. This
produces a beam
of electrons,
known as
cathode rays,
from the cathode
towards the
anode. The front
face of the CRT is
coated with a
layer of a material
called phosphor
arranged in the form of a rectangular grid of a large number of dots. The material phosphor has a property of emitting a
glow of light when it is hit by charged particles like electrons.
The beam of electrons is controlled by three other positive terminals. The control
grid (G1) helps to draw out the electrons in an uniform beam, the accelerating grid
(G2) provides acceleration to the electrons in the forward direction and the
focusing grid (G3) focuses the beam to a single point on the screen ahead, so that
the diameter of the beam is equal to the diameter of a single dot of phosphor. This
dot is called a pixel, which is short for Picture Element. As the beam hits the
phosphor dot, a single glowing pixel is created at the center of the screen. On the
neck of the CRT are two other electrical coils called deflection coils. When
current flows through these coils, the electrical field produced interacts with the
electron beam thereby deflecting it from its original path. One of the coils called
the horizontal deflection coil moves the beam horizontally across the screen
and the other coil called the vertical deflection coil moves the beam vertically
along the height of the screen. When both these coils are energized the electron
beam can be moved in any direction thus generating a single spot of light at any
point on the CRT screen.
Raster Scanning
To draw an image on the CRT screen involves the process of raster scanning. It is the process by which the electron beam
sequentially moves over all the pixels on the screen. The beam starts from the upper-left corner of the screen, moves over
the first row of pixels until it reaches the right hand margin of the screen. The beam is then switched off and retraces back
horizontally to the beginning of the second row of pixels. This is called horizontal retrace. It is then turned on again and
moves over the second row of pixels. This process continues until it reaches the bottom-right corner of the screen, after
which it retraces back to the starting point. This is called vertical retrace. The entire pattern is called a raster and each
scan line is called a raster line.
Frames and Refresh Rate
The electron beam is said to

produce a complete frame of
picture when starting from the
top-left corner it moves over all
the pixels and returns back to
the starting point. The human
brain has the capability to hold
on to the image of an object
before our eyes for a fraction of
a second even after the object
has been removed from before
our eyes. This phenomenon is
called persistence of vision.
As the beam moves over each
pixel, the glow of the pixel dies down although its image persists in our eyes for sometimes after that. So if the beam can
come back to the pixel before its glow has completely disappeared, to us it will seem that the pixel is glowing continuously.
It has been observed that we see a steady image on the screen only if 60 frames are generated on the screen per second
i.e. the electron beam should return to its starting point within 1/60 th of a second. The monitor is then said to have a
refresh rate of 60 Hz. A monitor with a refresh rate of less than 50 Hz produces a perceptible flicker on the screen and
should be avoided.
Color CRT
The working principle of

a color CRT is same as
that of a monochrome
CRT, except that here
each pixel consists of
three colored dots
instead of one and is
called a triad. These
colors are red, green and
blue (RGB) and are
called primary colors.
Corresponding to the
three dots there are also
three electron beams
from the electrode (also
called electron gun) each
of which falls on the
corresponding dot. It has been experimentally observed that the three primary colored lights can combine in various
proportions to produce all other colors. As each of the three beams hits the corresponding dots in various intensities, they
produce different proportions of the three elementary colored lights which together create the sensation of a specific color in
our eyes. Our eyes cannot distinguish the individual dots but see their net effect as a whole. A perforated screen called a
shadow mask prevents the beams from falling in the gap between the dots. Secondary colors are created by mixing
equal quantities of primary colors e.g. red and green create yellow, green and blue create cyan, blue and red create
magenta, while all the three colors in equal proportion produce white.
Interlacing
A process by which monitors of lower refresh rates can produce images comparable in quality to that produced by a monitor
of higher refresh rate. Here each frame is split into two parts consisting of odd and even lines from the complete image.
These are called odd-field and even-field. The first field is displayed for half the frame duration and then the second field
is displayed so that its lines fit between the lines of the first field. This succeeds in lowering the frame rate without
increasing the flicker correspondingly, although the picture quality is still not same as that of a non-interlaced monitor. One
of the most popular applications of interlacing is TV broadcasting.
Monitor Specifications
(a) Refresh Rate : Number of frames displayed by a monitor in one second. Thus a monitor having a frame rate of 60 Hz
implies that an image on the screen of the monitor is refreshed 60 times per second.
(b) Horizontal Scan Rate : Number of horizontal lines displayed by the monitor in one second. For a monitor having a
refresh rate of 60 Hz and 600 horizontal lines on the screen, the horizontal scan rate is 36 KHz.
(c) Dot Pitch : Shortest distance between two neighbouring pixels or triads on the screen. Usually of the order of 0.4 mm
to 0.25 mm.
(d) Pixel Addressability : The total number of pixels that can be addressed on the screen. Measured by the product of
the horizontal number of pixels and the vertical number of pixels on the screen. Modern monitors usually have 640 X
480 pixels or 800 X 600 pixels on the screen.
(e) Aspect Ratio : Ratio of the width of the screen to its height. For computer monitors and TV screens it is 4:3, whereas
for movie theatres it is 16:9.
(f) Size : The longest diagonal length of the monitor. Standard computer monitors are usually between 15” and 20” in
size.
(g) Resolution : The total number of pixels per unit length of the monitor either in the horizontal or vertical directions.
Measured in dots per inch (dpi). Usually of the order of 75 dpi to 96 dpi for modern monitors.
(h) Color Depth : A measure of the total number of colors that can be displayed on a monitor. Depends on the number of
the varying intensities that the electron beams can be made to have. A monitor with a color depth of 8-bits can display
a total of 28 or 256 colors.
Problem-1
A 15” monitor with aspect ratio of 4:3 has a pixel addressability of 800 X 600. Calculate its resolution.
Lets the width of the monitor be 4x and its height be 3x.

For a right angled triangle by we know,
(4x)2 + (3x)2 = 152
i.e. 16x2 + 9x2 = 152
i.e. x=3
The width of the monitor is 12” and the height is 9”
So, resolution is (800/12) = (600/9) = 66.67 dpi
Problem-2
A monitor can display 4 shades of red, 8 shades of blue and 16 shades of green. Find out its color depth.
Each pixel can take up a total of (4 X 8 X 16) or 512 colors.

Since 29 = 512, the monitor has a color depth of 9-bits.
THE VIDEO ADAPTER CARD AND CABLE
The Video Adapter is an expansion card which usually sits on a slot on the motherboard. It acts as an interface between the
processor of the computer and the monitor. The digital data requiring for creating an image on the screen is generated by
the central processor of the computer and consists of RGB values for each pixel on the screen. These are called pixel
attributes. For an 8-bit image, each pixel is digitally represented by an 8-bit binary number. The adapter interprets these
attributes and translates them into one of 256 voltage levels (since 28 = 256) to drive the electron gun of the monitor.
These intensity signals along with two synchronization signals for positioning the electron beam at the location of the pixel,
are fed to the monitor from the adapter through the video cable.
The VGA
The Video Graphics Array (VGA) adapter was a standard introduced by IBM which was capable of displaying text and
graphics in 16 colors at 640 x 480 mode or 256 colors at 320 x 240 mode. A VGA card had no real processing power
meaning that the CPU had to do most of the image manipulation tasks. The VGA adapter was connected to a VGA
compatible monitor using a video cable with a 15-pin connector. The pins on the connector carried various signals from the
card to the monitor including the color intensity signals and the synchronization signals. The sync signals were generated by
the adapter to control the movement of the electron guns of the CRT monitor. Sync signals consisted of the horizontal sync
pulses which controlled the left to right movement of the electron beam as well as the horizontal retrace, and the vertical
sync pulses which controlled the up and down movement of the beam as well as the vertical retrace. Nowadays VGA has
become obsolete being replaced by the SVGA adapters.
The SVGA
The industry extended the VGA standard to include improved capabilities like 800 x 600 mode with 16-bit color and later on
1024 x 768 mode with 24-bit color. All of these standards were collectively called Super VGA or SVGA. The Video Electronics
Standard Association (VESA) defined a standard interface for the SVGA adapters and called it VESA BIOS Extensions. Along
with these new improved standards came accelerated video cards which included a special graphics processor on the
adapter itself and relieved the main CPU from most of the tasks of image manipulation.
Components of an Adapter
The main components of the video adapter card include :
Display Memory
A bank of memory within the adapter card used for storing pixel attributes. Initially used for storing the image data from
the CPU and later used by the adapter to generate RGB signals for the monitor. The amount of memory should be sufficient
to hold the attributes of all the pixels on the screen and depends on the pixel addressability as well as the color depth. Thus
for a 8-bit image displayed at 640 x 480 mode, minimum amount of display memory required is approximately 1 MB.
Graphics Controller
A chip within the adapter card responsible for coordinates the activities of all other components of the card. For the earlier
generation video cards, the controller simply passed on the data from the processor to the monitor after conversion. For
modern accelerated video cards, the controller also has the capability of manipulating the image data independently of the
central processor.
Digital-to-Analog Converter
The DAC actually converts the digital data stored in the display memory to analog voltage levels to drive the electron beams
of the CRT.
Problem-3
A monitor has pixel addressability of 800 X 600 and a color depth of 24-bits. Calculate the minimum amount of
display memory required in its adapter card to display an image on the screen.
A total of 24-bits are allocated to each pixel.

So for a total of 800 X 600, total number of bits required is (800 X 600 X 24).
To store these many bits, the amount of display memory required is (800 X 600 X 24)/(8 X 1024 X 1024) which
rounded to the next highest integer becomes 2 MB.
Accelerated Graphics Port (AGP)
To combat the eventual saturation of the PCI bus with video information a new interface has been pioneered by Intel
(http://developer.intel.com/technology/agp) , designed specifically for the video subsystem. AGP was developed in response
to the trend towards greater and greater performance requirements for video. As software evolves and the computer use
continuously into previously unexplored areas such as 3D acceleration and full-motion video playback, both the processor
and the video adapter need to process more and more information. Another issue has been the increasing demands for
video memory. Much larger memory are being required on video cards not just for the screen image but for the 3D
calculations. This in turn makes the video card more expensive.
AGP gets around these problems by two approaches. It provides a separate AGP slot on the motherboard connected to an
AGP bus providing 530 MB/sec. It also utilizes a portion of the main memory known as the texture cache for storing pixel
attributes thereby going beyond the limits of display memory on the adapter card. AGP is ideal for transferring the huge
amount of data required for displaying 3D graphics and animation. AGP is considered a port and not a bus as it involves
only two devices, the processor and the video card and is not expandable. AGP has helped remove bandwidth overheads
from the PCI bus.
The slot itself is physically
similar to the PCI slot but is
offset further from the edge of
the motherboard.
The Liquid Crystal Display
Principle of Operation
Liquid crystals were first discovered in the late 19 th century by Austrian botanist Freidrich Reinitzer and the term liquid
crystal was coined by German physicist Otto Lehmann.
Liquid crystal are transparent organic substances consisting of long rod-like molecules which in their natural state arrange
themselves with their axes roughly parallel to each other. By flowing the liquid crystal over finely grooved
surface it is possible to control the alignment of the molecules as they follow the alignment of the grooves.
The first principle of an LCD consists of sandwiching a layer of liquid crystal between two finely grooved surfaces whose
grooves are perpendicular to each other. Thus the molecules at the two surfaces are aligned perpendicular to each other
and those at the intermediate layers are twisted by intermediate angles. Light in following the molecules is also twisted by
90 degrees as it passes through the liquid crystal.
The second principle of an LCD depends on

polarizing filters. Natural light waves are oriented
at random angles. A polarizing filter acts as a net
of finely parallel lines blocking all light except those
whose waves are parallel to those lines. A second
polarizer perpendicular to the first would therefore
block all of the already polarized light. An LCD
consists of two polarizing filters perpendicular to
each other with a layer of twisted liquid crystals
between them. Light after passing through the first
polarizer is twisted through 90 deg. By the liquid
crystal and passes out completely through the
second polarizer. This gives us a lighted pixel. On
applying an electric charge across the liquid crystal
its molecular alignment is disturbed. In this case
light is not twisted by 90 degrees by the liquid
crystal and therefore blocked by the second
polarizer. This gives us a dark pixel. Images are
drawn on the screen using arrangements of these
lighted and dark pixels.
VIDEO
INTRODUCTION
The recording and editing of sound has long been in the domain of the PC. Doing so with motion video
has only recently gained acceptance. This is because of the enormous file size required by video. For
example : one second of 24-bit, 640 X 480 mode video and its associated audio required 30 MB space.
Thus a 20 minute clip filled 36 GB of disk space. Moreover it required processing at 30 MB/s.
The only solution was to compress the data, but compression hardware was very expensive in the early
days of video editing. As a result video was played in very small sized windows of 160 X 120 pixels
which occupied only 1/16 th of the total screen. It was only after the advent of the Pentium-II processor
coupled with cost reduction of video compression hardware, that full screen digital video finally became
a reality.
Moving Pictures
In motion video the illusion of moving images is created by displaying a sequence of still images rapidly
one after another. If displayed fast enough our eye cannot distinguish the individual frames, rather
because of persistence of vision merges the individual frames with each other thereby creating an effect
of movement.
Each individual image in this case is called a frame and the speed with which the images are displayed
one after another is called frame rate. The frame rate should range between 20 and 30 for perceiving
smooth realistic motion. Audio is added and synchronized with the apparent movement of images.
Motion picture is recorded on film whereas in motion video the output is an electrical signal. Film
playback is at 24 fps while for video ranges from 25 to 30. Visual and audio data when digitized and
combined into a file gives rise to digital video.
Video represents a sequence of real world images taken by a movie camera. So it depicts an event that
physically took place in reality. Animation also works on the same principle of displaying a sequence of
images at a specific speed to create the illusion of motion but here the images are drawn by artists, by
hand or software. So the events do not depict any real sequence of events taking place in the physical
world.
ANALOG VIDEO
In analog video systems video is stored and processed in the form of analog electrical signals. The most
popular example is television broadcasting. In contrast, digital video is where video is represented by a
string of bits. All forms of video represents in a PC represents digital video.
Video Camera
Analog video cameras are used to record a succession of still images and then convert the brightness and
color information of the images into electrical signals. These signals are transmitted from one place to
another using cables or by wireless means and in the television set at the receiving end these signals are
again converted to form the images. The tube type analog video camera is generally used in professional
studios and uses electron beams to scan in a raster pattern, while the CCD video camera, using a light-
sensitive electronic device called the CCD, is used for home/office purposes where portability is
important.
Tube type Camera
The visual image in

front of the video
camera is presented
to the user by an
optical lens. This
lens focuses the
scene on the
photosensitive surface of a tube in the same way that the lens of a camera focuses the image on the film
surface.
The photo-sensitive surface, called Target, is a form of semi-conductor. It is almost an insulator in the
absence of light. With absorption of energy caused by light striking the target, electrons acquire
sufficient energy to take part in current flow. The electrons migrate towards a positive potential applied
to the lens side of the target. This positive potential is applied to a thin layer of conductive but
transparent material. The vacant energy states left by the liberated electrons, called holes, migrate
towards the inner surface of the target. Thus a charge pattern appears on the inner surface of the target
that is most positive where the brightness or luminosity of the scene is the greatest.
The charge pattern is sampled point-by-point by a moving beam of electrons which originates in an
electron gun in the tube. The beam scans the charge pattern in the same way a raster is produced in a
monitor but approach the target at a very low velocity. The beam deposits just enough carriers to
neutralize the charge pattern formed by the holes. Excess electrons are turned back towards the source.
The exact number of electrons needed to neutralize the charge pattern constitute a flow of current in a
series circuit. It is this current flowing across a load resistance that forms the output signal voltage of the
tube.
CCD Camera
The ADC sends the digital information

to a digital signal processor (DSP).
The DSP adjusts the contrast and
detail in the image, compresses the
data and sends it to the camera’s
storage medium.
The transistors generate a continuous

analog electrical signal that goes to an
analog to digital converter. The ADC
translates the varying signal to a digital
format consisting of 1s and 0s.
Instead of being focused on

photographic film, the image is
focused on a chip called a CCD.
The face of the CCD is studded
with an array of transistors that
create electrical current in
proportion to the intensity of light
striking them.
Light passing through the lens of the camera is focussed on a chip called CCD. The surface of the CCD is covered with an
array of transistors that create electrical current in proportion to the intensity of the light striking them. The transistors
make up the pixels of the image. The transistors generate a continuous analog electrical signal that goes to an ADC which
translates the signal to a digital stream of data. The ADC sends the digital information to a digital signal processor (DSP)
that has been programmed specifically to manipulate photographic images. The DSP adjusts the contrast and brightness of
the image, and compresses the data before sending it to the camera’s storage medium. The image is temporarily stored on
a hard drive, RAM, floppy or tape built into the camera’s body before being transferred to the PC’s permanent storage.
Television Systems
Color Signals
The video cameras produce three output

signals which require three parallel cables for transmission. Because of the complexities involved in
transmitting 3 signals in exact synchronism, TV systems do not usually handle RGB signals. Signals are
encoded in composite format as per Luma-Chroma principle based on human color perceptions. This is
distributed using a single cable or channel.
Human Color Perception

All objects that we observe are focused sharply by the lens system of the eye on the retina. The retina
which is located at the back side of the eye has light sensitive cells which measure the visual sensations.
The retina is connected with the optic nerve which conduct the light stimuli as sensed by the organs to
the optical centre of the brain. According to the theory formulated by Helmholtz the light sensitive cells
are of two types – rods and cones. The rods provide brightness sensation and thus perceive objects in
various shades of grey from black to white. The cones that are sensitive to color are broadly divided in
three different groups. One set of cones detect the presence of blue color, the second set perceives red
color and the third is sensitive to the green shade. The combined relative luminosity curve showing
relative sensation of brightness produced by individual spectral colors is shown below. It will be seen
from the plot that the sensitivity of the human eye is greatest for the green-yellow range decreasing
towards both the red and blue ends of the spectrum. Any color other than red, green and blue excite
different sets of cones to generate a cumulative sensation of that color. White color is perceived by the
additive mixing of the sensations from all the three sets of cones.
Based on the spectral response curve and extensive tests with a large number of observers, the relative
intensities of the primary colors for color transmission e.g. for color television, has been standardized.
The reference white for color television transmission has been chosen to be a mixture of 30% red, 59%
green and 11% blue. These percentages are based on the light sensitivities of the eye to different colors.
Thus one lumen (lm) of white light = 0.3 lm of red + 0.59 lm of green + 0.11 lm of blue = 0.89 lm of
yellow + 0.11 lm of blue = 0.7 lm of cyan + 0.3 lm of red = 0.41 lm of magenta + 0.59 lm of green.
Luma-Chroma Principle
The principle states that any video signal can be broken into two components :
The luma component, which describes the variation of brightness in different portions of the image without regard to any
color information. It is denoted by Y and can be expressed as a linear combination of RGB :
Y = 0.3R + 0.59G + 0.11B

The chroma component, which describe the variation of color
information in different parts of the image without regard to any brightness information. It is denoted by C and can further
be subdivided into two components U and V.
Thus RGB output

signals from a video camera are transformed to YC format using electronic circuitry before being transmitted. At the
receiving end for a B/W TV, the C component is discarded and only the Y component is used to display a B/W image. For a
color TV, the YC components are again converted back to RGB signals which are used to drive the electron guns of a CRT.
Color Television Camera

The figure below shows a block diagram of a color TV camera. It essentially consists of three camera
tubes in which each tube receives selectively filtered primary colors. Each camera tube develops a signal
voltage proportional to the respective color intensity received by it. Light from the scene is processed
by the objective lens system. The image formed by the lens is split into three images by glass prisms.
These prisms are designed as diachroic mirrors. A diachroic mirror passes one wavelength and rejects
other wavelengths. Thus red, green and blue images are formed. These pass through color filters which
provide highly precise primary color images which are converted into video signals by the camera tubes.
This generates the three color signals R, G and B.
To generate the monochrome or brightness signal that represents the luminance of the scene the three
camera outputs are added through a resistance matrix in the proportion of 0.3, 0.59 and 0.11 for R, G
and B respectively
Y = 0.3R + 0.59G + 0.11B
The Y signal is transmitted as in a monochrome television system. However instead of transmitting all
the three color signals separately the red and blue camera outputs are combined with the Y signal to
obtain what is known as the color difference signals. Color difference voltages are derived by
subtracting the luminance voltage from the color voltages. Only (R-Y) and (B-Y) are produced. It is
only necessary to transmit two of the three color difference signals since the third may be derived from
the other two.
The color difference signals equal zero when white or grey shades are being transmitted. This is
illustrated by the calculation below.
For any grey shade (including white) let R = G = B = v volts.

Then Y = 0.3v + 0.59v + 0.11v = v
Thus, (R-Y) = v – v = 0 volt, and (B-Y) = v – v = 0 volt.
When televising color scenes even when voltages R, G and B are not equal, the Y signal still represents
monochrome equivalent of the color. This aspect can be illustrated by the example below. For simplicity
of calculation let us assume that the camera output corresponding to the maximum (100%) intensity of
white light be an arbitrary value of 1 volt.
Consider a color of unsaturated magenta and it is required to find out the voltage components of the
luminance and color difference signals.
Since the hue is magenta it implies a mixture of red and blue. The word unsaturated indicates that some
white light is also there. The white content will develop all the three i.e. R, G and B voltages, the
magnitudes of which will depend on the extent of unsaturation. Thus R and B voltages must dominate
and both must be of greater amplitude than G. Let R=0.7 volt, G=0.2 volt, B=0.6 volt represent the
unsaturated magenta color. The white content must be represented by equal quantities of the three
primaries and the actual amount must be indicated by the smallest voltage i.e. G=0.2 volt. Thus the
remaining i.e. R=(0.7-0.2)=0.5 volt and B=(0.6-0.2)=0.4 volt is responsible for the magenta hue.
The luminance signal Y = 0.3R+0.59G+0.11B = 0.3(0.7)+0.59(0.2)+0.11(0.6) = 0.394 volt.
The color difference signals are : (R-Y) = 0.7-0.394 = 0.306 volt
(B-Y) = 0.6-0.394 = 0.206 volt
The other component (G-Y) can be derived as shown below:
Y = 0.3R+0.59G+0.11B
Thus, (0.3+0.59+0.11)Y = 0.3R+0.59G+0.11B
Rearranging the terms, 0.59(G-Y) = -0.3(R-Y) –0.11(B-Y)
i.e. G-Y = -0.51(R-Y)-0.186(B-Y)
Since the value of the luminance is Y=0.394 volt and peak white corresponds to 1 volt, the magenta will
show up as a fairly dull grey in a monochrome television set.
Chroma Sub-sampling
Conversion of RGB signals into YC format also has another important advantage of utilizing less
bandwidth through the use of chroma subsampling. It had been observed through experimentation that
human eye is more sensitive to brightness information than to color information. This limitation can be
exploited to transmit reduced color information as compared to brightness information, a process called
chroma-subsampling, and save on bandwidth requirements.
On account of this, we get code words like "4:2:2" and "4:1:1" to describe how the subsampling is done.
Roughly, the numbers refer to the ratios of the luma sampling frequency to the sampling frequencies of
the two chroma channels (typically Cb and Cr, in digital video); "roughly" because this formula doesn't
make any sense for things like "4:2:0".
4:4:4 --> No chroma subsampling, each pixel has Y, Cr and Cb values.

4:2:2 --> Chroma is sampled at half the horizontal frequency as luma, but the vertical frequency is the
same. The chroma samples are horizontally aligned with luma samples.
4:1:1 --> Chroma is sampled at one-fourth the horizontal frequency as luma, but at full vertical
frequency. The chroma samples are horizontally aligned with luma samples.
4:2:0 --> Chroma is sampled at half the horizontal frequency as luma, and also at half the vertical
frequency. Theoretically, the chroma pixel is positioned between the rows and columns.
Bandwidth and
Frequencies
Each TV channel is
allocated 6 MHz of
bandwidth. Out of
these the 0 to 4 MHz
part of the signal is
devoted to Y
component, the next
1.5 MHz is taken up
by the C component,
and the last 0.5 MHz
is taken up by the
audio signal.
Video Signal Formats
Component Video
Our color television system starts out with three channels of information; Red, Green, & Blue (RGB). In
the process of translating these channels to a single composite video signal they are often first converted
to Y, R-Y, and B-Y. Both three channel systems, RGB and Y, R - Y, B - Y are component video signals.
They are the components that eventually make up the composite video signal. Much higher program
production quality is possible if the elements are assembled in the component domain.
Composite Video
A video signal format where both the luminance and chroma components are transmitted along a single
wire or channel. Usually used in normal video equipment like VCRs as well as TV transmissions.
NTSC, PAL, and SECAM are all examples of composite video systems.
S-Video
Short for
Super-video. A
video signal
format where
the luminance
and color
components are
transmitted separately using multiple cables or channels.
Here picture quality is better than that of composite video
but is more expensive. Usually used in high end VCRs and
capture cards.
Television Broadcasting Standards
NTSC
National Television Systems Committee. Broadcast standard used in USA and Japan. Uses 525 horizontal lines at 30 (29.97)
frames / sec. Uses composite video format where luma is denoted by Y and chroma components by I and Q. While Y utilizes
4 MHz bandwidth of a television channel, I uses 1.5 MHz and Q only 0.5 MHz. I and Q can be expressed as combinations of
RGB as shown below :
I = 0.74(R-Y) – 0.27(B-Y)
Q = 0.48(R-Y) + 0.41(B-Y)
PAL
Phase Alternating Lines. Broadcast standard used in Europe, Australia, South Africa, India. Uses 625 horizontal lines at 25
frames / sec. Uses composite video format where luma is denoted by Y and chroma components by U and V. While Y
utilizes 4 MHz bandwidth of a television channel, U and V both uses 1.3 MHz each. U and V can be expressed as a linear
combination of RGB as shown below :
U = 0.493(B-Y)
V = 0.877(R-Y)
SECAM
Sequential Color and Memory. Used in France and Russia. The fundamental difference between the SECAM system on one
hand and the NTSC and PAL system on the other hand is that the latter transmit and receive two color signals
simultaneously while in the SECAM system only one of the two difference signals is transmitted at a time. It also uses 625
horizontal lines at 25 frames/sec. Here the color difference signals are denoted by DR and DB and both occupies 1.5 MHz
each. They are given by the relations:
DR = -1.9(R-Y)
DB = 1.5(B-Y)
Other Television Systems
Enhanced Definition Television Systems (EDTV)
These are conventional systems modified to offer improved vertical and horizontal resolutions. One of the systems emerging
in US and Europe is known as the Improved Definition Television (IDTV). IDTV is an attempt to improve NTSC image
by using digital memory to double the scanning lines from 525 to 1050. The pictures are only slightly more detailed than
NTSC images because the signal does not contain any new information. By separating the chrominance and luminance parts
of the video signal, IDTV prevents cross-interference between the two.
High Definition Television (HDTV)
The next generation of television is known as the High Definition TV (HDTV). The HDTV image has approximately twice as
many horizontal and vertical pixels as conventional systems. The increased luminance detail in the image is achieved by
employing a video bandwidth approximately five times that used in conventional systems. Additional bandwidth is used to
transmit the color values separately. The aspect ratio of the HDTV screen will be 16 : 9. Digital codings are essential in the
design and implementation of HDTV. There are two types of possible digital codings : composite coding and component
coding. Composite coding of the whole video signal is in principle easier than a digitization of the separate signal
components (luma and chroma) but there are also serious problems with this approach, like disturbing cross-talk between
the luma and chroma information, and requirement of more bandwidth due to the fact that chroma-subsampling would not
be possible. Hence component coding seems preferable. The luminance signal is sampled at 13.5 MHz as it is more crucial.
The chrominance signals (R-Y, B-Y) are sampled at 6.75 MHz (4:2:2). The digitized luminance and chrominance signals are
then quantized with 8 bits each. For the US, a total of 720000 pixels are assumed per frame. If the quantization is 24
bits/pixel and the frame rate is approximately 60 frame/second, then the data rate for the HDTV will be 1036.8
Mbits/second. Using a compression method data rate reduction to 24 Mbits/second will be possible without noticeable
quality loss. In the case of European HDTV, the data rate is approximately 1152 Mbits/second.
DIGITAL VIDEO
Video Capture
Source and Capture Devices
Two main components : Source and Source device Capture device. During capture the visual component
and audio component are captured separately and automatically synchronized. Source devices must use
PAL or NTSC playback and must have Composite video or S-video output ports.
The source and source device can be the following :

• Camcorder with pre-recorded video tape
• VCP with pre-recorded video cassette
• Video camera with live footage
• Video CD with Video CD player
Video Capture Card
A full motion video capture card is a circuit board in the computer that consists of the following components :
• Video INPUT port to accept the video input signals from NTSC/PAL/SECAM broadcast signals, video camera or VCR. The
input port may conform to the composite-video or S-video standards.
• Video compression-decompression hardware for video data.
• Audio compression-decompression hardware for audio data.
• A/D converter to convert the analog input video signals to digital form.
• Video OUTPUT port to feed output video signals to camera and VCR.
• D/A converter to convert the digital video data to analog signals for feeding to output analog devices.
• Audio INPUT/OUTPUT ports for audio input and output functions.
Rendering support for the various television signal formats e.g. NTSC, PAL, SECAM imposes a level of complexity in the
design of video capture boards.
Video Capture Software
The following capabilities might be provided by a video capture software, often bundled with a capture card :
AVI Capture : This allows capture and digitization of the input analog video signals from external devices and conversion to
an AVI file on the disk of the computer. No compression is applied to the video data and hence this is suitable for small files.
Playback of the video is done through the Windows Media Player. Before capturing parameters like frame rate, brightness,
contrast, hue, saturation etc. as well as audio sampling rate and audio bit size may be specified.
AVI to MPEG Converter : This utility allows the user to convert a captured AVI file to MPEG format. Here the MPEG
compression algorithm is applied to an AVI file and a separate MPG file is created on the disk. Before compression
parameters like quality, amount of compression, frame dimensions, frame rate etc. may be specified by the user. Playback
of the MPEG file is done through the Windows Media Player.
MPEG Capture : Certain cards allow the user to capture video directly in the MPEG format. Here analog video data is
captured, digitized and compressed at the same time before being written to the disk. This is suitable for capturing large
volumes of video data. Parameters like brightness, contrast, saturation etc. mat be specified by the user before starting
capturing.
DAT to MPEG Converter : This utility converts the DAT format of a Video-CD into MPEG. Conversion to MPEG is usually
done for editing purposes. DAT and MPG are similar formats so that the file size changes by very small amounts after
conversion. The user has to specify the source DAT file and the location of the target MPG file.
MPEG Editor : Some capture software provide the facility of editing an MPEG file. The MPG movie file is opened in a
timeline structure and functions are provided for splitting the file into small parts by specifying the start and end of each
portion. Multiple portions may also be joined together. Sometimes functions for adding effects like transitions or sub-titling
may also be present. The audio track may also be separately edited or manipulated.
Video Compression
Types of Compression
Video compression is a process whereby the size of the digital video on the disk is reduced using certain mathematical
algorithms. Compression is required only for storing the data. For playback of the video, the compressed data need again to
be decompressed. Software used for the compression/decompression process are called CODECs. During the process of
compression the algorithms analyse the source video and tries to find out redundant and irrelevant portions. Greater is the
amount of these portions in the source data, better is the scope of compressing it.
Video compression process may be categorised using different criteria.

Lossless compression occurs when the original video data is not changed permanently in any way during the compression
process. This means that the original video data can be obtained after decompression. Though this preserves the video
quality, but the amount of compression achieved is usually limited. This process is usually used where the quality is of more
importance than the storage space issues e.g. medical image processing.
Lossy compression occurs where a part of the original data is discarded during the compression process in order to reduce
the file size. This data is lost forever and cannot be recovered after the decompression process. Thus here quality is
degraded due to compression. The amount of compression and hence the degradation in quality is usually selectable by the
user – more is the compression greater is the degradation in quality and vice versa. This process is usually used where
storage space is more important than quality e.g. corporate presentations.
Since video is essentially a sequence of still images, compression can be differentiated on what kind of redundancy is
exploited. Intraframe compression occurs where redundancies in each frame or still image (spatial redundancy) is exploited
to produce compression. This process is same as an image compression process. A video CODEC can also implement
another type of compression when it exploits the redundancies between adjacent frames in a video sequence (temporal
redundancy). This is called interframe compression.
Compression can also be categorized based on the time taken to compress and decompress. Symmetrical compression
algorithms take almost the same for both the compression and decompression process. This is usually used in live video
transmissions. An asymmetrical compression algorithm usually take a greater amount of time for the compression process
than for the decompression process. This is usually used for applications like CD-ROM presentations.
Since video is essentially a sequence of still images, the initial stage of video compression is same as that for image
compression. This is the intraframe compression process and can be both lossless and lossy. The second stage after each
frame is individually compressed is the interframe compression process where redundancies between adjacent frames are
exploited to achieve compression.
Lossy Coding Techniques
Lossy coding techniques are also known as Source Coding. The popular methods are discussed below :
Discrete Cosine Transform (DCT)
Which portion of data is considered lossy and which lossless depends on the algorithm.
One method to separate relevant from irrelevant information is Transform Coding. This transforms data into a different
mathematical model better suited for the purpose of separation. One of the best known transform codings is Discrete Cosine
Transform (DCT). For all transform codings an inverse function must exist to enable reconstruction of the relevant
information by the decoder.
An image is subdivided into blocks of 8 X 8 pixels. Each of these blocks is represented as a combination of DCT functions.
64 appropriately chosen coefficients represent the variation of horizontal and vertical frequencies of varying pixel intensities.
The human eye is greatly sensitive at low frequency levels, but it sensitivity decreases at high frequency levels. Thus
reduction in number of high frequency DCT components weakly affects image quality. After the DCT transform, a process
called Quantization is used to extract the relevant information by making the high frequency components zero.
Video Compression Techniques
After the image compression techniques, the video CODEC uses interframe algorithms to exploit
temporal redundancy, as discussed below :
Motion Compensation
By motion compensated prediction, temporal redundancies between two frames in a video sequence can
be exploited. Temporal redundancies can arise from movement of objects in front of a stationary
background. The basic concept is to look for a certain area (block) in a previous or subsequent frame
that matches very closely an area of the same size in the current frame. If successful then the differences
in the block intensity values are calculated. In addition, the motion vector which represents the
translation of the corresponding blocks in both x- and y-directions is determined. Together the
difference signal and the motion vector represent the deviation between the reference block and
predicted block.
Some Popular CODECs
JPEG
Stands for Joint Photographic Experts Group, a joint effort by ITU and ISO. Achieves compression by
first applying DCT, then quantization, and finally entropy coding the corresponding DCT coefficients.
Corresponding to the 64 DCT coefficients, a 64 element quantization table is used. Each DCT
coefficient is then divided by the corresponding quantization table entry, and the values rounded off. For
entropy coding Huffman method is used.
MPEG-1
Stands for
Moving Pictures
Expert Group.
MPEG-1
belongs to a
family of ISO
standards.
Provides motion
compensation
and utilizes
both intraframe
and interframe compression. Uses 3 different types of frames : I-frames, P-frames and B-frames.
I-frames (intracoded) : These are coded without any reference to other images. MPEG makes use of JPEG for I frames.
They can be used as a reference for other frames.
P-frames (predictive) : These require information from the previous I and/or P frame for encoding and decoding. By
exploiting temporal redundancies, the achievable compression ratio is higher than that of the I frames. P frames can be
accessed only after the referenced I or P frame has been decoded.
B-frames (bidirectional predictive) : Requires information from the previous and following I and/or P frame for encoding
and decoding. The highest compression ratio is attainable by using these frames. B frames are never used as reference for
other frames.
Reference frames must be transmitted first. Thus transmission order and display order may differ. The first I frame must be
transmitted first followed by the next P frame and then by the B frames. Thereafter the second I frame must be
transmitted. An important data structure is the Group of Pictures (GOP) . A GOP contains a fixed number of consecutive
frames and guarantees that the first picture is an I-frame. A GOP gives an MPEG encoder information as to which picture
should be encoded as an I, P or B frame and which frames should serve as references.
The first frame in a GOP is always an I-frame which is encoded like an intraframe image i.e. with DCT, quantization and
entropy coding. The motion estimation step is activated when B or P frames appear in the GOP. Entropy coding is done by
using Huffman coding technique.
Cinepak
Cinepak was originally developed to play small movies on '386 systems, from a single- speed CD-ROM drive. Its greatest
strength is its extremely low CPU requirements. Cinepak's quality/datarate was amazing when it was first released, but does
not compare well with newer CODECs available today. There are higher-quality (and lower-datarate) solutions for almost
any application. However, if you need your movies to play back on the widest range of machines, you may not be able to
use many of the newer codecs, and Cinepak is still a solid choice.
After sitting idle for many years, Cinepak is finally being dusted off for an upgrade. Cinepak Pro from CTI
(www.cinepak.com) is now in pre-release, offering an incremental improvement in quality, as well as a number of bug fixes.
Supported by QuickTime and Video for Windows.
Sorenson
One of the major advances of QuickTime 3 is the new Sorenson Video CODEC which is included as a standard component of
the installation. It produces the highest quality low-data rate QuickTime movies.
The Sorenson Video CODEC produces excellent Web video suitable for playback on any Pentium or PowerMac. It also
delivers outstanding quality CD-ROM video at a fraction of traditional data rates, which plays well on 100MHz systems.
Compared with Cinepak, Sorenson Video generally achieves higher image quality at a fraction of the data rate. This allows
for higher quality, and either faster viewing (on the WWW), or more movies on a CD-ROM (often four times as much
material on a disc as Cinepak). It supports variable bitrate encoding [When movies are compressed, each frame of the video
must be encoded to a certain number of bytes. There are several techniques for allocating the bytes for each frame. Fixed
bitrate is used by certain codecs (like Cinepak), which attempt to allocate approximately the same number of bytes per
frame. Variable bitrate (VBR) is supported by other codecs (such as MPEG-2 and Sorenson), and attempts to give each
frame the optimum number of bytes, while still meeting set constraints (such as the overall data rate of the movie, and the
maximum peak data rate). ]. Supported by Quicktime. Manufacturer is Sorenson Vision Inc (www.sorensonvideo.com )
RealVideo
RealMedia currently has only two video CODECs: RealVideo (Standard) and RealVideo (Fractal).
RealVideo (Standard) is usually best for data rates below 3 Kbps. It works better with relatively static material than it does
with higher action content. It usually encodes faster. RealVideo (Standard) is significantly more CPU intensive than the
RealVideo (Fractal) CODEC. It usually requires a very fast PowerMac or Pentium for optimal playback. It is supported by the
RealMedia player. Manufacturer is Progressive Networks (www.real.com).
H.261
H.261 is a standard video-conferencing CODEC. As such, it is optimized for low data rates and relatively low motion. Not
generally as good quality as H.263. H.261 is CPU intensive, so data rates higher than 50 Kbps may slow down most
machines. It may not play well on lower-end machines. H.261 has a strong temporal compression component, and works
best on movies in which there is little change between frames. Supported by Netshow, Video for Windows.
H.263
H.261 is a standard video-conferencing CODEC. H.263 is an advancement of the H.261 standard, mainly it was used as a
starting point for the development of MPEG (which is optimized for higher data rates). Supported by QuickTime, Netshow,
Video for Windows.
Indeo Video Interactive (IVI)
Indeo Video Interactive (IVI) is a very high-quality, wavelet-based CODEC. It provides excellent image quality, but requires
a high-end Pentium for playback. There are currently two main versions of IVI. Version 4 is included in QuickTime 3 for
Windows; Version 5 is for DirectShow only. Neither version currently runs on the Macintosh, so any files encoded with IVI
will not work cross-platform. Version 5 is very similar to 4, but uses an improved wavelet algorithm for better compression.
Architecures supported are QuickTime for Windows, Video for Windows, DirectShow. Manufacturer is Intel (www.intel.com).
VDOLive
VDOLive is an architecture for web video delivery, created by VDOnet Corporation (www.vdo.net). VDOLive is a server-
based, "true streaming" architecture that actually adjusts to viewers' connections as they watch movies. Thus, true
streaming movies play in real-time with no delays for downloading. For example, if you clicked on a 30 second movie, it
would start playing and 30 seconds later, it would be over, regardless of your connection, with no substantial delays.
VDOLive's true streaming approach differs from QuickTime's "progressive download" approach. Progressive download allows
you to watch (or hear) as much of the movie as has downloaded at any time, but movies may periodically pause if the
movie has a higher data rate than the user's connection, or if there are problems with the connection or server, such as
very high traffic. In contrast to progressive download, the VDOLive server talks to the VDOPlayer (the client) with each
frame to determine how much bandwidth a connection can support. The server then only sends that much information, so
movies always play in real time. In order to support this real-time adjustment of the data-stream, you must use special
server software to place VDOLive files on your site.
The real-time adjustment to the viewer's connection works like this: VDOLive files are encoded in a "pyramidal" fashion. The
top level of the pyramid contains the smallest amount of the most critical image data. If your user has a slow connection,
they are only sent this top portion. The file's next level has more data, and will be sent if the viewer's connection can handle
it, and so forth. Users with very fast connections (T1 or better) are sent the whole file. Thus, users are only sent what they
can receive in real-time, but the data has been pre-sorted so that the information they get is the best image for their
bandwidth.
MPEG-2
MPEG-2 is a standard for broadcast-quality digitally encoded video. It offers outstanding image quality and resolution.
MPEG-2 is the primary video standard for DVD-Video. Playback of MPEG-2 video currently requires special hardware, which
is built into all DVD-Video players, and most (but not all) DVD-ROM kits.
MPEG-2 was based on MPEG-1 but optimized for higher data rates. This allows for excellent quality at DVD rates (300-1000
Kbps), but tends to produce results inferior to MPEG-1 at lower rates. MPEG-2 is definitely not appropriate for use over
network connections (except in very special, ultra-high-performances cases).
MPEG-4
MPEG-4 is a standard currently under development for the delivery of interactive multimedia across networks. As such, it is
more than a single CODEC, and will include specifications for audio, video, and interactivity. The video component of MPEG-
4 is very similar to H.263. It is optimized for delivery of video at Internet data rates. One implementation of MPEG-4 video
is included in Microsoft's NetShow. The rest of the MPEG-4 standard is still being designed. It was recently announced that
QuickTime's file format will be used as a starting point.
Playback Architectures
QuickTime
QuickTime is Apple's multi-platform, industry-standard, multimedia software architecture. It is used by software developers,
hardware manufacturers, and content creators to author and publish synchronized graphics, sound, video, text, music, VR,
and 3D media. The latest free downloads, and more information, are available at Apple's QuickTime site.
(http://www.apple.com/quicktime). QuickTime offers support for a wide range of delivery media, from WWW to DVD-ROM.
It was recently announced that the MPEG-4 standard (now in design) will be based upon the QuickTime file format.
QuickTime is also widely used in digital video editing for output back to videotape. QuickTime is the dominant architecture
for CD-ROM video. It enjoys an impressive market share due to its cross-platform support, wide range of features, and free
licensing. QuickTime is used on the vast majority of CD-ROM titles for these reasons. QuickTime is a good choice for kiosks,
as it integrates well with Macromedia Director, MPEG, and a range of other technologies.
RealMedia
The RealMedia architecture was developed by Progessive Networks, makers of RealAudio. It was designed specifically to
support live and on-demand video and audio across the WWW. The first version of RealMedia is focused on video and audio,
and is referred to as RealVideo. Later releases of RealMedia will incorporate other formats including MIDI, text, images,
vector graphics, animations, and presentations. RealMedia content can be placed on your site either with or without special
server software. There are performance advantages with the server, but you don't have to buy one to get started. However,
high volume sites will definitely want a server to get substantially improved file delivery performance. Users can view
RealMedia sites with the RealPlayer, a free "client" application available from Progressive. A Netscape plug-in is also
available. The main downside to RealMedia is that it currently requires a PowerMac or Pentium computer to view. As such,
RealMedia movies aren't available to the full range of potential users. The latest free downloads, as well as more
information, are available at www.real.com.
NetShow
Microsoft's NetShow architecture is aimed at providing the best multimedia delivery over networks, from 14.4 kbps modems
to high-speed LANs. There is an impressive range of audio and video CODECs built into NetShow 3.0. Combined with a
powerful media server, this is a powerful solution for networked media. Technically, the term "NetShow" refers to the client
installation and the server software. Netshow clients are built on top of the DirectShow architecture. Because of this,
NetShow has access to its own CODECs, and also those for DirectShow, Video for Windows, and QuickTime. Netshow media
on WWW pages may be viewed via ActiveX components (for Internet Explorer), plug-ins (for Netscape Navigator), or stand-
alone viewers. NetShow servers support "true streaming" (in their case, called "intelligent streaming"): the ability to
guarantee continuous delivery of media even if the networks' performance degenerates. If this happens, NetShow will
automatically use less video data (thus reducing the quality). If the amount of available bandwidth decreases more,
NetShow will degrade video quality further, until only the audio is left. Microsoft says that their implementation provides the
most graceful handling of this situation. The latest free downloads, as well as more information, are available at Microsoft's
NetShow site (www.microsoft.com/netshow).
DirectShow
DirectShow (formerly known as ActiveMovie) is the successor to Microsoft's Video for Windows architecture. It is built on top
of the DirectX architecture (including DirectDraw, DirectSound, and Direct3D), for optimum access to audio and video
hardware on Windows-based computers. Supported playback media includes WWW, CD-ROM, DVD-ROM, and DVD-Video
(with hardware). DV Camera support will be added in an upcoming release. DirectShow has its own player (the Microsoft
Media Player, implemented as an ActiveX control) which may be used independently or within Internet Explorer. There is
also a plug-in for use with Netscape Navigator. And playback may also be provided by other applications using the OCX
component. As DirectShow is the playback architecture for NetShow, these playback options support either delivery
approach. Media Types Supported are Audio, Video, Closed Captioning (SAMI), MIDI, MPEG, animation (2D or 3D). The
latest free downloads, as well as more information, are available at Microsoft's DirectX site
(www.microsoft.com/directx/pavilion/dshow/default.asp).
Video for Windows
Video for Windows is similar to QuickTime. Its main advantage is that it is built into Windows 95. However, it is limited in
many ways. It runs on Windows only, doesn't handle audio/video synchronization as well as QuickTime, and doesn't support
variable-length frames. Video for Windows is no longer supported by Microsoft, and is being replaced by
DirectShow/ActiveMovie (one of the DirectX technologies). Video for Windows is often referred to as "AVI" after the .AVI
extensions specified by its file format.
[Some of the details discussed is available at: http://www.etsimo.uniovi.es/hypgraph/video/codecs/Default.htm ]
Some Concepts of Video Editing
Time Base and Frame Rates
In the natural world we experience time as a continuous flow of events. However working with video
requires precise synchronization so it is necessary to measure time using precise numbers. Familiar time
increments like hours, minutes, seconds, are not precise enough as each second might contain several
events. When editing video several source clips may need to be imported to create the output clip. The
source frame rates of these source clips determine how many frame rates are displayed per second
within these clips. Source frame rates can be different for different types of clips:
Motion picture film – 24 fps
PAL and SECAM video – 25 fps
NTSC video – 29.97 fps
Web applications – 15 fps
CD-ROM applications – 30 fps
In a video editing project file there is a single and common timeline where all the imported clips are
placed. A parameter called the timebase determines how time is measured and displayed within the
editing software. For example a timebase of 30 means each second is divided into 30 units. The exact
time at which an edit occurs depends on the timebase specified for the particular project. Since there
needs to be a common timebase for the video editor timeline, source clips whose frame rates do not
match with the specified timebase needs adjustments. For example if the frame rate of a source clip is 30
fps and the timebase of the project is also 30 fps then all frames are displayed as expected (figure below,
half second shown)
However if the source clip was recorded at 24 fps and it is placed on a timeline with a timebase of 30
then to preserve the proper playback speed some of the original frames need to be repeated. In the figure
below, frames 1, 5 and 9 are shown to be repeated for half second duration.
If the final edited video clip needs to be exported at 15 fps, then from the timeline every alternate frames
need to be discarded.
On the other hand if the timebase was set at 24 and the final video needs to be exported at 15 fps, then
some selective frames would need to be discarded. In the figure below frames 3, 6, 8 and 11 are shown
to be discarded for half second duration.
SMPTE Timecode
Timecode defines how frames in a movie are counted and affects the way you view and edit a clip. For
example, you count frames differently when editing video for television than when editing for motion-
picture film. A standard way to represent timecode have been developed by a global body called Society
of Motion Pictures and Television Engineers (SMPTE) and represent timecode by a set of numbers. The
numbers represent hours, minutes, seconds, and frames, and are added to video to enable precise editing
e.g. 00:03:51:03.
When NTSC color systems were developed, the frame rate was changed by a tiny amount to eliminate
the possibility of crosstalk between the audio and color information; the actual frame rate that is used is
exactly 29.97 frames per second. This poses a problem since this small difference will cause SMPTE
time and real time (what your clock reads) to be different over long periods. Because of this, two
methods are used to generate SMPTE time code in the video world: Drop and Non-Drop.
In SMPTE Non-Drop, the time code frames are always incremented by one in exact synchronization to
the frames of your video. However, since the video actually plays at only 29.97 frames per second
(rather than 30 frames per second), SMPTE time will increment at a slower rate than real world time.
This will lead to a SMPTE time versus real time discrepancy. Thus, after a while, we could look at the
clock on the wall and notice it is farther ahead than the SMPTE time displayed in our application.
1 sec. (clock time)
1 sec. (SMPTE time) [video plays at 29.97 frames per sec]
Difference of 0.03 frames per second translates to (0.03 × 60 × 60) or 108 frames per hour.
SMPTE Drop time code (which also runs at 29.97 frames per second ) attempts to compensate for the
discrepancy between real world time and SMPTE time by "dropping" frames from the sequence of
SMPTE frames in order to catch up with real world time. What this means is that occasionally in the
SMPTE sequence of time, the SMPTE time will jump forward by more than one frame. The time is
adjusted forward by two frames on every minute boundary which increases the numbering by 120
frames every hour. However to achieve a total compensation of 108 frames, the increment is avoided at
the following minute boundaries : 00, 10, 20, 30, 40 and 50. Thus when SMPTE Drop time increments
from 00:01:59:29, the next value will be 00:02:00:02 in SMPTE Drop rather than 00:02:00:00 in
SMPTE Non-Drop. In SMPTE Drop, it must be remembered that certain codes no longer exist. For
instance, there is no such time as 00:02:00:00 in SMPTE Drop. The time code is actually 00:02:00:02.
No frames are lost, because drop-frame timecode does not actually drop frames, only frame numbers. To
distinguish from the non-drop type, the numbers are separated by semicolons instead of colons i.e.
00;02;00;02
1 hour (clock time)
diff. of 108 frames 1 hour (SMPTE time)
Online Editing and Offline Editing
There are three phases of video production :

• Pre-production : Involves writing scripts, visualizing scenes, storyboarding etc.
• Production : Involves shooting the actual scenes
• Post-production : Involves editing the scenes and correcting / enhancing wherever necessary.
Editing involves a draft or rough cut called Offline Edit which gives a general idea of the editing
possibilities. Offline edit is usually done on a low-end system using a low resolution copy of the original
video. This is to make the process economically feasible because a low resolution copy is sufficient to
decide on the edit points. An edit decision list (EDL) is created which contains a list of the edit changes
to be carried out. The EDL can be refined through successive iterations until the edit points and changes
are finalized. Since the iterative process may potentially take a long time duration (typically several
days) using a high-end system is not considered desirable and optimum. Once the EDL is finalized, the
final editing work is done on the actual high-resolution copy of the video using a powerful system. This
operation is called Online Edit. It requires a much lesser time compared to an offline edit because the
operations are done only once based on the finalized EDL. The higher costs of the high-end system need
to be borne only for a short time duration (typically few hours).
Edit Decision List (EDL)

An EDL is used in offline editing for recording the edit points. It contains the names of the original
clips, the In and Out points, and other editing information. In Premiere editing decisions in the Timeline
are recorded in text format and then exported in one of the EDL formats. A standard EDL contains the
following columns:
(a) Header – Contains title and type of timecode (drop-frame or nondrop-frame)
(b) Source Reel ID – Identifies the name or number of the videotape containing the source clips
(c) Edit Mode – Indicates whether edits take place on video track, audio track or both.
(d) Transition type – Describes the type of transition e.g. wipe, cut etc.
(e) Source In and Source Out – Lists timecodes of first and last frames of clips
On a high-end system the EDL is accepted by an edit controller which applies the editing changes to the
high-quality clips.
FireWire (IEEE-1394)
Although digital video in an external device or camera is already in binary computer code, you still need
to capture it to a file on a hard disk. Capturing digital video is a simple file transfer to a computer if the
computer has an available FireWire (IEEE-1394) card and a digital video CODEC is available. The
IEEE-1394 interface standard specification is also known as "FireWire" by Apple Computer, Inc. and as
"iLink" or "iLink 1394" by Sony Corp. Developed by the Institute of Electrical and Electronics
Engineers it is a serial data bus that allows high speed data transfers. Three data rates are supported :
100, 200 and 400 Mbps. The bus speed is governed by the slowest active node
It consists two separately shielded pairs of wires for signaling, two power conductors and an outer
shield. Upto 63 devices can be connected in daisy chain. The standard also support hot plugging which
means that devices can be connected or disconnected without switching off power in the cable.
IEEE 1394 is a non-proprietary standard and many organizations and companies have endorsed the
standard. The Digital VCR Conference selected IEEE 1394 as its standard digital interface; an EIA
committee selected IEEE 1394 as the point to point interface for digital TV. Video Experts Standards
Association (VESA) adopted IEEE 1394 for home networking. Microsoft first supported IEEE 1394 in
the Windows 98 operating system and it is supported in newer operating systems.

The Visual Display System

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

The Visual Display System

Hochgeladen von

Copyright:

Verfügbare Formate

THE VISUAL DISPLAY SYSTEM

Frames and Refresh Rate

The electron beam is said to

The working principle of

Lets the width of the monitor be 4x and its height be 3x.

Each pixel can take up a total of (4 X 8 X 16) or 512 colors.

THE VIDEO ADAPTER CARD AND CABLE

A total of 24-bits are allocated to each pixel.

Accelerated Graphics Port (AGP)

The Liquid Crystal Display

The second principle of an LCD depends on

Tube type Camera

The visual image in

The ADC sends the digital information

The transistors generate a continuous

Instead of being focused on

The video cameras produce three output

Human Color Perception

Y = 0.3R + 0.59G + 0.11B

Thus RGB output

Color Television Camera

Y = 0.3R + 0.59G + 0.11B

For any grey shade (including white) let R = G = B = v volts.

4:4:4 --> No chroma subsampling, each pixel has Y, Cr and Cb values.

Other Television Systems

Enhanced Definition Television Systems (EDTV)

High Definition Television (HDTV)

Source and Capture Devices

The source and source device can be the following :

Video Capture Software

Video compression process may be categorised using different criteria.

Lossy Coding Techniques

Discrete Cosine Transform (DCT)

Video Compression Techniques

Indeo Video Interactive (IVI)

Video for Windows

[Some of the details discussed is available at: http://www.etsimo.uniovi.es/hypgraph/video/codecs/Default.htm ]

Some Concepts of Video Editing

Time Base and Frame Rates

1 sec. (clock time)

1 sec. (SMPTE time) [video plays at 29.97 frames per sec]

1 hour (clock time)

diff. of 108 frames 1 hour (SMPTE time)

Online Editing and Offline Editing

There are three phases of video production :

Edit Decision List (EDL)

Das könnte Ihnen auch gefallen