You are on page 1of 9


Abstract---Three-dimensional TV is expected to be the next revolution in the TV history. They implemented a 3D TV prototype system with real-time
acquisition transmission, & 3D display of dynamic scenes. They developed a distributed scalable architecture to manage the high computation &
bandwidth demands. 3D display shows high-resolution stereoscopic color images for multiple viewpoints without special glasses. This is first real time
end-to-end 3D TV system with enough views & resolution to provide a truly immersive 3D experience.Japan plans to make this futuristic television a
commercial reality by 2020as part of abroad national project that will bring together researchers from the government, technology companies and
academia. The targeted "virtual reality" television would allow people to view high definitionimages in 3D from any angle, in addition to being able to
touch and smell the objects being projected upwards from a screen to the floor.

Keywords--- parallex,display,perception,holographic images


Three-dimensional TV is expected to be the next revolution in the TV history. They implemented a 3D TV prototype system with real-time
acquisition transmission, & 3D display of dynamic scenes. They developed a distributed scalable architecture to manage the high computation &
bandwidth demands. 3D display shows high-resolution stereoscopic color images for multiple viewpoints without special glasses. This is first
real time end-to-end 3D TV system with enough views & resolution to provide a truly immersive 3D experience.

Why 3D TV

The evolution of visual media such as cinema and television is one of the major hallmarks of our modern civilization. In many ways, these
visual media now define our modern life style. Many of us are curious: what is our life style going to be in a few years? What kind of films and
television are we going to see? Although cinema and television both evolved over decades, there were stages, which, in fact, were once seen as
1) at first, films were silent, then sound was added;
2) cinema and television were initially black-and-white, then color was introduced;
3) computer imaging and digital special effects have been the latest major novelty.


Human gains three-dimensional information from variety of cues. Two of the most important ones are binocular parallax & motion parallax.

A. Binocular Parallax

It means for any point you fixate the images on the two eyes must be slightly different. But the two different image so allow us to perceive a
stable visual world. Binocular parallax defers to the ability of the eyes to see a solid object and a continuous surface behind that object even
though the eyes see two different views.

B. Motion Parallax

It means information at the retina caused by relative movement of objects as the observer moves to the side (or his head moves sideways).
Motion parallax varies depending on the distance of the observer from objects. The observer's movement also causes occlusion (covering of one
object by another), and as movement changes so too does occlusion. This can give a powerful cue to the distance of objects from the observer.

C. Depth perception

It is the visual ability to perceive the world in three dimensions. It is a trait common to many higher animals. Depth perception allows the
beholder to accurately gauge the distance to an object. The small distance between our eyes gives us stereoscopic depth perception[7]. The brain
combines the two slightly different images into one 3D image. It works most effectively for distances up to 18 feet. For objects at a greater
distance, our brain uses relative size and motion As shown in the figure, each eye captures its own view and the two separate images are sent on
to the brain for processing. When the two images arrive simultaneously in the back of the brain, they are united into one picture. The mind
combines the two images by matching up the similarities and adding in the small differences. The small differences between the two images add
up to a big difference in the final picture ! The combined image is more than the sum of its parts. It is a three-dimensional stereo picture.

Fig.3.1 Depth Perception

D. Stereographic Images

It means two pictures taken with a spatial or time separation that are then arranged to be viewed simultaneously [5]. When so viewed they
provide the sense of a three-dimensional scene using the innate capability of the human visual system to detect three dimensions.As you can see,
a stereoscopic image is composed of a right perspective frame and a left perspective frame - one for each eye.When your right eye views the
right frame and the left frame is viewed by your left eye, your brain will perceive a true 3D view.

Figure 2 shows the stereographic images.

E. Stereoscope

It is an optical device for creating stereoscopic (or three dimensional) effects from flat (two-dimensional) images; D.Brewster first constructed
the stereoscope in 1844. It is provided with lenses, under which two equal images are placed, so that one is viewed with the right eye and the
other with the lef [5]t. Observed at the same time, the two images merge into a single virtual image, which, as a consequence of our binocular
vision, appears to be three-dimensional.

F. Holographic Images

A luminous, 3D, transparent, colored and nonmaterial image appearing out of a 2D medium, called a hologram. A holographic image cannot be
viewed without the proper lighting.


Figure 5 shows the schematic representation of 3D TV system.

The whole system consists mainly three blocks:

1 Aquisition

2. Transmission

3. Display Unit

A. Acquisition

The acquisition stage consists of an array of hardware-synchronized cameras. Small clusters of cameras are connected to the producer PCs. The
producers capture live, uncompressed video streams & encode them using standard MPEG coding. The compressed video then broadcast on
separate channels over a transmission network, which could be digital cable, satellite TV or the Internet.
Generally they are using 16 Basler A101fc color cameras with 1300X1030, 8 bits per pixel CCD sensors.

1) CCD Image Sensors: Charge coupled devices are electronic devices that are capable of transforming a light pattern (image) into an
electric charge pattern (an electronic image).
Figure 6 shows CCD sensors.

Fig.5.2 CCD Image Sensor

2) MPEG-2 Encoding: MPEG-2 is an extension of the MPEG-1 international standard for digital compression of audio and video signals.
MPEG-2 is directed at broadcast formats at higher data rates; it provides extra algorithmic 'tools' for efficiently coding interlaced video, supports
a wide range of bit rates and provides for multichannel surround sound coding. MPEG- 2 aims to be a generic video coding system supporting a
diverse range of applications. They have built a PCI card with custom programmable logic device (CPLD) that generates the synchronization
signal for all the cameras. So, what is PCI card?

3) PCI Card:

Fig.5.3 PCI Card

There's one element the bus. Essentially, a bus is a channel or path between the components in a computer. We will concentrate on the bus
known as the Peripheral Component Interconnect (PCI). We'll talk about what PCI is, how it operates and how it is used, and we'll look into the
future of bus technology.
All 16 cameras are individually connected to the card, which is plugged into the one of the producer PCs. Although it is possible to use software
synchronization, they consider precise hardware synchronization essential for dynamic scenes. Note that the price of the acquisition cameras can
be high, since they will be mostly used in TV studios. They arranged the 16 cameras in regularly spaced linear array.

Fig.5.4 Arrays of 16 Cameras

B. Transmission
Transmitting 16 uncompressed video streams with 1300X1030 resolution & 24 bits per pixel at 30 frames per seconds requires 14.4 Gblsec
bandwidth, which is well beyond current broadcast capabilities. For compression & transmission o1 dynamic muitiview video data there are two
basic design choices. Either the data from multiple cameras is compressed using spatial or spatio-temporal encoding, or each video stream is
compressed individually using temporal encoding. The first option offers higher compression, since there is a lot of coherence between the
views. However, it requires that a centralized processor compress multiple video streams. This compression-hub architecture is not scalable,
since the addition of more views will eventually overwhelm the internal bandwidth of the encoder. So, they decided to use temporal encoding of
individual video stream on distributed processors. This strategy has other advantages. Existing broadband protocols & compression standards do
not need to be changed for immediate real world 3D TV experiments. This system can plug into today's digital TV broadcast infrastructure &
co-exist in perfect harmony with 2D TV. There did not have access to digital broadcast equipment, they implemented the modified architecture
as shown in figure 9.
Eight producer PCs are connected by gigabit Ethernet to eight consumers PCs. Video stream at full camera resolution (1300*103D) are encoded
with MPEG-2 & immediately decoded on the producer PCs. This essentially corresponds to a broadband network with infinite bandwidth &
almost zeros delay. The gigabit Ethernet provides all-to-all connectivity between decoders & consumers, which is important for distributed
rendering & display implementation. So, what is gigabit Ethernet? '

Fig.5.5 Modified System

1) Gigabit Ethernet: It a transmission technology, enables Super Net to deliver enhanced network performance. Gigabit Ethernet is a high speed
form of Ethernet (the most widely installed LAN technology), that can provide data transfer rates of about 1 gigabit per second (Gbps). Gigabit
Ethernet provides the capacity for server interconnection, campus backbone architecture and the next generation of super user workstations with
a seamless upgrade path from existing Ethernet implementations.

2)Decoder & Consumer Processing: The receiver side is responsible for generating the appropriate images to be displayed. The system needs to
be able to provide all possible views to the end users at every instance. The decoder receives a compressed video stream, decode it, and store the
current uncompressed source frame in a buffer as shown in figure 10. Each consumer has virtual video buffer (VVD) with data from all current
source frames. (I.e., all acquired views at a particular time instance).

Fig.5.6 Block Diagram of Decoder and Consumer processing

The consumer then generates a complete output image by processing image pixels from multiple frames in the VVB. Due to the bandwidth 8
processing limitations it would be impossible for each consumer to receive the complete source of frames from all the decoders. This would also
limit the scalability of the system. Here is one-to-one mapping between cameras & projectors.


A. Holographic Displays

It is widely acknowledged that Dennis Gabor invented the hologram in 1948. he was working on an electron microscope. He coined the word
and received a Nobel Prize for inventing holography in 1971. The holographic image is true three-dimensional: it can be viewed in different
angles without glasses.
Figure shows the holographic image.

Fig.6.1 Holographic Imag

All current holo-video devices use single-color laser light. To reduce the amount of display data they provide only horizontal parallax. The
display hardware is very large in relation to size of the image. So cannot be done in real-time.

B. Holographic Movies

We have developed the world's first holographic equipment with the capability of projecting genuine 3-dimensional holographic films as well as
holographic slides and real objects – for the multiple viewers simultaneously. Our Holographic Technology was primarily designed for cinema.

C. Volumetric Displays

It use a medium to fill or scan a three-dimensional space & individually address & illuminate small voxels. However, volumetric systems
produce transparent images that do not provide a fully convincing three dimensional experience. Furthermore, they cannot correctly reproduce
the light field of a natural scene because of their limited color reproduction & lack of occlusions. The design of large size volumetric displays
also poses some difficult obstacles.

D.Parallax Displays

Parallax displays emit spatially varying directional light. Much of the early 3D display research focused on improvement to Wheat stone's
stereoscope. In 1903, F.Ives used a plate with vertical slits as a barrier over an image with alternating strips of left-eye/righteye images. The
resulting device is called a parallax stereogram. To extend the limited viewing angle 8 restricted viewing position of stereogram, Kanolt &
H.Ives used narrower slits & smaller pitch between the alternating image strips. These multiview images are called parallax panorama grams.
Stereogram & panorama grams provide only horizontal parallax. Lippmann proposed using an array of spherical lenses instead of slits. This is
frequently called a 'fly's eye" lens sheet, & resulting image is called integral photograph. An integral is a true planar light field with directionally
varying radiance per pixel. Integral sacrifice significant spatial resolution in both dimensions to gain full parallax. Researchers in the 1930s
introduced the lenticular sheet, a line of array of narrow cylindrical lenses called Isnticules. Lenticular images found widespread use for
advertising, CD covers, & postcards. To improve the native resolution of the display, H.Ives invented the multi-projector lenticular display in
1931. He painted the back of a lenticular sheet with diffuse paint & used it as a projection surface for 39 slide projectors. Finally high output
resolution, the large number of views & the large physical dimensions of or display leads to a very immersive 3D display. Other research in
parallax displays includes time multiplexed 8 tracking-bass systems. In time multiplexing, multiple views are projected at different time
instances using a sliding window or LCD shutter. This inherently reduces the frame rate of the display & may lead to noticeable flickering.
Headtracking designs are mostly used to display stereo images, although it could also be used to introduce some vertical parallax in multiview
lenticular displays. Today's commercial auto stereoscopic displays use variations of parallax barriers or lenticular sheets placed on the top of
LCD or plasma screens. Parallax barriers generally reduce some of the brightness &sharpness of the image. Here, this projector based 3D
display currently has a native resolution of 12 million pixels.

Fig.6.2 Images of a scene from the viewer side of the display (top row) and
as seen from some of the cameras (bottom row).

Multi Projector

Displays offer very high resolution, flexibility, excellent cost performance, scalability, & large-format images. Graphics rendering for
multiprojector systems can be efficiently parallelized on clusters of PCs using, for example, the Chromium API. Projectors also provide the
necessary flexibility to adapt to non-planar display geometries. Precise manual alignment of the projector array is tedious 8 becomes downright
impossible for more than a handful of projectors or non-planar screens. Some systems use cameras in the loop to automatically compute relative
projectors poses for automatic alignment. Here they will use static camera for automatic image alignment & brightness adjustments of the


This is a brief explanation that we hope sorts out some of the confusion about the many 3D display options that are available today. We'll tell
you how they work, and what the relative tradeoffs of each technique are. Those of you that are just interested in comparing different Liquid
Crystal Shutter glasses techniques can skip to the section at the end. Of course, we are always happy to answer your questions personally, and
point you to other leading experts in the field[4]. Figure shows a diagram of the multi-projector 3D displays with lenticular sheets.

Fig.7.1 Projection-type lenticular 3D displays

They use 16 NEC LT-170 projectors with 1024'768 native output resolution. This is less that the resolution of acquired & transmitted video,
which has 1300'1030 pixels. However, HDTV projectors are much more expensive than commodity projectors. Commodity projector is a
compact form factor. Out of eight consumer PCs one is dedicated as the controller. The consumers are identical to the producers except for a
dual-output graphics card that is connected to two projectors. The graphic card is used only as an output device. For real-projection system as
shown in the figure, two lenticular sheets are mounted back-to-back with optical diffuser material in the center. The front projection system uses
only one lenticular sheet with a retro reflective front projection screen material from flexible fabric mounted on the back. Photographs show the
rear and front projection.

Fig.7.2 Rear Projection and Front Projection

The projection-side lenticular sheet of the rear-projection display acts as a light multiplexer, focusing the projected light as thin vertical stripes
onto the diffuser. Close up of the lenticular sheet is shown in the figure 6. Considering each lenticel to be an ideal Pinhole camera, the stripes
capture the view-dependent radiance of a threedimensional light field. The viewer side lenticular sheet acts as a light de-multiplexer & projects
the view-dependent radiance back to the viewer. The single lenticular sheet of the front-projection screen both multiplexes & demultiplexes the
light. The two key parameters of lenticular sheets are the field-of-view (FOV) & the number of lenticules per inch (LPI). Here it is used 72" '
48" lenticular sheets with 30 degrees FOV & 15 LPI. The optical design of the lenticules is optimized for multiview 3D display. The number of
viewing zones of a lenticular display is related to its FOV. For example, if the FOV is 30 degrees, leading to 180/30 = 6 viewing zones.

Most of the key ideas for 3D TV systems presented in this paper have been known for decade, such as lenticular screens, multi projector 3D
displays, and camera array for acquisition. This system is the first to provide enough view points and enough pixels per view points to produce
an immersive and convincing 3D experience. Another area of future research is to improve the optical characteristic of the 3D display
computationally. This concept is computational display. Another area of future research is precise color reproduction of natural scenes on
multiview display.


[1] An Assessment of 3DTV Technologies, Levent Onural-Bilkent Un.,Thomas Sikor- Tech. Univ. Of Berlin, Jorn Ostermann- Univ. Of
Hanover, Aljoscha Smolic- Fraunhofer Inst.-HHI, M. Reha Civanlar- Koc Univ., John Watson- Univ. Of Aberdeen, NAB-2006 - Las Vegas - 26
April 2006 c Copyright 2006.
[2] T. Capin, K. Pulli, and T. Akenine-Moller, “The State of the Art in Mobile Graphics Research”, IEEE Computer Graphics and Applications,
vol. 28, no. 4, pp. 74 - 84,2008.
[3] K. Muller, P. Merkle, and T. Wiegand, “Compressing 3D Visual Content”, IEEE Signal Processing Magazine, vol. 24, no. 6, pp. 58-65,
November 2007.
[4] T. Okoshi, "Three dimensional displays," Proceedings of the IEEE, vol. 68, pp. 548-564, 1980.
[5] I. Sexton, and P. Surman, “Stereoscopic and auto stereoscopic display systems,” IEEE Signal Processing Magazine, vol. 16, no. 3, pp. 85-99,
[6] P C. Fehn, P. Kauff, M. Op De Beeck, F. Ernst, W. IJsselsteijn, M. Pollefeys, L. Van Gool, E. Ofek and I. Sexton, “An Evolutionary and
Optimized Approach on 3D-TV”, Proc. of International Broadcast Conference, 2002.
[7] C. Fehn, “A 3D-TV approach using depth image- based rendering (DIBR)”, Proc. Of
VIIP 2003.