Sie sind auf Seite 1von 3

Designing low-cost VoIP videophones

By Michael Ward tain as many as three indepen- plications processor manages hancement (line and acoustic
Director of Product Line dent processors—a DSP, video the VoIP call control protocol echo cancellation, and jitter
Management encode and decode processor, and user interface (Figure 1). buffers) and other similar func-
Trinity Convergence Inc. and an applications processor. In this multidevice architec- tions can now be executed on
E-mail: mward These processors combined ture, tasks are relegated to dif- the applications processor if
@trinityconvergence.com with necessary electronic com- ferent components of the sys- carefully implemented with as-
ponents, such as the camera, tem for processing. Hence, the sembly-coded and hand-opti-
Video telephony has faced chal- LCD and audio codec, quickly task of coordinating and man- mized software, while using
lenges in quality and the cost of push the overall electronics aging the overall system is in- some form of hardware accel-
its equipment and associated BOM (eBOM) to the $500 range creased, extending the size of eration for the video encode
services. These factors, along before any costs for software, the design and overall complex- and decode (Figure 2).
with video frame rates in the packaging or manufacturing ity of the circuit board. Given In a typical videophone de-
single to 10 frames per second, are added. the multiple devices and their sign, a hardware audio codec
resulted in a videophone that (e.g. an AC97 codec) provides
consumers left sitting on the the physical interface between
shelves. the microphone/speaker and
One by one, the technologi- Mic Speaker the general-purpose processor.
cal challenges facing video te- Software running on general-
lephony are being overcome purpose applications proces-
with increasingly sophisticated sors sends and receives audio
Audio
VoIP solutions. Broadband codec data and performs the neces-
Internet access has become sary VoIP processing, such as
pervasive, creating sufficient taking incoming audio data
bandwidth to the home. A com- from the microphone and pro-
bination of startup and estab- DSP or Display cessing it via a voice codec and
DSP co-processor
lished carriers is providing - VoIP audio sending it out as a packet
- Video encode
packet-based telephony ser- processing and decode stream, or perhaps playing a
vices that can serve as the foun- generated DTMF tone to the
dation for video telephony. Camera speaker (via the hardware audio
Most important, Moore’s Law codec).
has worked its magic to deliver Detailed knowledge in the
processing devices capable of Applications
processor architecture and
cost-efficient full-motion processor VoIP audio processing algo-
video. By using soft-VoIP de- - VoIP call control rithms is needed to effectively
vice design techniques in - GUI implement these processor-in-
- System mgt
conjunction with hardware- tensive algorithms. While a sig-
accelerated video processing, nificant effort is required to
phones can now be created that implement these voice-process-
Keypad Ethernet
meet the aggressive cost-points interface ing modules for the appropriate
needed for video telephony to applications processor archi-
finally reach its mass-market tecture, the result is well worth
potential. the investment.
Figure 1: DSP handles the packet voice processing, while an applications Due to the real-time nature
Price is right processor manages the VoIP call control protocol and user interface. of VoIP traffic and the need to
For the widespread adoption of provide a wide range of VoIP
videophones, providers are tar- Fortunately, emerging tech- varying power requirements, codec support to ensure inter-
geting a consumer price point nology allows the eBOM for a the power-supply design within operability across devices, a
of $99 for the residential video videophone to be cut by at least the system may become very flexible framework is needed.
telephony market. While this is 50 percent. By using integrated complex, requiring the use of This allows runtime selection
challenging for the current gen- applications processors and multiple voltage converters to and configuration of the appro-
eration of technology, many soft-DSP techniques, such as generate the various power priate VoIP codec and dynamic
VoIP service providers are sub- the VeriCall EdgeTM, the mul- rails needed. configuration of the media-pro-
sidizing videophones in ex- tiple processors previously re- Traditional architecture us- cessing elements to be used
change for a year or more of sub- quired can be combined into a ing separate processors for within a given media channel.
scription. With current video- single device, yielding signifi- voice, video and system control The framework and its associ-
phone models retailing for cant savings in cost, power and also requires multiple pro- ated scheduler component
about $800 before subsidies, a device size. gramming models and develop- must ensure that all algorithms
heavy burden is placed on the In traditional videophone ment tool chains. Thus, larger required for a given channel
service provider that can reduce devices, a DSP handles the development teams, increased definition are executed in the
the drive to offer such services. packet voice processing (voice training requirements and ad- time period allowed.
Current VoIP-enabled encode/decode, tone genera- ditional costs will be needed. While in a single-channel
videophone designs are expen- tion and detection, echo cancel- VoIP codecs (G.711, G.729AB, system the task of scheduling
sive because they require a high lation and noise reduction). A G.723.1 and iLBC), audio pro- algorithms is little more than a
number of specialized compo- separate DSP or dedicated co- cessing (DTMF and call series of consecutive calls to the
nents. Traditional video and processor handles the video en- progress tone detection/gen- appropriate algorithms in or-
VoIP (V2IP) designs often con- code and decode, while an ap- eration), voice-quality en- der, multichannel systems offer
dia applications processor fall tem designer range from soft-
into this category. The i.MX21 ware implementations running
device contains an ARM9E- on a GPP or a DSP to silicon
Speaker based GPP along with a dedi- devices dedicated to the encod-
Mic
cated hardware acceleration ing and decoding of specific
engine for H.263 & MPEG-4 video streams. Full software
encode/decode for up to 30fps encoding and decoding on a
Audio video at CIF resolution. The GPP is really only viable for
codec VeriCall Edge solution takes very-low-frame rate and low-
advantage of this architecture resolution video. The encoding
by running all the VoIP media of the video stream requires
Applications processing on the ARM9E GPP, significantly more processing
processor DSP or Display while the VeriCall Edge soft- power than the decoding opera-
- VoIP audio co-processor ware framework configures and tion, so a reasonable alternative
processing - Video encode
- VoIP call control and decode manages the hardware-based is to perform the video decode
- GUI video-acceleration block to on a GPP while handling the
- System mgt
Camera control the H.263 or MPEG-4 encode via some form of hard-
video stream. ware-based acceleration. With
the additional processing
Keypad Ethernet
interface
Getting the picture power in emerging general-pur-
The most processing-intensive pose applications processors
task in a videophone is the pro- (such as those based on ARM11
cessing of the video itself. The and MIPS24K cores), software-
Figure 2: Software running on general-purpose applications processors specific processing required based decoding combined with
sends and receives audio data and performs the necessary VoIP processing. may vary considerably, depend- hardware-assisted encoding is a
ing on the size of the transmit- suitable solution for many next-
a more complex scenario in trol/system management onto ted and received image, as well generation videophones.
which different VoIP codecs a single applications processor as on the particular encoding To achieve full-motion
may be required for each chan- enables a simplified design, scheme used. H.263, MPEG-4 (30fps) video at CIF or greater
nel, as well as certain channels which reduces the device and H.264 are among the most resolutions, current-genera-
requiring echo cancellation, count, cost and size. It also commonly used codecs for video tion VoIP videophones will re-
while others do not. Video- removes the need for separate telephony. H.264 has the ben- quire some form of hardware-
phones are typically “single- development efforts and tool- efit of requiring less bandwidth based acceleration. This could
channel” systems, although chains for the VoIP media pro- to transmit a video image of be in the form of an SoC with a
they will often provide the ca- cessing and system control. quality comparable to that of GPP to handle voice, and either
pability for three-way audio To further increase the level H.263, but requires signifi- a dedicated video encode device
calling with the audio mixed lo- of system integration and gain cantly more processing power to or a more general-purpose DSP
cally on the videophone, so the additional benefits of lower achieve this higher level of com- to handle video. Advantages of
need for multichannel support power, smaller footprint and pression. The specific video dedicated video encode devices
arises. overall lower system costs, an compression scheme, desired include a simplified program-
Simplifying life for device SoC that contains the GPP maximum frame rate and range ming model and implementa-
designers, software platforms along with a dedicated video co- of supported resolutions will tions that are often more effi-
such as VeriCall Edge provide processor or DSP can be used drive the “right” solution for cient in terms of power and sili-
a highly optimized and inte- (Figure 3). Devices such as the video processing in a device. con gate-count than a DSP
grated solution in a form that Freescale TM i.MX21 multime- Options available to the sys- when embedding the device in
can be quickly integrated into an SoC. However, a DSP pro-
the end-product design. Provid- vides more f lexibility for the
ing all the requisite media pro- system designer, allowing DSP
cessing algorithms and SIP- or software updates to deliver new
H.323-based VoIP call control video-codec formats (assuming
within a flexible framework in- Speaker the utilized DSP has sufficient
Microphone
tegrated for ARM9, ARM9E processing power for the new
and MIPS32-based devices al- video codec). Ultimately, the
lows the videophone developer system designer must deter-
to focus on support for value- Audio mine the desired operating
codec
added services on the device. characteristics, taking into
By merging the packet voice consideration the video for-
processing onto a GPP, only mats to be supported, available
one device is required to handle Applications processing capacity, power
processor DSP or Display
the VoIP-related tasks of the - VoIP audio co-processor budget and footprint require-
videophone. The VoIP call con- processing - Video encode ments for the device.
- VoIP call control and decode
trol, user interface and general -GUI
-System mgt
telephony control application Bringing it all together
Camera
execute on this general-pur- With the design points for the
pose applications processor. As A/V systems identified, perhaps
Keypad Ethernet
these tasks do not exhibit the the most critical aspect of the
interface
same time-critical attributes, design is the integration of
they are implemented in C these two, disparate sub-
without the need for optimized systems. In a VoIP-based
assembly code. Merging the Figure 3: To further increase system level integration, an SoC that contains videophone, A/V signals are
media processing and call con- the GPP along with a dedicated video co-processor or DSP can be used. transmitted and received as two
independent packet streams introduce unnecessary delay to stream or the other, VeriCall velopers to deliver innovative
that must be correlated and syn- the system. Edge can ensure a high-quality products with less risk and
chronized on the receiving side. Certain videophones provide experience to the user in a wide faster time-to-market.
Failure to properly synchronize basic synchronization by allow- range of network environments. As the demand for low-cost
the A/V stream results in a user ing the user to modify a delay or video telephony grows, it will
experience akin to that of offset between the A/V streams. Low-cost videophone inevitably find its way into
watching a late night Godzilla While this solution enables While video telephony has other devices beyond the tradi-
movie, where the actors’ words synchronization of the A/V faced many false starts over the tional desk phone. Wi-Fi-based
never match the video on the streams, it must be modified past 40 years, elements are now personal video terminals are al-
screen. each time the user places a call. in place to allow cost-effective ready in development that will
A jitter buffer designed with If network conditions change video telephony to reach the allow friends to share experi-
consideration for video tele- during a call, it is possible for mass market using the fast- ences in real-time, wherever
phony is needed and careful these streams to go out of sync. growing VoIP infrastructure. Wi-Fi access exists. Remotely
attention must be paid to this The VeriCall Edge V2IP solu- By coupling soft-DSP VoIP monitored V2IP door cameras
critical system element to cre- tion incorporates an advanced techniques with hardware- can become commonplace, al-
ate a viable product. While the technology that provides auto- accelerated video encoding, lowing a homeowner to answer
A/V packet streams contain matic dynamic synchronization videophones can be developed a door and converse with the
timestamps that can be used to of the A/V streams, removing at a savings of as much as 50 visitor from any place in the
correlate data, the system must the need for the user to manually percent in the eBOM of the de- world that has a broadband
also contend with packet loss synchronize the streams while vice. This savings in product Internet connection. These and
and network jitter that can oc- also adapting to changing net- cost will allow videophones to many other innovative applica-
cur in the network. The design work conditions. By actively be marketed at price points tions for video telephony will
must take into account the need monitoring the incoming A/V that finally enable mass-market surely arise, all driven by the
to buffer and synchronize the streams and managing these adoption. The use of integra- ability to deliver low-cost video
two separate streams while do- streams to accommodate de- ted software frameworks like terminals that are enabled by
ing so in a manner that does not layed or missing packets in one VeriCall Edge allows system de- DSP-free VoIP techniques.

Das könnte Ihnen auch gefallen