Beruflich Dokumente
Kultur Dokumente
com
Dont let our sexy curves and cool colors fool you. The internet-age Pingtel xpressa phone, and its virtually limitless Java repertoire of revenue-enhancing possibilities, such as hosted IP voice services, is a very serious money maker indeed. To learn about the opportunities the worlds most intelligent phone can bring you, go to www.pingtel.com/mintmoney. Or send an e-mail to us at hostedvoiceservices@pingtel.com and well get back to you
TECHNOLOGY GUIDE
Visit our Web site to read, download, and print all the Technology Guides in this series.
Table of Contents
Abstract Introduction Architecture Models Technology Enablers for Next Generation Voice Services and Applications Next Generation IP Voice Services and Applications Summary Glossary Appendix A: Session Initiation Protocol (SIP) Concepts and Operation 4 4 6 16 29 33 34 38
techguide.com
Over 100 Technology Guides in the Following Categories:
Software Applications Network Management Enterprise Solutions Network Technology Telecommunications Convergence/CTI Internet Security
ATGs Technology Guides and White Papers are produced according to a structured methodology and proven process. Our editorial writing team has years of experience in IT and communications technologies, and is highly conversant in todays emerging technologies.
The Guide format and main text of this Guide are the property of The Applied Technologies Group, Inc. and is made available upon these terms and conditions. The Applied Technologies Group reserves all rights herein. Reproduction in whole or in part of the main text is only permitted with the written consent of The Applied Technologies Group. The main text shall be treated at all times as a proprietary document for internal use only. The main text may not be duplicated in any way, except in the form of brief excerpts or quotations for the purpose of review. In addition, the information contained herein may not be duplicated in other books, databases or any other medium. Making copies of this Guide, or any portion for any purpose other than your own, is a violation of United States Copyright Laws. The information contained in this Guide is believed to be reliable but cannot be guaranteed to be complete or correct. Any case studies or glossaries contained in this Guide or any Guide are excluded from this copyright. Copyright 2001 by The Applied Technologies Group, Inc. 209 West Central Street, Suite 301, Natick, MA 01760 Tel: (508) 651-1155, Fax: (508) 651-1171 E-mail: info@techguide.com, Web site: http://www.techguide.com
TECHNOLOGY GUIDE
Abstract
This Technology Guide explains the unique benets of using the Web architectural model with SIP and Java as the enabling technologies for next generation IP voice services and applications. Using the Web as a reference model for rapid innovation, the Guide contrasts the limitations of circuit-switched telephony and rst generation VoIP architectures with the Web model. It summarizes limitations of centralized-processing models such as traditional telephony, MGCP, and Megaco as compared to peer-to-peer models such as SIP and H.323. This Technology Guide explains in more detail the unique benets of using SIP for call control and Java for making phones intelligent. SIP is compared with H.323 in terms of innovation, scalability, simplicity, ease of deployment, and standardization. The guide also includes an explanation of SIP concepts and operation. A description of Java features supporting new voiceservices and applications is also included. The Guide concludes with examples of new voice-services and applications made possible exclusively by SIP and Java.
Introduction
Traditional telephony has hit a wall in terms of innovation, ease of use, and cost reduction. The core components of traditional telephony the terminal (telephone), PBX, the central ofce switch, and the switching network are struggling and failing to keep up with the rate of innovations on the Internet. The archaic telephony framework with PBXs and Custom Local Area Signaling Services (CLASS) switches providing Centrex and enhanced residential services (call waiting, call forwarding,
caller ID, etc.), cannot provide the types of features that are needed by a contemporary business in the age of e-commerce. The traditional business telephony solutions are complicated, for both the service administrators and the users. Because of the daunting complexity of PBX and CLASS/Centrex user-interfaces, users typically know and use only a fraction of the total feature set. Now imagine telephony services in the context of the current business need. The users would still like to use a phone for making and receiving calls and playing voice-mail messages. However, they would also like to have the phone appliance integrated with a browser-based PC for managing phone books and seamlessly interfacing with other applications, such as customer relationship management (CRM), sales force automation (SFA), supply chain management (SCM), time accounting, etc. In other words, perform tasks most suitable for the PC on the PC and those most suitable for the telephone using a phone appliance and have the two devices seamlessly integrated. Todays telephone just cannot deal with this new business imperative. In contrast, the Internet and Web-based communications have revolutionized the business environment and user personal life-styles by their inexpensive, standards-based innovations. We already have data, multimedia, video, and music applications on the Internet. The Internet is already serving as the underpinning of critical business and IT solutions. Just in the last few years alone the Internet and the Web have generated more innovations than traditional telephony has produced in its entire history. The next frontier for the Web is to apply the same degree of innovation to telephony. Most market surveys have veried that IP telephony is already supplementing traditional telephony and it is expected that the IP telephony architecture will ultimately replace the traditional telephony model.
TECHNOLOGY GUIDE
This Technology Guide explains the architecture of the new IP telephony model using Session Initiation Protocol (SIP) and Java. The Guide also demonstrates the power of SIP and Java in terms of scalability, ease of use, and innovative services and applications.
IP Centrex
Softswitch "gatekeeper"
Architecture Models
Circuit-Switched and First-Generation IP Telephony Architectures
The traditional telephony architecture is based on a centralized processing model. First generation IP telephony architecture uses a Media Gateway Control Protocol (MGCP), Megaco, or vendor proprietary protocols such as Ciscos Skinny Client Control Protocol (SCCP), which also are centralized architectures similar to the traditional telephony.
Figure 1A: Traditional circuit-switched telephony architectures
PBX Centrex
CLASS 5 switch
Both models have all of their intelligence in a centralized switch or server, which performs all of the telephony functions such as call setup, call forwarding, conference calling, etc. All requests, responses, and state changes must be processed by the central switch/server with the end-station being a dumb terminal. The following are the salient characteristics of the traditional telephony environment: Archaic, Host-to-Dumb Terminal Architecture: Voice service architecture has not changed for generations. Today, PBX and Centrex services are delivered using switches that contain all application intelligence just as mainframes and minicomputers did for IBM 3270 or VT100 terminals in old computer systems. Dumb Terminal The Telephone: Voice service delivery assumes a dumb terminal in telephony parlance the telephone. The end-
TECHNOLOGY GUIDE
user interface for these services on the dumb telephone requires non-intuitive ash sequences and star codes. No options exist for making telephony features easier to use and increasing user productivity. Hardware Specic Software: The voice features reside in software that is usually hardwarespecic and/or proprietary. This environment requires highly-specialized software engineers that are expensive and hard to nd. Even simple software modications require the extensive regression testing of feature interaction. Limited Next-Generation Platforms: Nextgeneration voice service platforms still fall short of business needs. Most rst-generation IP telephony systems, for both service providers and enterprises, do exploit IP for transport and some feature a Java or XML software environment. However, this open environment is not easily made extensible by anyone other than the vendor or possibly a service provider; certainly not the enterprise or an independent software vender with a great idea. These systems, consequently, still perpetuate the same 1960s host-terminal architecture with a dumb telephone as the endpoint: The IP PBX is a host computer with all the smarts driving dumb IP phones. VoIP gateways, softswitches, and their feature servers are merely physically distributed mainframes talking to dumb terminals.
Web Architecture
The Web represents the most successful application architecture in history. The Web features many intelligent servers located everywhere on the network and an intelligent, browser-based client device (a PC or a low cost Internet appliance). It is the client device, not the server, that both initiates and controls all communications with the server. When a user simply clicks on an icon to access an application, the browser pulls content in the form of HTML and applications (Java, Java script, Flash, Active X, etc.) from the server and runs them on the PC. There is a complete disaggregation of services in the Web model. Not only do the services come from different servers, they may be provided by different and multiple service providers. Some of the examples (shown in gure 2) include Yahoo for news; Amazon for shopping; MSN for instant messaging; ASP services (such as Corio) for customer relationship management (CRM), sales force automation (SFA), enterprise resource planning (ERP); and MP3.com for music. An enterprise can outsource as few or as many services as suits its business model. Key characteristics of the Web architecture include: Intelligent end devices (clients) Distributed, intelligent servers (no central switch or server for services) An open architecture leading to innovation, rapid application development, and lower costs
10
TECHNOLOGY GUIDE
Figure 2: Web application architecture
CRM/SFA MSN Instant Messenger amazon.com MP3.com
11
Intelligent servers
doubleclick.com
yahoo.com
Virtualcart.com
Intelligent clients
voice-world are solely dened and developed by PBX and CLASS switch manufacturers, just as mainframe applications were dened by the vendors. The PBX and CLASS switch vendors, their ideas, their bureaucratic practices, and their business motivations have held innovation in the voiceworld hostage. Voice features reside in software on the switch that is hardware-specic and vendorspecic. It is a proprietary environment that is not openly extensible. Even modest new functions require the onerous regression testing of feature interaction. The centralized, closed-software environment offers no way for enterprises to add their own innovations or enhancements to telephony features, let alone individual users or software developers with really good ideas. Some features are impossible to implement because of the dumbtelephone as the endpoint. Consequently, innovation is and will remain dead, especially when compared to the revolutions on the Web.
Web Innovation on the Web occurs at the edges of the network, where anyone businesses and individuals can create Web sites that are immediately open for other users to interact with. On the Web, in contrast to traditional telephony, a new page or feature can be created in a few minutes. More importantly, the Web page can be conceived, created, delivered and personalized by anyone yahoo, e-bay, GE, a company, an individual, their kids or their grandparents. Several million Web sites are in existence today, up from a few thousand in 1993. These sites satisfy everyones personal and business needs for news, buying, entertainment, chat, sports, sex, etc. regardless of gender, race, religion, ethnic background, industry and occupation. Amazon.com would not have happened if the world needed to rely on the data communications
12
TECHNOLOGY GUIDE
vendors such as Alcatel, Cisco, Lucent, or Nortel to invent the service and add the features to a router or a switch. Ease of Use Traditional and First-Generation IP Telephony For most telephone users, cryptic impossible-toremember ash sequences and * codes are the interface to thousands of PBX and CLASS features. For the fortunate few with block character displays, even IBM 3270 and VT100 terminals appear attractive. Users dont know what voice features exist and if they do, they do not know how to use them. While most voice service platforms such as PBX and CLASS switches offer hundreds or thousands of features (300-400 features in a typical PBX, 3000-4000 in a CLASS 5 switch), most users typically dont know any more than just a few transfer, hold, last number redial. In research conducted by WorldCom, 9 out of 10 executives could not even transfer a call without resorting to the help scream Do I dial ash rst and then the number, or the other way around? Trying to set-up just a 3-party conference call over a PBX is even a bigger nightmare. Its no wonder that the assisted conference calling businesses of AT&T, Sprint and WorldCom are so big and protable. For many, the most difcult part of changing jobs is learning a new phone system. What do I dial to get an outside line? Consequently, for the vast majority, ignorance is bliss, yet very expensive in user productivity.
13
browsers graphical user interface means that users do not have to memorize features as in the world of telephony. The use of any Web site is an intuitive discovery process, performed simply by pointing and clicking at images and words. Scalability and Capacity Traditional and First Generation IP Telephony In the telephony world, big centralized boxes have all the smarts. Whenever the telephone, the terminal in the parlance of telephone equipment vendors, sends a ash sequence or * code, its the PBX or CLASS switch that gures out what it means. The PBX or the switch also must actively manage each and every call. Consequently, it just does not scale. Support for just one more user may end-up requiring a hugely expensive replacement or addition.
Web A Web site, however, can support millions of users. Scalability is achieved not only through the connection-less nature of IP and by adding more and bigger servers to the Web site. Scalability is also achieved by exploiting an intelligent endpoint the browser-based PC. In fact, its the browser software that interprets Web objects and puts a Web page together. For example, in accessing a typical e-commerce site, its the browser, not a server, that:
Retrieves and displays the source HTML page and embedded product images individually Retrieves and runs a Java applet, Java script, Flash, Active X or other application components Retrieves and displays a dynamic advertisement from DoubleClick.com Retrieves shopping cart services from a ShoppingCart.com
Web On the Web, millions of sites with billions, perhaps trillions, of pages can be easily navigated by pointing and clicking at pictures or words displayed on an intelligent, browser-based PC. In contrast to telephony feature usage, anyone from kids to their great grandparents can easily discover and use any site on the Web. The
14
TECHNOLOGY GUIDE
Stores cookies to identify users and maintain states Encrypts credit card numbers Manageability Traditional and First-Generation IP Telephony An expert the equivalent of the proverbial rocket scientist must perform all maintenance and management tasks for the PBX or the switch. Tools for managing moves/adds/changes tend to be horrendous and, consequently, administrators learn only the basic coping skills. This makes it extremely costly to administer the switch. According to some estimates it can cost as much as $300-$500 per PBX move/add/change. For a Centrex line, it can take weeks for a change to be implemented by the telephone company.
15
An enterprise has the option of providing PBX services locally through a premises-based system device or these could be outsourced to a networkbased service. The outsourced service not only eliminates capital costs but may actually provide richer services than those available from a PBX. The gure also shows some illustrative services such as unied messaging, presence messaging, instant messaging, and CRM integration, all of which can be provided by separate service providers offering best-of-breed solutions for an enterprises or even an individual users specic requirements.
Figure 3: Web architecture for next-generation voice services and applications
Intelligent servers
CRM/SFA Presence & IM Audio Auctions Hosted PBX service IP PBX
Web Self-service by users is the normal operative model here for registration, buying things, personalizing info, etc. Every ofce device including printers, copiers, and now intelligent IP phones have a built-in Web server that enables remote conguration over the net via browser interface. Every ofce device and home appliance is becoming more intelligent and capable of running automated diagnostics, reporting the ndings, and ordering replacements before service is disrupted.
Unified Messaging
PSTN gateways
Java MP3
PC app integration
Exploiting the Web Architecture for Next Generation Voice Services and Applications
Figure 3 shows what telephony would look like if migrated to a Web-like architecture. In this model, services and applications are resources on the network and are accessed and controlled by the phone and not by a central-switch or a gatekeeper. Nor does a central-switch or gatekeeper control what the phone can do.
Intelligent clients
PCs and other phones are simply resources on the network that provide services to users. In this model, the PC may provide services for the phone such as integration with the desktop applications or the phone may provide services for the PC such as causing the phone to ring and automating conference calls in Microsoft Outlook.
16
TECHNOLOGY GUIDE
17
Phone Intelligence Technology An ability to support small footprint applications is the key for incorporating intelligence in phones. A powerful yet easy to use programming language used widely for Web-enabling Internet appliances is required. In addition to rich functionality for traditional Web applications, features developed specically for telephony and security are mandatory. Lastly, the language must already be used by hundreds of thousands of programmers worldwide in order for innovation to happen rapidly. Extensible, Scalable Call Control Protocol A call control protocol is used for call related functions such as setting up, monitoring, and terminating calls. However, in the new IP telephony model, the call control protocol must differ from traditional telephony and the rst generation IP telephony protocols. For maximum scalability, the new call control protocol must support peer-to-peer communications whereby two or more phones can set up and communicate directly without requiring anything more than locations services from a call control server. In addition, the protocol must allow the peer-to-peer exchange of applications and data in addition to voice communications. The call control protocol must support a wide range of environments from home-ofce to the largest enterprise and from the smallest to the largest services provider. Thus, the protocol must be highly scalable as well as cost effective in a diverse range of congurations. Since it is not possible to predict all future applications of IP telephony, the protocol must also be extensible in order to accommodate unforeseen requirements.
18
TECHNOLOGY GUIDE
SIP (Session Initiation Protocol) The Call Control Protocol
SIP introduces the benets of the Web architecture to IP telephony. It provides a powerful, extensible, scalable, and easy-to-deploy protocol for call control and media exchange. Several standards are available for building IP telephony solutions. These include the Session Initiation Protocol (SIP) from the IETF; ITU-T H.323, an ITU-T umbrella standard; Media Gateway Control Protocol (MGCP) from IETF; Media Gateway Control (Megaco), a joint protocol by IETF and ITU-T; and proprietary protocols such as Ciscos Skinny Client Control Protocol (SCCP). A high-level comparison of these protocols is included in table 1.
19
Architectural Model Media types Network scope Extensibility Scalability Ease of deployment Standardization
Peer-to-peer Voice, video, data Intra, Extra, and Internet High High High IETF
Peer-to-peer Voice, video, limited data Intra, Extra, and Internet Low Medium Low ITU-T
Master/ slave Voice, video Intranet only Medium Low Medium IETF and ITU-T
Why SIP Of the protocols listed in table 1, only SIP and H.323 are peer-to-peer protocols. MGCP, Megaco and Ciscos proprietary SCCP represent the old centralized model and suffer from this models limitations discussed earlier. Thus, the real choice for a protocol with Web-like benets comes down to one of the peer-to-peer protocols H.323 or SIP.
H.323, the older of the protocols, was originally designed for video conferencing over the LAN. Since then it has been morphed and used to support voice and video over then WAN as well. SIP, however, was designed from the beginning for multimedia sessions and conferences over the WAN. Because of these differences in their design objectives, SIP offers numerous compelling advantages in the areas of extensibility, scalability, and ease of deployment over H.323. Today there are more products available supporting H.323 than SIP. However, since its introduction, SIP is rapidly becoming the preferred protocol. A January 2001 survey of Voice over IP vendors in Network World found that while 75% of the vendors offered products based on one of the four H.323 versions, an approximately equal number of them were already planning to offer SIP-based products by June 2001. However, the more telling statistic was that less than 25% of the vendors were planning to upgrade their products from H.323 Version 2 to Version 3 and even fewer to Version 4, the latest version of H.323. According to the same survey, most vendors expected H.323 to become a legacy protocol. In contrast, the list of vendors supporting or planning to support SIP is growing rapidly. Service providers embracing SIP include WorldCom, Level 3, Net2Phone, Telia, Webley, Ibasis, LipStream, and TalkingNets as of March 2001 with many more anticipated. The reasons for the rapid ascendancy of SIP become obvious when we compare it with H.323 in the areas of innovation, scalability, ease of deployment, manageability, and the standardization process. Appendix A provides additional details on SIP concepts, denitions, and operation.
20
TECHNOLOGY GUIDE
Innovation SIP enables new services and applications not possible with H.323 (or other IP telephony protocols) and easily empowers service providers, application developers, and enterprises to create unique, differentiated services and applications. For example, SIP uses a simple text-based encapsulation (based on the Internet standard MIME) which enables it to transmit data and application programs with the voice call, making it easy to send business cards, photos, and/or MP3 encoded information during a call. SIP also supports third-party call control through simple applications to modify SIP messages and enable functions such as sending ofce calls to a home phone after 5:00 PM or forwarding video calls to a PC. Lastly, SIP envisions the need to accommodate extensions new protocol headers, methods, bodies and parameters, to implement new and innovative applications. By design not all products are required to support these extensions (just the endpoints) servers or phones that want to use them. Scalability Being peer-to-peer protocols, both SIP and H.323 eliminate the need for central servers to control everything. Peer-to-peer protocols reduce costs of network and server infrastructure equipment necessary to support a user population of a given size. Within peer-to-peer protocols, SIP is a much more efcient and less complex protocol, therefore, more scalable than H.323. H.323 is actually an umbrella specication that includes several protocols from other ITU-T standards. Tables 2 4 cover three categories of such
21
protocols within H.323. These include Registration, Admission and Status (RAS), Q.931 for call control, and H.245 for transmission of non-telephony signals on the line. As shown in the tables, SIP has a total of 5 methods (commands) and 8 responses and H.323 has 21 commands/messages across the three protocols. SIP can be implemented as a stateless protocol and does not need to maintain any call states, which further increases scalability of SIP. SIP also shows a substantially higher efciency than H.323 during call set-up by using approximately 50% fewer messages. Figures 4 and 5 show call set-up messages for H.323 and SIP, respectively. While H.323 requires a total 13 message exchanges, SIP requires only 7 exchanges. SIP Methods and Response Codes
22
TECHNOLOGY GUIDE
H.323 Commands/Messages
23
AdmissionRequest (ARQ)
BandwidthRequest (BRQ)
DisengageRequest (DRQ)
Ease of Deployment Deploying and supporting SIP is similar to HTTP. It uses standard protocols and functions, which already exist in the current IP networks and are well understood by system administrators and technical support personnel. SIP has the following HTTP characteristics: Standard Internet addressing: SIP uses standard IP addressing format for both names and addresses, e.g., sip:username@abcorp.com or sip:1.781.938.5306@abcorp.com Clear text protocol: SIP uses clear text for its protocol encapsulation unlike H.323, which uses binary encoding, making SIP easier to diagnose and troubleshoot.
Status
Status Inquiry
24
TECHNOLOGY GUIDE
Simple error messages: SIP uses familiar errormessages with prexes such as 10x, 20x, etc. Leverages other Internet protocols: SIP uses other familiar Internet protocols such as MIME and Session Description Protocol (SDP), again eliminating the need for new technical training or expertise.
Figure 4: SIP Operation in Proxy Mode
25
Gatekeeper
Endpoint 2
4 6
Admission Request
Admission Confirm
Altering Connecting
Site 1
Endpoint 1@Site 1 INVITE Endpoint 2 @Site 2 Proxy Location Server
Site 2
Client 2 @Site 2 9 1 2 Endpoint 2 10
7 8 Terminal Capability Set Master/Slave Determination Terminal Capability Set + Ack Master/Slave Determination + Ack Terminal Capability Set Ack Master/Slave Determination Ack Open Logical Channel Open Logical Channel + Ack
11 Client 2 @Site 2 4 3 12 13
Open Logical Channel Ack Media (RTP) Close Logical Channel End Session Command Close Logical Channel + Ack End Session Command Release Complete
Ack Disengage Request Ack Disengage Confirm Endpoint 1 Gatekeeper RAS 0.931 Disengage Confirm Endpoint 2 H.245 Disengage Request
Standardization The ITU-T, organized under the auspices of the United Nations, denes traditional telephony and H.323 standards. It is a slow moving body with a highly political process. Participation in ITU-T activities is limited to paid members. Most of
26
TECHNOLOGY GUIDE
ITU-T documents are written using very dense language, which make it virtually impossible for the uninitiated to fathom their intent. Most ITU-T standards tend to be very complex. For example, H.323 specication with its co-requisite protocols runs some 700 pages compared to about 150 pages for SIP. The ITU-T specications are not freely available and have to be purchased. As of February 2001, you could not even buy the H.323 specications from the ITU-T bookstore because ITU-T still had not made them available for purchase. In contrast, the Internet standardization process is geared toward rapid innovation. It has an open and democratic process which draws architects from the industry, academia, government, and individuals who are experts in specic technology areas. All Internet specications are available for free to anyone and can be simply downloaded from the Internet. Lastly, the Internet standardization is rooted in the proof-of-concept, i.e., there must exist a prototype implementation for a standard to achieve approved status. The standard documents often include model codes to document the standard. Additionally, almost always, the actual code to implement a prototype is available on the Internet for free download and use.
27
can run on minimalist appliances. Simple Java applets can be developed in anywhere from a few minutes to a few hours. Key features of Java include: Network Orientation Java applications, called applets, run on thinclients. Java applets are network-aware and can open and access objects across the Internet via URLs. The Remote Method Invocation (RMI) feature of Java allows the building of distributed applications. RMI-based applications can connect to other Java applications as well as legacy applications. Java Naming and Directory Interface (JNDI) provides a unied interface to multiple heterogeneous naming and directory services including LDAP directories. JNDI enables seamless connectivity to these services. Developers can build powerful and portable directory-enabled Java applications using this industry-standard interface. Java Database Connector (JDBC) is an application programming interface (API) that provides crossDBMS connectivity to a wide range of SQL databases. Using JDBC, an application can establish connectivity with nearly any enterprise or service provider database from a Java-enabled phone. Java also features specications and supports products which can automate the process of distributing new versions of applications over the network. This includes Java Management Extensions (JMX), the specication, and Java Dynamic Management Kit (JDMK), Suns product which implements this specication. Powerful APIs for Telephony and Speech Applications Java has two APIs specially designed for telephony and speech applications: Java Telephony API (JTAPI) denes interface to access the following functional areas: call control, telephone physical device control,
28
TECHNOLOGY GUIDE
media services, and telephony administrative services. JTAPI functions can be used with both wired and wireless phones and its core functions can be extended to build applications such as call logging and tracking, auto-dialing, screen-based telephone applications, call routing applications, automated attendants, interactive Voice Response (IVR) systems call management center, voicemail, etc. Java Sound API (JSAPI) allows developers to incorporate speech technology into user interface for their Java applets and applications. This API species a cross-platform interface to support command and control recognizers, dictation systems and speech synthesizers. Security Java has a built-in security framework or sandbox that can protect basic phone operation like making and receiving calls from rogue or misbehaving applets. Java enables the construction of virus-free, tamper-free appliances like phones. It also incorporates authentication techniques based on public-key encryption. Javas security features also allow enterprises to control access to resources via policy-based permissions. Support for a Wide Variety of Devices and User Interfaces Java applets can run on virtually any platform due to their platform independence. A Java applet can be written once and run on virtually any operating system including cell phone OS, HP UX, IBM AIX, Palm OS, Sun Solaris, VxWorks, Microsoft Windows, and various other varieties of Unix and Linux systems. To enable a Java application to execute anywhere on the network, the Java compiler generates an architecture-neutral object le and the compiled code is executable on any
29
processor that is running Java runtime environment. Consequently, a Java applet written for an IP phone appliance can run without modication on a PCbased softphone supporting Java. Ease of Development Sun makes developing applications quick and easy with great tools in their Java Development Kit. In addition, Java is supported by numerous tools, components, and applications that are available from many vendors. In fact, many are available for free on the Internet. These tools include application and user interface (UI) components, authoring and workow tools, and integrated development environments. A wide variety of Java training options ranging from classrooms to web-based are also available. Lastly, due to Javas tremendous popularity, Java software engineers are readily available on permanent or contract basis to assist in development.
30
TECHNOLOGY GUIDE
Personal Productivity Applications
Electronic business cards send an enriched electronic virtual business card (vCard) including photo and audio le automatically with every call as caller ID information (or selectively during the middle of call). This information can be added into any personal contact database such as Microsoft Outlook, or a corporate CRM, or a Supply Chain Management (SCM) database with the push of a button. Presence and instant messaging use an instant messenger service to determine when geographically distributed colleagues are available for a quick conference call with a customer. Simply click or automatically camp on your buddy list to create the conference call. Call lters have every call from that very important customer ring at every phone business phone, cell phone, home phone, vacation phone, etc. The call will get completed to the rst device from where the user picks up the call. Phone book use multiple phone books corporate, personal, Internet, etc., on the phone and simply point to an entry to make the call. The phone books can be synchronized with the data on a PC or any server. Personalized music on-hold play personalized announcements or music from a favorite MP3 recording or Internet radio station while callers are on hold. Voice tag elimination deliver customized messages to people trying to contact busy contacts and eliminate phone tag.
31
Automated conference calling create conference call appointments in Microsoft Outlook. The application would automatically set-up the conference call at the specied time. Distinctive rings play unique rings from any sound le based on caller ID or personal directory information. Separate rings could be set up for a boss, spouse, kids, or anyone else.
32
TECHNOLOGY GUIDE
phone to manage the bidding process and to track who raised a hand to bid rst, etc. Virtual call center ASP support the integrated voice and data requirements of call center agents working from their homes. Airlines reservations use a Java applet to visually display interactive voice response (IVR) options rather than forcing users to wait through very long recorded instructions and go through multi-level menus requiring the use of a telephone keypad.
33
Summary
The Web has revolutionized the world of business. Traditional telephony, however, cannot fulll the needs of the emergent e-business model. The traditional telephony model is constrained by an inexible and inefcient architecture based on centralized processing and the dumb terminal. This environment inhibits innovation, is nearly impossible to use, and simply perpetuates the old, cumbersome, and limited functionality services. IP telephony needs to embrace the Web architectural model in order to achieve rapid and cost effective innovation. Old denitions of enhanced services and features do not come anywhere near even the simplest applications made possible by technologies such as SIP and Java. SIP, coupled with Java, can bring the same revolutionary innovations and mindset to the world of IP telephony that the Web has brought to IT and the data world.
34
GLOSSARY
API: Application Programming Interface, a set of programming functions and calls supported by a language or a software product. APIs are used by software developers to develop programs in a specic language or to enhance or extend the capabilities of a product. Abstract Syntax Notation 1, an object-oriented language used by various architectures such as OSI, ITU-T, and SNMP to dene objects including data structures. Application Services Provider, a service provider that provides applications over a network with a usage-based fee. Custom Local Area Signaling Services, services such as caller ID and ring back provided by a telephone company. Devices in the telephone central ofce that provide such services are called CLASS switches. Central Processing Unit, the arithmetic and logic unit in a computer. Examples include the Intel Pentium family, the AMD Atheon, and the IBM RISC processors.
35
IVR:
Interactive Voice Response, a system used for generating voice prompts and menus and for accepting and processing user responses. Java Telephony API, an extension to Java that provides telephony functions such as call control. Java Speech API, an extension to Java that provides functions for controlling dictation systems and speech synthesizers Java Naming and Directory Interface, an extension to Java that provides a unied interface to multiple naming and directory services. Media Gateway Control, a VoIP protocol jointly developed by ITU-T and IETF. It uses softswitches and gatekeepers for central control of calls and conferences. Media Gateway Control Protocol, a VoIP protocol developed by and IETF. It uses softswitches and gatekeepers for central control of calls and conferences. Multipurpose Internet Mail Extensions, an Internet standard used for encapsulating e-mail messages in clear text. Private Branch Exchange, a customer premise based telephone switch for intra-campus and outside telephone calls. Public switched Telephone Network, a general reference to telephone networks using circuit switching and time division multiplexing. An ITU-T Call control protocol for ISDN, also used in H.323. It denes procedures for setting up and clearing calls.
JTAPI:
ASN.1:
JSAPI:
JNDI: ASP:
CLASS:
Megaco:
MGCP: CPU:
MIME: CRM: Customer Relationship Management software, used with application such as ACT or Goldmine to keep track of customer contacts and sales information. An ITU-T specication for multimedia conferences over IP for LAN attached stations. It is a peer-to-peer protocol as opposed to MGCP and Megaco which require central control Hyper Text Transfer Protocol, used for encoding and transferring Web objects from Web servers to Web browsers.
PBX:
H.323:
PSTN:
HTTP:
Q.931:
36
GLOSSARY
RAS: Registration, Admission, and Status, a component of H.323, denes procedures whereby users can register themselves with a gatekeeper as a preliminary step to setting up a call. Remote Method Invocation, a component part of Java, allows building of distributed applications that can connect to other Java applications as well as legacy applications. RTP Control Protocol, control protocol for RTP that allows multimedia session partners to monitor the quality of their sessions. Real-time Transport Protocol, an IP standard for encapsulating multimedia streams for transmission over IP networks. It includes information such as packet timestamps to help implement quality of service for a session. Skinny Client Control Protocol, a Cisco proprietary protocol for voice over IP that uses central control with gatekeeper-like functions. Supply Chain Management, used in reference to application programs used for managing purchases and suppliers. Session Description Protocol, an IETF standard to advertise multimedia conferences. SDP is intended for describing multimedia sessions for the purposes of session announcement, session invitation, and other forms of multimedia session initiation. Sales Force Automation, used in references to application programs used for managing sales activities such as capturing customer contact information, generating contracts, and generating order forms.
37
SIP:
Session Initiation Protocol, IETF standard for peer-to-peer multimedia sessions and IP telephony. An alternative to the ITU-T H.323 protocol. Voice over IP, a general reference to several technologies and protocols that allow voice telephony implementation over IP networks. Examples of components and technologies that enable VoIP include codecs, IP PBXs, softswitches, gateways, H.323, SIP, MGCP, and Megaco.
RMI:
VoIP:
RTCP:
RTP:
SCCP:
SCM:
SDP:
SFA:
38
APPENDIX A
39
cases of a multicast conference, a full-mesh conference and a two-party phone call, as well as combinations of these. Any number of calls can be used to create a conference.
Call A call consists of all participants in a conference invited by a common source. A SIP call is identied by a globally unique call-ID.
Figure 7: SIP clients and servers
Gopher
Kerb
SMTP
Telnet
FTP
SIP
SNMP
RPC
UDP
In TCP/IP terminology, as shown in gure 6, SIP is an application level protocol and runs over UDP but may use TCP. SIP is based on existing and well-understood Internet protocols and extends them to support IP telephony. SIP Concepts Session A SIP session is a multimedia session consisting of a set of multimedia senders and receivers and the data streams owing from senders to receivers. Session is the basic building block in SIP. All calls and conferences are established by setting up sessions among users.
Conference A conference is a multimedia session, identied by a common session description. A conference can have zero or more members and includes the
SIP Components User Agent Clients and Servers A user agent is a program that runs on a SIP device (e.g., the phone). It contains a client function and a server function. The user agent client (UAC) is a program that initiates SIP requests such as initiating a call. A UAC is also known as the calling user agent A user agent server (UAS) is a program that receives SIP requests such as an incoming call and sends back responses to those requests. A UAS is also known as the called user agent.
40
APPENDIX A
SIP Servers Location Server A location server is used to obtain information about a callees possible location. A location is the IP address of the domain where a user is located. To locate a user, the name of the user is sent to the location server and the location server returns zero or multiple locations (IP addresses orf domains) where a callee may be found. If the caller already knows the IP address of the destination server, the caller can directly contact the callees UAS.
41
rwhois, LDAP, multicast-based protocols or operating-system dependent mechanisms to actively determine the end system where a user might be reachable. SIP Addressing SIP uses traditional Internet names as addresses, which consist of a user name and a domain name. This is an important issue because it means that the existing Internet naming, addressing, and routing services can process SIP addresses without modications. Examples of SIP addresses include: SIP:user01@bigcorp.com SIP:user@25.16.10.8 SIP:1-212-555-1212@business.com These addresses are similar to HTTP URL addresses except that they start with SIP instead of HTTP. The rst example shows a user being identied via a typical e-mail address. The second example shows an address where the IP address of the destination is known. The last example shows how we could use a phone number-like address under SIP. The major advantages of this addressing scheme are: It invents no new directory structure and can be processed by existing IP servers Users can use familiar e-mail or URL addresses to make phone calls and have one less thing to remember, the phone number. Domain Name Services (DNS) DNS is a standard Internet service to convert user names, e.g., user01@bigcorp.com into IP addresses, e.g., 172.30.10.20, that can be used for nding user locations and routing calls. Because SIP uses standard IP naming and addressing, we are able to use existing, standard DNS services for SIP without any modication.
Proxy Servers A proxy server is an intermediary program that acts as both a server and a client for the purpose of making requests on behalf of other clients. Requests are serviced internally by a proxy server or forwarded, possibly after translation, to other servers. A proxy interprets and, if necessary, rewrites a request message before forwarding it. Redirect Server A redirect server is a server that accepts a SIP request, maps the address into zero or more new addresses and returns these addresses to the client. Unlike a proxy server, it does not initiate its own SIP requests. Unlike a user agent server, it does not accept calls. Registrar A Registrar is a server that accepts REGISTER requests. A client uses the REGISTER request to let a proxy or redirect server know the location where the client can be reached. It provides a means whereby users can register their locations with a SIP server dynamically. As users move to different locations, they can register their new locations with the local location server. To supplement information obtained through user registrations, a location server may also use one or more TCP/IP protocols, such as nger,
42
APPENDIX A
SIP Messages SIP messages include SIP methods and responses to the methods. These are listed in tables 5 and 6. SIP Message Encapsulation MIME Multipurpose Internet Mail Extensions (MIME) is the Internet standard for describing different types of content on the Internet, including video and image types. It is already used by HTTP for composing Web pages and by e-mail systems for encoding e-mail messages. SIP uses this wellestablished standard for encoding information, eliminating the need for inventing a new technique for encoding voice and multimedia over the Internet. SIP Call Setup SIP is inherently capable of carrying voice, video, and multimedia calls. In the examples below, the setup ows remain the same irrespective of the type of the call. In these scenarios a call set up is illustrated where a caller knows the name but not the IP address of a callee, necessitating the use of a SIP server. If the caller knew the IP address of the callee, the caller would not need services from the SIP servers. With a callees destination IP address known, the callers user agent client only needs to select the protocol (UDP by default), port (5060 by default) and IP address of the SIP user agent server to which the INVITE request should be sent. A successful SIP call setup consists of two messages, an INVITE followed by an ACK. The INVITE request asks the callee to join a particular conference or establish a two-party conversation. It also includes information about the media types and formats that are allowed for the session. If the callee wishes to accept the call, it responds to the invitation by returning a similar description listing the media and format it wishes to use.
43
When the callee sends a response to the INVITE request agreeing to participate in the call, the caller sends an ACK to conrm callees response. Call Setup Using A Proxy Server To initiate a SIP call, a caller rst locates the appropriate proxy server and then sends a SIP invitation request to the proxy server. The location of the proxy server is locally congured on the user station. The proxy server can also be discovered automatically by the caller using a variety of mechanisms such as DHCP options, DNS SRV and others. Instead of directly sending the call to the intended callee, the proxy server may redirect the SIP request or trigger a chain of new SIP requests to other proxies or location servers. Figure 5 shows detailed ows for SIP call setup using a proxy server and are describe below: 1. Endpoint1@Site1 sends an INVITE request for Endpoint2@Site2 to the proxy server. 2. The proxy server contacts the location service for Endpoint2. 3. The proxy server receives a more precise location for Endpoint2 as Client2@Site2 from the location server. 4. The proxy server issues an INVITE request to the address(es) returned by the location service. The INVITE request carries a Call-ID. (Upon receiving the INVITE request, the called user-agent alerts the user by generating a phone ring). 5. The called user agent returns a 100 Trying response indicating that it is processing the INVITE request. 6. The called user agent returns a 200 OK response to indicate successful processing of the INVITE request.
44
APPENDIX A
7. The calling user agent sends an ACK to complete the handshake. The call is now in place. Call Setup Using Redirect Server Again we assume that the IP address of the caller is not known to the callers agent, thereby, necessitating services of the local SIP server, a redirect server in this case. The key difference compared to the proxy server is that the redirect server cannot initiate an INVITE request.
Figure 8: SIP Operation in Redirect Mode
45
3. The location server returns information that this client can be found at Site3. 4. The redirect server forwards precise location information to the calling user agent using a 302 Moved Temporarily message: Contact Client2@Site3 5. The calling user agent acknowledges the information with ACK 6. The calling user agent sends an INVITE request directly to the called user agent. 7. The called user agent returns a 100 Trying response indicating that it is processing the request. 8. The called user agent returns a 200 OK response to indicate successful processing of the INVITE request. 9. The calling user agent sends an ACK to complete the handshake. The call is in now place.
Site 1
Endpoint 1 @Site 1 INVITE Endpoint 2 @Site 2 302 Moved Temporarily Contact: Client 2 @Site 3 Ack
Site 2
Redirect Server Location Server
Site 3
Client 2 @Site 3
Endpoint 2
Site 3
Ack
The ow of requests and responses for gure 8 is as follows: 1. Enduser1@Site1 sends an INVITE request to the redirect server for Endpoint2@Site2. 2. The redirect server contacts the location server for location information about Endpoint2.
46
NOTES
47
about anything a clever Java programmer could dream up. To see what your Java colleagues have taught our phone to do already, go to www.pingtel.com/payphone now and check out our App Dev Zone. A good idea of your own and who knows? You just might get rich. Or famous. Real fast.
This Technology Guide is one in an ongoing series of over 100 solutions-focused Guides. These Guides assist IT professionals in making informed business decisions about specic aspects of technology development and strategic deployment. The Technology Guide Series offers a broad array of titles, each presenting objective information and practical guidance in a non-biased, easy-to-understand style and tone. Our editorial writing team has many years of experience in IT and communications technologies, and is highly conversant in todays emerging technologies. The Technology Guide Series and techguide.com are supported by a consortium of leading technology providers. The Sponsor has lent its support to produce and publish this Guide. This Guide, as well as the entire Technology Guide Series, is made available to view and print at no charge by visiting techguide.com.
Software Applications
Network Management Enterprise Solutions Network Technology Telecommunications Convergence/CTI Internet Security