0471345970

PA R T
One
Internet Standards
Email as we know it is useful only because it is interoperable. I can read the email you send me, no matter what kind of system you used to send it and no matter what kind of system I use to read it. As long as we all use software that adheres to the open standards, we can all get along just fine. The first part of this book first describes the scope of current Internet standards for email, messaging, and workgroup applications, and then continues by building a foundation for understanding what Internet standards are and how they work. Chapter 1, Internet Email Standards, examines why email and related technologies require standards while introducing the technologies themselves. Chapter 2, Internet Standards and Internet Protocols, examines Internet standards and Internet protocols. Chapter 3, Internet Standards Bodies, explains the organizations involved in creating Internet protocols and setting Internet standards. Chapter 4, The Internet Standards Process, describes the processes involved in building an Internet standard. Chapter 5, Getting the RFCs, provides guidance for finding Internet standards as they are described in Request for Comments (RFC) documents, and Chapter 6, Reading the RFCs, explains how to read and use RFCs.
CHAPTER
1
Internet Email Standards
Software and hardware vendors alike have been building and selling electronic messaging productsemailfor decades. Not only do old-timers still talk about IBMs PROFS mainframe messaging system, but the venerable product is still widely used and supported. However, PROFS support often means buying a gateway that will allow PROFS users to exchange email with the rest of the world. Building an email system for a multiuser computer is easy. Building email systems that can handle messages from any system anywhere in the world is more difficult. For years, interoperability and proprietary standards blocked progress on interoperable email. IBMs PROFS worked on mainframes using EBCDIC characters; other vendors email products were equally proprietary. Each vendor would use its own standards for building messages, for setting up addresses, for designing message headers, and for handling the transmission of messages from one user to another and from one system to another. By the 1980s, more people started using email through online services such as Compuserve, dial bulletin board systems (BBSs), or email services such as MCI Mail. If you had accounts on several different systems, youd have separate email mailboxes on each of those systems. You could receive personal email from other Compuserve members on your Compuserve account, business
Essential Email Standards: RFCs and Protocols Made Practical
email from your coworkers on your corporate account, and business email from your business partners, customers, and suppliers through your MCI Mail account. The obvious problem was the proliferation of accounts. If a key client used a different email system, youd have to subscribe to that system too. If you started working with colleagues at a research facility across the country, youd have to get a mainframe and email account on their systems. As so often happened in the network and computer industries, competitors attempted to corner the market for email by presenting their customers with proprietary products. If you want to do email, they said, youve got to use our product. Email vendors limited their users ability to exchange email with others. The equivalent would be long-distance telephone companies telling their customers: Heres a telephone. You can use it only to call people who use our service. Unless you could convince everyone you call to use the same long-distance service as you do, you would find your desk littered with different vendors telephones, or you wont use the phone as often. The situation was not really quite as dire as that, though. Vendors realized that users want to communicate with people who chose the wrong service, so they built gatewayssystems that could properly forward email addressed to external email systems and accept incoming messages from those systems. A gateway must incorporate an accurate and complete understanding of message and address formats as well as character formats and appropriate message handling. That is, they must know how to translate addresses from one systems formats to the others, how to translate message information from one systems formats to the others, and where to send the messages once they are translated. Gateways work reasonably well when they link a relatively small number of systems and when all the linked systems are well documented. They dont work well when there are a lot of different systems, nor do they work well if the implementers dont have access to all the details of every systems proprietary standards. As the number of gatewayed systems increases, the number of translation modules required increases much faster. For 2 systems, you need only have 2 translation modules: one to translate messages from System A to System B, and a second to translate messages from System B to System A. With 3 systems, you need 6 translators; with 4 systems, you need 12; with 5 you need 20. The fact that different email vendors compete with each other makes things even more difficult. Not all vendors are equally willing to share their email designs. Furthermore, any time a vendor makes a change in its email standard, every system that gateways to that system must modify its translation modules. The entry of Internet standard email changed everything. Internet email was starting to make itself felt in the market as increasing numbers of academics, students, researchers, and networking professionals began using the Internet to
communicate. Suddenly, incompatible computers could exchange messages. Using Internet-standard email, users on IBM mainframes (properly equipped with standards-compliant email software, of course) could exchange email with personal computer users (also properly equipped with standards-compliant email software). This book is about those standards, and how they work to make email the lowest-common denominator for Internet connectivity. You may not be able to surf the Web, you may not be able to download files or access remote systems with terminal emulation programs, but if you can send and receive Internet email then you can be considered connected. At times, the term messaging is used instead of email. Internet messaging includes email as well as network news that incorporates many of the same or similar standards. Also included under the messaging umbrella are collaborative applications that either support or are similar to other messaging applications.
Basic Email Requirements

Lets backtrack for a bit. What is email, exactly? Having been made the subject of a hit Hollywood movie, Youve Got Mail, email has clearly made the transition from technology culture to popular culture. We all know what email is, pretty much. Heres my definition: Email represents all the systems and mechanisms by which a message entered into a network-connected device finds its way to a destination device. The way we normally speak about email encompasses the messages themselves, the systems that handle the delivery of the messages, the software that allows users to send and receive the email, the specifications that define how those messages are formatted, addressed, sent, transmitted, and received. Youve mastered email if you can understand how those five things formatting, addressing, sending, transmitting, and receivingwork. Those five things are what the standards are all about, and what this book is all about. Those things work in specific ways for Internet email, but they dont have to work that way for all email systems. As far as the user is concerned, email means a piece of client software that somehow sends and receives messages. Through the use of that software, it is possible to enter a message and a destination address to whom the message should be delivered. After the sender sends the message, it appears in the destination users mailbox. The mailbox refers to the part of the client software that displays email messages, and it is also the part of the client software that allows the user to access and read messages that have been received. This is the way most end users experience email, whether it is proprietary email like Lotuss cc:Mail or open-standard Internet email. There can be significant differences in what happens to the email after the sender clicks the send button and the recipient opens the message.
Closed-System Email
Traditional proprietary email systems were based on single systems with many users, so they were relatively easy to build. You just had to set up a message storage system and an application that would notify recipients when they received a new message. Users log on and are given access to read messages they received (or sent to others). The messages are all stored in a central repository. There is no need for networking beyond what is necessary to connect users to that repository. Messages never leave the central system. Figure 1.1 shows how this works. This approach to email has many advantages. It is simple to build and deploy. There is no need for complicated networking tasks relating to email. A mechanism for users to send and receive messages is required, but this can even be provided through a simple application built on the email server itself and accessed through a terminal session. An administrator can handle message backup for all users. Messages can be delivered instantaneously. However, the central server model has its drawbacks as well. The entire system has a single point of failure and when the server goes down, users have no access to any of their messages. Messages to recipients not in the system must be handled through other means. Unless old messages are expired and removed, central email systems can quickly fill as much disk space as you throw at them. Message retrieval performance can degrade as the message store increases in size.
Central Message Store
Message Server
User System User System User System User System
Figure 1.1 Many proprietary email systems are built on central server models, obviating the need for complicated message-handling routines.
This model grew out of the multiuser system environment, but has been extended to networked personal computers. Lotuss old cc:Mail email system used a central store to provide email services to networked personal computers, with gatewaying services across LANs. This architecture has the drawback of increased network traffic for email activity. Every time a message is retrieved, it must be retransmitted across the LAN when messages are all stored centrally. Client software for such a system can be set to poll the server periodicallyautomatically query it for new messages at a set intervalalso adding to network noise. Although such PC email systems may appear to allow the exchange of messages from one PC to another, they in fact simply put a client front-end on the PC while retaining the workings of the email system on the server. In effect, end-users are connecting to the server for the purpose of sending and receiving email. The only significant difference between PC email systems and mainframe-based email systems is that the smarts of the client reside on the PC rather than on the mainframe.
Internet Email
A different approach to email depends on a more inclusive model, where messages are exchanged not just between individual users of the same system. In this model, messages are actually transferred from a source system to a mailforwarding system, which in turn delivers the messages to their destinations. Figure 1.2 shows how this works. Individual users send and receive email through the agency of user agents (UAs). These programs provide front ends for reading messages, and they are able to send and receive messages through the agency of message transfer agents (MTAs). MTAs have links with other MTAs and are able to forward messages through the network cloud shown in Figure 1.2. One might think that the aggregation of all linked MTAs can be viewed as the functional equivalent of a centralized
Message Transfer Agent
User Agent Figure 1.2
User Agent
User Agent
User Agent
User Agent
User Agent
Internet email depends on two types of agents that can handle email.
email repository, but this is not the case. The MTAs dont necessarily retain messages, but may relay the messages only when UAs request them. The MTAs make it possible for messages to be transmitted across systems, and open standards make it possible for those messages to be properly interpreted by recipient systems no matter what type of system was used to originate the messages. Finally, MTAs forward messages but do not affect their contents (if theyre working right, anyway).
Standards for Internet Messaging

Proprietary messaging is bad, open messaging is good, right? Maybe, maybe not. Whether it is bad or not, many millions of users rely on proprietary messaging systems of one kind or another, and many more millions rely on openstandard Internet messaging. Everyone wants to be able to communicate seamlessly and interoperably with each other. Adopting open standards doesnt necessarily have to mean implementing them; it can mean implementing whatever you want and then building modules that translate inbound standard messages into your proprietary formats and translate outbound proprietary messages into open standard formats. Using standard formats in this way immediately reduces the task of building interoperable messaging gateways. Each proprietary email system implementer needs to build only its own pair of translators: one for inbound messages and the other for outbound messages. No one needs to bother with anyone elses proprietary formats, and everyone should be able to correctly interpret inbound open standard messages. This system works only if everyone understands the standards. The five general categories in which open standards are applied to Internet messaging are discussed below.
Formatting and Message Headers

Rules specify precisely how postal mail must be packaged and addressed. The return address must be in the upper left corner of the envelope, and the destination address must be positioned correctly in the center of the envelope. Envelope size ranges are specified, and proper packaging for parcels is defined. To be interoperable, email messages must be at least as rigorously specified. The format of the data must be specified so as to avoid problems stemming from different data representation standards. The format of the messages themselves must be specified so that all systems that handle messages know where the message begins and ends. The format of the message handling informationthe message headersmust be specified so that all systems know where to look to find the destination and source addresses and any other relevant information.
Character Representation
Vendors of operating systems and computer hardware are often bound by their history or corporate goals to specific data representation schemes. These different data representation schemes can make the goal of interoperability between different systems that much more difficult, and interoperable open standard email is possible only if the contents of the message can be transmitted unchanged from source to destination. Sometimes this means it must be converted, as when the message originates on an IBM mainframe in EBCDIC and is destined to an ASCII-based system. We address character representation issues in Chapter 7, Messaging Standards.
Message Body and Attachments

Interoperable messaging requires that the message be easily distinguishable from other related information that travels with the message. There are always headers, as we see in the next section, but message attachments are another important feature of Internet messaging. Message attachments have long challenged implementers as well as users. Sending attachments to messages through centralized proprietary email systems is not always easy, and sending attachments across email gateways can sometimes be (or at least seem) hopeless. Determining where the message begins and ends is important, as is figuring out mechanisms for attaching noncharacter-based files. Although characterbased files can be relatively easy to translate across system boundaries, binaries are more problematic. Some systems want to treat all data as character-based data and, as a result, can truncate bytes and change their meaning. Email implementers have attempted to solve this problem in many different ways over the years. These approaches are detailed in Chapter 14, Network News Transfer Protocol (NNTP).
Message Headers
How to format message information necessary for delivery is another important issue related to message formatting. Somehow, all the systems involved in handling email must understand what they are supposed to do with the message. The most important piece of information is the message destination, but other bits of information are relevant to the delivery, handling, and response to the message. A minimal set of basic email functions is defined by what information is required and permitted in the headers. Closed-system incompatibilities often stem from differences in the way functions are supported in the headers. For example, return receipt deliveriesa message is returned to the sender when the recipient of a message receives and opens a messagehave long been a part of proprietary email systems, but they have long been missing from Internet standard email.
10
Open-standard messaging headers must include provisions for every piece of data necessary. They must also not include anything that all participating systems cant handle. All systems should be able to interpret, add to, modify, and respond to all headers as needed to deliver messages. Chapter 9, Multipurpose Internet Mail Extensions (MIME), examines the Internet standards for message headers.
Email Addressing
Addressing conventions must be uniform or easily parsed by different systems, otherwise there is no way to interoperate. The ISO X.400 addressing standard is one attempt at a universal standard for electronic message addressing, and one that still has significance for the Internet. However, the familiar name@domainname.dom format was not always the only way to express email addresses. The standard, globally unique Internet email addressing evolved over time, embracing and eventually replacing competing and alternative addressing schemas. Interoperable address representations are not enough for global messaging; directory services are also an important part of any discussion of Internet email standards.
Email Transport
There are rules for formatting an email message, for creating and interpreting message headers, and for using email addresses. Once the message is correctly formatted, enclosed within its headers, and given an appropriate destination, it must still be sent from its originator and forwarded on to that destination. This is where things get more complicated, as a protocol defining how different systems are to deal with the task of getting a message from one place to another represents a higher level of complexity than merely defining what the message should look like. The transit of a message across the Internet from its source to its destination can be viewed as a single journey with three legs. First, the message must get from the UA to an MTA. From there, it must travel from one MTA to another until it arrives at an MTA that can deliver the message to its destination UA. Finally, the message actually arrives at a destination UA from the last MTA. Looked at in this way, its possible to segment the journey into more manageable tasks. The protocols defining the rules for moving messages from source to destination are discussed in Chapter 10, Simple Mail Transfer Protocol, Chapter 12, Internet Message Access Protocol (IMAP), and Chapter 13, SMTP Message Address Resolution.
11
Internet Messaging and Collaboration

Email is just a part of the Internet messaging scene. Network news is almost as ancient an application as email, and an entirely new category of application is growing up as Internet workgroup and collaboration protocols are being standardized and implemented. Network news uses the same standards for message headers and similar mechanisms for delivery as Internet email. Other collaborative tools, including calendaring and scheduling applications as well as identity exchange tools (virtual business cards), are also covered in this book as they are important adjuncts to the mainstay messaging applications. Chapter 15, vCard, Chapter 16, Calendaring and Scheduling Standards, and Chapter 17, Internet Messaging Security, describe the current standards for these other types of Internet messaging.
Security Considerations
Although security was not always the top priority for early Internet engineers who were interested in solving networking problems, it certainly is a priority now. Hackers have routinely used email and related applications to wreak havoc around the world. Chapter 18, The Future of Internet Messaging, examines both the security flaws that attackers use as well as specifications for Internet standard tools that can be used to protect against those attacks.
CHAPTER
2
Internet Standards and Internet Protocols
Many people consider Internet standards and Internet protocols almost magical. Although other standards may be more widely implemented, few are implemented in such a public way. Telecommunications protocols may affect more, but few standards are so interoperably implemented by so many different implementers. So what exactly makes a protocol an Internet standard? And what exactly is an Internet protocol? As with so much else in life, these questions have two sets of answers. One set is simple, straightforward, and of limited practical usefulness. The other set, though more useful, is also far more involved. If you want the easy answers, you can find them in the next paragraph. If you want the useful answers, youll have to read all the chapters in Part One of this book. An Internet protocol is a set of rules that specifies interaction between networked entities over the Internet or other TCP/IP networks. A protocol becomes an Internet standard if it is listed as such in the Internet standards document known as STD-1. RFC 2500 defined current Internet standards as of its publication date: June 1999. STD-1 is published approximately once every 100 RFCs and lists the status of all current RFCs. The complicated but useful answers require asking even more questions: What is an RFC? An STD? How are Internet protocols documented? What
13
14
other kinds of documents are relevant to Internet protocols? How does a protocol differ from an application? What are the steps that must be taken to create an Internet standard? What, exactly, is a protocol? Do all RFCs describe Internet standards? Do all RFCs describe protocols? Is there a simple list of current Internet standards? All these questions are answered in this chapter. Of course, the answers raise even more questions, which are answered in the coming chapters. Chapter 3, Internet Standards Bodies, shows where Internet standards come from. Chapter 4, The Internet Standards Process, examines how a protocol makes its way from being an idea to being an Internet standard. Chapter 5, Getting the RFCs, identifies where to find documentation of current and future Internet standards. Chapter 6, Reading the RFCs, tells you how to read and use RFCs and other related documents.
Internet Documents
The Request for Comments (RFC) represents the most important form Internet standards take and is the most often cited type of document when people speak of Internet standards. However, it is far from the only type of Internet standardsrelated document. RFCs represent an archive of all the wisdom of the Internet (as well as much else), from its very start in 1969. Not all RFCs are readily available. Many early RFCs never made it into electronic format and have been lost over time. However, all the current RFCs with any relevance to the modern Internet are available online. Several different types of RFCs exist, including several special RFC series. In this section, we define the different categories of Internet documents.
RFCs
Any definition of the RFCs should start with that offered in RFC 2026, The Internet Standards Process Revision 3 (BCP 9):
Each distinct version of an Internet standards-related specification is published as part of the "Request for Comments" (RFC) document series. This archival series is the official publication channel for Internet standards documents and other publications of the IESG, IAB, and Internet community. RFCs can be obtained from a number of Internet hosts using anonymous FTP, gopher, World Wide Web, and other Internet document-retrieval systems.
An RFC is simply a report, originally called a Request for Comments because researchers reported their own results, theories, and activities and solicited responses from other researchers through this mechanism. All Internet standards are published as RFCs, but not all RFCs document Internet stan-
15
dards. Publication of a document as an RFC may mean that it should be considered a standard, or it could simply mean that the RFC editor deemed it to be of interest or value to the Internet community. Once published, an RFC is frozen in time. It can never be edited, updated, revised, or changed in any way. There is never any question of which is the most recent version of a particular RFC. RFC 2500, cited above, will never change, though the official protocol standards of the Internet are likely to change. Any changes will be documented in an RFC also titled Internet Official Protocol Standards (or something very much like that), but with a higher RFC number (probably 2600). RFCs may be written by anyone: students, professors, researchers, employees of networking companies, employees of companies that use networking products, anyone. As long as the document has relevance for computer communications, is formatted appropriately, and submitted according to the rules (to be discussed in Chapter 4), it stands a chance of being published as an RFC. RFCs may be reviewed prior to publication by the RFC editor, by Internet task forces, by one or more individual experts, or by anyone else the RFC editor deems appropriate, but RFCs are not technical refereed publications. When the author intends the document to specify an Internet standard, very specific steps must be taken to gain approval. These steps are detailed in Chapter 4.
STDs
The body of RFCs includes a few subsets of document series. Most important are the STDs (standards) documents. These are RFCs that document protocols that are considered to be Internet Standards with a capital S. The STD series clearly identifies the RFCs that document current Internet standards. An Internet standard protocol may have undergone several updates, revisions, or changes since it first was published as an RFC. The Internet STD series links specific protocols with static STD numbers. For example, the Simple Mail Transfer Protocol (SMTP) is an Internet standard and is described in STD-10. The most recent list of Internet standards identifies the STD-10 document as being RFC 821. Should an upgrade to SMTP be accepted as an Internet standard, STD-10 would no longer point to RFC 821, but rather to the new RFC that documents SMTP version 2 (be it called SMTP next generation or Complicated Mail Transfer Protocol, or whatever). STDs point at the current standards and provide a point of reference for anyone looking for the most current version of Internet standards. STDs document standards rather than single protocols. A standard that comprises more than one protocol may have an STD that comprises more than one RFC. For example, STD-5 describes the standard for the Internet Protocol (IP) and it points to six different RFCs: RFC 791, RFC 950, RFC 951, RFC 919, RFC 792, and RFC 1112. These RFCs describe not only the Internet Protocol but also IP subnetting, IP
16
broadcasting, IP broadcasting with subnets, the Internet Control Message Protocol (ICMP), and the Internet Group Multicast Protocol (IGMP), respectively. When a specification reaches full standard status, it is assigned an STD number. When a full standard becomes obsolete, its STD number is not reused but is no longer included in the pantheon of Internet standards. For example, STD-4, Gateway Requirements, was most recently documented in RFC 1009, Requirements for Internet Gateways, and was phased out as a standard in RFC 1800 in 1995. In that version of the Internet Standards document, the protocol referenced by STD-4 became historic and STD-4 was retired. We come back to STD documents later in this chapter.
FYIs
In 1990, RFC 1150 F.Y.I. on F.Y.I. Introduction to the F.Y.I. Notes was published. The FYI documents described in RFC 1150 were intended to be a subset of the RFC document series:
The FYI series of notes is designed to provide Internet users with a central repository of information about any topics which relate to the Internet. FYIs topics may range from historical memos on "Why it was done this way" to answers to commonly asked operational questions.
The FYI document, which is something like a cross between a primer and a FAQ, was intended to answer questions rather than to describe a specific protocol. All FYIs are RFCs, though not all RFCs are FYIs. FYIs refer to specific topics and point at RFCs, but when one RFC becomes obsolete or is replaced by another newer document, the FYI number may remain the same while it points to the newer document. FYI 1 points to RFC 1150. FYI 2 points to RFC 1470, FYI on a Network Management Tool Catalog: Tools for Monitoring and Debugging TCP/IP Internets and Interconnected Devices. FYI 5 points to RFC 1178, 1470 - FYI on a Network Management Tool Catalog: Tools for Monitoring and Debugging TCP/IP Internets and Interconnected Devices.
BCPs
Members of another series of RFCs are called Best Current Practice (BCP) documents. RFC 1818, Best Current Practices, describes the series as containing those documents that best describe current practices for the Internet community. The rationale behind creating a new series of documents was that, at the time (November 1995), there were only two types of RFCs: standards track RFCs and all other RFCs. The standards track RFCs are intended to document Internet standards, and documents are accepted into the standards track based on a very specific and rigorous process. The remaining RFCs consist of far less formal documents. These
17
RFCs have no formal review or quality control process, which means that publication as a nonstandards track RFC affords relatively little standing for a documents content. The Best Current Practices series provides the IETF with a mechanism to disseminate officially sanctioned technical information outside of protocol specifications. The sequence of review necessary for an RFC to be promoted to BCP status is similar to that required for an RFC to be promoted to an Internet standard, as we see in Chapter 4. While STDs describe protocols, BCPs describe other technical information that has been endorsed by the IETF. BCPs can refer to meta-issues relating to the Internet, such as BCP 9: RFC 2026, The Internet Standards ProcessRevision 3. This document describes the process by which a protocol becomes a standard. BCPs may also refer to deployment or implementation issues, such as BCP 5: RFC 1918, Address Allocation for Private Internets. This document provides guidelines for the efficient allocation of network addresses to avoid connectivity problems while at the same time conserving globally unique IP addresses, a depleted resource.
RTRs
RARE is the acronym for the Reseaux Associes pour la Recherche Europeenne (Association of European Research Networks). Its purpose is to create a highquality computer communications infrastructure for Europe, using Open Systems Interconnection (OSI) protocols as well as TCP/IP and related protocols. RARE Technical Reports (RTRs) are described in RFC 2151, A Primer on Internet and TCP/IP Tools and Utilities as being published as RFCs in order to promote cooperation between RARE and the Internet effort. For example, RTR 6 refers to RFC 1506, A Tutorial on Gatewaying between X.400 and Internet Mail. RTRs often document issues related to interoperability between OSI and IP-related protocols.
Internet-Drafts
The documents that describe Internet standards as embodied in RFCs evolve over time and through many revisions before becoming RFCs, let alone Internet standards. Well before a standards-related specification is accepted as an RFC, it must start out as an Internet-Draft (I-D). As explained in RFC 2026, The Internet Standards ProcessRevision 3:
During the development of a specification, draft versions of the document are made available for informal review and comment by placing them in the IETF's "Internet-Drafts" directory, which is replicated on a number of Internet hosts. This makes an evolving working document readily available to a wide audience, facilitating the process of review and revision.
18
Unlike RFCs, which are intended to survive over time, unchanged and unchanging, I-Ds are meant to be temporary. They are working documents that are meant to be replaced once updated and forgotten when no longer useful. For example, all drafts must include an expiration date, and any published I-D that is not revised or accepted as an RFC after six months is simply removed from the Internet-Drafts directory. While RFCs are meant to be used as references, readers are warned not to use I-Ds as references. They have no formal status with the IETF. They are not archived, so references to specific versions of I-Ds can not be used. Readers are warned not to refer to I-Ds in other published materials other than as being works in progress, and they are especially cautioned not claim compliance with specific I-Ds for their products. We discuss I-Ds in more detail, particularly as they relate to the standards process, in Chapter 4.
Internet Standards
One might easily believe that an RFC either documents or does not document an Internet standard, but it isnt quite that simple. First, a handful of fundamental standards such as STD-1 actually describe the rest of the Internet standards. Other standards in this category include the Assigned Numbers document, which lists all values that have special meaning to Internet standards, and the host and router requirements specifications. Standards themselves have two special characteristics: state and status. A standards state refers to its maturity level: It might be a proposed standard, a draft standard, or an actual standard. The standards status refers to its requirements level: Is the protocol required, recommended, or elective? The term Internet standard refers specifically to a protocol that is either already accepted as a full Internet standard or that is on the Internet standard track. To discover what protocols and what RFCs are standards or on the standards track, you consult STD-1. The most recent version of STD-1RFC 2500lists not only all the current standards, but also the RFCs documenting draft standard and proposed standard protocols as well as informational and historic protocols. STD-1 contains lists of current STDs along with the RFCs linked to each STD. STD-1 also lists all Internet protocols by their maturity level, as described below. This document is the key to all the Internet standards: If you want to know which protocols are standards and where those standards are documented, you simply locate the current document referenced by STD-1. All other STDs are listed here. STD-2 is the Assigned Numbers document, most recently published as RFC 1700. STD-2 includes the most important numbers to the Internet. For exam-
19
ple, this document lists the values of well-known ports, reserved multicast addresses, or virtually any values related to TCP/IP protocols. However, RFC 1700 was published in 1994 and is seriously out of date. The Internet Assigned Numbers Authority (IANA) has been publishing these values online, at www.iana.org/numbers.html. This will probably change as the IANA is replaced by the Internet Corporation for Assigned Names and Numbers (ICANN). Both IANA and ICANN, and the transition from one to the other, are discussed in Chapter 3. Standards can be deprecated, meaning they are no longer considered standards. For example, between publication of RFC 2400 (September 1998) and publication of RFC 2500 (June 1999), STD-3, consisting of RFC 1122 and RFC 1123, was removed from the list of standards. These documents describe precisely what is expected from TCP/IP host implementations, and are now listed as Current Applicability Statements, meaning they describe the way Internet entities should behave. As mentioned earlier, STD-4 for gateway requirements is no longer listed. The term gateway is no longer considered appropriate, and the new standard refers to IP version 4 routers. RFC 1812 replaces the chain of obsolete specifications for IPv4 routers (starting with RFC 1009, Standards Requirements for Internet Gateways), but the related standard, STD-4, has long been absent from the list of current standards. RFC 2500 lists RFC 1812 as a proposed standard and does not show an STD document for IPv4 router requirements. That specification may eventually be promoted to full standard status, at which point it will receive a higher STD numberor (more likely) it will be designated a Current Applicability Statement.
States: Standards Maturity Levels

STD-1 defines a series of levels describing a standards maturity. There are six levels defined, along with suggestions for where and when they should actually be implemented: Standard Protocol. This is a protocol that has been established as an official standard protocol for the Internet by the IESG. Standard protocols define how things should be done. In other words, if you are going to do Internet routing, you must use the Internet standard routing protocols; if you are doing Internet email, you must the Internet standards for email. There should be no problems with interoperability if the protocol is implemented. Draft Standard Protocol. A protocol that is under active consideration by the IESG to become a Standard Protocol is considered a draft standard. Draft standard protocols are likely to eventually be made standard. Wide implementation is desirable from the point of view of the standards bodies, as this provides a broader base for evaluating the protocol. Draft standards may be modified before being accepted as standards, and implementers must be prepared to accept and incorporate those changes.
20
Proposed Standard Protocol. A protocol being proposed for consideration as a standard sometime in the future by the IESG is called a Proposed Standard Protocol. These protocols need to be implemented and deployed in order to test them, but they are rarely accepted as standards without revisions. Experimental Protocol. Protocols that are being used for experimentation or that are not related to operational services are considered experimental. If you are not in the experiment, you should not implement the experimental protocol, though the experiment will probably depend on all participants implementing the protocol. Experimental protocols can later be admitted to the standards track, at which time their maturity level would be changed. Informational Protocol. Protocols that have been developed outside the Internet development communityfor example, those developed as proprietary protocols or those developed by other standards bodiesmay be documented as informational protocols. These specifications can be published as RFCs for the convenience of the Internet community. Examples already cited include the NFS protocol developed by Sun and the CyberCash payment protocol. Historical Protocol. Historical protocols are no longer relevant, either because they have been superseded by newer versions or by newer alternative protocols or because there was not sufficient interest to advance them through the standards process. These protocols are unlikely to ever become standards. Standards maturity levels depend on context. A group of network-specific standard protocols have been defined for link layer protocols. Obviously, STD42, Internet Protocol on Ethernet Networks, will not be implemented on ATM networks. Likewise, there are relatively few full-fledged standard Internet protocols (see the section Whats Standard, Whats Not); however, quite a few draft and proposed standard protocols are widely implemented in popular commercial products. For example, the very popular Dynamic Host Configuration Protocol (DHCP) is a draft standard, as is the Multipurpose Internet Mail Extensions (MIME) protocol. Furthermore, the Internet Message Access Protocol (IMAP) and the Hypertext Transfer Protocol (HTTP) are both still proposed standards.
Status: Standards Requirements Levels

Up until RFC 2400, STD-1 defined a protocols status as its requirements level. These levels provided guidance as to whether the protocol should be implemented and included the following:
21
Required Protocol. Systems must implement required protocols. Recommended Protocol. Systems should implement recommended protocols. Elective Protocol. Systems may choose whether to implement elective protocols. However, if a system will be implementing a protocol of this type, it must implement exactly this protocol. Multiple elective protocols are often offered for general areas, such as routing or email. Limited Use Protocol. Protocols may be limited due to the fact that they are experimental, provide limited functionality, or lack current relevance. Not Recommended Protocol. Some protocols are considered not recommended for general use. They may have limited functionality, lack current relevance, be designed for special purposes, or be experimental. To put the requirements levels into perspective, a system that implemented only the required protocols would probably be able to do little more than be visible on an IP network. Upper layer protocols such as the Transport Control Protocol (TCP) and the User Datagram Protocol (UDP) were recommended but not required. Such a minimal host would be able to do little more than respond to most network requests with error messages. Implementing all the recommended protocols would improve the situation to the point that such a host would be usable for most simple and typical network services. However, these distinctions have been removed as RFC 2500 defines RFCs simply by maturity level.
Internet Nonstandards
Although roughly 2,500 different RFCs have been published, most are not currently relevant to Internet standards. Some RFCs document protocols that are now obsolete, such as the Simple File Transfer Protocol (SFTP) documented in RFC 913. These protocols may once have been considered useful, but are no longer. These protocols are considered historical protocols because they are of interest only for historical purposes and are not intended to be implemented on current systems. Some RFCs describe protocols that are proprietary and are considered to be informational protocols. These include documents such as RFC 1898, CyberCash Credit Card Protocol Version 0.8, or RFC 1813, NFS Version 3 Protocol Specification, which documents Sun Microsystems Inc.s Network File System. These protocols are documented for different reasons, though usually to provide information to the community about the work being done by the owner of the protocol. For example, Suns NFS protocol, while not an Internet standard, is certainly an important protocol and is documented so that others can write applications that are compatible with NFS. Some RFCs are purely informational and do not document actual protocols. They may summarize meetings or describe approaches to specific networking
22
problems taken by the author(s). Most informational RFCs are intended to provide important information or to raise important questions. One subset of informational RFCs includes April Fools documents, published on April 1 of each year and conforming strictly to the RFC format. For example, one of the best-known examples is RFC 1149, A Standard for the Transmission of IP Datagrams on Avian Carriers, published April 1, 1990. The earliest example I found is RFC 748, TELNET RANDOMLY-LOSE Option, published in 1978.
Whats Standard, Whats Not

The reader is directed to STD-1 for a complete survey of Internet standards, draft standards, proposed standards, and other protocols. Tables 2.1 and 2.2 list the current Internet standards and current network-specific standards, as they appear in RFC 2500.
Table 2.1 Internet Standards as Defined by RFC 2500 (STD-1) NAM E Internet Official Protocol Standards Assigned Numbers IP Internet Protocol as amended by:-------IP Subnet Extension IP Broadcast Datagrams IP Broadcast Datagrams with Subnets ICMP IGMP UDP TCP TELNET FTP SMTP SMTP-SIZE SMTP-EXT MAIL Internet Control Message Protocol Internet Group Multicast Protocol User Datagram Protocol Transmission Control Protocol Telnet Protocol File Transfer Protocol Simple Mail Transfer Protocol SMTP Service Ext for Message Size SMTP Service Extensions Format of Electronic Mail Messages 950 919 922 792 1112 768 793 854,855 959 821 1870 1869 822 5 5 5 5 5 6 7 8 9 10 10 10 11 RFC 2500 1700 791 STD 1 2 5
PROTOCOL
Internet Standards and Internet Protocols Table 2.1 (Continued) NAM E Network Time Protocol (Version 2) Domain Name System Mail Routing and the Domain System Simple Network Management Protocol Structure of Management Information Concise MIB Definitions Management Information Base-II NetBIOS Service Protocols Echo Protocol Discard Protocol Character Generator Protocol Quote of the Day Protocol Active Users Protocol Daytime Protocol Time Server Protocol Binary Transmission Echo Suppress Go Ahead Status Timing Mark Extended-Options-List Trivial File Transfer Protocol ISO Transport Service on top of the TCP Ethernet MIB Point-to-Point Protocol (PPP) PPP in HDLC Framing IP Datagrams over the SMDS Service Post Office Protocol, Version 3 RFC 1119 1034,1035 974 1157 1155 1212 1213 1001,1002 862 863 864 865 866 867 868 856 857 858 859 860 861 1350 1006 1643 1661 1662 1209 1939 STD 12 13 14 15 16 16 17 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 35 50 51 51 52 53 Continues
23
PROTOCOL NTPV2 DOMAIN DNS-MX SNMP SMI Concise-MIB MIB-II NETBIOS ECHO DISCARD CHARGEN QUOTE USERS DAYTIME TIME TOPT-BIN TOPT-ECHO TOPT-SUPP TOPT-STAT TOPT-TIM TOPT-EXTOP TFTP TP-TCP ETHER-MIB PPP PPP-HDLC IP-SMDS POP3
24
Essential Email Standards: RFCs and Protocols Made Practical Table 2.1 Internet Standards as Defined by RFC 2500 (STD-1) (Continued) NAM E Open Shortest Path First Routing V2 Multiprotocol over Frame Relay RIP Version 2-Carrying Additional Info. RIP Version 2 Protocol App. Statement Structure of Management Information v2 Textual Conventions for SNMPv2 Conformance Statements for SNMPv2 RFC 2328 2427 2453 1722 2578 2579 2580 STD 54 55 56 57 58 58 58
PROTOCOL OSPF2 IP-FR RIP2 RIP2-APP SMIv2 CONV-MIB CONF-MIB
Table 2.2 Network-specific Draft, Proposed, and Standard Protocols, as Defined by RFC 2500 (STD-1) PROTOCOL NAM E IP-ATM Classical IP and ARP over ATM STATUS Prop Prop Prop Std Draft Std Std Std Std Std Std Std Std Std Std Std Std Std Draft RFC 2225 1483 1469 1390 1356 826 903 37 38 36 STD
ATM-ENCAP Multiprotocol Encapsulation over ATM IP-TR-MC IP-FDDI IP-X.25 ARP RARP IP-ARPA IP-WB IP-E IP-EE IP-IEEE IP-DC IP-HC IP-ARC IP-SLIP IP-NETBIOS IP-IPX IP-HIPPI IP Multicast over Token-Ring LANs Transmission of IP and ARP over FDDI Net X.25 and ISDN in the Packet Mode Address Resolution Protocol A Reverse Address Resolution Protocol Internet Protocol on ARPANET Internet Protocol on Wideband Network Internet Protocol on Ethernet Networks Internet Protocol on Exp. Ethernet Nets Internet Protocol on IEEE 802 Internet Protocol on DC Networks Internet Protocol on Hyperchannel Transmitting IP Traffic over ARCNET Nets Transmission of IP over Serial Lines Transmission of IP over NETBIOS Transmission of 802.2 over IPX Networks IP over HIPPI
BBN1822 39 907 894 895 1042 891 1044 1201 1055 1088 1132 2067 40 41 42 43 44 45 46 47 48 49
25
Reading List
Table 2.3 contains some RFCs that elaborate on the information presented in this chapter. For the most current assigned numbers, check out the Current Assigned Numbers Web site at www.iana.org/numbers.html. Another good resource is the Internet Mail Consortiums (IMC) IETF Novices Guide, at www.imc.org/novice-ietf.html.
Table 2.3 RFC RFC 2500 Relevant RFCs TITLE Internet Official Protocol Standards DESCRIPTION This is the current incarnation of Internet STD-1 and includes complete information about Internet standards current when the RFC was published. This is the most recent publication of the assigned numbers document. It documents assigned numbers that were current when the RFC was published. This RFC explains what the F.Y.I. series of documents is all about. This RFC explains the best current practices series. This RFC explains how specifications become Internet standards. We return to cover the material in this RFC in depth in Chapter 4.
RFC 1700
Assigned Numbers
RFC 1150 RFC 1818 RFC 2026
F.Y.I. on F.Y.I.Introduction to the F.Y.I. Notes Best Current Practices The Internet Standards ProcessRevision 3
CHAPTER
3
Internet Standards Bodies
A regular alphabet-soup of standards bodies guide, cajole, steer, and engineer standards into existence. Learning what each group does, how each group relates to the other groups, and how the groups are involved in the standards development process will help you to understand how Internet standards work. With this understanding you will be better equipped to track the standards process and make appropriate decisions about how to use those standards in your organization and products. Some Internet standards bodies have been documented in RFCs; others make their charters available on the Internet through their Web sites. Still other standards bodies are not, strictly speaking, part of the Internet standards process, but their work affects Internet standards in some way or other. This chapter introduces the most important players in the standards process, starting with Internet groups and followed by introductions to other important standards groups. The end of the chapter has references to relevant RFCs as well as URLs pointing to organizational Web sites. The organizations that are involved in the Internet standards process are highly interrelated and interdependent. It is almost impossible to talk about one of them without making reference to one or more of the others. Figure 3.1 shows a simplified organizational chart that displays the relationships among the bodies that are important to the creation of Internet standards. Each of these bodies is explained in this chapter.
27
28
Internet Society (ISOC)
Internet Architecture Board
IRSG
IRTF
IANA ICANN RFC Editor
IESG
IETF
Figure 3.1 A simple organizational chart showing the links among the primary bodies involved in the development of Internet standards.
The IAB
The Internet Architecture Board (IAB), which was originally called the Internet Activities Board when it was first set up in 1983, did not begin publishing its activities until 1990, so much of its origins are misted by time and memory. IAB chair Brian Carpenter wrote an overview of the IAB in 1996, called What Does the IAB Do, Anyway? (available online at www.iab.org/connexions.html). RFC 1160, published in 1990, provides an early history and description of the IAB. The IAB charter is documented in RFC 1601. These documents form the basis of this section, which details the IAB and what it does.
IAB History
According to RFC 1160, Internet research during the 1970s slowly grew to the point where it became necessary to form a committee that could guide development of the protocol suite. This committee was called the Internet Configuration Control Board (ICCB). In January 1983, the Defense Communications Agency declared the TCP/IP protocol suite to be the standard for the Advanced Research Projects Agency network, also known as the ARPANET. The Defense Communications Agency was the organization within the U.S. government responsible for operation of the ARPANET, which later evolved
29
into the Internet. Later in 1983, DARPA reorganized the ICCB and renamed it the Internet Activities Board. As of 1990, the IAB had only two important task forcesthe Internet Engineering Task Force (IETF) and the Internet Research Task Force (IRTF)both of which were established in 1986. Each task force is led by a chairman and guided by a steering group: the Internet Engineering Steering Group (IESG) for the IETF, and the Internet Research Steering Group (IRSG) for the IRTF. Most of the work of the task forces is carried out by working groups (WGs) set up for specific programs or topics. In 1992, the IAB was reconstituted as a component of the Internet Society (ISOC), and its name was changed from the Internet Activities Board to the Internet Architecture Board. We discuss the Internet Society and the other organizational components mentioned in the charter, including the IETF, the IESG, the IRTF, and the IRSG, at greater length later in this chapter.
IAB Charter
The charter, published as RFC 1601, is a good place to start to understand what the IAB is and what function it fulfills. We begin by outlining the IABs functions. According to RFC 1601, the IABs responsibilities are: 1. Selection of Internet Engineering Steering Group (IESG) members. The charter calls for a fair degree of unanimity, requiring at least eight votes in favor of a successful nominee and no more than one vote against the nominee. 2. Provide architectural oversight for Internet protocols and procedures. An important function of the IAB is long-range planning. The charter calls for the IAB to track the important long-term issues relevant to the Internet and to make sure that the groups that should address the issues are made aware of those issues. The IAB is responsible for organizing the Internet Research Task Force (IRTF) as part of its architectural oversight function. 3. Provide oversight to the Internet standards process as well as provide an appeals board for complaints about that process. The IAB, with the participation of the IESG, defines how that process is to unfold and also documents that process. 4. Manage and publish the RFC document series and administer the Internet assigned numbers. It is up to the IAB to select an RFC editor (Jonathan B. Postel, Ph.D., was RFC editor until his untimely passing in October 1998). The RFC editor is responsible for the editorial management and publication of the RFC series. According to its charter, the IAB is also responsible for designating an Internet Assigned Numbers Authority (IANA) to administer the assignment of Internet protocol
30
numbers. Jon Postel was also responsible as the IANA, and this function will pass to the ICANN. 5. Act on behalf of the Internet Society as liaison with other organizations that are concerned with global Internet standards, technologies, and organizational issues. Some of the entities the IAB liaises with include the U.S. Federal Networking Council (FNC); various organs of the European Commission (EC); the Coordinating Committee for Intercontinental Research Networking (CCIRN); standards bodies such as the International Organization for Standardization (ISO), the International Electrotechnical Commission (IEC), and the International Telecommunication Union (ITU); and other professional societies such as the Institute of Electrical and Electronic Engineers (IEEE) and the Association for Computing Machinery (ACM). 6. Provide advice to the Internet Society, guiding the trustees and offices of the Internet Society on technologies, architecture, procedures, and, where appropriate, policy matters that relate to the Internet and related technologies. Where necessary, the IAB can call together expert panels, hold hearings, or use other methods to investigate questions or issues raised by the Internet Society. The IAB is made up of 13 voting members, including the IETF chair and 12 full members. The IETF chair is also the IESG chair and gets a vote on all IAB actions with one exception: the approval of IESG members. Full IAB members serve for two years and are permitted to serve for any number of terms. Although IAB members may have day jobs, they must act as individuals on the board and not as representatives of any employer. The IETF, through a nominating committee, nominates IAB members. The Internet Society Board of Trustees votes on the nominees for IAB membership. The IAB chair is voted on by the current twelve sitting IAB members. The charter states that normally six new full members are nominated each year. The charter also specifies who is eligible for the nomination committee.
How the IAB Works

The IAB usually meets about once a month through a telephone conference, according to Brian Carpenter. These meetings usually run about two hours each and are scheduled to allow members from all parts of the world to participate, though not always without some inconvenient scheduling. Physical meetings occur three times a year at IETF meetings, at which the IAB also holds an open meeting that allows any IETF member to raise issues directly with the IAB. As we see later when we discuss the actual standards process, the IAB itself does not drive the technical work so much as oversee and guide it. This means
31
that during IAB meetings, action is not necessarily taken on specific standards or protocols. More often, apart from the usual administrivia of reviewing action lists, the IAB attempts to strategize in depth on one or two important issues and come up with some result that can be passed along to the relevant entities: the IESG, IETF working groups, or the public, though an RFC. Carpenter gives some examples of issues that were raised during IAB meetings held during the second half of 1995, including:
s s s s s s s s s s s s s s s s
The future of Internet addressing Architectural principles of the Internet Future goals and directions for the IETF Management of top-level domains in the Domain Name System Registration of MIME types International character sets Charging for addresses Tools needed for renumbering
Rather than attempting to come up with solutions to the issues that are raised, the IABs aim is either to get the IESG to take action or to stimulate the IETF community to address the issues. When the IAB publishes RFCs or Internet-Drafts, they are in the form of statements or viewpoints rather than actual proposals for new or modified standards. The IAB can also initiate workshops or panels that operate outside the standards process but that are intended to incubate ideas in specific areas. Carpenter cites workshops held on security, which is documented in RFC 1636, and on information infrastructure, which is documented in RFC 1862. The IAB may also initiate the formation of research groups under the aegis of the IRTF. However, the research groups are not intended to generate standards-track proposals, unlike the workshops or panels research groups, which are intended to persist over time. According to Carpenter, in between meetings, IAB members keep track of relevant IETF and IESG activities through email lists and by commenting on draft charters of new working groups, reviewing documents that are in the last stages of getting approval, and generally helping out when difficulties arise with working groups. Carpenter makes clear also what the IAB is not: The IETF is the standards body, the IAB is drawn from the IETF. The IAB is mostly an advisory board and has minimal input to policy issues for the Internet. The IAB might decide it is important that work be done on some kind of standard, but it can not specify where and whether that standard must be applied. In practice, though, the boundaries between the IETF, the IESG, and the IAB are blurred, and those borders are not strictly patrolled but rather used as guidelines for action.
32
The Internet Society

The Internet Society, also known as ISOC, was announced in 1991 and born as an organization in January 1992. It is the international organization for global cooperation and coordination for the Internet and its internetworking technologies and applications, according to the FAQ page on the ISOC Web site. It is a not-for-profit organization with tax-deductible status based in Reston, Virginia. Though it boasts a membership of individuals and organizations representing all segments of the global Internet community, as of early 1999 it claimed only about 7,000 members worldwide. ISOCs mission statement is To assure the beneficial, open evolution of the global Internet and its related internetworking technologies through leadership in standards, issues, and education. The Internet Society mission continues:
Since 1992, the Internet Society has served as the international organization for global coordination and cooperation on the Internet, promoting and maintaining a broad spectrum of activities focused on the Internets development, availability, and associated technologies.
The Internet Society acts not only as a global clearinghouse for Internet information and education but also as a facilitator and coordinator of Internetrelated initiatives around the world. Through its annual International Networking (INET) conference and other sponsored events, developing-country training workshops, tutorials, statistical and market research, publications, public policy and trade activities, regional and local chapters, standardization activities, committees, and an international secretariat, the Internet Society serves the needs of the growing global Internet community. From commerce to education to social issues, its goal is to enhance the availability and utility of the Internet on the widest possible scale. In terms of the number of individuals and organizations affected, the Internet Societys most important activities are those related to Internet standards. The Internet Society was founded, in part, to provide an ongoing source of organizational and financial support for the IETF and other related bodies. By the early 1990s, it was apparent that the involvement of the U.S. government as the primary supporter of Internet activities could not be sustained. To grow, the Internet had to move from being a research and academic tool to being a medium for commercial development, and it was clear that the U.S. government would eventually stop funding the Internet. In addition to funding the IAB, IETF, and other related groups, the Internet Societys board of directors, consisting of 15 Internet deities, is responsible for approving IAB members that have been nominated by the IETF nominating committee. Chapter 4, The Internet Standards Process, outlines how the Internet Society participates in the Internet standards process.
33
The IETF and IESG

It may seem that the IETF would be a formal organization with membership lists, formal structure, and activities. However, this is not the case. As is explained in RFC 1718, The Tao of the IETF, the IETF is open to anyone who shows up. According to RFC 1718, the Internet Engineering Task Force is a loosely self-organized group of people who make technical and other contributions to the engineering and evolution of the Internet and its technologies. You can participate at any of the three yearly meetings in person, or you can participate through IETF working groups and their mailing lists. The individuals who participate in the IETF include network designers, operators, vendors, researchers, and anyone else with an interest in the development of the Internet and its protocols and architecture. Within the IETF, most of the work is accomplished in working groups, which are categorized into different areas. We return to how working groups actually work in Chapter 4, but these are the IETF areas: Applications Area includes working groups that address applicationsin other words, anything that provides some benefit to end usersand excludes anything related to security, networks, transport protocols, or administration and management. Examples of working groups in this area include the Hypertext Transfer Protocol (HTTP), calendaring and scheduling, Internet fax, and others. General Area currently includes only two working groups, the Policy Framework working group and the Process for Organization of Internet Standards working group. These groups address general areas of interest to the IETF. Internet Area includes groups working on issues related directly to the Internet Protocol (IP), including groups working on implementing IP over different data link layer protocols as well as IPng (IP, next generation, now known as IPv6) and others. Operations and Management Area working groups address issues related to the way things work on the Internet. Working groups in this area include a benchmarking group, a group working on year 2000 issues, groups working on network management protocols, and others. Routing Area working groups focus on issues related to routing in the Internet. Working groups address multicast routing issues, quality of service routing issues, and others. Security Area working groups focus on providing security to the protocols that other IETF groups are working on. Important working groups in this area include those addressing the IP security architecture (IPsec), groups working on various aspects of authentication, groups working on
34
encryption issues, groups working on development of secure applications, and others. Transport Area working groups focus on issues related to transport protocols as well as related protocols. For example, working groups include differentiated services, multicast address allocation, TCP implementation, and others. User Services Area working groups focus on issues related to improving the quality of information available to Internet users and to developing programs that may be helpful to users. The three current working groups in this area are the Responsible Use of the Network group, the Site Security Handbook group, and the User Services group. Most of these areas have a dozen or so working groups, and altogether there are well over 100 IETF working groups. Each IETF area has one or two area directors, who oversee and coordinate the activities of the workgroups in their areas. Each working group has one or two chairs, as well as an area advisor (usually one of the area directors). Although the IETF can be a diffuse and somewhat nebulous organization, the Internet Engineering Steering Group is more explicitly and narrowly defined. The IETF area directors plus the IETF chair make up the IESG. Although all Internet protocol development work is done at the working group level, once the working groups are finished, it is the IESG that must approve the standard protocol specifications (or other documents) for publication as RFCs.
The Internet Research Task Force and Internet Research Steering Group
The Internet Research Task Force (IRTF) and the Internet Research Steering Group (IRSG) are not nearly as well known as the IETF and IESG. This is, in part, because the results of the IRTF research groups tend to be used as the basis for engineering work done by the IETF. Thus, while the results of the work done by IETF working groups may be enshrined as Internet standards, the results of the work done by IRTF research groups more often are used as one of many sources for new work by the IETF working groups. The IRTF mission, stated on the IRTF Web page (www.irtf.org), is To promote research of importance to the evolution of the future Internet by creating focused, long-term and small research groups working on topics related to Internet protocols, applications, architecture and technology. The activities of the IRTF research groups are thus more forward-looking than those of the IETF working groups: Their results may be published in peer-reviewed aca-
35
demic journals as well as in informational RFCs. An important difference between the IRTF research groups and IETF working groups is that membership in research groups is not necessarily open to all interested parties. IRTF research groups currently include the following: The End-to-End research group is concerned with issues related to end-toend services and protocols, with particular attention to performance, traffic control, scheduling, protocol framing, efficient protocol implementations, high-performance host interfaces, and others. The Information Infrastructure Architecture research group is concerned with developing an interoperable framework for the Internets information architecture. Membership in this group is by invitation only. The Internet Resource Discovery research groups mission is to develop a model by which resources can be described on the Internet. This includes the design of entities that can act on behalf of electronic resources for the purposes of indexing, querying, and retrieving information; building mechanisms that can create, maintain, and use data for those entities; and setting requirements for systems that use these entities. Membership in this group is by invitation only. The Routing research group works on routing issues that have relevance to the Internet but that are not yet mature enough to be incorporated into work being done by IETF routing working groups. Some of the topics set forth in this groups charter include work on quality of service (QoS) routing, scalable multicast routing, routing protocol stability, and extremely dynamic routing. According to the charter, this group has a limited core membership but occasionally holds open meetings to solicit input from the rest of the community. The Services Management research group works on issues related to the concept of service management. Basing their work on the assumption that network management and system management are converging toward a single function, called service management, this group is investigating how best to go about creating new architectures and protocols that would allow a system/network manager to manage all different types of connected devicesfrom PDAs to mainframeswith the same conceptual framework and the same tool or tools. Membership in this group is by invitation only. The Reliable Multicast research group, presumably, will be concerned with issues related to building a framework for doing multicasting reliably. However, the groups charter has not yet been published. The Internet Research Steering Group (IRSG) membership is, like the IESG, limited to the chairs of all the research groups as well as the IRTF chair. Other
36
prominent members of the community may be invited to serve as members of the IRSG. Although some of these research groups maintain mailing lists or Web sites, some appear to be moribund. The address given for subscribing to the Internet Resource Discovery group mailing list is no longer valid, and other groups mailing lists are sparsely attended. In fact, the Privacy and Security group is included on the IRTF Web site, but the group was disbanded in early 1998 because much of the groups work was done. The charter describes work that eventually resulted in the IP Security Architecture, a set of standards that have already been published in two versions as RFCs.
Internet Assigned Numbers Authority and Internet Corporation for Assigned Names and Numbers
As far as this book is concerned, the most important function of the Internet Assigned Numbers Authority (IANA) is to administer and publish numbers that are related to Internet standards. For example, if you want to know what different values in the IP headers protocol field represent, you would consult the IANA. Any arbitrary values related to Internet protocols and parameters must be assigned through the mediation of the IANA. You may not simply choose some value and then publish it as a standard. This goes for protocol parameters as well as well-known port numbers for transport layer protocols and any other number related to a protocol or an Internet standard. However, as mentioned in Chapter 2, Internet Standards and Internet Protocols, the IANA is in the process of being replaced by the Internet Corporation for Assigned Names and Numbers (ICANN). The need for a transition was apparent by 1996, when discussions and proposals began over how best to convert the U.S. government-funded IANA into an organization that could satisfactorily serve a global commercial Internet. Not only is the IANA responsible for protocol parameters, but it is also tasked with administering the assignment of globally unique Internet network addresses and domain names. Internet addresses and domain names have a commercial component, as they are viewed as limited resources. There are only seven root-level three-letter domains (.gov, .mil, .edu, .int, .net, .org, and .com). Only three of these are generally available to businesses and organizations (.net, .org, and .com). There are issues relating to the way protected corporate trade names are allowed to be registered, as well as concern that additional root-level domains should be added. As for Internet network addresses, experts have been predicting since the late 1980s that the current version of IP (IPv4) does not provide a sufficiently large address space to support the continued growth of the Internet for
37
many more years. These numbers are allocated through regional registries and are becoming more and more scarce. After considerable debate and much revising, the ICANN proposal was accepted in late 1998just a month and a half after Postels death. The U.S. government acknowledged in a memorandum of understanding, dated November 25, 1998, that ICANN would be set up as a private, nonprofit corporation to administer policy for the Internet Name and Address System. The most visible and politically sensitive issues were the way addresses and domains are assigned, but the administration of protocol parameters will also be transferred to the ICANN because it was also part of the IANAs original charter. Exactly how that function will be performed is yet to be seen. ICANN may simply continue to publish the assigned numbers online in the same way the IANA has been. In fact, by summer of 1999, ICANNs future, scope, form, and function were still unclear. ICANN funding was far from certain, and its precise duties were still undefined as were the ways in which it would interact with the Internet Society and the IETF. More details are available at the IANA and ICANN Web sites for updates or subscribe to the ICANN-announce mailing list by sending a message to:
majordomo@icann.org
The message should have no subject line and the following command as the message body:
subscribe icann-announce
Other Relevant Bodies

Many more standards relate to networking and the Internet than those specified by the bodies described so far. Four of the most important other standards bodies are the World Wide Web Consortium (W3C), the International Telecommunication Union (ITU), the Institute of Electrical and Electronics Engineers (IEEE), and the National Institute of Standards and Technology. These bodies are profiled briefly below.
W3C
The World Wide Web Consortium (W3C) is the newest of the related standards bodies, founded in 1994 to promote the World Wide Web and help it achieve its full potential through the development of common and interoperable protocols. However, to the extent that work on important Internet protocols like Hypertext Transfer Protocol (HTTP) and the Universal Resource Identifier
38
(URI) is done in partnership with the IETF, the W3C is most closely related to Internet standards. Unlike the IETF, which is a wide-open organization, the W3C is an industry consortium. Individuals may join, but they must pay the full annual fee of $5,000, which is charged to affiliate organizational members (full members pay $50,000 each year). Unlike the IETF, when members suggest programs within the W3C, they must also back up the program proposal with funding for the work. Operating out of the Laboratory for Computer Science at MIT, the W3Cs members often are current or former contributors to Internet standards through the IETF. The two organizations share the goal of building interoperable protocols that foster connectivity without regard to nationality, corporate affiliation, or any other restrictive notions. The W3C is organized into four different domains: User Interface, Technology & Society, Architecture, and the Web Accessibility Initiative. Each domain is responsible for different activity areas, resulting in an organization similar to that of the IETF areas and working groups. The User Interface domain activity areas address issues that include data representations through the Hypertext Markup Language (HTML), stylesheets, fonts, internationalization, and others. The Technology & Society domain activity areas address issues that include legal and social implications of the web, in particular electronic commerce, privacy concerns, digital signatures, and others. The Architecture domain activity areas concern themselves with issues relating to the way the Web operates. Activity areas are devoted to issues like HTTP, structured document interchange using the Extensible Markup Language (XML), Synchronized Multimedia (SMIL), and others. The Web Accessibility Initiative domain is chartered to pursue a high degree of usability for people with disabilities, through improved technology, guidelines, tools, education, and research. As an industry consortium whose members are almost exclusively organizations, the W3C standards process is less open than that of the IETF, though interested readers will find the process document at www.w3.org/Consortium/ Process/. W3C standards start out as Working Drafts and proceed to the status of Proposed Recommendations and finally Recommendations after passing through all review stages described in the process document. There are two other types of W3C documents, called Notes and Submissions. A Note is a document that the W3C publishes because it may be of interest to the community. Publication as a Note does not imply that the W3C endorses the document. W3C Submissions permit members to publish ideas or technologies for the consortiums consideration. Although the Notes are chosen by the W3C for publication,
39
Submissions that are submitted with all support materials in order will be published. However, Submissions have no official status as W3C standards. Because the IETF and the W3C share some of the same concerns, a high degree of cross-pollination goes on between the two organizations. Anyone interested in protocols related to the World Wide Web will find standards and protocols through both organizations. Where overlap occurs, the two organizations cooperate in the interests of interoperability.
IEEE
The Institute of Electrical and Electronics Engineers (IEEE) is an international professional organization for engineers. Founded in 1884, the IEEE standards groups work on specifications for all types of engineering pursuits including networking. In particular, IEEE standards are used to define the way data is transmitted across network media like ethernet. Important standards include the IEEE 802 LAN/MAN standards relating to ethernet transmissions, the IEEE P1394.1 high-performance serial bus bridge standards, and the IEEE P1363 standards for public-key cryptography.
ITU
With roots going back to the 1865 founding of the International Telegraph Union, the International Telecommunication Union (ITU) is one of the oldest standards bodies around. In 1947, it became an agency of the United Nations and is based in Geneva, Switzerland. Initially, it was set up to foster international standards for telegraphy, technical standards as well as standards for operations, tariffs, and telecommunications accounting practices. The ITU has evolved over the years to accommodate changes in the telecommunications industries it serves. Its activities include work on all sorts of data transmission media, including satellite, radio, and more traditional cabled transmission. As telecommunications organizations increasingly rely on IP networks to carry voice as well as data, the ITU will expand its Internet-related activities. RFC 2436 addresses issues of interaction between the ITU and the IETF. The ITU currently has several study groups working on IP-related issues, including multimedia services and systems, telecommunication management networks and network maintenance, and signaling requirements and protocols for network media like ISDN. Other standards are developed through ITU, in particular the X.400 standards relating to message handling and the X.500 standards relating to directory services.
NIST
The National Institute of Standards and Technology (NIST) is an agency of the U.S. Department of Commerces Technology Administration whose mission is
40
to promote U.S. economic growth by working with industry to develop and apply technology, measurements, and standards. NIST is active in a number of important areas relating to the Internet, including standards for encryption such as the Data Encryption Standard (DES) and selection of a replacement for DES, known as the Advanced Encryption Standard (AES). NIST is also active in working on new protocols for broadband data transmission across highspeed networks including ATM, as well as research on technologies to support the next-generation Internet.
Reading List
RFC 2028, The Organizations Involved in the IETF Standards Process, is a good place to start if youre interested in reading more. Table 3.1 includes Web sites for the organizations described in RFC 2028 as well as many others of relevance to the Internet standards process.
Table 3.1 Organizations Involved in the Internet Standards Process URL www.isoc.org www.icann.org www.iana.org/index2.html www.iana.org/numbers.html www.ietf.org www.irtf.org www.ietf.org/iesg.html www.iab.org/iab/ www.w3c.org/ www.itu.int/ www.ieee.org/ http://standards.ieee.org/
ORGAN IZATION The Internet Society (ISOC) The Internet Corporation for Assigned Names and Numbers (ICANN) The Internet Assigned Numbers Authority (IANA) The IANA Protocol Numbers and Assignment Services page The Internet Engineering Task Force (IETF) The Internet Research Task Force (IRTF) The Internet Engineering Steering Group (IESG) The Internet Architecture Board (IAB) The World Wide Web Consortium (W3C) The International Telecommunication Union (ITU) The Institute of Electrical and Electronics Engineers (IEEE) The IEEE Standards site
CHAPTER
4
The Internet Standards Process
Weve discussed what an Internet standard is in Chapter 2, Internet Standards and Internet Protocols, and what organizations participate in the creation of Internet standards in Chapter 3, Internet Standards Bodies. In this chapter, we look at the process by which a protocol becomes an Internet standard protocol. Working from two RFCs that describe the standards process and provide guidelines for IETF working groups, we introduce the activities necessary to create an Internet standard. In the last part of this chapter, we examine the instructions to RFC authors to better understand how those documents are structured and what information those documents contain.
The Standards Process

The abstract of RFC 2026, The Internet Standards ProcessRevision 3, reads:
This memo documents the process used by the Internet community for the standardization of protocols and procedures. It defines the stages in the standardization process, the requirements for moving a document between stages and the types of documents used during this process. It also addresses the intellectual property rights and copyright issues associated with the standards process.
41
42
This RFC is currently defined as BCP-9, documenting the best current practices for defining Internet standards. The actual procedures required to turn a protocol into a standard are defined here. The document notes that specifications developed through the actions of the IAB and IETF are usually revised before becoming standards. However, specifications that have been defined by outside bodies may go through the same approval process that home-grown standards do, but the outside standards are not revised. In these cases, the Internet standards process is used to affirm it as a standard and to determine how it should be applied to the Internet, rather than to modify the specification being taken. RFC 2026 defines the Internet standard, pointing out that the specification must be stable and well understood and must be competent technically. It should also have been implemented by more than one independent group, and all those implementations should be interoperable. There should be substantial operational experience with the standard, and the standard should enjoy significant public support. Furthermore, it should be recognizably useful in some or all parts of the Internet. In a perfect world, the Internet standard process would be straightforward: Someone proposes a new protocol or process, people work on it over time, the Internet community provides feedback as the standard is gradually improved until the community determines that the specification is stable, competent, interoperable, supported, and is recognizably useful. However, in practice, the difference between theory and practice is far greater than the difference between theory and practice, in theory. Defining Internet standards can be a messy process.
Standards Actions
As RFC 2026 makes clear, Internet standards actions must all be approved by the IESG, and standards actions include anything that modifies the state of the specification as it relates to the standards process. Anything that changes the state of a specification is a standards action. Actions occur when a specification enters the standards track, when it changes its maturity level within the standards track, or when it is removed from the standards track. None of those things can happen unless the IESG approves it. The IESG follows guidelines devised to identify specifications that are ripe for a standard action, but the documented criteria are not hard-and-fast rules but rather guidelines. These guidelines will be discussed later. The IESG, as a group, uses its own judgment when deciding on standards actions. It has the power to deny an action to a specification that otherwise might appear to fulfill all requirements or to approve an action for a specification that might appear to fall short in one or more areas. If any parties believe that a standard action was granted or denied in error, they can resort to the dispute resolution procedures discussed later in this section.
43
The first step in the standards process is the entity sponsoring the specification publishing it as an Internet-Draft (I-D). Normally, this entity is the IETF working group, but it may also be an individual or some other organization. I-Ds produced by individuals or groups not directly connected to an IETF working group can be published as standards-track RFCs and are frequently published as informational RFCs as well. I-Ds are subject to modification based on community review, are transient documents, and are not intended to be referenced in the same way that RFCs are. I-Ds expire if they have not been modified for six months, though the timer starts again when a new version is published. An I-D published on January 1, 2001 would expire after June 30, 2001 if it was not revised; if a revision is published on June 1, 2001, then it is due to expire after November 30, 2001. If a revision is published January 15, 2001, then that I-D expires after July 15, 2001. However, the whole point of publishing an I-D is to have it accepted to the Standards Track rather than to have it persist as an I-D. This is the first standards action that must occur in the standards process for any specification. No action can occur until the I-D has been available online for at least two weeks. This time is to be used for community review, allowing members of the IETF and the rest of the world to read the draft and make comments on it. Although the IESG cant take any action until at least two weeks after the I-D is published, nothing happens unless the IETF working group makes a recommendation to its area director. It can take quite some time and several revisions before the working group makes that recommendation. Normally, one or several members of a working group write a preliminary draft of the specification and publish it as an I-D. That draft stimulates discussion within the working group, which may result in modifications to the draft. A second I-D is published, stimulating further discussion, which in turn results in further modifications. For successful specifications, this process continues until the group is able to agree that the current version of the draft is ready to be published as an RFC. Not all I-Ds become RFCs, however. Some may languish due to lack of interest. Others may be dropped when some other specification appears to solve the problem better. Some never achieve a stable form. When the IESG receives a recommendation for a standards action, it may consult with experts to review the recommendation. When the IESG is reviewing a document, it issues a last call notification to the IETF through the IETFannounce mailing list. Anyone may subscribe to this mailing list, and anyone may submit comments on the specification being reviewed. Once the specification is received from the working group, the last call period must be at least two weeks, but the IESG has the option of extending the last call period if it deems it necessary. Although the IETF working groups recommendation carries weight with the IESG, it is far from binding. The IESG can even decide to consider a standards action different from that requested by the working group. Once the last
44
call period is over, the IESG makes its decision and announces it through the IETF announce mailing list. If approved, the IESG then notifies the RFC Editor that the I-D should be withdrawn and republished as an RFC.
The Standards Track

Each time a specification is promoted to one of the three maturity levels of the Internet standards trackproposed standard, draft standard, and standardit must go through the IESG approval process noted previously. This section examines the stated criteria for promotion to each level as described in RFC 2026. Specifications must remain at the proposed and draft standard maturity levels for minimum periods of time, but these minimums are precisely that: absolute minimums. Advancement along the standards track can be quite slow. Rather than quickly advance a specification, the IESG and IETF working groups prefer that the standard is correct rather than risk enshrining a flawed standard. It is not uncommon for a proposed or draft standard to fail to advance on the standards track but to remain important for the Internet. For example, the Boot Protocol (BOOTP), documented in RFC 951 in 1985, is still a draft standard in 1999. Likewise, the IP Security Architecture, documented in RFC 2401 in November 1998, is still a proposed standard even though it replaces an earlier standards-track version documented by RFC 1825, published in 1995. When a specification stalls at some point in the standards track for two years, the IESG reviews the specification every year. The IESG may subsequently decide to terminate the effort or else decide that development of the specification should continue. At the same time, the IESG may determine that the specification itself is no longer relevant and should be reclassified as a historical RFC rather than a standard-track specification. As specifications advance, they are usually modified. These modifications usually require the publication of new RFCs to document the new versions of the specification. Though it may not be necessary to republish a specification when it changes maturity level (that is, the specification is unchanged), in most cases when a specification advances, a new RFC is published to reflect changes. If the modifications made during the revision process are sufficiently extensive, the IESG can decide the specification should go back and restart the process.
Proposed Standard
According to RFC 2026, to become a proposed standard, a specification is generally stable, has resolved known design choices, is believed to be wellunderstood, has received significant community review, and appears to enjoy enough community interest to be considered valuable. However, more experience with the specification might prove otherwisethe specification might not be valuable, or have support, be well-understood, or even stablein which case the specification could lose its status as a proposed standard.
45
No operational experience or even an implementation is necessary for a specification to achieve the proposed standard level, though both of those are helpful to a specifications cause. If the specification is likely to have a significant impact on the Internet as it exists now, the IESG will very likely require that the specification be implemented and deployed. Proposed standards are to be considered immature, but RFC 2026 encourages implementers to use the specification to build up a body of experience that can be drawn upon to judge the protocols value. A specification must spend at least six months as a draft standard before it can advance along the standards track.
Draft Standard
A specification from which at least two independent and interoperable implementations from different code bases have been developed, and for which sufficient successful operational experience has been obtained, may be elevated to the Draft Standard level. Thats how RFC 2026 puts it. Interoperable means that the implementations are functionally equivalent or interchangeable. To qualify, the implementations have to implement the entire specification. If some functions or options are left out, the implementation doesnt count, unless the things that were left out of the implementations are also taken out of the specification. Where a proposed standard should be generally stable, draft standard specifications must be well-understood and known to be quite stable. It is up to the working group chair to document the specifications implementations, as well as to document interoperability test results and function/option support test results as a part of the chairs recommendation for moving the proposed standard to draft standard status. Once a specification achieves draft standard status, it stays there for at least four months. This period must include an IETF meeting, so this period may be extended if the next IETF meeting occurs more than four months from the date the specification achieves draft standard. Once a specification attains the draft standard maturity level, it is considered a final specification, one that implementers are encouraged to deploy in production systems. Although the draft standard specification may be subject to changes before attaining full standard status, those changes are most likely to be limited to fixes for specific problems arising from continued experience with the specification.
Internet Standard
According to RFC 2026, Internet standard status is reserved for specifications with significant implementation and successful operational experience. Standards are differentiated from other maturity levels by a high degree of
46
technical maturity and by a generally held belief that the specified protocol or service provides significant benefit to the Internet community. Once a specification is approved as a standard, it is assigned an STD number (see Chapter 3). Most specifications have yet to reach standard level; as of early 1999, only 56 STD numbers have ever been assigned out of almost 2,500 RFCs published.
Revising or Retiring Existing Standards

What happens when an existing standard must be updated? The process is the same for a revision to an existing standard as for a new standard. Consider the case of IPv6, the revision to the current version of the Internet Protocol, IPv4 (see also IPv6 Clearly Explained, Morgan Kaufman 1999). Work on the revision began in the IPng working group in the early 1990s, with the first series of IPv6 standards-track RFCs published in 1996. Continued work resulted in new versions of the IPv6 specifications, published in RFCs by late 1998 and early 1999. At the same time, the IPv4 standards are still standards and are likely to remain standards as long as IPv4 is widely implemented and deployed. When two versions of the same protocol coexist, it is necessary to document how the two versions are related. What happens when a revised protocol replaces the older version? The revised protocol must go through the same process, and the older version may be retired unless a sufficiently large installed base uses the older version. Consider the Post Office Protocol (POP): POP version 2, documented in RFC 937, was designated historical after POP version 3 became STD-53 (RFC 1939). Sometimes a standard becomes obsolete because a new protocol does the job much better. The Exterior Gateway Protocol (EGP), documented in RFC 904, was once STD-18. However, other routing protocols have come to replace EGP as a core protocol for the Internet, and it has since been relegated to the status of historic protocol.
Resolving Problems
One of the stated goals of the Internet standards process is to be fair, and that requires mechanisms for resolving disputes over how the process is conducted. RFC 2026 sets out guidelines for resolving problems that occur within working groups as well as problems relating to the entire standards process. These are largely common sense, at least in an organizational framework. Although two types of disagreements are considered for working group disputes, only a single set of guidelines is provided. The types of disagreements are divided between those where an individual believes that his or her views were not given adequate consideration by the working group and those where the individual believes that the working group made an incorrect choice that
47
could result in harm to the groups results. The resolution process relies on discussing the problem first with the working group chair or chairs, who may involve others in the group as necessary. If the problem can not be resolved at that level, it can be escalated to the area director responsible for that working group; further escalation progresses to the full IESG, and finally to the court of last resort, the IAB. If an individual disagrees with an action taken by the IESG, the process is similar but starts with the IESG chair. From there, the problem may be escalated to the entire IESG and then to the IAB. The IAB can not change the IESGs decision, but suggests alternatives or directs that the IESGs decision be annulled and consideration of the matter started over. In the event that the disagreement pertains to whether the procedure itself is sufficient and fair, as described in RFC 2026, an individual can petition the board of the Internet Society.
Documenting the Process

All the groups involved in doing standards work are expected to make public their activities. This means that IETF and working group meetings must be announced on the IETF-announce mailing list. It also means that the IETF, the IESG, the IAB, all IETF working groups, and the Internet Society board must all make public their charters, minutes of their standards-related meetings, archives of working group mailing lists, and anything contributed in writing from participants in the standards process. Even expired I-Ds are archived by the IETF secretariat so as to maintain an historical record of standards activities.
IETF Workgroups
IETF working groups are designed to foster cooperation among individuals who work in widely disparate environments, from academic researchers to for-profit product developers. Working groups are also likely to include individuals who work for organizations with conflicting goals, incorporating people who work for competing software, hardware, and service vendors. Further complicating matters, working group members may live and work almost anywhere in the world. Despite these difficulties, the bulk of the work of the IETF is accomplished by its working groups. RFC 2418 is appropriately titled IETF Working Group Guidelines and Procedures. Describing how the working groups fit into the standards process while also outlining how successful working groups achieve their goals, this RFC should be required reading not only for anyone interested in the Internet standards process but also for anyone interested in organizational dynamics.
48
Defining the Working Group

An IETF working group is usually formed for the purpose of solving some specific problem or to create some specific result or results. For example, the Calendaring and Scheduling working group is chartered to create standards that make calendaring and scheduling software significantly more useful and to enable a new class of solutions to be built that are only viable if open standards exist (from the Calendaring and Scheduling working group charter, at www.ietf.org/html.charters/calsch-charter.html). The charter goes on to define three specific sets of problems relating to Internet calendaring and scheduling applications. Working group deliverables are usually in the form of specifications, guidelines, or other reports published as RFCs. Once all tasks are completed, the working group may be disbanded or its operations may be suspended, with periodic review of standards as they progress through the standards track. In keeping with the IETFs openness, IETF working groups are open to participation by anyone who wishes to contribute. Although much of the working groups work is accomplished by small central cores of group members, other members can contribute through participation in working group mailing lists or by attending meetings in person. Again, inclusiveness reigns: Any activity that occurs at a physical meeting is reported to the mailing list, and rough consensus of the entire group is always a requirement. The working group chair can restrict contributions from members deemed to be acting counter to the interest of the group. If someone holds up meetings by discussing matters that are not appropriate or raising issues that are counter to the rough consensus, that person may be restricted from speaking, but not from attending the meeting. There must be at least one working group chair, but usually no more than two. The chairs concern is to make forward progress through a fair and open process (from RFC 2418). It is up to the chair to ensure that the working group is accomplishing the tasks it is chartered to complete and nothing more or less. Other working group chair tasks include moderating the working group email list, planning working group sessions, communicating the results of the sessions, managing the work by motivating participants to do what needs to be done, developing and publishing supporting documents, and keeping track of implementations based on the working groups activities. Of course, this is a lot of work, and the chair may delegate some or all of these tasks. Other working group staff include the secretary, who is responsible for taking minutes and recording working group decisions, and the document editor, who is responsible for ensuring that the documents the group generates truly reflect the decisions that have been made by the group. A working group facilitator, responsible for making sure that the group processes are working, may also be part of the group. The facilitator works on the style of interaction among the group members, rather than the content, to keep the group moving toward
49
its goals. Finally, in certain cases, the IETF Area Director may assign a working group consultant to a working group. The consultants role is to provide the benefit of his or her experience and technical expertise to the working group. Working group members are likely to be called upon to serve on a design team. When a problem needs solving, the group may determine that a subset of the group should form a design team to solve it. Design teams can be completely informal, consisting of whoever happens to be standing around during a hallway chat, or they may be formally designated groups appointed by the working group chair to address some controversial issue, or something in between. Working group guidelines are truly guidelines, and the working group chair is accorded considerable latitude in terms of how the working groups goals are to be achieved. As long as the process is fair and open and meets the basic requirements set forth in RFC 2418, the working group controls its own process. A working group can be created only when certain conditions are met, and those conditions help define what working groups actually are able to do. The next section explains this process.
Creating a Working Group

Working groups are created at the behest of an IETF Area Director or by some other individual or group. The Area Director has to get behind the idea for the new group, although the IESG (with advice from the IAB) has the final say over whether the group is formed. The Area Director considers the following criteria before making any decision about pushing forward with the chartering process. These criteria help define what a working group should be, inasmuch as any existing working group should meet most if not all of them: Clarity and relevance to the Internet community. Is there a clear vision of what the working group should be working on, and will the working group be working on something that is of value to the Internet community? Without clear goals and relevance, a proposed working group is unlikely to be chartered. Specific and achievable goals. The working group should have specific goals that can be attained within a reasonable period of time. Working groups are meant to have finite lifetimes, and they are meant to actually perform complete tasks. Risks and urgency. What happens if the working group is not formed? What risks are incurred if no action is taken, and what risks might be incurred if action is taken? Working groups that target problems that hinder Internet scalability and continued growth may get priority treatment. Overlap with existing working groups. Will the proposed working groups activities duplicate efforts being made by any existing working groups?
50
Will the proposed working group be working on the same or similar problems being addressed by any existing working groups? Overlap may not be bad if the new working group approaches the problem from a different technical direction. However, if only a limited number of qualified people are working on the problem, multiple working groups could cause those peoples efforts to be spread a bit thin. Interest level. Enough people must be interested in doing the work of the working group, as well as in participating as working group staff (that is, working group chair, secretary, and so on). According to RFC 2418, a viable working group requires that at least four or five people be interested in the management of the group and at least one to two dozen others must be willing to participate to the extent of attending meetings and contributing to the mailing list. The RFC also notes that the group membership must be broadly based. It is not sufficient for membership to represent a single organization, which would be viewed as an attempt by that organization to create its own Internet standard. Expertise level. Are there enough people within the IETF who are sufficiently knowledgeable about the working group work to make worthwhile contributions, and are enough of those people interested in participating? Again, the objective of the working group is to accomplish specific objectives. If the working group members arent experienced in the technologies they are working with, its unlikely theyll be able to achieve those goals. End-user interest level. Is there a consumer base for the output of the working group? Are end users interested in seeing the goals proposed by the working group charter accomplished? The IETF is an engineering organization, whose production is intended for use by end users. Pure-research projects are better accomplished by the IRTF; the IETF must concern itself with products that have practical applications. Practicality of IETF involvement. All the criteria listed here might be met, but some specifications are better produced by other bodies. There may be interest, expertise, relevance, and all the rest, but the IETF is unlikely to get involved with developing standards for LAN media or object models. Other bodies are better qualified to produce specifications in these areas. Intellectual property rights issues. Increasingly, intellectual property rightssoftware patents, copyrights, and moreare relevant to work being done by working groups. These issues must be understood before the working group is chartered. Open technology. Many organizations would like to have their proprietary standards recognized as Internet standards. Such recognition would accord the organization a significant advantage over competitors. When evaluating
51
applications for new working groups, the IESG must attempt to determine whether the work planned by the group is an attempt to favor some existing, closed technology, or whether the plan is devised to solicit IETF participation to genuinely develop an open specification. Understanding of the technologies and issues. Are the issues and technologies proposed for the working groups activities well understood? Technologies should be reasonably mature before they are brought into an Internet standards effort. The IESG would prefer to avoid the kind of debacle that could result from rushing into unproven technology. Overlap with other standards bodies. Do the working groups goals intersect with the goals of any other standards bodies? This may not be cause for concern if the working group approaches the issues in a way unique for the Internet, but the IESG would have to evaluate the degree to which liaison with the other group exists or is required. Once the Area Director is satisfied that a working group proposal is in good shape, the chartering process starts. The Area Director and the person who is to become the working group chair work out the charter together and then submit it to the IESG for approval and to the IAB for review. The charter includes a description of the working group and its objectives and goals, scheduled milestones necessary to achieve those goals, and a list of administrative details like names and contact information for working group chair(s).
Working Group Operations

Working groups have a certain amount of latitude in how they operate, as long as the procedures that result are open and fair. Most of the action usually takes place on the mailing list, with members of the group suggesting options, debating the value of different approaches, and discussing problems arising from implementation and deployment of the solutions being considered by the working group. The standard for moving working group tasks forward is rough consensus, meaning that most of the group is mostly agreed about the solution in question. Determining where the rough consensus actually is, is the job of the working group chair. This can be difficult when all work is carried on over the mailing list, but it is certainly possible. Consensus can also be determined at meetings, where the group can vote in some way. In either case, when the chair feels that a consensus has been reached, the chair may solicit comments from the list or call for a vote. No hard and fast rule determines where consensus actually lies in terms of how many are in favor and how many opposed: The only guideline provided in RFC 2418 is that agreement by 51 percent of the
52
group is not enough to form a rough consensus, and when the group is 99 percent in agreement, a more than rough consensus definitely does exist.
Working Group Documentation

First and foremost, the raw activity of the working group is available in the archives for the working group mailing list. Here you will find all the comments, arguments, proposals, and questions raised in the group. You will also find agendas for physical meetings, meeting minutes, and notifications about the publication of other working group documents, particularly Internet-Drafts and RFCs. Anyone interested in the output of a particular IETF working group should subscribe to the mailing list right away. For a more formal look at the results of the work of a working group, look at the Internet-Drafts it generates. Although these are definitely working documents, they do reflect the best and most recent version of the working groups product. An I-D may be revised many times before it is finally approved and published as an RFC, but only one version of the I-D is ever publicly available at any given time. To trace the development of a specification across time, you must either follow the mailing list or download and store copies of each new revision of the I-D. However, most I-D revisions include a section detailing the changes made since the previous version. The ultimate documentation of a working groups activity is the finished RFCs it generates. An I-D is just a draft, and six months after it is published, it expires unless it can be moved forward. RFCs, on the other hand, live forever and contain information that is at least of some interest to the Internet community and that may actually describe a specification on the Internet standards track.
Reading List
Table 4.1 lists some RFCs that elaborate on the information presented in this chapter.
The Internet Standards Process Table 4.1 RFC RFC 2026 Relevant RFCs TITLE The Internet Standards ProcessRevision 3 DESCRIPTION This document serves as the basis of much of this chapter and explains the exact process by which specifications become standards. This document explains how working groups work, how to start one, how to run one, and how to terminate one. This RFC explains what IETF organizational entities are involved in the process of setting standards, as well as what roles each plays. This short RFC simply punctuates the distinction between acceptance of a specification as a standard and acceptance of a specification for publication as an RFC. This RFC is useful for anyone wishing to write an RFC or RFC-like document as well as for those interested in how these documents are styled and structured.
53
RFC 2418
IETF Working Group Guidelines and Procedures
RFC 2028
Organizations Involved in the Standards Process
RFC 1796
Not All RFCs Are Standards
RFC 2223
Instructions to RFC Authors
CHAPTER
5
Getting the RFCs
You can find RFCs in lots of places, though some are more complete, accurate, and up-to-date than others. In this chapter, we examine where to find RFCs and Internet-Drafts, and where to get the latest information about RFCs and Internet-Drafts. RFCs can be found almost everywhere, it seems. Computer book authors have been known to include complete copies of RFCs in their books, and some authors incorporate searchable databases of RFCs on CD-ROMs included with their books. Yahoo! may have as many as a dozen or so RFC-related sites, most of which are archives containing all (or almost all) RFCs published to date. Ive included a handful of the Web archive sites I find most useful, and you can find more pointers on the companion Web site for this book. However, anyone interested in getting the latest should do his or her own search for RFC-related Web sites: Old ones go away, new ones come online all the time, and the ones that stay on often undergo changes, sometimes for the better and sometimes for the worse. Having all the RFCs does not necessarily give you everything you need to work with RFCs, however. For one thing, there are somewhere in the neighborhood of 2,500 different RFCs. Trying to find what you need in that thicket of documents is sort of like trying to find what you need in an encyclopedia whose articles are arranged in the order they were written. To make things worse, revisions of existing articles are simply treated as newer articles, and the older, outdated articles are never removed. And, of course, there is no index.
55
56
To make sense of RFCs, you need something to act as an index. In most cases, that something is a search tool associated with the Web site or CD-ROM where the RFCs themselves are published. RFC archives may be totally spartan, like the directory published by the IETF, which is nothing more than an FTP directory containing the RFC files. More elaborate archives provide tools for searching and displaying the RFCs, with varying degrees of success. So far, no single site provides everything you need to work with RFCs, but some combination of two or three should be sufficient to meet most needs.
Staying on Top of RFCs

There are several different types of RFC consumers. The more casual consumers are usually more interested in looking up some specific standard or document on a one-time or infrequent basis. A network manager may consult an RFC to check on header fields or some other aspect of a protocol while troubleshooting a network problem. Computer science students may consult the RFC archives to document some protocol. Students of history may consult the RFC archives to track down some Internet-related event. The casual reader may have been given an RFC number by a text reference (like this one), a vendor, a professor, or some other source, and thus may have no need for any type of search engine. Casual RFC readers often find out about new Internet standards-track specification from their vendors or from trade press reports about new products that support them. People involved with deploying Internet-based or related systems may have a higher level of interest in RFCs. Intranet/extranet managers need to understand what their systems are doing and how they do it. This includes understanding the protocols as well as how vendors implement those protocols. These users need to be able to search for RFCs based on keywords. They need to be able to jump from one RFC to another related RFC to see how they affect each other. They need to know which RFC is a current standard (or nonstandard) and which is obsolete or experimental. These readers may even need to know when a new specification has been added to the standards track or when an existing specification advances along the standards track. The third class of RFC readers are those who not only need access to current RFCs, but who must know what future RFCs will look like. These are the implementersnetwork software and hardware engineers who must translate the specifications from document form into products that actually do something. Not only must these implementers know when new specifications are published as RFCs or advanced along the standards track, but they must have a pretty good idea of where the specification is going well before it is published as an RFC. Vendors implement specifications described in Internet-Drafts for experimentation and testing so they can roll out RFC-compliant products quickly once the RFC is published.
Getting the RFCs
57
This book was written for people in the two latter categories, and in the next section we look at some of the more important mailing lists to which you should subscribe if you need timely information about RFCs.
IETF Mailing Lists

Several mailing lists are worth knowing about if you are interested in what the IETF is doing: IETF-discuss. The IETF discussions list is an open forum for IETF members to discuss issues related to the Internet, the IETF, the IESG, and their activities. If you are considering subscribing to this list, check out the archives at the IETF Web site. IETF-announce. The IETF announcement list is used to distribute information about the logistics of IETF meetings, agendas for IETF meetings, actions taken on working group activities, announcements of Internet-Drafts, IESG last calls, Internet standard actions, and announcements of publication of new RFCs. This is a read-only list, and it is used to communicate official activities of the IETF and IESG, rather than to stimulate discussion. Internet Monthly Report. Subscribers to this mailing list receive copies of a monthly report detailing all the activities of participating organizations during the month preceding. In this report, you can find a summary of all standard actions, new RFCs and Internet-Drafts published, activities of the RFC editor and the IANA, information about meetings that were held during the month, and notices of relevant meetings to come. RFC-dist. This is the RFC distribution list. Subscribers receive notification every time a new RFC is published, along with a URL pointing to the newly published document. To avoid duplication of messages, most people would choose to subscribe to only one of the IETF-announce, RFC-dist, or Internet Monthly Report lists. For example, if you want notification every time a new RFC is published but are not interested in Internet-Drafts or any other Internet actions, you would subscribe to the RFC-dist list. If you dont want to be bombarded with messages but still want to stay on top of Internet standards activities, you would subscribe to the Internet Monthly Report. If you need to know everything that happens, as it happens, you would subscribe to the IETF-announce list. All RFC-dist list messages are copied to the IETF-announce list, as is the Internet Monthly Report, so subscribing to the IETF-announce list is the most comprehensive option. Table 5.1 includes subscription information for these lists as well URLs for list archives. Before subscribing, readers are urged to visit the archive sites listed in Table 5.1 and read all instructions about the mailing list before subscribing.
58
Essential Email Standards: RFCs and Protocols Made Practical Table 5.1 Addresses for Subscribing to IETF-related Mailing Lists EMAI L ADDRESS majordomo@ zephyr.isi.edu ARCHIVE SITE (included in IETF announce list archive) NOTES Message body should read subscribe rfc-dist. Use subscribe as both the subject line and the message body. Message body should read subscribe imr. Use subscribe as both the subject line and the message body.
MAI LI NG LIST RFC-dist
IETF-announce
ietf-announcerequest@ietf.org
www.ietf.org/mailarchive/ietf-announce/ maillist.html ftp://ftp.isi.edu/ in-notes/imr/ www.ietf.org/mailarchive/ietf/ maillist.html
Internet Monthly Report IETF Discussion List
majordomo@ isi.edu ietf-request@ ietf.org
N OT E The IETF discussion list can be very noisy at timesit is an open forum
from which no one may be ejected and without any type of censorship. Participants sometimes veer off onto topics not relevant to the IETF, post repetitively on the same topic, or return to topics that are no longer relevant or that have already been discussed into the ground. An alternative exists for people who are busy and want to know whats being discussed, without the crosspostings, postings from known troublemakers, and repeated requests for help in unsubscribing. The ietf+censored list filters out much of the noise and can be subscribed to by sending a message to ietf+censored-request@alvestrand.no with the body subscribe. For those interested in seeing only the rejected messages (for amusement purposes only), send a message body of subscribe to the address ietf+censoredrejects-request@alvestrand.no. For more information about these lists, see www.alvestrand.no/ietf+ censored.html.
RFC Archives
Dozens of RFC archives scattered over the globe exist on the Internet. Some are more useful than others, and some are better than others. The RFC archive on the IETF site contains the raw RFCs as text files and, in some cases, as PostScript files. However, this is simply a file transfer site: There are no search tools here. If you need help with RFCs, you need to find another resource.
Getting the RFCs
59
Rather than list all the RFC archives currently available, this section discusses how to locate archives and what kinds of features are available in RFC archives. Links to some of the better RFC archives are available on the companion Web site to this book; readers are urged to make their own search for a source that is appropriate for them.
Finding RFC Archives

Locating an RFC archive on the Internet is relatively simple. Try the RFC editor Web site for a list of some RFC archives: www.rfc-editor.org/rfc.html This is a good place to start because it describes some of the features and capabilities of the listed sites. Portal sites like Yahoo! also maintain categories related to Internet standards. Yahoo! even has a category just for RFCs (http://dir.yahoo.com/ Computers_and_Internet/Standards/RFCs/), which is a fertile hunting ground for RFC archive sites. Portals may offer more selection, including off-beat archive sites. If you dont find what you want at a portal site, you can try one of the Web search sites like AltaVista, HotBot, or others. A search on the word RFC will undoubtedly produce thousands of matches, but you can narrow it down by adding qualifying words such as archive, search, or standards. You may also be able to narrow the search down to geographic areas or languages: HotBot offers criteria based on domain, continent, and language as well as the more common Boolean searches on words.
RFC Archive Features

RFC archives usually offer a mix of features, and some mixtures are more useful than others. Some archives are simply that: repositories for raw RFC files. If you know the RFC number, you can use these archives; if not, you may be out of luck. Some of these file dumps actually list the RFC names, authors, and date of publication in addition to the number, so you can use your browsers search function to find relevant documents as long as you know what to look for. At a minimum, the archive should provide a search function. Searching should be done on the body of the RFC text, rather than just on RFC titles. Some archives searches are too restrictive and produce way too few hits; other archives searches are too loose and produce way too many hits. The just right number of hits varies from person to person, but the search results should include all the relevant RFCs without including too many irrelevant ones. Consider too the search features. Some archives permit only simple searches on one or more terms; others permit Boolean text searches. Some archives allow you to fine-tune your searches, limiting the number of hits; others restrict you to
60
a maximum number of hits and urge you to add terms if you exceed that maximum. Some allow complex searches with criteria relating to the title and body of the RFCs, as well as options regarding the output of the results. The more control you have over the search, the more likely you are to find just the documents you want. Some archive sites include only RFCs, while others provide access to InternetDrafts as well. Likewise, some archive sites allow you to search or browse through document subsets, such as the STD, BCP, and FYI series of RFCs. Finally, some sites even include hyperlinking: RFCs (and possibly other documents) cited in the body of the RFC you are reading are activated at Web links. Open up one RFC, and you can immediately jump to any RFC cited in the text by clicking on it. This is a great idea, but the implementations tend to fall short. The RFC being displayed usually links to itself through the RFC number listed in the page headers. One version actually seems to link any number with three or more digits, including zip codes and binary values of fields included in protocol descriptions.
RFCs by Email
For those with email-only access to the Internet, RFCs are available by email from the RFC-INFO service. Send email to rfc-info@isi.edu and format your message body like this: Retrieve: RFC Doc-ID: RFC#### Replace #### with the number of the RFC you want, padding the value with zeros for RFC numbers that are lower than 1000. For example, to retrieve RFC 821, you would use RFC0821. For additional features or for help with retrieving RFCs by email, you can send a message to rfc-info@isi.edu with the message body help: help.
Getting Internet-Drafts
Subscribing to the IETF-announce mailing list will get you, among other things, announcements of publication of all new I-Ds. These announcements include URLs you can use to retrieve a copy of the I-D. You may also want to search for I-Ds that relate to a particular technology or issue. You can see all I-Ds generated by a particular IETF working group at the active working group Web page: www.ietf.org/html.charters/wg-dir.html
Getting the RFCs
61
Choose the working group of interest from this list, and youll see all its RFCs and I-Ds. Many related organizations also maintain archives of I-Ds as well as RFCs; for example, the Internet Mail Consortium maintains RFCs and I-Ds related to Internet mail at its Web site: www.imc.org/mail-standards.html The IETF maintains the most up-to-date and comprehensive list of I-Ds. The main repository for Internet-Drafts is at: www.ietf.org/ID.html From this page, you can do a keyword search, browse through the I-D directory, or view guidelines for I-D authors. Many of the other good RFC repositories also include facilities for searching for I-Ds.
Reading List
Rather than suggest any specific references for additional reading in this chapter, you should go to your favorite Web search or portal site to search for RFC archives. If you cant find at least five, try another search engine. Now, try each of the archive sites youve located and see which one best suits you. Regardless of whether you do your own search, be sure to visit Lynn Wheelers IETF RFC Index site (www.garlic.com/~lynn/rfcietf.html). It is one of the most comprehensive and useful archives around. Wheeler provides the ability to view specifications at different stages of the standards track as well as view-only specific document series. Also included here are links to specifications that have been made obsolete as well as the specifications that have replaced them. Links to related sites are also useful.
CHAPTER
6
Reading the RFCs
As mentioned in Chapter 5, Getting the RFCs, RFC consumers tend to fall into three categories: casual readers, deployers, and developers. Just as each type of consumer has slightly different requirements for obtaining and tracking RFCs and Internet-Drafts, so too does each type of consumer use these documents in a different way. This chapter takes a look at how people use RFCs. Though it may not be apparent from reading some of the earlier RFCs, new Internet documents must conform to a very specific set of stylistic requirements. RFC 2223, Instructions to RFC Authors, is a must-read for anyone who plans to write an Internet-Draft or RFC. It is also useful for understanding just what is and is not included in an RFC. All RFCs are published as ASCII text files because it is a universal format, accessible to anyone with email or better Internet access. Occasionally an RFC may also be published in PostScript to provide additional detail to graphics included in the document, but most ASCII RFCs include text-based graphics. All modern RFCs adhere to a strict page format, with headers that contain the RFC number, title, and month and year of publication and footers that contain the author(s), RFC category (informational, standards-track, best current practices, or experimental), and page number. The first page displays the RFC number, the authors names, their organizational affiliation, and a line indicating which previous RFCs the current one updates or makes obsolete. If the document has any other numbers, for example an STD, FYI, or BCP number, these are listed at the top of the first page as well.
63
64
RFCs must have a status section, identifying the RFC as documenting a standards track specification, a best current practice (BCP), an experimental specification, or an informational document. The status section consists of one of four boilerplate paragraphs, each one indicating a different type of document. A brief boilerplate copyright notice, reserving copyright for the Internet Society, is also required on the first page, with a longer piece of boilerplate added at the end of the document. The introduction section briefly describes the document itself. This section is often the most useful when the reader is searching for a particular specification. The introduction summarizes the RFC, usually in a few paragraphs or less. The introduction section is usually derived from the abstract section of its precursor Internet-Draft. Other required sections for RFCs include a references section citing all previously published documents to which the RFC refers and a security considerations section discussing potential security issues raised by the RFC. The authors address section is also required, as it permits readers to send questions or comments directly to the author. Of course, these sections are the shell within which the meat of the RFC is nestled. After the introduction, the specification is described in detail. The first section after the introduction often describes pertinent terminology and may be followed by a section or sections describing the requirements or circumstances that caused the specification to be written. The protocol headers and fields are then described, followed by discussion of specific protocol features and how they work. The last sections may discuss how the protocol interacts with other protocols, how it should be implemented or deployed, or any other issues that need to be addressed in order to implement the protocol. Appendices are often used where appropriate. RFCs usually describe behaviors and attributes of protocols. They tell you how a system using the protocol should work. From there, you can build your own implementation of the protocol. RFCs dont usually explain how to build the implementations, they just tell you how an implementation would work if it were built. Some RFCs describe protocol APIs, but these still describe how the implementations must behave rather than how to actually program the implementations.
Understanding Protocols
Perhaps the most common use of RFCs is to understand what the protocol being specified actually does and how it works. Casual readers as well as developers must first look to the RFC for a basic understanding before anything else can happen. Casual readers may be able to stop there, although both deployers and developers need to look beyond a basic understanding of the specification to meet their needs. Getting a basic understanding of a protocol may be as simple as reading the introduction section of the RFC; this is often all that is necessary. However,
Reading the RFCs
65
things are not always that simple. Sometimes it is necessary to read through the entire RFC, and even then the answer may not be apparent. In those cases, it may be worthwhile to check on the citations in the RFC reference section as well as any related or dependent specifications. When all else fails, the casual reader may need to take a short course in TCP/IP internetworking.
Reading the RFCs

Always start with the introduction. This usually is the most precise and concise summary of the specification available anywhere. The introduction usually explains what the protocol does and how it does it. You can often rule out an RFC as being irrelevant by looking at the introduction. You may also determine that a protocol does what you want it to do, the way you want to do it, by reading the introduction. More often, you will need to delve further into the RFC to find what you need. Sometimes you will have to pay careful attention to the definitions section, especially if the specification refers to systems or processes with which you are unfamiliar. Sometimes the definitions will be mostly formalized descriptions of terms that are well understood. Read any sections that discuss the background of the problem that the specification solves or attempts to solve. This section may describe not only the approach used by the specification described in the RFC but also other competitive or precursor solutions. Most casual readers will have found their answer by now. Most of the basics of the protocol and its special features are outlined in the first few sections of the RFC; reading beyond that into the protocol nitty-gritty of headers and detailed specifications may not provide answers to simple questions. At this point, it may be useful to look at references and protocol dependencies rather than attempting to divine further meaning from the RFC in question.
Checking References and Dependencies

If the reader simply wants to understand the broad outlines of the protocol, sometimes more background is needed rather than more detail about the specification being examined. Internet protocol specifications often expand as they are updated, especially as they progress along the standards track. A specification that is sufficient for a proposed standard usually expands over time as people uncover potential problems or issues related to the deployment of the protocol on the Internet. RFCs of updated protocols tend to expand in order to deal with, or at least acknowledge, these issues and problems. To understand the basic concepts of a revised specification, it is sometimes useful to go back to the original specification. Likewise, documents cited in the references section often include discussions of the issues and approaches and concepts used as background for development of the specification described in the RFC. Often, the documents are other RFCs or Internet-Drafts and are easily accessible online. An RFC can specify a new version, update, or replacement for an existing specification. For example, when a proposed standard has been revised and moved to
66
draft standard status, the new version is given a new RFC number. However, just looking at the original, proposed standard RFC will not give you any indication that the RFC has been deprecated. Relations between RFCs are indicated in Lynn Wheelers RFC Web site (www.garlic.com/~lynn/rfcietf.html).
Getting More Help

When reading the RFC doesnt help and the references are similarly unenlightening, the casual reader may need more internetworking background. Books like TCP/IP Clearly Explained, 3rd edition (Pete Loshin, Morgan Kaufman 1999) or Illustrated TCP/IP (Matthew Naugle, Wiley 1998) provide general readers with enough background to begin to read RFCs with greater understanding of the basic concepts of internetworking. Alternatively, casual readers can often find answers to technical questions, or at least pointers to good sources for such answers, on mailing lists and newsgroups devoted to the protocol specified in the RFC. Public newsgroups are available for most Internet protocols, as are FAQs for those groups. Mailing lists maintained by the relevant IETF working group are often helpful, though the reader is still urged to check the list archives or read a few days worth of newsgroup postings to see what kinds of questions are encouraged and to see whether the question has recently been answered.
RFC Troubleshooting
From what does this protocol do? to why doesnt this protocol work? is a big step. Deployers of protocols need to be able to read the specifications with a more critical eye than most casual readers. They need to understand not only how to identify which protocol is at fault and how protocols behave, but also how to look at network traffic and determine what is actually going on. Casual readers often know exactly what protocol and even what RFC they should be looking for: They may have been told that a certain specification solves their problems or is incorporated into a new networking product they are considering for purchase. Troubleshooters have no such assurances. More often, the people involved with troubleshooting products that have been deployed know only that some system is not working as they believe it should be working. They must first analyze the problem and rule out the more commonplace causes before they need to examine the protocol specifications. Protocol analysis tools capture and decode network traffic. Network managers can examine protocol behaviors by looking at the traffic being sent and received by local systems. Initial descriptions of network problems usually are phrased in terms of lack of connectivity between systems or failure of system functions despite apparent connectivity. The solution usually lies in something external to the protocol, or at least in some issue relating to the installation or configuration of the protocol implementation. Network support staff are usually able to solve problems through the process of elimination: Misconfigured systems, disabled servers, or overly strict
Reading the RFCs
67
firewalls account for a large portion of these problems. Network engineers using protocol analyzers can eliminate the peskier problems that relate to protocol implementations by studying the actual network transmissions and determining whether the implementations are behaving as they are supposed to. Problems with implementations can also be tracked down by replacing the problem system or systems with other systems known to work correctly. Scanning network transmissions, the engineers can determine whether the functionally equivalent systems are actually behaving in the same ways. Even though Internet standards are well defined, as are the requirements for implementing them, not all implementations are created equal. An implementation may be incomplete or incorrect either through design or in error, but to detect a problem you must understand what the implementation is supposed to do, as defined in the relevant RFC. Deployment professionals who are reading RFCs for troubleshooting purposes must be able to go beyond the basics of the specification and understand the specific functions defined by the protocol. Thus, it is not enough to understand that TCP, for example, uses four different timing functions to guarantee service across a virtual circuit. You must also understand exactly how those timers are supposed to work and why performance can be disrupted if one or more of them is improperly implemented. Newsgroups and mailing lists are far more important tools for the deployment professional than the casual RFC reader, as they are sources of information about specific implementations as well as good places to ask technical questions. Whereas basic questions about a protocol are less welcome in such forums, issues about how protocol implementations actually work are central to the operation of the IETF working groups as well as to participants on newsgroups and mailing lists.
Building Protocol Implementations

No one has a greater need to be on top of the Internet standards process than the people responsible for building the applications that implement standard protocols. Any implementer who waits for a specification to achieve full Internet standard status loses all hope of ever attaining significant market share without huge cost. This is what happened to Microsoft when it first started building Internet software. Microsoft started building its Web browser long after Netscape and Spyglass dominated the market. Ultimately, Microsoft garnered significant share only by giving away its browser and by bundling it into as many products and packages as it could manage. In effect, Microsoft had to start from scratch in order to catch up; in fact, Microsoft licensed browser code from Spyglass before building its own browser. Having learned from experience, Microsoft now participates in many IETF working groups. Not only does Microsoft gain access to valuable information about what the new and revised standards will look like, but it also guides those efforts through the working groups.
68
If you want to implement a protocol for a commercial product or service, you are not alone. If you want your product to succeed, it must be timely, and that means, at the least, tracking protocol development from the Internet-Draft phase. Ideally, protocol implementers actively participate in the standards development process through working group mailing lists and by attending working group meetings.
Understanding the Standards

The relevant protocol is not always immediately apparent, nor is there only a single relevant protocol for a particular application. A developer working on collaborative workgroup software may be affected by specifications relating to Internet messaging, calendaring and scheduling, IP multicast, multimedia data transmission, and quality of service. Switch and router developers must stay abreast of all developments relating to IP routing as well as to data link layer transports like Ethernet, ATM, Frame Relay, FDDI, and others. Understanding the standards means not just reading the relevant RFCs and Internet-Drafts, but also looking at existing implementations. The IETF maintains reports on implementations of Internet standards (at www.ietf.org/IESG/ implementation.html). Prospective implementers are advised to start here by looking at existing implementations and reports about those implementations (there is some talk of enhancing the IETFs role in making reference implementations of standard protocols available). Its important to remember that by working within the system, implementers have access to far more information and assistance than is incorporated into the RFCs.
Reading List
The best way to get comfortable with reading RFCs is by simply reading an RFC that covers a topic that is of interest. In addition to the RFCs listed in previous chapters, you can use your preferred mechanism to search for an RFC on some aspect of Internet messaging, and then simply try to make sense of it. As you read, try to answer some questions:
s s s s
What kind of RFC is it? (e.g., informational, standards track, etc.) Does the RFC belong to some other document series? (e.g., BCP, STD, FYI, etc.) Is there any way you can use the RFC to confirm that a behavior or characteristic of some familiar application or system complies with the RFC? Can you locate any of the RFCs references, to gain greater insight into the RFC you just read?
s s
s s
All these exercises should help you master the art of reading RFCs.
PA R T
Two
Internet Messaging Standards
There are really only a relative handful of full Internet standards for messaging, but quite a few specifications are on the standards track and in general use. The rest of this book examines specific messaging protocols. Chapter 7, Messaging Standards, provides a 30,000-foot overview of Internet messaging specifications, briefly describing the different types of protocols being used, how they are used, and which RFCs document them. Within the rest of the book, the chapters are broken down by function, starting with message formatting, followed by message transport protocols, Internet messaging applications, and ending with discussion of message security specifications and the future of Internet messaging. Chapters 8 and 9 address message formatting. In Chapter 8, Internet Message Format Standard, we examine the fundamental message format required for all Internet messages, while Chapter 9, Multipurpose Internet Mail Extensions (MIME), introduces the Internet standard for formatting message attachments. Chapter 10, Simple Mail Transfer Protocol (SMTP), examines the fundamental message transfer protocol, and Chapters 11, Post Office Protocol (POP), and 12, Internet Message Access Protocol (IMAP), describe the two most important mechanisms available for distributing messages from servers to end users. Chapter 13, SMTP Message Address Resolution, discusses how Internet message applications are able to deliver messages based on email addresses.
70
Chapter 14, Network News Transfer Protocol (NNTP), discusses the specifications for network news. Chapter 15, vCard, examines Internet calendaring and scheduling specifications and Chapter 16, Calendaring and Scheduling Standards, describes specifications related to the exchange of directory information. Chapter 17, Internet Messaging Security, discusses the current state of Internet messaging security specifications, with particular attention to the shortcomings of what is available as well as brief introductions to other Internet security mechanisms. Finally, Chapter 18, The Future of Internet Messaging, looks forward to the future of Internet messaging, introducing some of the topics currently being considered by IETF working groups.
CHAPTER
7
Messaging Standards
Fundamental specifications for the most important Internet messaging applications, email and news, have been standards for many years. The defining protocols for email were first published as RFCs in 1982. The mission of the Detailed Revision/Update of Message Standards (DRUMS) IETF working group is to develop and review revised versions of these basic standards. The DRUMS objectives direct the working group to include corrections, clarifications, and amplifications to reflect existing practice or to address problems which have been identified through experience with Internet mail protocols. For the most part, however, these protocols have proven remarkably robust and scalable; as the DRUMS working group description explicitly states, New functionality is expressly inappropriate. In this chapter, we examine the various forms Internet messaging standards take. We start by identifying the different aspects of any Internet messaging application, including the Internet standards for message formats, the functions carried out by message transfer agents (MTAs) and user agents (UAs), the differentiation between messaging servers and messaging clients, and the use of the Domain Name System (DNS) and the mail exchange (MX) record type. The rest of the chapter provides an overview of the standards specified for the three most important categories of Internet messaging application, email, network news, and Internet workgroup/collaboration.
71
72
Internet Messaging Components

Internet messaging services must define certain requirements and standards if they are to be successful. These requirements include: A message format. All messages should be easily identifiable as messages. All messages should include all information necessary to enable the messaging system to deliver the message. All message content should be identifiable and portable across systems with different data representation formats. Relevant specifications define how messages are to be formatted. A framework for moving messages from one system to another, across the network. In Internet messaging terms, message transfer agents (MTAs) fulfill this function, collecting messages to be delivered from UAs and passing them along to other MTAs that can deliver the message to the UA designated for the recipient of the message. Relevant specifications define how systems forward messages from the source MTA to the destination MTA. A framework for delivering messages to users. In Internet messaging terms, the user agent (UA) acts on behalf of the user for these purposes. Relevant specifications define the protocols to be used by client software to accept incoming messages and store them locally for the recipient to read. A mechanism for resolving message addresses, usually expressed in human-friendly formats, into machine-friendly addresses to facilitate delivery. The Domain Name System (DNS) mail exchange (MX) records fulfill this function for Internet messaging applications. Relevant specifications define how the mail exchange record is formatted and how the DNS responds to requests by messaging systems. Although this architecture doesnt specify servers and clients, in effect the UAs usually act as clients, and MTAs usually act as servers when interacting with UAs. However, MTAs can behave as both clients and servers when transferring messages to and from other MTAs.
Message Formatting
We address three levels of message formatting issues here: data representation, message and header formatting, and enclosure formatting.
Character Representation
The most basic question about formatting is how to represent the data in the message and message header. Its easy enough to say all messages must consists of letters, numbers, and punctuation, but at a very basic level everyone who wants to communicate must agree to use the same set of bits to represent
Messaging Standards
73
those characters. Anything that is sent out onto the Internet as a message must use the same kinds of bits to represent the same kinds of information. Weve already got a standard for Internet data representation. Its called 7-bit ASCII or US-ASCII, and it uses the first seven bits of each byte to represent one of 128 values; the eighth bit is always set to 0. Internet messages must use 7-bit ASCII for their headers, at least, and Internet messaging systems must treat non-ASCII data in a way that is consistent with other systems expecting to process only 7-bit ASCII data. Messaging system implementers could just as easily dispense with 7-bit ASCII and deploy their systems using some other character encoding scheme. As long as all parties using that system used the same character encoding scheme, they could all happily interoperate with each other.
Message Syntax
Messages may consist of ASCII data, but that data must be organized in a way that makes it recognizable as a message as opposed to a simple stream of characters. RFC 822, Standard for the Format of ARPA Internet Text Messages, specifies how Internet messages must be formatted for them to be treated as messages. In other words, RFC 822 specifies a syntax for messages. According to the RFC, each message consists of an envelope and contents. The envelope contains whatever information is necessary to deliver the contents. The structure and format of the envelope are not specified by RFC 822 or its successors. However, the envelope may be created by a messaging application by extracting certain information from the contents of the message. The Internet Message Format Standard (MFS) was not published as an RFC as of late summer 1999, but it clarifies the standard defined in RFC 822 and otherwise leaves it functionally intact. MFS specifies that messages consist of a set of header fields, which are lumped together into the message header, and an optional message body. You can send a message with no body, but you cant send a message with no header. Some header fields are required; others are optional. All must adhere to the proper syntax. Messaging applications may peek into the message contents and use the data in message header fields to create the message envelope. RFC 822/MFS defines required and optional message headers, as well as specifications for new and custom message headers. Chapter 8 discusses the RFC 822/MFS header specifications in detail. RFC 822/MFS defines the syntax for the message contentsnot the message envelope. The format of the message envelope is determined by the application handling message delivery, which may be the Simple Mail Transfer Protocol (SMTP), the Network News Transfer Protocol (NNTP), or some other message transfer protocol. MFS also limits itself to defining messages entirely consisting of US-ASCII characters and leaves specification of binary data enclosures to
74
other standards, notably MIME. Internet message formatting standards are examined in detail in Chapter 8, Internet Message Format Standard.
Enclosure Formatting
For many years, different email systems used different approaches to file attachments. Some of these approaches persist, but this book focuses on the Internet standards for enclosing nontext data into Internet messages. The Multipurpose Internet Mail Extensions (MIME) standards define mechanisms that allow senders to enclose, and recipients to detach, non-ASCII objects that have been sent with standard Internet messages. MIME is a highly extensible specification that defines a mechanism for defining special types of data. MIME can be used for binary and multimedia data, including programs, graphics, audio, video, and any other non-ASCII data. MIME can also be used for ASCII-based data that carries its own formatting. For example, a rich text format (RTF) word processing file consists of ASCII text and formatting tags. You could encapsulate an RTF document in a MIME enclosure that would allow a MIME-enabled client capable of rendering RTF text to display the enclosure formatted as an RTF document rather than simply as a text file with RTF tags. MIME uses two sets of headers to do most of its work. The content-type header identifies what kind of material is being enclosed; the content-transfer-encoding header identifies how the content is encoded. MIME content-transfer-encoding headers identify how the data within the MIME enclosure is formatted. Possible options range from 7-bit ASCII to binary. There are only a handful of MIME content transfer encoding methods. MIME content types consist of two parts: a type and a subtype. The MIME type provides a general description of the kind of data carried in the MIME enclosure. The subtype offers a more specific description of the type of data enclosed. For example, the MIME body type TEXT (which refers to ASCII text enclosures) could be combined with the subtype PLAIN. This content-type refers to a plain ASCII enclosure, which is functionally identical to an RFC 822-compliant message body. Although there are only a few MIME content types, each type may have many subtypes. RFC 2045 through RFC 2049 specify the fundamentals of MIME. In addition to these RFCs specifying how MIME works, numerous other RFCs specify new MIME content types. Chapter 9, Multipurpose Internet Mail Extensions (MIME), examines the MIME specifications in detail and lists RFCs that define special MIME content types. Secure MIME (S/MIME) is a proposed Internet standard that defines a mechanism for consistently sending and receiving MIME data securely. S/MIME uses digital signatures to do authentication, data integrity, and nonrepudiation of origin on MIME enclosures. It also uses encryption to provide data security for MIME enclosures. RFC2633, S/MIME Version 3 Message Specification,
Messaging Standards
75
defines how S/MIME works. S/MIME is discussed in Chapter 17, Internet Messaging Security. News and mail are two separate Internet applications, and each has its own message transfer protocol. The scheduling and calendaring protocols covered in this book do not define their own message transfer protocols, but rather define MIME types in which application data is carried. Thus, these collaborative messaging protocols finesse the question of how their data is distributed. It may be distributed using email or news transfer protocols, by Web or by some other method, but in either case, that distribution is beyond the scope of the collaborative protocol itself.
Message Transfer
Standard message formatting is fundamental to Internet messaging, but once a standard is defined for formatting messages, you must then define standards for forwarding those standard-format messages. The goal of creating a message format standard is to allow any kind of system to be able to identify a message and then process it appropriately. The message format standards apply to the contents of the message; message transfer standards apply to the envelope of the message. Just as package delivery services dont need to look at what is inside a package to deliver it, the message transfer protocols dont care what is contained within the message. The message transfer protocols are concerned with only the addressing and delivery information stored on the message envelope. Message transfer protocols define how messages are forwarded from one system to another across the Internet or other networks. Message transfer protocols purposely avoid the question of how to deliver messages to end users. Consider the Simple Mail Transfer Protocol (SMTP), defined in RFC 821 and exhaustively updated in an RFC expected to be published later in 1999. This protocol defines SMTP functions, including SMTP client and server as well as SMTP relays and gateways. An SMTP server may be fully capable, meaning that it can perform all specified tasks of an SMTP server, or it may be less capable, meaning that it performs only certain SMTP tasks. The flow of messages within the SMTP-defined transport is from a message store (a file system) on one server, across the network, to another message store on the destination server. How those messages get in or out of the message storesthat is, how a user sends a message to an SMTP server and how a user receives a message from an SMTP serveris beyond the scope of the SMTP specification, but not beyond the scope of other specifications we discuss in this book. A fully capable SMTP server is able to act as an SMTP client and initiate requests to transfer messages to another SMTP server, as well as act as an SMTP server and respond to requests from SMTP clients. Some SMTP servers may act only as relays, accepting messages that are placed in its message store and then
76
forwarding them to another SMTP server, regardless of the ultimate delivery destination. Email client software usually behaves as a less capable SMTP server, submitting messages from the user to a more capable SMTP server for forwarding. An organization might also use a relay SMTP server to collect outbound messages from users and then forward them outside a firewall to another SMTP server that has Internet connectivity. A gateway SMTP server acts as a protocol gateway, meaning that the gateway accepts messages using SMTP, but then converts those messages into some other message delivery protocol for local distribution. An organization using a proprietary messaging product for internal messaging, such as Lotus cc:Mail, would use an SMTP gateway system that could accept inbound SMTP messages and convert them into cc:Mail format for delivery on the local network. Inasmuch as most SMTP clients also act as servers, the SMTP specifications tend to speak of SMTP senders to refer to systems forwarding messages to another SMTP server. SMTP receivers refer to systems receiving messages from an SMTP server that acts as a client. SMTP defines how SMTP senders and receivers communicate, how messages are passed from SMTP senders to receivers, and how errors are reported. SMTP is examined in detail in Chapter 10, Simple Mail Transfer Protocol. The Network News Transfer Protocol (NNTP), defined in RFC 977, solves a problem similar to that solved by SMTP: the delivery and distribution of messages among servers spread across the Internet (or other networks). NNTP functions in some ways similarly to SMTP, with some important distinctions. NNTP is examined in detail in Chapter 14, Network News Transfer Protocol (NNTP).
Message Delivery
So far, weve discussed the protocols used to specify message formatting and the protocols used to move a message from one server to another server. Missing, so far, are the mechanisms for moving a message from an end user to a server or from a server to an end user. The message forwarding protocols describe how messages can be moved from one servers message store to another servers message store, but leave out the problem of how those messages arrive in the originating servers message and how they are retrieved from the receiving servers message store. Three common mechanisms place messages in front of users and submit messages from users. The simplest is to implement an SMTP server that has a user interface for receiving messages and allow the user to read those messages with some utility such as a text processor or email client. The problem with this approach is that it requires that the users system be available at all times to receive incoming messages. It cannot be turned off at night, nor should it be brought down for any significant length of time for things like system management, upgrades, or even just a system crash. This is not to say that
Messaging Standards
77
this approach is not ever usedit just isnt a suitable solution for most end users. SMTP is examined in more detail in Chapter 10. A second mechanism defines a protocol that allows a server to act as a post office for a client, fulfilling a function similar to the way a brick-and-mortar post office offers post office box services. In the real world, the post office collects mail through its own delivery network. A user physically goes to the post office every so often to check his or her mailbox for mail; if there is any, the user empties the mailbox and goes back home to read the mail. The email Post Office Protocols (POP) collect messages by acting as a receiving SMTP system and putting the messages in a data store. The email client checks the users electronic mailbox every so often to see if any messages are there. If there are, the client downloads the messages and puts them on a local file store for the user to read. The post office server is always up and always ready to accept messages on behalf of the user. The users can check their mailboxes every minute or every week, and the server will deliver all current messages in either case (depending, of course, on whether the server has enough storage space and is properly maintained and functioning at all times). The Post Office Protocol version 3 is an Internet standard and is specified in RFC 1939. POP provides a simple service: Clients can get very basic information about messages in the users mailbox, download those messages, and delete them from the servers message store. POPv3 is examined in detail in Chapter 11, Post Office Protocol (POP). As mentioned in Chapter 1, Internet email differs from most proprietary email systems in the amount of function provided through the server and server message stores. Internet email server message stores are not intended to be used for persistent storage, but rather as transient stores. Proprietary email systems have traditionally worked better than Internet email systems if you want to save your messages, organize them into folders, and access stored messages from various clients and locations at different times. The Internet Message Access Protocol (IMAP) version 4 revision 1 is a proposed Internet standard defined in RFC 2060. Over the past few years, software vendors have been increasingly incorporating IMAP support into their client and server offerings. Internet service providers have been dragging their heels in offering IMAP services to their customers, as IMAP requires dedication of significant file storage resources to message stores. Organizations, on the other hand, have been more willing to support IMAP for their users, especially as they can replace proprietary messaging products with IMAP. IMAPv4r1 is examined in detail in Chapter 12, Internet Message Access Protocol (IMAP).
Address Resolution
Email addresses generally take the form somebody@example.com, where the domain (in this case, example.com) is accessible through the Domain Name
78
System (DNS). The goal of systems forwarding messages is to deliver those messages to the domain in question. To do so, they must resolve the domain name into a network address. This is usually accomplished using the Internet standard for DNS as defined in RFC 1034, Domain NamesConcepts and Facilities, and RFC 1035, Domain NamesImplementation and Specification. Messaging servers must be able to resolve more than just the domain of an email address. Once the message is delivered to the domain, some mechanism is necessary to associate the email address with an account on a particular system. The email address somebody@example.com may represent an email account held by someone named somebody on some system hosted within the domain example.com. The entity may be accepting email at a host actually called smtp.ne.example.com, so a special mail exchange (MX) record has been defined for resolving email addresses. RFC 974, Mail Routing and the Domain System, as well as the expected update to the SMTP specification, define how this address resolution takes place. Mail address resolution standards are examined in Chapter 13, SMTP Message Address Resolution.
Internet Messaging Applications

So far, weve spoken of email and news as two Internet messaging applications, and workgroup scheduling and calendaring represent another set of applications. However, strictly speaking, Internet messaging applications can perform virtually any function that uses the asynchronous exchange of formatted information. You can define one or more MIME types to contain particular kinds of information using a specific format and use those MIME types to format and exchange information between two or more entities. To illustrate, consider a literary property auction protocol that uses Internet messaging protocols for transferring information among participants. All data is encapsulated within a special MIME type, and all the instances of the MIME type are encapsulated within RFC 822/MFS standard messages. The application protocol itself might identify a series of steps related to literary auctions, starting with an author submitting a proposal to a literary agent. The author might specify publishers to be included or excluded in the auction, or the agent might be given free rein in choosing auction participants. The protocol could further specify what information is necessary and what information is optional for an auction invitation (for example, book topic, category, authors name, advance requested, and so forth). The protocol could also specify what information is necessary for bid messages, including terms of the offer, publisher identification, and so on. The messages themselves are formatted as Internet standard messages and transferred using Internet message transfer protocols; the application operates on the information contained within the messages.
Messaging Standards
79
Why build such a protocol on top of the Internet messaging protocol infrastructure, rather than create a new application with its own application transport protocol? The answer is that the Internet messaging infrastructure provides the least common denominator for applications: Virtually everyone with Internet connectivity is able to receive and interpret RFC 822/MFS formatted messages. Message forwarding systems are widely deployed and very well understood. Automation tools and programming interfaces are easily accessible for customizing and automating message handling and message parsing. Messaging applications discussed in this book include Internet calendaring, based on the iCalendar specification, and the vCard standard for electronic exchange of business card information. iCalendar, a proposed Internet standard, is defined as a MIME type in RFC 2445, Internet Calendaring and Scheduling Core Object Specification (iCalendar). By itself, this specification helps define the type of information that can be used in Internet calendaring applications, but more is necessary. These data containers are called iCalendar objects, and these objects can be used by calendaring and scheduling applications. Another proposed Internet standard, RFC 2446, iCalendar Transport-Independent Interoperability Protocol (iTIP) Scheduling Events, BusyTime, To-dos and Journal Entries, specifies how different calendaring applications can use the information in iCalendar objects. So far, so good, but these two specifications define what information can be contained within the iCalendar objects, how it is to be formatted, and how applications can use this information to achieve application objectives. What is missing is the network transport protocol: How do the objects get from one entity to another. RFC 2447, iCalendar Message-Based Interoperability Protocol (iMIP), is a third proposed Internet standard. This specification defines how iCalendar objects are encapsulated within Internet standard message formats, specifically within RFC 822/MFS and MIME standard formats. Once in these formats, the iCalendar objects can be transported through any appropriate network transport protocol and viewed through any standard Internet mail client that supports MIME. Chapter 15, vCard, discusses the iCalendar and related protocols. A related set of specifications define the vCard, or virtual business card. For various reasons, it is desirable to have a standard format for exchanging contact information: name, telephone numbers, business name, title, email address, and more. A proposed Internet standard, RFC 2425, A MIME Content-Type for Directory Information, defines a MIME content type that can hold this kind of information. Client software can pull the data in and add it to a directory. Whether the directory is a personal contact directory maintained on a users personal or palm computer, a full-blown X.500 standard directory, or an LDAP (Lightweight Directory Access Protocol) directory, the contact information is available to any software supporting the MIME content type. Chapter 16,
80
Calendaring and Scheduling Standards, discusses the vCard MIME content type and the related directory services protocols that can make use of vCard objects.
Messaging Security
There are no explicit Internet standards regarding messaging security. The closest thing is the Privacy Enhanced Mail (PEM) specification, which has been languishing as a proposed standard since it was last revised in RFC 1421, RFC 1422, and RFC 1423 in 1993. If not officially an historical specification, most implementers consider PEM to be effectively obsolete as it is rarely implemented or used. The Secure MIME (S/MIME) specification is published in standards track RFC 2633, S/MIME Version 3 Message Specification. Although S/MIME does provide a mechanism for securing content through encryption and providing authentication and data integrity through digital signature, it does nothing to secure message header information. A more promising approach is offered by specifications built on the Pretty Good Privacy (PGP) cryptographic model. Two current proposed standards are based on PGP: RFC 2015, MIME Security with Pretty Good Privacy (PGP), and RFC 2440, OpenPGP Message Format. RFC 2015 describes how PGP can be used to encrypt or digitally sign MIME content; RFC 2440 describes how to develop interoperable applications that use the OpenPGP format to encrypt and digitally sign messages and data files. Chapter 17, Internet Messaging Security, discusses all these security specifications, with special attention to weaknesses exposed when attempting to apply them to Internet messaging. We also introduce other Internet security conceptsin particular mechanisms for providing security at the IP and transport layersand discuss how they relate to secure Internet messaging.
Reading List
This chapter serves as an overview to the standards and specifications discussed throughout the book. Rather than listing all the RFCs mentioned here, the best reading suggestion is to continue with the rest of the book. You may also wish to skip to whichever chapter is of greatest interest.
CHAPTER
8
Internet Message Format Standard
There is nothing more basic to Internet messaging than the format to which messages must adhere. If a chunk of data is formatted properly, it will be treated as an Internet message; if not, it is just a chunk of data. Substantially unchanged since 1982, the RFC 822 format for Internet messages has undergone revision by the Detailed Revision/Update of Message Standards (DRUMS) IETF working group. Its goal was not to add function but rather to clarify, correct, and amplify on existing messaging standards. This chapter examines the Internet Message Format Standard, as originally defined in RFC 822 and as it is expected to be updated based on current work in progress. We start by looking at the scope in which this standard is to be applied and then continue with a brief introduction to the Augmented BackusNaur Form (ABNF) used to specify message syntax. Next is an overview of the different parts of the message, followed by a detailed summary of message syntax. Following comes a description of header fields and their ABNF syntax and a description of newly defined obsolete headers. We conclude the chapter with examples of messages with different types of header fields.
81
82
Internet Message Format Standard Scope

The Internet Message Format Standard (MFS) specifies a syntax for text electronic mail messages. This standard does not address any type of message other than those that can be considered electronic mail, nor does it address messages that contain anything other than text. Furthermore, this standard addresses only the format of the message contents, not the format or structure of the message envelope, even though the contents of a message may be affected by activity of external protocols like SMTP, as we see later in this chapter and in the next. To further limit the scope of MFS, the specification does not define how messages are to be stored on any system, how information within the message is to be formatted for internal systems, nor even the character encoding to be used for transport or storage of the message. Also not specified in MFS are what kinds of features are to be provided on messaging applications that use the format or the appearance or function of user interfaces to such applications. What do all these caveats mean? It is true that all systems receiving Internet messages should expect that text messages will be delivered as US-ASCII data. However, by limiting the scope of the standard in this way, we are able to include a far wider population of systems that can be enabled for interoperation. This approach opens up Internet messages to many more systems through the use of gateway systems. By not defining how messages are stored, we allow systems to use Internet messaging standards even if they store the messages locally using some other character encoding scheme (non-US-ASCII flavors, or EBCDIC, IBMs mainframe scheme, for example). Text characters can be converted from another encoding scheme to ASCII and formatted into a standards-compliant message by a gateway module before they are forwarded onto the Internet from these system. By not defining how information within the messages is formatted on internal systems, we allow proprietary messaging systems to interface with Internet standard messaging systems through gateways. The gateway module can gather relevant information from the proprietary formats and rewrite it as Internet-standards-compliant format. By not specifying character encoding for message storage or transport, we make it possible for standard messages to be carried through foreign protocol tunnels. At the same time, those messages can be stored on and forwarded from systems that dont use ASCII encoding, without affecting how they are received at their final destinations. Finally, MFS specifically avoids making any statements about the applications that will be using these messages. To do so would invariably limit the function of messaging systems rather than enhance interoperability. MFS specifies only how information is to be uniformly carried within a message, not how that information should be used.
83
Technical Specification Format Syntax

The Backus-Naur Form (BNF) is a format syntax long used in the field of computer science. A modified version of BNF, called Augmented BNF (ABNF), defines a way of expressing specifications for lexical entities that is both simple and compact, yet powerful enough for most Internet protocols. A standards-track specification for ABNF is available in RFC 2234, ABNF for Syntax Specifications. This RFC is reproduced in Appendix A with annotations that make it more comprehensible to readers lacking in formal academic computer science training. RFC 822 included its own brief definition of ABNF, but MFS refers to RFC 2234 (an updated version of this specification is being developed by the DRUMS working group). ABNF formats consist of a series of rules that define the constituent parts of each lexical entity. Rules consist of a rule name, followed by the = character and a series of one or more elements. For example, the following is a valid ABNF rule:
example-name = element1 element2 element3
Each of the elements may be defined with its own rule, or it may be a basic (as in fundamental) rule such as ALPHA, which consists of any single alphabetic character (AZ or az) or BIT, which consists of either the numeral 0 or 1 to represent the value of a single bit. An element may also be a single character. Thus, continuing with the example begun in the previous paragraph, consider the following rules:
element1 element2 element3 = "A" = "B" = "C"
Given these three additional rules, the rule defined above as example-name matches ABC. This demonstrates the simplest ABNF operator: concatenation. The first rule represents a simple concatenation of the three next rules. Note that if you want to add white spacethat is, spacing between elementsyou must explicitly add a white space character in the rule. There are several other ABNF operators. The alternative operator is specified when two or more elements are set off by a forward slash (/), as in this example:
my-name = foo / bar / baz
In this case, the rule my-name can be matched by foo or bar or baz. There are two other alternative operators. One is called the value range alternative operator. Rather than explicitly stating every possible value in a range
84
and setting them off by forward slashes, you may specify a range of character elements. The lowest permitted value is set off from the highest permitted value by a dash. Another alternative operator is known as the incremental alternative. Rather than specifying a rule in a single line, you may use the alternative operator to specify the rule in a series of fragments. See Appendix A for more details about this operator. Enclosing elements and operators within a pair of parentheses causes those items to be treated as a sequence group. That group is treated as a single element. Thus, the rule:
silly-name = foo (bar / baz) (baz / foo)
matches foobarfoo, foobarbaz, foobazfoo, and foobazbaz. The repetition operators define mechanisms for repeating elements within a rule. The variable repetition operator is specified by an asterisk, and two optional values can be used to specify the minimum and maximum number of repetitions of the element. This can be represented as:
<a>*<b>element
If the variables are not defined, the defaults are 0 for minimum number of repetitions (in other words, the element does not have to appear at all) and infinity for the maximum number of repetitions.
1*10ALPHA
This represents a string of no less than 1 and no more than 10 alphabetical characters. The following examples represent, first, a sequence of no more and no less than 5 bits; next, a string of at least 24 alphabetical characters; and finally a sequence of anywhere from 0 to 16 bits.
5*5BIT 24*ALPHA *16BIT
When the element is to be repeated a specific number of times, no more and no less, the asterisk can be dispensed with and the operation represented like this (specifying an eight-character string):
8ALPHA
One way to represent an optional sequence is to use the repetition operator with a minimum value of 0 and a maximum value of 1 (in other words, repeat the sequence no less than zero times and no more than once). Another way to express this operator is to set off the optional sequence by square brackets, like this:

[element1 element2 element3]
85
Finally, comments may be inserted in a rule after the semicolon (;) character. For example:
silly-name = foo (bar / baz) (baz / foo) ; Hi, Joycie!
For a far more complete discussion of ABNF, see Appendix A; this summary should be sufficient to allow the casual reader to decipher the ABNF representations of the Internet message formats.
Parts of the Internet Message

The fundamental particle of any Internet message is the ASCII character. All Internet MFS messages consist entirely of ASCII characters and of nothing but ASCII characters (though, of course, the MIME standards make it possible for an MFS message to incorporate non-ASCII characters). If a message has a non-ASCII character in it, then by definition, it is not an MFS-compliant message. ASCII characters are combined within messages to form lines. Every line is identifiable because it ends with the carriage return/line feed character combination (ASCII 13 and ASCII 10, respectively, and more conveniently abbreviated as CRLF). Message lines should be no more than 78 characters long (not including the CRLF), because lines that are any longer can be difficult to display properly with some hardware and software. Lines may have to be split before reaching the CRLF, the true end of the line. Message lines may in no case exceed 998 characters (not including the CRLF), because many Internet messaging implementations are simply incapable of dealing with lines any longer. These ASCII characters and lines are the building blocks of the Internet message. Messages consist of header fields (or just headers), whose syntax must conform to ABNF rules (to be discussed later in the chapter), and optional body. The headers come first, but in no particular order. The body, if present, follows an empty line, that is, a line with no characters other than CRLF following the last header field. The body itself, if present, consists of one or more lines with no particular structure.
Message Body
As mentioned above, the message body consists of lines of ASCII characters. All ASCII characters, both printing and nonprinting, are permitted. Only two other restrictions are placed on the contents of a message body. First, carriage return (CR) and line feed (LF) characters may not appear independently within the message body. In other words, if there is a CR, an LF must immediately follow it; when these two characters appear together it indicates the end of a line. The
86
second limitation on the contents of the message body is that lines must contain no more than 998 characters, exclusive of the line-terminating CRLF.
Header Fields
Header fields are necessary for any standards-compliant message. Header fields contain information such as where the message came from, where it is going, when it was sent, and more. However, only two header fields are nonoptional for standards-compliant messages: the From: header, indicating the originator of the message, and the Date: header, indicating the origination date of the message. We discuss the specifics of message header fields, their structure, and syntax later in this chapter. All header fields consist of the following parts, in this order: Field name. Field names must consist entirely of printable US-ASCII characters except for the colon character. Colon (:). A colon sets off the field name from the field body. Field body. Field bodies can consist of any US-ASCII character or characters (including the colon), but may not contain a carriage return or line feed (though the CRLF combination may be present if the field body is broken into more than one line for display purposes, see Folding Header Fields below). CRLF. A CRLF indicates the end of the header field. Two different types of header field bodies are defined: structured and unstructured. The unstructured header field bodies offer no internal structure to their contents: Whatever is in the field body is treated simply as a string of data terminated by a CRLF. Any values are valid in these field bodies. Syntactical structures have been defined for structured header field bodies. In other words, the contents of the field body is structured data of some sort that conforms to the specified syntax for the field. The data in structured field bodies can be extracted by applications and used for processing. For example, Subject: header field body is unstructured. The contents of the Subject: header are just characters, subject only to the basic limitations imposed on all field bodies. On the other hand, the Date: header field body is structured: It contains specific types of data, arranged in a particular order, and that information can be processed by application software once the message is received.
Folding Header Fields

Header fields are terminated by CRLF, except for when they arent terminated by CRLF. Each header field is considered a logical record, taking up a single line. As long as all headers are less than 998 characters (preferably less than 78 characters), there is no functional requirement to break a field up before its end. However, because it is possible for a header field to exceed these limits, a
87
mechanism for splitting headers into more than one line is useful. In addition, breaking headers for formatting can help make the headers more comprehensible to humans. Two ASCII characters are collectively referred to as white space (WSP): the space character (ASCII 32) and the horizontal tab (HTAB) character (ASCII 9). Hit the spacebar, and you get an ASCII space; press the tab key, and you get an ASCII HTAB. The rule for creating a folded header is that it may be split in any place in the header that consists of white space. If there is a space, a CRLF can be inserted before the space, and the header will still be a logical unit even though it is now taking up two lines rather than one. For example, a Subject: header with words and spaces can be split so that:
Subject: Now is the time for all good men and women...
can be turned into this:

Subject: Now is the time for all good men and women...
The ability to fold headers is useful largely to make headers more comprehensible; otherwise, it adds minimal functionality to the specification.
Message Header Fields

As mentioned above, only two headers are strictly required to create a standardscompliant message: the from and the orig-date headers. However, approximately two dozen different header fields are specified in the standard. In this section, we list all the standard header fields defined in the standard, along with brief descriptions of what each field contains. Headers can be categorized as either optional or required, as structured or unstructured. More specific field categories include originator and destination fields, identification and informational fields, resent, trace, and optional fields. Fields defined in MFS are introduced here, organized into these more specific categories. Each header is described with a brief discussion of contents and use. The complete ABNF construction of the message headers, including the rules for each header field and the rules for creation of all elements, is provided later in this chapter. In this section, we identify the field contents in very broad terms, saving the precise ABNF definitions for later. Table 8.1 includes a list of message headers, including field name, header name, whether it is required (MUST) or recommended (SHOULD), and whether it must be unique in each message or can appear more than once in the same message.
88
Essential Email Standards: RFCs and Protocols Made Practical Table 8.1 Internet Message Format Standard Header Fields HEADER NAM E 0 0* STATUS unlimited unlimited* M ULTI PLE I NSTANCES Block prepended see 3.6.7 One per block, required if other resent fields present - see 3.6.6 One per block - see 3.6.6 One per block, MUST occur with multiaddress resent-from - see 3.6.6 One per block - see 3.6.6 One per block - see 3.6.6 One per block - see 3.6.6 One per block - see 3.6.6
FI ELD NAM E trace resent-date
resent-from resent-sender
0 0*
unlimited* unlimited*
resent-to resent-cc resent-bcc resent-id orig-date from sender
0 0 0 0 1 1 0*
unlimited* unlimited* unlimited* unlimited* 1 1 1
See sender and 3.6.2 MUST occur with multi-address from see 3.6.2
reply-to to cc bcc message-id in-reply-to references subject comments keywords optional-field
0 0 0 0 0* 0* 0* 0 0 0 0
1 1 1 1 1 1 1 1 unlimited unlimited unlimited SHOULD be present see 3.6.4 SHOULD occur in some replies - see 3.6.4 SHOULD occur in some replies - see 3.6.4
89
Origination Date and Origination Addresses

These four headers contain information about the origination of the message. The origination address headers can be used to generate destination address headers when a reply to the message is being generated. orig-date This required field indicates the date and time when the creator of the message considered it to be complete and ready to submit to the mail delivery system. This header begins with the text Date: followed by the day, month, and year; and the time (to the minute). This field may optionally include the day of the week and the time to the second. The date and time indicated are meant to reference when the message was submitted by the originator for delivery, rather than when it was created. For example, if a user created and submitted a message for delivery, but the system was not currently connected to any network, the time at which the message was submitted and queued for delivery is used here, rather than the time that the message is actually transmitted across a network. from This is the only other required field and is one of three originator fields indicating where the message came from. There can be only one From: header in any message. The field itself contains the text From: followed by elements that indicate one or more mailboxes. A mailbox indicates a conceptual entity that receives mail on behalf of some other entity (usually a person). A mailbox does not have to exist in file storage: It indicates whatever is used to output messages from the message delivery system. A mailbox could output messages onto a printer, a monitor, or even paper tape. When used in originator fields, the mailbox (or mailboxes) reference the entity (or entities) receiving mail on behalf of the entity (or entities) that is doing the message origination indicated by the field. The From: field is differentiated from other originator fields in that it specifies the person or entity that created the message. sender This optional field consists of the text Sender: followed by elements indicating a single entitys mailbox. There can be no more than one Sender: header in any message. The Sender: field is used when the entity placing the message into the mail delivery system is different from the entity that created the message. For example, a message sent by an automatic mailing list might indicate the author of a message in the From: field, while indicating the automatic mailing list agent in the Sender: field. The Sender: header is required when more than one mailbox is indicated in the From: field, meaning more than one entity created the message (for example, messages from members of a committee), but only one messaging entity actually submitted the message into the delivery system.
90
reply-to This consists of the text Reply-To: followed by one or more mailboxes. This optional field contains mailbox addresses or groups (for lists of addresses), indicating where replies to the message should be directed. There can be no more than one Reply-To: header in any message. This field is used when the originators of the message wish to direct replies to a different mailbox (or mailboxes) than those used to originate the message. For example, messages sent from a mailing list might indicate that replies should be directed to an address different from that in the Sender: or From: headers.
Message Destinations
These headers indicate the destinations to which the message is to be delivered. Each header takes the general form of some text To: (or Cc: or Bcc:), followed by addresses (optional for the blind copy header). The destination headers can be correlated directly to standard business memo addressing conventions. All of these headers are optional, but none may appear more than once in any given message. Each may contain multiple addresses. Destination headers may be created by application software by taking addresses listed in the origination headers. to This is the To: header. It contains the address or addresses of the primary recipients of the message. cc This is the Cc: (carbon copy) header, meaning that it contains the address or addresses of recipients who are to get copies of the message as a courtesy, for backup, for background, or for some other reason. This usage refers to the practice of using carbon paper to create additional copies of a typewritten letter or memo: The primary recipient gets the original, and other recipients get the (lower quality) carbon copies. Recipients listed in the cc: header receive the same message as that sent to the primary recipient. bcc This is the Bcc: (blind carbon copy) header. This is the cover-your-ass (CYA) header, which is used for electronic messaging in the same way it is used for paper memos. Unlike the to: and cc: headers, bcc address(es) are optional. This mean the bcc: header may appear in the messageindicating that blind copies were sentbut without any indication of to whom they were sent. MFS identifies several different ways this header can be used. First, all addressees including those listed as blind recipients can receive a copy of the message, but with the actual bcc: header not present (in other words, that header is stripped off by some system before the messages are transmitted). Another option is for the blind copy recipients to get a copy of the message that includes their own address (and no others) listed in the bcc: header, but for the other recipients to receive the message with no bcc:
91
header. If there are multiple blind copy recipients, each gets a message that lists only their own address in the bcc: header. The last option is for the blind copy recipients to see their own address in the bcc: header, and for other recipients to receive the message with the bcc: header present, but without any recipients listed (of course).
Message Identification
Message identification headers provide some context for the message. Though not required, these headers are recommended. This means that standardscompliant implementations should use these headers where appropriate. None of these headers may appear more than once in any message. All messages should be given a unique message identifier by the host generating the message; messages that are replies should include a header indicating which message they are replying to and a header that identifies other messages that may also be related. message-id This produces the Message-ID: header. Each message should contain one (and only one) instance of this header. The header itself contains a unique identifier issued by the hostand referencing the hoston which the message was created. in-reply-to This produces the In-Reply-To: header. It contains the message ID of the message to which the current message is replying. This header should be present when the message is a reply to another message (but only if the message is a reply). There may be no more than one In-ReplyTo: headers in any message. When the reply message is created, this header is created by taking the contents of the Message-ID: header (if present) and copying it into the In-Reply-To: header. references This produces the References: header, which should be present when the message is a reply to another message (but only if the message is a reply). This header contains the message IDs of all other messages in the same thread (replies to replies to replies). There may be no more than one References: headers in any message. Building the contents of the References: header can seem complicated, so it is discussed next in Message Threading.
MESSAGE TH READING
An example helps to understand how message threads are created. In this example, we look strictly at a series of messages (identified as M-n, where n is an integer). The first message (M-0) in a thread can not have a References: header or In-Reply-To: header, though it should have a Message-ID: header. Lets say that the message ID for M-0 is <0000@example.com>. Continues
92
MESSAGE TH READING (CONTI N UE D)

The next message in the thread, which is a reply to the original message, should have all three message identification fields. The reply message put its own message ID, <0001@example.net>, into the Message-ID: header of the new message (M-1). This second message creates an In-Reply-To: header as well, filling it with the message ID of the first message, <0000@example.com>. A References: header is also created, containing the same message ID. When this second message is received, a reply can be generated (M-2). This third message creates its own message ID, <0002@example.com> to put in the Message-ID: header. It takes the message ID of the message to which it is replying, M-1, and puts that into its In-Reply-To: header. Then, since the message to which it is replying also has a References: header, the contents of that header are copied into the new messages References: header, and the message ID of the message being replied to (M-1) is appended at the end of that header. Table 8.2 shows the data that would be included in the Message-ID:, In-Reply-To:, and References: headers for a four-message thread being exchanged between two correspondents.
Message Information
Three headers have been defined for message information; all three are optional. These headers are intended to contain information to be processed by the reader of the message, unlike the other headers discussed so far. While destination mailboxes, origination date and time, and originating mailboxes may include information that is interesting to readers, those headers are intended to provide formatted content that is used by applications to process and deliver the message. subject This field contains the header text Subject: followed by unstructured text and white space. Usually the Subject: header contains a brief description of the contents of the message body. Application software usually uses the string Re: with the subject of the original message when replying to a message; see Cascading Replies for more about this usage.
Table 8.2 Message Threading IN-REPLY-TO: REFERENCES: N/A <0000@example.com> <0000@example.com> <0001@example.org> <0000@example.com> <0001@example.org> <0002@example.com>
MESSAGE MESSAGE-I D: M-0 M-1 M-2 M-3
<0000@example.com> N/A <0001@example.org> <0000@example.com>
<0002@example.com> <0001@example.org> <0003@example.org> <0002@example.com>
93
CASCADING REPLI ES
It is standard business practice to use the term Re: in the heading of most memos to indicate the topic of the memo. According to the MFS specification, this term is derived from the Latin term res meaning in the matter of. (In some European countries, sv: is used instead of re:.) When replying to a message, a long-standing standard electronic messaging practice is to append the original messages subject to the term Re: and to put the whole thing in the reply messages Subject: header. However, this approach fails to take into consideration that the original message may actually have been a reply to another message. When correspondents exchange more than one or two messages on the same topic (replies to replies to replies), the result can be Subject: headers that consist of sequences like this: Subject: Re: Re: Re: Re: Re: Re: Re: Lunch tomorrow? The MFS specification recommends that applications not repeat the string Re: more than once in any Subject: header. Some applications create a Subject: header that includes Re: and a number to indicate how many replies deep the thread is. Further, MFS suggests that implementers not use any other string to indicate a reply; as phrased in the document, ...use of other strings or more than one instance can lead to undesirable consequences. Presumably, those consequences include creation of Subject: headers that exceed appropriate limits.
The string sv: is often used in Europe instead of re:. The Subject: header is the most common of the informational headers. comments This field contains the header text Comments: followed by unstructured text and white space. This header may contain additional information about the body of the message. Any number of Comments: headers may be included with a particular message. keywords This field contains the header text Keywords: followed by one or more words or phrases set off by commas. These keywords could be used by the recipient to classify the message or for searching through message stores. Any number of Keywords: headers may be included with a particular message.
Resent Fields
The resent fields are useful for resending messages. This occurs when a message recipient reintroduces the message into the message transport system, directing it to another mailbox. This is different from message forwarding, as explained in Forwarding and Resending. When a message is resent, it appears to the recipient as if it was sent by the original originator, not the resending originator. When a message is resent, applications should add appropriate headers to indicate who is doing the resending.
94
The resent fields parallel the origination and destination fields described above and contain the same types of information as those fields except that the resent fields contain information about the resent message rather than the original message. When resending a message the fields present in the original message remain intact, and the resent fields are added above the existing fields. A message can be resent more than once, in which case the second time it is resent, a second set of resent fields is added at the start of the message headers. These fields can each occur only once per block of resent fields, but each may recur if there is more than one set of resent fields. That is, if the message was resent more than once, there may be as many instances of each of the resent fields as the number of times the message was resent. Resent fields include the following: resent-date This field includes the text Resent-Date: and the date and time that the message was resubmitted to the message transport. This field is required if resent fields are being used. resent-from This field includes the text Resent-From: and the mailbox(es) of the sender(s) who are resending the message. This field is required if resent fields are being used. resent-sender This field includes the text Resent-Sender: and the mailbox of the entity responsible for resubmitting the message. It is required only if more than one mailbox is indicated in the resent-from field and should be present if an entity other than that indicated in the resent-from field did the actual submission. Otherwise, it is not permitted (if there were only a single mailbox indicated in the resent-from field and that entity did the submission of the message for resending). resent-to This field includes the text Resent-To: and the mailbox(es) and/or groups of the primary recipients of the resent message. Otherwise, this field behaves just like the field represented by the To: header. resent-cc This field includes the text Resent-Cc: and the mailbox(es) and/or groups of the recipients of copies of the resent message. Otherwise, this field behaves just like the field represented by the Cc: header. resent-bcc This field includes the text Resent-Bcc: and the mailbox(es) and/or groups of the recipients of blind copies of the resent message. Otherwise, this field behaves just like the field represented by the Bcc: header. resent-id This field includes the text Resent-Message-ID: and a unique identifier for the resent message. This is an additional message ID, not related to the original message ID, which persists along with all the other headers of the original message. When recipients reply to messages that have been resent to them, the reply goes to the mailbox of the entity that originated the original messagenot to
95
FORWARDING AN D RESEN DI NG
MFS identifies two different meanings to the term forwarding as it is applied to Internet messaging. When a message recipient forwards that message to another entity, the original message is being copied and encapsulated in an entirely new message being originated by the forwarding party. A message transport application may forward a message as part of its core function, for example, when a mail server passes a message along to its final destination, as discussed in Chapter 10, Simple Mail Transfer Protocol (SMTP). However, resending a message is an entirely different procedure from either of the mechanisms described by the term forwarding. Unless message recipients use software that explicitly displays the values of any resent fields present in a message, the values in those fields are functionally ignored.
the entity that resent the message. In fact, the specification explicitly identifies the resent fields as strictly informational and forbids their use for processing of replies.
Trace Fields
Two trace fields are defined in the Internet message format standard, but this standard treats them as purely informational. The reason is that these fields are used by external protocols such as the Simple Mail Transfer Protocol (SMTP). We come back to how these fields are used by SMTP in Chapter 10, but very simply, the return-path field is used to indicate where SMTP error messages should be sent. The received field contains information prepended to the message by every SMTP server as it passes the message along its path to its final destination. The trace fields are used to provide a mechanism by which it is possible to trace messages from their sources and to offer some additional information for SMTP purposes, as we see in Chapter 10. These two fields are described here. return This field includes the text Return-Path: and a mailbox address, indicating where error messages should be returned. trace This field contains multiple header lines that include the text Received: and additional information, including a mailbox, a domain, a message ID, and a date and time. These headers are prepended to the message headers by SMTP servers along the messages route and indicate what happened to the message, when, and where.
N OT E The Received: header can use a syntax that indicates the address on
whose behalf the message was received (...for <address>..., see examples at the end of this chapter). However, many mail systems do not use this format, reducing the value of these headers for tracking mail problems.
96
Optional Fields
If youve ever examined the full headers of many Internet email messages, youve probably seen many other types of headers than those described so far. Anyone can create a new, optional message field as long as it complies with the specification. The optional field must contain a field-name, followed by the colon, followed by unstructured data, and terminated with a CRLF. The fieldname may contain any printable character except for the space or the colon. Informational RFC 2076, Common Internet Message Headers, references standard headers, such as those described so far in this chapter, as well as those that use optional fields. This RFC has been updated in an Internet-Draft with additional headers defined, but an updated RFC has not yet been published. Three examples of optional fields include the following: X-Mailer This includes information about the messaging client software being used by the originator of the message. X-UIDL This indicates an ID that is unique for the message to a particular local mailbox store. Content-type This indicates the MIME content type/subtype. For standards-compliant Internet messages, the value of this field should be text/plain; charset=US-ASCII or something similar. This header is described in greater detail in Chapter 9, Multipurpose Internet Mail Extensions (MIME), as it is defined by the MIME standards. RFC 2076 includes headers defined in a variety of other RFCs, which in turn define various other messaging and related applications. Table 8.3 lists some RFCs that contain other header definitions. Not all optional header definitions have been defined in RFCssome may have been defined externally to the IETF process, and others defined in Internet-Drafts. Some of these headers are considered obsolete, unusable, undesirable, or nonstandard. Rather than attempt to list them all, you are urged to seek more information about any particular header in the RFCs listed in Table 8.3 as well as by searching the RFC and I-D archives.
Table 8.3 RFC RFC 822 RFC 1036 RFC 1123 RFC 1327 RFC 1496 Some RFCs That Define Optional Message Headers TITLE Standard for the Format of ARPA Internet Text Messages Standard for Interchange of USENET Messages Requirements for Internet HostsApplication and Support Mapping between X.400(1988)/ISO 10021 and RFC 822 Rules for Downgrading Messages from X.400/88 to X.400/84 When MIME Content-Types Are Present in the Messages
Internet Message Format Standard Table 8.3 RFC RFC 1766 RFC 1806 RFC 1864 RFC 1911 RFC 2045 (Continued) TITLE Tags for the Identification of Languages Communicating Presentation Information in Internet Messages: The Content-Disposition Header The Content-MD5 Header Field Voice Profile for Internet Mail Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies
97
In particular, Usenet news messages include a variety of optional message headers, which we discuss in Chapter 14, Network News Transfer Protocol (NNTP).
Internet Message Syntax

In this section, we build up the Internet message format standard headers from scratch, using ABNF. Taking the rules directly from the MFS specification, we start with the fundamental building blocks and build up a series of lexical tokens, which can be used to form message header field bodies. The syntax rules presented here often include elements that specify obsolete constructions. These elements are identified by the first four characters of obs- to indicate the element represents a structure that should be recognized and processed for backwards compatibility, but that should not be propagated by current or new applications. Obsolete syntax is discussed at more length later in this chapter.
Building Blocks
Building blocks define elements that are used to build up more meaningful lexical entities. The rules are taken directly from the MFS specification. Where clarification is necessary or helpful, comments have been added. In this section, all monospaced font text is taken directly from the MFS specification.
NON-ASCI I CHARACTER SETS
Proposed standard RFC 2231, MIME Parameter Value and Encoded Word Extensions: Character Sets, Languages, and Continuations, describes a mechanism for allowing nonASCII characters to be used in RFC 822 headers. Because MIME (Chapter 9) allows nonASCII characters in headers as well as bodies, it becomes possible that these characters could appear in RFC 822 content-type and content-disposition header fields. RFC 2231 describes how these characters can be encapsulated within the RFC 822 headers.
98
Primitives
These primitive tokens are defined to indicate a single character to be used within an element. Each rule indicates that an instance of the rule can be represented by any character listed as an option for the rule. In other words, the text rule specifies that the token can represent any ASCII character except for the carriage return or line feed. When the token text appears in another rule, that means that any of those characters can appear in that position. Each of these tokens represents a single character.
NO-WS-CTL = %d1-8 / %d11 / %d12 / %d14-31 / %d127 %d1-9 / %d11-12 / %d14-127 / obs-text " (" / ")" / "<" / ">" / "[" / "]" / ":" / ";" / "@" / "\" / "," / "." / DQUOTE ; US-ASCII control characters ; that do not include the ; carriage return, line feed, ; and white space characters
text
; Characters excluding CR and LF
specials
; Special characters used in ; other parts of the syntax
Quoted Characters
The quoted-pair rule allows the use of special characters, such as those that have some special meaning when used in the message header, to be included in informational headers and interpreted as plain text. In other words, sometimes a reserved character, like an angle bracket (< or >), appears in a header field (for example, in the wrong part of an address header). In those cases, the reserved character should be quoted, by preceding it with a backslash (\). The rule is as follows:
quoted-pair = ("\" text) / obs-qp
Folding White Space and Comments

As mentioned earlier, the specification allows some message headers to be folded where a logical break, in the form of a space or tab, appears in a header.
99
The constituent elements of folding white space (FWS) and comments are defined here.
FWS = ([*WSP CRLF] 1*WSP) / obs-FWS NO-WS-CTL / %d33-39 / %d42-91 / %d93-127 ccontent comment CFWS = = = ; Folding white space
ctext
; Non white space controls ; The rest of the US-ASCII ; characters not including (, ; ")", or "\"
ctext / quoted-pair / comment "(" *([FWS] ccontent) [FWS] ")" *([FWS] comment) (([FWS] comment) / FWS)
Note that comments start with an open parenthesis and end with the close parenthesis. Folding white space is permitted within a comment.
Atoms
To represent simple strings, the atom elements are defined. Some simple strings include any printable character except for the specials, defined above; other strings include the dot or period (.). These rules define these atomic string entities.
atext = ALPHA "!" / "$" / "&" / "*" / "-" / "=" / "^" / "`" / "|" / "~" / DIGIT / ; Any character except controls, "#" / ; SP, and specials. "%" / ; Used for atoms "'" / "+" / "/" / "?" / "_" / "{" / "}" /
atom dot-atom dot-atom-text
= = =
[CFWS] 1*atext [CFWS] [CFWS] dot-atom-text [CFWS] 1*atext *("." 1*atext)
The rule for atoms is interpreted thus: An atom may optionally begin with a comment or folding white space, must consist of at least one character (atext,
100
defined above), and may optionally end with a comment or folding white space. Functionally, the comments and folding white space are not considered part of the atom for processing purposes.
Quoted Strings
You can include otherwise forbidden characters in a field when those characters are surrounded by the double quote (DQUOTE) character. For example, you can include this is a quoted string $%^& because it is surrounded by the double quote character. Quoted strings are treated semantically in the same way as atoms. The backslash and double quote characters are not defined as part of the qtext characters, but since the quoted-pair element is specified, the double quote and backslash can be included in quoted strings if they are expressed as quoted pairs.
qtext = NO-WS-CTL / %d33 / %d35-91 / %d93-127 qcontent quoted-string = = ; Non white space controls ; The rest of the US-ASCII ; characters not including "\" ; or the quote character
qtext / quoted-pair [CFWS] DQUOTE *([FWS] qcontent) [FWS] DQUOTE [CFWS]
Miscellaneous Tokens
For the purposes of message headers, tokens are defined for things called words, phrases, and unstructured. A word is any atom or quoted string; a phrase is any sequence of one or more words; and unstructured refers to any sequence of text (see Primitives, earlier in this chapter) and folding white space.
word phrase unstructured = = = atom / quoted-string 1*word / obs-phrase *([FWS] text)
Date and Time Syntax

All messages have at least one field containing a date and time: the orig-date field; date and time appear in other field headers as well. This section
101
includes the rules that specify the elements that make up the date and time used in message headers. Implementations that express years in two digits are obsolete as they are incapable of properly expressing dates in the year 2000 and later; other obsolete elements are described later in this book. Text from the MFS specification is included after these rules to clarify issues related to date and time.
date-time day-of-week day-name = = = [ day-of-week "," ] date FWS time [CFWS] ([FWS] day-name [FWS]) / obs-day-of-week "Mon" / "Tue" / "Wed" / "Thu" / "Fri" / "Sat" / "Sun" day month year ([FWS] 4*DIGIT [FWS]) / obs-year (FWS month-name FWS) / obs-month "Jan" / "Feb" / "Mar" / "Apr" / "May" / "Jun" / "Jul" / "Aug" / "Sep" / "Oct" / "Nov" / "Dec" ([FWS] 1*2DIGIT [FWS]) / obs-day time-of-day FWS zone hour ":" minute [ ":" second ] 2DIGIT / obs-hour 2DIGIT / obs-minute 2DIGIT / obs-second (( "+" / "-" ) 4DIGIT) / obs-zone
date year month month-name
= = = =
day time time-of-day hour minute second zone
= = = = = = =
The day is the numeric day of the month. The year is any numeric year 1900 or later. The time-of-day specifies the number of hours, minutes, and optionally seconds since midnight of the date indicated. The date and time-of-day SHOULD express local time. The zone specifies the offset from Coordinated Universal Time (UTC, formerly referred to as "Greenwich Mean Time") that the date and timeof-day represent. The "+" or "-" indicates whether the time-of-day is ahead of or behind Universal Time. The first two digits indicate the number of hours difference from Universal Time, and the last two digits indicate the number of minutes difference from Universal Time. (Hence,
102

+hhmm means +(hh * 60 + mm) minutes, and -hhmm means -(hh * 60 + mm) minutes). The form "+0000" SHOULD be used to indicate a time zone at Universal Time. Though "-0000" also indicates Universal Time, it is used to indicate that the time was generated on a system that may be in a local time zone other than Universal Time. A date-time specification MUST be semantically valid. That is, the dayof-the week (if included) MUST be the day implied by the date, the numeric day-of-month MUST be between 1 and the number of days allowed for the specified month (in the specified year), the time-of-day MUST be in the range 00:00:00 through 23:59:60 (the number of seconds allowing for a leap second; see [STD-12]), and the zone MUST be within the range -9959 through +9959.
Address Specification
Internet messages can not exist without some syntax for addresses: They represent destinations and sources of messages. As will be apparent from the syntax below, in the Internet message format, an address may represent either a mailbox or a group. A mailbox can consist of a name-addr, an addr-spec, or an obs-mailbox (an obsolete form of mailbox). Moving through the rules, the name-addr element is defined as an optional display-name (a phrase is a sequence of one or more words) followed by optional comments or folding white space ([CFWS]), followed by a required open angle bracket (<), an addr-spec element, and a close angle bracket (>), followed by optional CFWS. In other words, a valid mailbox expression may look like either of these examples:
Joe Example <joe@example.com> <joe@example.com>
Older implementations of messaging software often use a different format, where the addr-spec appears first, followed by a display-name surround by parentheses. In this construction, the name is treated as a comment (because it appears within parentheses, see the Comments and Folding Address Space section). These legacy systems often actually process data within the parentheses, so the specification recommends against using any comments in address fields to avoid confusing such implementations. The group element defines a mechanism by which a single phrase can be used to act as an alias for multiple mailboxes. Groups are constructed of any number of mailboxes, separated by commas, preceded by a phrase representing the group, and ending with a semicolon. The group element can include all the member mailboxes in a group, though it can also be displayed in the message header field with no mailboxes. When the group displays zero mailboxes,
103
it indicates to recipients that the message was sent to a (possibly large) group without incorporating a (possibly long) list of mailboxes in each message. An address-list contains one or more addresses, set off by commas. These may include either mailboxes or groups. A mailbox list contains one or more mailboxes, set off by commas. The addr-spec element is central to any email address. Simply put, the addrspec is a string followed by the at-sign character (@) followed by a fully qualified Internet domain identifier. The domain is used by the message delivery system to determine where the message should be routed across the Internet, while the string is used locally within that domain.
address mailbox name-addr group = = = = mailbox / group name-addr / addr-spec / obs-mailbox [display-name] [CFWS] "<" addr-spec ">" [CFWS] display-name ":" [mailbox-list / CFWS] ";" [CFWS] phrase (mailbox *("," mailbox)) / obs-mbox-list address *("," address) / obs-addr-list local-part "@" domain dot-atom / quoted-string / obs-local-part dot-atom / domain-literal / obs-domain [CFWS] "[" *([FWS] dcontent) [FWS] "]" [CFWS] dtext / quoted-pair NO-WS-CTL / %d33-90 / %d94-127 ; Non white space controls ; The rest of the US-ASCII ; characters not including "[", ; "]", or "\"
display-name mailbox-list address-list addr-spec local-part domain domain-literal dcontent dtext
= = = = = = = = =
Overall Message Syntax

The ABNR syntax for an Internet standard message is quite simple. The message consists of the fields element (which contains all possible optional and required header fields) and an optional section starting with a CRLF followed
104
by a message body. The body consists of any number of lines of up to 998 text characters, each terminated by a CRLF pair.
message = (fields / obs-fields) [CRLF body]
body
*(*998text CRLF) *998text
Header Field Syntax

The fields element consists of any permitted combination of header fields. These fields are defined by their own rules, using the elements defined above. Field names are specified on the left. Note that they sometimes differ from the header names that appear in the messages. Where additional elements are defined for specific header fields, they are included below the field definitions.
Origination Date and Originator Fields

orig-date from sender reply-to = = = = "Date:" date-time CRLF "From:" mailbox-list CRLF "Sender:" mailbox CRLF "Reply-To:" address-list CRLF
Destination Fields
to cc bcc = = = "To:" address-list CRLF "Cc:" address-list CRLF "Bcc:" (address-list / [CFWS]) CRLF
Message Identification
message-id in-reply-to references msg-id id-left id-right no-fold-quote no-fold-literal = = = = = = = = "Message-ID:" msg-id CRLF "In-Reply-To:" 1*msg-id CRLF "References:" 1*msg-id CRLF [CFWS] "<" id-left "@" id-right ">" [CFWS] dot-atom-text / no-fold-quote / obs-id-left dot-atom-text / no-fold-literal / obs-id-right DQUOTE *(qtext / quoted-pair) DQUOTE "[" *(dtext / quoted-pair) "]"
Message Information
subject comments keywords = = = "Subject:" unstructured CRLF "Comments:" unstructured CRLF "Keywords:" phrase *("," phrase) CRLF
105
Resent Fields
resent-date resent-from resent-sender resent-to resent-cc resent-bcc resent-msg-id = = = = = = = "Resent-Date:" date-time CRLF "Resent-From:" mailbox-list CRLF "Resent-Sender:" mailbox CRLF "Resent-To:" address-list CRLF "Resent-Cc:" address-list CRLF "Resent-Bcc:" (address-list / [CFWS]) CRLF "Resent-Message-ID:" msg-id CRLF
Trace Fields
trace return path received name-val-list name-val-pair item-name item-value = = = = = = = = [return] 1*received "Return-Path:" path CRLF ([CFWS] "<" ([CFWS] / addr-spec) ">" [CFWS]) / obs-path "Received:" name-val-list ";" date-time CRLF [CFWS] [name-val-pair *(CFWS name-val-pair)] item-name CFWS item-value ALPHA *(["-"] (ALPHA / DIGIT)) addr-spec / atom / domain / msg-id
Optional Fields
optional-field field-name ftext = = = field-name ":" unstructured CRLF 1*ftext %d33-57 / ; Any character except %d59-126 ; controls, SP, and ; ":".
Obsolete Message Syntax

Throughout this chapter, the ABNF rules have included elements indicating that some headers may contain obsolete formations. For example, date and time headers may contain data that fails to conform to the modern standard but that conformed to the more liberal, obsolete standards. As mentioned earlier, the standard specifies the obsolete syntax not to indicate that it is now acceptable to implement the specification with obsolete syntax, but rather to enable current implementations to interpret headers generated by obsolete implementations. The MFS specification reiterates that modern implementations should not discard information or crash because data they receive does not conform to the current specification. The rule of thumb, be liberal in what you accept and conservative in what you send, applies here. For example, messaging applications should not simply silently discard characters if a line is received in excess of 998 character per line limit, but should attempt to recover data that may be malformed. Likewise, applications should be robust enough to
106
withstand receipt of unexpected or malformed data, so they do not crash if they receive an improperly formatted destination header. The latest specification tightens up the rules for permitted characters, for example, forbidding the use of the carriage return or line feed characters except as a pair to terminate a line. Similarly, MFS is more conservative in where comments and folding white space are permitted to improve ease with which header fields can be parsed and interpreted. Two other categories of obsolete formations are the date and time elements and the addressing elements. The rules for addresses have been made more restrictive, though obsolete addressing elements can include the more liberal interpretations, including the use of parentheses to set off a display name rather than angle brackets to set off the mailbox address. Obsolete date and time elements use character strings to identify time zones, whereas the more restrictive standard now prefers identifying a time zone though a numerical offset. Significant too is the new specifications use of fourdigit years rather than two-digit years. MFS specifies how the obsolete two-digit years are to be interpreted. Obsolete date and time element rules are also more liberal in permitting insertion of comments and folding white space. In addition, there is an obsolete syntax for folding white space that used to allow the possibility of folding a line (inserting a CRLF in front of some white space character) that contains nothing but white space. In other words, this syntax permitted a line that has nothing but white space in it, something not permitted in the more recent specification. ABNF rules for obsolete syntax, taken from the MFS specification, are included below. See the specification for additional discussion of obsolete syntax.
Obsolete Miscellaneous Tokens

obs-qp obs-text obs-char obs-phrase = = = = "\" (%d0-127) *LF *CR *(obs-char *LF *CR) %d0-9 / %d11 / ; %d0-127 except CR and %d12 / %d14-127 ; LF word *(word / "." / CFWS)
Obsolete Folding White Space

obs-FWS = 1*WSP *(CRLF 1*WSP)
Obsolete Date and Time

obs-day-of-week obs-year obs-month obs-day obs-hour = = = = = [CFWS] day-name [CFWS] [CFWS] 2*DIGIT [CFWS] CFWS month-name CFWS [CFWS] 1*2DIGIT [CFWS] [CFWS] 2DIGIT [CFWS]

obs-minute obs-second obs-zone = = = [CFWS] 2DIGIT [CFWS] [CFWS] 2DIGIT [CFWS] "UT" / "GMT" /
107
"EST" / "EDT" "CST" / "CDT" "MST" / "MDT" "PST" / "PDT" %d65-73 / %d75-90 / %d97-105 / %d107-122
/ / / /
; ; ; ; ; ; ; ; ; ; ;
Universal Time North American UT offsets Eastern: - 5/ - 4 Central: - 6/ - 5 Mountain: - 7/ - 6 Pacific: - 8/ - 7 Military zones - "A" through "I" and "K" through "Z", both upper and lower case
Obsolete Addressing
obs-mailbox obs-route-addr obs-route obs-domain-list obs-local-part obs-domain obs-mbox-list obs-addr-list = = = = = = = = addr-spec / [display-name] obs-route-addr [CFWS] "<" [obs-route] addr-spec ">" [CFWS] [CFWS] obs-domain-list ":" [CFWS] "@" domain *(*(CFWS / "," ) [CFWS] "@" domain) atom *("." atom) atom *("." atom) *([mailbox] [CFWS] "," [CFWS]) *([address] [CFWS] "," [CFWS])
Obsolete Origination Date Field

obs-orig-date = "Date" *WSP ":" date-time CRLF
Obsolete Originator Fields

obs-from obs-sender obs-reply-to = = = "From" *WSP ":" mailbox-list CRLF "Sender" *WSP ":" mailbox CRLF "Reply-To" *WSP ":" mailbox-list CRLF
Obsolete Destination Address Fields

obs-to obs-cc obs-bcc = = = "To" *WSP ":" address-list CRLF "Cc" *WSP ":" address-list CRLF "Bcc" *WSP ":" (address-list / [CFWS]) CRLF
Obsolete Identification Fields

obs-message-id obs-in-reply-to obs-references obs-id-left obs-id-right = = = = = "Message-ID" *WSP ":" msg-id CRLF "In-Reply-To" *WSP ":" *(phrase / msg-id) CRLF "References" *WSP ":" *(phrase / msg-id) CRLF local-part domain
108
Obsolete Informational Fields

obs-subject obs-comments obs-keywords = = = "Subject" *WSP ":" unstructured CRLF "Comments" *WSP ":" unstructured CRLF "Keywords" *WSP ":" *([phrase] ",") CRLF
Obsolete Resent Fields

obs-resent-from obs-resent-send obs-resent-date obs-resent-to obs-resent-cc obs-resent-bcc = = = = = = "Resent-From" *WSP ":" mailbox-list CRLF "Resent-Sender" *WSP ":" mailbox CRLF "Resent-Date" *WSP ":" date-time CRLF "Resent-To" *WSP ":" address-list CRLF "Resent-Cc" *WSP ":" address-list CRLF "Resent-Bcc" *WSP ":" (address-list / [CFWS]) CRLF "Resent-Message-ID" *WSP ":" msg-id CRLF "Resent-Reply-To" *WSP ":" address-list CRLF
obs-resent-mid = obs-resent-rply =
Obsolete Trace Fields

obs-return obs-received obs-path = = = "Return-Path" *WSP ":" path CRLF "Received" *WSP ":" name-val-list CRLF obs-route-addr
Obsolete Optional Fields

obs-optional = field-name *WSP ":" unstructured CRLF
Examples
These examples are taken from actual messages. The first example shows a set of headers taken from a simple email message, the second shows a set of headers taken from a message received through a mailing list discussion, and the third shows a set of headers taken from a Usenet news group posting. Other headers may be defined in other chapters. In particular, Chapter 14 highlights special headers defined for news, and Chapter 10 highlights headers defined and used by SMTP.
Simple Email Message

The headers shown below were taken from an actual message received by the author. Note that the headers are not required to appear in any particular order, so the From: header is inserted between two Received: headers. Weve formatted the text for readability. Note the different ways date and time are rendered in different headers. In some cases, the obsolete character time zones are used, and in others, the more conservative time zone offset is used.
109
Note the use of folding white space and comments in the Received: headers. The comments appear within parentheses and indicate information about the SMTP servers that added those headers. Note also the optional header fields that have been incorporated here, including the X- headers, a MIME version header, and a content-type and content-transfer-encoding header. These headers relate to MIME and are discussed in Chapter 9.
Return-Path: <X99999y@aol.com> Received: from chmls06.example.net ([24.128.1.71]) by chmls01.example.net (Netscape Messaging Server 3.01) with ESMTP id AAA16025 for <loshin@example.net>; Sat, 6 Mar 1999 00:01:42 -0500 Received: from chmls04.example.net (chmls04 [24.128.1.114]) by chmls06.example.net (8.8.7/8.8.7) with ESMTP id AAA09071 for <loshin@example.net>; Sat, 6 Mar 1999 00:01:40 -0500 (EST) From: X99999y@aol.com Received: from maildeliver0.xxxx.net (maildeliver0.xxxx.net [199.0.65.19]) by chmls04.example.net (8.8.7/8.8.7) with ESMTP id AAA07064 for <loshin@example.net>; Sat, 6 Mar 1999 00:01:39 -0500 (EST) Received: from mx1.xxxx.net (mx1.xxxx.net [199.0.65.251]) by maildeliver0.xxxx.net (8.8.8/8.8) with ESMTP id AAA06683 for <pete@loshin.com>; Sat, 6 Mar 1999 00:01:40 -0500 (EST) Received: from imo14.mx.aol.com (imo14.mx.aol.com [198.81.17.4]) by mx1.xxxx.net (8.8.8/8.6.9) with ESMTP id AAA02267 for <pete@loshin.com>; Sat, 6 Mar 1999 00:01:40 -0500 (EST) Received: from X99999y@aol.com by imo14.mx.aol.com (IMOv19.3) id lOBWa23192 for <pete@loshin.com>; Sat, 6 Mar 1999 00:01:03 -0500 (EST) Message-ID: <8907fa44.36e0b68f@aol.com> Date: Sat, 6 Mar 1999 00:01:03 EST To: pete@loshin.com Mime-Version: 1.0 Subject: Re: Lunch on Tuesday Content-type: text/plain; charset=US-ASCII Content-transfer-encoding: 7bit X-Mailer: AOL 4.0 for Windows 95 sub 4 X-Mozilla-Status: 8011 X-Mozilla-Status2: 00000000 X-UIDL: 19990306050143.AAA.50a9b710
Mailing List Message

The following message headers are taken from an actual mailing list message. Rather than reproduce all the Received: headers (which are similar to those shown in the previous example), weve edited them out to highlight the other header fields used in this message. Note the presence of custom-specified optional header fields used to indicate mailing list information. These include the address to which unsubscribe requests should be sent (List-Unsubscribe:),
110
the name and version of the list software (List-Software:), the address to which subscription requests should be sent (List-Subscribe:), and the address to which address requests to the entity that owns the list (List-Owner:).
Return-Path: <bounce-cbp-46885@ls.datareturn.com> Date: Sat, 6 Mar 1999 10:17:13 Subject: RE: why cant we all just get along To: "Friendship List" <fl@ls.example.com> From: "Kathy Smith" <kathy@smith.com> List-Unsubscribe: <mailto:leave-fl-46885R@ls.friendship.com> List-Software: Lyris Server version 3.0 List-Subscribe: <mailto:subscribe-fl@ls.friendship.com> List-Owner: <mailto:owner-fl@ls.friendship.com> X-URL: <http://www.friendship.com> X-List-Host: friendship site <http://www.friendship.com> Reply-To: "Friendship List" <cbp@ls.example.com> Sender: bounce-cbp-46885@ls.example.com Message-ID: <LYR46885-6445-1999.03.06-10.17.15 pete#loshin.com@ls.example.com> Precedence: bulk X-Mozilla-Status: 0000 X-Mozilla-Status2: 00000000 X-UIDL: 19990306162033.AAA.50a9b710
Usenet News Posting

The following message headers are taken from an actual posting made to a Usenet newsgroup. The Path: header has been edited for claritymany intermediate nodes have been removed and replaced with the string [...] because this line does not fold. Note the presence of additional header fields, including the Newsgroups:, NNTP-Posting-Host:, Organization:, and Lines: headers. All these headers are specific to NNTP messages and are discussed in Chapter 14.
Path: lwnws01.example.net![...]! the-fly.zip.com.au!not-for-mail From: gmkelly@bushrock.zipzip.com.au () Newsgroups: comp.os.linux Subject: Re: Kernel 2.2.2 Date: 26 Feb 1999 03:49:37 GMT Organization: Example Corporation Lines: 21 Message-ID: <7b55kh$m8b$1@the-fly.zip.com.au> References: <7b50fj$snv$1@hermes.louisville.edu> NNTP-Posting-Host: 61.8.18.131 Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Newsreader: knews 1.0b.0 Xref: chnws05.example.net comp.os.linux:13699
111
Reading List
The following RFCs are most relevant to this chapter:
s s
RFC 822, Standard for the Format of ARPA Internet Text Messages, provides a good introduction with a historical perspective, even though it is in the process of being updated. The Internet Message Format Standard is the specification submitted by the DRUMS working group as an update of RFC 822. This is where to find all the up-to-date information about Internet message formatting. This Internet-Draft had been submitted for a last call early in 1999, but no protocol action had been taken by the end of summer 1999. RFC 2076, Common Internet Message Headers, is an informational RFC. It is a collection of Internet message headers that are commonly used. It also contains information about how those headers are used and where to find out more about them. This document is also in the process of being updated. RFC 2234, Augmented BNF for Syntax Specifications: ABNF, and its successors are crucial for anyone interested in working with Internet protocols. It specifies the language (Augmented Backus-Naur Form) used to describe Internet protocol syntax.
s s
s s
s s
CHAPTER
9
Multipurpose Internet Mail Extensions (MIME)
As we saw in Chapter 8, Internet Message Format Standard, the messaging infrastructure was designed to accommodate 7-bit ASCII characters only. After all, that was what messages would consist of. Perhaps your client software could handle binary or 8-bit data; perhaps even your organizational messaging systems are all built to handle 8-bit binary data. However, if you plan on sending messages outside your organization, chances are good to excellent that at some point on their journey your beautiful binary data will be munged by some intermediate message server or gateway. You have no control over what happens to your messages once they leave your domain for forwarding across the Internet. If you want to send binary data, you must use some mechanism that allows 8-bit data to be encoded in a way that is not affected by systems that recognize only 7-bit data and that may strip off the eighth bit if it is not set to zero. The Multipurpose Internet Mail Extensions (MIME) standards provide that mechanism. It is not enough to come up with a way to translate 8-bit data into 7-bit data and then translate it back. You must be able to do that translation consistently and interoperably, across any type of network or system. There are many clever ways to encode data, but unless the person or system at the receiving end uses the exact same mechanism as you, they will not be able to reliably interpret your nontext enclosures.
113
114
RFC 822 and the Internet Message Format Standard (MFS) discussed in Chapter 8 specify how message headers are to be formatted, but they say almost nothing about how message bodies are to be formatted. Other than limiting the bodies to US-ASCII characters laid out in lines no longer than 998 characters and terminated by a carriage return/line feed pair, RFC 822 and MFS leave the message body alone. The MIME specifications address how to format message bodies and how to interoperably represent the contents of any particular message body to any application software across any network transport. In this chapter, we examine the MIME specifications, starting by introducing the MIME specifications and explaining how they work followed by an introduction to the format that Internet message bodies may take and a review of the originally specified MIME content types and subtypes. We continue with a discussion of extensions to the RFC 822/MFS headers required for MIME headers and a discussion of MIME conformance procedures and issues. We finish the chapter with a discussion of the steps necessary to register new MIME content types, followed by an overview to some of the MIME content types that have already been specified in RFCs. The Secure MIME (S/MIME) specification is discussed in Chapter 17, Internet Messaging Security.
MIME Specification Overview

There are five RFCs central to the MIME standard. Rather than attempt to address all MIME issues in a single document, these five documents provide guidance for each facet of the specification. The RFCs include:
s s
RFC 2045, Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies. This document specifies the headers that describe the structure of MIME messages. RFC 2046, Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types. This document specifies how the MIME media typing system works and specifies a basic set of media types. RFC 2047, MIME (Multipurpose Internet Mail Extensions) Part Three: Message Header Extensions for Non-ASCII Text. This document describes mechanisms that allow the inclusion of non-US-ASCII text in RFC 822/MFS-compliant message headers. RFC 2048, Multipurpose Internet Mail Extensions (MIME) Part Four: Registration Procedures. This document is a Best Current Practices document describing the procedures to be followed by individuals or groups wishing to register new MIME facilities. RFC 2049, Multipurpose Internet Mail Extensions (MIME) Part Five: Conformance Criteria and Examples. This document describes a baseline
s s
s s
s s
s s
115
of interoperability for MIME implementations. In other words, in order to conform to the MIME standards, implementations must behave in a predictable and uniform manner when confronted with MIME data. Except for RFC 2048, which is a BCP document, these RFCs are all proposed standards, and all are widely implemented in commercial and production applications. This chapter mirrors the organization implied by the division of the MIME specification into five RFCs. The next section is drawn from RFC 2045 and discusses the format of the Internet message body as exemplified by the MIME standard. As outlined in these RFCs, the purpose of MIME is to create a new definition for message format that permits the inclusion of the following: Message bodies that contain textual data in non-US-ASCII character sets. As defined in RFC 822/MFS, Internet messages can only represent US-ASCII data, restricting the transmission of messages not only in all the world languages that dont use the Latin alphabet, but also from most European languages that use a slightly different character set. Most European languages use characters with accents or other diacritical marks that can not be properly rendered in US-ASCII. An extensible mechanism by which different formats can be used to include nontextual message bodies. Its one thing to send files as enclosures, but quite another to make it possible for systems to identify what kind of file is being enclosed so the system can represent it appropriately. For example, MIME enclosures make it possible for email client software to sense when an image file is being attached to the message and rather than simply indicate its presence, actually display the image. Multipart message bodies. It is not enough to attach a file or other enclosure to a message; there are times when it is desirable to include more than one entity, such as the header and contents of a MIME message or one of the parts of a multipart message. Permitting a message to consist of more than one part makes many new applications possible. Multipart messages will be discussed at greater length later in this chapter. Textual header information in character sets other than the US-ASCII set. As with message bodies, there are times when text is incorporated into message headers that cannot be properly rendered in the US-ASCII character set. In particular, names and other proper nouns often contain characters with accents or other diacritical marks that can not be properly rendered in US-ASCII. As we see later in this book, MIME makes it possible to define new types of attachments that can be used by new kinds of applications. For example, calendaring and scheduling applications rely on existing message forwarding
116
infrastructure and protocols for creating, sending, and receiving messages, but they create new MIME types that allow the interoperable exchange of specific types of information.
Format of Internet Message Bodies

RFC 2045 defines the format of the Internet message body in terms of its headers. MIME headers appear along with other RFC 822/MFS headers (as we saw in the examples provided in Chapter 8), but MIME headers can also appear after the RFC 822/MFS text body. MIME headers are RFC 822/MFS-compliant headers: They must adhere to the same guidelines and are defined by ABNF rules. Five header fields are defined in RFC 2045. They are the following: MIME-Version. This field is the most straightforward, as it is used simply to signal to applications receiving the message that it is MIME-conformant. So far, the only valid version number is 1.0. If this header has some other value, the recipient knows the message may not be MIME-conformant. Content-Type. This header specifies the media type and subtype of the data in the body of the message. This is where the MIME header identifies what is inside the MIME body. These values indicate whether the MIME body contains application data or an image or a program or some other kind of content. Parameters may be used with this header. We discuss media types and subtypes, as well as parameters for this header field, in the next section. Content-Transfer-Encoding. While the Content-Type header field contains information pertaining to the content of the MIME body, this header field indicates how that content is encoded. This header provides information about how the data in the MIME body is represented. Two pieces of information are encapsulated in this single header. First, this header indicates whether the body contains data that was encoded differently from its original form. Second, it indicates what kind of character-set is being used for the data within the body. We discuss this in greater detail in the section titled Data Transfer Encoding. Content-ID. This header can be used in the same way that a Message-ID header field is used in RFC 822/MFS messages, that is, to reference another MIME body. We discuss this in greater detail in the section titled Other MIME Header Fields. Content-Description. This field can contain descriptive information about the MIME body contents. It is optional and is intended merely for informational purposes, for example, identifying the body contents as containing a picture of a puppy or an audio sound clip of a song.
117
Custom or future MIME header fields are permitted as long as they conform to the proper formats and limitations, and as long as they start with the string Contents- to identify them as MIME header fields.
Content-Type Header Field

The Content-Type header field defines what the content of the MIME entity is. If this header is present, it must contain at least two pieces of information: the content type and the content subtype. The content type is a general descriptor, and RFC 2046 defines a set of seven content types, five discrete (meaning they apply to single MIME entities) and two composite (meaning they define MIME entities that contain more than one entity). We examine each of the different types in the section MIME Content Types/Subtypes below. For now, it is enough to understand that discrete MIME types include the categories text, image, audio, video, and application, and that the composite MIME types are multipart and message. The content subtype provides further information about the type of content enclosed in the MIME entity and is not optional. If you specify a type, you must also specify a subtype. Within the type text, there is a subtype of plain. This header field may also contain one or more parameters, set off by semicolons. For example, within the text/plain MIME type/subtype, one option is to specify a character set (charset) to indicate what kind of text is being used. The default is US-ASCII, and if the Content-Type header field is not present in a MIME message, this is the default type/subtype and parameter. The default Content-Type header field would appear like this, with the type and subtype following the text Content-type: and set off by a forward slash; the parameter follows:
Content-type: text/plain; charset=US-ASCII
The ABNF syntax for the Content-Type header field, taken directly from RFC 2045, follows:
content := "Content-Type" ":" type "/" subtype *(";" parameter) ; Matching of media type and subtype ; is ALWAYS case-insensitive. type := discrete-type / composite-type discrete-type := "text" / "image" / "audio" / "video" / "application" / extension-token composite-type := "message" / "multipart" / extension-token extension-token := ietf-token / x-token ietf-token := <An extension token defined by a standards-track RFC and registered
118

with IANA.> x-token := <The two characters "X-" or "x-" followed, with no intervening white space, by any token> subtype := extension-token / iana-token iana-token := <A publicly-defined extension token. Tokens of this form must be registered with IANA as specified in RFC 2048.> parameter := attribute "=" value attribute := token ; Matching of attributes ; is ALWAYS case-insensitive. value := token / quoted-string token := 1*<any (US-ASCII) CHAR except SPACE, CTLs, or tspecials> tspecials := "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "\" / <"> "/" / "[" / "]" / "?" / "=" ; Must be in quoted-string, ; to use within parameter values
Working down this ABNF syntax from the first line, it is worth noting that the content type/subtype representation is not case sensitive. Note also that the five discrete and two composite types are defined here. The element extensiontoken provides room for expansion, offering two options for increasing the number of permitted MIME types. One option is for the IETF to approve additional types and subtype. The other is to permit anyone else to create his or her own MIME types that begin with the string X- or x- to indicate the type/subtype has not been officially sanctioned. Note also that while the IETF has the authority to create extension tokens for both types and subtypes, the IANA has the authority to create extension tokens for subtypes. As the transition away from the IANA toward the ICANN (or other organizational entities) continues, this may change. The authors of RFC 2045 intended that the five discrete MIME types defined there would be comprehensive. New MIME types should all be classifiable as subtypes of one of the five fundamental types. Though new types must be approved through IETF standards action, relatively few restrictions are placed on the definition of new subtypes. Classifying new subtypes is considered a positive development, as long as those new subtypes are truly new. As we see later, MIME subtypes can be used to adapt the Internet messaging infrastructure for use as an application transport protocol. The application itself uses a
119
new MIME type to carry the application data, and the MIME entity is transported from host to host using the familiar messaging transports.
Data Transfer Encoding

We mentioned above that the Content-Transfer-Encoding header field contained two pieces of information. However, the syntax for this header field allows only a single value. There are five explicit values defined for this element (though it is extensible either through the IETF or privately), and there is one more, implicit, value. These values can represent domains, which is the kind of data inside the attachment, or they can represent transformations, which is what has to be done to the data to make it go back to its original form. The three data domains are as follows: 7bit. This refers to the seven-bit US-ASCII character set. When data is described as being 7bit, it means that only US-ASCII characters are in use and that all data is in lines of no more than 998 characters terminated by the carriage return/line feed combination. No octets with a decimal value greater than 127 are permitted (seven-bit values only). The carriage return and line feed characters may appear only in combination to terminate lines, and the NUL character (decimal value of 0) is not permitted. 8bit. This refers to data that includes characters taken from character sets that use all eight bits of each byte. When data is described as being 8bit, it means that each octet represents a character and that all data is in lines of no more than 998 characters terminated by the carriage return/line feed combination. The carriage return and line feed characters may appear only in combination to terminate lines, and the NUL character (decimal value of 0) is not permitted. binary. This refers to data that has no restrictions over the representation of bits and is not necessarily organized in eight-bit octets. All data falls into one of these domains. The most restrictive is 7bit, as only 127 characters are permitted and restrictions are placed on how they are organized (carriage return and line feed must always terminate a line together and may never appear separately). Binary data has the fewest restrictions. Although the data itself may have internal organization, that organization may arrange the bits of the data in any possible combination without restriction. For Internet messaging, most MIME data is in the 7bit domain because SMTP requires that all messages be transmitted using seven-bit US-ASCII characters in lines consisting of fewer than 998 characters. That means all MIME data transmitted via SMTP is in the 7bit domain, even if the original material in the entity was not US-ASCII text. However, MIME is extensible and can be used for transports other than SMTP, so there is no reason why a MIME
120
attachment could not be represented as 8bit or binary data as long as that kind of data is permitted on the message transport infrastructure. However, the restriction on nontext data over SMTP presents a problem if youre trying to attach nontext content in a MIME body. Some kind of transformation must be performed on the data to give it a representation that consists only of US-ASCII text and make it possible to transform it back into its original form. The following are the three transformations defined for MIME data: [identity] This is not, strictly speaking, a transformation. The identity transformation simply means that nothing was done to the data to turn it into valid US-ASCII data and that nothing has to be done to it to turn it back into US-ASCII data. quoted-printable This transformation is used for data that contains all or mostly US-ASCII text or that contains characters of which all or most correspond to US-ASCII text. Thus, the quoted-printable transformation can be used to convert data that uses Latin character sets into US-ASCII character sets. It can also be used to maintain the exact formatting of US-ASCII text and prevent reformatting of lines as the entity passes through intermediate servers. Each non-US-ASCII text character is converted into one or more US-ASCII text characters in a way that the changed characters are not affected by any processing done by intermediate systems, and in a way that allows the original data to be recovered by reversing the transformation. The quoted-printable transformation is useful for data that can be represented as US-ASCII text with relatively little change and offers the advantage that the content is still largely human readable (it retains much or all of its original form). Of course, the quoted-printable transformation is of value only for Latin-based character sets; for other character sets it is worthless. base64 This transformation is used for data that is not text and line based; in other words, binary or nontextual data that may or may not be organized into lines separated by CRLF. If people dont need to be able to read the raw data (as with compiled programs or image files), the data can be transformed using the base64 encoding. This transformation uses a 64character alphabet (the characters A-Z, a-z, 0-9, +, and /) to represent binary data; a 65 character, =, is used to indicate that a boundary has been reached. Transformation process works by taking 24 bits of binary data and converting those bits into a sequence of four encoded characters. Each three octets of binary data is grouped together and then divided into four sets of six bits each; each of those six-bit units corresponds to one of the 64 letters of the base64 alphabet.
th
The idea behind the Content-Transfer-Encoding header is to use a single value to indicate two pieces of information. If the content has been trans-
121
formed, a transformation mechanism is indicated; if the content has not been transformed, the content domain is indicated. Thus, if the content was originally 7bit, no transformation is done (the identity transformation) and the domain 7bit can be used for this header field. Full ABNF syntax defining the quoted-printable and base64 transformations is available in RFC 2045. Any type of contents can be encoded with the base64 encoding method; there is nothing preventing an application from using the base64 method to encode US-ASCII content. However, for the most part, only 8bit or non-USASCII seven-bit content is susceptible to quoted-printable encoding (as well as US-ASCII content, of course). The Content-Transfer-Encoding header field is optional and is not required when the content uses the default encoding, 7bit. The default value looks like this:
Content-Transfer-Encoding: 7bit
The Content-Transfer-Encoding may have some relation to the ContentType, but this is not always the case and should not be inferred. For example, the same type of application subtypes may sometimes be encoded as 7bit and other times as binary. This is the ABNF syntax for the Content-Transfer-Encoding field:
encoding := "Content-Transfer-Encoding" ":" mechanism
mechanism := "7bit" / "8bit" / "binary" / "quoted-printable" / "base64" / ietf-token / x-token
Note that this field can be extended either through IETF action or through the use of new, privately specified encoding mechanisms, which are not recommended. While there is benefit to having many explicitly specified MIME subtypes, there is less benefit to defining more than a very few encoding transformations. Each new standard transformation would burden implementers by requiring them to incorporate new code in MIME-compliant software that would be able to perform that new transformation.
Other MIME Header Fields

Any header starting with the string Content- indicates the header is MIME-related. The only MIME header field that is absolutely necessary is the MIME-Version header. Without it, recipients of the message entity wont know that the message is MIME-compliant. Though not required, the Content-Type header field is optional only so long as the type/subtype of the message contents is text/plain. For anything more interesting, Content-Type is, in effect, a required header field. Likewise, the Content-Transfer-Encoding
122
header field may not be required, but it is only optional as long as the content of the entity is untransformed 7bit data. Two other MIME headers are defined in RFC 2045, both usually optional except for certain exceptions. The RFC also defines the syntax for creation of MIME headers. These header fields are described next.
Content-ID Header Field

The Content-ID header field parallels almost exactly the Message-ID header field described for RFC 822/MFS standard messages. The ABNF syntax for this header field is simply this:
id := "Content-ID" ":" msg-id
The msg-id element must be globally unique to allow a MIME message to reference some other MIME entity. Though normally optional, the Content-ID header field is mandatory for message/external-body MIME entities. These entities must refer to an entity referenced by the Content-ID message ID. We return to this header field when discussing MIME types and subtypes, as it has special meaning for multipart/alternative MIME entities.
Content-Description Header Field

As mentioned above, the Content-Description header field is intended to serve as a holder for an optional note or caption, describing the MIME entity content. The ABNF format for this header is as follows:
description := "Content-Description" ":" *text
In other words, this header consists of the string Content-Description: followed by any amount of text. The header itself may be entirely empty, or it may consist of any US-ASCII characters (or, like any other header, it may consist of non-US-ASCII characters using the mechanism described in RFC 2047, discussed later in this chapter). This header field is always optional.
Special Headers (Content-)

In keeping with the intention to make the MIME specification fully extensible, RFC 2045 defines a mechanism for adding new MIME headers to be specified in the future. There are only two restrictions on these header fields: They must be RFC 822/MFS standard headers, and the header must begin with the string Content- to distinguish them from other types of headers. The ABNF syntax for these special headers is this:
MIME-extension-field := <Any RFC 822 header field which begins with the string "Content-">
123
MIME Content Types/Subtypes

MIME content is characterized by MIME types and subtypes; subtypes may also be modified by parameters depending on the specification of the subtype. As mentioned previously, there are five different discrete types and two composite types. These top-level media type categories are used to make gross decisions about how to deal with the MIME content as well as to classify different kinds of data using different subtypes and parameters. For example, a MIME-enabled client can determine how to display MIME content based on the type. If the type is audio data of some unknown subtype (that the client is not able to turn into audio output), the client would not display the bits of the content because they would have no relevance. However, if the content is of some unknown subtype but of the text type, the client might determine that it should display the MIME content because it is text and it could be human readable even without any special software available to interpret the data. Top-level media types are defined by a set of five pieces of information, as discussed in RFC 2046. These criteria include the following:
s s
A type name and description. The description would also include its own set of criteria that must be met by content in order to be classified within the type. For example, to qualify as content of type image, content must contain bits that, when interpreted by some program, are displayed as an image. Specifications for parameters used by all subtypes of the type. This includes the names of the parameters, valid values, and indication of whether the parameters are required or optional. For example, the text type requires a parameter indicating the character set (charset) in all cases that the content does not use the US-ASCII character set. When the parameter is not explicitly stated, its value defaults to US-ASCII, also known as ISO-8859-1. Instructions for user agents and gateway software concerning how the content should be treated if an unknown subtype must be processed. In other words, if a MIME entity containing an unknown subtype of video data is received by a MIME email client, the data should be treated as a sequence of octets that feed an application (application/octet-stream). General guidelines for what to do when this type of entity appears at a gateway, if the gateway does not recognize the subtype. Discussion of restrictions on content-transfer-encodings for the top-level media type.
s s
s s
s s
s s
The discrete media types are defined for individual entities. If a MIME entity consists of more than one entity, it should be one of the two composite
124
top-level media types. These media types make possible some of the more interesting MIME applications and are discussed in more detail. First, we provide an overview to the six basic MIME top-level media types, then discuss the composite media types. Rather than attempt to classify all subtypes, we simply mention some subtypes as examples. New subtypes are being defined all the time, and older ones may or may not have achieved any degree of pervasiveness. The most authoritative and comprehensive resource Ive found so far for identifying current media types is available through the IANA Web site. The URL is:
www.iana.org/numbers.html
This page provides pointers to assignment services and protocol numbers. This resource may change, or a new resource may be made available through ICANN in the near future. As of mid-1999, the link to MIME media types pointed to this URL:
http://www.isi.edu/in-notes/iana/assignments/media-types/media-types
MIME subtypes can take the form of dot-delimited names; in particular, the prefix vnd. may be prepended to a MIME subtype name to indicate that it represents a subtype being defined by a vendor. This allows vendors to create subtypes that accommodate their own proprietary file formats. However, in practice, some vendor-specific subtypes are defined without this prefix. In particular, a MIME subtype known as application/msword identifies its body as a Microsoft Word formatted word processing file. Similarly, Adobe has defined a subtype (application/pdf) for its proprietary Portable Document Format (PDF) used for displaying and printing documents.
Discrete Top-Level Media Types

A composite MIME type entity can contain more than one discrete MIME type entity. These media types are described next.
Text
The text top-level media type containswhat elsetext. More specifically, any MIME entity of this type should consist of textual data that can easily be interpreted as text without having to resort to any sort of application to decode the contents. For example, an all-ASCII text file can be included in a MIME text/plain entity. A word processing file containing all ASCII text plus minimal formatting codes, also expressed in ASCII characters, may also be appropriate for this type. However, some content may consist entirely of US-ASCII characters but may not be appropriate for the MIME type of text. A PostScript
125
file can contain all-ASCII data but is usually not profitably interpreted as a pure-text piece of contentthe control codes often obscure any content. All text subtypes can specify the parameter charset to indicate a particular character set used in the content. If the charset parameter is not specified, the default is US-ASCII or ISO-8859-1, which is defined by the International Standards Organization (ISO) and is considered the lowest common denominator for plain text. Other valid values for this parameter include any other ISO character set that can be specified as ISO-8859-X where X is a value from 1 to 10. These values specify different seven-bit character representations for different languages and alphabet types. The most basic text subtype is the text/plain subtype. Other subtypes include the text/richtext subtype (defined in RFC 1341) and text/enriched subtype (defined in RFC 1896). The text/richtext type, defined in an earlier version of the MIME specification, simply outlines a series of tags incorporated into a text document (set off by angle brackets) to indicate formatting of the tagged text. The text/richtext subtype is now considered obsolete, but the text/enriched subtype defined in RFC 1896 provides a more fully featured framework in which to share formatted text. Even so, the text/enriched subtype was originally envisioned as a mere stopgap format rather than a longterm solution. MIME software that encounters an unknown text subtype should simply treat the content as if it were text/plain content. This makes the content available to the end user in a way that should be accessible no matter what kind of client software is being used. However, if the unknown text subtype contains data intended for some particular kind of application, it will be accessible to that application only if the application is resident on the end users system and set up to accept MIME data of that type.
Image
The image media type is used for any kind of MIME content that can contain an image. Initially, two subtypes were defined for JPEG and GIF format images, but any image data format can be used as the basis of a MIME image subtype. Other image subtypes include TIFF (Tag Image File Format), CGM (Computer Graphics Metafile), and others, including both vendor proprietary and open standard. Applications recognize any MIME entities of the image type as images and may also be programmed to display image subtypes that they recognize; that is, an application can be configured to recognize TIFF, GIF, and JPEG image subtypes and open the appropriate graphics viewer for each of those kinds of MIME bodies. The application may also be configured to recognize that a MIME image body is an image, even if it does not recognize the specific image subtype, and pass the MIME body to a general graphics viewer.
126
Minimally, MIME implementations should treat unrecognized image subtypes as application/octet-stream types and provide the user with options pertaining to such a type (well talk about that more later).
Audio
The audio media type can be used for any type of data that contains sounds. At present, there are no strong contenders for interoperable and open sound file formats (though there are a handful of strong proprietary solutions contending for market share). The generic-sounding audio/basic is defined in RFC 2046 in this way: The content of the audio/basic subtype is single channel audio encoded using 8bit ISDN mu-law [PCM] at a sample rate of 8000 Hz. At the moment, there are only a handful of defined audio subtypes, though an audio/mpeg subtype, for MPEG audio recording, is a work in progress as of the first half of 1999. As with unknown image subtypes, the MIME implementation may elect to pass unknown audio subtypes to a general-purpose audio playback application or, at the least, to treat the unknown subtype as application/octet-stream.
Video
The video media type can be used to indicate that the body contains any timevarying-picture image, possibly with color and coordinated sound. In other words, anything that moves could be considered video, including animation if it is properly encoded. Sound is not required but may be a part of the content. MPEG and QuickTime video subtypes have been defined (according to the IANA listing). As with audio and image content, the MIME implementation may feed unknown video subtypes to a general-purpose video viewer, but should at least treat such content as application/octet-stream. Although it covers the mixing of multiple media in a single body, RFC 2046 recognizes that video content often uses external representations of audio that is synchronized with the visual component of the audio and makes an exception for this media type.
Application
In a way, the image, video, and audio top-level-media types are exceptions or special applications rather than primary examples of MIME content. After all, these are types of data that are used by applications (by graphics, video viewer, and audio players). The application top-level-media type is used to define all other types of data that can be imagined. Word processing documents, spreadsheet files, file transfer, encryption and commercial transactions, presentation files, compressed data, and many other types of application data are represented by the application MIME type. The application type is for any
127
kind of data that isnt text, isnt image, video, or audio, but is data that can be interpreted by a program but that can not easily be interpreted by a person. That application data may even be in the form of an active language, with content bodies that consist of programs to be run on the recipients systems. RFC 2046 defines two subtypes, the application/octet-stream and the application/postscript subtypes. The octet-stream subtype is most important, as MIME implementations must, at a minimum, treat unknown audio, video, and image subtypes as if they were application/octet-stream bodiesand unknown application subtypes are also treated in this way. The application/ postscript subtype is discussed briefly below. Other current application subtypes, which include popular word processing, spreadsheet, desktop publishing, personal information manager, electronic data interchange (EDI), and many others, are listed at the media type Web document cited previously.
Application/Octet-Stream
This application subtype may contain binary data of indeterminate format, organization, and content. Unlike most of the application subtypes, which are more or less specific about the application in which the body content was created, the octet-stream subtype is deliberately vague. When a MIME entity is identified as the application/octet-stream subtype, it says nothing about the contents other than that they may be binary data. As defined in RFC 2046, there are two optional parameters for this subtypeTYPE and PADDING: TYPE. The type parameter identifies a general type or category of binary data. This parameter is intended to be informational for the end user, rather than to identify the application to the MIME implementation or to be used for any kind of automatic processing of the MIME body. The rarely used multipart/digest subtype is syntactically identical to the multipart/mixed subtype, but semantically different. You cant tell one from the other except by examining the contents of the bodies. This subtype is used to present PADDING. PADDING. The padding parameter indicates how many (if any) padding bits were added at the end of the bit stream to create a body that contains a multiple of 8 bits. (By definition, the octet-stream subtype consists of a discrete number of 8-bit octets, so if the bit-stream that is contained in the MIME body contains a number of bits that is not evenly divisible by eight, anywhere from one to seven bits may have to be appended to the end of the stream.) The recommended action for MIME implementations encountering an application/octet-stream body is to offer the user the option to save the body as a file or, possibly, to allow the user to specify an application to use to open the body. For security reasons, the RFC recommends against implementations attempting to automatically open the body, even if an interpreter parameter is
128
included with the Content-Type header. Potentially damaging programs (in the form both of standalone code as well as in automatically invoked macros incorporated into application files) could be run as a result.
Application/PostScript
Adobes PostScript document description language is used to describe sometimes complex documents. Rather than simply adding tags to specify text and document formatting, the PostScript language includes some very powerful programming features that allow, among other things, the ability to display a PostScript document interactively on a monitor as well as the ability to print out a copy of a document. The printing functions are not considered harmful (or potentially harmful), but some of the operations that may be used for interactive display of PostScript documents are. Interactive PostScript provides a mechanism by which inadvertently or purposefully harmful code could be executed on a message recipients system. In other words, automatically displaying PostScript entities could result in a buggy PostScript program or a PostScript virus being automatically executed. For more details about how PostScript MIME bodies pose potential problems, see RFC 2046.
Model
The model MIME type was defined in RFC 2077, The Model Primary Content Type for Multipurpose Internet Mail Extensions, which is a proposed standard. This type was added to the five defined in RFC 2046 in January 1997, to accommodate modeling data. The RFC defines a model as an electronically exchangeable behavioral or physical representation within a given domain. Examples include CAD models, Virtual Reality Modeling Language (VRML) data types, and simulation models.
Composite Top-Level Media Types

MIME starts to get interesting with composite top-level media types. With composite media types, you can start to put together more than one MIME entity in a single package. There are two composite media types: the multipart media type and the message media type. The multipart or message media type identifies a MIME entity that contains more than one MIME entity. The way it works is that the entity is identified in its own header as a multipart or message. The body of that entity contains one or more body parts, which are each set off by a boundary delimiter. In other words, the message header section includes a Content-Type header that identifies it as a composite. In this, it looks like any other RFC 822/MFS message. The Content-Type header includes a parameter that identifies the body delimiter. Anything that occurs after the messages header section appears to be an RFC 822 message
129
body. In fact, there may be non-MIME content that precedes the first body delimiter. This is called a preamble and can be treated as if it were an RFC 822/MFS message body. The first boundary always includes two hyphens just before the boundary string; the last boundary always ends with two hyphens after the boundary string. Another set of MIME headers, which identify the MIME entity that follows the boundary, appears after each boundary delimiter. After the last MIME entity that makes up the composite body, another instance of boundary is appended, and that is the end of the MIME content. (An epilogue of text can appear after the last boundary, but it is ignored in terms of MIME implementations.) Multipart subtypes are used any time one or more MIME entities must be incorporated into a single entity, for example, when a user wants to send a basic text message along with a photograph or two. Message subtypes are defined when an entire message is to be encapsulated within another message. This is a different operation from attaching other kinds of MIME entities, and there are some subtle issues, which are discussed next.
Multipart Composite Media Types

Any time you want to incorporate one or more MIME entities within another MIME entity, you need to use the multipart media type. As mentioned, the Content-Type: header indicates that the MIME entity is a multipart, along with a subtype. This header requires a parameter indicating the boundary delimiter. The boundary delimiter must consist of as few as 1 and as many as 70 characters, with two limitations on the selection of characters. First, the boundary may not terminate on a white space character. If a boundary seems to be ending in white space, the presumption is that the white space was added in error by a gateway and should be ignored. The second limitation is that the characters of the boundary must be selected from a subset of ASCII characters that are known to be able to survive unmodified even when transported across different messaging gateways. Uppercase and lowercase letters, digits, and some punctuation and symbols are permitted, but no other characters. The ABNF syntax for the boundary parameter is listed below, along with an overall syntax for multipart entities (the syntax is adapted from RFC 2046, but it has been reorganized for clarity).
ABNF Syntax for Boundary Parameter: boundary := 0*69<bchars> bcharsnospace bchars := bcharsnospace / bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" / "+" / "_" / "," / "-" / "." / "/" / ":" / "=" / "?"
130
ABNF Syntax for Multipart Entity:
multipart-body := [preamble CRLF] dash-boundary transport-padding CRLF body-part *encapsulation close-delimiter transport-padding [CRLF epilogue] dash-boundary := "" boundary ; boundary taken from the value of ; boundary parameter of the ; Content-Type field. transport-padding := *LWSP-char ; Composers MUST NOT generate ; non-zero length transport ; padding, but receivers MUST ; be able to handle padding ; added by message transports.
body-part := MIME-part-headers [CRLF *OCTET] ; a body-part is a ; message as defined in RFC 822. ; ; ; ; ; ; ; Lines in a body-part must not start with the specified dash-boundary and the delimiter must not appear anywhere in the body part. Note that the semantics of a body-part differ from the semantics of a message, as described in the text [RFC 2046].
encapsulation := delimiter transport-padding CRLF body-part close-delimiter := delimiter "" delimiter := CRLF dash-boundary preamble := discard-text epilogue := discard-text discard-text := *(*text CRLF) *text ; May be ignored or discarded. OCTET := <any 0-255 octet value>
131
While examining the multipart syntax, note that each multipart body begins with two hyphens (--), followed by the boundary value that was defined in the Content-Type: headers boundary parameter. The body itself may consist of an optional preamble, which is ignored by the MIME implementation (but which may be useful for the end user receiving the message), followed by transport padding (white space added to the message as it passed through the messaging transport medium), followed by at least one body part (MIME body), followed by any number of additional encapsulated body parts. The additional encapsulated body parts are indicated by the encapsulation element, each of which consists of two hyphens and another instance of the boundary and another body part. The multipart body is terminated by a carriage return/line feed followed by two hyphens, the boundary string, and two more hyphens. More transport padding may be present at the end, as well as an optional CRLF and epilogue. These characters are ignored by the MIME implementation, but they may contain information for the recipient. Finally, note that the multipart does not necessarily have to have more than one part. Four multipart subtypes, defined in RFC 2046, are described next.
Multipart/Mixed
The multipart/mixed subtype is the basic subtype. It consists of one or more bodies, and those bodies may be of any type or subtype. There is no implicit relation between or among the bodies, no implied ordering of those bodies, and MIME implementations are supposed to treat unknown multipart subtypes as multipart/mixed. Bodies of a multipart/mixed subtype are not required to have a Content-Type: header (or any other header), but they are assumed to be text/plain unless the Content-Type: header is present.
Multipart/Alternative
The multipart/alternative subtype is syntactically identical to the multipart/mixed subtype, but semantically different. You cant tell one from the other except by examining the contents of the bodies. This subtype is used to present two or more alternative representations of the same content. The content bodies are ordered from least faithful to the original content to most faithful to the original content. There is a presumed inverse relationship between faithfulness to original content and interoperability of the content. In other words, the least faithful rendering of the content will likely be the rendering most accessible to most users on most systems. Put another way, the more faithful the rendering, the more restrictive the representation. In practice, the least restrictive rendering of content is often 7bit ASCII text, the most restrictive an application document. With the multipart/alternative MIME entity, bodies might include a text file, a rich text format (.rtf) file, a Microsoft Word file, and a FrameMaker publishing file, in that order.
132
MIME implementations scan through the multipart/alternative bodies. Each body is described by a discrete type and subtype; the system doing the reading then compares the subtypes with subtypes it supports. When the system hits a subtype that it does not support, it backs off one step and uses the last support subtype to open the body for the recipient. The multipart/alternative subtype may be used by messaging clients that generate both HTML and text-only versions of messages. Recipients get both versions, but their MIME-aware clients display only the HTML version (assuming they are capable of rendering that content). Non-MIME-aware clients, on the other hand, simply display the whole mess(age), with all headers and bodies rendered in 7bit text.
Multipart/Digest
The rarely used multipart/digest subtype is syntactically identical to the multipart/mixed subtype but semantically different. You cant tell one from the other except by examining the contents of the bodies. This subtype is used to present a sequence of RFC 822/MFS messages in a particular order. Unlike the multipart/mixed, the content type of multipart/digest entity bodies is assumed to be message/RFC822 (see below for more about message types). It may be possible to include bodies in a multipart/digest entity that are not messages themselves, but it is not desirable. This subtype is specifically defined to be used for messages only. Digests can be used to encapsulate entire sequences of messages, such as those generated by mailing lists. This is done to allow mailing list recipients the option of receiving all messages from a particular period all at once instead of receiving each message individually. The digest format allows readers to read all messages at once and delete them all at once.
Multipart/Parallel
The rarely used multipart/digest subtype is syntactically identical to the multipart/mixed subtype but semantically different. You cant tell one from the other except by examining the contents of the bodies. This subtype is used to present all the encapsulated bodies simultaneously (if the recipients system is capable of doing so).
Multipart Encoding and Composite Nesting

Multipart MIME entities must not be encoded other than as 7bit, 8bit, or binary. In practice, this means that, for Internet messaging, multipart entities must use 7bit encoding. This is to ensure that all boundaries between the component entities are recognizable and to prevent the possibility of a component entity being encoded more than once. The parts that make up a multipart entity may be encoded (that is, may be quoted-printable or base64) as long as they are discrete subtypes. If the composite multipart MIME entity itself could be encoded
133
MIM E AN D MAI LI NG WEB PAGES

RFC 2557, MIME Encapsulation of Aggregate Documents, such as HTML (MHTML), describes how MIME can be used to encapsulate and transmit an entire Web page (HTML document) by email. HTML pages are aggregate documents. They often consist of a root document and numerous related files that are loaded by a Web browser. Web pages can be difficult to send as attachments. One approach is to simply send a URL, directing the recipient to the original Web page. The problem with this method is that the recipient will be able to see only the current state of the Web page. If the sender wants to send a snapshot of the page as it appears at a precise moment, sending the URL is not enough. Another method is to identify the HTML file, download it, and encapsulate in a MIME body and send it. The problem with this approach is that it does not capture the subsidiary files associated with the page. Files containing graphics are not a part of the Web pages root document, but are part of the Web pagethese will not be included with the root document file when it is incorporated into a MIME body. RFC 2557 tackles the problem of mailing Web pages in two ways. First, it defines the use of a MIME multipart/related structure to aggregate a text/html root resource and the subsidiary resources it references. And second, it specifies a MIME content-header (Content-Location) that allow URIs in a multipart/related text/html root body part to reference subsidiary resources in other body parts of the same multipart/related structure.
then boundaries between the components might be lost, defeating the purpose of the multipart type. The encoding issue raises another issue, that of nesting components. Nowhere is the type of MIME body permitted within a multipart MIME entity restricted, so you could, at least theoretically, have a multipart entity as one part of another multipart entity. No specific instances of subtypes explicitly do this for multiparts, but the message composite media type may produce this type of nesting more often. A nested multipart entity must use a different boundary string than that used by the parent entity. In this way, MIME-aware implementations can identify all the different parts of entity as well as which parts belong to the parent entity and which to the nested entity. Because the multipart entity can not be encoded, it can be nested to (theoretically, anyway) any depth. By allowing nesting and restricting encoding, it is possible to distinguish between MIME entities that may be largely the same but that have subtle differences in meaning. For example, a multipart entity might consist of a text/plain body and two images. This entity consists of three MIME bodies, each separated by the boundary string and each containing its own ContentType header (one text/plain, the other image/jpeg for example). That text/plain body might actually have been an RFC 822/MFS message being forwarded along with its own enclosed image and a new image attached by the person who is resending the message. The original message itself is a composite MIME entity that contained one of the images as an enclosure. The new multipart entity contains the text message as a multipart message type of
134
its own, that contained its own image enclosure. Properly setting boundaries and keeping it all UUencoded means that the two different multipart entities could be easily differentiated.
Message Composite Media Types

The message media types are defined to encapsulate messages. This is necessary because entities that have RFC 822/MFS headers may appear as if they are independent entities, rather than entities encapsulated in other MIME bodies. Message handling systems often insert, modify, or remove headers from normal RFC 822/MFS messages, but it is expressly forbidden to modify any message media types at all. RFC 2046 defines three message media types, which are described next.
Message/rfc822
Sometimes it is useful to be able to encapsulate one message within another message. For example, when forwarding a message to a new recipient, it often makes sense to encapsulate it within a new MIME entity. Likewise, when a message is rejected by a messaging gateway, the rejected message is often returned to its sender along with rejection information. It is beneficial to incorporate this type of message in another message. However, rejected or forwarded messages are considered to be best handled in MIME multipart/mixed or some other subtype of multipart. Each such message is incorporated in a single MIME entity, which is bundled into the multipart/mixed entity. The message headers are inviolate from the depredations of the messaging transport infrastructure. The message/rfc822 may not be encoded except as 7bit, 8bit, or binary (and for transport over the Internet, this turns out in practice to limit the body to US-ASCII). However, this restriction on encoding is placed only on the message/rfc822 body itself. The message body that is enclosed within the message/rfc822 could be encoded, but only if the message itself is a discrete entity. If it is also a composite MIME entity, it may not be encoded either, but its constituent parts might be. Unlike regular messages that use RFC 822/MFS headers, the message/ rfc822 bodies have slightly less restriction on which headers must be present. The only requirement is that one of the From:, Subject:, or Date: headers must be present.
Message/Partial
The message/partial subtype allows large entities to be broken up into fragments and reassembled at their destination. Presence of this subtype indicates that the body contains part of a larger entity. This subtype is useful for environments in which message size is limited, for example, some part of the message transport infrastructure forbids messages in excess of a particular size. To avoid
135
problems with gateways between environments that support UUencoded 8bit or binary message bodies and environments (like the Internet) where messages must be encoded as 7bit data, the content-transfer-encoding of any message/ partial body must always be 7bit. Likewise, the payload body (the message being fragmented) must also not be encoded as 8bit or binary, though if the original message is binary, it could be encoded as base64 so it can be reliably transmitted across any message transport medium. Message/partial MIME entities are required to have three parameters set: ID. This is a unique value identifying the message that is being fragmented. Format is effectively the same as that used for the RFC 822/MFS header field message ID (see Chapter 8). This parameter is required for all message/partial bodies. Number. This value gives a value that indicates the order in which the fragments are to be reassembled. Each fragment is assigned a number, starting with 1 and ending with the number of total fragments created. This parameter is required for all message/partial bodies. Total. This represents the total number of fragments created from the original entity. It is required only for the last fragment, though it may be included with all of the fragments. Each fragment must be identified by its number, but only the last fragment must indicate how many fragments there are altogether. The fourth fragment in a series of seven could have the parameters number=4 and total=7, or it could have just the number parameter along. The last of those fragments must have both the number=7 and total=7 parameters. When the fragments are reassembled, they always contain a complete MIME entity that can have its own Content-Type. Fragments may also be refragmented, if there is some reason to do so.
Message/External-Body
The message/external-body subtype is used when the body of the MIME entity is not included with the headers. This type of MIME entity contains headers, but not body. The body is located somewhere else and only referenced by the MIME entity itself. Every message/external-body MIME entity can take one or more from a set of four parameters. One of these, the ACCESSTYPE parameter is absolutely required for all message/external-body entities. The access-type indicates how the external body can be retrieved. Parameters that can be used for any message/external-body entity include the following: ACCESS-TYPE. This parameter can be set to a single word that identifies how the external body is to be retrieved. Options defined in RFC 2046 include FTP (for file transfer protocol), ANON-FTP (for anonymous FTP), LOCAL-FILE (for files stored on the local system), and MAIL-SERVER (for files that are available on a mail server).
136
EXPIRATION. This parameter is optional and is set to a valid date that serves as the expiration date. After this date, recipients can not expect to find the external data. SIZE. This optional parameter indicates the size, in bytes, of the external data. PERMISSION. This optional parameter can be set to either READ or READ-WRITE and indicates whether the recipient is expected to write over the external data. Each access-type value can have additional parameters associated with it. For example, with file transfer access types (FTP, FTP-ANON, TFTP), there are two mandatory parameters: NAME and SITE. NAME indicates the name of the file containing the external data, and SITE indicates the fully qualified domain name of the file server. Optional parameters for these access types include DIRECTORY, indicating the directory in which the external data file resides, and MODE, indicating the file transfer modes (see RFC 2406 and the standards for file transfer protocols for more information). See RFC 2046 for other parameters associated with access types.
Additional MIME Standard Information

So far, weve focused on the MIME standards defined in RFC 2045 and RFC 2046, as these documents contain the blueprints for the MIME standard. However, three other RFCs address other MIME-related issues. RFC 2047 provides a mechanism for incorporating non-US-ASCII text in message headers. RFC 2048 explains the processes by which new standards for MIME subtypes and other MIME-related variables are registered. Finally, RFC 2049 proposes a standard for how a MIME implementation must behave if it is to be considered to conform to the MIME standard.
MIME Header Extensions

As we saw in Chapter 8, the RFC 822/MFS specification for message headers limits message header fields to the US-ASCII character set. This is fine as long as you are using US-ASCII, but it tends to shut out users of other, equally important, character sets. In particular, people using 7-bit character sets to represent characters with diacritical marks (accents, cedillas, umlauts, and so forth) often want to have their names spelled properly even though the letters are passed through systems that turn everything into a US-ASCII character. RFC 2047 defines a set of simple rules for representing non-US-ASCII text characters. However, while this proposed standard addresses the encoding of non-text 8-bit characters in headers, it does not define a translation between 8-bit headers and pure ASCII headers.
137
Basically, the mechanism defines an element called an encoded-word, consisting of some boundary characters, a charset element to identify which character set is being used to encode the word, a type of encoding to be used, the text to be encoded, and a boundary terminator. The ABNF syntax, adapted from RFC 2047, is as follows:
encoded-word = "=?" charset "?" encoding "?" encoded-text "?=" charset = token the ; any character set name that is allowed within ; MIME "charset" parameter of a "text/plain" body ; part or any character set name registered with ; IAN for use with the MIME text/plain contenttype encoding = token ; initial legal values are either "Q" or "B" ; corresponding to quoted-printable and base64 ; encoding, as defined for Content-Transfer-
Encoding token = 1*<Any CHAR except SPACE, CTLs, and especials> especials = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / " <"> / "/" / "[" / "]" / "?" / "." / "=" encoded-text = 1*<Any printable ASCII character other than "?" or SPACE> ; (but see "Use of encoded-words in message ; headers", section 5)
To encode a non-US-ASCII word in an RFC 822/MFS header, you indicate the start of an encoded word with the characters =? followed by a name of a text character set (as defined earlier in this chapter), followed by another ? and an encoding of either Q or B, followed by the encoded-text, and terminated by the ?= characters. The encoding type is parallel to the encoding options discussed earlier in the context of the Content-Transfer-Encoding header field. The encoding type indicates base64 encoding and indicates that the encoded text contains base64-encoded data, exactly as discussed earlier in this chapter. The Q encoding is similar to quoted-printable encoding. It provides a mechanism for representing 8-bit characters, by preceding the hexadecimal value of those characters by the = character. It also allows 8-bit characters, where possible, to be represented as printable ASCII characters. There are restrictions. Though encoded words can be used in any header field or element defined as text, in comments, or as words within ABNFdefined phrases (with some restrictions on characters that appear in particular contexts), encoded words may not be used in the following places:
138
s s s s s s s s
Within any portion of an addr-spec element Inside a quoted-string element Within a Received: header field In a MIME Content-Type field or in any structured field body except within a phrase or comment
There are other limitations and restrictions as well. An encoded-word, with delimiters, charset, encoding, and encoded text, may be no longer than 75 characters. To encode more than that, implementers can encode text in multiple units no longer than that separated by a carriage return/line feed and a space. Thus, the maximum size of a line containing an encoded-word is 76 characters. Because RFC 822/MFS parsers are designed to recognize header atoms set off by white space, the white space in an encoded-word (for example, spaces separating text words) must be encoded as the ASCII space, which is the equivalent in hexadecimal representation to 20. Thus, the example taken from RFC 2047:
=?iso-8859-1?q?this is some text?=
would be incorrect. The spaces must be represented like this:

=?iso-8859-1?q?this=20is=20some=20text?=
MIME Facilities Registration

RFC 2048 describes the procedures for registering MIME facilities with the IANA. Although the precise steps are likely to change as the functions of the IANA are transferred to other bodies, this document is useful for identifying what kinds of MIME facilities may be registered and how easy or difficult the process is. Three facilities are identified as requiring registrations:
s s s s s s
Media types External body access types Content-transfer-encodings
Long ago, it was believed that all three of these facilities should be tightly controlled so that interoperability between messaging applications could be maintained. The more media types, however, the less likely any particular implementation was to support it. However, in practice the number of media types (actually, subtypes) is growing, and this is a good thing. As more applications are found for MIME, the ability to encapsulate more and more different types of data within MIME bodies is a positive development. Interoperability may be reduced locally because not all MIME implementations will be able to
139
properly process any particular MIME subtype, but global interoperability actually is increased because any MIME implementation will treat a MIME body appropriately and give the user the opportunity to add whatever is needed to interpret the body (or just save it for use elsewhere).
Media Type Registration

MIME media subtypes are organized in trees, similar to the way the Domain Name System (DNS) organizes domain names. Each root name identifies a tree of subtypes. There are four subtype trees defined in RFC 2048, though others may be added in the future, if deemed necessary. The first four are: IETF Tree. This tree includes subtypes that are considered to be of interest to the Internet community. These subtypes are owned by the IETF, subject to IETF development and IESG approval, and should be described by RFCs. Subtypes within the IETF tree do not have a named root and so usually are represented as single words rather than dot-delimited hierarchical names. Vendor Tree. This tree is intended to contain all vendor-specific or proprietary MIME subtypes. For example, one subtree off the vendor tree contains MIME subtypes for bodies that contain Microsoft application files, and another subtree contains MIME subtypes for bodies that contain Lotus application files. This tree adds the vnd. prefix to all subtype names, which may also be prefixed by a vendor name with a dot-delimited vendor name (for example, application/vnd.ibm.modcap) or simply prefixed as part of the subtype name (for example, applicaton/vnd.lotus-1-2-3). Some MIME types that should have used this tree were defined before the tree structure was defined, and thus flout the rule, as with application/wordperfect5.1 or application/msword. Personal (Vanity) Tree. This tree is intended for noncommercial, experimental, or otherwise private MIME subtypes. There is no overview process required to register these subtypes; simply notify the IANA (or other registry authority), and they are posted. However, registrants are urged to use the IETF forum to provide public exposure and review to the subtype. X (Experimental) Tree. As with headers that begin with the characters X-, this tree is considered experimental and is not formally supported as part of standard MIME implementations except as any other nonstandard header is supported. This tree was included more for symmetry than for any useful purpose: Almost all subtypes that might go in this tree would be more appropriately apportioned to either the personal or vendor trees. You can register a new subtype by submitting a proposal containing specific information about the new subtype. The proposal must define a particular function beyond simply providing a transfer encoding, a new character set, or a collection of entities already defined as another type. In other words, your
140
proposed subtype must reference a MIME body that is not defined elsewhere and that feeds into a program that does more than convert the contents into a different character set or data encoding type. The proposal must also include a MIME type/subtype name that is unique and appropriate. A new MIME type describing a body that would contain an image should be an image type; a new MIME type describing audio content should be an audio type. In the event that a new subtype being proposed does not fit into any of the seven existing MIME types, the Internet standards process can be used to create a new type. Parameter requirements must be specified for IETF-tree types and should be specified for vendor and vanity types. All registered types must use one and only one data format, which is required for IETFtree types and should be included or at least referenced in the proposal. Vendor-tree proposals for proprietary data may simply reference the application software to be used to generate the MIME content. The proposal must also address issues of interoperability and security requirements, as well as indicate whether the type is intended for unlimited use or if there are any limitations on its use. Any other important aspects of the type should also be noted, for example, default file extensions or any other identifying attributes of the body content. IETF-tree types must be published as RFCs, meaning they must be first submitted as Internet-Drafts. Vendor and vanity types do not have to be submitted to the IETF, but such publication is encouraged as information documents to notify the community of the new type. Registration starts by filling out the template included in RFC 2048 and sending it to the ietf-types@iana.org mailing list. After this publication, a two-week review period takes place, during which the community can comment on the appropriateness of the name or any other aspect of the submission.
External Body Access Type Registration

New external body access types are not expected to arise very frequently at all, but when and if they do, they must go through a more stringent process before they may be registered. First, the new access type must be given a unique and appropriate name. More important, all of the protocols, transports, and procedures of the access type must be described publicly and in sufficient detail to allow anyone to build an implementation of the access type. The access type must be published as an RFC, even if only an informational RFC. However, it is recommended that the access type go through the IETF standards process. Access type registration starts similar to subtype registration, but after the two-week review period, an access type reviewer, appointed by the IETF Applications Area director, reviews the access type. This person may forward the proposal to the IANA or reject it if significant objections have been raised. The URL external body access type is described in RFC 2017, but it has not yet been registered as a new type with the IANA (as of spring 1999). It is a standardstrack specification, though, so it should eventually be granted full support.
141
Content-Transfer-Encoding Type Registration

Like access types, new content-transfer-encoding types are not registered lightly. It is conjectured that compression algorithms might be used to encode MIME bodies or that transport encodings could be defined to extend existing formats to MIME. However, even more than new access types, any new content-transfer-encoding scheme would present an added barrier to interoperability. The process of registering a new encoding type is defined merely to allow the possibility that a case could be made for a new one. The process requires that the new type be defined as a standards-track RFC, so any new transfer-encoding type would be subject to considerable scrutiny and review. No new content-transfer-encoding types have been approved since RFC 2048 was published in November 1996.
MIME Conformance
RFC 2049 defines the criteria for a MIME-conformant mail user agent, as well as guidelines for how messaging data should be sent over the Internet and how and when content should be encoded and when it should be left in its canonical form. Very briefly, MIME-conformant mail user agents must, among other things, recognize content-transfer-encoding header fields and be able to decode any data received in the quoted-printable and base64 encodings. It must recognize and handle properly some basic MIME types. If it receives anything with an unrecognized transfer-encoding, type, or subtype, the conformation mail user agent must treat the content in an appropriate manner (that is, it shouldnt display binary data as if it were text or offer the user the option of saving an unrecognized application body). If the agent is designed to support receipt of nonstandard MIME content, it must not send out any nonstandard content. Essentially, this document says, if you want to be MIME-conformant, you must play by the minimal set of rules laid down in the MIME specifications.
Other MIME Content Types/Subtypes

Table 9.1 lists some of the RFCs documenting MIME subtypes.
Reading List
You are urged to review the specification for the Augmented Backus-Naur Form (ABNF) documented in RFC 2234, as well as the Internet Message Format Standard specification (MFS) discussed in Chapter 8. The RFCs listed in
142
Table 9.1 should be consulted if you need specific information about a MIME type. Likewise, the IANA MIME facilities registation documents are indispensable and located at:
http://www.iana.org/numbers.html
Finally, the reader is again pointed to the five seminal MIME specifications, RFC 2045 through RFC 2049, as mentioned throughout the chapter.
Table 9.1 RFC
Some MIME Content Type RFCs TITLE Using Unicode with MIME MIME Encapsulation of Macintosh filesMacMIME MIME Content Type for BinHex Encoded Files MIME Encapsulation of EDI Objects The Text/Enriched MIME Content-type Suggested Additional MIME Types for Associating Documents MIME Security with Pretty Good Privacy (PGP) A MIME Body Part for FAX A MIME Body Part for ODA Tag Image File Format (TIFF)Image/tiff MIME Sub-type Registration S/MIME Version 2 Message Specification The MIME Multipart/Related Content-type Toll Quality Voice32 kbit/s ADPCM MIME Sub-type Registration VPIM Voice Message MIME Sub-type Registration Content Duration MIME Header Definition A MIME Content-Type for Directory Information vCard MIME Directory Profile MIME Types for Use with the ISO ILL Protocol MIME Encapsulation of Aggregate Documents, such as HTML (MHTML)
RFC 1641 RFC 1740 RFC 1741 RFC 1767 RFC 1896 RFC 1927 RFC 2015 RFC 2159 RFC 2161 RFC 2302 RFC 2311 RFC 2387 RFC 2422 RFC 2423 RFC 2424 RFC 2425 RFC 2426 RFC 2503 RFC 2557
CHAPTER
10
Simple Mail Transfer Protocol (SMTP)
Back in Chapter 1, Internet Email Standards, Figure 1.2 showed a very orderly message transfer universe with Mail User Agents (MUAs or UAs) linking up with Mail or Message Transfer Agents (MTAs). Each MTA, in turn, connects very nicely with other MTAs. Incoming and outgoing messages flow between the MTAs and the UAs, and messages are passed between the MTAs for forwarding. This, at least, is the theory. The practice of Internet message forwarding and routing, however, is different. A truer picture, which we paint in this chapter, shows a far more complex reality. Different SMTP MTAs behave in different ways, with some MTAs performing all SMTP functions and others performing only some. User agents arent always distinguishable, functionally, from mail transfer agents. This chapter opens with a discussion of the scope and applicability of the Simple Mail Transfer Protocol (SMTP), including the kind of transport architecture SMTP uses, how mail user agents and mail transfer agents are supposed to behave, the functions SMTP clients and servers perform, and the roles SMTP clients and server fulfill. This discussion concludes with an overview of SMTP functions and features. Next, we introduce the SMTP protocol itself. We begin with commands and replies. All commands are four-letter protocol verbs submitted by a requesting
143
144
system, and all commands are three-digit decimal response codes returned by the system accepting the requests. The SMTP vocabulary is terse but effective. In the next section, specific SMTP procedures are introduced with examples, from session initiation between SMTP client and server to the actual transfer of mail to verification of mailbox names and expansion of mailing lists to the close of the session. This section concludes with a discussion of other important protocol considerations, including troubleshooting and security issues. The chapter concludes with a discussion of the specialized use of SMTP for message submission. Although a users email client may behave strictly as a mail user agent, that same piece of software often functions both as a mail user agent for accepting and displaying messages and as a mail transfer agent when the user wishes to submit a message for delivery. In this section, we discuss how such Message Submission Agents (MSAs) should behave, as defined by RFC 2476, Message Submission. One important aspect of SMTP that we do not discuss in this chapter is the matter of address resolution. Originally, RFC 821 left this issue alone; Internet messaging address resolution was long addressed by RFC 974, Mail Routing and the Domain System. RFC 821bis is the successor document to RFC 821, currently a work in progress. It incorporates a section on address resolution and name handling, but we leave address resolution out of this chapter for clarity and treat it in Chapter 13, SMTP Message Address Resolution.
Scope and Applicability

The objective of the Simple Mail Transfer Protocol, as defined in RFC 821 and in RFC 821bis, is to transfer mail in a manner that is reliable and efficient. Although SMTP mail is invariably electronic mail, the term used in most Internet standard documentation is mail without the modifier. Making the assumption that all SMTP mail is electronic is probably accurate but can not be considered to always be safe: SMTP could as easily be applied to messages carried across any medium, ranging from optical (as in fiber-optic transmissions) to dead-trees media. Some of these transport media are more likely and more effective than others, but for now we continue to refer to SMTP messages as mail or Internet messages for claritys sake. An important aspect of SMTP is its design to allow it to carry messages across what are known as transport service environments. A transport service environment is any medium through which processes can communicatein other words, a network. Both an isolated TCP/IP intranet and the connected global Internet can be considered transport service environments, at least for any two processes running on hosts within those environments. Any system within the intranet can function as SMTP client or server, connecting to
145
another SMTP system within that intranet. If the intranet is truly isolated, with no links at all to the Internet, then those systems are inaccessible to any systems on the Internet. But if a system links both the intranet and the Internet, then mail can be exchanged across the network borders. If the linked networks all use the same protocols, the mail is relayed; if the system links a network that uses different protocols, the mail is gatewayed. All these functions are accounted for in the SMTP model.
The SMTP Model

Figure 10.1 shows a very simple rendering of the SMTP design. This figure is based on a figure from RFC 821bis, and it shows that SMTP is concerned only with what happens between the SMTP client and the SMTP server. How the mail gets to the SMTP client and what happens to it after it arrives at the SMTP server are of no concern to the SMTP implementations: They are interested only in how to set up the transfer of mail between two SMTP systems. SMTP defines how the client and server initiate a connection, how they transfer mail, and how they close the connection. How mail gets from a file system or directly from a user to an SMTP client is of no concern to SMTP. This is the job of the mail client softwareor of some other mechanismthat is capable of placing the message into the SMTP transport environment. In practice, as we see later, the mail client software may use SMTP to submit the message to a fully functional SMTP system, but how the message gets to an SMTP system is irrelevant to the protocol itself. When an SMTP system receives a message, the SMTP entity can forward the message. Two sets of terms are generally used for SMTP systems sending or
User SenderSMTP
File System SMTP commands/ replies and mail
ReceiverSMTP
File System
SMTP Client
Figure 10.1 The SMTP model from RFC 821bis.
SMTP Server
146
receiving a message. The first, SMTP-sender and SMTP-receiver, were used in RFC 821 and are self-explanatory. The SMTP sender is the system sending the message; the SMTP-receiver is the system receiving the message. Since RFC 821 was published in 1982, this usage has lost some of its currency and been replaced by the terms SMTP client and SMTP server. The SMTP client is the system sending the message, and the SMTP server is the system receiving the message. As RFC 821bis explains, the client/server usage may be more current but can be quite confusing inasmuch as an SMTP system often takes on both roles for the same message. An SMTP client opens the connection with the SMTP server and, if everything is in order, requests that a mail transaction be completed. Each mail transaction is actually a series of commands that specify the originator and destination of the mail, followed by the transmission of the message itself. When the transmission is complete, the client may request another mail transaction or may terminate the session. The client may also request other types of interaction, including verification of email addresses or retrieval of mailing list subscriber addresses. SMTP is a stateful protocol, meaning that each participant in a mail transaction keeps track of whats been happening. In contrast, the Hypertext Transport Protocol (HTTP) is a stateless protocol, meaning that it provides no mechanism for a server to remember what just happened between the server and the client. By maintaining state, SMTP allows servers to keep a buffer and state table that can be reset by the client. The client can send different pieces of information, like a source, a destination, and the message itself, and be able to refer back to any of those parts later in the transaction because the server keeps track of those pieces.
SMTP Server Roles

There are four roles an SMTP system can serve:
s s
An SMTP originator system submits mail to the messages transport service environment. In other words, this type of system is the source, within the network (read the Internet), of the original message. This is the system that accepts messages for delivery from MUAs and sends them out onto the Internet. An SMTP delivery system accepts mail from the message transport service environment and hands it off to an MUA or stores it in a message store (this usually a file system though it could be a printer or even a monitor display) for access by the MUA or user. An SMTP relay system, generally called a relay, accepts messages from an SMTP client and passes it along to another SMTP server. The relay
s s
s s
147
modifies the message only by adding its own trace information; otherwise, the message is left unchanged. A relay passes the message to another SMTP server, which may in turn be either another relay or may be a delivery system.
s s
An SMTP gateway system, generally called a gateway, accepts messages from an SMTP client system in one network and retransmits the message in another network, possibly using another messaging protocol.
These different functions help us paint a more complex picture than that portrayed in Figure 1.2 in Chapter 1. Figure 10.2 shows how the different SMTP roles can be fulfilled. Starting at the lower left corner, the originating user agent passes a message along to the SMTP originator system. This system does nothing more than act as a conduit from the user agent to the Internet (the transport service environment), though there is no reason that this particular SMTP system could not take on some other role at another time. The SMTP originator system forwards the message along to an SMTP relay system, shown on the inside of the Internet cloud in the figure. This is a relay
User Agent
Closed Network
SMTP delivery system SMTP gateway system SMTP relay system
User Agent
Internet
SMTP relay system SMTP originator system
User Agent Figure 10.2 SMTP originator, delivery, relay, and gateway system roles.
148
system because it receives messages as an SMTP server but it then immediately takes on the role of SMTP client to forward (or relay) the message along to another SMTP system acting as a server.
N OTE In theory, SMTP servers can relay mail from one server to another across
the Internet. However, in practice, organizations that host SMTP services generally restrict SMTP sessions, accepting messages only from known hosts (for example, subscribers to an ISPs Internet service) and sending messages across the Internet only to the messages designated SMTP server. The reason for the restrictions are to cut down on the use of SMTP servers as relays for unsolicited email, also known as spam.
In Figure 10.2, the first relay system is shown linked to two other SMTP systems. Traveling upward and then to the right, the first SMTP relay system passes the message along to another SMTP relay system. Messages can be forwarded from one relay to another more than once; this is necessary for some network architectures. Following the link from this second SMTP relay, we see an SMTP delivery system. This system is linked to a destination user agent, and its role is to accept the message from the service transport environment (the Internet) and pass it along to its destination. This completes the transit for the message going through the two SMTP relay systems shown in the figure. Look again at Figure 10.2. Going directly upward from the first SMTP relay system, you see a link to an SMTP gateway system. This system links a closed network to the Internet. As a gateway, this system can accept SMTP-based messages and transform them (if necessary) and resend them along the alternate service transport environment (the closed network). The SMTP specification addresses how messages, as well as control and error information, are passed between any of the systems labeled SMTP in Figure 10.2. SMTP also specifies what changes can be made in the mail message content (headers) as it is moved along between these systems. SMTP doesnt really have anything to say about what happens between the SMTP gateway and the systems or protocols to which it provides a gateway, nor does it say very much about what happens between the SMTP delivery system and the user agent. Once the message makes its way to its destination mail store, other protocols can be used to manage what happens next, which we discuss in Chapter 11, Post Office Protocol (POP), and Chapter 12, Internet Message Access Protocol (IMAP).
SMTP Extensions
In the beginning, SMTP was a very simple protocolwith a first name like Simple, you could hardly expect anything else. This meant that the handful of commands and responses would have to be made to serve in all possible situ-
149
ations. Despite the value of such an approach, at times implementers and consumers of messaging software determined that there were features that would be nice to have but that were not possible in the very simple SMTP. The answer was to develop a mechanism by which an SMTP server and client setting up a session could, optionally, indicate to each other that they supported extensions to SMTP. This concept of SMTP extensions was first published as a standards-track protocol in RFC 1425 in 1993 and updated most recently in RFC 1869 as part of STD-10, the Internet standard for SMTP. Very briefly, the SMTP extensions specification created a new way to open the SMTP session, EHLO (see below) to replace the original RFC 821 HELO (hello) command. When a client gives a server the EHLO command, the server must respond with a list of SMTP extension commands that it is able to accept in addition to standard SMTP commands. If the server does not support SMTP extensions, the client can start over by sending a HELO command. RFC 1869 and RFC 821bis both give guidelines as to how servers and clients should handle the two session-opening commands. In general, modern SMTP servers should be able to accept the HELO command (from older SMTP implementations) as well as the EHLO command. Likewise, modern SMTP clients should always open the session with EHLO unless the server responds with an error indicating it does not recognize that command, in which case the client should try again with HELO. We discuss some of the current SMTP extension commands later in this chapter and also provide an example of how SMTP extensions are negotiated during the session initiation.
The Simple Mail Transfer Protocol

SMTP interactions are defined by commands and replies. The four-letter SMTP commands are all terminated by a CRLF. If parameters are present, the command is followed by a space, then the parameters, and then the CRLF. For example, the HELO (HELLO) command is used to open a session and identify the client to the server. In effect, the client says, Hi there. My name is Pete. Instead of saying hi there, the SMTP client says HELO (or as is more proper, EHLO), and instead of saying My name is Pete, the client includes, as an argument to the command, its fully qualified domain name. A reply is a three-digit value that represents some response by the server to the command. Response codes are commonly used for Internet applications, including FTP, HTTP, and others. As with jokes told by the members of the comedians union to each other, SMTP servers use the numerical responses to encapsulate what may be more complex messages. For example, the value 250 represents the message Requested mail action okay, completed, which basically just means OK and is often accompanied by additional information.
150
Other replies may indicate that more information is needed or that, for some reason, the command failed due to some permanent or temporary condition. A mail transaction can begin once the server and client have established their transmission channel. The mail transaction itself consists of the commands that are necessary to transmit the message. This means commands that specify who is to get the message, where the message is coming from, and the message content itself. Also part of the transaction are the responses from the server: If the server fails to reply appropriately to all the commands, the message transmission can not be completed. SMTP commands and replies are discussed next.
Commands
The basic SMTP commands specified in RFC 821 and 821bis are listed next. Rather than give complete protocol specifications here, we leave discussion of format and syntax issues for the SMTP Procedures section below, where we use examples to illustrate proper use of the different commands. EHLO (HELO). Originally, the first command of an SMTP session was HELO, which stands for Hello and identifies the SMTP client to the SMTP server. Extended Hello, or EHLO, is more proper for modern SMTP implementations. The EHLO command uses an argument field containing the fully qualified domain name of the SMTP client (if there is no fully qualified domain name, then some other identifier is desirable, especially if it can be used to identify the client). The EHLO command solicits a response from the server that includes information about which SMTP extensions the server supports. The HELO command simply solicits a response greeting along with the appropriate response code from the server. MAIL. The MAIL command begins the mail transaction between client and server. The client sends the MAIL command to indicate that a message is to be sent. The MAIL command indicates a reverse-path for the message and may include parameters, depending on whether service extensions are in use. The reverse-path indicates the source mailbox of the message. The reverse-path may be a null if the message is actually reporting a nondelivery situation (that is, if the message indicates that mail sent to a particular mailbox is bouncing). The MAIL command usually takes the form of the following:
MAIL FROM:<pete@loshin.com> mail-parameter1 parameter-2
The mail-parameters are optional and are separated from other elements of the command by a space.
151
RCPT. The RCPT command identifies a recipient and may also identify a route for that message, though RFC 821bis indicates that servers need not honor such routes. The RCPT command looks like this:
RCPT TO:<pete@loshin.com>
Each RCPT command identifies a single recipient for a message. If the same message is to be sent to more than one recipient, the RCPT command is issued more than onceeach time for each separate recipient. DATA. The DATA command indicates to the receiving SMTP system that the lines to follow are all part of the message data. As per RFC 822 and MFS (the Internet Message Format Standard), message data must be 7bit US-ASCII characters. All characters sent from the client to the server are treated as message data until the sequence of a CRLF, followed by the period (.), followed by another CRLF (in other words, any line that contains only the character . and is terminated with a CRLF). That combination signals the end of the message data. RSET. RSET is the reset command, which aborts the current transaction. RSET also causes the server to clear its buffers and state tables. The client may send a RSET command at any time; it takes no arguments. VRFY. The VERIFY or VRFY command is used by the client to get the receiving system to confirm information about a user or mailbox. VRFY takes as an argument the user name or mailbox name. If the VRFY command has a user name as its argument, the response from the server should be the user name and the users mailbox (assuming there is an unambiguous mailbox associated with that name). If the VRFY command has a mailbox as its argument, the response from the server should either indicate a positive affirmation that the mailbox is valid or indicate that the mailbox is not known at that server. VRFY doesnt affect anything else that may be going on with a mail transaction and doesnt affect the contents of any buffers. EXPN. The EXPAND or EXPN command is used to get a list of members of a mailing list. The sender sends this command, with an argument that is a list name. The recipient then responds with the members of the list or with an error reply if the list name is not valid. EXPN doesnt affect anything else that may be going on with a mail transaction and doesnt affect the contents of any buffers. HELP. The client can send a HELP command to a server, which should respond with some helpful information. The HELP command may include another command name as an argument, in which case the server responds with some more information about that command. Exactly
152
what kind of information the server provides is not specified. HELP doesnt affect anything else that may be going on with a mail transaction and doesnt affect the contents of any buffers. NOOP. The NOOP (no operation) command doesnt do anything and doesnt affect any buffers. All it does is get the server to respond with an OK. This is the digital version of the conversational Hey, are you there? The NOOP does not take any arguments, and if any are present, the recipient is supposed to ignore them. QUIT. The QUIT command is used by the sender to terminate a session. When a QUIT command is received, the recipient is required to send an OK and then close the channel. We see how these commands work later, in the SMTP Procedures section where we take a look at actual example mail transaction exchanges. It should be noted that VRFY, EXPN, and HELP are optional for SMTP implementations, but RFC 821bis specifies that they should be implemented.
Replies
In general, the SMTP responses are three-digit numbers that may be accompanied by some text related to the response code. This approach to protocol reply codes has long been used and makes it possible for SMTP implementations to use the reply value as a machine-readable value to be used for some other action, as well as to provide some information in human-readable form when a reply code is received. The machine readability of the replies allows the creation and use of meaningful state tables: Upon receiving any particular reply, the system that sent the original command (the client) can put itself into the appropriate state to mirror the state of the other system (the server). Each digit of a reply code has some significance. The first digit indicates, very roughly, what happened in response to the original command. There are five levels of response, to be described later; a 2, for example, indicates that the command was received and successfully executed. The second digit of the reply code indicates a response within a specific category. For example, the reply code may make reference to the session connection or may contain information. The third digit provides the most specific set of meanings, referring to some particular situation or information related to the rest of the reply. Table 10.1 lists all the reply codes that are defined in RFC 821bis. First and second digit significance are discussed next.
Completion Status
The first digit of the SMTP reply code indicates the completion status of the command being replied to. There are five valid values, of which only four are used by standard SMTP commands. The completion status replies are as follows:
153
Positive Preliminary Reply (1xy). When the first digit of the reply code is 1, it means that the command has been accepted, but the completion is being withheld for confirmation from the requesting system. In other words, the action will be taken only if the requester says to go ahead. This completion status is not defined for any SMTP-standard commands; any extension services that use it must also define an abort or continue command to use with this reply. Positive Completion Reply (2xy). This reply indicates that the requested action has been completed and that any new command can be submitted. Positive Intermediate Reply (3xy). This reply indicates that the command was accepted but that the action can not be completed until additional information is supplied by the requesting system. Unlike the positive preliminary reply, this reply indicates that the action can not take place until the additional information has been provided. Transient Negative Completion Reply (4xy). This reply indicates that the command was not accepted and the requested action was not done. However, this reply indicates that the command was not accepted due to some temporary condition. The precise meaning of transience in terms of distinguishing a transient condition from a permanent condition is not provided in RFC 821bis, though the specification does indicate that if the precise same command can be resubmitted and accepted successfully, then the condition is probably temporary. This response might be provided as the result of the unavailability of resources needed to complete the mail transaction. Permanent Negative Completion Reply (5xy). As with the transient negative reply, the permanent negative reply indicates that the command was not accepted and the requested action not done. However, the implication for this reply is that resubmitting the same command will not produce a different result. This does not mean that the negative completion can not be corrected, only that the same command will produce the same result, which means that the information used to generate the mail transaction may need to be corrected (for example, a mailbox may have been misspelled). The second digit of the reply code provides more information about the reply, placing the reply within specific categories. RFC 821bis defines categories for 0 through 5, but 3 and 4 are undefined. The other categories include the following: Syntax (x0y). Syntax-related replies refer to errors or conditions related to the syntax of the command. For example, a command whose syntax is improper would get a reply with this code. Likewise, a superfluous or unimplemented command would receive a similar reply.
154
Information (x1y). Information-related replies contain information and are generally in response to commands that make requests for information, for example, HELP or EXPN. Connections (x2y). Replies relating to the actual transmission channel used between SMTP server and client use this code. Mail System (x5y). Replies relating to the status of the receiver mail system, as that systems status relates to the action being requested, use this code. The third digit of the reply code tends to simply identify different types of replies. The values listed in Table 10.1 illustrate this. We see how the reply codes are used later in the chapter when we look at examples of mail transactions.
Extensions
RFC 821 defined several SMTP commands that were optional for standardscompliant SMTP implementations. Since 1982, four of these commands have either been deprecated or made obsolete, while one has become required, as per RFC 821bis. This does not mean there are no SMTP service extensions; there are, and we cover them shortly. First, it is worth mentioning the original six optional commands and then moving on. Three original SMTP commands have become obsolete. They are the SEND, SOML (SEND or MAIL), and SAML (SEND and MAIL) commands. SEND was defined as an alternate to MAIL, and it delivered the message directly to a users session on a terminal (remember, this was 1982, well before the ubiquitous personal computer). SEND would deliver a message directly to terminal; otherwise, SEND operated like the MAIL command. The SOML command would cause the message to be delivered as a SEND to the user at a terminal session (if such a session was running) or else deliver the message with the MAIL command if no such terminal session was in use. The SAML command caused both a SEND and a MAIL command to be used to deliver the message to both the terminal session and the mailbox. The TURN command has been deprecated since 1982 because it presents a potential security problem. When a client sends the TURN command to a server, the server is supposed to take on the role of client while the original client becomes a server. The purpose of TURN, originally, was to streamline message transactions between SMTP systems across channels that were costly or time-consuming to set up. In practice, making it possible for a client to control a server in this way can make it possible for an attacker to divert mail from its proper destination. RFC 821bis mandates that TURN should not be implemented unless it is accompanied by strong authentication of the host requesting that the client and server switch roles. The HELP command is still considered optional. The status of VRFY is less clear. Though RFC 821bis indicates that both VRFY and EXPN SHOULD be supported, RFC 1869 seems to imply that EXPN is optional but that VRFY is
Simple Mail Transfer Protocol (SMTP) Table 10.1 SMTP Reply Codes (from RFC 821bis ) VALU E System status, or system help reply Help message (Information on how to use the receiver or the meaning of a particular nonstandard command; this reply is useful only to the human user) NOTES
155
REPLY CODE 211 214
220 221 250 251 252
<domain> Service ready <domain> Service closing transmission channel Requested mail action okay, completed User not local, will forward to <forward-path> Cannot VRFY user, but will accept message and attempt delivery Start mail input; end with <CRLF>.<CRLF> <domain> Service not available, closing transmission channel Requested mail action not taken: mailbox unavailable Requested action aborted: local error in processing Requested action not taken: insufficient system storage Syntax error, command unrecognized Syntax error in parameters or arguments Command not implemented Bad sequence of commands Command parameter not implemented Continues (This may include errors such as command line too long) (This may be a reply to any command if the service knows it must shut down) (e.g., mailbox busy)
354 421
450
451 452
500 501 502 503 504
156
Essential Email Standards: RFCs and Protocols Made Practical Table 10.1 SMTP Reply Codes (from RFC 821bis ) (Continued) VALU E Requested action not taken: mailbox unavailable User not local; please try <forward-path> Requested mail action aborted: exceeded storage allocation Requested action not taken: mailbox name not allowed (e.g., mailbox syntax incorrect) NOTES (e.g., mailbox not found, no access, or command rejected for policy reasons)
REPLY CODE 550
551 552
553
554
Transaction failed Or, in the case of a connection-opening response, No SMTP service here
required. An implementation that lacks HELP, VRFY, and EXPN could still perform all functions necessary to sending messages but would not be able to respond positively to those commands. A server announces the service extensions or optional commands it supports when it starts a session with a client. In a successful response to the EHLO command, the server responds with a reply code of 250 (meaning Requested mail action okay, completed) along with information about what optional services it supports. Thus, if the server supports the EXPN and HELP commands, it would reply with a separate 250 reply line for each of those commands. We see how this works in the examples section. Other SMTP service extensions have been defined. Table 10.2 lists RFCs that describe some SMTP service extensions.
SMTP Procedures by Example

In this section, we walk through SMTP interactions using example commands and replies collected from RFCs and other documents. Where it helps understanding, we use more than one example to illustrate commands and responses. SMTP servers maintain state tables relating to the commands and replies received. If a certain type of command is received, consulting the state table gives guidance to the server as to possible replies to a clients commands. Depending on what reply is returned, a client can determine what its next
Simple Mail Transfer Protocol (SMTP) Table 10.2 RFCs Documenting SMTP Service Extensions STATUS Draft STD Experimental STD 10 Proposed STD Proposed STD Proposed STD Proposed STD RFC TITLE SMTP Service Extension for 8bit-MIMEtransport SMTP Service Extensions for Transmission of Large and Binary MIME Messages SMTP Service Extension for Message Size Declaration SMTP Service Extension for Delivery Status Notifications SMTP Service Extension for Remote Message Queue Starting SMTP Service Extension for Returning Enhanced Error Codes SMTP Service Extension for Secure SMTP over TLS
157
RFC N U MBER 1652 1830 1870 1891 1985 2034 2487
command should be. In Table 10.3, we include the table of permitted command/reply sequences that is used to build the state tables. SMTP servers maintain buffers for the reverse-path (the originating mailbox) for the forward-path (the destination mailbox) and for the mail data. Depending on what command is issued by the client, information may be added to a buffer, or one or more of the buffers may be cleared on the server. In addition to their state tables, SMTP servers maintain a reverse-path buffer, a forward-path buffer, and a mail data buffer. We see how this happens as we discuss the different commands and their responses. The exchange of an SMTP message is called a transaction. This term has a very specific meaning and refers to processes that are built to be robust. A transaction either happens or it doesnt happen. Financial transactions implemented in software are designed to use very robust protocols that do not permit a transaction to be finalized until both sides have acknowledged the terms of the transaction to each other and acknowledged receipt of each others acknowledgments. Something similar happens with email transactions: The client sends all the message information to the server, and the server stores it all in its buffers until the transmission is complete. At that point, the server acknowledges that the message is received and processable. Only after that can the server do anything to process the message by writing it to a file store from its buffers.
SMTP Responses and Replies

Table 10.3 shows a list of permitted replies that servers may make to client commands. This version is taken from RFC 821bis. There is a similar table in
158
Essential Email Standards: RFCs and Protocols Made Practical Table 10.3 SMTP Commands and Permitted Replies SUCCESS CODES 220 250 250 250 250, 251* 354 250 250 250, 251, 252 250, 252 211, 214 250 221 550, 551, 553, 502, 504 550, 500, 502, 504 502, 504 ERROR CODES 554 504, 550 504, 550 552, 451, 452, 550, 553, 503 550, 551*, 552, 553, 450, 451, 452, 503, 550 451, 554, 503 552, 554, 451, 452
COMMAN D
CONNECTION ESTABLISHMENT EHLO HELO MAIL RCPT DATA (start**) DATA (send**) RSET VRFY EXPN HELP NOOP QUIT
* The 251 reply code is deprecated in RFC 821bis. ** The initial response to a DATA command is either 354 start mail input or an error; after the data is sent, the server can either send a 250 (OK) or an error, as noted.
RFC 821, but significant changes have been made since then. In the original specification, there was a distinction drawn between an error and a failure, which is no longer deemed necessary.
Session Initiation
SMTP servers listen to port 25 for client requests. When a request comes in in the form of a request to open a TCP session on that port, the server responds with its first SMTP reply, the 220 Service ready reply. At this point, the client can send an EHLO command to the server, and the server can respond with a 250 Requested mail action okay, completed message and with a list of any supported extension services. This type of interaction generally looks something like what is shown next. The convention for client/server interactions in most Internet messaging specifications is to identify what the client sends by the string C: and to identify the servers responses by the string S:. The session initiation looks like this fragment, taken from RFC 821bis:

S: C: S: S: S: S: S: 220 foo.com Simple Mail Transfer Service Ready EHLO bar.com 250-foo.com greets bar.com 250-8BITMIME 250-SIZE 250-DSN 250 HELP
159
Stepping through each line, the server responds to a client setting up a TCP circuit by sending the 220 reply. This reply includes the domain name of the server (foo.com) and a string indicating that service is ready. Once the client receives this affirmation of a connection from the server, it sends the EHLO command, along with its own domain. It should be noted that in this exchange, the domains used have only two parts, but in fact the domains used in these commands and replies are intended to represent hosts, not just domains. The servers response to the clients EHLO command is to respond with a 250 reply, indicating that the server has received the command and processed it. Part of the response to an EHLO command is to provide a list of the SMTP service extensions the server supports; it does so in a series of 250 replies. Each 250 reply includes text followed by a hyphen, indicating that there is at least one more reply to follow. The last line of the reply dispenses with the hyphen. Until the server responds with the 250 reply with no hyphen separating it from the reply text, the client waits for more responses from the server. In this example, the server responds by first greeting the client, listing its own domain name and the clients domain name. The next four lines indicate that the server supports the 8BITMIME option (which allows the exchange of 8bit MIME data, rather than just 7bit data), SIZE (the SMTP message size command), the DSN command (for delivery status notifications, also known as return-receipts), and the HELP command. At this point, the client can continue by sending another valid command. In the event that the server could not respond to the clients request to open a TCP circuit on server port 25, the server would respond with the 554 reply code, meaning transaction failed or more specifically, no SMTP service here. This would kill the session before it even got started, though the server must still wait for the client to submit a QUIT command. The server must not terminate the session on its own, but must wait for the client to terminate the session. Once the client and server initiated the TCP circuit on the server port 25, and the client submitted an EHLO command, the server could still respond with either a 504 or 550 negative reply code. In the first case, the 504 reply means that the clients command is not implemented on the server. This would happen with an older server that implements RFC 821, but not RFC 821bis, and thus does not recognize the EHLO command. In this case, the client could retry the connection using the HELO command, and the session would continue as usual (though without the server providing a list of options supported). The 550 reply occurs if there is a policy reason to reject the request, for
160
example, if the clients domain is on a list of domains with which the server is not permitted to interact. Assuming that the positive response was received, the client can continue the session by initiating a mail transaction. If the positive response was received, the server will have cleared out its buffers in anticipation of doing a mail transaction.
Mail Transactions
The mail transaction itself consists of three types of command: the MAIL command, the RCPT command, and the DATA command. The sequence of these commands does not vary. First comes a single MAIL command, indicating a reverse-path for the message, which is usually the mailbox from which the message originates though sometimes this contains the mailbox to which error messages should be directed instead. This is followed by one or more RCPT commands, indicating message recipients. Last is one or more DATA commands, which contain the content of the message. Continuing the exchange started above, lifted from RFC 821bis, we see the following:
C: S: C: S: C: S: C: S: MAIL FROM:<Smith@bar.com> 250 OK RCPT TO:<Jones@foo.com> 250 OK RCPT TO:<Green@foo.com> 550 No such user here RCPT TO:<Brown@foo.com> 250 OK
Once it is the clients turn, the client sends a MAIL command. As we see, the MAIL command indicates an originating mailbox for the message. There is only one originating mailbox, so once the server replies with a 250 OK response, the client continues with the RCPT command or commands. The origination mailbox value is placed in the servers reverse-path buffer at this point. The RCPT commands are processed next, with the values of the destination mailboxes going into the forward-path buffer. In this particular sequence, there are three recipients, though only two are accepted. The third is rejected with a 550 error, meaning that there is no such user known at the destination mail server. It should be noted that there is only one positive completion code permitted for the MAIL command, but several failure codes. If the client submits the MAIL command out of sequence (for example, after a RCPT or DATA command), the server responds with a 503 reply code (bad sequence of commands). A 550 response might mean that the transaction is being rejected for a policy reason (for example, the originating mailbox is banned from using the server). If the
161
mailbox syntax is wrong, the server can respond with a 553 reply. If the server is out of storage space, it can respond with a 552 reply. Note that these replies start with the digit 5, meaning they indicate a permanent failure. However, this does not necessarily mean that the message can never be transmitted so much as that the command, as submitted, will never succeed in the context and form it was sent. In other words, if the command includes bad syntax, it is not going to succeed even if it is resent. Likewise, if a command is submitted out of order, sending it again will not remedy the problem. The 451 and 452 replies are also permitted in response to the MAIL command. These responses indicate, respectively, that a local processing error occurred or the system had insufficient storage to process the command submitted. The first digit 4 indicates a transient error condition. Although the command failed this time, resubmitting the command in the same format could very well succeed. The RCPT commands include a single mailbox each. The server has the option of responding to the RCPT command in many ways. If the command can successfully be completed, the server returns a 250 OK reply. A 251 reply, indicating that the mailbox is not local but that the server will forward the message anyway, was defined in RFC 821, but is now deprecated and should not be implemented in SMTP servers because it reveals more information than some organizations feel is necessary or safe. A RCPT command can be rejected in many ways. If, for example, the server knows the address in a RCPT command refers to a domain that can not be reached, it may return a 550 reply. If the RCPT command was submitted out of order, a 503 reply is in order. Other permanent failure conditions are represented by replies 550 through 553, or there may be a local problem (a 450, 451, or 452 reply). The 550 reply, no such user here, can be used to indicate that the server recognizes the mailbox address as undeliverable. The 551 reply code, indicating user not local; please try <forward-path>, is technically permitted, but is included mostly for backward compatibility with legacy systems. Though it was defined in RFC 821, RFC 821bis deprecates it (along with reply 251) because it can provide too much information about an organizations network architectures.
The Message Itself

Once the MAIL and RCPT commands have been properly processed, all that remains is to transmit the message to the server. This part of the transaction starts when the client sends a DATA command to the server, as shown in the sequence below (taken from RFC 821bis). The servers reply of 354 indicates that the server is ready to accept mail input, and the text message reminds the sender to terminate the message transmission with a line that has nothing but a period on it (a CRLF followed by a period followed by another CRLF). Once
162
that reply is received, the client can go ahead and send the message as a sequence of lines. When the termination sequence is received, the server sends the 250 OK reply, and the client can terminate the session by sending a QUIT command (we return to session termination later).
C: S: C: C: C: S: C: S: DATA 354 Start mail input; end with <CRLF>.<CRLF> Blah blah blah... ...etc. etc. etc. . 250 OK QUIT 221 foo.com Service closing transmission channel
Forwarding Mail
An SMTP client attempts to open a session with a particular server based on a handful of criteria. In the case of the client that simply originates email (see SMTP and Message Submission, later in this chapter), the SMTP client may simply send all messages to some server that has been designated as a relay server. For SMTP clients that are fully SMTP capable, the target server is determined by using DNS to look up a mail exchange (MX) record (see Chapter 13). DNS yields the fully qualified domain name and address of a server that acts on behalf of the email address specified as the destination. The SMTP client takes this information and uses it to attempt to connect to the designated SMTP server for that destination address. DNS may be used both outside, in the Internet, as well as inside organizational intranets that have their own DNS servers. Thus, the RCPT command can solicit different responses from the server depending on what function the server is fulfilling. In the case of the server that accepts all mail for forwarding, the destinations cited in the RCPT commands will be accepted as long as they are valid addresses. On the other hand, an SMTP server accepting only mail for its own domain can reject RCPT commands that contain destinations recognized as not being valid addresses for the domain the server is serving. Due to the prevalence of forged mail, or merely to prevent mail coming into an organization (or being forwarded by an organizations SMTP servers), it is possible to restrict the domains from which messages can originate. For example, if an organization decides that messages coming from fubar.com tend to mostly be spam (unsolicited email), then their SMTP servers can be configured to reject messages that originate from that domain.
SMTP Trace Header Fields

Remember the trace header fields described in Chapter 8? We said that SMTP puts those into the message. The way this works is that RFC 821bis (and RFC
163
821 before that) mandate that every time an SMTP server receives a message from an SMTP client for delivery (or for any type of further processing) the server must insert a trace header field. The trace header acts as a time stamp on the message; the receiving server adds the trace time-stamp-line (Received:) header at the very beginning of the message when the server accepts the message for processing from the client. The ABNF syntax for the SMTP trace header fields is listed here (adapted from RFC 821bis):
Return-path-line = "Return-Path:" FWS Reverse-path <CRLF> Time-stamp-line = "Received:" FWS Stamp <CRLF> Stamp = From-domain By-domain Opt-info ";" FWS Daytime
From-domain = "FROM" FWS Extended-Domain CFWS By-domain = "BY" FWS Extended-Domain CFWS Extended-Domain = Domain / ( Domain FWS "(" TCP-info ")" ) / ( Address-literal FWS "(" TCP-info ")" TCP-info = Address-literal / ( Domain FWS Address-literal ) ; Information derived by server from TCP connection, not client EHLO. Opt-info = [Via] [With] [ID] [For] Via = "VIA" FWS Link CFWS With = "WITH" FWS Protocol CFWS ID = "ID" FWS String / msg-id CFWS For = "FOR" FWS 1*( Path / Mailbox ) CFWS Link = "TCP" / Addtl-Link Addtl-Link = Atom ; Additional standard names for links are registered with the Internet Assigned Numbers Authority (IANA). "Via" is primarily of value with non-Internet transports. SMTP servers SHOULD NOT use unregistered names. Protocol = "ESMTP" / "SMTP" / Attdl-Protocol Attdl-Protocol = Atom ; Additional standard names for protocols are registered with the Internet Assigned
164

Numbers Authority (IANA). SMTP servers SHOULD NOT use unregistered names. Daytime = FWS [ day-of-week "," FWS ] Date FWS Time
Date = DD FWS Mon FWS YYYY ; Note that the earlier form, which permits two-digit years, has been deprecated. SMTP systems MUST use four-digit years. [additional elements defining the date element have been deleted for brevity; they are available in Chapter 8]
The time-stamp-line header includes the name of the source host (from the EHLO command) and the IP address of the source (taken from information used to set up the TCP circuit between the server and client). The trace must have this information in the From: field. A single instance of the return-path-line is required for each message, put there by the last delivering SMTP server. Every SMTP server that handles the message must add a time-stamp-line, which contains the domain name of the originating host (the from-domain), the domain name of the host from which the message was received (the by-domain), and the time at which the message was received. A mail system is not permitted to modify Received: headers added by other systems. These headers are useful for tracing the route a message takes across the Internet. Likewise, the trace headers may not be rearranged by any system. Looking at the Received: headers should give an accurate picture of where the message came from and what systems passed the message along en route to its destination. One option that used to be open to SMTP implementers was to specify a route that a message should take: The route would be added as an argument to the MAIL command. RFC 821bis specifically indicates that while such routes will not cause a mail transaction to fail, the route will be ignored. The value of using a source-specified route is dubious, while it poses the potential for security and other problems (more on this later in the chapter). The return-path line is inserted at the beginning of the mail data by the SMTP system that acts as the final deliverer of the message. The value of the return-path is taken from the reverse-path specified in the MAIL command. This may be different from the mailbox of the original sender, particularly if the return-path is designated as a separate mailbox that handles error messages (this is common usage for mailing lists, less so for individual mailboxes). How does the SMTP server know that it is the final deliverer of the message? In general, if the message leaves the SMTP delivery environment, then it is assumed that it has been delivered. This occurs when the SMTP system forwards the message to a file system (for example, no more forwarding) or passes it along to some other (non-SMTP) system. The message may traverse more systems if the last SMTP system was a gateway system, but it has effec-
165
tively left the SMTP environment. However, unlike trace headers, an SMTP system may remove return path lines if they exist in the message but the message has not arrived at its destination.
Closing the Session

Closing the SMTP session is very simple: The client sends the QUIT command, and the server replies with a 221 message, <domain> Service closing transmission channel where <domain> is the domain name of the SMTP server. This is when everything works as it is supposed to. Of course, there are wrinkles. First, the QUIT command is necessary even when the service channel has not actually been opened. When a TCP circuit on port 25 is opened, if the server replies with a 554 reply (meaning that the service is not being offered), the client still has to send a QUIT in order for the session to be properly terminated. Next, there may be times when the server must be brought down for some reasonit may have run out of disk space, for example. In these cases, the server can terminate the session by sending a 451 code, indicating that the service is not available and that the server must close the channel. When the TCP circuit itself is broken, the server must cancel the transaction that was in progress, but must not do anything about other transactions that may have been completed earlier in the session. The client must behave as if it had received a transient error message and can attempt to reconnect with the server and retry the transaction. In general, however, the server must never terminate the session until it receives a QUIT command and replies to the client with a positive response; the client must never terminate the session until it sends the QUIT command and receives the positive response from the server.
Verifying and Expanding

In practice, both VRFY and EXPN tend to be problematic. They are often not implemented, and even when implemented, they can return inaccurate information. We discuss them here for completeness sake rather than for their inherent utility within SMTP. The SMTP VRFY and EXPN commands are not necessary for transmitting messages, though they can be useful for debugging problems with message deliveries. Their usefulness extends beyond legitimate purposes at times, and these commands have been known to offer information to attackers as well as to legitimate entities. More is said about the security aspects of these commands later in the chapter; in this section, we discuss only how they work. Again taking an example adapted from RFC 821bis, here is an exchange between SMTP systems with the client using the VRFY command. In this example, the client wants to verify a mailbox for the user name Crispin. Using VRFY is like asking the server if it can deliver a message addressed to that user
166
name. Usually, the name being verified is a valid mailbox name, but it does not have to be (as in this example). The VRFY command follows the EHLO command/reply sequence and precedes any actual mail transaction. Valid positive responses include 250, 251, and 252. A 250 reply indicates that the server will accept mail for that name. A 251 reply indicates that the user is not local but that the server will forward the message; it also indicates a path for forwarding. The 252 reply means that the server can not verify the user locally, but will accept the message anyway and attempt to deliver it. The sample verification sequence goes like this:
S: C: S: S: S: S: S: C: S: C: S: C: S: C: S: C: C: C: S: C: S: 220 foo.com Simple Mail Transfer Service Ready EHLO bar.com 250-foo.com greets bar.com 250-8BITMIME 250-SIZE 250-DSN 250 HELP VRFY Crispin 250 Mark Crispin <Admin.MRC@foo.com> MAIL FROM:<EAK@bar.com> 250 OK RCPT TO:<Admin.MRC@foo.com> 250 OK DATA 354 Start mail input; end with <CRLF>.<CRLF> Blah blah blah... ...etc. etc. etc. . 250 OK QUIT 221 foo.com Service closing transmission channel
In this example, the VRFY command got a successful response. Note that any successful response must include the user name and mailbox. Nothing happens to the buffers when a VRFY command is submitted and replied to. It is up to the client to use the address the server returned and use it as an argument to a MAIL command. When the server can not reply positively to the VRFY, for example when the user name is ambiguous (meaning that it may represent more than one user that the server accepts mail for), the server may reply with a simple 553 message indicating that the mailbox name is ambiguous. On the other hand, the server may also reply with a message indicating that the name is ambiguous and a list of the mailboxes that might correspond to the user name. The EXPN command is used by the client to ask the server to expand on a mailing list name and reply with a list of the member names. If the VRFY command is a way for the client to ask the server, Do you know this mailbox? then the EXPN command is a way for the client to ask the server, Who gets a copy when I send mail to this mailing list name? EXPN is a bit more contro-
167
versial than VRFY in terms of security exposure: By using EXPN, an unauthorized user can gain access to mailing list subscribers. The names and addresses are often used for sending unsolicited email.
Additional Protocol Considerations

SMTP provides an important service to the Internet communitya very visible and sometimes vulnerable service. sendmail, the SMTP implementation for Unix, has often been a target of attackers over the years. SMTP in any flavor tends to be vulnerable to attack because any organization that deploys SMTP must offer a server that is exposed to the Internet. There is no hiding. Security is a real concern, and in this section we discuss some of the issues directly related to SMTP, though we discuss messaging security at greater length in Chapter 17, Internet Messaging Security. Problem detection and problem handling are also critical functions for any email system. SMTP provides some mechanisms for detecting and dealing with problems, to be discussed after security.
Security Issues
For SMTP, there are two major security issues: email spoofing and protecting information. Information can leak from any number of sources. As weve mentioned earlier, a server that supports the VRFY and EXPN commands may expose some potentially sensitive informationthe names of actual verified email addresses. Knowing an email address means that an attacker can take the next step and attempt to subvert a specific account, and possibly accounts on systems owned by the same person who owns the email account. Another less than desirable use of the EXPN command is by spammers in search of email addresses of people who subscribe to mailing lists. However, SMTP servers must either support VRFY or not: They may not pretend to support it by always responding either negatively or positively to any VRFY command. As for the EXPN command, sites have the option of enabling or disabling it; they may soon have the option of enabling it only for authenticated and authorized users. Other potential sources of information leakage exist in the trace fields. When a message originates within an organizational intranet, it may accumulate some trace fields that reference internal systems before the message arrives at an SMTP relay server that injects the message into the global Internet. Likewise, implementers should beware of using the optional For clause in the trace field when sending blind copiesit can often tip recipients off to the identity or identities of other recipients if not used carefully. Even the announcement replies that contain system information are considered potentially harmful, as they allow attackers to probe for specific types of systems that may have particular security holes.
168
A more difficult problem is that of mail spoofing. By its nature, because SMTP allows just about any SMTP sender to connect to just about any SMTP receiver, SMTP itself is insecure. There is nothing stopping you, me, or anyone else from telnetting into port 25 of an SMTP server and hand-typing in the commands explained in this chapter. And nothing is stopping malicious or simply mischievous attackers from building in headers that make a message appear to be from anyone in the world. The spoofing issue is a problem, largely because it is fairly intractable at the SMTP level. Because you can not predict where a message will be forwarded once it is injected into the delivery environment, you can not really specify any special handling for authentication of SMTP clients and servers. There are times that perfectly legitimate applications and individuals send mail on behalf of some other person, so it is not useful to figure out ways to make sure that the MAIL FROM command uses the same address as the From: header. Ultimately, the solution lies in the direction of end-to-end, application-level mechanisms that support data integrity. These might include some form of digital signature on the body of the message that can be authenticated by the recipient. We discuss these approaches at greater length in Chapter 17.
Problem Detection and Problem Handling

As the authors of RFC 821bis point out, SMTP service can not be taken lightly. Any time an SMTP server accepts responsibility for a piece of mail by accepting the message, it must not fail to deliver unless it has a very good reason. For example, if the server were to be dynamited the moment after sending the 250 OK reply, that might be considered a good reason. However, the server crashing after sending the 250 OK reply and causing the message to be lost is not a good reason. If an SMTP server does fail to deliver a message after accepting it, the server must create and send a notification message. The message must be addressed to the mailbox in the Return-Path: header, with the notification messages own reverse path set to null (<>). (Of course, if the original Return-Path: header has a null in it itself, the notification message must not be sent out.) Other good reasons for failure to deliver can occur when an SMTP server accepts messages as a relay, so it does not know in advance whether an address is valid. One common problem is the mail routing loop. Trivial loops, such as those where a mailing list contains an alias referring to another mailing list, which in turn references the first mailing list, should be relatively straightforward to detect. Other loops can occur that are less easy to detect by any mechanism other than counting the number of trace headers. RFC 821bis specifies that a message should not be considered to be looped unless it has at least 100 Received: headers.
169
Following the tradition of being conservative about what you send and liberal about what you accept, RFC 821bis also discusses what happens with messages that may not be entirely well formed. SMTP servers are permitted to try to fix bad messages that they originate by adding a message-id field if the message lacks one, by adding a date, time, or time zone when the message lacks any of these, and by fixing addresses so they conform to the fully qualified domain name format when they are otherwise malformed. However, this fixing up of broken messages is purely optional and the more broken the message is, the less likely the server will be able to figure out how to fix it correctly. And when a server does fix something, that fix should be documented in a trace or header field somewhere.
SMTP and Message Submission

SMTP is for transferring and delivering messages. However, the messages have to somehow be introduced into the SMTP delivery environment. There is no getting around it, and there is no getting around the fact that it is not practical or feasible for people to run full-blown SMTP 24 hours a day and seven days a week on their personal computers. However, people are using partial SMTP implementations to submit messages to SMTP relay systems, which in turn pass the messages into the delivery environment. RFC 2476, Message Submission, discusses how SMTP can be more properly used for message submission. In fact, it further complicates that nice orderly picture of MTAs and MUAs with the concept of a Message Submission Agent (MSA). The MSA submits messages to an MTA through port 587. By setting aside a different port for submissions, servers can be configured to not accept any submissions on port 25. This can be useful if an SMTP server is being used to accept messages only from internal hosts and then relay them to another SMTP server for relay or delivery. The server can also be set up to reject all nonlocal submissions, discriminating on the basis of the IP address of the source or by using some form of identification authentication. RFC 2476 adds four new status codes, listed in Table 10.4. Unlike MTAs, MSAs must keep messages cleaned up and can not forward messages that are not complete (for example, do not have valid domain names or timezones specified) or that are otherwise not proper. MSAs should help enforce address syntax by rejecting messages that do not conform to the standards. They should also keep logs of error messages. MSA options include the enforcement of submission rights by issuing error replies when an unauthorized user attempts to submit messages. Also optional is session authentication and checks for invalid message content. None of these things are done by regular SMTP servers and clients, but regular SMTP systems do not normally have the same degree of connection with
170
Essential Email Standards: RFCs and Protocols Made Practical Table 10.4 Status Codes for Message Submission (from RFC 2476) VALU E Bad content. Bad domain or address. Site policy (the message appears to violate some site policy). Not allowed (the address does not appear to have sufficient submission rights, is invalid, is not authorized, or is rejected because the submitting user does not have proper permissions).
REPLY CODE 560 562 570 571
a body of users as the MSAs do. RFC 2476 includes additional information about how MSAs should support various SMTP extensions as well as specification of permitted ways in which messages may be modified by the MSA.
Reading List
Table 10.5 includes the following RFCs for this chapter.
Table 10.5 RFC RFC 821 RFC 822 RFC 1869 work in progress work in progress RFC 2476
Some SMTP RFCs

TITLE Simple Mail Transfer Protocol Standard for the Format of ARPA Internet Text Messages SMTP Service Extensions Simple Mail Transfer Protocol (Internet-Draft) Internet Message Format Standard (Internet-Draft) Message Submission
CHAPTER
11
Post Office Protocol (POP)
What happens to your telephone calls when youre not available to answer them? Some just ring off into oblivion, or at least they used to. Now, you have the option of using an answering machine or a voice mail system to answer your telephone for you. Internet messaging has a similar need for an entity to act on the behalf of a recipient that isnt available to receive email at all hours of the day or night. As we saw in Chapter 10, Simple Mail Transfer Protocol, SMTP defines the rules for exchanging messages between SMTP clients and SMTP servers. An SMTP server must always be up and running and listening to port 25, ever ready to start a session with a client so that all messages destined for that server can be received. When that server is not listening to port 25whether because the server program is down or the entire system is turned offmail can not be delivered to that server. The SMTP client that tries to deliver mail will fail to do so and will send off a bounce message to the sender using the return-path discussed in Chapter 10. SMTP is almost all the mail protocol you need in a world of big computer systems that are always up and running and have a phalanx of operators who make sure nothing goes wrong. This was pretty much the world of yesterday, the world we knew back in 1976 or so. Even then, however, you still needed a messaging protocol capable of delivering messages from the SMTP system to the end user.
171
172
Fortunately for us all, the personal computer caught on and we now have a situation in which hundreds of millions of computers can send and receive email over the Internet. More than ever, SMTP is no longer all the messaging protocol we need. Very few personal computer users feel comfortable leaving their systems up and running and connected to the Internet (or the network of their choice) 24 hours a day, 7 days a week. Even if Microsoft Windows were robust enough to run 24/7, most users would not want to keep their systems up and connected all the time. A better solution would be to have groups of users share a server that acts on their behalf, collecting their messages and making them available when the user is ready to download them. Users might be grouped by their place of business, by their ISP, or by some other bond. This proposed server keeps track of its users mailboxes, receives messages from the SMTP delivery infrastructure, allows downloading of messages addressed to its user, and holds on to the messages until users tell it to delete them. This solution exists and is defined by the Post Office Protocol (POP). POP version 3 (POP3) is the current standard. It is defined in RFC 1939, Post Office Protocol - Version 3, and standardized in STD-53. A mechanism for extending POP3 is defined in RFC 2449, POP3 Extensions Mechanism. RFC 1734, POP3 AUTHentication command, defines an optional command for doing authentication with POP3. RFC 2195, IMAP/POP AUTHorize Extension for Simple Challenge/Response, offers an enhanced mechanism for POP3 security. These RFCs form the basis for the material in this chapter.
POP3
POP3 is a nice, straightforward protocol designed to be simple to implement on clients and to not take up too many resources on the server. POP, like SMTP, is a stateful protocol. Unlike SMTP, the POP3 server replies are limited to two basic status indicators: one for positive response, the other for negative. The responses can also include more information. Similar to SMTP, POP3 servers respond to clients, go through a sequence of commands and responses before mail is transferred, and terminate sessions once completed. However, that POP3 and SMTP are both Internet protocols relating to messages is where the similarities between them end. SMTP is for sending messages; POP3 is for delivering messages to their destination. POP3 is a matter between a client and a server. There is no relaying and no forwarding of messages involved. A POP3 client connects to a POP3 server, and messages are passed from the server to the client. With SMTP, the message may be passed from any number of SMTP clients to any number of SMTP servers before it finally reaches its destination. It should be noted that although POP3 servers are separate from the SMTP delivery infrastructure, the POP3 server function may be fulfilled by the same
173
system that acts as an SMTP server. This arrangement simplifies the transfer of messages from the SMTP environment into the local delivery (POP3) environment. Just as the client message user agent may act as an SMTP client when it submits messages for delivery, so too does the POP3 server behave as an SMTP server when it receives messages on behalf of POP3 clients.
POP3 Commands and Responses

POP3 commands can be categorized by the POP state in which they are valid. For example, the USER command, which is used to pass the server a mailbox name, is valid only during the AUTHORIZATION state. Likewise, the mandatory RETR command, which is used to retrieve a message, is valid only in the TRANSACTION state. Table 11.1 lists all the POP3 commands defined in RFC 1939. As we learn later in the chapter, other commands are possible. The server has two options when responding to a command: either perform the command or not. If the server completes the command, it responds positively, returning the string +OK (the status indicator) and a keyword and sometimes additional information. The negative response -ERR can also contain a
Table 11.1 POP3 Commands ARGUM ENTS name string DESCRIPTION Identifies mailbox name to server. Provides mailbox password to server. Terminates session. Requests information about mailbox statistics. [msg] msg msg Requests information about messages in mailbox. Requests retrieval of a specific message. Requests marking for deletion of a specific message. An are you there? command. Requests that messages marked for deletion be unmarked. name digest msg n [msg] Used for authentication purposes. Requests the headers and first n lines of the message indicated. Requests a unique ID for the message indicated or for all messages in mailbox.
COMMAN D USER PASS QUIT STAT LIST RETR DELE NOOP RSET APOP TOP UIDL
174
keyword and additional information. Though the commands themselves are not case sensitive, the responses are and must appear in uppercase characters as shown here. The next section discusses how these commands are used in the context of sample interactions between a server and a client. Not all of these commands are required. POP3 servers must support some authentication/authorization command, but no particular command is defined as required. This means that, technically, a POP3 server could use the USER/PASS combination, the APOP command, or even something else. However, in practice at least the USER/PASS commands should be implemented. Other optional commands include TOP and UIDL. Again, in theory, one could implement a POP3 server without them, but in practice they are useful because most POP3 systems also implement them and because building alternative commands to perform the same functions is counterproductive for interoperability.
Using POP3
POP3 servers listen to TCP port 110 once the service is up and running. The session begins when a client attempts to open a TCP circuit on the servers port 110. POP3 commands are case insensitive keywords and may be followed by one or more arguments. The keywords are separated from the argumentsif they existby a single space, and the commands themselves are terminated by the CRLF pair. POP3 commands must all be printable ASCII characters; the command keywords are all either three or four characters long and each argument may be as many as 40 characters long. Each POP3 session progresses through various states from the time it opens to the time it is terminated. The POP3 client drives the session, submitting commands to the server. The server responds to commands, but only commands.
POP3 Session Initiation

The POP3 session is initiated when the client starts the TCP circuit to the servers port 110. At this point, the server opens the session by sending a positive response (assuming that the server can open the session). This response can include a one-line greeting and may look like this one, which is taken from RFC 1939:
S: <wait for connection on TCP port 110> C: <open connection> S: +OK POP3 server ready <1896.697170952@dbc.mtview.ca.us>
As before, the servers output or action is identified by the string S: in this interchange and the clients by the string C:. Once the server has responding in this way, the POP3 session enters the AUTHORIZATION state.
175
POP3 Authorization State

In the AUTHORIZATION state, the POP3 server awaits the client sending authorization information. RFC 1939 defines two mechanisms for user authorization; we discuss two others later in this chapter. The first mechanism is the USER/PASS pair of commands. The sequence looks something like this:
C: S: C: S: USER pete +OK pete likes spinach PASS swordfish +OK pete's maildrop has 12 messages (21320 octets)
The USER command includes as an argument a string that identifies the mailbox. The USER command is valid only when it follows immediately after the session has been opened or after a failed PASS command. This allows the client to retry with a different mailbox name (or with the same one). The PASS command can only be submitted when the session is in the AUTHORIZATION state and only when it is immediately preceded by the USER command. The PASS command includes as its argument a string representing a password specific to the server and to the mailbox identified in the USER command. In this exchange, everything went according to plan. The server acknowledged knowing the user specified, pete (any kind of string can be used in the positive response to the USER command). The client followed the successful USER command with the PASS command. The server replied with a positive response and a string indicating that the maildrop (the messages stored by the server for the mailbox) in question contains 12 messages, amounting to 21,321 bytes in total. As with the response to the USER command, the extra information provided with the positive response is arbitrary and optional. When the USER/PASS authorization is consummated, the server locks the mailbox. Nothing that is in there can be modified, and nothing can be added or removed. By locking the mailbox, the client can download everything that was in there at the moment the session was started, but nothing more or less. If the server started to receive a message for that mailbox after it was locked, the message would be held in a cache or buffer somewhere until the client session is complete, and then it would be added to the mailbox. This means that the client must initiate another session to get that just-sent message, but it also means that the servers file system does not have to try to contend with a client reading and a server writing to the same database at the same time. What happens when the USER mailbox name is not known at the server or the password indicated in the PASS command is invalid? The server is not obligated to notify a client if the USER mailbox name is valid or notthe server may respond positively even if the mailbox name does not exist, and then fail the authorization attempt after the password is submitted. This is a more secure approach, as it prevents attackers from trying different mailbox names at random and then attempting to break into an actual mailbox. However, the
176
server may respond to a bad mailbox in the USER command with a negative response such as:
-ERR never heard of mailbox name
After the PASS command, the server may either respond positively (as noted above) or negatively in some way. The error may indicate that the password is invalid, which may hide the fact that the mailbox does not exist, or it may indicate that the mailbox in question is already locked.
APOP
APOP is an optional POP command defined in RFC 1939 for POP3 authentication. We discuss security issues, including the APOP command, as well as protection of passphrases and cryptographic algorithms, at greater length in Chapter 17, Internet Messaging Security.
QUIT
Once the client has been authenticated and the mailbox locked, the session moves into the TRANSACTION state. If the client can not be authenticated or if the mailbox requested is locked, the server may close the session on its own. The client may also close the session at this state by submitting the QUIT command. QUIT is valid both in the AUTHORIZATION state and in the TRANSACTION state. When the command is issued during authorization, the server simply closes the session. QUIT takes no arguments; we discuss QUIT again later.
POP3 TRANSACTION State

The client has provided enough authorization to get the server to recognize and accept it. The mailbox itself has been opened for the client and locked from any changes. The POP3 session is now in the TRANSACTION state. Now, the client can submit TRANSACTION state commands to retrieve information about the mailboxes, retrieve messages, or delete messages. Using a sample POP3 session, we discuss the different commands and possible responses to them. The example listed below was adapted from RFC 1939 and can be considered to continue from the point above where the server responded with the message +OK petes maildrop has 12 messages (21320 octets).
POP3 AUTHORIZATION AN D SECURITY
You may have noticed that the USER/PASS commands are not terribly secure. In fact, they are terribly unsecure. Anyone with a sniffer on the wire can detect mailbox names and passwords and use them with impunity. On the other hand, USER/PASS is easy to implement and, in certain environments, could be a perfectly serviceable solution. The APOP command defined in RFC 1939 offers a more secure option, as do some other commands that we discuss later in this chapter.

C: S: C: S: S: S: S: S: STAT +OK 12 21320 LIST +OK 12 messages (21320 octets) 1 1220 2 2300 3 240 4 5580
177
[and so on until the end] S: S: S: C: S: S: S: C: S: C: S: S: S: C: S: 11 2040 12 1830 . RETR 1 +OK 120 octets <the POP3 server sends message 1> . DELE 1 +OK message 1 deleted RETR 2 +OK 200 octets <the POP3 server sends message 2> . DELE 2 +OK message 2 deleted
[and so on until all 12 messages have been retrieved]
STAT
The first command in this sequence is STAT. STAT is a request for maildrop statistics. The only defined response to the STAT command is a positive indicator followed by two numbers. The first number indicates the number of messages in the maildrop, the second indicates the size in bytes (octets) of the entire maildrop. In this example, the maildrop contains 12 new messages, and the total size of those messages is 21,320 bytes.
LIST
The next command, LIST, causes the server to provide a scan listing for the messages in the maildrop. The scan listing only has to contain two pieces of information: the message number and the size of the message in bytes. The message number in this context is not a unique identifier but simply an ordering of the messages currently in the maildrop, starting from 1. As shown in the sequence above, the server responds to a LIST command with no argument by first summarizing the maildrop, indicating that there are 12 messages totaling 21,320 bytes. Then the server sends a separate line for each message in the maildrop; the server indicates it is finished by sending a single period (.) and CRLF after the last scan listing.
178
The LIST command can be used with an argument to indicate a specific message for which a scan listing is desired. A positive response indicates that the listings will follow. If there are no messages in the maildrop, the positive response will simply be followed by a period and CRLF. The server responds negatively if the LIST argument references a message that does not exist. This would happen, for example, if the command LIST 4 was submitted and the server only had three messages in the maildrop.
RETR
Once the client knows how many messages there are and what their numbers are, it can begin retrieving them with the RETR command. This command is issued with a required message number argument. If there is no such message in the maildrop, the server response is a negative one (such as -ERR no such message). If the message exists, the server responds with a positive response followed by the message, one line at a time. The first line of the response may indicate that the message follows or may include additional information such as the message number and size. At the end of the message, the completion is signaled by the transmission of the single period. This occurs because all MFS-compliant messages end with the single period on its own line.
DELE
The server does not normally delete messages unilaterally. In general, even if a client downloads a message, the message is retained in the maildrop on the server. Exceptions may occur with servers that have policies pertaining to how long messages are retained. A server may automatically remove messages that linger too long, but the client usually explicitly deletes the message. The DELE command may be issued immediately after the message has been downloaded, but this is not always the case. One function of modern email clients allows the user to determine whether the messages are to be removed from the POP3 server after retrieval. Leaving them on the server allows users to retrieve messages from more than one location and from more than one computer. For example, a mobile user might download some or all of her messages to a laptop while on the road, but not delete them from the server. Then, when she returns to her home base, she can download the messages onto her desktop system and delete the messages from the server only at that time. The DELE command must be issued with a message number as its argument. When the server receives this command, it marks the message as deleted, but the message is not actually deleted until the POP3 session is in the UPDATE state. If the DELE command uses a message number that indicates a message that has already been deleted, the server responds negatively; otherwise, the server sends a positive response indicating that the message will be deleted.
179
NOOP
The NOOP or no operation command does nothing but elicit a positive response from the server. As with the similar command in SMTP, the POP3 NOOP command is a sort of are you still there?
RSET
The RSET command causes the server to unmark any messages that have been marked for deletion. The server does not have the option of responding negatively to this command, but must always honor it.
QUIT
As mentioned earlier, QUIT is a valid command both during the AUTHORIZATION state and the TRANSACTION state. However, during the TRANSACTION state, QUIT does not terminate the session entirely but rather advances the session to the UPDATE state.
POP3 UPDATE State

The session enters the UPDATE state after one or mail transactions have been concluded and the client sends the server a QUIT command. For example:
C: S: C: S: QUIT +OK dewey POP3 server signing off (maildrop empty) <close connection> <wait for next connection>
Upon receiving the QUIT command, the server removes any messages from the maildrop that have been marked for deletion and responds to the command. If all messages were removed without any problem, the server responds with a positive reply; if for some reason not all the messages were removed, the server responds with a negative reply indicating that not all messages were removed. At that point, the server releases the lock on the maildrop so that other authorized users (if any) can access it and new messages can be added to it. Then, the server terminates the TCP circuit. These steps happen whether or not the messages were removed from the maildrop. This is essentially what happens when the QUIT command is issued at any time in the POP3 session; however, if a client sends QUIT during the AUTHORIZATION state, the maildrop has not yet been opened and there are no messages marked for deletion.
POP3 Authentication and Authorization

As we noted earlier, POP3 authentication and authorization mechanisms defined in RFC 1939 are not entirely secure. Although the RFC authors noted
180
that sending plain text user names and passwords sporadically might not pose security problems, any such transmission is vulnerable to sniffer attacks. RFC 1734 defined a POP3 AUTH, or authentication, command. This command allows the client to specify a mechanism to be used for authentication. In other words, the client can specify some stronger authentication mechanism than those defined in RFC 1939 (USER/PASS or APOP). The AUTH command takes a single argument, identifying the authentication mechanism to be used. This specification was updated in RFC 2195, IMAP/POP AUTHorize Extension for Simple Challenge/Response. The same authorization mechanism can be used for either POP or IMAP, as we see in Chapter 12, Internet Message Access Protocol. RFC 2195 points out that IMAP and POP lack strong authentication mechanisms that are not flawed. The mechanisms either pass name/password pairs in the clear over the network or else require extensive security infrastructure support to be built into the servers. RFC 2195 defines a challenge/response mechanism that is similar to APOP. It is intended for use with IMAP but can, in principle, be used for POP as well. With POP/AUTH, authentication mechanisms stronger than APOP and USER/PASS can be specified and used, as long as both the client and server support them. We revisit the IMAP/POP AUTH issue in Chapter 12.
POP3 Extensions
Although RFC 2449, POP3 Extension Mechanism, defines a mechanism for defining extensions to POP3, it also warns that such a mechanism is not intended as a license to extend POP3. POP3 is a simple protocol and is intended to stay simple: POP3 is for downloading and deleting messages from a maildrop. However, as weve seen, POP3 has some optionsspecifically relating to authenticationand POP3 as defined in RFC 1939 provided no mechanism for a client to find out what options a server supports. The only way to find out whether a server supports TOP, for example, is to send a TOP command. RFC 2449 specifically warns that the extension mechanism it describes should not be used to extend the protocol beyond its simple objective, which is to download and delete messages from a maildrop. The one exception to the rule is APOP: Clients can identify servers that support APOP by looking at the response sent by the server when the session begins. If that response contains the required msg-id, with timestamp, the client can assume the server supports APOP. If there is some other value returned with the session initiation response, then the client can assume that the server does not support APOP. RFC 2449 defines a new command, CAPA, for clients to check out server capabilities. It also lists the valid and current optional capabilities that can be returned by the CAPA command. Furthermore, it defines extensions to the
181
POP3 error messages to allow the messages to carry data that can be easily parsed by client software. The CAPA command takes no arguments and can be used in any state. If the server doesnt support CAPA, it returns a negative response. If the server does support the command, it responds with a positive indicator followed by a list of capabilities, one to a line, followed by a single period to indicate completion. A sample CAPA command and response interaction, adapted from RFC 2449, is shown here:
C: S: S: S: S: S: S: S: S: S: S: S: CAPA +OK Capability list follows TOP USER SASL CRAM-MD5 KERBEROS_V4 RESP-CODES LOGIN-DELAY 900 PIPELINING EXPIRE 60 UIDL IMPLEMENTATION Generic-POP-v201 .
In this example, all the initial POP3 capabilities are supported. The next section describes these capabilities.
Initial POP3 Capabilities

The capabilities returned in response to a CAPA command represent optional POP3 commands or POP3 server behaviors. Some values may include arguments. When these capability values are part of a response to a CAPA command, they are called CAPA tags and refer to specific functions supported or server behaviors as described below. TOP. This tag indicates that the server supports the TOP command. USER. This tag indicates that the server supports the USER and PASS commands. SASL. This tag indicates that the server supports the Simple Authentication and Security Layer (SASL) documented in RFC 2222 (we return to this specification in Chapter 17). SASL provides mechanisms for authentication. This tag may take arguments that indicate which SASL mechanisms for authentication are being used. RESP-CODES. This tag indicates that server response text following an open square bracket ([) should be considered an extended response code (see the next section on extended response codes).
182
LOGIN-DELAY. This tag indicates the minimum number of seconds allowed between logins. This tag takes as its argument the number of seconds. Some servers limit the frequency of POP3 sessions because opening a session can be resource intensive. When returned to a CAPA command issued during the AUTHENTICATION state, this tag may be followed by the text USER to indicate that the delay may be longer or shorter depending on who the user is (although the argument in this case will be the largest value that any user could get with that server). The presence of the text USER means that the CAPA command must be reissued after the AUTHENTICATION state is complete. PIPELINING. This tag indicates that the server can accept more than one command at one time. This option is similar to the pipelining option available with ESMTP (see Chapter 10), allowing the client to submit two or more commands without having to wait for the server to respond to the first command. Clients that support pipelining must be able to keep track of which commands it has issued and which have been responded to by the server. EXPIRE. This tag lets the server notify the server of its message retention policies. The EXPIRE tag takes as its argument either a number indicating the minimum number of days that the server will retain messages or the word NEVER. As with LOGIN-DELAY, the EXPIRE tag may appear with the word USER during the AUTHORIZATION state to indicate that some users have different retention limits than others. The value displayed in that case will be the minimum retention period for any user of that server. An EXPIRE 0 tag indicates that the client can not leave any messages on the server, while an EXPIRE NEVER means that the server claims it will never delete messages. UIDL. This tag indicates that the UIDL command is supported. IMPLEMENTATION. This tag allows the server to identify, as an argument to the tag, the server implementation. In the example provided above, the implementation is identified as Generic-POP-v201. The implementation token should contain no spaces, since those are normally used as token delimiters. The implementation may be announced in the welcome banner as the session starts, but also as an argument to this tag. This extension mechanism is optional, and clients must be able to interact with servers that support only the basic required POP3 commands. Any additional extensions are likely to be allowed only if they are shown to be highly useful while at the same time NOT extending the scope of POP3 beyond the basic function of downloading and deleting messages. This means that POP3
183
extensions will not be added to support mailbox directories, persistent mailbox stores, message uploading, or any other new function. IMAP, to be discussed in Chapter 12, is designed to provide greater functionality, while POP3 is intended to provide a very basic set of functions.
Extended POP3 Response Codes

As we saw earlier, the server responds to commands with a simple yes (+OK) or no (-ERR). As we also saw, the server may include additional text after the response indicator, but this verbiage was not documented or specified as part of the standard in RFC 1939. As a result, some POP3 clients attempt to parse and decipher the strings in these responses to remedy error situations. For example, an authorization may fail because the mailbox is currently locked or because the login was bad (incorrect password or mailbox name). If the client software can determine which situation is in effect, it can ask the user to close any other POP3 sessions or try later, or it can try using a different password and/or mailbox name. RFC 2449 defines a mechanism for adding standard, machine-parsable data to server responses, as well as two POP3 response codes to be used when a login attempt fails. Response codes follow the response indicator and are enclosed in square brackets. The two defined response codes are: LOGIN-DELAY. This code is returned with the -ERR response indicator after a failed AUTH, USER, PASS, or APOP command. It means that the login-delay period has not expired yet, so the user will not be permitted to log in again until that time has passed. If this code is returned for a USER command, it reveals that that user name exists on the server without authenticating the entity requesting the session. This may be a potential security risk if user names are not otherwise publicly available. IN-USE. This code is returned with an -ERR response indicator after a failed AUTH, APOP, or PASS command and indicates that the maildrop is currently in use, probably by another POP3 client.
Reading List
Table 11.2 contains some of the RFCs of interest in this chapter.
184
Essential Email Standards: RFCs and Protocols Made Practical Table 11.2 RFC RFC 1939 Some POP3 RFCs TITLE Post Office Protocol Version 3 DESCRIPTION Defines the POP3 protocol, but ABNF syntax for POP3 is provided in RFC 2449 in addition to its definition of an extension mechanism for POP.
RFC 1734 RFC 1939 (STD 53) RFC 2195 RFC 2449 RFC 2222
POP3 AUTHentication command Post Office Protocol Version 3 IMAP/POP AUTHorize Extension for Simple Challenge/Response POP3 Extension Mechanism Simple Authentication and Security Layer (SASL)
CHAPTER
12
Internet Message Access Protocol (IMAP)
POP3 is a simple and straightforward protocol for downloading and deleting messages, but the Internet Message Access Protocol, version 4, revision 1 (IMAP4rev1 or just plain IMAP) is a much more complex protocol that supports more complex messaging functions. IMAP allows clients to access and manipulate messages on a server in the same way that users would normally read, store, copy, and delete messages on a local mailbox. IMAP also offers the capability of synchronizing an offline client and a server, along with many other helpful functions. IMAP gives users many of the same features previously found only in proprietary network email systems: Users can access their mailboxes from any client system. IMAP covers all the bases, with support for creating, deleting, and renaming mailboxes as well as checking for new messages, searching the mailbox for selective fetching of messages, and more. However, IMAP is for getting and playing with mail, not for submitting. As weve seen, SMTP is the protocol of choice for sending messages, and POP and IMAP are the protocols of choice for receiving messages. IMAP is considerably more complex than POP, and this is worth repeating. While RFC 1939 documents POP3 in 23 pages, IMAP takes 82 pages in RFC 2060, Internet Message Access ProtocolVersion 4 rev1. Though extensions
185
186
and supporting mechanisms are defined for POP3 in a handful of other RFCs (see Chapter 11, Post Office Protocol (POP)), IMAP has more than a dozen related RFCs. Rather than attempt to summarize in this chapter all of the features, functions, commands, and responses that make up IMAP as weve been doing for other protocols, we instead provide an overview of the protocol and summaries of the relevant specifications.
IMAP4rev1 Protocol Overview

IMAP is defined in RFC 2060 and at least 10 more RFCs. In this section, we first look at the email models that IMAP4 can support: online, offline, and disconnected use. This discussion is based on RFC 1733, Distributed Electronic Mail Models in IMAP4. The next sections discuss, in turn, IMAP functionality and IMAP4 commands and replies. These sections introduce the basic elements of the protocol without going into too many details. The last part of this chapter discusses the rest of IMAPthat is, the pieces of the protocol that are defined in separate RFCs. IMAP, according to RFC 2060, includes operations for creating, deleting, and renaming mailboxes; checking for new messages; permanently removing messages; setting and clearing flags; [RFC-822] and [MIME-IMB] parsing; searching; and selective fetching of message attributes, texts, and portions thereof. These operations complicate matters considerably when compared with the functions defined for POP; consider also that IMAP supports these functions not just for individual clients but for clients interacting with remote servers and for servers that may be interacting with multiple clients using the same mailboxes.
IMAP Electronic Mail Models

RFC 1733 is an informational document that discusses three different models (offline, online, and disconnected use) that can be used for client/server email. It also discusses how these models relate to IMAP. The offline model is what POP uses: The client makes a connection to a server every once in a while to download messages, but otherwise, message management takes place on the client, away from the server. Once the message is downloaded, the server is no longer concerned with it. The online email model is more common with proprietary messaging systems: All messages are stored on the server. The client may access them through the network, but the client is used only for display purposes. Messages are never stored on the clientonly the serverand if the connection between the client and server is broken the client can no longer be used to view or manage mail. The disconnected-use model takes on characteristics of both the online and offline models. Disconnected-use email systems allow the client to download
187
some portion of the message store from the server. Those messages can be modified locally by the client and then, at a later time, the client can upload the changes to the server to synchronize the servers message store. Each approach has advantages and disadvantages. Offline messaging means that you can access messages only from a single client (though, as we saw in Chapter 11, POP provides a very primitive workaround mechanism that allows clients to download messages without deleting them). Offline messaging can eventually place a storage burden on the client, particularly as the number of messages that a person receives and stores increases. On the other hand, offline messaging works well for organizations that cant or dont want to support very large data stores for clients and cant or dont want to support long messaging sessions. Offline message servers can be relatively lean since they neednt store terabytes of messages on behalf of clients. Also, connect time with clients is usually strictly limited to the amount of time necessary to download the latest mail drop. Online messaging means that you can read mail only when you have a connection, which is a drawback, but online messaging also means that you can read the same email regardless of where you are logging in. Other drawbacks to online messaging include the need for the client/server connection to persist as long as the user wants to read or manage email and a possibly higher demand for bandwidth, especially if the user reads messages more than once. Since the messages are stored on the server, every time the user requests a particular message, it must be transmitted across the network. In offline messaging systems, on the other hand, the message is downloaded once and can be viewed any number of times on the local client system without increasing demand for network bandwidth. Depending on how the online messaging model is implemented, bandwidth may also be saved by allowing selective downloading of messages. This is how IMAP was designed to work, but IMAP can also support the other messaging models described here. The disconnected use model functions like the offline model, downloading information from the server and allowing a client to operate on email independent of any server connection. However, its goal is to support the same functions supported by online systems. In this model, the client downloads some or all of the message store, modifies it, and then notifies the server (which acts as the canonical message store) of any changes made to the message store for synchronization. As we saw in Chapter 11, the offline model is clearly the least complicated approach. The online approach is slightly more involved, requiring implementers to build an application that operates at a distance: The clients commands must be properly interpreted and executed by the remote server. Using the disconnected use model to build an email system is the most difficult approach to take. Not only must client commands be accepted, but the server and client must be able to accurately and reliably synchronize their stores.
188
Table 12.1, adapted from RFC 1731, highlights the differences between the offline, online, and disconnected-use email models. By designing IMAP to accommodate any of these models, IMAP clients can mix and match. A user can use the same IMAP client to access messages from different types of email systems, seamlessly. By design, IMAP can be configured to download all, part, or none of the messages waiting for delivery, depending on how it is configured. Thus, an IMAP client connecting to a server over a low-bandwidth link might download message headers only and allow the client to choose which messages to download, or the client might download all messages but no attachments. IMAP provides much latitude in the way the email account is configured.
IMAP Functionality
With IMAP, the client need not download the entire message store every time it connects to the server. Instead, the client can pick and choose what data is to be transmitted and when that data is to be transmitted. Since IMAP supports serverbased message parsing and processing, the client can specify in advance what kind of messages to download and what parts of the message to download. As defined in RFC 2060, IMAP expects a reliable transport layer circuit, such as that provided by TCP. IMAP servers listen to port 143 for client requests. IMAP connections start when the client and server establish a connection and go through a series of four different states (to be discussed later). Any client can connect to a serverthere is no labeling of clients to link a particular client with a particular mailbox. Client commands and server responses take the form of CRLF-terminated lines, just like POP and SMTP. IMAP clients can submit more than one command at a time, without waiting for the server to respond to the first command. Clients tag each command
Table 12.1 Email Messaging Model Features FEATU RE Uses multiple clients Minimum use of server connect time Minimum use of server resources Minimum use of client disk resources Multiple remote mailboxes Fast startup Mail processing when not online OFFLI N E No Yes Yes No No No Yes ON LI N E Yes No No Yes Yes Yes No DISCON N ECTED USE Yes Yes No No Yes No Yes
189
with a unique identifier, and when responding to a command, the server uses the same tag to indicate the command to which the reply refers. IMAP messages are also identified between the server and the client, both by a unique identifier (unique within the mailbox) and by a sequential identifier. The unique identifier is assigned within the mailbox, and although this value is not strictly sequential (its value is not incremented by one for each new message), it does have an ordering. Each successive message gets a higher message identifier value than the previous one. The message sequence number is assigned in a strict sequential ordering. For example, three messages, in order, might be given unique identifiers of 3, 17, and 44. Those same three messages would, respectively, be given the sequence numbers of 1, 2, and 3. Add a fourth message, call it number 57, and that message receives the sequence number 4. Remove message number 17, and message 57 receives a new sequence number value (3). As we see later, the actual message identifiers and sequence numbers take slightly different forms. The client submits commands, and the server responds with one of three completion responses that may include tags (when the server is responding to a particular command). Servers may send responses that are not replying to any particular command but that are unsolicited; in these cases, the commands do not bear tags. The server response may be a status response, server data, or a command continuation request. Server completion responses, which are server communications that contain tags (that is, they are responding to a specific command), fall into three categories: OK, NO, and BAD. The OK response indicates that the message was received and will be processed. The NO response indicates an error condition related to the command execution (for example, the command may not be permitted because a disk is full). The BAD response indicates an error condition related to the protocol (for example, the command was issued without a required argument or with an invalid argument). Command tags and message identifiers add significant complexity, as do the three different possible server completion responses, but the complexity is necessary if IMAP is to be able to support the online and disconnected use email models. The rest of this section goes into more detail about how IMAP works. The next two sections highlight the actual IMAP protocol commands and completion responses.
IMAP States
IMAP defines four different states in which a server can exist. Figure 12.1, from RFC 2060, shows a state diagram for an IMAP server. IMAP commands are generally valid only in the appropriate states. When a client submits a command that does not match the current server state, the server responds with a
190
failed completion response (either BAD or NO). Technically, submitting a command in an inappropriate state is a protocol error, but some servers return BAD while others return NO. The four IMAP server states are as follows: Nonauthenticated State. This state occurs after the connection has been initiated but before the client has been authenticated. In the nonauthenticated state, the client must supply some form of authentication credentials before any other commands can be accepted. A connection may be pre-authenticated, in which case this state is skipped; otherwise, all IMAP sessions start out in the nonauthenticated state. Authenticated State. The session is in the authenticated state any time that a client has been authenticated, but when a mailbox is not selected. This state occurs when a pre-authenticated connection is initiated, after a client has submitted acceptable authentication credentials, and when there has been an error in selecting a mailbox. Selected State. Once a mailbox has been selected for access by the client, the session is in selected state. Logout State. The logout state is reached before the connection is terminated. The client can request the session be closed, at which point the logout state is entered, or the server can unilaterally decide to terminate the session and enter the logout state. As shown in Figure 12.1, three options can occur at the start of the IMAP session: The connection can proceed but authentication must occur (indicated by (1) in the diagram); the connection may be preauthenticated (indicated by (2) in the diagram); or the connection can be rejected (indicated by (3) in the diagram). We discuss IMAP commands in greater detail later in this chapter, at which point you may wish to refer back to this diagram. For now, just remember that IMAP states define which commands may be submitted by the client. As might be expected in the nonauthenticated state, only commands pertaining to authenticating the session are permitted. In the authenticated state, mailbox selection and manipulation commands are permitted. In the selected state, commands pertaining to message manipulation are permitted as are the commands pertaining to mailbox management. We discuss how IMAP commands and responses are exchanged, and then we take a look at the commands and responses themselves. But first, a look at the attributes that IMAP can assign to messages.
IMAP Message Attributes

In addition to identifying commands with tags, IMAP identifies messages with two different types of number. IMAP also uses other message attributes
191
++ |initial connection and server greeting| ++ || (1) || (2) || (3) VV || || +-+ || || |non-authenticated| || || +-+ || || || (7) || (4) || || || VV VV || || ++ || || | authenticated |<=++ || || ++ || || || || (7) || (5) || (6) || || || VV || || || || ++ || || || || |selected|==++ || || || ++ || || || || (7) || VV VV VV VV ++ | logout and close connection | ++ (1) (2) (3) (4) (5) (6) (7) connection without pre-authentication (OK greeting) pre-authenticated connection (PREAUTH greeting) rejected connection (BYE greeting) successful LOGIN or AUTHENTICATE command successful SELECT or EXAMINE command CLOSE command, or failed SELECT or EXAMINE command LOGOUT command, server shutdown, or connection closed
Figure 12.1
IMAP state diagram.
in addition to the actual content of the message. The client can retrieve message attributes by themselves or along with the rest of the message. Again, the complexity of the functions supported by IMAP requires additional complexity when dealing with messages as well as commands. There are two message identifiers: the message Unique Identifier (UID) as well as the message sequence number. The IMAP flags attribute is used to indicate a status for the message, as is discussed later. Other attributes can be used to indicate what the message looks like, when it was received, and how it is structured.
IMAP Message UID

Each IMAP message is assigned a 32-bit value, which is combined with a 32bit unique identifier validity value to create a 64-bit unique identifier (UID) for
192
the message. This value refers to only the single message it is generated for and will never refer to any other message in the IMAP mailbox. UIDs are assigned in an ascending (but not necessarily sequential) order, which means that every new message is given a UID with a higher value than the message preceding it. Each mailbox has a UID validity value, which is reported to the client by the server when the mailbox is selected. Assigning UIDs that persist from one session to another and in ascending order guarantees that the client can resynchronize its state in the event that a session is terminated prematurely or the client is used to process mail offline. UIDs may be regenerated if a non-IMAP mail user agent is used between IMAP sessions. The reordering requires that new UIDs be generated to maintain the ascending UID values. RFC 2060 strongly recommends that the UID validity value be a 32-bit representation of the date and time that the mailbox was created. This helps reduce ambiguity that might arise when a mailbox is deleted and a new one is created with the same name. A client looking for the old mailbox might mistakenly open the new mailbox unless it checks that the validity value and the mailbox name match. Although message UIDs are intended to be persistent, RFC 2060 states that a message UID must not change during a session, but it uses the less stringent language should not when discussing message UID changes between sessions. Sometimes it is unavoidable. However, in those cases, the new UIDs must be assigned in ascending order from the old.
IMAP Message Sequence Number

Where the IMAP message UID is a permanent message identifier, the IMAP message sequence number merely indicates the relative position of the message in the mailbox. Thus, the sequence number is a value ranging from 1 to the number of messages in the mailbox. The highest value allowed is always equal to the number of messages in the mailbox. When a message is deleted, the sequence number of messages received after the deleted message are decremented by one. Message sequence numbers are calculated based on the IMAP UID values of the messages. The UID values are assigned in strict ascending order, so the sequence numbers depend on the UID values. If the messages are reordered in the mailbox (by a non-IMAP agent, for example), the sequence numbers must be regenerated based on the new ordering and the new UID values. Clients can use the message sequence numbers to calculate information about the contents of the mailbox. The server may at times notify the client about the existence of a number of messages. For example, if the server indicates at one point that a message with a sequence number value of 12 exists and then later indicates that a message with a sequence number value of 18 exists, then the client knows that six new messages arrived (and were assigned the sequence number values of 13, 14, 15, 16, 17, and 18).
193
IMAP Message Flags

IMAP defines a set of tokens that can be associated with each message. These tokens are called flags and may be either permanent or session-only. System flags are predefined and are set off by the backslash character (\). Those defined in RFC 2060 include the following: \Seen indicates that the message has been read. \Answered indicates that the message has been answered. \Flagged indicates that the message is marked (flagged) for some kind of special (or urgent) attention. \Deleted indicates that the message is marked for deletion and will be removed later using the EXPUNGE command. \Draft indicates that the message has not been completed and should be marked as a draft. \Recent indicates that the message arrived recently (a purposefully vague term). This flag appears only during the first session that is aware of the existence of the message; this flag is removed in subsequent sessions. The user can not alter this flag. Servers (and clients also, in some cases) are allowed to define additional keywords though these do not begin with the backslash (\). Permanent flags can be permanently added or removed by the client, while session flags can not. Session flag changes are valid only during the session in which the change takes place. The one exception is the \Recent flag, which can not be modified by the user but which appears during only one session and then disappears.
Other IMAP Message Attributes

IMAP includes other message attributes, which help it support more involved mail-handling functions. These attributes, defined in RFC 2060, include the following: Internal date message attribute. This attribute contains the date and time when the message was received by the server. For SMTP-delivered messages, this should be (but is not necessarily) the final delivery date and time (as indicated by the RFC 822bis header and defined by RFC 821bis). Some IMAP messages will be delivered using IMAP commands (such as COPY or APPEND), in which case the internal date message attribute will be different. RFC 822 size message attribute. This attribute indicates the size of the message, in bytes. Envelope structure message attribute. This attribute contains a parsed representation of the RFC 822 and MFS envelope information.
194
Body structure message attribute. This attribute contains a parsed representation of the MIME body structure of the message.
Using IMAP Message Attributes

Other basic email protocols do not use message attributes largely because there is no need. Messages are entities to be passed from one system to another, and the only state required indicates whether the message has been successfully transmitted. With IMAP, the server maintains much more state relating to messages and to the mailboxes in which they are stored. Clients submit commands that solicit information about mailboxes as well as about messages; the servers respond with information derived from message attributes. For example, the server uses the \Seen flag to determine how many messages in a mailbox have already been read by a client. The server provides this information to the client so that the client can display the users mailbox in the same way no matter where the mailbox is being accessed. POP-based clients display this same information but the information stays with the client. The mailbox and its current state can be viewed only through a single system (say, a desktop PC). IMAP enables any client to access this information as long as they have access to the server. However, this means that IMAP implementations must allow users to keep track of much more information about mailboxes and must also accommodate (reliably) mailboxes to which multiple users have access.
IMAP Protocol Components

The IMAP session begins when the client has successfully initiated a TCP circuit on the servers port 143. The server signals the session can begin by sending a greeting through the TCP connection. At this point, the session is in the nonauthenticated state, and the client can start sending commands. Each command initiates an operation, and each client command starts with a command identifier prefix. This sequence is called a tag and is usually a short alphanumeric string. Every client command has a different tag.
IMAP Commands and Responses

Client commands are usually complete on a single line, with two exceptions. The client may issue a command with a command argument indicating an octet count. In this case, the server counts the number of octets indicated rather than waiting for the CRLF combination to signal the end of the command. The other exception occurs when a command requires server feedback. For example, when the AUTHENTICATE command (see below) is used, the client must wait for feedback from the server before the command can be completed.
195
When the server receives such commands, it signals to the client when it is ready for the rest of the command. Such server messages are prefixed by the plus sign (+). Sometimes the server sends data to the client that is not a command completion response. In this case, the data does not get a tag but rather is identified by a prefixed asterisk (*). These are called untagged responses. Server data that may be sent in this way may also be sent by the server in response to some particular command. In either case, the only difference between the syntax of the server message is that the untagged responses have the asterisk prefix, while the tagged responses are prefixed by the tag of the command being responded to. We discuss specific IMAP commands and responses next, but it is worth noting that IMAP specifies a more complex set of interactions between client and server than any of the protocols we have discussed so far and more complex than most of the protocols that are discussed in later chapters.
Commands and State

Table 12.2 contains the commands defined for IMAP in RFC 2060, their descriptions, and the states in which they are valid. Looking at the list, note that as you proceed through the session states, more commands become valid. There are three universal commands; the client may issue the CAPABILITY, NOOP, and LOGOUT commands at any time and in any state. The AUTHENTICATION and LOGIN commands are valid only when the session is in the nonauthenticated state. The commands that manipulate or otherwise interact with mailboxes are valid when the session is in authenticated state or in selected state. Mailboxes can be manipulated before a specific mailbox has been selected, but messages can be manipulated only after a mailbox has been selected. Table 12.2 simply lists the IMAP commands defined in RFC 2060 and summarizes their use. One of the most important commands, FETCH, is described next. Implementers and others who need the complete details of the IMAP commands should read RFC 2060 and the most recent Internet-Drafts describing IMAP before attempting to build an IMAP implementation.
Table 12.2 IMAP Client Commands VALI D STATE(S) Any Any Any DESCRIPTION Requests a list of server capabilities. No operation. For polling a connection or to keep the session alive. Requests the server to close the session. Continues
COMMAN D CAPABILITY NOOP LOGOUT
196
Essential Email Standards: RFCs and Protocols Made Practical Table 12.2 IMAP Client Commands (Continued) VALI D STATE(S) DESCRIPTION Specifies an authentication mechanism to be used to open a session. If the server supports the specified mechanism, the authentication process can continue. Requests authentication for a specified login name and password (sent in plaintext). Specifies a mailbox by name to be selected. Selecting a mailbox is required before the client can access mailbox contents. The server responds with a list of flags that have been defined for the mailbox, the number of messages in the mailbox, the number of those messages with the \Recent flag set, and a UID validity value for the mailbox. Just like SELECT, but allows read-only access to the mailbox. The client can not make any changes to anything in the mailbox, and the mailbox state is retained unchanged. Submits a name for the creation of a new mailbox. Requests that the specified mailbox be deleted. Requests that the specified mailbox be renamed. Requests that the specified mailbox be added to the list of active mailboxes for the user. See also LSUB and UNSUBSCRIBE commands. Requests that the specified mailbox be removed from the list of active mailboxes for the user. See also LSUB and UNSUBSCRIBE commands. Requests that the server list some of the mailboxes available to the user. LIST can take a wildcard string pattern as an argument.
COMMAN D
AUTHENTICATION Nonauthenticated
LOGIN
Nonauthenticated
SELECT
Authenticated or selected
EXAMINE
Authenticatedor selected
CREATE DELETE RENAME SUBSCRIBE
Authenticated or selected Authenticated or selected Authenticated or selected Authenticated or selected
UNSUBSCRIBE
LIST
Internet Message Access Protocol (IMAP) Table 12.2 (Continued) VALI D STATE(S) Authenticated or selected DESCRIPTION Requests a listing of all mailboxes that are active or subscribed by the user. See also SUBSCRIBE and UNSUBSCRIBE. Returns status information for a specified mailbox. Status information items include number of messages, number of \Recent messages, UID validity value, the UID value in line to be assigned to the next message, and the number of messages that have not been read (\Seen flag is not set). Causes a message to be added at the end of the specified mailbox. Requests the server do an implementation-dependent housekeeping task not ordinarily executed with other commands (for example, synchronizing server disk and RAM). If no such task is implemented, CHECK is roughly equivalent to NOOP. Requests that the server return the session to authenticated state (exits selected state) and causes any messages marked for deletion to be permanently removed from the mailbox. Requests that the server permanently delete messages marked for deletion. The server responds separately for each message that is expunged (EXPUNGE). Requests the server respond with the sequence numbers of messages that match the searching criteria that the client passes as an argument to the command. Client uses FETCH to request part or all of a specified message (be sequence number). See the Fetch Options section for more information. Continues
197
COMMAN D LSUB
STATUS
APPEND CHECK
Authenticated or selected Selected
CLOSE
Selected
EXPUNGE
Selected
SEARCH
Selected
FETCH
Selected
198
Essential Email Standards: RFCs and Protocols Made Practical Table 12.2 IMAP Client Commands (Continued) VALI D STATE(S) Selected DESCRIPTION Requests that the specified flag value(s) be stored in the mailbox on the server. Copies one or more messages to the end of the mailbox specified. The client uses the UID command to FETCH, COPY, or STORE messages based on message UID rather than message sequence number. UID can also be used with SEARCH to request message UIDs in response from the server instead of message sequence numbers.
COMMAN D STORE
COPY UID
Selected Selected
IMAP FETCH
FETCH is so important because it provides so many different options for retrieving message data. The client can retrieve an entire message with all associated information, such as message flags and header information. The client can also retrieve flag, header, and date information for all the messages in the mailbox. This makes it relatively easy for a user to pick and choose which messages to download over a low-bandwidth link. The client can also download only RFC 822 and MFS message bodies, leaving MIME content on the server.
Excerpt from RFC 2060, defining the IMAP FETCH command:
6.4.5.
FETCH Command message set message data item names untagged responses: FETCH OK - fetch completed NO - fetch error: cant fetch that data BAD - command unknown or arguments invalid
Arguments:
Responses: Result:
The FETCH command retrieves data associated with a message in the mailbox. The data items to be fetched can be either a single atom or a parenthesized list. The currently defined data items that can be fetched are: ALL Macro equivalent to: (FLAGS INTERNALDATE

RFC822.SIZE ENVELOPE) BODY Non-extensible form of BODYSTRUCTURE.
199
BODY[<section>]<<partial>> The text of a particular body section. The section specification is a set of zero or more part specifiers delimited by periods. A part specifier is either a part number or one of the following: HEADER, HEADER.FIELDS, HEADER.FIELDS.NOT, MIME, and TEXT. An empty section specification refers to the entire message, including the header. Every message has at least one part number. Non-[MIME-IMB] messages, and non-multipart [MIME-IMB] messages with no encapsulated message, only have a part 1. Multipart messages are assigned consecutive part numbers, as they occur in the message. If a particular part is of type message or multipart, its parts MUST be indicated by a period followed by the part number within that nested multipart part. A part of type MESSAGE/RFC822 also has nested part numbers, referring to parts of the MESSAGE parts body. The HEADER, HEADER.FIELDS, HEADER.FIELDS.NOT, and TEXT part specifiers can be the sole part specifier or can be prefixed by one or more numeric part specifiers, provided that the numeric part specifier refers to a part of type MESSAGE/RFC822. The MIME part specifier MUST be prefixed by one or more numeric part specifiers. The HEADER, HEADER.FIELDS, and HEADER.FIELDS.NOT part specifiers refer to the [RFC-822] header of the message or of an encapsulated [MIME-IMT] MESSAGE/RFC822 message. HEADER.FIELDS and HEADER.FIELDS.NOT are followed by a list of field-name (as defined in [RFC-822]) names, and return a subset of the header. The subset returned by HEADER.FIELDS contains only those header fields with a field-name that matches one of the names in the list; similarly, the subset returned by HEADER.FIELDS.NOT contains only the header fields with a non-matching field-name. The field-matching is case-insensitive but otherwise exact. In all cases, the delimiting blank line between the header and the body is always included. The MIME part specifier refers to the [MIME-IMB]
200

header for this part. The TEXT part specifier refers to the text body of the message, omitting the [RFC-822] header.
Here is an example of a complex message with some of its part specifiers: HEADER TEXT 1 2 3 3.HEADER 3.TEXT 3.1 3.2 4 4.1 4.1.MIME 4.2 4.2.HEADER 4.2.TEXT 4.2.1 4.2.2 4.2.2.1 4.2.2.2 ([RFC-822] header of the message) MULTIPART/MIXED TEXT/PLAIN APPLICATION/OCTET-STREAM MESSAGE/RFC822 ([RFC-822] header of the message) ([RFC-822] text body of the message) TEXT/PLAIN APPLICATION/OCTET-STREAM MULTIPART/MIXED IMAGE/GIF ([MIME-IMB] header for the IMAGE/GIF) MESSAGE/RFC822 ([RFC-822] header of the message) ([RFC-822] text body of the message) TEXT/PLAIN MULTIPART/ALTERNATIVE TEXT/PLAIN TEXT/RICHTEXT
It is possible to fetch a substring of the designated text. This is done by appending an open angle bracket (<), the octet position of the first desired octet, a period, the maximum number of octets desired, and a close angle bracket (>) to the part specifier. If the starting octet is beyond the end of the text, an empty string is returned. Any partial fetch that attempts to read beyond the end of the text is truncated as appropriate. A partial fetch that starts at octet 0 is returned as a partial fetch, even if this truncation happened. Note: this means that BODY[]<0.2048> of a 1500-octet message will return BODY[]<0> with a literal of size 1500, not BODY[]. Note: a substring fetch of a HEADER.FIELDS or HEADER.FIELDS.NOT part specifier is calculated after subsetting the header.
201
The \Seen flag is implicitly set; if this causes the flags to change they SHOULD be included as part of the FETCH responses. BODY.PEEK[<section>]<<partial>> An alternate form of BODY[<section>] that does not implicitly set the \Seen flag. BODYSTRUCTURE The [MIME-IMB] body structure of the message. This is computed by the server by parsing the [MIME-IMB] header fields in the [RFC-822] header and [MIME-IMB] headers. The envelope structure of the message. This is computed by the server by parsing the [RFC-822] header into the component parts, defaulting various fields as necessary. Macro equivalent to: (FLAGS INTERNALDATE RFC822.SIZE) The flags that are set for this message. Macro equivalent to: (FLAGS INTERNALDATE RFC822.SIZE ENVELOPE BODY) The internal date of the message. Functionally equivalent to BODY[], differing in the syntax of the resulting untagged FETCH data (RFC822 is returned). Functionally equivalent to BODY.PEEK[HEADER], differing in the syntax of the resulting untagged FETCH data (RFC822.HEADER is returned). The [RFC-822] size of the message. Functionally equivalent to BODY[TEXT], differing in the syntax of the resulting untagged FETCH data (RFC822.TEXT is returned). The unique identifier for the message. C: S: S: S: S: A654 FETCH 2:4 (FLAGS BODY[HEADER.FIELDS (DATE FROM)]) * 2 FETCH .... * 3 FETCH .... * 4 FETCH .... A654 OK FETCH completed
ENVELOPE
FAST
FLAGS FULL
INTERNALDATE RFC822
RFC822.HEADER
RFC822.SIZE RFC822.TEXT
UID Example:
202
The last part of this excerpt includes an example of the FETCH command. This is a typical IMAP interaction, with the client data prefixed with C: and the server data prefixed with S:. Next, we look at how server responses work in IMAP next as well as a more complete example of an IMAP session.
IMAP Server Responses

Another way in which IMAP differs from the protocols covered elsewhere in this book (and, indeed, from most Internet protocols) is that server responses do not consist of coded numeric responses but rather consist of a token indicating the status of the command and additional information. Server transmissions that are responding to specific client requests also include a tag to identify which request they match, and server responses may also include human-readable text. Three kinds of server responses are possible in IMAP: status responses, server data, and command continuation requests. All server data responses are untagged, and some status responses may be untagged. When a server response is untagged, it is prefixed with the asterisk (*) instead of a tag. Command continuation responses use the plus sign (+) instead of a tag to indicate that the initial part of the command has been accepted but that the command is incomplete in some way. Server greetings and server status messages that dont relate to completion of a specific command are untagged responses. Other types of unilateral untagged server data may be sent during the selected state of a session. IMAP checks the selected mailbox for new messages as part of any selected state command execution. This means that even a NOOP command, which requests nothing but an acknowledgment from the server, actually behaves as a message polling command when issued during a sessions selected state. When new messages are found, the server responds with a tagged OK, followed by untagged responses that indicate the number of messages in the mailbox (EXISTS response) and how many of those messages are new (RECENT response). In this section, we look at five different categories of server responses:
s s s s s s s s s s
Status responses Server and mailbox status Mailbox size Message status Command continuation request
Status Responses
Permitted status responses include:
203
OK means that the command has been accepted and will be processed. OK responses are tagged if they are in response to a specific command, but they may be untagged to indicate an information-only message from the server. The untagged OK response is also used when the session is first opened and, in this case, indicates that a LOGIN command is required to continue. NO means that the command has not been accepted and will not be processed. NO means that the command was recognized and apparently syntactically correct but not permitted. NO responses are tagged when they are sent in response to a specific command, but they can be used without a tag as a warning. When used as a warning, the untagged NO response includes human-readable text describing some condition that needs attention while still completing the command with a tagged OK response. BAD means that the command failed due to an apparent protocol or syntax problem. For example, the command may not have been recognized as valid, or it may not have been formatted correctly. BAD responses are tagged when in response to a specific command. When the server can not associate a protocol-level error with a specific command, the BAD response can be untagged. PREAUTH is the response sent when a preauthenticated connection is opened, and it indicates that no LOGIN is required. PREAUTH responses are always untagged. BYE is the response sent when a connection is about to be closed (or rejected without being opened). BYE responses are always untagged. Status response may include an optional response code that has data inside square brackets followed by optional arguments. The response code adds more complete information related to the response itself, especially where additional information can help the client take a specific action to remedy a problem with the command. RFC 2060 defines the following response codes: ALERT indicates that there is human-readable text included with the response that must be made available to the user in a way that calls the users attention to the alert. NEWNAME indicates a mailbox name and a new mailbox name. This response code indicates that the mailbox name (specified by a client SELECT or EXAMINE command) has been changed to the new mailbox name. PARSE indicates an error in RFC 822, MFS, or MIME headers and includes the text that is in error. PERMANENTFLAGS is followed by a list of flags and indicates that the listed flags may be changed permanently by the client.
204
READ-ONLY indicates that the client can access the selected mailbox in read-only mode (or that the access status has been changed from readwrite to read-only). READ-WRITE indicates that the client can access the selected mailbox in read-write mode (or that the access status has been changed from readonly to read-write). TRYCREATE is included in responses to APPEND or COPY commands when the mailbox to which the client is trying to add messages does not exist. UIDVALIDITY is followed by a decimal value of a 32-bit number that is the validity value for the selected mailbox. UNSEEN is followed by a decimal number that refers to the sequence number of the first message that does not have the \Seen flag set.
Server and Mailbox Status Responses

Servers use server and mailbox status responses to transmit information to the client from the server. Though typically these responses are generated as a result of a similarly named client command, they are always untagged responses. These responses, as defined in RFC 2060, include the following: CAPABILITY is sent after the server gets a CAPABILITY command from a client. It consists of the word CAPABILITY followed by a list of server capabilities separated by spaces. This list must include IMAP4rev1 and may include an AUTH=AUTHTYPE to indicate that the server uses the authorization type AUTHTYPE. Server capabilities must either be registered standard IMAP capabilities or be experimental (prefixed with an X). LIST is sent after the server receives a LIST command. Each mailbox is reported in a separate LIST line. There are four attributes defined for LIST that can be used to indicate whether it is possible to create a subfolder of the mailbox, whether the mailbox can even be selected, and whether any new messages have been received since the last mailbox access. LSUB is issued when the server receives an LSUB command. Its data is the same as that for the LIST command, but it returns a subscribed mailbox in each line of response. STATUS is sent to the server. Upon receiving a STATUS command, the server returns the STATUS response containing the name and status (number of messages and next UID to be assigned) of the mailbox in question. SEARCH is sent to the server. Upon receiving a SEARCH command, the server returns the message sequence number (or UID number for UID SEARCH commands) of the messages that match the search criteria. Numbers are separated by spaces.
205
FLAGS is sent after the server receives a SELECT or EXAMINE command from the client. This response contains a list of flags that are applicable to the mailbox in question. All FLAGS responses contain at least the systemdefined flags and may contain flags that have been specially defined for the server implementation.
Mailbox Size Responses

Mailbox size responses provide a mechanism for the server to notify the client of changes in the size of the mailbox. In other words, it tells the client that new mail has arrived. All mailbox size responses are untagged. EXISTS reports the number of messages in the mailbox as a decimal number. This response may be sent if the number of messages changes or in response to a SELECT or EXAMINE command from the client. RECENT reports the number of messages that have the \Recent flag set, and it occurs in response to a SELECT or EXAMINE command or any time the mailbox changes in size.
Message Status Responses

These responses are generated as a result of an EXPUNGE or FETCH command from the client. Message status responses are always untagged and always include a number following the asterisk prefix that indicates the message sequence number. EXPUNGE indicates that the specified message sequence number has been removed from the mailbox (not just marked for removal). The message specified is removed, and all messages with higher message sequence numbers are immediately decremented by 1. Decrementing sequence numbers has several implications. Depending on the server, when expunging more than one message at a time, the server might remove the highest sequence numbered messages first. In this case, the EXPUNGE server responses would cycle downward from the highest number. Another implication is that the server can expunge the lowest sequence numbered message first, in which case all succeeding expunged messages would be renumbered and thus the deleted messages all appear to have the same sequence numbers. Thus, EXPUNGE responses must not be sent while the server is also responding to FETCH, STORE, or SEARCH commands because of the potential for confusion. FETCH This response contains the message data requested by the FETCH command from the client. Just as the FETCH command is the most involved IMAP commandand also the most important in terms of protocol functionalityso too is the FETCH server response the most
206
involved as well as important server response. The complete text of the RFC 2060 discussion of the FETCH response follows.
Excerpted from RFC 2060: 7.4.2. FETCH Response message data
Contents:
The FETCH response returns data about a message to the client. The data are pairs of data item names and their values in parentheses. This response occurs as the result of a FETCH or STORE command, as well as by unilateral server decision (e.g. flag updates). The current data items are: BODY A form of BODYSTRUCTURE without extension data.
BODY[<section>]<<origin_octet>> A string expressing the body contents of the specified section. The string SHOULD be interpreted by the client according to the content transfer encoding, body type, and subtype. If the origin octet is specified, this string is a substring of the entire body contents, starting at that origin octet. This means that BODY[]<0> MAY be truncated, but BODY[] is NEVER truncated. 8-bit textual data is permitted if a [CHARSET] identifier is part of the body parameter parenthesized list for this section. Note that headers (part specifiers HEADER or MIME, or the header portion of a MESSAGE/RFC822 part), MUST be 7-bit; 8-bit characters are not permitted in headers. Note also that the blank line at the end of the header is always included in header data. Non-textual data such as binary data MUST be transfer encoded into a textual form such as BASE64 prior to being sent to the client. To derive the original binary data, the client MUST decode the transfer encoded string. BODYSTRUCTURE A parenthesized list that describes the [MIME-IMB] body structure of a message. This is computed by the server by parsing the [MIME-IMB] header fields, defaulting various fields as necessary. For example, a simple text message of 48 lines and 2279 octets can have a body structure of: (TEXT

"PLAIN" (CHARSET "US-ASCII") NIL NIL "7BIT" 2279 48) Multiple parts are indicated by parenthesis nesting. Instead of a body type as the first element of the parenthesized list there is a nested body. The second element of the parenthesized list is the multipart subtype (mixed, digest, parallel, alternative, etc.). For example, a two part message consisting of a text and a BASE645-encoded text attachment can have a body structure of: (("TEXT" "PLAIN" ("CHARSET" "US-ASCII") NIL NIL "7BIT" 1152 23)("TEXT" "PLAIN" ("CHARSET" "US-ASCII" "NAME" "cc.diff") "<960723163407.20117h@cac.washington.edu>" "Compiler diff" "BASE64" 4554 73) "MIXED")) Extension data follows the multipart subtype. Extension data is never returned with the BODY fetch, but can be returned with a BODYSTRUCTURE fetch. Extension data, if present, MUST be in the defined order. The extension data of a multipart body part are in the following order: body parameter parenthesized list A parenthesized list of attribute/value pairs [e.g. ("foo" "bar" "baz" "rag") where "bar" is the value of "foo" and "rag" is the value of "baz"] as defined in [MIME-IMB]. body disposition A parenthesized list, consisting of a disposition type string followed by a parenthesized list of disposition attribute/value pairs. The disposition type and attribute names will be defined in a future standards-track revision to [DISPOSITION]. body language A string or parenthesized list giving the body language value as defined in [LANGUAGE-TAGS]. Any following extension data are not yet defined in this version of the protocol. Such extension data can consist of zero or more NILs, strings, numbers, or potentially nested parenthesized lists of such data. Client implementations that do a BODYSTRUCTURE fetch MUST be prepared to accept such extension data. Server implementations MUST NOT
207
208

returned with a BODYSTRUCTURE fetch. Extension data, if present, MUST be in the defined order. The extension data of a non-multipart body part are in the following order: body MD5 A string giving the body MD5 value as defined in [MD5]. body disposition A parenthesized list with the same content and function as the body disposition for a multipart body part. body language A string or parenthesized list giving the body language value as defined in [LANGUAGE-TAGS]. Any following extension data are not yet defined in this version of the protocol, and would be as described above under multipart extension data. ENVELOPE A parenthesized list that describes the envelope structure of a message. This is computed by the server by parsing the [RFC-822] header into the component parts, defaulting various fields as necessary. The fields of the envelope structure are in the following order: date, subject, from, sender, reply-to, to, cc, bcc, in-reply-to, and message-id. The date, subject, in-reply-to, and message-id fields are strings. The from, sender, reply-to, to, cc, and bcc fields are parenthesized lists of address structures. An address structure is a parenthesized list that describes an electronic mail address. The fields of an address structure are in the following order: personal name, [SMTP] at-domain-list (source route), mailbox name, and host name. [RFC-822] group syntax is indicated by a special form of address structure in which the host name field is NIL. If the mailbox name field is also NIL, this is an end of group marker (semi-colon in RFC 822 syntax). If the mailbox name field is non-NIL, this is a start of group marker, and the mailbox name field holds the group name phrase. Any field of an envelope or address structure that

send such extension data until it has been defined by a revision of this protocol. The basic fields of a non-multipart body part are in the following order: body type A string giving the content media type name as defined in [MIME-IMB]. body subtype A string giving the content subtype name as defined in [MIME-IMB]. body parameter parenthesized list A parenthesized list of attribute/value pairs [e.g. ("foo" "bar" "baz" "rag") where "bar" is the value of "foo" and "rag" is the value of "baz"] as defined in [MIME-IMB]. body id A string giving the content id as defined in [MIME-IMB]. body description A string giving the content description as defined in [MIME-IMB]. body encoding A string giving the content transfer encoding as defined in [MIME-IMB]. body size A number giving the size of the body in octets. Note that this size is the size in its transfer encoding and not the resulting size after any decoding. A body type of type MESSAGE and subtype RFC822 contains, immediately after the basic fields, the envelope structure, body structure, and size in text lines of the encapsulated message. A body type of type TEXT contains, immediately after the basic fields, the size of the body in text lines. Note that this size is the size in its content transfer encoding and not the resulting size after any decoding. Extension data follows the basic fields and the type-specific fields listed above. Extension data is never returned with the BODY fetch, but can be
209
210

is not applicable is presented as NIL. Note that the server MUST default the reply-to and sender fields from the from field; a client is not expected to know to do this. FLAGS A parenthesized list of flags that are set for this message. A string representing the internal date of the message. Equivalent to BODY[]. Equivalent to BODY.PEEK[HEADER]. A number expressing the [RFC-822] size of the message. Equivalent to BODY[TEXT].
INTERNALDATE
RFC822 RFC822.HEADER RFC822.SIZE
RFC822.TEXT Example:
S: * 23 FETCH (FLAGS (\Seen) RFC822.SIZE 44827)
Command Continuation Requests

The command continuation request, signified by a prefix plus sign (+) instead of an asterisk or tag, is defined in RFC 2060 only for use with the AUTHORIZATION command. Command continuation requests could be used anywhere a client issues a command that requires multiple steps such as that required by the authorization process. The response itself merely indicates that the server is in a state of readiness for the next part of the client command; it may also include human readable text.
Sample IMAP Session

This sample session is taken directly from RFC 2060 and demonstrates the use of command tags and server response messages. The session starts out with the client logging in, then selecting the mailbox inbox. The client then fetches the full data associated with the message number 12 in the mailbox; this is followed by a fetch of the body and header of the same message, and then by deletion of that message. The long line in response to the initial fetch is broken up for claritys sake.
S: C: S: C: S: * OK a001 a001 a002 * 18 IMAP4rev1 Service Ready login mrc secret OK LOGIN completed select inbox EXISTS

S: S: S: S: S: C: S: * FLAGS (\Answered \Flagged \Deleted \Seen \Draft) * 2 RECENT * OK [UNSEEN 17] Message 17 is the first unseen message * OK [UIDVALIDITY 3857529045] UIDs valid a002 OK [READ-WRITE] SELECT completed a003 fetch 12 full * 12 FETCH (FLAGS (\Seen) INTERNALDATE "17-Jul-1996 02:44:25 -0700" RFC822.SIZE 4286 ENVELOPE ("Wed, 17 Jul 1996 02:23:25 -0700 (PDT)" "IMAP4rev1 WG mtg summary and minutes" (("Terry Gray NIL "gray" "cac.washington.edu")) (("Terry Gray" NIL "gray" "cac.washington.edu")) (("Terry Gray" NIL "gray" "cac.washington.edu")) ((NIL NIL "imap" "cac.washington.edu")) ((NIL NIL "minutes" "CNRI.Reston.VA.US") ("John Klensin" NIL "KLENSIN" "INFOODS.MIT.EDU")) NIL NIL "<B27397-0100000@cac.washington.edu>") BODY ("TEXT" "PLAIN" ("CHARSET" "US-ASCII") NIL NIL "7BIT" 3028
211
92)) S: a003 OK FETCH completed C: a004 fetch 12 body[header] S: * 12 FETCH (BODY[HEADER] {350} S: Date: Wed, 17 Jul 1996 02:23:25 -0700 (PDT) S: From: Terry Gray <gray@cac.washington.edu> S: Subject: IMAP4rev1 WG mtg summary and minutes S: To: imap@cac.washington.edu S: cc: minutes@CNRI.Reston.VA.US, John Klensin <KLENSIN@INFOODS.MIT.EDU> S: Message-Id: <B27397-0100000@cac.washington.edu> S: MIME-Version: 1.0 S: Content-Type: TEXT/PLAIN; CHARSET=US-ASCII S: S: ) S: a004 OK FETCH completed C: a005 store 12 +flags \deleted S: * 12 FETCH (FLAGS (\Seen \Deleted)) S: a005 OK +FLAGS completed C: a006 logout S: * BYE IMAP4rev1 server terminating connection S: a006 OK LOGOUT completed
IMAP-Related RFCs
As of mid 1999, many additional RFCs define aspects of IMAP as a mail handling protocol. Other RFCs discuss other aspects of IMAP. Table 12.3 lists these RFCs, as well as their status as standards-track or informational documents. The rest of this section summarizes the contents of the RFCs that document proposed standard specifications for IMAP, except for RFC 2060, which has been the foundation for this chapter.
212
In general, IMAP extensions define new server capabilities that are reported by the server in response to the CAPABILITY command. IMAP standardstrack specifications include the following RFCs.
RFC 1731, IMAP4 Authentication Mechanisms

RFC 1731, IMAP4 Authentication Mechanisms, describes additional authentication mechanisms for use with the IMAP AUTHENTICATE command. These include Kerberos version 4, the Generic Security Service Application Program Interface (GSSAPI, documented in RFC 1508), and S/Key, a one-time password system developed at Bellcore.
Table 12.3 RFC RFC 1731 RFC 1732 RFC 1733 RFC 2060 RFC 2086 RFC 2087 RFC 2088 RFC 2177 RFC 2180 RFC 2192 RFC 2193 RFC 2195 RFC 2221 RFC 2342 RFC 2359
IMAP RFCs STATUS Proposed Standard Informational Informational Proposed Standard Proposed Standard Proposed Standard Proposed Standard Proposed Standard Informational Proposed Standard Proposed Standard Proposed Standard Proposed Standard Proposed Standard Proposed Standard TITLE IMAP4 Authentication Mechanisms IMAP4 Compatibility with IMAP2 and IMAP2bis Distributed Electronic Mail Models in IMAP4 INTERNET MESSAGE ACCESS PROTOCOL VERSION 4rev1 IMAP4 ACL extension IMAP4 QUOTA extension IMAP4 non-synchronizing literals IMAP4 IDLE command IMAP4 Multi-Accessed Mailbox Practice IMAP URL Scheme IMAP4 Mailbox Referrals IMAP/POP AUTHorize Extension for Simple Challenge/Response IMAP4 Login Referral IMAP4 Namespace IMAP4 UIDPLUS extension
213
RFC 2086, IMAP4 ACL Extension

RFC 2086, IMAP4 ACL Extension, defines an extension to IMAP that permits access control lists to be manipulated through IMAP. Access control lists contain names referring to IMAP users paired with a list of access control rights. This specification defines commands that allow authorized users to view, change, and remove access permission for IMAP mailboxes.
RFC 2087, IMAP4 QUOTA Extension

RFC 2087, IMAP4 QUOTA extension, defines an extension to IMAP (QUOTA) that permits administrators to limit resource (disk) usage. These quotas can be manipulated through IMAP quota commands that permit authorized users to set limits on storage and other system resources for specific mailboxes.
RFC 2088, IMAP4 Non-Synchronizing Literals

IMAP defines a mechanism for sending literals, requiring the client to send a notification of the number of bytes to be sent and waiting for a response by the server before actually sending them. RFC 2088 defines a mechanism that allows this process to be streamlined. Clients simply send the literal after the byte count without waiting for a command continuation response from the server when nonsynchronizing literals are supported.
RFC 2177, IMAP4 IDLE Command

As defined in RFC 2060, IMAP message polling is the responsibility of the client. Although some servers may send notification of new messages without being prompted, clients can not expect this behavior in all cases. The IDLE command allows the client to request the server to notify it immediately of any new messages, thus eliminating unnecessary polling activity.
RFC 2192, IMAP URL Scheme

This specification does not define any new messaging aspect of IMAP, but rather defines a URL scheme that can be used to reference objects that reside on an IMAP server.
RFC 2193, IMAP4 Mailbox Referrals

This specification defines a mechanism that allows clients to seamlessly access mailboxes that are distributed across more than one IMAP server. For example,
214
a personal mailbox might reside on a local server and shared mailboxes on remote servers. Rather than acting as a proxy on behalf of the client, the server refers the client to connect directly with any referred servers.
RFC 2195, IMAP/POP AUTHorize Extension for Simple Challenge/Response

This specification defines a simple challenge/response authentication protocol for IMAP. AUTH has been mentioned in Chapter 11, and it is discussed againly in Chapter 17, Internet Messaging Security.
RFC 2221, IMAP4 Login Referrals

Just as sometimes a client needs to access mailboxes that are spread across more than one server, so too it sometimes happens that a mailbox is moved from one server to another. This might occur when a server fails and access is provided by a backup server, or when email support for an organizational unit is changed from one server to another. RFC 2221 defines a mechanism that allows a server to refer a client login to another server, rather than acting as a proxy for the client in dealing with the new server.
RFC 2342, IMAP4 Namespace

RFC 2060 leaves the matter of mailbox namespace to the IMAP implementer. Though it has much to say about mailbox namespace conventions (not discussed in this chapter), RFC 2060 defines no default approach for naming mailboxes. RFC 2342 notes that there are two different broad approaches to namespaces: building a namespace consisting only of a users personal mailboxes (without access to any shared mailboxes) and building a hierarchical namespace in which personal and shared mailboxes can both be accessed through the hierarchy. RFC 2342 defines a command (NAMESPACE) that requests the server to notify the client of mailbox prefixes. With this knowledge, the client can access personal and shared mailboxes without any manual configuration of mailbox name prefixes.
RFC 2359, IMAP4 UIDPLUS Extension

This specification defines a mechanism for streamlining message manipulation (APPEND, COPY, and EXPUNGE) for disconnected-use clients. Despite the fact that no standards-track specifications have been accepted for disconnected use of IMAP, the IESG approved this RFC for proposed standard status
215
because it appeared to be free of any technical flaws. However, it does make assumptions about how disconnected-use IMAP would work, an issue that is still open.
Reading List
Table 12.3 includes the most important RFCs relating to IMAP. There are other documents related to IMAP, but these are mostly obsolete or superseded by newer documents.
CHAPTER
13
SMTP Message Address Resolution
Internet email addresses are usually expressed in the form mailboxname@host .domain.name, where host.domain.name is a fully qualified domain name and mailboxname is a name associated with a mailbox at the host identified on the other side of the @ character. Although it might be enough to say that email systems use the Domain Name System (DNS) to resolve Internet addresses just as any other Internet application, it would not be enough. In this chapter, we take a brief look at how DNS works and then discuss how SMTP uses DNS to determine where to send mail. DNS is specified as an Internet standard (STD-13) in RFC 1034 and RFC 1035, while Internet email address resolution using DNS is defined as an Internet standard (STD-14) in RFC 974. RFC 821bis consolidates and updates these specifications. The first part of this chapter introduces DNS and discusses how it is used with email addressing. The second part of this chapter examines how email addresses are used within SMTP to deliver messages to their destinations. Also discussed in more detail here are how SMTP gateways work and how aliases and mailing lists work.
217
218
The Domain Name System

IP defines two mechanisms for identifying nodes: the IP address and the fully qualified domain name. The IP address is a globally unique 32-bit value that indicates where the node is attached to the Internet (or intranet). The fully qualified domain name (FQDN) is a globally unique host name and domain that also indicates where the node is attached. An important difference between these two identifiers is that the IP address is necessary, while the name is not. Network devices like routers require an IP address to deliver packets, while the IP host name is most useful for humans as well as for giving a network resource or service a name that is independent of network topology (the hosting server may change, but the name can remain the same). When the first IP networks were built, host names could be associated with IP addresses by maintaining a file on every system that contained a list of all connected IP hosts with their IP addresses. On relatively small and static networks, the hosts file need be updated only when a host is added, removed, or moved around in the network. On slightly larger networks, an authoritative hosts file can be maintained on some central server and uploaded periodically by all network hosts. However, this approach rapidly fails to scale as network size increases. First, as the network size increases, so does the size of the hosts file, causing the periodic downloads to be longer and longer for each individual node and requiring an ever-more powerful central hosts file server. Second, as the network increases in size, downloading the hosts file takes up more and more bandwidth. Not only is the file larger, but it must also go to a greater number of nodes, across more and more networks. Finally, the growth of the network means also that more changes occur more frequently, causing more network problems reaching nodes that have just recently been added or moved as well as problems reaching nodes that have been removed from the network.
DNS Structure
The Domain Name System is effectively a distributed database application in which DNS servers provide information about their own zones. Responsibility for maintaining host information is thus pushed out to local ISPs and organizations. DNS uses a hierarchical data schema, in which domains consist of alphanumeric text words separated by periods (.). Under the highest level (signified by an empty domain) are seven three-character top-level domains that were originally specified for DNS (.com, .net, .org, .edu, .gov, .mil, and .int) and two-character country codes (.us, .uk, .cz, and so forth). Under each of the top-level domains are site domainsfor example, ibm.com, whitehouse.gov, mit.edu, and so on. It is possible to create further internal organization within a site domain, creating domains with more elements such as altavista.digital.com (the domain name used by Digital Equipment Corporation for its AltaVista Internet search engine before DEC was acquired by Compaq).
219
At each domain level exists a server that can refer queries to an appropriate resource. In other words, at the highest level, a service can refer queries about top-level domains, and more information is gathered by walking down the domain name. For example, a node might be requesting an IP address for the host with a FQDN of loshin.ne.mediaone.net. The first step would be to query the highest-level domain, asking it where more information can be found about the *.net domain. That service would point the requester to the *.net DNS service, which would point the requester to a DNS server for *.mediaone.net. That DNS server could point the requester to the DNS server for *.ne.medianone.net, which in turn would return to the requester the IP address for the host loshin.ne.mediaone.net.
DNS Name Resolution

DNS works at the application layer, using UDP to transmit requests and responses across the Internet. Internet applications such as HTTP usually use DNS resolvers (implemented as part of the network stack or as subroutines called by network applications) to convert FQDNs into IP addresses. When an application is presented with an FQDN as a destination, it uses the resolver to convert the host name to an IP address and then uses that IP address to pass along down the protocol stack for creation of the IP packets that carry the application data to its destination. As we see, DNS provides more than just IP address/host name bindings. This is important because the domain names specified in email addresses dont always identify the host and domain of the system that will be handling the recipients email.
DNS Resource Records

A DNS resource record (RR) is a construct within which DNS servers store data about domain names. RRs contain a particular bit of information about a domain name, and each RR is matched to a domain name. A particular domain name can have more than one RR, but each RR is associated with only one domain name. The particular RR returned by a DNS server depends on what kind of request it receives. The simplest type of RR is a type A RR: This resource record holds a 32-bit IP address. When a node requests resolution of an FQDN, the server returns this value. Things can get more complicated, however. A domain name may be an alias for some other node. For example, the FQDN www.Internet-Standard.com might actually be an alias. The CNAME RR (for canonical name) provides the FQDN of the domain name to which the alias refers. Aliases can be used when a service is associated with a particular host name in a domain, but the service is actually provided by some other host. For example, most Web servers are hosted on FQDNs that look like this: www.mydomain.com. The owners of that
220
domain might want to host their Web content on a host with a different name, perhaps at a hosting service. The CNAME RR would indicate the FQDN of the host on which the service is actually being hosted. Likewise, the owners of mydomain.com might also own mydomain.net and mydomain.org; these domains might also be aliased to the actual Web server domain. Several other RRs are defined for various different purposes; most of these are not immediately relevant to email. One RR type, however, is the MX (mail exchange) resource record. An MX record is quite different from an A record, mostly because the MX record includes a preference field. When more than one MX record is applicable to a particular domain, the preference field values of the records are used to determine the order in which the sender uses the different records. Each domain can have one or more MX RRs. Each MX RR contains two pieces of information: a preference (a 16-bit value) and a domain name (this should be a canonical domain name). The MX record indicates a host that handles email sent to the original domain. Thus, while you can address email to me at pete@loshin.com, the loshin.com domain has at least one MX RR directing that email addressed to the loshin.com domain actually be sent to some other mail server maintained by the provider that manages my domain. A domain can have more than one MX RR, and thats where the preference value comes into play. When more than one MX RR is returned, the requesting node (an SMTP server) sorts the MX RRs in ascending order based on preference. An SMTP server must attempt to deliver a message to a host identified in an MX RR with a low-valued preference before attempting delivery to a node with a higher-valued preference. For more about other types of DNS resource records, check out STD 13.
Internet Email Addressing

As discussed earlier in this chapter and in Chapter 8, Internet Message Format Standard, Internet email addresses consist of two portions, the mailbox identifier on the left of the @ character and the FQDN to the right of the @ character. In Chapter 8, we discussed how the mailbox portion of the address is used in SMTP to direct messages once a connection is made between SMTP client and server. The mailbox portion can also be used to refer to a mailing list, to be exploded by the SMTP server hosting the mailing list. However, in this chapter we discuss what happens with the domain name part of the email address and how SMTP clients determine what SMTP server they need to reach in order to deliver the message.
SMTP Domain Name Resolution

An Internet email address must use a valid FQDN in all cases, with only two exceptions. First, the rule: SMTP domains must be fully qualified and resolv-
221
able. This means that the domain must be resolvable to either an MX record or an A record. If the domain is a canonical name, it must represent a target that can be resolved to an MX or A record. Local nicknames (host aliases) may not be used, and neither may unqualified host names. In both cases, the name may have significance locally but not outside the DNS zone. Such names can be ambiguous and are not permitted. There are two exceptions to the rule: First, when opening a session with the EHLO command, the domain name given must be a primary host namea host name that resolves to an A resource record. Alternately, the EHLO command can be used with an IP address rather than a host name when the destination node does not have a host name. The other exception permitted occurs when a RCPT TO command is being issued to the reserved address postmaster. In these cases, the message must be accepted and forwarded to the entity acting as postmaster for the local SMTP server. SMTP clients attempting to deliver a message start by examining the destination email address. The address is parsed, meaning that the FQDN portion of the address is isolated for name resolution. The SMTP client first looks for one or more MX RRs associated with the domain. If no MX RRs are found, but a CNAME RR is found, then the canonical name is used in place of the domain name indicated in the email address. If only an A RR is found for the email address domain name, then the client can treat the domain as if it were associated with an MX RR. The next action taken by the SMTP client depends on what the DNS resolution reveals about the domain name in the email address. Assuming that only one MX RR is found, the SMTP client can use the canonical name associated with the MX record as the host name to which it connects to attempt to open an SMTP session. However, at times there will be more than one MX record, or even other alternative delivery addresses, which is discussed in the next section.
Alternative Delivery Addresses

A DNS lookup may fail to turn up any records matching the domain name, or it may succeed and return a single resource record for the domain name. In some cases, the lookup succeeds too well, producing one or more alternate addresses as well as multiple MX records. Multiple MX records may exist to provide redundancy in email delivery. An MX record may indicate an actual destination SMTP server, or it may refer to an SMTP relay server. Such a server might forward the message as is to its actual destination, forward the message after rewriting the address (as in gateway servers), or even pass the message along to a non-SMTP delivery environment. Alternate IP addresses for the destination may also be the result of using a multihomed host for the destination. A multihomed host is connected to the Internet (or intranet) on two or more different link layer network interfaces and has a separate IP address for each network interface. This approach is also used
222
for reliability, providing two or more routes to reach the destination server so that even if one route is down, another route can be used to complete delivery. In the event that a connection can not be opened with the first host, an SMTP client should try at least one more server before failing the entire transaction. Preferably, the client should try every server for which a DNS record is found (and try again) until a connection is made or until there are no more SMTP servers to attempt to connect to. The specification requires that SMTP clients be able to try and retry all the alternative destinations, but it allows implementers to provide an option to put a configurable limit on the number of alternative destinations that a client attempts. Although the client must be able to exhaust the possible destinations, the specification allows that this may not always be appropriate and recommends that at a minimum two different destinations be attempted. Multihomed hosts and multiple MX records are used to make email more reliable. The more alternative SMTP servers available to handle email, the more likely that at least one of them is up and running and has connectivity to the global Internet. However, trying and retrying all possible destinations for a piece of email may tie up SMTP client resources, hence the latitude allowed implementers and deployers in limiting the number of attempts. When there is more than one MX record, the SMTP client then must process the domains listed in those records. The first thing it does is to sort the MX records in ascending order of the preference field. The record with the preference field that has the lowest numerical value is listed first; the highest numerical value goes last. Next, the SMTP client checks whether any of the MX records identify the SMTP clients own local domain. This step is necessary to avoid mail loops. If the SMTP clients domain is listed, the client might have received the message from some other SMTP client for forwarding. All MX records whose preference level is equal to or greater than the local SMTP clients preference level are ignored. Starting with the MX record with the highest preference (the lowest numerical value), the SMTP client attempts to open an SMTP connection with the host specified by the canonical name indicated in the MX record. When more than one MX record contains the same preference, the SMTP client must choose which destination to attempt first. The sender may be able to make the decision based on some other criteria; for example, the sending SMTP client recognizes one of the addresses as being local or closer than the other addresses. If this is not possible, the sender must select the destination randomly. This helps spread the load across an organizations mail exchangers when everything else is equal. If the destination host is multihomed, the domain name resolver is responsible for ordering the alternate IP addresses in order of decreasing preference (based on some criterion related to the network distance or cost of transmission between the source and destination hosts). The domain resolver is required to put the alternative IP addresses in order. It is up to the SMTP sender to try these alternative destinations in that order.
223
SMTP Source Routing

Source routing, a mechanism defined in RFC 821 that allowed message originators to specify one or more intermediate SMTP servers, is deprecated by RFC 821bis. This soon-to-be official deprecation reflects the real-world rejection of SMTP source routing. Source routing has almost completely disappeared from working email systems. The original specification allowed the message source to list servers that the message should transit on its way to its ultimate destination. This practice allowed, among other things, message originators to leave the destination address incomplete in some way with the expectation that the intermediate servers would correct it before delivering the message. In addition, source routing can prevent messages from being delivered when one of the intermediate servers is down or unreachable, even if the destination server is reachable by some other route. Now, SMTP clients should not generate source routes unless there are extenuating circumstances. Even then, SMTP servers may decline to act as relays and may also decline to accept messages with source routes specified. In fact, RFC 821bis specifies that SMTP servers should simply ignore the source route hosts specified and just deliver the message to the destination host specified in the address. Instead of source routing, SMTP now relies on the network to get messages to their destinations. DNS MX records contain preference values so that some SMTP servers can act as relays without requiring the message source to specify them or even to know anything about them. An organization can create an MX record for its primary SMTP server and give it a preference value of 0. The same organization might create one or more other MX records pointing to other SMTP servers that act as backups to the primary server. A last-resort SMTP server might have a preference value of 65535, while the first-choice backup server might have a preference value of 1. The last-resort server might be on the other side of the Internet, connected to a network that is connected through an expensive private network to the organizational headquarters. The first-choice backup server might have its own separate Internet connection and be connected by Ethernet to the primary server. However, both those servers as well as any other servers with intermediate preference can be used to route around failures in the primary server. Each can act as a relay server by forwarding messages to their ultimate destination.
SMTP Gatewaying
An SMTP server specified in an MX record may be the destination server, it may be a relay server, or it may even be a gateway server. A gateway translates messages that conform to the SMTP specification into a format that conforms to the internal gatewayed protocol specification. What kinds of changes must be made depend upon what protocols are being used in the internal message delivery
224
environment. If the internal delivery system uses a format compatible with RFC 822 or RFC 822bis, then the gateway must translate only the message transfer protocol. If the internal system uses a different message format, then the message headers and perhaps even the message itself must also be translated. Given the technical complexity and wide range of possibilities for the internal message delivery environment, the Internet standards for messaging can not offer much more than some general guidelines for gateways between SMTP and other delivery environments. However, some generalizations can be made about SMTP gatewaying, and they are discussed in RFC 821bis. First, RFC 821bis specifically states that it is permissible to rewrite message header fields when necessary as messages move across mail environment boundaries. This can be done by examining the message body and its headers as well as interpreting and translating that data into an appropriate format for the internal mail protocols. Many email specifications either use a subset of RFC 822 headers or provide similar functions to those defined by RFC 822 but different formats, so translation of those headers is usually relatively straightforward. However, not all email specifications have structures equivalent to the SMTP envelope (the information contained in the MAIL FROM and RCPT TO commands is considered the SMTP envelope). In some cases, therefore, it may be necessary for the gateway to somehow incorporate this information into the translated messages headers. One header field that should not be modified in the transition across an email delivery environment is the Received: line. The gateway must prepend its own Received: line without making any changes in Received: lines that are already a part of the message. Even though different email protocols might use different specifications to create their own Received: header fields, the purpose of those fields is to aid debugging problems in mail transport. Any changes made to existing Received: header fields, even well-meaning changes meant to fix the headers to conform to one or another email protocol, would complicate this process. By the same token, SMTP systems must also leave these trace fields as is and not reject messages whose trace fields do not conform to RFC 822/RFC 822bis. Another recommended practice to assist in message delivery debugging is for the email gateway to indicate the environment and messaging protocols being used. This information can be included in a via clause in the Received: header field. Otherwise, the specification reiterates that a gateway, when interacting with SMTP systems, should behave as an SMTP system in terms of accepting valid address formats, creating well-formed messages according to RFC 822/RFC 822bis standards, and properly preparing return-path fieldsjust like any SMTP system.
Mailing Lists and Aliases

Though not related to the way SMTP interacts with DNS, we discuss the use of mailing lists and mailing aliases here because it does relate to the way mes-
225
sages are delivered based on the destination address. The specification indicates that hosts supporting SMTP should support both aliases and mailing lists for the purpose of using a single email address target to be expanded for delivery to multiple actual destinations. It is important to note that the MAIL FROM command (the SMTP envelope) of a delivered message forwarded from an expanded list form (either alias or mailing list) contains the address of the entity responsible for maintaining the list, but the From: header field in the message itself must not change. Email aliases and mailing lists share certain characteristics. Both allow multidestination delivery, in which a single message sent to the alias or list is transformed into a list of email addresses to which the message is sent. Both use an address that looks just like a regular mailbox, but messages sent to that address are actually sent to all the addresses participating in the mailing list or included in the alias. Aliases and mailing lists differ in the way the recipient mailer behaves when it receives a message and forwards it to member addresses. The recipient mailer affects only the SMTP envelope when an alias is expanded. The mailer issues the RCPT TO command with each address associated with the alias replacing the alias. Otherwise, the rest of the SMTP envelope (the MAIL FROM commands, for example) are unchanged as is the rest of the body and header of the message being delivered. In effect, the mailer simply forwards the message to all the recipients. The mailer forwards the message to each recipient separately, creating a separate envelope (SMTP transaction) for each recipient. Mailing lists are treated differently. They start out by replacing the mailing list address in the SMTP RCPT TO command with all the addresses that have been expanded from the mailing list address. Rather than sending out the same number of messages as there are subscribers or members of an alias, with a mailing list there is a single message that is sent to multiple recipients. Generally, the recipients list is not included in the delivered message headers. Also unlike aliases, mailing list mailers change the return address (MAIL FROM) in the SMTP envelope to point to the mailing list administrator. This practice ensures that any error messages, such as those generated by a bad email address, are sent to the list administrator rather than to the sender of the original message. Suppressing transmission of the members addresses as well as directing error messages to the list administrators makes mailing lists more practical than aliases for lists with large memberships.
Reading List
Table 13.1 includes the most important RFCs relating to DNS and SMTP. Though originally considered a separate issue, domain name resolution is now an integral part of the SMTP specification.
226
Essential Email Standards: RFCs and Protocols Made Practical Table 13.1 RFC RFC 821 RFC 822 RFC 974 RFC 1034 RFC 1035 RFC 821bis RFC 822bis Relevant RFCs TITLE Simple Mail Transfer Protocol Standard for the Format of ARPA Internet Text Messages Mail Routing and the Domain System Domain Names Concepts and Facilities Domain Names Implementation and Specification Simple Mail Transfer Protocol Internet Message Format Standard
CHAPTER
15
vCard
Why include a chapter on electronic business cards in a book on Internet messaging? For one thing, the vCard specification is on the Internet standards track as a proposed standard. For another, the vCard specification defines a profile used with the MIME text/directory content type, which means that vCards can be exchanged using Internet messaging protocols, among others. Finally, although they are by no means essential, vCard enclosures certainly make using the Internet to exchange messages much simpler because they make the exchange of contact informationsuch as email addressesmuch simpler. In this chapter, we first briefly introduce vCard, explaining exactly what it is and what it does. Then, we examine the MIME text/directory content-type specified in RFC 2425, A MIME Content-Type for Directory Information. This content-type can serve as the basis for the exchange of data from almost any directory application, and it does serve as the basis for the vCard application, defined in RFC 2426, vCard MIME Directory Profile, which is discussed in the second half of this chapter. vCard is a specification for a virtual business card. It is a data object that contains much of the same information normally available in a business card, as well as some information not normally found there, such as a public key for digital signatures and encryption,
259
260
a photographic image of the person described in the business card, and even a digitized voice message from the card owner. We need not dwell in depth on the vCard specification, although it does demonstrate the use of MIME to define application data transports. The application in question is relatively simple: vCard data can be extracted from the MIME format into virtually any name and address database application. Chapter 16 provides a similar, though far more complicated, example of the use of MIME and Internet messaging protocols to define an open transport for workgroup calendaring and scheduling applications.
What Is vCard?
The v-standards (vCard and vCalendar) were originally the work of the Versit Consortium, an industry group consisting of leading companies including Apple, AT&T, IBM, and Siemens. The group specified the vCard and vCalendar standards to make the exchange of contact and calendar information between individuals much simpler, no matter what hardware or software they use. At the end of 1996, the Versit Consortium disbanded and transferred responsibility for its Personal Data Interchange (PDI) specifications to another, new, industry group: the Internet Mail Consortium (IMC). The IMC cooperates with the IETF in its vCard and vCalendar standards development effort to keep the specifications open and available for use over the Internet and other TCP/IP networks. vCard supports PDI between and among many different platforms, ranging from palmtop computers to desktop computers and including exchange through the Web, email, personal information manager (PIM) programs, even telephones, and videoconferencing applications. According to IMC materials, vCards can be used in the following ways:
s s
To exchange basic directory and contact information such as name, address, telephone numbers, email addresses, and Web URLs. The vCard can indicate multiple instances of contact data, specifying a preferred telephone number, as well as different kinds of contact data, indicating which telephone number is a fax, pager, cellular, office, or home number, for example. To exchange multimedia directory and contact information such as company logos, photographs, and even audio clips for welcome messages or to clarify how to pronounce a name. To exchange information about preferred contact times and locations.
s s
s s
vCards are highly interoperable, providing support for multiple languages and using an open specification that can be easily implemented on virtually any transport or operating system. IMC touts the use of vCards for exchange of information through the following transports:
vCard
261
s s
vCards can be incorporated into email messages and quickly incorporated into address book applications on the receivers system. vCards can be dragged and dropped into Internet registration or order forms to fill out the forms, automatically (if the form is properly designed). vCards can be exchanged through infrared (IR) links between palmtop and other devices with IR ports. vComputer telephony and videoconferencing applications can transmit vCards simultaneously with voice or video for a variety of purposes.
s s
s s
s s
Other applications are possible, limited only by the devices and systems that can support TCP/IP and Internet applications. vCard is currently supported by industry leader vendors of addressing and calendaring products including 3COM (in the Palm Pilot product), Novell, Nokia, Microsoft, Netscape, Lotus, and many others.
N OT E Not all leading software publishers support vCard, unfortunately. As of

mid-1999, some email clients still did not support vCard. For example, the Eudora email client from Qualcomm still did not support vCard attachments. They simply appear as unknown attachments.
MIME Text/Directory Content-Type

A directory is a database that contains information about entities. Each directory record must include an entity identifier and can contain other information about the entity. A telephone book is an example of a directory: It contains telephone numbers, names, and addresses. The unique identifier is the telephone number, with the collateral information being the names and addresses; not all listings include an address, and not all valid telephone numbers are listed. The telephone company generates a telephone book from its own canonical directory that contains every telephone number for which it is responsiblethe phone book is a subset. The telephone companys canonical directory not only contains entries for every telephone number, but it also contains considerably more information about each phone number: name, address, billing information, subscription information, and more. RFC 2425 addresses the problem of representing directory information in a standard, portable format by specifying a MIME content-type for directory information. This generalized MIME format can be used for just about any kind of directory information and can be adapted for any specific directory by defining a profile for that directory. The vCard specification, in RFC 2426, defines a profile for vCard data based on the RFC 2425 MIME contenttype for directory information.
262
The text/directory content type can be used in two different ways. First, it can be used as a simple container for textual directory information or URIs (Universal Resource Identifiers, usually URLs) relating to the directory entry. Second, the text/directory content type may also be used as the root body part of a multipart/related MIME content type. This allows much more complicated directory entries that can include photographs, voice recordings, video clips, or any other multimedia data relating to the directory entry. The text/directory content type uses a very simple structure to define how directory data is expressed. Each piece of directory data is categorized as a certain type and formatted in the MIME body as:
type:value
where type identifies the kind of data. RFC 2425 defines type as a property or attribute with which the value is associated. Types may refer to specific kinds of data, such as a name or address, or they may refer to metadata, such as character encodings or alternate languages. RFC 2425 also specifies how to define and register a profile: A format that is intended to contain data specific to some application. For us, the most important profile is that defined for the vCard standard, as we see later in this chapter. In this section, we introduce the text/directory MIME content type, defining it by its parameters, defining its types and what kind of values are permitted in those types, and discussing what a profile is and how a new profile can be defined.
Parameters
The MIME content type text/directory has only two parameters: charset is a required parameter, as defined in RFC 2046 and discussed in Chapter 9, Multipurpose Internet Mail Extensions (MIME). profile is an optional parameter that identifies the kind of entity the directory information in the MIME body refers to. The profile indicates, to great extent, what kind of data may be in the body. Applications may use the profile to help interpret the contents of the MIME body.
Data Types
The body content consists of directory data, formatted as data types. The specific kinds of data contained in a text/directory MIME body are not defined in RFC 2425, but are defined in profile or application specific documents such as the vCard specification. As long as the data is expressed in a valid data type format, it is acceptable. However, to be of any use, the content types must be recognizable to the recipient. In this section, we discuss how that data must be
vCard
263
formatted, leaving discussion of specific kinds of data to the second half of the chapter that covers the vCard specification. Data in text/directory MIME bodies is expressed as data types. The simplest expression of a data type with data includes a data type name followed by a colon (:), followed by the value of that data type. For example, inclusion of an email address might look like this:
email:pete@loshin.com
However, content types can be much more complex. The gist of the ABNF for a line of text/directory data from RFC 2425 is as follows:
contentline param = [group "."] name *(";" param) ":" value CRLF = param-name "=" param-value *("," param-value)
Although most of the detailed specification has been left out, these two lines summarize what the content lines look like. A group may be used to organize directory information about some particular entity. For example, data can be organized into home.* and office.* groups, resulting in directory content like the following:
home.tel:+1 212 555 1212 work.email:pete@loshin.com
A piece of data may also take one or more parameters, delimited by semicolons. The parameter itself consists of the parameter name followed by the equal symbol (=) followed by the parameter values. In this example, the data is a telephone number and the parameters (work, voice, pref, msg) indicate that the type of telephone number being given is a voice line (as opposed to a modem or fax line), is the preferred number for reaching the entity specified in the directory listing, and includes voicemail messaging on the line:
TEL;TYPE=voice,pref,msg:+1 212 555 1212
The TYPE parameters (the TYPE parameter is actually defined in RFC 2426) may also be expressed individually, separated by semicolons (;):
TEL;TYPE=voice;TYPE=pref;TYPE=msg:+1 212 555 1212
Most data types for directory data are defined elsewhere (specifically, in RFC 2426 and in other standards for data format). In general, they are specific to the profile being used in the text/directory MIME body. However, RFC 2425 does define several data types that can be used in any text/directory MIME body. These are generalized data types that are useful for any kind of directory body and are permitted in any text/directory body except where they are explicitly prohibited by the profile.
264
SOURCE Type Definition. A text/directory MIME body usually comes from somewhere; the SOURCE type indicates where that information originated. The SOURCE type must be in the form of a URI. The contents of the SOURCE data type can be used by an application to retrieve additional or more current information about the entity described in the text/directory body, from the directory service that originally provided the MIME body information. More than one source URI may be specified in a single SOURCE data type; alternatively, more than one SOURCE data type may be present in a single text/directory body. The application is permitted to determine which source it prefers for verification or updating the body content. NAME Type Definition. A text/directory MIME body usually contains directory information about some particular entity. The NAME type identifies, in a displayable format, who that entity is. PROFILE Type Definition. A text/directory MIME body usually conforms to a particular profile, meaning that the directory entity (the thing in the MIME body, not the entity the MIME body refers to) fits into some specific scheme for expressing directory information. Thus, the PROFILE might refer to the vCard scheme for expressing business contact information. Other types of profiles might specify network resource directory information or Web resource information. BEGIN Type Definition. This type, always used with the END type definition, behaves like a boundary for directory entities. It indicates the start of an entity within the text/directory body. END Type Definition. This type, always used with the BEGIN type definition, behaves like a boundary for directory entities. It indicates the end of an entity within the text/directory body. The BEGIN/END types can be used to set off the properties relating to a specific profile within a text/directory body, especially when more than one set of properties is included in the body.
Type Values
RFC 2425 also specifies the different kinds of information that can be contained within a data type and its parameters. Table 15.1 includes the predefined parameters that may be used with a data type. All these parameters are expressed as a parameter name, followed by an equal sign (=), followed by the parameter value. The valuetype parameter is of particular interest as it can be used to identify the type of information contained in the data type. Table 15.2 summarizes the valuetype values defined in RFC 2425.
vCard Table 15.1 Predefined Parameters For Text/Directory Data Types (from RFC 2425) NAM E encoding= DESCRIPTION The encoding parameter; this can be b (for BASE64, as discussed in RFC 2047 and RFC 2045), or another valid encoding registered with the IANA, but not q (for quoted-printable, also discussed in RFC 2045). Not to be confused with the ContentTransfer-Encoding, which specifies encoding of the entire MIME body; this parameter indicates encoding only for a single value. Indicates what kind of data in the data type. See Table 15.2. Indicates the language of the data within the data type. Languages are identified by the language tags defined in RFC 1766. Indicates a protocol that should be used to interpret the contents of the data type. For example, if the data type contents needs to be interpreted using the Lightweight Directory Access Protocol (LDAP), the data type would use a context parameter of context=LDAP.
265
PARAM ETER encodingparm
valuetypeparm languageparm
value= language=
contextparm
context=
Table 15.2
Defined Data Value Types (from RFC 2425) DESCRIPTION A URL, as defined in RFC 1738. Text. The CRLF character indicates the end of a line, and thus the end of a content line, so new lines in a text value are indicated with the \n or \N characters. The full year, month, and day of month. The hour, minute, second. Optionally may include fraction of seconds as well as a time zone. The combination of the date and the time. An integer, numerical value. May be signed (positive or negative). TRUE or FALSE are the only valid values. A floating point numerical value. Any name beginning with x- or X- is experimental; it may also be used for privately -agreed upon data types. A value type registered with the IANA but not specified in the current standard.
VALU E TYPE uri text
date time date-time integer boolean float x-name iana-token
266
Profiles
A profile defines a special type of data to be contained within a text/directory MIME body. Anyone can specify a new profile; the simplest way is to use the experimental prefix X- to create one that need not be subjected to any standards review. An experimental profile can be defined for use by two or more entities exchanging the defined directory information. There is a set of procedures for creating a sanctioned, reviewed profile. Similar to the process for registering a new MIME type/subtype, an application is submitted to the IANA giving the profiles name, the profiles purpose (the reason the profile is being created), a list of data types defined for the profile, and intended usage of the profile (either COMMON, LIMITED USE, or OBSOLETE). Special notes about the profile, providing more information about how the profile is to be used or how data is to be ordered within it, are optional.
The vCard Specification

The vCard specification is a proposed Internet standard, documented in RFC 2426, vCard MIME Directory Profile. In this RFC, vCard is defined as a profile of the MIME text/directory content type. vCard uses the text/directory MIME content type described earlier in this chapter. As stated in RFC 2426, the specification defines the profile of the MIME Content-Type... for directory information for a white-pages person object, based on a vCard electronic business card. Directory information in the RFC 2426 profile description is based on the ITU X.520 and X.521 directory services recommendations. The RFC documents how that information can be represented as a directory entry using vCard. The rest of this section describes the components of the vCard profile and provides some examples.
vCard Profile Information

The vCard profiles purpose is to contain white-pages type directory data about a person, largely similar to the type of information normally recorded on a business card, specifically that recorded on a digital business card. Of the data value types defined in RFC 2425 (see Table 15.2), the vCard profile uses the uri, date, date-time, and float. It adds new data value types as well: binary, phone-number, utc-offset, and vcard. The vCard profile adds a couple of dozen of its own data types, to be discussed later. It also adds a new parameter to be used with data types called TYPE (in addition to the parameters described in Table 15.1).
vCard
267
The TYPE parameter is used throughout the vCard specification to modify and elaborate on most of the new data types defined in RFC 2426. Weve already seen it used earlier in this chapter with the tel data type to indicate that a telephone number refers to a fax or pager number, to indicate that the telephone number is a preferred number for contact, or that it is a home or work telephone number. The TYPE parameter can also be used for other forms of elaboration, as we see when discussing the new data types.
vCard Value Extensions

In addition to the values defined in RFC 2425 (Table 15.2), the vCard specification adds the values listed in Table 15.3. The descriptions identify the data types where the values may be used. These data types are described in the next section and in Table 15.4.
vCard Data Types

Only a few basic data types were defined for the text/directory MIME content type in RFC 2425the bare necessities: source, name, profile, begin, and end. The idea is to keep the specification open so that particular types of data to be used in each profile can be specified along with the profile. vCard identifies almost 30 types of data that can be included in a vCard. Table 15.4 lists them along with the category to which each belongs and a brief description of the data type.
Table 15.3 VALU E binary Card Value Extensions DESCRIPTION Indicates that the type value is encoded binary data within the data type value. Can be used within the PHOTO, LOGO, SOUND, and KEY data types. Indicates that the value is a telephone number, as defined by ITU specifications for telephone number expression. Used with the TEL data type. Indicates, in hours and minutes, the difference between local time (for the entity described in the vCard) from Coordinated Universal Time (UTC). The value may be either positive or negative, and the value is expressed as four digits, the first two representing the number of hours of offset and the second two digits representing the number of minutes of offset. Used with TZ data type. Indicates that the data type value contains another vCard. Used with the AGENT type, to indicate the contact information for someone who can act on behalf of the entity owning the main vCard.
phone-number
utc-offset
vcard
268
Table 15.4 indicates categories: Each data type represents data that falls into different categories. These categories are useful mostly for breaking the 29 different types into more manageable groups. Identification types identify the owner of the vCard (the entity represented by the vCard). Delivery types provide information related to physical addresses used for delivery of packages or postal mail or to provide a physical location where the entity might be located. The telecommunications category includes telephone and email contact information, while the geographic category includes information about time zone and latitude/longitude. Organizational data types include information about the vCard entitys corporate identity, while explanatory data types include information that can help to explain the contents of the vCard. Security data types are used to specify information related to security issues related to access to vCard and public key or certificate information associated with the vCard entity. All vCards must contain three data types: FN, N, and VERSION. All other data types are optional. Types may also be grouped, as discussed previously and described in RFC 2425. This means that work or home (or some other grouping criteria) can be prepended to a data type identifier to specify related vCard data. The vCard standard also supports extensions through the X- prefix for nonstandard data types.
Table 15.4 VALU E FN vCard Data Types CATEGORY identification DESCRIPTION The formatted text corresponding to the name of the object the vCard represents. This data must be present in any vCard object. Specifies the components of the name of the object the vCard represents. Each component of the name (first name, last name, initial, title, honorifics) are listed, separated by semicolons (;). Components are listed in this order: Family Name, Given Name, Additional Names, Honorific Prefixes, and Honorific Suffixes. A descriptive name for the entity represented by the vCard, the NICKNAME may also be the familiar form of a proper name (Pete for someone named Peter, for example). Contains either a URI pointing to a photograph somehow related to the vCard entity or inline binary data containing such a photograph. Uses the TYPE parameter to indicate the image format and the encoding of the data.
identification
NICKNAME
identification
PHOTO
identification
vCard Table 15.4 VALU E BDAY (Continued) CATEGORY identification DESCRIPTION Specifies the birth date of the entity referred to by the vCard. The default value is a single date, but this may be changed to a date-time value. Specifies the components of the delivery address for the vCard object. This type lists the components of the address, separated by semicolons. The component values must be specified in order of the post office box; the extended address; the street address; the locality (for example, city); the region (for example, state or province); the postal code; the country name. Even if one or more of these components is missing, they must still be indicated with a semicolon to indicate their absence. The TYPE parameter may be used to indicate the delivery address type: dom for domestic delivery address, intl for international delivery address, postal to indicate a postal delivery address, parcel to indicate an address where parcels can be delivered, home or work to indicate a home or work address, and pref to indicate a preferred address when more than one address is included. Specifies a formatted text version of the vCard entitys delivery address. Unlike the ADR data type, the LABEL type contains the delivery address in a format that is acceptable for creating a shipping or mailing label. The TYPE parameter can be used with the same options as listed for ADR. Specifies telephone number(s) that can be used to communicate with the entity represented by the vCard. Contains a single telephone number. The TYPE parameter can be used to specify home for a residence number, msg to indicate voice messaging support, work for a work number, pref to indicate a preferred number when more than one number is specified, voice, fax, cell, video, pager, bbs, modem, car, isdn, pcs indicate different types of numbers. pcs stands for a personal communications services number. Specifies an email address to be used to contact the vCard entity. The TYPE parameter can be used to specify a preference or format of the email address. Specifies the email software used by the vCard entity. Continues
269
ADR
delivery
LABEL
delivery
TEL
telecom
EMAIL
telecom
MAILER
telecom
270
Essential Email Standards: RFCs and Protocols Made Practical Table 15.4 VALU E TZ vCard Data Types (Continued) CATEGORY geographic DESCRIPTION Specifies the vCard entitys time zone, either as a simple offset from UTC or as a single text value that contains the time zone information. Specifies a latitude and longitude for the entity represented by the vCard. Uses the FLOAT value type. Specifies the vCard entitys job title, functional position, or function. Specifies the role, occupation or business category of the object the vCard represents. Specifies a logo image, either through a URI or inline. The TYPE parameter can be used to specify the image format. Specifies information about another person who will act on behalf of the individual or resource associated with the vCard. The value is, by default, another vCard. It can also be a URI pointing at agent directory information available elsewhere. Specifies the organization and unit or units within the organization associated with the vCard entity. Consists of the organization name, followed by division, directorate, or other unit information. All components are separated by semicolons. Specifies one or more values linking the vCard to application categories. For example, CATEGORIES:TRAVEL AGENT or CATEGORIES:INTERNET,IETF,INDUSTRY,INFORMATION TECHNOLOGY (from RFC 2426). A comment or some other supplemental information that doesnt fit in any other data type. Indicates the software or product that generated the vCard, based on a product identifier. A single date-time value that indicates the last time the vCard was updated. Specifies a string taken from the FN or N data types to be used for sorting, especially where languagespecific considerations indicate. For example, for surnames with prefixes that are not used for sorting or where the family name is not the last component of the full name.
GEO TITLE ROLE LOGO
geographic organization organization organization
AGENT
organization
ORG
organization
CATEGORIES
explanatory
NOTE PRODID REV SORT-STRING
explanatory explanatory explanatory explanatory
vCard Table 15.4 VALU E SOUND (Continued) CATEGORY explanatory DESCRIPTION Specifies a digital sound incorporated into the vCard, especially where the sound provides the correct pronunciation of the entity represented by the vCard. May be a URI pointing to a binary data file or inline data. The TYPE parameter can also be used to specify the audio format type. Specifies a URL associated with the entity represented by the vCard. Specifies a globally unique value that can be associated with the entity represented by the vCard. The version of vCard used to format this vCard. This data type is mandatory, and the current version is 3.0. Specifies an access classification for a vCard, such as PUBLIC, PRIVATE, or CONFIDENTIAL. Specifies a public key or authentication certificate associated with the object that the vCard represents. The TYPE parameter may be used to specify a particular format for the public key or certificate, such as PGP.
271
URL UID VERSION CLASS KEY
explanatory explanatory explanatory security security
Reading List
If interested, you may wish to check out the Internet Mail Consortium (IMC) Web site at www.imc.org/pdi for more about the vCard (and related) protocols. In addition, Table 15.5 lists some of the relevant RFCs concerning vCard and Internet white pages service.
Table 15.5 RFC RFC 2425 RFC 2426 RFC 2218 Relevant RFCs TITLE A MIME Content-Type for Directory Information vCard MIME Directory Profile A Common Schema for the Internet White Pages Service
CHAPTER
16
Calendaring and Scheduling Standards
Once corporate networks became common, but before widespread use of IP on those networks, organizations deployed proprietary workgroup computing tools to help people work together more efficiently. In addition to the proprietary email, news, and bulletin board applications, these products allowed people to coordinate their schedules over the network. By the end of the 1990s, an increasing number of users had embraced various scheduling tools, whether in the form of desktop personal information management (PIM) software that handled calendaring, scheduling, and contact lists or in the form of personal digital assistants (PDAs) that perform the same functions in the office and on the road. With open-standard IP networking rapidly being accepted as the protocol of choice for corporate networking, the logical extension of the workgroup application is to open it up and make it interoperable. The goal of such an effort is to allow people to use their tool of choice to manage their time. Palm Pilot users can coordinate their schedules with Windows CE palmtops; Lotus Notes users can coordinate directly with users of legacy mainframe scheduling programs. The Versit Consortium, the same organization behind the vCard standard, created the vCalendar standard, also for supporting personal data interchange (PDI). With vCalendar, people can update their own calendars to incorporate
273
274
an event announced on the Internet or through email, with a single click. vCalendar also simplifies the process of coordinating a meeting between two or more people, all of whom may use different calendar/schedule products. Some of the other areas where vCalendar adds value, according to the IMC, include the following:
s s
Exchange of meeting coordination interactions through email. A vCalendar object can be incorporated into a message and then dropped directly into whatever scheduling applications the recipients use. Distribution and exchange of project management information. Encapsulation of sequences of event planning components. Online publication of event calendars for theatrical venues, educational institutions, or any other sponsoring organization.
s s s s s s
The Calendaring and Scheduling (CALSCH) workgroup of the IETF has been working on developing the network protocols that can make the IMC vCalendar standard work over the Internet and other IP networks and turn it into the iCalendar standard. No longer are users limited to interaction with other users of the same system or with other users within the same organizations. The CALSCH group has so far produced three RFCs defining the protocols for Internet calendaring and scheduling applications. The first defines a core MIME object specification for calendaring and scheduling data in RFC 2445, Internet Calendaring and Scheduling Core Object Specification (iCalendar). RFC 2446, iCalendar Transport-Independent Interoperability Protocol (iTIP) Scheduling Events, BusyTime, To-dos and Journal Entries, describes a protocol that uses the iCalendar core objects in an application. Finally, RFC 2447, iCalendar Message-based Interoperability Protocol (iMIP), describes how iTIP can be bound to existing Internet transport protocols including, but not limited to, email (SMTP) and the Web (HTTP). These three specifications are proposed standards. The iCalendar Real-Time Interoperability Protocol (iRIP), which is still a work in progress of the CALSCH workgroup, defines a mechanism for binding iTIP messages to a real-time protocol. The vCalendar/iCalendar specifications describe formats for calendaring components and events. By itself, iCalendar does little more than provide a uniform way to express calendar data. iTIP is the protocol specification that defines how calendaring and scheduling messages are to be exchanged between calendar entities, whereas iMIP and iRIP define how transport protocols can be used to exchange those messages. iMIP uses a messaging-based transport, while iRIP defines a real-time, interactive protocol for carrying iTIP exchanges. To say that Internet scheduling is complex is an understatement: The specifications must distill the essence of a wide range of scheduling and calendaring applications and produce a framework in which the vast majority of proprietary applications can interoperate with each other. The average email application provides a relative handful of functions (send, receive, forward,
275
reply to a message) and is thus relatively straightforward to implement. The message format is based on a well-understood and long-used standard. Netnews also poses few major hurdles to implement: It too is relatively simple, with few functions and familiar standards. Calendaring and scheduling applications, however, tend to be more complicated. Transactions can involve individuals updating their own private to-do lists, two individuals setting up a lunch appointment, or groups of many individualssome of whom dont know each otherscanning each others schedules to set up large-scale meetings. The calendar itself adds complexity as well as the protocol must address calendar and time peculiarities beyond the usual issues like Y2K compliance and accurate leap year observance but also issues like where and when daylight savings observances have been implemented. The result of all this complexity are two voluminous RFCs: RFC 2445 is almost 150 pages long, most of it committed to detailing all the various iCalendar components and their properties, attributes, and parameters. RFC 2446 tips the scales at more than 100 pages; it is full of descriptions of protocol functions as well as long tables listing which iCalendar components and properties must be present with different protocol methods. Given the complexity of these protocols, in this chapter we simply describe the basics and urge the interested reader to review the actual RFCs for more details. iCalendar and its related protocols demonstrate how the relatively simple Internet email architecture can provide a transport for a highly complex application.
The iCalendar Architecture

The iCalendar standard specifies an exchange format between systems and/or applications. It can hold virtually any information related to calendar/scheduling functions. The iCalendar format can also be used as the vehicle through which another protocol defines requests for scheduling activity and responses to such requests. iTIP is such a protocol. iTIP defines how a request (such as for free time to set up a meeting during a particular time period) can be made using the iCalendar format, and how a response (such as a listing of free times during the time period) can be formatted and returned. The iCalendar specifies a MIME content type: the text/calendar MIME content type. This is the basic container for calendaring and scheduling data for Internet calendaring and scheduling applications. iCalendar objects are little more than repositories for information about events. They contain data to be interpreted or processed or forwarded or stored by a calendaring or scheduling application. An iCalendar object might contain date, time, location, and duration information about a particular event, or it might contain availability times for a particular individual or entity. The iCalendar Transport-Independent Interoperability Protocol, or iTIP, specifies how iCalendar objects are used for interoperation between different
276
calendaring and scheduling applications. It provides an application framework for exchanging iCalendar objects. It provides a language of requests and responses, to add semantics to the iCalendar specification. An iCalendar object containing date, time, location, and duration information about a particular event can be incorporated into an iTIP REQUEST. An iCalendar object containing availability times for an individual or entity might be incorporated into an iTIP REPLY and sent out in response to a request for that entitys availability. iTIP defines how iCalendar information is exchanged and incorporated into transactions between entities wishing to coordinate schedules or notify each other of events or availability. iTIP operates independently of the transport protocolthe set of rules specifying how iTIP requests and responses are exchanged. The iTIP messages can be incorporated into Internet messages, using standard Internet mail protocols, or they can be exchanged through a real-time application transport. The iCalendar Message-based Interoperability Protocol, or iMIP, defines how iTIP messages are incorporated into mail for the purposes of being exchanged from one calendar entity to another. iMIP supports a store-and-forward mechanism for exchanging calendaring data between calendar entities, piggybacking iCalendar data on top of a conventional Internet messaging protocol such as SMTP or HTTP. iCalendar data can also be exchanged between calendaring application entities using real-time protocols. These protocols have not yet achieved Internet proposed standard status; they are still going through the Internet draft process. The iCalendar Real-Time Interoperability Protocol, or iRIP, is a stateful application protocol similar to other application protocols weve described here such as SMTP, IMAP, and NNTP. iRIP is still in Internet-Draft form. Finally, the Calendar Access Protocol (CAP) provides an application layer protocol for calendar user agents (CUAs) to interact with a calendar store (CS). As of mid 1999, CAP existed mostly as an Internet-Draft describing its requirements; a protocol specification was not available even as an Internet-Draft. For our purposes, it is enough to summarize the iCalendar object specification and related protocols and look at how they fit together.
The iCalendar Core Object

RFC 2445 defines different properties that can be a part of the iCalendar content type. A text/calendar property is roughly equivalent to the data type defined for the text/directory content type. iCalendar is the IETF version of the vCalendar specification, originally from Versit and currently the responsibility of the Internet Mail Consortium (IMC).
277
The iCalendar specification, a MIME content type standard, defines the following: Properties. Properties are the units within which iCalendar object data are contained. As defined in RFC 2445, a property is the definition of an individual attribute describing a calendar or a calendar component. A property may contain an event start time or location, or the name of an events organizer. Property parameters. Parameters add meta-information to a property. The property parameter can clarify or add information to properties, for example, specifying the data type of the property value or indicating that the property data is expressed in a particular language. Property value data types. The data within iCalendar properties is strongly typed, and data values use one of the defined data types. For example, data can be date-time or Boolean. Components. According to RFC 2445, calendar components are collections of properties that express a particular calendar semantic. For example, the calendar component can specify an event, a to-do, a journal entry, time zone information, or free/busy time information, or an alarm. The rest of this section discusses these different aspects of the iCalendar specification. Rather than go into detail (RFC 2445 provides all the details in 148 pages), we simply summarize the pieces of the iCalendar standard in tabular format. These tables only hint at the complexity of the iCalendar object. A component usually consists of a very particular set of properties, and each property consists of data and a very specific set of parameters. Values permitted for different properties and the precise configuration of components are all defined in great detail in RFC 2445. In this chapter, there is room only for a summary of these different pieces of the iCalendar object.
Properties
iCalendar properties hold all the calendar data. Table 16.1 lists all the iCalendar properties defined in RFC 2445, along with their categories and descriptions. Properties each fit into one category, which defines how they are used. Broadly speaking, properties are either calendar properties, in which case they occur only once in each calendar object and are not a part of any component or they are component properties and can occur within calendar object components, per the component specification. Property classifications include: Calendar. From RFC 2445: The Calendar Properties are attributes that apply to the iCalendar object, as a whole. These properties do not appear within a calendar component. They SHOULD be specified after the BEGIN:VCALENDAR property and prior to any calendar component.
278
Descriptive. Provides descriptive information about the calendar component. Date and time. Provides date and time information about the calendar component. Time zone. Related to component time zone information. Relationship. Indicates how the component relates to the calendar object. Recurrence. Specifies how and whether the component repeats over time. Alarm. Defines how, when, and how often an alarm action should occur. Change. Specifies change management information for calendar components. Miscellaneous. Includes the nonstandard/experimental X- properties as well as the REQUEST-STATUS property that provides information about a scheduling request. In several cases, the property descriptions in Table 16.1 refer to iCalendar components such as todo, event, journal, or freebusy. Components are defined later in this chapter, but component names are descriptive: A todo component represents something that would go on a to-do list, and a freebusy component contains information about unallocated time (free) or allocated time (busy) for an entity.
Table 16.1 iCalendar Properties Defined in RFC 2445 CATEGORY calendar calendar DESCRIPTION The calendar scale (for example, Gregorian) used by the iCalendar object. Indicates a protocol action or transaction, as defined in iTIP (or any other valid application protocol). Indicates (by ID value) the product that generated the iCalendar object. Indicates the version number(s) of iCalendar required to interpret (or capable of interpreting) the iCalendar object. Used to associate a document (by default, identified by a URI) with an event. Used to specify categories or subtypes assigned to an event, todo, or journal calendar component. Specifies an access classification for an event, todo, or journal component, indicating whether the component is public, private, or something else.
PROPERTY NAM E CALSCALE METHOD
PRODID VERSION
calendar calendar
ATTACH CATEGORIES
descriptive descriptive
CLASS
descriptive
Calendaring and Scheduling Standards Table 16.1 (Continued) CATEGORY descriptive DESCRIPTION Contains informational data for a user, rather than data to be processed by the calendar application. Used with event and todo components to provide a complete textual description of the component. Contains the latitude/longitude of a location related to the event or todo component. Identifies the venue (room or other location) for the event either in text or by pointing to a URI that references the location. An integer representing the percent completion of a todo component. Used by the assignee or delegatee of a todo and submitted to the organizer of the todo component. Defines a relative priority for an event or todo component. Describes the equipment or resources expected to be needed for a todo or event component. To indicate an overall status for a todo, confirmation status for an event, or draft level for a journal item. Example values include TENTATIVE, IN-PROCESS, FINAL. Summarizes the content of a component or contains a subject title. Contains the date and time a todo component was completed. Contains the date and time an event or freebusy component ends. Contains a date and time a todo is expected to be completed. Indicates a starting date and time for an event, todo, freebusy, or timezone component (in the last instance, indicating the time when that timezone specification is in effect, especially for daylight savings). Continues
279
PROPERTY NAM E COMMENT
DESCRIPTION
descriptive
GEO LOCATION
PERCENT-COMPLETE
descriptive
PRIORITY RESOURCES
STATUS
descriptive
SUMMARY COMPLETED DTEND DUE DTSTART
descriptive date and time date and time date and time date and time
280
Essential Email Standards: RFCs and Protocols Made Practical Table 16.1 iCalendar Properties Defined in RFC 2445 (Continued) CATEGORY date and time date and time DESCRIPTION Specifies a length of time for event, todo, freebusy, or alarm components. Indicates a start and stop time and an availability status (free or busy); used in the freebusy component. Time transparency. Indicates whether an event should be transparent (marked as committed) to busy time searches. Specifies a unique time zone value. Required in the timezone component. Specifies a name associated with a time zone. Optional for timezone components. Indicates the hours and minutes offset from UTC prior to the timezone indicated in the timezone component. Indicates the hours and minutes offset from UTC in the current time zone indicated in the timezone component. Points to a URI that contains up-to-date information about the timezone component. Identifies a person attending or participating in a calendar component. Takes parameters for more complete description of attendee. Contains information about contact information related to the component or URI pointing to such information, especially when referring to an individual acting as a contact resource. Specifies the person organizing iCalendar objects specifying a group scheduled calendar entity; this property is required for such components. Used with the UID and SEQUENCE property to identify a particular instance of a recurring event, todo, or journal. Contains a UID of another component, to indicate a relation with that component from the component in which this property occurs. Identifies a URL associated with the component.
PROPERTY NAM E DURATION FREEBUSY
TRANSP
date and time
TZID TZNAME TZOFFSETFROM
time zone time zone time zone
TZOFFSETTO
time zone
TZURL ATTENDEE
time zone relationship
CONTACT
relationship
ORGANIZER
relationship
RECURRENCE-ID
relationship
RELATED-TO
relationship
URL
relationship
Calendaring and Scheduling Standards Table 16.1 (Continued) CATEGORY relationship recurrence DESCRIPTION A globally unique identifier for the component. Indicates exception date/times for recurring components. Used with other recurrence properties. Defines a rule or pattern for determining exceptions to a rule set for scheduling a recurring component. Lists recurrence dates for a recurring component. Specifies a rule or pattern for scheduling a recurring component. Specifies an action to occur when an alarm is set off. Indicates the number of times an alarm is repeated after the initial trigger. Defines when an alarm is set off. The date and time the information in the component was created by the calendar user agent. The date and time that the instance of the component was created. Used to indicate when the component was placed into a protocol message, not to indicate the creation of the information. The date and time the information in the component was last updated in the calendar store. Indicates the number of previous versions of the component prior to this. Experimental or other nonstandard parameters are allowed. Contains a code (similar to other protocols reply codes) indicating success or failure status.
281
PROPERTY NAM E UID EXDATE
EXRULE
recurrence
RDATE RRULE ACTION REPEAT TRIGGER CREATED
recurrence recurrence alarm alarm alarm change
DTSTAMP
change
LAST-MODIFIED
change
SEQUENCE X-* REQUEST-STATUS
change misc misc
Property Parameters
Modifiers can be used to make the text/calendar properties more specific. For example, the property ATTENDEE (containing data that identifies an attendee
282
of an event) can take quite a few different parameters, including the PARTSTAT (participation status) parameter, which indicates whether the attendees participation status for an event has been accepted, declined, or one of several other options. Table 16.2 lists iCalendar property parameters defined in RFC 2445. Property parameters contain meta-information: either some information about the property or information about the property value. For example, the parameter FBTYPE, taken by the FREEBUSY property, indicates a free/busy time type, which can be FREE, BUSY, BUSY-PERMANENT, BUSY-TENTATIVE. The FREEBUSY property by itself just indicates a stretch of time; the FBTYPE parameter specifies whether that time is available. Other parameters provide meta-information about the property value, such as ALTREP, which indicates an alternate representation for the contents of the property.
Table 16.2 iCalendar Property Parameter Values (from RFC 2445) DESCRIPTION Alternate text representation Common name Calendar user type Delegator Delegatee Directory entry Inline encoding Format type Free/busy time type Language for text Group or list membership Participation status Recurrence identifier range Alarm trigger relationship Relationship type Participation role RSVP expectation Sent by Reference to time zone object Property value data type
PROPERTY VALU E ALTREP CN CUTYPE DELFROM DELTO DIR ENCODING FMTTYPE FBTYPE LANGUAGE MEMBER PARTSTAT RANGE TRIGREL RELTYPE ROLE RSVP SENTBY TZID VALUETYPE
283
Property Value Data Types

iCalendar properties are strongly typed, and RFC 2445 defines several property value data types that are permitted for property values. These are listed in Table 16.3.
iCalendar Components
If iCalendar properties and property parameters are the building blocks with which an iCalendar component is built, then iCalendar components are the building blocks of the iCalendar object itself. Table 16.4 summarizes the six different iCalendar components defined in RFC 2445. iCalendar components all start with the letter V to differentiate them from other entities and terms. Part of the reason for the length of RFC 2445 is that a great degree of dependency exists between components and the properties that make them up. This requires a very detailed specification of which properties can constitute any given component and even of which parameters those properties may take or must take when they constitute any given component. This degree of detail is omitted here; the reader interested in the details of the specification is directed to RFC 2445.
Table 16.3 TYPE BINARY BOOLEAN CAL-ADDRESS DATE DATE-TIME DURATION FLOAT INTEGER PERIOD RECUR TEXT TIME URI UTC-OFFSET iCalendar Property Value Data Types DESCRI PTION Indicates inline binary data. TRUE or FALSE. A URI, usually a mailto:, with a calendar users address. A date value. A date-time value. A time duration expressed in some combination of weeks, days, hours, minutes, and seconds. A floating point, real number value. A numerical, integer, value. One or more precisely delimited time periods. A recurrence rule. Human-readable text data. A time of day value. A Uniform Resource Identifier, such as a URL. Number of minutes and hours offset from UTC time.
284
Essential Email Standards: RFCs and Protocols Made Practical Table 16.4 iCalendar Components DESCRIPTION An event. A group of properties that describe an event. An event represents a scheduled amount of time on a calendar and may include a VALARM component. A VEVENT may also identify an anniversary or daily reminder. A to-do task. A group of calendar properties that describe an action item or some other assignmentsomething that is on somebodys to do list. A journal entry. A group of calendar properties that describe a journal entry. RFC 2445 defines a journal entry as one or more descriptive text notes associated with a particular calendar date. VJOURNAL components can also associate a particular document with a calendar date. A set of properties relating to free/busy time information. It can be a request for free/busy information, a response to such a request, or a component that carries public free/busy information. A set of properties that define a time zone. A set of properties that define an alarm or reminder for an event or a todo.
COM PON ENT VEVENT
VTODO
VJOURNAL
VFREEBUSY
VTIMEZONE VALARM
iTIP
It is all well and good to have a specification for an iCalendar object, but RFC 2445 defines only how the data for calendaring and scheduling applications is packaged into a MIME object. RFC 2446, iCalendar Transport-Independent Interoperability Protocol (iTIP) Scheduling Events, BusyTime, To-dos and Journal Entries, defines a set of rules for using iCalendar objects to interoperate between and among calendaring systems. RFC 2446 outlines the rules by which iCalendar objects are exchanged for the purpose of publishing and transmitting calendaring information as well as for exchanging requests, responses, and negotiation of changes or cancellations of calendar events.
iTIP Roles and Methods

The iTIP specification uses methods to exchange iCalendar objects between calendar users. Table 16.5, taken from RFC 2446, shows the kinds of interactions that are possible among calendar users. A calendar user may either act as an organizer or an attendee. The organizer organizes a scheduled event by initiating an iTIP exchange. The attendee receives iTIP messages and can
285
respond appropriately to messages sent out by the organizer. An iTIP method is either a request or a reply, though some methods, such as PUBLISH, do not require any responses. Of the methods listed in Table 16.5, the organizer may use only PUBLISH, REQUEST, ADD, CANCEL, and DECLINECOUNTER. The attendee can use only the REPLY, REFRESH, and COUNTER methods, and the REQUEST method only when the attendee is delegating an event to another calendar user.
iTIP Protocol Elements

All iTIP protocol messages are actually text/calendar MIME entities (defined in RFC 2445) that contain some kind of calendaring or scheduling information. An iTIP message is called a method type and includes a METHOD property (see Table 16.1). The METHOD property value can be any of the methods listed
Table 16.5 METHOD PUBLISH iTIP Methods (from RFC 2446) DESCRIPTIONS Used to publish a calendar entry to one or more Calendar Users. There is no interactivity between the publisher and any other calendar user. An example might include a baseball team publishing its schedule to the public. Used to schedule a calendar entry with other Calendar Users. Requests are interactive in that they require the receiver to respond using the Reply methods. Meeting requests, busy time requests, and the assignment of VTODOs to other Calendar Users are all examples. Requests are also used by the Organizer to update the status of a calendar entry. A Reply is used in response to a Request to convey Attendee status to the Organizer. Replies are commonly used to respond to meeting and task requests. Add one or more instances to an existing VEVENT, VTODO, or VJOURNAL. Cancel one or more instances of an existing VEVENT, VTODO, or VJOURNAL. The Refresh method is used by an Attendee to request the latest version of a calendar entry. The Counter method is used by an Attendee to negotiate a change in the calendar entry. Examples include the request to change a proposed Event time or change the due date for a VTODO. Used by the Organizer to decline the proposed counterproposal.
REQUEST
REPLY
ADD CANCEL REFRESH COUNTER
DECLINE-COUNTER
286
in Table 16.5. Not all methods are permitted for all iCalendar components. You may use the PUBLISH method for an event, a todo, a journal, or a freebusy component, but you can use a CANCEL method for a freebusy component. Table 16.6, taken from RFC 2446, indicates which methods are compatible with which components.
iMIP
The iTIP protocol defines how iCalendar objects can be used to create calendaring and scheduling transactions, but it does not address the issue of how to get those transactions from one calendar user to another. The iCalendar Message-based Interoperability Protocol (iMIP) largely specifies how to stick a text/calendar MIME content type body into a mail message. Very simply, the binding is done by appending the text/calendar MIME content type to a message and making sure that the MIME Content-Type header field includes the type parameter METHOD. The following example shows how this parameter appears in a typical Content-Type header field:
Content-Type:text/calendar; method=REQUEST; charset=US-ASCII
The value passed with the parameter must be the same as the value of the METHOD property of the iCalendar object. Thus, the example here would be valid only if the Icelander object included the METHOD property and the value of that property was REQUEST.
Table 16.6 METHOD PUBLISH REQUEST REFRESH CANCEL ADD REPLY COUNTER
iTIP Methods and iCalendar Components (from RFC 2446) VEVENT Yes Yes Yes Yes Yes Yes Yes Yes VTODO Yes Yes Yes Yes Yes Yes Yes Yes VJOU RNAL Yes No No Yes Yes No No No VFREEBUSY Yes Yes No No No Yes No No
DECLINE-COUNTER
287
iCalendar Works in Progress

The CALSCH workgroup of the IETF is responsible for specifying a complete architecture for interoperable calendaring and scheduling applications over the Internet. A large part of that task involved the specification of the iCalendar core object in RFC 2445 and the iTIP and iMIP protocols in RFC 2446 and RFC 2447. Now, there is enough protocol infrastructure to enable any two calendaring systems to interoperate through the mechanism of MIME objects transmitted over the existing Internet messaging transport. However, iMIP provides only a store-and-forward protocol for calendaring and scheduling. Two other protocols have yet to be fully specified and published as RFCs. The Calendar Access Protocol (CAP) and the iCalendar Real-time Interoperability Protocol (iRIP) flesh out the Internet calendaring architecture.
Calendar Access Protocol (CAP)

CAP specifies a mechanism for accessing and managing iCalendar-based calendars. At present, CAP exists only as a draft of a requirements specification. CAP defines how a calendar user agent (CUA) interacts with a calendar store (CS), chooses a calendar to interact with, and views, updates, adds, or deletes events and other components contained in the chosen calendar. CAP is not meant to provide a mechanism for interaction between calendar users, but rather to mediate how a calendar user interacts with a calendar store or calendar service. Details of CAP, in current Internet-Drafts, can be found through the CALSCH Web site.
iRIP
Though further along the process of publication as an RFC, iRIP is still an Internet-Draft. Where iMIP provides a mechanism for binding iTIP calendaring interactions into Internet messaging transports, iRIP provides a binding of the iTIP interactions into a real-time protocol for instant interaction. iRIP defines a stateful protocol used by calendar services (CS) to exchange iTIP messages on behalf of client calendar user agents (CUAs). Details of iRIP, in current Internet-Drafts, can be found through the CALSCH Web site.
Reading List
In addition to all the MIME specifications, the RFCs specified in Table 16.7 can provide more information about the specifications discussed in this
288
chapter. The interested reader is also referred to the Internet Mail Consortium Web site (www.imc.org/pdi) and the CALSCH workgroup Web site (www.ietf.org/html.charters/calsch-charter.html) for the latest information about these standards.
Table 16.7 RFC RFC 2425 RFC 2446 Relevant RFCs TITLE A MIME Content-Type for Directory Information iCalendar Transport-Independent Interoperability Protocol (iTIP) Scheduling Events, BusyTime, To-Dos and Journal Entries iCalendar Message-Based Interoperability Protocol (iMIP)
RFC 2447
CHAPTER
17
Internet Messaging Security
Internet security protocols can operate at the network layer (IPsec), at the transport layer (TLS/SSL), and at the application layer. This chapter provides a very brief overview to IPsec and TLS with discussion about how they relate to messaging security, along with pointers to find out more about transport and network layer security protocols. Though all these protocols provide security in one way or another, they are not specific to email or other Internet messaging protocols. We also examine some current specifications for message and MIME security and explore the question of why email and Internet messaging security has not received more complete treatment by the standards bodies. Security is a vital component of Internet messaging applications, considering how important those applications are to so many organizations and individuals. The term security covers a lot of ground, but for the purposes of most Internet application protocols, security generally consists of the following: Keeping private information private. This means doing something (almost always using encryption) to make sure that only the author of a message or other data and the intended, authorized recipient of that data are able to interpret it. All others should see is random-seeming bits. Data integrity. This means using some mechanism to prevent any third party from intercepting and modifying a message or other data. Maintaining
289
290
data integrity means that the data arrives at its destination unchanged from the way it was sent from its source. Data integrity is usually achieved through the use of a cryptographic hash function or a digital signature; the recipient can verify whether the data has been modified en route and take appropriate action if the data was tampered with. Authorization and access control. Internet messaging applications require that servers offer their services rather freely to unfamiliar hosts. Mechanisms to prevent unauthorized use of services are necessary. Likewise, Internet messaging protocols, such as IMAP and SMTP, must employ mechanisms that keep unauthorized users from accessing possibly sensitive messages. A variety of mechanisms for authorization and access control have been incorporated into messaging protocols. Denial of service attacks. Vandals who are unable to access data on remote hosts without authorization often turn to denial of service attacks. Attempting to bring down Internet messaging servers and other related systems using Internet messaging protocols is another security problem, as is the unauthorized use of messaging systems by spammers and others. Some of these goals are surprisingly simple to accomplish with Internet messaging applications, while others are considerably more difficult. After discussing some of the protocols designed for Internet security, we see why messaging security is so much more difficult and why some of it is relatively easy.
Lower-Layer Security Protocols

Two important protocols have been defined to provide security for Internet applications: the IP Security Architecture (IPsec) and the Transport Layer Security (TLS) protocol. IPsec, defined in RFC 2401, Security Architecture for the Internet Protocol, as well as in several other related documents, provides a framework for secure transmission of packets across an open network. IPsec assumes that the network transport is at best neutral, but most likely a threatening environment full of attackers who would like to intercept, read, and perhaps forge your data. TLS is a revised version of the Secure Sockets Layer (SSL), which provides security at the transport layer. Specified in RFC 2246, The TLS Protocol Version 1.0, TLS provides a protocol that processes can use to secure their communications. We see next how the distinction between communicating hosts and communicating processes affects security implementations. TLS is an important adjunct to Internet messaging security. RFC 2595, Using TLS with IMAP, POP3 and ACAP, defines how it can be used to add an additional layer of security to those messaging protocols. Network security is an end-to-end function. This means that security must be negotiated from the source to the destination, but that means from the pro-
291
tocols source entity to the protocols destination entity. For example, using the IP security architecture (IPsec) means that a system using one IP address encrypts and/or digitally signs IP packets to be sent, and the system connected to the destination IP address decrypts and/or certifies the secured IP packets. The information inside the packets is secure only as long as it traverses the Internet, intranet, or other network. It is not secure before it is encrypted and/or digitally signed, and it is not secure after it is received at its destination and decrypted and/or certified. End-to-end security also means that security at any given protocol layer must operate on protocol data units. With IPsec, the entire IP packet is secured, whereas with TLS, only the transport layer protocol data unit (with TCP, a segment) is encrypted and/or digitally signed. TLS by itself protects the data being sent down from the application as well as any application protocol commands, but it does not protect the transport layer or network layer headers. Attackers may not be able to see the data being passed, but they can certainly see what nodes are communicating as well as possibly infer a great deal about what kind of application protocol information is being transmitted, particularly if they are able to identify the source and destination ports of the traffic. TLS thus leaves open the possibility that attackers can do traffic analysis on the encrypted streams, making spoofing attacks possible. One of the greatest benefits of using TLS is that it provides greater assurance than IPsec of end-to-end security. Encryption or digital signature happens at the source system as data is being handed off by the process that creates it. The source system creates the data and processes it through the application layer to be formatted for transmission to its destination. It is then encrypted as it is packaged for transmission to the destination process running on the destination system. Security occurs immediately below the application layer. With IPsec, security tunnels may be created between security gateways for protected data communication across an open network, but packets traveling from the source to the security gateway wont necessarily be protected. Likewise, packets traveling from a destination security gateway to the actual destination also wont necessarily be protected. Figure 17.1 shows that although the data may be protected for most of its route, it is not necessarily protected for the entire route. Of course, this is just one architectural approach to deploying IPsec; the secure tunnel may also be extended all the way from the source to the destination systems. However, when security gateways are used, traffic analysis yields relatively little information about the tunneled dataonly that encrypted datagrams are being passed from the source security gateway to the destination security gateway. Tunneling between the actual source and destination would reveal more useful information to a traffic analyzer. TLS secured channels usually encrypt or digitally sign data streams running over TCP virtual circuits. Although TLS offers a greater degree of end-to-end security, it leaves more information about the source and destination vulnerable to traffic analysis.
292
Destination
Intranet
Internet
IPsec Encrypted Tunnel
Security Gateway
Security Gateway
Intranet
Source
Figure 17.1
IPsec protecting packets sent over the Internet.
Internet Messaging Security Issues

The problem with TLS and IPsec as security protocols for protecting Internet messages is that Internet messaging protocols dont have a one-to-one mapping with the systems that carry Internet messages. Consider Internet email: One system is used by the originator of an email message to compose and submit the message to an SMTP server. That SMTP server may only be a relay, accepting messages from users and passing them along to another SMTP server that actually routes the messages to an SMTP server acting for the destination mailbox. From there, the message is stored on a POP server, which is accessed by the destination MUA. In all, there are five different application protocol interactions in a chain linking five different servers, using two different application protocols, as shown in Figure 17.2. Figure 17.2 clearly demonstrates that Internet messaging security must be applied at a higher, rather than lower, protocol layer. Securing the link between the source MUA and its SMTP relay achieves next to nothing if the SMTP server connecting to the remote SMTP server across the open Internet is not secure.
293
Destination MUA
Intranet
Internet
SMTP Server SMTP Relay SMTP Server
POP Server
Intranet Source MUA Figure 17.2 messages. Lower-layer security protocols can not be relied upon to protect Internet
Applying IPsec or TLS in this case also offers relatively little benefit. Although the entity sending the message can use IPsec to securely tunnel packets to the SMTP relay and within that tunnel, use TLS to encrypt a virtual circuit, the source entity has no control over how other hosts beyond the SMTP relay handle their connections. Weve already seen how SMTP, POP, and IMAP can use authentication extensions, but these precautions are useful only for protecting one interaction at a time. So even applying security at the individual application protocol layer, an approach that has been successful for other Internet applications, does not solve the problem for Internet messaging. The problem is that Internet messaging is a store-and-forward application. Building security between the entities that store and forward messages doesnt address the fundamental problems of keeping messages private and preventing message forgery or alteration. In this section, we discuss what exactly messaging security entails in more detail.
Message Privacy
Encryption is invariably the solution when discussing Internet security and how to keep information private. Encryption is used in IPsec and TLS, as well as many
294
application layer security protocols. However, encryption of the protocol data units (whether they are IP datagrams or TCP segments) protects the data encapsulated within those protocol data units only when they are in transit. Once the packet or segment arrives at its destination, the data within is decrypted. Even if encryption were applied at the application protocol layer, with SMTP, IMAP, or POP doing the encryption and decryption as the data containing the message is transferred from one host to another, the data would still be decrypted by each server before being re-encrypted and passed along to the next server. The solution to the problem of how to protect Internet message privacy is to bypass the protocols and encrypt the message itself. However, lower-layer protocols are still valuable even when the message is encrypted. Although the message (actually, the message body) is encrypted, the message headers must still be in plain text. Thus, traffic analysis may be brought to bear on the message if it is transported openly across an open network. Attackers can determine who is sending the message and who is receiving the message, and they can read any headers that are included with the message. Later in this chapter we discuss several approaches to Internet message encryption.
Message Integrity
Even if you dont care whether anyone sees whats in your messages, you probably dont want anyone to be able to modify what you have written and you certainly dont want anyone to be able to forge messages and make them look as if they came from you. As it happens, forging emailthat is, sending a message that appears to be from someone elsehas historically been relatively easy and has long been a source of amusement for adolescents (see sidebar). It is also more or less simple to detect a spoofed message, depending on the skills of the spoofer. Another type of message integrity threat exists. Since Internet messages can pass through open networks, an attacker who has control over an intermediary system could, in theory, grab your message and change part of its content. Assuming that the attacker has taken control of a system that normally processes mail, such an attack could be virtually undetectable. Both types of attack can be thwarted by using digital signatures on messages. These provide message integrity in two ways: If the digital signature is properly verified (and if the public key has been reliably distributed to the recipient), then the recipient can be confident in the identity of the author of the message as well as the integrity of the contents of the message. On the other hand, if the digital signature verification fails, it indicates only that there is something questionable about the message: Its contents may have been deliberately modified or corrupted in some way during transit, some piece of digital
295
FORGING EMAI L
As may be obvious from the chapter on SMTP, forging email headers requires only that the attacker be able to act as an SMTP system and be able to connect to another SMTP system that forwards messages from anyone. Increasingly, SMTP system administrators are tightening up their systems to keep unauthorized users out. Spammerspeople who use third-party SMTP systems to flood the Internet with unsolicited emailseek out SMTP servers that accept and relay messages from anyone, to anyone (not just to mailboxes for which the server provides service). Because so many ISPs and other organizational entities have rules against the use of their systems for spam, spammers use whatever means are at their disposal to locate friendly servers. To forge email, one needs only to create a Telnet connection to such an SMTP server through the SMTP well-known port (port 25). At that point, the protocol commands can be exchanged between forger and SMTP server in just the same way that two SMTP servers would communicate. Send a MAIL FROM: command with the forged email address you want to appear as the source of the email (something like, abraham.lincoln@whitehouse.gov or billg@microsoft .com) and the RCPT TO: command with the destination address of the person you want to send the forged message to. Then enter the DATA command, and you will be prompted to enter the mail message. Enter all headers, followed by a blank line, followed by the forged message, followed by the message termination string of a period on a line by itself followed by a CRLF. Assuming that the SMTP server youve connected to accepts the message, the forged message will be delivered. The spoofed SMTP server indicates the source host for the message; clever forgers will be able to forge their own IP address and host name to keep trackers off their trail. Other, less clever, forgers can still avoid detection if they are able to connect to the spoofed SMTP server from some other host to which they can not be linked.
signature software (either the signing program or the verification program) may be malfunctioning, or the message may have been forged. In any case, a red flag should be raised if the digital signature is not confirmed. The message should not be trusted.
Authorization, Access Control, and Other Attacks

Message privacy and integrity are far from the only security issues related to Internet messaging. In fact, it might be argued that these two issues are far less important than issues related to authorization, access control, and other messaging system attacks. As was mentioned in the sidebar about forging email, spammers seek out SMTP servers that will freely relay messages destined for mailboxes other than those served directly by the SMTP server. SMTP system administrators have two concerns. Most believe that spam is a bad thing and want to minimize the amount that is sent out, particularly the amount that is
296
sent out from their systems. More generally, when a spammer uses a thirdparty SMTP server, he or she is actually stealing services from the person or organization that owns the SMTP server. This is a type of denial-of-service attack and can cause serious problems if the attacker so overwhelms the system that it becomes unavailable to authorized users for any length of time. The AUTH extended command for POP and IMAP provides one approach to limiting access to messaging services, specifically to keep unauthorized users out of message stores. In SMTP, the EHLO command offers an avenue for adding security by using SMTP security extensions. For example, SMTP servers can negotiate use of the TLS protocol to secure their exchanges (see RFC 2487, SMTP Service Extension for Secure SMTP over TLS). Likewise, TLS can be applied to IMAP, POP, and ACAP, as described in RFC 2595. Messaging servers are no more immune from attack than any other type of Internet server, and the same strategies used to protect any Internet server are likely to help protect messaging servers. However, in this chapter, we concentrate on security issues specifically related to Internet messaging. More general security issues are treated in the literature covering network and system security and intrusion detection.
Protection from Attack

Messaging server administrators must take all the proper precautions to thwart attacks against their systems, but those precautions do not protect the messages themselves for all the reasons previously cited. Privacy and message integrity must be addressed at the level of the creation and reading of the message. This means building some mechanisms for encrypting and digitally signing messages. Despite the seeming relative simplicity of this approach, the standards groups have been long incapable of agreeing on standards for Internet messaging security that implementers have been willing to develop and end users have been willing to adopt. We discuss Internet email security standards and why they have been so difficult to agree on in the next section.
Internet Email Security Standards

As with so many other Internet protocols and applications, security for Internet messaging was not a high priority early on. When the standards were being specified (and for many years after), the Internet and its predecessor networks were largely used by academics and researchers who could largely be trusted to not do anything that would be wrong or that might harm others. In any case, security was frequently viewed as a secondary goal for networks and applications being developed simply to see if they could be developed.
297
Since then, however, there have been several significant efforts to build secure messaging mechanisms, to specify them as proposed standards, and to build support for those standards. So far, the results have been inconclusive. In this section, we outline some of the messaging security protocols that have been proposed and discuss how they fit (or dont fit) into the security architecture of the Internet. We glance at some of the more or less marginal message security protocols (PEM, MOSS, S/MIME) in this section and in the following section highlight the current contenders (MIME multipart security and PGP-related protocols) for message security. All of these protocols depend on three important technologies: symmetric encryption, asymmetric (public key) encryption, and public key digital signatures. Symmetric encryption is the kind where the sender uses the same key to encrypt the message that the recipient uses to decrypt the message. With public key encryption, there are two keys: One can be made public, and the other is kept private. The sender uses the recipients public key to encrypt the message, and the recipient uses her private key to decrypt the message. Digital signatures use the same public/private key pairs, but in reverse. The sender takes a secure hash of the message and encrypts that hash using her own private key. Anyone can decrypt that hash (and verify it by running the same hash against the original message) simply by decrypting with the senders public key. Since only the owner of the private key associated with that public key could possibly have encrypted it, the message can be verifiedthus, the digital signature. Now, the paragraph immediately preceding this is a gross oversimplification of an extremely complex field. Not only does it gloss over the technology of cryptography, but it ignores very important issues of key management and key distribution. However, a complete discussion of cryptography is not only beyond the scope of this book, but beyond the scope of most books. The interested reader who is completely unfamiliar with cryptographic mechanisms can find a general introduction in my own Personal Encryption Clearly Explained (AP Professional, 1998) or a more comprehensive technical introduction in Bruce Schneiers Applied Cryptography, Second Edition (John Wiley & Sons, 1996). However, each of the protocols discussed in this chapter define mechanisms for taking a message body or some other data entity (in the case of the MIMEbased protocols) and either encrypting it, digitally signing it, or both. The details of which cryptographic algorithms are used and how they are applied can be found in the RFCs.
Privacy Enhanced Mail (PEM)

Privacy Enhanced Mail (PEM) services were published as Internet proposed standards in 1993 and have since then received minimal market support. In
298
fact, in 1997, the report of the IAB Security Architecture Workshop (RFC 2316) identified PEM as a not useful protocol for security due to its lack of acceptance over time. PEMs failure to catch on may have been due to a number of factors, including a lack of generally available software implementations that could do the encryption and digital signature processing it requires as well as lack of a generalized infrastructure for creating and distributing keys. PEM specifications (it is defined in RFC 1421 through RFC 1424; see the Reading List section at the end of the chapter for titles) defined not just a protocol for encrypting messages (RFC 1421), but also defines a supporting key management architecture and infrastructure, based on public-key certificate techniques (RFC 1422) and key management services (RFC 1424). Because it had to define its own infrastructure, one that was not accepted by the vast majority of the market, PEM fell by the wayside. Despite its current status as a proposed Internet standard, PEM is a dead letter, and you are unlikely to encounter it except as a historical footnote. According to Paul Hoffman, PEM had good intentions but serious design flaws; no one uses it. Rather than going into detail about its implementation, we simply display an example of a PEM encapsulation in Figure 17.3. This encapsulation would be the body of the message, set off by the begin and boundaries shown here. PEM defined several headers to be used within the encapsulation to indicate what kind of processing was done to the message, what kind of content was inside the encapsulation, and information about the identity and keys of the sender and recipient, followed by the encrypted content itself.
---BEGIN PRIVACY-ENHANCED MESSAGE--Proc-Type: 4,ENCRYPTED Content-Domain: RFC822 DEK-Info: DES-CBC,F8143EDE5960C597 Originator-ID-Symmetric: linn@zendia.enet.dec.com,, Recipient-ID-Symmetric: linn@zendia.enet.dec.com,ptf-kmc,3 Key-Info: DES-ECB,RSA-MD2,9FD3AAD2F2691B9A, B70665BB9BF7CBCDA60195DB94F727D3 Recipient-ID-Symmetric: pem-dev@tis.com,ptf-kmc,4 Key-Info: DES-ECB,RSA-MD2,161A3F75DC82EF26, E2EF532C65CBCFF79F83A2658132DB47 LLrHB0eJzyhP+/fSStdW8okeEnv47jxe7SJ/iN72ohNcUk2jHEUSoH1nvNSIWL9M 8tEjmF/zxB+bATMtPjCUWbz8Lr9wloXIkjHUlBLpvXR0UrUzYbkNpk0agV2IzUpk J6UiRRGcDSvzrsoK+oNvqu6z7Xs5Xfz5rDqUcMlK1Z6720dcBWGGsDLpTpSCnpot dXd/H5LMDWnonNvPCwQUHt== ---END PRIVACY-ENHANCED MESSAGE---
Figure 17.3
A PEM message encrypted and encapsulated (from RFC 1421).
299
MIME Object Security Services (MOSS)

Another dead end for message security was the MIME Object Security Services (MOSS) specification. Defined in 1995, in RFC 1848, MIME Object Security Services, MOSS is based on PEM and shares its lack of success as well as its identification as a not useful protocol for security in RFC 2316. However, MOSS did make some significant advances over PEM. Most important was its use of the MIME object to contain the content to be encrypted. PEM specified various mechanisms for ensuring that the content was properly identified and properly encoded so as to avoid problems with data translations across platforms. With MOSS, all objects subject to encryption must first be turned into MIME objects, so that issues of encoding and content types are delegated to MIME and can thus be ignored by the encryption function.
Standards Track Messaging Security Protocols

As may be apparent by the sequence of message security protocols discussed in the previous section, one logical solution to message security is to use MIME: MIME provides an organized format in which to put message bodies and provides a formal mechanism for defining a holder for message data. The MIME specifications themselves do not define a security mechanism, but they do provide a framework within which a security mechanism (or more than one security mechanism) can be built. Although the S/MIME specification is published as an informational RFC, two other specifications for message security have been published as proposed Internet standards based on MIME: MIME security multiparts and PGP. The first, published in 1995 in RFC 1847, Security Multiparts for MIME: Multipart/Signed and Multipart/Encrypted, defines two new MIME multipart types: multipart/signed and multipart/ encrypted. As per RFC 2316, multipart MIME security is recommended as the preferred way to add secured sections to MIME-encapsulated email. Although RFC 2316 is an informational document, it does reflect the results of a meeting of the IAB, which can be expected to guide security protocol development. PGP MIME is another MIME-based proposed standard, specified in RFC 2015, MIME Security with Pretty Good Privacy (PGP). It is actually based on RFC 1847 and describes a mechanism for using the Pretty Good Privacy (PGP) framework for doing encryption and digital signature in a multipart/signed or multipart/encrypted MIME content type. PGP was created by Philip Zimmerman and first released in 1991 as an open alternative for those seeking to use strong public key and symmetric key encryption.
300
The third proposed standard, OpenPGP, is published in RFC 2440, OpenPGP Message Format, and is based on a version of the Pretty Good Privacy (PGP) application that specifies only openly available cryptographic algorithms. Unlike the other two standards discussed in this section, OpenPGP is a general specification of an open version of PGP that can be used to encrypt or digitally sign any kind of data file or messagewhether a standard RFC 822/822bis format mail message, a MIME body, or a word processing file. As with many other message security protocols and specifications, the OpenPGP proposed standard was not placed onto the standards track without considerable controversy, and it may ultimately never achieve great acceptance. The OpenPGP specification was accepted by the IETF in part because all the components it specified were freely available and unconstrained by copyright, patent, or trade secret status. The alternative to OpenPGP was a protocol sponsored by RSA that was not similarly unconstrained, yet was subject to some limitations on the strength of the cryptographic algorithms specified. The contention between the RSA-backed proposal and the PGP-backed proposal, and the ultimate victory of PGP, raises a serious concern about the viability of choosing the one proposal over the other. The IETF ultimately granted proposed standard status to OpenPGP, indicating the general agreement that that solution was technically superior. However, implementers, in the form of software vendors selling (actually, giving away) software that supported the RSA solution. Between Microsoft and Netscape, tens of millions of users were already using browser/email clients that were compatible with the RSA solution, even if most of those users were unaware of their Internet clients cryptographic capabilities. On the other hand, while the OpenPGP solution boasted a far smaller number of actual users, supporters argued that those users were more likely to be actively using the software than the many millions of Web surfers who were generally clueless about crypto. As those familiar with the application of cryptographic algorithms to networking protocols are aware, further controversy hovers over any such algorithm when it is incorporated into protocols destined for global use. As already mentioned, the IETF strives never to develop standards that are constrained by intellectual property rights; they also attempt never to develop standards whose deployment may be affected by government regulation. The United States government, among others, maintains restrictions over the strength of encryption software that may be exported by U.S. vendors. These regulations loom large in any discussion of Internet standards where strong encryption tools are required to ensure security of data as it traverses open networks. In the selection of the OpenPGP proposal over the RSA proposal, the specification of a 40-bit encryption key in the RSA offering (considered by most experts to be insufficient for all but the most trivial security) tended to prejudice the IETF away from that proposal and tended to drive them toward the OpenPGP solution, which offered much stronger encryption. Despite soft-
301
ening of the U.S. regulations since then, key length continues to be an issue that will divide the Internet community.
TLS and POP3, IMAP, and ACAP

RFC 2595, Using TLS with IMAP, POP3 and ACAP, is a proposed standard published in June 1999. It specifies how TLS can be applied to messaging protocols. In addition to discussing how TLS integrates with IMAP, POP, and ACAP, RFC 2595 presents mechanisms by which those protocols can integrate TLS support. This is an important addition to the Internet messaging toolbox, improving the security with which message clients and servers communicate.
POP3 Authentication
The standard for POP3 (STD-53, RFC 1939) Post Office Protocol - Version 3, (discussed in Chapter 11, Post Office Protocol (POP)), includes the optional command APOP. APOP provides a more secure solution for POP3 authentication. Unlike the USER/PASS pair of commands, which send mailbox user name and password in plain text across the network, APOP does not send passwords in plain text over any network. APOP takes two required arguments: name, indicating the mailbox, and digest, which is an MD5 digest. MD5 is a cryptographic algorithm for calculating a hash, or digest, using a key and a plain text. The key in the case of POP3s APOP command is created by combining the time (as indicated in the POP3 servers response when it opens the POP3 connection) and a shared secreta passwordknown to the server and to the client. All POP3 servers that support the APOP command must also return a time stamp, formatted as a msg-id (as defined in the Internet Message Format Standard, MFS; see Chapter 8, Internet Message Format Standard). The MD5 function creates a message digest on the time stamp value combined with the shared secret. The server can calculate this, as can the client. But even though an attacker may see the time stamp, she can not see the password, which means she can not calculate a valid digest value.
Secure MIME (S/MIME)

Why was S/MIME not considered a core security protocol by the authors of RFC 2316? They dont explain, but the S/MIME specifications in RFC 2311, S/MIME Version 2 Message Specification, and RFC 2312, S/MIME Version 2 Certificate Handling, are informational, not proposed standards. A good part of the reason is that they specify a vendor-specific algorithm from RSA Data Security Inc., RC2, that was protected as a trade secret. This is a no-no for Internet standards, which must be open and available to all who wish to implement them. So far, no Internet standard protocol requires implementers to pay for any license on intellectual property. However, RSA has freed the algorithms for use
302
with S/MIME, and version 3 was published as a proposed standard in June 1999, as RFC 2633, S/MIME Version 3 Message Specification. S/MIME builds on the multipart security specification (RFC 1847, Security Multiparts for MIME: Multipart/Signed and Multipart/Encrypted) and improves it slightly. As we see in the next section, the security multiparts specification defines multipart MIME types for security. However, intermediaries such as gateways that do not support a multipart type treat the unknown multipart as a multipart/mixed, with potential changes made to the contents. This is not a good thing when a digital signature fails if carriage returns and line feeds are added in the wrong places in either the signed body or the signature itself. The S/MIME specification adds an application type, application/pkcs7-mime, which can be used for digital signatures. Non-MIME savvy gateways treat unknown application types as bit streams and wont (shouldnt) add or reformat them in any way.
MIME Security Multipart Content Types

Noting in its introduction that MIME by itself provides no mechanism for securing the contents of any MIME body, RFC 1847 goes on to define two MIME content types that can be used to digitally sign or encrypt MIME parts. By using the MIME multipart content type, relevant encryption or digital signature information is carried in one part of the MIME multipart, and the information being signed or the encrypted ciphertext is carried in another part. In fact, the specification declares that both the multipart/signed and the multipart/encrypted MIME content types may contain exactly two parts. Any type of MIME data can be protected, presumably including an encapsulated multipart (although these are not explicitly identified in the specification). The MIME part that is protected in a security multipart is protected (either signed or encrypted) in its entirety, which includes the body as well as any MIME headers within. This means that attackers can not modify the headers of a digitally signed body, and it also means that attackers can not determine from its headers what kind of MIME body is being encrypted (as we see next). RFC 1847 does not go into specifics about what kinds of digital signature or encryption algorithms are to be used, only how they are to be applied to the MIME security multiparts. To oversimplify, this specification states that MIME security multiparts consists of two parts: control information and protected information. The control information provides the recipient with enough data to either certify or decrypt the protected information. The defined MIME headers (parameters) for the security multipart add little more than the identification of the parts. The meatier information about encryption or digital signature is carried in the control parts. We summarize here the salient details of the MIME security multiparts and how they are intended to be used. It should be noted that although two sepa-
303
rate content types are definedone for digital signature and the other for encryptionit is possible to encapsulate one multipart security body inside another to allow a MIME body to be both signed and encrypted. When signing and encrypting a MIME body, the digital signature should be done first and the multipart/signed body encapsulated within the multipart/encrypted body. This ensures that the encryption obscures all the details that an attacker might extract from the digital signature and related control information.
MIME Multipart/Signed Content Type

The MIME multipart/signed content type consists of two different parts: The first part is the data being signed, and the second part is the control data relating to, and required for the verification of, the digital signature. It takes only three required parameters and has no optional parameters. The required parameters are as follows: boundary. As discussed in Chapter 9, Multipurpose Internet Mail Extensions (MIME), this parameter identifies a string that is used to demarcate a border between the part and other parts. protocol. This parameter identifies the type/subtype of the second part (the one containing control information). The value depends on the mechanism used to generate the digital signature. micalg. The Message Integrity Check (MIC) is the hashing algorithm used in support of generating the digital signature. More than one value can be used for any given multipart, in which case the values are separated by a comma. The MIC algorithm type is specified in the multipart/signed header to enable signature verification in a single pass rather than requiring one pass to read the message and another to verify. This algorithm is used to generate a hash value, which the signer of the message then signed by encrypting using the secret key of a public key pair. To verify the signature, the recipient system must generate the same hash value on the received message, so knowing which algorithm to use before the data is streamed into the system means that the recipient host can calculate the hash on the fly, stopping processing when it reaches the end of the signed part (and gets to the second part containing the signature control information). The specifics of what kind of information goes into the control part of the multipart/signed content type depend on the implementation, as we see when discussing PGP MIME.
MIME Multipart/Encrypted Content Type

The first part of a multipart/encrypted content type is the control part; the second part is the ciphertext. Just as having the control part come last in the
304
multipart/signed enhances performance, so too does having the control part come first with multipart/encrypted: It allows the receiving system to determine what is necessary to decrypt the encrypted part before that part is read and to decrypt it on the fly as it is input to the system. Only two parameters are defined for the multipart/encrypted, both of them required: the border parameter that weve encountered elsewhere, and the protocol parameter that identifies a type/subtype of the first (control information) part. While the type/subtype of the control part will vary depending on what kind of encryption is being used, the second part (the encrypted part) is always labeled as an application/octet-stream part. The PGP MIME specification, discussed next, demonstrates how RFC 1847 style MIME security is applied with an actual cryptographic application.
MIME Security with PGP

Until RFC 1847, other specifications attempted to create mechanisms by which MIME could carry secured data. However, those attempts generally failed because they focused on creating application subtypes. This forced the applications to attempt to map control information onto the MIME content type headers and into parameters, often with problematic results. RFC 2015, MIME Security with Pretty Good Privacy (PGP), uses the RFC 1847 framework to define a mechanism that supports PGP encryption and digital signature. However, it should be noted that while RFC 2015 is a standards track document, PGP itself is documented by the IETF only in informational RFCs. (However, OpenPGP, as we see later in the chapter, is on the standards track.) Using the multipart security content types, RFC 2015 further defines three MIME type/subtypes:
s s s s s s
application/pgp-encrypted application/pgp-signature application/pgp-keys
Very simply, the application/pgp-encrypted type is specified as the protocol parameter of the multipart/encrypted type, and the encrypted part of that body is labeled as such. Likewise, the application/pgp-signature type is specified in the protocol parameter of the multipart/signed type, and the part containing control information is labeled as such. The third MIME type, application/pgp-keys, defines a MIME type that can be used to exchange MIME public keys between PGP users. For PGP encrypted data, the control part contains nothing more than the content-type: header (application/pgp-encrypted) and a version number. Nothing else is necessary, as nothing more is required for the authorized recipient to decrypt than the PGP encrypted data itself. For PGP signed data, the control part contains little more than the contenttype header (application/pgp-signature) and the PGP signature itself. Again,
305
nothing else is necessary for the recipient to be able to verify the digital signature beyond the digital signature and the type of hash used (specified as a parameter to the multipart/signed MIME body, as described earlier).
OpenPGP
As has been mentioned, OpenPGP is not specific to Internet messages, but rather can be used to encrypt or digitally sign any piece of data, whether it is a data file or an email message. RFC 2440, OpenPGP Message Format, is an important specification because it documents the message formats used in OpenPGP and the methods required to read, check, generate, and write conforming packets crossing any network. However, it provides a general introduction to a cryptographic application and not a specific protocol for securing email. Although OpenPGP has many supporters, it has considerably less support in the marketplace. Although most of those who have a copy of PGP actually use it, only Network Associates markets PGP products. S/MIME is more widely supported by software vendors, and it is incorporated into email clients from both Microsoft and Netscape. Although few users of those products are likely to be actively using S/MIME, there are many millions of installed S/MIME-capable clients.
Reading List
Security is an important aspect of Internet messaging, even if hard and fast rules have not been specified as standards and protocols have not been selected as full Internet standards. Not only are there many relevant RFCs, but several IETF workgroups are developing proposals and other documents concerned with the security of Internet messaging. The interested reader should not only review the RFCs, listed in Table 17.2, but also check on the Web sites devoted to the workgroups listed in Table 17.1 to read the latest versions of relevant Internet drafts.
Table 17.1 IETF Workgroups Developing Security Standards Related to Messaging ABBREVIATION smime WORKGROU P M ISSION To define MIME encapsulation of digitally signed and encrypted objects whose format is based on PKCS #7. (PKCS is a series of documents published by RSA Data Security Inc. and also submitted to the IETF for publication as informational RFCs.) Continues
WORKGROU P S/MIME Mail Security
306
Essential Email Standards: RFCs and Protocols Made Practical Table 17.1 IETF Workgroups Developing Security Standards Related to Messaging (Continued) ABBREVIATION openpgp WORKGROUP M ISSION To provide IETE standards for the algorithms and formats of PGP processed objects as well as providing the MIME framework for exchanging them via email or other transport protocols. To develop Internet standards for IETF sponsored public key certificate format, associated signature and other formats, and key acquisition protocols that are simple, easy to understand, and easy to implement. To develop Internet standards needed to support an X.509-based PKI [Public Key Infrastructure] to be used by any Internet application (such as messaging) that requires a public key infrastructure.
WORKGROUP An Open Specification for Pretty Good Privacy
Simple Public Key Infrastructurean
spki
Public-Key Infrastructure (X.509)
pkix
Table 17.2 RFC RFC 2633 RFC 2595 RFC 2440 RFC 2316 RFC 2312 RFC 2311 RFC 2015 RFC 1991 RFC 1848 RFC 1847 RFC 1424
Internet Messaging Security RFCs STATUS Proposed Standard Proposed Standard Proposed Standard Informational Informational Informational Proposed Standard Informational Proposed Standard Proposed Standard Proposed Standard TITLE S/MIME Version 3 Message Specification Using TLS with IMAP, POP3 and ACAP OpenPGP Message Format Report of the IAB Security Architecture Workshop S/MIME Version 2 Certificate Handling S/MIME Version 2 Message Specification MIME Security with Pretty Good Privacy (PGP) PGP Message Exchange Formats MIME Object Security Services Security Multiparts for MIME: Multipart/Signed and Multipart/Encrypted Privacy Enhancement for Internet Electronic Mail: Part IV: Key Certification and Related Services
Internet Messaging Security Table 17.2 RFC RFC 1423 RFC 1422 RFC 1421 (Continued) STATUS Proposed Standard Proposed Standard Proposed Standard TITLE Privacy Enhancement for Internet Electronic Mail: Part III: Algorithms, Modes, and Identifiers Privacy Enhancement for Internet Electronic Mail: Part II: Certificate-Based Key Management Privacy Enhancement for Internet Electronic Mail: Part I: Message Encryption and Authentication Procedures
307
CHAPTER
18
The Future of Internet Messaging
This is the Buck Rogers chapter where we take a peek at what the future holds for Internet messaging far down the road. It is also the short-term road map chapter where we look at what the IETF working groups are working on for the first years of the new millennium. Rather than attempt to list the protocols of the future and describe them in detail, in this chapter, we merely highlight some areas of activity and speculate on what kinds of new messaging applications can be expected. We also point the reader to Web resources where up to date protocol information can be found. This chapter has three main sections. The first summarizes some of the IETF workgroup activity that has already been published and is on the verge of being deployed in actual products. The next section describes some workgroups that have published Internet-Drafts and that seems to be on the right track for creating usable application protocols. Finally, the last section discusses some of the directions in which Internet messaging is currently moving for the future.
309
310
Whats Coming Right Up

Much of what weve covered in the rest of this book reflects not just what is currently an Internet standard or on the Internet standards track, but also what is about to get on that track or be advanced along it. IETF workgroups are constantly identifying areas where new protocols and specifications are needed or where the existing ones require more or less significant overhauls. Where relevant, these working groups have been cited throughout the book. Here we look at a few of the workgroups whose work will have an important impact in the first year or so of the twenty-first century.
Detailed Revision/Update of Message Standards (drums)

The job of the drums workgroup is to consolidate, correct, and where necessary revise SMTP and related specifications into a single specification documenting the protocol. Although some changes to reflect accepted practice are inevitable, the drums group will not be adding any functionality but rather clarifying, correcting, and aggregating information about existing SMTP function. To illustrate the scope of the work performed by the drums group, titles of Internet-Drafts generated by the workgroup include:
s s s s s s s s s s
Simple Mail Transfer Protocol Mail and Netnews Header Registration Procedure Internet Message Format Standard Use of Reply-To in Internet Email messages The Message-ID header
As part of their work, the drums group has already published as RFC 2234 the document Augmented BNF for Syntax Specifications: ABNF, which specifies the formal syntax now used for most Internet protocols. The rest of the drums workgroup work product is well advanced, with a tenth draft of the SMTP specification available as of mid-1999 and submitted to the IESG with a request for a last call. The impact of the work done by the drums group will not be revolutionary by any stretch of the imagination, but rather reflect the honing of a set of specifications that have worked quite well for close to 20 years.
NNTP Extensions (nntpext)

As discussed in Chapter 14, Network News Transfer Protocol (NNTP), the NNTP protocol has been one of the more popular Internet applications since well before it hit the Internet standards track as a proposed standard in 1986. Yet since then, it has not moved along the standards track nor has it been
311
updated. The nntpext workgroup is responsible for changing that by giving NNTP a thorough updating while at the same time extending the protocol, adding whatever new functions the workgroup deems fit. However, the workgroup is unlikely to add much to the standard that has not become a de facto standard over the years through popular implementations. The most radical change will likely be the addition of a mechanism to allow extensions to the protocol (hence the workgroups name). Although its work has not yet been moved forward as far as the drums group, the nntpext workgroup has produced several Internet-Drafts, including:
s s s s s s s s
Network News Transport Protocol Common NNTP Extensions NNTP Full-text Search Extension An NNTP Extension for Dynamic Feed Adjustment
Again, though the work they do is significant, it is unlikely to cause any serious changes in the way people use netnews largely because much of it has already been incorporated into existing implementations.
Usenet Article Standard Update (usefor)

As with the NNTP Extensions workgroup, the Usenet article standard update working group aims to revise, correct, and update a relatively old standard. It also intends to build specifications for extending the Usenet article format as well as specify how to generate unique message IDs, how to properly cancel Usenet articles, and how to identify messages that are delivered both by email and by news.
s s s s s s s s s s
Identification of messages delivered via both mail and news News Article Format Recommendations for generating Message IDs Cancel-Locks in Usenet articles. Guidelines for the Generation of Message IDs and Similar Unique Identifiers
Again, the usefor workgroup results are unlikely to surprise anyone or to cause any real changes in the way people use netnews, but they will modernize a widely used protocol and help power future developments.
Expanded Standards and Availability of Calendaring Tools (calsch)

As we saw in Chapter 16, Calendaring and Scheduling Standards, specifications for proposed Internet standards for calendaring and scheduling protocols
312
are well advanced, with three RFCs already published as proposed standards. The calsch workgroup continues to advance other specifications through the Internet draft stage; recent drafts include:
s s s s s s s s s s
Calendar Attributes for vCard and LDAP Internet Calendar Model Specification CAP Requirements Calendar Real-Time Interoperability Protocol (iRIP) iCalendar v2.0 Formal Public Identifier
The existence of these protocols is not enough; they must be applied to existing workgroup, calendaring, and scheduling application products that are already widely used throughout organizations. The participation of IETF members from important vendors in this area, including Lotus, Netscape, Microsoft, Sun Microsystems, Qualcomm and others, seems to indicate that products will be forthcoming very soon. Some of these products are likely to be gateways from workgroup products like Lotus Notes and Microsofts Outlook, while others will integrate interoperable calendaring functions into browsers, email clients, desktops, palmtops, and many other devices (perhaps even pagers and wireless telephones). As more of the worlds population becomes connected, and as more connected products interoperate over the Internet, the importance of the protocols developed by this workgroup can only grow.
Internet Fax (fax)

Why would anyone want to use facsimile (fax) transmission for documents over the Internet? Actually quite a few reasons: Fax is widely available (even more so than Internet connectivity), it is fast, it works much better for hard copy documents than the Internet can, and lots of people are just plain more comfortable with it. You can do some interesting things once youve interfaced a fax network with the Internet. You can send a fax to an email address, where it will be received as a scanned image; you can send email to a fax number, where it will be printed out as a text or image message; you can use the Internet to transport fax data, so that international faxes dont require international telephone calls. The Internet fax (fax) workgroup has taken on the tasks of specifying protocols for encapsulating messages inside faxes, sending faxes inside email messages, formatting content for fax/email, Internet/dialup and dialup/Internet gatewaying, and interfacing between the Internet and the publicly switched telephone network (PSTN). They have been going at it for some time, with quite a few RFCs published already:
s s s s
File Format for Internet Fax (RFC 2301) Tag Image File Format (TIFF)Image/Tiff MIME Sub-Type Registration (RFC 2302)
313
s s s s s s s s s s s s
Minimal PSTN Address Format in Internet Mail (RFC 2303) Minimal FAX Address Format in Internet Mail (RFC 2304) A Simple Mode of Facsimile Using Internet Mail (RFC 2305) Tag Image File Format (TIFF)F Profile for Facsimile (RFC 2306) Terminology and Goals for Internet Fax (RFC 2542) Indicating Supported Media Features Using Extensions to DSN and MDN (RFC 2530) Content Feature Schema for Internet Fax (RFC 2531) Extended Facsimile Using Internet Mail (RFC 2532)
s s s s
You may already have used the Internet to send or receive a fax. The impact of this group will continue to be felt as long as implementers feel the need to support fax. However, over the long term, fax will likely be eclipsed by digital messaging techniques and be used only where digital versions of a document are unavailable. Current Internet-Drafts published by the fax workgroup include:
s s s s s s s s s s s s
SMTP Service Extension for Immediate Delivery GSTN Address Element Extensions in Email Services Using Message Disposition Notifications to Indicate Supported Features Extended MDN for Internet Fax Full Mode Facsimile Applications for Internet Fax Full Mode Extensions to Message Disposition Notifications for Reporting on Multipart/Alternative Messages Expressing Fax Capabilities in Internet Protocols MIME Content-Type for Internet Fax Full Mode Internet Fax T.30 Feature Mapping Indicating the Presence of a Coverpage in the Fax-over-SMTP Environment A Simple Mode of Facsimile Using Internet Mail File Format for Internet Fax (Revised) Minimal GSTN Address Format in Internet Mail Minimal FAX Address Format in Internet Mail
s s s s s s s s s s s s s s s s
Whats Next
What are the messaging protocols on the horizon that should start to flower over the next few years as more standards are specified and more applications are deployed? I cant guarantee that the specifications coming out of the workgroups listed here will be the next big thing over the next few years, nor that
314
some other workgroups wont be the source for something great in Internet messaging. However, its a good guess that within two to five years from the time this book is published, well be hearing much more about instant messaging and message tracking.
Instant Messaging and Presence Protocol (impp)

Back in the olden days of computing when everyone was connected to the same mainframe, everyone had access to a couple of nifty features. One allowed you to check to see who else was logged onto the system. The other let you send a brief text message directly to the screen of anyone who was logged in. These functions have not been the subject of any great degree of research or development, at least not in terms of open standards. Proprietary systems have popped up over the last few years and have been incorporated into mainstream products, as the ICQ instant-messaging product was purchased and incorporated into the offerings from AOL and Netscape. The instant messaging and presence protocol (impp) workgroup was chartered to define protocols and data formats necessary to build an Internetscale end-user presence awareness, notification and instant messaging system. However, its initial task is to determine specific design goals and requirements for such a service. So far, the workgroup has published only two Internet-Drafts, laying the groundwork for future protocol specifications:
s s s s
A Model for Presence and Instant Messaging Instant Messaging/Presence Protocol Requirements
The currently available, proprietary instant messaging and presence tools have proven to be extremely popular on the desktops of those who use them. As the number and variety of IP devices continues to increase, the impact of these protocols will also increase. Even now, businesses are employing similar tools that enable brokers to notify their customers of important opportunities by sending messages to a pager or wireless telephone. The possibilities are endless.
Message Tracking Protocol (msgtrk)

A relatively new IETF workgroup, the message tracking protocol (msgtrk), group is chartered to design a diagnostic protocol for a message originator to request information about the submission, transport, and delivery of a message regardless of its delivery status. Such a mechanism will be invaluable to messaging system administrators seeking to maintain a smooth flow of data to their customers, as well as for simplifying the task of troubleshooting messaging problems or even assigning blame for denial of service or spam attacks. The objective of the message tracking model will be to track the submission, transport, and delivery of RFC 822bis messages from the time they enter the
315
messaging transport network until they reach their final destinationwhether it be a users mailbox, an IMAP server, or some other, proprietary mail system. The impact of such a mechanism may not be felt by end users except in the form of continued or improved messaging performance, but it could help prevent bottlenecks as the Internet continues to scale up to ever larger populations of users. So far, the group has produced no drafts.
Crystal Ball Gazing

Internet messaging may seem to comprise a fairly mature set of technologies, but those technologies are mature only in the context of their deployment across IP networks. Whether you call it convergence, smart devices, or ubiquitous networking, something is happening, which means that interoperable, open standard messaging protocols will increasingly affect how we communicate. Within the next five years, perhaps sooner, integrated messaging tools will become reality. It will be possible to send and receive all personal communication, whether voice telephony, email, faxes, paging, wireless communication, or something else, mediated through a single client device or mechanism. In other words, youll be able to accept a fax through a paging device, send a page from email, receive a telephone call through a Web site, or even send an email from a pay phone. Ubiquitous networking and integrated messaging will be accelerated as companies adopt the next-generation Internet Protocol, IPv6. Fielding all those ubiquitous networked devices requires a networking protocol capable of easily supporting billions, trillions, or even more interoperating nodes. Internet messaging depends on having a universal format for carrying data, capable of being carried intact across any type of system. The revised Internet message format standard, RFC 822bis, along with the MIME format, will likely continue to be central to all Internet messaging protocols. Implementers will continue to define new MIME content types as they find new uses for the MIME container. As long as MIME functions well in its role as container, it will not require updating. However, chances are good that the MIME specifications will require updating, at least in the years to come.
Summing Up
There is no question that Internet messaging is no longer a novelty but rather a prerequisite for doing business in todays organizational environment. It may not entirely replace hard-copy communication, fax, face-to-face meetings, telephone calling, postal mail, or any other messaging mechanism, but it will definitely realign the messaging marketit already has. Considerable improvements and exciting new functions will be delivered by the IETF and
316
its members in the years to come. However, Internet messaging professionals will likely find that the bulk of their work revolves around the same basic principles that have defined Internet messaging since the early 1980s. The details may change, but the concepts will remain.
Reading List
This chapter references mostly Internet drafts and IETF working groups. For more information about specification and protocols that are in the process of becoming Internet standards, the interested reader should check the IETFs Web page listing active workgroups, at:
http://www.ietf.org/html.charters/wg-dir.html

0471345970

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

0471345970

Hochgeladen von

Copyright:

Verfügbare Formate

PA R T

Essential Email Standards: RFCs and Protocols Made Practical

Internet Email Standards

Basic Email Requirements

Essential Email Standards: RFCs and Protocols Made Practical

Central Message Store

User System User System User System User System

Internet Email Standards

Message Transfer Agent

Message Transfer Agent

Message Transfer Agent

User Agent Figure 1.2

Essential Email Standards: RFCs and Protocols Made Practical

Standards for Internet Messaging

Formatting and Message Headers

Internet Email Standards

Message Body and Attachments

Essential Email Standards: RFCs and Protocols Made Practical

Internet Email Standards

Internet Messaging and Collaboration

Essential Email Standards: RFCs and Protocols Made Practical

Internet Standards and Internet Protocols

Essential Email Standards: RFCs and Protocols Made Practical

Internet Standards and Internet Protocols

Essential Email Standards: RFCs and Protocols Made Practical

Internet Standards and Internet Protocols

States: Standards Maturity Levels

Essential Email Standards: RFCs and Protocols Made Practical

Status: Standards Requirements Levels

Internet Standards and Internet Protocols

Essential Email Standards: RFCs and Protocols Made Practical

Whats Standard, Whats Not

PROTOCOL OSPF2 IP-FR RIP2 RIP2-APP SMIv2 CONV-MIB CONF-MIB

Internet Standards and Internet Protocols

RFC 1150 RFC 1818 RFC 2026

Essential Email Standards: RFCs and Protocols Made Practical

Internet Society (ISOC)

Internet Architecture Board

IANA ICANN RFC Editor

Internet Standards Bodies

Essential Email Standards: RFCs and Protocols Made Practical

How the IAB Works

Internet Standards Bodies

Essential Email Standards: RFCs and Protocols Made Practical

The Internet Society

Internet Standards Bodies

The IETF and IESG

Essential Email Standards: RFCs and Protocols Made Practical

Internet Standards Bodies

Essential Email Standards: RFCs and Protocols Made Practical

Internet Standards Bodies

Other Relevant Bodies

Essential Email Standards: RFCs and Protocols Made Practical

Internet Standards Bodies

Essential Email Standards: RFCs and Protocols Made Practical

The Standards Process

Essential Email Standards: RFCs and Protocols Made Practical

The Internet Standards Process

Essential Email Standards: RFCs and Protocols Made Practical

The Standards Track

The Internet Standards Process

Essential Email Standards: RFCs and Protocols Made Practical

Revising or Retiring Existing Standards