Sie sind auf Seite 1von 563

Networks and Protocols 2006

Jurgen Schonwalder
j.schoenwaelder@iu-bremen.de

International University Bremen Campus Ring 1 28725 Bremen, Germany

http://www.faculty.iu-bremen.de/jschoenwae/np-2006/

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 1

0. Preface

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 2

Networks & Distributed Systems


General CS I Nat. Sci. Lab Module (C) General CS II Nat. Sci. Lab Module (C++) Algorithms and Data Structures Computer Arch. and Operating Systems Networks and Protocols Distributed Systems Advanced Networking Advanced Distributed Systems

1st Semester 1st Semester 2nd Semester 2nd Semester 3rd Semester 4th Semester 5th Semester 6th Semester Fall Semester Fall Semester
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 3

Course Content
1. Introduction 2. Fundamental Concepts 3. Local Area Networks (IEEE 802) 4. Internet Network Layer (IPv4, IPv6) 5. Internet Routing (RIP, OSPF, BGP) 6. Internet Transport Layer (UDP, TCP) 7. Firewalls and Network Address Translators 8. Abstract Syntax Notation 1 (ASN.1) 9. External Data Representation (XDR) 10. Augmented Backus Naur Form (ABNF) 11. Domain Name System (DNS) 12. Electronic Mail (SMTP, IMAP) 13. Document Access and Transfer (HTTP, FTP) 14. Operations and Management (SNMP) 15. Remote Procedure Calls (ONC RPC)
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 4

Course Objectives

Introduce fundamental data networking concepts Focus on widely deployed Internet protocols Prepare students for further studies in networking Combine theory with practical experiences Raise awareness of weaknesses of the Internet

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 5

Reading Material

A.S. Tanenbaum, "Computer Networks", 4th Edition, Prentice Hall, 2002 W. Stallings, "Data and Computer Communications", 6th Edition, Prentice Hall, 2000 C. Huitema, "Routing in the Internet", 2nd Edition, Prentice Hall, 1999 D. Comer, "Internetworking with TCP/IP Volume 1: Principles Protocols, and Architecture", 4th Edition, Prentice Hall, 2000 J.F. Kurose, K.W. Ross, "Computer Networking: A Top-Down Approach Featuring the Internet", 3rd Edition, Addison-Wesley 2004.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 6

Grading Scheme

Final examination Covers the whole lecture Open book (and closed computers / networks) Midterm examination Covers the rst part of the lecture Open book (and closed computers / networks) Quizzes Control your continued learning success Mini projects / homeworks / labs Gain some practical experience

(30%)

(30%)

(20%) (20%)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 7

Reasons for not taking this course


You do not have the time required for this course You do not have the required background You just want to know how to congure your system(s) You nd the topics covered by this course boring You are not ready for some hard work You are unable to do some programming in C/Unix You are not ready to take initiative such as reading book chapters and specications, or programming exercises

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 8

1. Introduction

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 9

The Internet

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 10

Autonomous Systems

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 11

Growth of the Internet

How would you count hosts on the Internet?


slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 12

Growth of the World Wide Web

This may be a bit easier to count? Do not take these gures too serious!

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 13

Growth of Problems

CERT Coordination Center (CERT/CC) is a center of Internet security expertise Located at the Software Engineering Institute operated by Carnegie Mellon University (CMU)
Year 1988 1989 1990 1991 1992 1993 1994 1995 Incidents 6 132 252 406 773 1334 2340 2412 Advisories 1 7 12 23 21 19 15 18 Year 1996 1997 1998 1999 2000 2001 2002 2003 Incidents 2573 2134 3734 9859 21756 52658 82094 137529 Advisories 27 28 13 17 22 37 37 28

Statistics taken from http://www.cert.org/stats/

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 14

Networking Challenges

Hot topics: Routing and Convergence Security, Trust, and Key Management Network Management and Operations Ad-hoc networks and self-organizing networks Scalable Inter-Domain Quality of Service (QoS) Measurement and Modeling Killer Applications Disappearing (invisible) Networks Interplanetary Internet

= See RFC 3869 for further information on Internet Research.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 15

Internet Model
Application Process Application Process

Endsystem Application

End System Application

Transport

Router Network

Transport

Internet

Internet

Subnetwork

Subnetwork

Subnetwork

Media

Media

Subnetworks only have to provide very basic services Even the Internet layer can be used as a subnetwork
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 16

Design Principles

Connectivity is its own reward All functions which require knowledge of the state of end-to-end communication should be realized at the endpoints (end-to-end argument) There is no central instance which controls the Internet and which is able to turn it off Addresses should uniquely identify endpoints Intermediate systems should be stateless wherever possible Implementations should be liberal in what they accept and stringent in what they generate Keep it simple (when in doubt during design, choose the simplest solution)
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 17

Internet Addresses

Four byte IPv4 adresses are typically written as four decimal numbers separated by dots where every decimal number represents one byte (dotted quad notation). A typical example is the IPv4 address 212.201.48.1 Sixteen byte IPv6 addresses are typically written as a sequence of hexadecimal numbers seperated by colons (:) where every hexadecimal number represents two bytes Leading nulls in IPv6 addresses can be omitted and two consecutive colons can represent a sequence of nulls The IPv6 address 1080:0:0:0:8:800:200C:417A can be written as 1080::8:800:200C:417A

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 18

Internet Domain Names


virtual root

nl

de

edu

org

net

com

biz

toplevel

iubremen

2nd level

faculty

3rd level

www

4th level

The Domain Name System (DNS) provides a distributed hierarchical name space which supports the delegation of name assignments DNS name resolution translates DNS names into one or more Internet addresses
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 19

Autonomous Systems

The global Internet consists of a set of inter-connected autonomous systems An autonomous system (AS) is a set of routers and networks under the same administration Autonomous systems are identied by 16-bit numbers, called the AS numbers IP packets are forwarded between autonomous systems over paths that are established by an Exterior Gateway Protocol (EGP) Within an autonomous system, IP packets are forwarded over paths that are established by an Interior Gateway Protocol (IGP)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 20

Internet Layer

The Internet Protocol version 4 (IPv4) provides for transmitting datagrams from sources to destinations using 4 byte IPv4 addresses The Internet Control Message Protocol version 4 (ICMPv4) is used for IPv4 error reporting and testing The Address Resolution Protocol (ARP) maps IPv4 addresses to IEEE 802 addresses The Internet Protocol version 6 (IPv6) provides for transmitting datagrams from sources to destinations using 16 byte IPv6 addresses The Internet Control Message Protocol version 6 (ICMPv6) is used for IPv6 error reporting, testing, auto-conguration and address resolution
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 21

Internet Routing Protocols

The Routing Information Protocol (RIP) is a simple distance vector IGP routing protocol based on the Bellman-Ford shortest paths algorithm The Open Shortest Path First (OSPF) IGP routing protocol oods link state information so that every node can compute shortest paths using Dijkstras shortest path algorithm The Intermediate System to Intermediate System (IS-IS) routing protocol is another link state routing protocol adopted from the ISO/OSI standards The Border Gateway Protocol (BGP) EGP routing protocol propagates reachability information between ASs, which is used to make policy-based routing decisions

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 22

Internet Transport Layer


The User Datagram Protocol (UDP) provides a simple unreliable best-effort datagram service The Transmission Control Protocol (TCP) provides a bidirectional, connection-oriented and reliable data stream The Stream Control Transmission Protocol (SCTP) provides a reliable transport service supporting sequenced delivery of messages within multiple streams, maintaining application protocol message boundaries (application protocol framing) The Datagram Congestion Control Protocol (DCCP) provides a congestion controlled, unreliable ow of datagrams suitable for use by applications such as streaming media

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 23

Application Layer

The TELNET protocol provides a bi-directional, eight-bit byte oriented network virtual terminal The File Transfer Protocol (FTP) transports les between computers, shielding differences in storage systems The Simple Mail Transfer Protocol (SMTP) is used to transfer email reliably and efciently The Internet Message Access Protocol (IMAP) allows a client to access and manipulate electronic mail messages on a server The Hypertext Transfer Protocol (HTTP) is an application-level protocol for distributed, collaborative, hypermedia information systems

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 24

Application Layer

The Secure Shell (SSH) protocol provides a secure remote login and other secure network services over an insecure network The Network News Transfer Protocol (NNTP) is used for the distribution, inquiry, retrieval, and posting of Netnews articles The Real-Time Transport Protocol (RTP) provides a transport service for real-time multi-media applications where different data streams have to be synchronized, usually implemented on top of UDP The Open Network Computing Remote Procedure Call (ONC-RPC) protocol allows programs to invoke procedures on remote computers

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 25

Infrastructure

The Domain Name System (DNS) protocol is primarily used to resolve DNS names to Internet addresses The Network Time Protocol (NTP) provides the mechanisms to synchronize time and coordinate time distribution in a large, diverse internet The Simple Network Management Protocol (SNMP) can be used to monitor and control network elements The Lightweight Directory Access Protocol (LDAP) provides access to directories (yellow pages services) The Transport Layer Security (TLS) protocol allows client/server applications to communicate in a way that is designed to prevent eavesdropping, tampering, or message forgery
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 26

Sockets

Sockets are abstract communication endpoints with a rather small number of associated function calls The socket API consists of address formats for various network protocol families functions to create, name, connect, destroy sockets functions to send and receive data functions to convert human readable names to addresses and vice versa functions to multiplex I/O on several sockets Sockets are the de-facto standard communication API provided by operating systems

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 27

Socket Types

Stream sockets (SOCK_STREAM) represent bidirectional reliable communcation endpoints Datagram sockets (SOCK_DGRAM) represent bidirectional unreliable communcation endpoints Raw sockets (SOCK_RAW) represent endpoints which can send/receive interface layer datagrams Reliable delivered message sockets (SOCK_RDM) are datagram sockets with reliable datagram delivery Sequenced packet scokets (SOCK_SEQPACKET) are stream sockets which retain data block boundaries

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 28

Generic Socket Addresses


#include <sys/socket.h> struct sockaddr uint8_t sa_family_t char }; { sa_len sa_family; sa_data[...];

/* address length (BSD) */ /* address family */ /* data of some size */

struct sockaddr_storage { uint8_t ss_len; sa_family_t ss_family; char padding[...]; };

/* address length (BSD) */ /* address family */ /* padding of some size */

A struct sockaddr represents an abstract address, typically casted to a struct for a concrete address format A struct sockaddr_storage provides storage space
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 29

IPv4 Socket Addresses


#include <sys/socket.h> #include <netinet/in.h> typedef ... sa_family_t; typedef ... in_port_t; struct in_addr { uint8_t s_addr[4]; }; struct sockaddr_in { uint8_t sin_len; sa_family_t sin_family; in_port_t sin_port; struct in_addr sin_addr; };

/* IPv4 address */

/* /* /* /*

address length (BSD) */ address family */ transport layer port */ IPv4 address */

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 30

IPv6 Socket Addresses


#include <sys/socket.h> #include <netinet/in.h> typedef ... sa_family_t; typedef ... in_port_t; struct in6_addr { uint8_t s6_addr[16]; }; struct sockaddr_in6 { uint8_t sin6_len; sa_family_t sin6_family; in_port_t sin6_port; uint32_t sin6_flowinfo; struct in6_addr sin6_addr; uint32_t sin6_scope_id; };

/* IPv6 address */

/* /* /* /* /* /*

address length (BSD) */ address family */ transport layer port */ flow information */ IPv6 address */ scope identifier */

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 31

Connection-Less Communication
socket()

bind() socket()

bind() recvfrom() data sendto()

sendto()

data

recvfrom()

close()

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 32

Connection-Oriented Communication
socket()

bind()

listen() socket() accept() connection setup

connect()

read()

data

write()

write()

data

read()

close()

connection release

close()

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 33

Socket API Summary


#include <sys/types.h> #include <sys/socket.h> #include <unistd.h> #define #define #define #define #define SOCK_STREAM SOCK_DGRAM SCOK_RAW SOCK_RDM SOCK_SEQPACKET ... ... ... ... ...

#define AF_LOCAL ... #define AF_INET ... #define AF_INET6 ... #define PF_LOCAL ... #define PF_INET ... #define PF_INET6 ...

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 34

Socket API Summary


int socket(int domain, int type, int protocol); int bind(int socket, struct sockaddr *addr, socklen_t addrlen); int connect(int socket, struct sockaddr *addr, socklen_t addrlen); int listen(int socket, int backlog); int accept(int socket, struct sockaddr *addr, socklen_t *addrlen); ssize_t write(int socket, void *buf, size_t count); int send(int socket, void *msg, size_t len, int flags); int sendto(int socket, void *msg, size_t len, int flags, struct sockaddr *addr, socklen_t addrlen); ssize_t read(int socket, void *buf, size_t count); int recv(int socket, void *buf, size_t len, int flags); int recvfrom(int socket, void *buf, size_t len, int flags, struct sockaddr *addr, socklen_t *addrlen);

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 35

Socket API Summary


int shutdown(int socket, int how); int close(int socket); int getsockopt(int socket, int level, int optname, void *optval, socklen_t *optlen); int setsockopt(int socket, int level, int optname, void *optval, socklen_t optlen); int getsockname(int socket, struct sockaddr *addr, socklen_t *addrlen); int getpeername(int socket, struct sockaddr *addr, socklen_t *addrlen);

All API functions operate on abstract socket addresses Not all functions make equally sense for all socket types

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 36

Mapping Names to Addresses


#include <sys/types.h> #include <sys/socket.h> #include <netdb.h> #define AI_PASSIVE ... #define AI_CANONNAME ... #define AI_NUMERICHOST ... struct addrinfo { int int int int size_t struct sockaddr char struct addrinfo };

ai_flags; ai_family; ai_socktype; ai_protocol; ai_addrlen; *ai_addr; *ai_canonname; *ai_next;

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 37

Mapping Names to Addresses


int getaddrinfo(const char *node, const char *service, const struct addrinfo *hints, struct addrinfo **res); void freeaddrinfo(struct addrinfo *res); const char *gai_strerror(int errcode);

Many books still document the old name and address mapping functions gethostbyname() gethostbyaddr() getservbyname() getservbyaddr() which are IPv4 specic and should not be used anymore

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 38

Mapping Addresses to Names


#include <sys/types.h> #include <sys/socket.h> #include <netdb.h> #define #define #define #define #define #define NI_NOFQDN NI_NUMERICHOST NI_NAMEREQD NI_NUMERICSERV NI_NUMERICSCOPE NI_DGRAM ... ... ... ... ... ...

int getnameinfo(const struct sockaddr *sa, socklen_t salen, char *host, size_t hostlen, char *serv, size_t servlen, int flags); const char *gai_strerror(int errcode);

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 39

Multiplexing
#include <sys/select.h> typedef ... fd_set; FD_ZERO(fd_set *set); FD_SET(int fd, fd_set *set); FD_CLR(int fd, fd_set *set); FD_ISSET(int fd, fd_set *set); int select(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timeval *timeout); int pselect(int n, fd_set *readfds, fd_set *writefds, fd_set *exceptfds, struct timespec *timeout, sigset_t sigmask);

select() works with arbitrary le descriptors select() frequently used to implement the main loop of event-driven programs
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 40

References
[1] B. Carpenter. Architectural Principles of the Internet. RFC 1958, IAB, June 1996. [2] B. Carpenter. Internet Transparency. RFC 2775, IBM, February 2000. [3] R. Gilligan, S. Thomson, J. Bound, J. McCann, and W. Stevens. Basic Socket Interface Extensions for IPv6. RFC 3493, Intransa Inc., Cisco, Hewlett-Packard, February 2003.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 41

2. Fundamental Concepts

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 42

Topologies

Star

Ring

Meshed Network

Bus

Line

The topology of a network describes the way in which nodes attached to the network are interconnected
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 43

Topologies

Star: Nodes are directly connected to a central node Simple routing Good availability if central node is highly reliable Ring: Nodes are connected to form a ring Simple routing Limited reliability, failures require reconguration Meshed Network: Nodes are directly connected to all other nodes Very good reliability

Requires

n(n1) 2

links for n nodes


slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 44

Topologies

Bus: Nodes are attached to a shared medium Simple routing Good availability if the bus is highly reliable Line: Nodes are connected to form a line Average reliability Network position inuences transmission delays Line topologies are especially important in the case n = 2 (point to point links)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 45

Structured Cabling

Networks in ofce buildings are typically hierarchically structured: Every oor has a (potentially complex) network segment (usually using star topologies) The oor network segments are connected by a backbone network Multiple buildings are interconnected by connecting the backbone networks of the buildings Cabling infrastructure in the buildings should be useable for different purposes (voice network, data network)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 46

Lifetimes of Network Components

Typical lifetimes: Network rooms and cable ways (20-40 years) Fibre wires (about 15 years) Copper wires (about 8 years) Cabling should survive 3 generations of active network components For more details on structured cabling, see the international standard ISO/IEC 11801

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 47

Communication Channel Model


source input signal x(t) sink ouput signal y(t)

encoder

decoder

signal x(t)

signal y(t)

physical transmission medium distortion z(t)

Signals are in general modied during transmission


slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 48

Transmission Impairments

Attenuation: The strength of a signal falls off with distance over any transmission medium For guided media, attenuation is generally an exponential function of the distance For unguided media, attenuation is a more complex function of distance and the makeup of the atmosphere Delay distortion: Delay distortion occurs because the velocity of propagation of a signal through a guided medium varies with frequency Various frequency components of a signal will arrive at the receiver at different times
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 49

Transmission Impairments

Noise Thermal noise (white noise) is due to thermal agitation of electrons and is a function of temperature Intermodulation noise can occur if signals at different frequencies share the same transmission medium Crosstalk is an unwanted coupling between signal paths Impulse noise consists of irregular pulses or noise spikes of short duration and of relatively high amplitude

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 50

Channel Characteristics

The data rate (bit rate) describes the data volume that can be transmitted per time interval (e.g., 100 Mbit/s) The bit width is the time needed to transmit a single bit (e.g., 1 microsecond for 1 Mbit/s)

1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000 1111111111111 0000000000000


transmission delay

1. bit

last bit

propagation delay

The delay is the time needed to transmit a message from the source to the sink The delay consists of the propagation delay and the transmission delay
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 51

Transmission Media

Copper wires: Simple wires Twisted pair Coaxial cables Optical wires: Fibre (multimode and single-mode) Air: Radio waves Micro waves Infrared waves Light waves

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 52

Simple Electrical Wires


Simple two-wire open lines are the simplest transmission medium Adequate for connecting equipment up to 50 m apart using moderate bit rates The signal is typically a voltage or current level relative to some ground level Simple wires can easily experience crosstalk caused by capacitive coupling The open structure makes wires suspectible to pick-up noise signals from other electrical signal sources

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 53

Twisted Pairs

A twisted pair consists of two insulated copper wires Twisting the wires in a helical form cancels out waves Unshielded twisted pair (UTP) of category 3 was the standard cabling up to 1988 UTP category 5 and above are now widely used for wiring (less crosstalk, better signals over longer distances) Shielded twisted pair (STP) cables have an additional shield further reducing noise

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 54

Coaxial Cable
Insulation

Core

Outer conductor Protective plastic

Coax cables are shielded (less noise) and suffer less from attenuation Data rates of 500 mbps over several kilometers with a error probability of 107 achievable Widely used for cable television broadcast networks (which in some countries are heavily used for data communication today)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 55

Fibre
Cladding

Core

Jacket

Sheath

The glass core is surrounded by a glass cladding with a lower index of refraction to keep the light in the core Multimode ber have a thick core (20-50 micrometer) and propagate light using continued refraction Single-mode ber have a thin core (2-10 micrometer) which guides the light through the ber High data rates, low error propability, thin, lightweight, immue to electromagnetic interference
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 56

Electromagnetic Spectrum
f(Hz) 10 0 10 2 10 4 10 6 radio waves 10 8 10 10 10 12 10 14 UV 10 16 10 18 10 20 10 22 10 24 micro waves infrared waves xray waves gamma waves

light waves

f(Hz)

10 4 twisted pair

10 6

10 8

10 10 satellites

10 12

10 14 LWL

10 16

coax cables AM radio FM radio TV micro waves

Usage of most frequencies is controlled by legislation The Industrial/Scientic/Medical (ISM) band (2400-2484 MHz) can be used without special licenses

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 57

Parallel vs. Serial Data Transmission

Parallel: A data word (e.g., a byte) is transmitted in a single step (all bits transmitted concurrently) Requires multiple channels/connections between sender and receiver Serial: A data word (e.g., a byte) is transmitted as an ordered sequence of bits Only one channel necessary between sender and receiver

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 58

Digital Signal Encoding


Bits must be encoded as high/low frequencies or voltages Important characteristics: Clocking Must be able to detect the start and the end of a bit Usually no common clock between sender and receiver DC component Reduce the DC component of the transmission Resynchronization Allow receivers to synchronize at the beginning of a transmission

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 59

Return-To-Zero Encoding
0 1 0 0 0 1 1

high

low

1s are represented by high signals and 0s by low signals The width of a 1 rectangle is only half of the bit interval Signal rate (baud rate) higher than the data rate (bit rate) No clock regeneration, potentially high DC component

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 60

Non-Return-To-Zero Encoding
0 1 0 0 0 1 1

high

low

1s are represented by high signals and 0s by low signals Changes of the signal level only occur at bit boundaries No clock regeneration, potentially high DC component

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 61

Non-Return-To-Zero-Mark Encoding
0 1 0 0 0 1 1

high

low

NRZ-M changes the signal level when a 1 is transmitted A 0 bit does not change the signal level Differential encoding technique (signal changes are easier to detect under noise)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 62

Non-Return-To-Zero-Inverse Encoding
0 1 0 0 0 1 1

high

low

NRZ-I changes the signal level when a 0 is transmitted A 1 bit does not change the signal level Differential encoding technique (signal changes are easier to detect under noise)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 63

Manchester Encoding
0 1 0 0 0 1 1

high

low

A 1 bit is represented by a signal level change from high to low in the middle of the bit interval A 0 bit is represented by a signal level change from low to high in the middle of the bit interval At least on signal level change per bit interval (clocking) Every bit interval has a high and low signal level (no DC component)
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 64

Alternate Mark Insertion Encoding


0 1 0 0 0 1 1

high

zero

low

The representation of a 1 bit alternates between high (positive) and low (negative) signal level. A 0 bit is represented by the signal level zero.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 65

Media Access
media access

time division multiplexing

frequency division multiplexing

fixed raster

dynamic

decentralized

centralized

concurrent

ordered

Shared transmission media require coordinated access to the medium (media access control)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 66

Frequency Devision Multiplexing


frequency channel 1 channel 2 channel 3 channel 4 channel 5

11111111111111111111111 00000000000000000000000 11111111111111111111111 00000000000000000000000 11111111111111111111111 00000000000000000000000 11111111111111111111111 00000000000000000000000 11111111111111111111111 00000000000000000000000 11111111111111111111111 00000000000000000000000 11111111111111111111111 00000000000000000000000 11111111111111111111111 00000000000000000000000
time

Signals are carried simultaneously on the same medium by allocating to each signal a different frequency band

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 67

Wavelength Division Multiplexing (WDM)


Optical bers carry multiple wavelength at the same time WDM allows very large data rates over a single optical ber Dense WDM (DWDM) is a variation where the wavelengths are spaced close together, which results in a larger number of channels. Theoretically, there is room for 1250 10 Gbps channels (= 12.5 Tbps) on a single ber.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 68

Time Division Multiplexing


frequency

1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0

1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0

1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0

1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0

1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0

1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0

1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0

1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0

1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0 1 0

channel ...

channel ...

channel n

channel n time

channel 1

channel 2

channel 3

channel 1

Signals from a given sources are assigned to specic time slots Time slot assignment might be xed (synchronous TDM) or dynamic (statistical TDM)
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 69

channel 2

channel 3

Pure Aloha
Station A: Station B: Station C:

111 000 111 000 111 111 000 000 111 000 111 111 000 000 111 000 111 000 111 000 111 000 111 000
time

Developed in the 1970s at the University of Hawai Sender sends data as soon as data becomes available Collisions are detected by listening to the signal Retransmit after a random pause after a collision Not very efcient ( 18 % of the channel capacity)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 70

Slotted Aloha
Station A: Station B: Station C:

111 000 111 000 111 000 111 000 111 111 000 000 111 111 000 000
time

Senders do not send immediately but wait for the beginning of a time slot Time slots may be advertised by short control signals Collisions only happen at the start of a transmission Avoids sequences of partially overlaying data blocks Slightly more efcient ( 37 % of the channel capacity)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 71

Carrier Sence Multiple Access (CSMA)


Station A: Station B: Station C:

111 000 111 000 111 000 111 000 111 111 000 000 111 111 000 000
time

Sense the media whether it is unused before starting a transmission Collisions are still possible (but less likely) 1-persistent CSMA: sender sends with probability 1 p-persistent CSMA: sender sends with probability p non-persistent CSMA: sender waits for a random time period before it retries if the media is busy
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 72

CSMA with Collision Detection (CSMA-CD)


Station A: Station B: Station C:

1 0 1 0 1 0 1 0

1 0 1 0 1 0 1 0
time

Terminate the transmission as soon as a collision has been detected (and retry after some random delay) Let be the propagation delay between two stations with maximum distance Senders can be sure that they successfully acquired the medium after 2 time units Used by the classic Ethernet developed at Xerox Parc
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 73

Multiple Access with Collision Avoidance (MACA)


RTS CTS

Station A: Station B: Station C: time

A station which is ready to send rst sends a short RTS (ready to send) message to the receiver The receiver respondes with a short CTS (clear to send) message Stations who receive RTS or CTS must stay quiet Solves the hidden station and exposed station problem
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 74

Token
Station A: Station B: Station C: time

A token is a special bit pattern circulating between stations only the station holding the token is allowed to send data Token mechanisms naturally match physical ring topologies logical rings may be created on other physical topologies Care must be taken to handle lost or duplicate token

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 75

Transmission Error Detection


Data transmission often leads to transmission errors that affect one or more bits Simple parity bits can be added to code words to detect bit errors Parity bit schemes are not very strong in detecting errors which affect multiple bits Computation of error check codes must be efcient (in hardware and/or software)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 76

Cyclic Redundancy Check (CRC)


A bit sequence (bit block) bn bn1 . . . b1 b0 is represented as a polynomial B(x) = bn xn + bn1 xn1 + . . . + b1 x + b0 Arithmetic operations:
0+0=1+1=0 1+0=0+1=1 11=1 00=01=10=0

A generator polynomial G(x) = gr xr + . . . g1 x + g0 with gr = 1 and g0 = 1 is agreed upon between the sender and the receiver The sender transmits U (x) = xr B(x) + t(x) with
t(x) = (xr B(x)) mod G(x)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 77

Cyclic Redundancy Check (CRC)

The receiver tests whether the polynomial corresponding to the received bit sequence can be divided by G(x) without a remainder Efcient hardware implementation possible using XOR gates and shift registers Only errors divisible by G(x) will go undetected Example: Generator polynomial G(x) = x3 + x2 + 1 (corresponds to the bit sequence 1101) Message M = 1001 1010 (corresponds to the polynomial B(x) = x7 + x4 + x3 + x)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 78

CRC Computation
1001 1010 000 : 1101 1101 ---100 1 110 1 ----10 00 11 01 ----1 011 1 101 ----1100 1101 ---1 000 1 101 ----101 =>

transmitted bit sequence 1001 1010 101


slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 79

CRC Verication
1001 1010 101 : 1101 1101 ---100 1 110 1 ----10 00 11 01 ----1 011 1 101 ----1100 1101 ---1 101 1 101 ----0 =>

remainder 0, assume no transmission error


slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 80

Choosing Generator Polynomials


G(x)

detects all single-bit errors if G(x) has more than one non-zero term

G(x) G(x)

detects all double-bit errors, as long as G(x) has a factor with three terms

detects any odd number of errors, as long as G(x) contains the factor (x + 1) detects any burst errors for which the length of the burst is less than or equal to r detects a fraction of error bursts of length r + 1; the fraction equals to 1 2(r1)

G(x)

G(x) G(x)

detects a fraction of error bursts of length greater than r + 1; the fraction equals to 1 2r
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 81

Well-known Generator Polynomials


The HEC polynomial G(x) = x8 + x2 + x + 1 is used by the ATM cell header The CRC-16 polynomial G(x) = x16 + x15 + x2 + 1 detects all single and double bit errors, all errors with an odd number of bits, all burst errors with 16 or less bits and more than 99% of all burst errors with 17 or more bits The CRC-CCITT polynomial G(x) = x16 + x15 + x5 + 1 is used by the HDLC protocol The CRC-32 polynomial G(x) = x32 +x26 +x23 +x22 +x16 +x12 +x11 +x10 +x8 +x7 +x5 +x4 +x2 +x+1 is used by the IEEE 802 standards

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 82

Internet Checksum
uint16_t checksum(uint16_t *buf, int count) { uint32_t sum = 0; while (count--) { sum += *buf++; if (sum & 0xffff0000) { sum &= 0xffff; sum++; } } return (sum & 0xffff); }

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 83

Internet Checksum Properties


Summation is commutative and associative Computation independent of the byte order Computation can be parallelized on processors with word sizes larger than 16 bit Individual data elds can be modied without having to recompute the whole checksum Can be integrated into copy loop Often implemented in assembler or special hardware For details, see RFC 1071, RFC 1141, and RFC 1624

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 84

Further Error Situations

Despite bit errors, the following transmission errors can occur Loss of complete data frames Duplication of complete data frames Receipt of data frames that we never send Reordering of data frames during transmission In addition, the sender must adapt its speed to the speed of the receiver (end-to-end ow control) Finally, the sender must react to congestion situations in the network (congestion control)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 85

Sequence Numbers

The sender assigns growing sequence numbers to all data frames Receiver can detect reordered or duplicated frames Loss of a frame can be determined if a missing frame cannot travel in the network anymore Sequence numbers can grow quickly on fast networks

sender 1

receiver

1 2

3 3 4 4 5

6 6 7 5 7 8 4 8

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 86

Acknowledgements

Retransmit to handle errors A positive acknowledgement (ACK) is sent to inform the sender that the transmission of a frame was successful A negative acknowledgement (NACK) is sent to inform the sender that the transmission of a frame was unsuccessful Stop-and-wait protocol: a frame is only transmitted if the previous frame was been acknowledged
n

sender

receiver

n ACK n ACK n

n+1

n+1 (error) NACK n+1 NACK n+1 n+1 n+1 ACK n+1 ACK n+1

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 87

Timers

Timer can be used to detect the loss of frames or acknowledgments Sender can use a timer to retransmit a frame if no acknowledgment has been received in time Receiver can use a timer to retransmit acknowledgments Problem: Timers must adapt to the current delay in the network

sender n

receiver

n ACK n

n n ACK n ACK n n+1

n+1 n+1 ACK n+1 ACK n+1

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 88

Flow Control

Allow the sender to send multiple frames before waiting for acknowledgments Improves efciency and overall delay Sender must not overow the receiver The stream of frames should be smooth and not bursty Speed of the receiver can vary over time

sender n+0 n+1 n+2 ACK n+0 n+3 ACK n+1 n+4 ACK n+2 n+5 ACK n+3 n+6 n+4 ACK n+5 n+7 n+6 ACK n+4 n+8 ACK n+7 n+9 ACK n+6 ACK n+8 ACK n+9

receiver

n+0 ACK n+0 n+1 ACK n+1 n+2 ACK n+2 n+3 ACK n+3

n+5 ACK n+5 n+6 ACK n+6 n+4 ACK n+4 n+7 ACK n+7 n+6 ACK n+6 n+8 ACK n+8 n+9 ACK n+9

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 89

Sliding Window Flow Control


Sender and receiver agree on a window of the sequence number space The sender may only transmit frames whose sequence number falls into the senders window Upon receipt of an acknowledgement, the senders window is moved The receiver only accepts frames who sequence numbers fall into the receivers window Frames with increasing sequence number are delivered and the receiver window is moved. The size of the window controls the speed of the sender and must match the puffer capacity of the receiver
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 90

Sliding Window Implementation

Implementation on the sender side: SWS (send window size) LAR (last ack received) LFS (last frame send) Invariant: LFS - LAR + 1 SWS Implementation on the receiver side: RWS (receiver window size) LFA (last frame acceptable) NFE (next frame expected) Invariant: LFA - NFE + 1 RWS

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 91

Congestion Control

Flow control is used to control adapt the speed of the sender to the speed of the receiver Congestion control is used to adapt the speed of the sender to the speed of the network Principles: Sender and receiver reserve bandwidth and puffer capacity in the network Intermediate systems drop frames under congestion and signal the event to the senders involved Intermediate systems send control messages (choke packets) when congestion builds up to slow down senders

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 92

OSI Reference Model


Application Process Application Process

End System Application System Transport System Application

End System Application Aplication System Transport System

Presentation

Presentation

Session

Session

Transport

Transit System Network

Transport

Network

Network

Data Link

Data Link

Data Link

Physical

Physical

Physical

Media

Media

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 93

Physical and Data Link Layer

Physical Layer: Transmission of an unstructured bit stream Standards for cables, connectors and sockets Encoding of binary values (voltages, frequencies) Synchronization between sender and receiver Data Link Layer: Transmission of bit sequences in so called frames Data transfer between directly connected systems Detection and correction of transmission errors Flow control between senders and receivers Realization usually in hardware

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 94

Network and Transport Layer

Network Layer: Determination of paths through a complex network Multiplexing of end-to-end connections over an intermediate systems Error detection / correction between network nodes Flow and congestion control between end systems Transmission of datagrams or packets in packet switched networks Transport Layer: Reliable end-to-end communication channels Connection-oriented and connection-less services End-to-end error detection and correction End-to-end ow and congestion control
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 95

Session and Presentation Layer

Session Layer: Synchronization and coordination of communicating processes Interaction control (check points) Presentation Layer: Harmonization of different data representations Serialization of complex data structures Data compression Application Layer: Fundamental application oriented services Terminal emulationen, name and directory services, data base access, network management, electronic messaging systems, process and machine control, . . .
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 96

3. Local Area Networks (IEEE 802)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 97

IEEE 802 Overview


802.2 Logical Link Control 802 Overview and Architecture

802.1 Management

802.1 Bridging

802.3 Medium Access


Ethernet

802.4 Medium Access


Token Bus

802.5 Medium Access


Token Ring

802.6 Medium Access


DQDB

802.9 Medium Access

802.11 Medium Access


WaveLan

802.12 Medium Access

802.3 Physical

802.4 Physical

802.5 Physical

802.6 Physical

802.9 Physical

802.11 Physical

802.12 Physical

IEEE 802 standards are developed since the mid 1980s Dominating technology in local area networks (LANs)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 98

IEEE 802 Layers


Application Process

EndSystem Application System Transport System Application

Representation

Session

Transport

Network Logical Link Control (LLC) Data Link Media Access Control (MAC) Physical Physical (PHY) IEEE 802

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 99

IEEE 802 Layers


The Logical Link Control (LLC) layer provides a service interface which is the same for all IEEE 802 protocols The Medium Access Control (MAC) layer denes the method used to access the media being used The Physical (PHY) layer denes the physical properties for the various transmission media that can be used with a certain IEEE 802.x protocol

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 100

IEEE 802 Addresses


IEEE 802 addresses, sometimes also called MAC addresses, are 6 bytes (48 bit) long The common notation is a sequence of hexadecimal numbers with the bytes separated from each other using colons or hyphens ( 00:D0:59:5C:03:8A or 00-D0-59-5C-03-8A) The highest bit indicates whether it is a unicast address (0) or a multicast address (1). The second highest bit indicates whether it is a local (1) or a global (0) address The broadcast address, which represents all stations within a broadcast domain, is FF-FF-FF-FF-FF-FF Globally unique addresses are created by vendors who apply for a number space delegation by the IEEE
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 101

IEEE 802.3 (Ethernet)

Evolution of Ethernet Technology: 1976 Original Ethernet paper published 1990 10 Mbps Ethernet over twisted pair (10BaseT) 1995 100 Mbps Ethernet 1998 1 Gbps Ethernet 2002 10 Gbps Ethernet 2010* 100 Gbps Ethernet (predicted) Classic Ethernet used CSMA/CD and a shared bus Todays Ethernet is using a star topology with full duplex links Link aggregation allows to bundle links

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 102

IEEE 802.3 Frame Format

preamble

7 Byte

startofframe delimiter (SFD)

1 Byte

destination MAC address

6 Byte

source MAC address

6 Byte

length / type field

2 Byte

data (network layer packet) 461500 Byte

padding (if required)

frame check sequence (FCS)

4 Byte

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 103

641518 Byte

Transmitting IEEE 802.3 Frames

wait for frame to transmit

format frame for transmission

carrier sense signal on? Y N wait interframe gap time

start transmission

collision detected? Y N complete transmission and set status transmission done transmit jam sequence and increment # attempts

set status attempt limit exceeded Y

attempt limit reached? N compute and wait backoff time

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 104

Receiving IEEE 802.3 Frames


incoming signal detected? N Y set carrier sense signal on

obtain bit sync and wait for SFD

receive frame

FCS and frame size OK? N Y destination address matches own or group address? Y pass data to higher-layer protocol entity for processing discard frame N

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 105

Ethernet Media Types


Name 10Base2 10Base5 10BaseT 10BaseF 100BaseT4 100BaseTX 100BaseFX 1000BaseLX 1000BaseSX 1000BaseCX 1000BaseT Medium coax, =0.25 in coax, =0.5 in twisted pair ber optic twisted pair twisted pair ber optic ber optic ber optic coax twisted pair Max. Length 200 m 500 m 100 m 2000 m 100 m 100 m 412 m 500 / 550 / 5000 m 220-275 / 550 m 25 m 100 m

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 106

Bridged IEEE 802 Networks

802.11

B3 10Base5 B1 B2

10Base2

100BaseT

10Base2

802.5

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 107

IEEE 802 Bridges

Network Bridge

Network

IEEE 802.2 LLC

IEEE 802.2 LLC

IEEE 802.2 LLC

IEEE 802.3 MAC

IEEE 802.3 MAC

IEEE 802.11 MAC

IEEE 802.11 MAC

IEEE 802.3 PHY

IEEE 802.3 PHY

IEEE 802.11 PHY

IEEE 802.11 PHY

Source Routing Bridges: Sender has to route the frame through the bridged network Transparent Bridges: Bridges are transparent to senders and receivers
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 108

Transparent Bridges (IEEE 802.1D)


Forwarding database Station Port address number

Port management software

Bridge protocol entity

MAC chipset

Memory buffers

MAC chipset

Port 1

Port 2

Lookup an entry with a matching destination address in the forwarding database and forward the frame to the associated port. If no matching entry exists, forward the frame to all outgoing ports except the port from which the frame was received (ooding).
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 109

Backward Learning

Algorithm: The forwarding database is initially empty. Whenever a frame is received, add an entry to the forwarding database (if it does not yet exist) using the frames source address and the incoming port number. Reinitialize the timer attached to the forwarding base entry for the received frame. Aging of unsused entries reduces forwarding table size and allows bridges to adapt to topology changes. Backward learning requires an cyclic-free topology.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 110

Spanning Tree
1. The root of the spanning tree is selected (root bridge). The root bridge is the bridge with the highest priority and the smallest bridge address. 2. The costs for all possible paths from the root bridge to the various ports on the bridges is computed (root path cost). Every bridge determines which root port is used to reach the root bridge at the lowest costs. 3. The designated bridge is determined for each segment. The designated bridge of a segment is the bridge which connects the segment to the root bridge with the lowest costs on its root port. The ports used to reach designated bridges are called designated ports. 4. All ports are blocked which are not designated ports. The resulting active topology is a spanning tree.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 111

Port States

Blocking: A port in the blocking state does not participate in frame forwarding. Listening: A port in the transitional listening state has been selected by the spanning tree protocol to participate in fame forwarding. Learning: A port in the transitional learning state is preparing to participate in frame forwarding. Forwarding: A port in the forwarding state forwards frames. Disabled: A port in the disabled state does not participate in frame forwarding or the operation of spanning tree protocol.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 112

Port State Transitions

Blocking

Listening

Disabled

Learning

Forwarding

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 113

Broadcast Domains

A bridged LAN denes a single broadcast domain: All frames send to the broadcast address are forwarded on all links in the bridged networks. Broadcast trafc can take a signicant portion of the available bandwidth. Devices running not well-behaving applications can cause broadcast storms. Bridges may ood frames if the MAC address cannot be found in the forwarding table. It is desirable to reduce the size of broadcast domains in order to separate trafc in a large bridged LAN.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 114

IEEE 802.1Q Virtual LANs


VLANs allow to separate the trafc on an IEEE 802 network A station connected to a certain VLAN only sees frames belonging to that VLAN VLANs can reduce the network load; frames that are targeted to all stations (broadcasts) will only be delivered to the stations connected to the VLAN. Stations (especially server) can be a member of multiple VLANs simultaneously Separation of logical LAN topologies from physical LAN topologies VLANs are identied by a VLAN identier (1..4094)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 115

IEEE 802.1Q Tagged Frames

preamble

7 Byte

startofframe delimiter (SFD) 1 Byte destination MAC address 6 Byte

The tag protocol identier indicates a tagged frame. Tagged 802.3 frames are 4 bytes longer than original 802.3 frames. Tagged frames should only appear on links that are VLAN aware.

source MAC address 6 Byte tag protocol identifier priority CFI vlan identifier length / type field 2 Byte 2 Byte 641522 Byte

data (network layer packet) 461500 Byte

padding (if required)

frame check sequence (FCS)

4 Byte

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 116

VLAN Membership

Bridge ports can be assigned to VLANs in different ways: Ports are administratively assigned to VLANs (port-based VLANs) MAC addresses are administratively assigned to VLANs (MAC address-based VLANs) Frames are assigned to VLANs based on the payload contained in the frames (protocol-based VLANs) Members of a certain multicast group are assigned to VLAN (multicast group VLANs) The Generic Attribute Registration Protocol (GARP) can (among other things) propagate VLAN membership information.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 117

IEEE 802.1x Port Access Control


Port based access control grants access to the a switch port based on the identity of the connected machine. The components involved in 802.1x: The supplicant runs on a machine connecting to a bridge and provides authentication information. The authenticator runs on a bridge and enforces authentication decisions. The authentication server is a (logically) centralized component which provides authentication decisions (usually via RADIUS). The authentication exchange uses the Extensible Authentication Protocol (EAP). IEEE 802.1x is becoming increasingly popular as a roaming solution for IEEE 802.11 wireless networks.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 118

IEEE 802.11

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 119

References

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 120

4. Internet Network Layer

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 121

Internet Model
Application Process Application Process

Endsystem Application

End System Application

Transport

Router Network

Transport

Internet

Internet

Subnetwork

Subnetwork

Subnetwork

Media

Media

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 122

Terminology

A node is a device which implements an Internet Protocol (such as IPv4 or IPv6). A router is a node that forwards IP packets not addressed to itself. A host is any node which is not a router. A link is a communication channel below the IP layer which allows nodes to communicate with each other (e.g., an Ethernet). The neighbors is the set of all nodes attached to the same link.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 123

Terminology (cont.)

An interface is a nodes attachement to a link. An IP address identies an interface or a set of interfaces. An IP packet is a bit sequence consisting of an IP header and the payload. The link MTU is the maximum transmission unit, i.e., maximum packet size in octets, that can be conveyed over a link. The path MTU is the the minimum link MTU of all the links in a path between a source node and a destination node.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 124

Internet Network Layer Protocols

The Internet Protocol version 4 (IPv4) provides for transmitting datagrams from sources to destinations using 4 byte IPv4 addresses The Internet Control Message Protocol version 4 (ICMPv4) is used for IPv4 error reporting and testing The Address Resolution Protocol (ARP) maps IPv4 addresses to IEEE 802 addresses The Internet Protocol version 6 (IPv6) provides for transmitting datagrams from sources to destinations using 16 byte IPv6 addresses The Internet Control Message Protocol version 6 (ICMPv6) is used for IPv6 error reporting, testing, auto-conguration and address resolution
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 125

Reserved IPv4 Address Blocks


0.0.0.0/8 10.0.0.0/8 14.0.0.0/8 24.0.0.0/8 39.0.0.0/8 127.0.0.0/8 128.0.0.0/16 169.254.0.0/16 172.16.0.0/12 191.255.0.0/16 192.0.0.0/24 192.0.2.0/24 192.88.99.0/24 192.168.0.0/16 198.18.0.0/15 223.255.255.0/24 224.0.0.0/4 240.0.0.0/4 "This" Network Private-Use Networks Public-Data Networks Cable Television Networks Class A Subnet Experiment Loopback Reserved by IANA Link Local Private-Use Networks Reserved by IANA Reserved by IANA Test-Net / Documentation 6to4 Relay Anycast Private-Use Networks Network Interconnect / Device Benchmark Testing Reserved by IANA Multicast Reserved for Future Use [RFC1700] [RFC1918] [RFC1700] [RFC3330] [RFC1797] [RFC1700] [RFC3330] [RFC3330] [RFC1918] [RFC3330] [RFC3330] [RFC3330] [RFC3068] [RFC1918] [RFC2544] [RFC3330] [RFC3171]
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 126

[RFC1700]

IPv4 Packet Format


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| IHL |Type of Service| Total Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identification |Flags| Fragment Offset | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Time to Live | Protocol | Header Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options | Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 127

IPv4 Forwarding

IPv4 addresses can be devided into a part which identies a network (netid) and a part which identies an interface of a node within that network (hostid). The number of bits of an IPv4 address which identies the network is called the address prex. The address prex is commonly written as a decimal number, appended to the usual IPv4 address notation by using a slash (/) as a separator (e.g., 192.0.2.0/24). The forwarding table realizes a mapping of the network prex to the next node (next hop) and the local interface used to reach the next node. For every IP packet, the entry in the forwarding table has to be found with longest matching network address prex (longest prex match).
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 128

IPv4 Forwarding Example


Prefix 0.0.0.0/0 134.169.34.0/24 134.169.0.0/16 134.169.9.10 Next Hop 134.169.2.1 134.169.246.34 134.169.9.10 Interface eth0 eth0 eth0

R1 134.169.2.1

H1

134.169.0.0/16 134.169.246.34 Prefix 0.0.0.0/0 134.169.0.0/16 134.169.34.0/24 Next Hop 134.169.2.1 134.169.246.34 134.169.34.12 Interface eth0 eth0 eth1 R2 134.169.34.12 H2 Prefix 0.0.0.0/0 134.169.34.0/24 134.169.34.1 Next Hop 134.169.34.12 134.169.34.1 Interface eth0 eth0

134.169.34.0/24

Default forwarding table entries use the IPv4 prex 0.0.0.0/0. This example identies directly reachable hosts by using the IP address of a local interface as the next hop. Other representations use an explicit direct/indirect ag.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 129

Longest Prex Match: Binary Trie


H
0 1

C
0 0 1

D
0

C
1

A
0 0 1 0 1 0 1 0

D
0 0 0 1

J
0 1

B
0

F
1

D
1

B
0

D
0

A binary trie is the representation of the binary prexes in a tree.


slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 130

Longest Prex Match: Path Compressed Trie


1

H
0 2 1 2

0 3

1 3

1 3

C
0 4 0 1 4

D
0 4

C
1 5

G
0

A
1 5 0 1 5 0 1 0

F
1

D
0 1

J
0 1

A path compressed trie is obtained by collapsing all one-way branch nodes.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 131

Longest Prex Match: Two-bit stride multibit Trie


H

00

01

10

11

00

00

01

10

11

00

01

10

00

01

00

01

01

10

11

00

01

10

00

01

10

A two-bit multibit trie reduces the number of memory accesses.


slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 132

IPv4 Error Handling (ICMPv4)

The Internet Control Message Protocol (ICMP) is used to inform nodes about problems encountered while forwarding IP packets. Echo Request/Reply messages are used to test connectivity. Unreachable Destination messages are used to report why a destination is not reachable. Redirect messages are used to inform the sender of a better (shorter) path. ICMPv4 is an integral part of IPv4 (even though it is a different protocol).

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 133

ICMPv4 Echo Request/Reply


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Identifier | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Data ... +-+-+-+-+-

The ICMP echo request message (type = 8, code = 0) asks the destination node to return an echo reply message (type = 0, code = 0). The Identifier and Sequence Number elds are used to correlate incoming replies with outstanding requests. The data eld may contain any additional data.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 134

ICMPv4 Unreachable Destinations


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | unused | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Internet Header + 64 bits of Original Data Datagram | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The Type eld has the value 3 for all unreachable destination messages. The Code eld indicates why a certain destination is not reachable. The data eld contains the beginning of the packet which caused the ICMP unreachable destination message.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 135

Unreachable Destination Codes


0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 Net Unreachable Host Unreachable Protocol Unreachable Port Unreachable Fragmentation Needed and Dont Fragment was Set Source Route Failed Destination Network Unknown Destination Host Unknown Source Host Isolated Communication with Destination Network is Administratively Prohibited Communication with Destination Host is Administratively Prohibited Destination Network Unreachable for Type of Service Destination Host Unreachable for Type of Service Communication Administratively Prohibited Host Precedence Violation Precedence cutoff in effect
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 136

ICMPv4 Redirect
0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Type | Code | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Router Internet Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Internet Header + 64 bits of Original Data Datagram | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The Type eld has the value 5 for redirect messages. The Code eld indicates which type of packets should be redirected. The Router Internet Address eld identies the IP router to which packets should be redirected. The data eld contains the beginning of the packet which caused the ICMP redirect message.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 137

Redirect Codes
0 1 2 3 Redirect datagrams for the Network. Redirect datagrams for the Host. Redirect datagrams for the Type of Service and Network. Redirect datagrams for the Type of Service and Host.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 138

IPv4 Fragmentation

IPv4 packets that do not t the outgoing link MTU will get fragmented into smaller packets that t the link MTU. The Identification eld contains the same value for all fragments of an IPv4 packet. The Fragment Offset eld contains the relative position of a fragment of an IPv4 packet (counted in 64-bit words). The ag More Fragments (MF) is set if more fragments follow. The Dont Fragment (DF) ag can be set to indicate that a packet should not be fragmented. IPv4 allows fragments to be further fragmented without intermediate reassembly.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 139

Fragmentation Considered Harmful

The receiver must buffer fragments until all fragments have been received. However, it is not useful to keep fragments in a buffer indenitely. Hence, the TTL eld of all buffered packets will be decremented once per second and fragments are dropped when the TTL eld becomes zero. The loss of a fragment causes in most cases the sender to resend the original IP packet which in most cases gets fragmented as well. Hence, the probability of transmitting a large IP packet successfully goes quickly down if the loss rate of the network goes up. Since the Identification eld identies fragments that belong together and the number space is limited, one cannot fragment an arbitrary large number of packets.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 140

MTU Path Discovery (RFC 1191)


The sender sends IPv4 packets with the DF ag set. A router which has to fragment a packet with the DF ag turned on drops the packet and sends an ICMP message back to the sender which also includes the local maximum link MTU. Upon receiving the ICMP message, the sender adapts his estimate of the path MTU and retries. Since the path MTU can change dynamically (since the path can change), a once learned path MTU should be veried and adjusted periodically. Not all routers send necessarily the local link MTU. In this cases, the sender tries typical MTU values, which is usually faster than doing a binary search.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 141

IPv4 over IEEE 802.3 (RFC 894)

IPv4 packets are sent in the payload of IEEE 802.3 frames according to the specication in RFC 894: IPv4 packets are identied by the value 0x800 in the IEEE 802.3 type eld. According to the maximum length of IEEE 802.3 frames, the maximum link MTU is 1500 byte. The mapping of IPv4 addresses to IEEE 802.3 addresses is table driven. Entries in so called mapping tables (sometimes also called address translation tables) can either be statically congured or dynamically learned.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 142

IPv4 Address Translation (RFC 826)


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Hardware Type | Protocol Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | HLEN | PLEN | Operation | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sender Hardware Address (SHA) = +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ = Sender Hardware Address (SHA) | Sender IP Address (SIP) = +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ = Sender IP Address (SIP) | Target Hardware Address (THA) = +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ = Target Hardware Address (THA) | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Target IP Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The Address Resolution Protocol (ARP) resolved IPv4 addresses to link-layer addresses of neighboring nodes.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 143

ARP and RARP

The Hardware Type eld identies the address type used on the link-layer (the value 1 is used for IEEE 802.3 MAC addresses). The Protocol Type eld identies the network layer address type (the value 0x800 is used for IPv4). ARP/RARP packets use the 802.3 type value 0x806. The Operation eld contains the message type: ARP Request (1), ARP Response (2), RARP Request (3), RARP Response (4). The sender lls, depending on the request type, either the Target Hardware Address (RARP) eld or the Target IP Address (ARP) eld. The responding node swaps the Sender/Target els and lls the empty elds with the requested information.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 144

DHCP Version 4 (DHCPv4)

The Dynamic Host Conguration Protocol (DHCP) allows nodes to retrieve conguration parameters from a central conguration server. A binding is a collection of conguration parameters, including at least an IP address, associated with or bound to a DHCP client. Bindings are managed by DHCP servers. Bindings are typically valid only for a limited lifetime. See RFC 2131 for the details and the message formats. See RFC 3118 for security aspects due to lack of authentication.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 145

DHCPv4 Message Types


The DHCPDISCOVER message is a broadcast message which is sent by DHCP clients to locate DHCP servers. The DHCPOFFER message is sent from a DHCP server to offer a client a set of conguration parameters. The DHCPREQUEST is sent from the client to a DHCP server as a response to a previous DHCPOFFER message, to verify a previously allocated binding or to extend the lease of a binding. The DHCPACK message is sent by a DHCP server with some additional parameters to the client as a positive acknowledgement to a DHCPREQUEST. The DHCPNAK message is sent be a DHCP server to indicate that the clients notion of a conguration binding is incorrect.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 146

DHCPv4 Message Types (cont.)


The DHCPDECLINE message is sent by a DHCP client to indicate that parameters are already in use. The DHCPRELEASE message is sent by a DHCP client to inform the DHCP server that conguration parameters are no longer used. The DHCPINFORM message is sent from the DHCP client to inform the DHCP server that only local conguration parameters are needed.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 147

IPv6 Interface Identier


Interface identiers in IPv6 unicast addresses are used to uniquely indentify interfaces on a link. For all unicast addresses, except those that start with binary 000, interface identiers are required to be 64 bits long and to be constructed in modied EUI-64 format. Combination of the interface identier with a network prex leads to an IPv6 address. Link local unicast addresses have the prex FE80::/10. Ongoing work on the construction of globally unique local addresses using cryptographic hash functions.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 148

Modied EUI-64 Format


|0 1|1 3|3 4| |0 5|6 1|2 7| +----------------+----------------+----------------+ |cccccc0gcccccccc|ccccccccmmmmmmmm|mmmmmmmmmmmmmmmm| +----------------+----------------+----------------+ |0 1|1 3|3 4|4 6| |0 5|6 1|2 7|8 3| +----------------+----------------+----------------+----------------+ |cccccc1gcccccccc|cccccccc11111111|11111110mmmmmmmm|mmmmmmmmmmmmmmmm| +----------------+----------------+----------------+----------------+

Modied EUI-64 format can be obtained from IEEE 802 MAC addresses. However, these constructed IPv6 addresses can be used to track mobile nodes used in different networks.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 149

IPv6 Packet Format (RFC 2460)


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ |Version| Traffic Class | Flow Label | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Payload Length | Next Header | Hop Limit | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + Source Address + | | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + Destination Address + | | slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 150 + +

IPv6 Extension Header

All IPv6 header extensions and options are carried in a header daisy chain. Routing Extension Header (RH) Fragment Extension Header (FH) Authentication Extension Header (AH) Encapsulating Security Payload Extension Header (ESP) Hop-by-Hop Options Extension Header (HO) Destination Options Extension Header (DO) LinkMTUs must at least be 1280 bytes and only the sender is allowed to fragment packets.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 151

IPv6 Forwarding

IPv6 packets are forwarded using the longest prex match algorithm which is used in the IPv4 network. IPv6 addresses have much longer prexes which allows to do better address aggregation in order to reduce the number of forwarding table entries. Due to the length of the prexes, it is even more crucial to use an algorithm whose complexity does not dependent on the number of entries in the forwarding table or the average prex length.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 152

IPv6 Error Handling (ICMPv6) (RFC 2463)


The Internet Control Message Protocol Version 6 (ICMPv6) is used like ICMPv4. ICMPv6 is used to report error situations. ICMPv6 can be used to run diagnostic tests. ICMPv6 supports the auto-conguration of IPv6 nodes. ICMPv6 supports the resolution of IPv6 addresses to link-layer addresses.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 153

IPv6 over IEEE 802.3 (RFC 2464)

IPv6 packets can be encapsulated into IEEE 802.3 frames and sent over IEEE 802.3 packets as dened in RFC 2464: Frames containing IPv6 packets are identied by the value 0x86dd in the IEEE 802.3 type eld. The link MTU is 1500 bytes which corresponds to the IEEE 802.3 maximum frame size of 1500 byte. The mapping of IPv6 addresses to IEEE 802.3 addresses is table driven. Entries in so called mapping tables (sometimes also called address translation tables) can either be statically congured or dynamically learned using neighbor discovery.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 154

IPv6 Neighbor Discovery (RFC 2461)


Discovery of the routers attached to a link. Discovery of the prexes used on a link. Discovery of parameters such as the link MTU or the hop limit for outgoing packets. Automatic conguration of IPv6 addresses. Resolution of IPv6 addresses to link-layer addresses. Determination of next-hop addresses for IPv6 destination addresses. Detection of unreachable nodes which are attached to the same link. Detection of conicts during address generation. Discovery of better alternatives to forward packets.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 155

References
[1] C. Kent and J. Mogul. Fragmentation Considered Harmful. In Proc. SIGCOMM 87 Workshop on Frontiers in Computer Communications Technology, August 1987. [2] D. C. Plummer. An Ethernet Address Resolution Protocol. RFC 826, MIT, November 1982. [3] C. Hornig. A Standard for the Transmission of IP Datagrams over Ethernet Networks. RFC 894, Symbolics Cambridge Research Center, April 1984. [4] J. Mogul and S. Deering. Path MTU Discovery. RFC 1191, DECWRL, Stanford University, November 1990. [5] R. Droms. Dynamic Host Conguration Protocol. RFC 2131, Bucknell University, March 1997. [6] R. Droms and W. Arbaugh. Authentication for DHCP Messages. RFC 3118, Cisco Systems, University of Maryland, June 2001. [7] S. Deering and R. Hinden. Internet Protocol, Version 6 (IPv6) Specication. RFC 2460, Cisco, Nokia, December 1998. [8] T. Narten, E. Nordmark, and W. Simpson. Neighbor Discovery for IP Version 6 (IPv6). RFC 2461, IBM, Sun Microsystems, Daydreamer, December 1998. [9] A. Conta and S. Deering. Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6) Specication. RFC 2463, Lucent, Cisco Systems, December 1998. [10] M. Crawford. Transmission of IPv6 Packets over Ethernet Networks. RFC 2464, Fermilab, December 1998.
slides.tex Networks and Protocols 2006 Jurgen [11] IANA. Special-Use IPv4 Addresses. RFC 3330, Internet Assigned NumbersSchonwalder 9/10/2006 8:18 p. 156 Authority,

5. Internet Routing

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 157

Internet Routing Protocols

The Routing Information Protocol (RIP) is a simple distance vector IGP routing protocol based on the distributed Bellman-Ford shortest path algorithm The Open Shortest Path First (OSPF) IGP routing protocol oods link state information so that every node can compute shortest paths using Dijkstras shortest path algorithm The Intermediate System to Intermediate System (IS-IS) routing protocol is another link state routing protocol adopted from the ISO/OSI standards The Border Gateway Protocol (BGP) EGP routing protocol propagates reachability information between ASs, which is used to make policy-based routing decisions

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 158

Bellman-Ford

Let G = (V, E) be a graph with the vertices V and the edges E with n = |V | and m = |E|. Let D be an n n distance matrix in which D(i, j) denotes the distance from node i V to the node j V . Let H be an n n matrix in which H(i, j) E denotes the edge on which node i V forwards a message to node j V. Let M be a vector with the link metrics, S a vector with the start node of the links and D a vector with the end nodes of the links.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 159

Bellman-Ford (cont.)
1. Set D(i, j) = for i = j and D(i, j) = 0 for i = j . 2. For all edges l E and for all nodes k V : Set i = S[l] and j = D[l] and d = M [l] + D(j, k). 3. If d < D(i, k), set D(i, k) = d and H(i, k) = l. 4. Repeat from step 2 if at least one D(i, k) has changed. Otherwise, stop.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 160

Bellman-Ford Example (Round 0)


A B C

Dest. A

Link local Link local Link local

Cost 0 Cost 0 Cost 0

Dest. B

Link local Link local

Cost 0 Cost 0

Dest. C

Dest. D

Dest. E

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 161

Bellman-Ford Example (Round 1)


A Dest. A B D C Dest. B C E E Dest. B C D E Link local A-B A-D Link B-C local C-E Link B-E C-E D-E local Cost 0 1 1 Cost 1 0 1 Cost 1 1 1 0 D B Dest. A B C E Dest. A D E Link A-B local B-C B-E Link A-D local D-E Cost 1 0 1 1 Cost 1 0 1

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 162

Bellman-Ford Example (Round 2)


A Dest. A B C D E C Dest. A B C D E E Dest. A B C D Link local A-B A-B A-D A-B Link B-C B-C local C-E C-E Link B-E B-E C-E D-E Cost 0 1 2 1 2 Cost 2 1 0 2 1 Cost 2 1 1 1
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 163

Dest. A B C D E

Link A-B local B-C A-B B-E Link A-D A-D D-E local D-E

Cost 1 0 1 2 1 Cost 1 2 2 0 1

Dest. A B C D E

Count to Innity

Consider the following topology:


A --- B --- C

After some distance vector exchanges, C has learned that he can reach A by sending packets via B. When the link between A and B breaks, B will learn from C that he can still reach A at a higher cost (count of hops) by sending a packet to C. This information will now be propagated to C, C will update the hop count and subsequently announce a more expensive not existing route to B. This counting continues until the costs reach ininity.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 164

Split Horizon

Idea: Nodes never announce the reachability of a network to neighbors from which they have learned that a network is reachable. Does not solve the count-to-innity problem in all cases:
A --- B \ / C | D

If the link between C and D breaks, B will not announce to C that it can reach D via C and A will not announce to C that it can reach D via C (split horizon). But after the next round of distance vector exchanges, A will announce to C that it can reach D via B and B will announce to C that it can reach D via A.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 165

Split Horizon with Poisoned Reverse

Idea: Nodes announce the unreachability of a network to neighbors from which they have learned that a network is reachable. Does not solve the count-to-innity problem in all cases:
A --- B \ / C | D

C now actively announces innity for the destination D to A and B.

However, since the exchange of the distance vectors is not synchronized to a global clock, the following exchange can happen:
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 166

Split Horizon with Poisoned Reverse


1. C rst announces innity for the destination D to A and B. 2. A and B now update their local state with the metric innity for the destination D directly via C. The other stale information to reach D via the other directly connected node is not updated. 3. A and B now send their distance vectors. A and B now learn that they can not reach D via the directly connected nodes. However, C learns that it can reach D via either A or B. 4. C now sends its distance vector which contains false information to A and B and the count-to-innity process starts.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 167

Routing Information Protocol (RIP)


The Routing Information Protocol version 2 (RIP-2) dened in RFC 2453 is based on the Bellman-Ford algorithm. RIP denes innity to be 16 hops. Hence, RIP can only be used in networks where the longest paths (the network diameter) is smaller than 16 hops. RIP-2 runs over the User Datagram Protocol (UDP) and uses the well-known port number 520.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 168

RIP Message Format


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Command | Version | must be zero | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | RIP Entries | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The Command eld indicates, whether the message is a request or a response. The Version eld contains the protocol version number. The RIP Entries eld contains a list of so called xed size RIP Entries.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 169

RIP Message Format (cont.)


0 1 2 3 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Address Family Identifier | Route Tag | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | IP Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Subnet Mask | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Next Hop | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Metric | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The Address Family Identifier eld identies an address family. The Route Tag eld marks entries which contain external routes (established by an EGP).
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 170

RIP Message Format (cont.)


0 1 2 3 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | 0xFFFF | Authentication Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + Authentication + | | + + | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Special RIP entry to support authentication. Originially only trivial cleartext password authentication. RFC 2082 denes authentication based on MD5 using a special RIP message trailer.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 171

Dijkstras Shortest Path Algorithm


1. All nodes are initially tentatively labeled with innite costs indicating that the costs are not yet known. 2. The costs of the root node are set to 0 and the root node is marked as the current node. 3. The current nodes cost label is marked permanent. 4. For each node A adjacent to the current node C , the costs for reaching A are calculated as the costs of C plus the costs for the link from C to A. If the sum is less than As cost label, the label is updated with the new cost and the name of the current node.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 172

Dijkstras Shortest Path Algorithm (cont.)


5. If there are still nodes with tentative cost labels, a node with the smallest costs is selected as the new current node. Goto step 3 if a new current node was selected. 6. The shortest paths to a destination node is given by following the labels from the destination node towards the root.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 173

Dijkstra Example
3 3 B 2 2 4 A 4 2 1 G 2 F E 2 C D

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 174

Dijkstra Example (cont.)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 175

Dijkstra Example (cont.)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 176

Dijkstra Example (cont.)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 177

Dijkstra Example (cont.)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 178

Dijkstra Example (cont.)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 179

Dijkstra Example (cont.)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 180

Dijkstra Example (cont.)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 181

Open Shortest Path First (OSPF)


The Open Shortest Path First (OSPF) protocol dened in RFC 2328 is a link state routing protocol. Every node independently computes the shortest paths to all the other nodes by using Dijkstras shortest path algorithm. The link state information is distributed by ooding. OSPF introduces the concept of areas in order to control the ooding and computational processes. An OSPF area is a group of a set of networks within an autonomous system. The internal topology of an OSPF area is invisible for other OSPF areas. The routing within an area (intra-area routing) is constrainted to that area.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 182

OSPF Areas

The OSPF areas are inter-connected via the OSPF backbone area (OSPF area 0). A path from a source node within one area to a destination node in another area has three segments (inter-area routing): 1. An intra-area path from the source to a so called area border router. 2. A path in the backbone area from the area border of the source area to the area border router of the destination area. 3. An intra-area path from the area border router of the destination area to the destination node.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 183

OSPF Router Classication

OSPF routers are classied according to their location in the OSPF topology: 1. Internal Router: A router where all interfaces belong to the same OSPF area. 2. Area Border Router: A router which connects multiple OSPF areas. An area border router has to be able to run the basic OSPF algorithm for all areas it is connected to. 3. Backbone Router: A router that has an interface to the backbone area. Every area border router is automatically a backbone router. 4. AS Boundary Router: A router that exchanges routing information with routers belonging to other autonomous systems.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 184

OSPF Stub Areas


Stub Areas are OSPF areas with a single area border router. The routing in stub areas can be simplied by using default forwarding table entries which signicantly reduces the overhead.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 185

OSPF Message Header


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Version # | Type | Packet Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Router ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Area ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Checksum | AuType | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Authentication | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Authentication | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 186

OSPF Hello Message


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Network Mask | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | HelloInterval | Options | Rtr Pri | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | RouterDeadInterval | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Designated Router | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Backup Designated Router | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Neighbor | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... |

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 187

OSPF Database Description Message


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Interface MTU | Options |0|0|0|0|0|I|M|MS +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | DD sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+ | | +An LSA Header -+ | | +-+ | | +-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... |
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 188

OSPF Link State Request Message


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | LS type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Link State ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Advertising Router | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... |

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 189

OSPF Link State Update Message


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | # LSAs | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | ++-+ | LSAs | ++-+ | ... |

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 190

OSPF Link State Ack. Message


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | +-+ | | +An LSA Header -+ | | +-+ | | +-+ | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | ... |

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 191

OSPF Link State Advertisement Header


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 : : +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | LS age | Options | LS type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Link State ID | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Advertising Router | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | LS sequence number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | LS checksum | length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 192

Border Gateway Protocol (RFC 1771)


The Border Gateway Protocol version 4 (BGP-4) exchanges reachability information between autonomous systems. BGP-4 peers construct AS connectivity graphs to detect and prune routing loops and enforce policy decisions. BGP peers generally advertise only routes that should be seen from the outside (advertising policy). The nal decision which set of announced paths is actually used remains a local policy decision. BGP-4 runs over a reliable transport (TCP) and uses the well-known port 179.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 193

AS Categories (RFC 1772)

Stub AS: An AS that has only a single connection to one other AS. A stub AS only carries local trafc. Multihomed AS: An AS that has connections to more than one other AS, but refuses to carry transit trafc. Transit AS: An AS that has connections to more than one other AS, and is designed (under certain policy restrictions) to carry both transit and local trafc.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 194

Routing Policies

Policies are provided to BGP in the form of conguration information and determined by the AS administration. Examples: 1. A multihomed AS can refuse to act as a transit AS for other ASs. (It does so by only advertising routes to destinations internal to the AS.) 2. A multihomed AS can become a transit AS for a subset of adjacent ASs, i.e., some, but not all, ASs can use the multihomed AS as a transit AS. (It does so by advertising its routing information to this set of ASs.) 3. An AS can favor or disfavor the use of certain ASs for carrying transit trafc from itself. Routing Policy Specication Language (RFC 2622)
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 195

BGP Routing Table Statistics

http://bgp.potaroo.net/ See also: G. Huston, "The BGP Routing Table", Internet Protocol Journal, March 2001

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 196

BGP-4 Phases and Messages

Once the transport connection has been established, BGP-4 basically goes through three phases: 1. The BGP4 peers exchange OPEN messages to open and conrm connection parameters 2. The BGP4 peers exchange initially the entire BGP routing table. Incremental updates are sent as the routing tables change. Uses BGP UPDATE messages. 3. The BGP4 peers exchange so called KEEPALIVE messages periodically to ensure that the connection and the BGP-4 peers are alive. Errors lead to a NOTIFICATION message and subsequent close of the transport connection.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 197

BGP-4 Message Header


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + + | Marker | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Length | Type | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

The Marker is used for authentication and synchronization. The Type eld indicates the message type and the Length eld its length.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 198

BGP-4 Open Message


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+ | Version | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Autonomous System Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Hold Time | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | BGP Identifier | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Opt Parm Len | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | | Optional Parameters | | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 199

BGP-4 Open Message


The Version eld contains the protocol version number. The Autonomous System Number eld contains the 16-bit AS number of the sender. The Hold Time eld species the maximum time that the receiver should wait for a response from the sender. The BGP Identifier eld contains a 32-bit value which uniquely identies the sender. The Opt Parm Len eld contains the total length of the Optional Parameters eld or zero if no optional parameters are present. The Optional Parameters eld contains a list of parameters. Each parameter is encoded using a tag-length-value (TLV) triple.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 200

BGP-4 Update Message


+-----------------------------------------------------+ | Unfeasible Routes Length (2 octets) | +-----------------------------------------------------+ | Withdrawn Routes (variable) | +-----------------------------------------------------+ | Total Path Attribute Length (2 octets) | +-----------------------------------------------------+ | Path Attributes (variable) | +-----------------------------------------------------+ | Network Layer Reachability Information (variable) | +-----------------------------------------------------+

The UPDATE message consists of two parts: 1. The list of unfeasible routes that are being withdrawn. 2. The feasible route to advertise. The Unfeasible Routes Length eld indicates the total length of the Withdrawn Routes eld in bytes.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 201

BGP-4 Update Message


The Withdrawn Routes eld contains a list of IPv4 address prexes that are being withdrawn from service. The Total Path Attribute Length eld indicates the total length of the Path Attributes eld in bytes. The Path Attributes eld contains a list of path attributes conveying information such as the origin of the path information, the sequence of AS path segments, the IPv4 address of the next hop border router, or the local preference assigned by a BGP4 speaker. The Network Layer Reachability Information eld contains a list of IPv4 prexes that are reachable via the path described in the path attributes elds.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 202

BGP-4 Notication Message


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Error code | Error subcode | Data | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

NOTIFICATION messages are used to report errors.

The transport connection is closed immediately after sending a NOTIFICATION. Six error codes plus 20 sub-codes.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 203

BGP-4 Keep Alive Message


BGP-4 peers periodically exchange KEEPALIVE messages. A KEEPALIVE message consists of the standard BGP-4 header with no additional data.
KEEPALIVE messages are needed to verify that shared state information is still present.

If a BGP-4 peer does not receive a message within the Hold Time, then the peer will assume that there is a communication problem and tear down the connection.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 204

6. Internet Transport Layer

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 205

Internet Transport Layer


SMTP HTTP FTP

Transport Layer

IP Address + Port Number

IP Layer

IP Address

Network layer addresses identify interfaces on nodes (node-to-node signicance). Transport layer addresses identify communicating application processes (end-to-end signicance). 16 bit port numbers enable the multiplexing and demultiplexing of packets at the transport layer.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 206

Internet Transport Layer Protocols


The User Datagram Protocol (UDP) provides a simple unreliable best-effort datagram service. The Transmission Control Protocol (TCP) provides a bidirectional, connection-oriented and reliable data stream. The Stream Control Transmission Protocol (SCTP) provides a reliable transport service supporting sequenced delivery of messages within multiple streams, maintaining application protocol message boundaries (application protocol framing). The Datagram Congestion Control Protocol (DCCP) provides a congestion controlled, unreliable ow of datagrams suitable for use by applications such as streaming media.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 207

IPv4 Pseudo Header


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Destination Address | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | unused (0) | Protocol | Length | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Pseudo headers are used during checksum computation. Exclude header elds that are modied by routers.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 208

IPv6 Pseudo Header


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + Source Address + | | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | | + + | | + Destination Address + | | + + | | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Upper-Layer Packet Length | slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 209 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

User Datagram Protocol (UDP)


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Port | Destination Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Length | Checksum | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

UDP (RFC 768) provides an unreliable datagram transport service.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 210

User Datagram Protocol (UDP)


The Source Port eld contains the port number used by the sending application layer process. The Destination Port eld contains the port number used by the receiving application layer process. The Length eld contains the length of the UDP datagram including the UDP header counted in bytes. The Checksum eld contains the Internet checksum computed over the pseudo header, the UDP header and the payload contained in the UDP packet.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 211

Transmission Control Protocol (TCP)


0 1 2 3 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Source Port | Destination Port | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Sequence Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Acknowledgment Number | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Offset| Reserved | Flags | Window | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Checksum | Urgent Pointer | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+ | Options | Padding | +-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

TCP (RFC 793) provides a bidirectional connection-oriented and reliable data stream over an unreliable connection-less network protocol.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 212

Transmission Control Protocol (TCP)


The Source Port eld contains the port number used by the sending application layer process. The Destination Port eld contains the port number used by the receiving application layer process. The Sequence Number eld contains the sequence number of the rst data byte in the segment. During connection establishment, this eld is used to establish the initial sequence number. The Acknowledgment Number eld contains the next sequence number which the sender of the acknowledgement expects. The Offset eld contains the length of the TCP header including any options, counted in 32-bit words.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 213

Transmission Control Protocol (TCP)

The Flags eld contains a set of binary ags: URG: Indicates that the Urgent Pointer eld is signicant. ACK: Indicates that the Acknowledgment Number eld is signicant. PSH: Data should be pushed to the application as quickly as possible. RST: Reset of the connection. SYN: Synchronization of sequence numbers. FIN: No more data from the sender.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 214

Transmission Control Protocol (TCP)


The Window eld indicates the number of data bytes which the sender of the segment is willing to receive. The Checksum eld contains the Internet checksum computed over the pseudo header, the TCP header and the data contained in the TCP segment. The Urgent Pointer eld points, relative to the actual segment number, to important data if the URG ag is set. The Options eld can contain additional options.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 215

TCP Connection Establishment


Active Open SYN x SYN x ACK x+1, SYN y ACK x+1, SYN y ACK y+1 ACK y+1 Passive Open

Handshake protocol establishes TCP connection parameters and announces options. Guarantees correct connection establishment, even if TCP packets are lost or duplicated.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 216

TCP Connection Tear-down


Active Open FIN x FIN x ACK x+1 ACK x+1 ACK x+1, FIN y ACK x+1, FIN y ACK y+1 ACK y+1 Passive Open

TCP provides initially a bidirectional data stream. A TCP connection is terminated when both unidirectional connections have been closed. It is possible to close only one half of a connection.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 217

TCP State Machine


. . . see lecture notes . . .

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 218

TCP Flow Control


Sender write(2K) 2K | SEQ = 0 0 ACK = 2048 Win = 2048 Receiver 0 4K 4K

11 00 11 00

write(2K)

2K | SEQ = 2048 0 ACK = 4096 Win = 0

1111 0000 1111 0000 11 00 11 00

4K

0 ACK = 4096 Win = 2048

read(2K) 4K

Both TCP engines advertise their buffer sizes during connection establishment. The available space left in the receiving buffer is advertised as part of the acknowledgements.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 219

Optimizations

Nagles Algorithm When data comes into the sender one byte at a time, just send the rst byte and buffer all the rest until the byte in ight has been acknowledgement. This algorithm provides noticeable improvements especially for interactive trafc where a quickly typing user is connected over a rather slow network. Clarks Algorithm The receiver should not send a window update until it can handle the maximum segment size it advertised when the connection was established or until its buffer is half empty. Prevents the receiver from sending a very small window updates (such as a single byte).
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 220

TCP Congestion Control

TCPs congestion control introduces the concept of a congestion window (cwnd) which denes how much data can be in transit. The congestion window is maintained by a TCP sender in addition to the ow control receiver window (rwnd) which is advertised by the receiver. The sender uses these two windows to limit the data that is sent to the network and not yet received (ight size) to the minimum of the receiver and the congestion window:
f lightsize min(cwin, rwin)

The key problem to be solved is the dynamic estimation of the congestion window.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 221

TCP Congestion Control (cont.)

The initial window (IW) is usually initialized using the following formula:
IW = min(4 SM SS, max(2 SM SS, 4380bytes)) SM SS is the sender maximum segment size, the size of the largest sement that the sender can transmit (excluding TCP/IP headers and options).

During slow start, the congestion window cwnd increases by at most SM SS bytes for every received acknowledgement that acknowledges data. Slow start ends when cwnd exceeds ssthresh or when congestion is observed. Note that this algorithm leads to an exponential increase if there are multiple segments acknowledged in the cwnd.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 222

TCP Congestion Control (cont.)

During congestion avoidance, cwnd is incremented by one full-sized segment per round-trip time (RTT). Congestion avoidance continues until congestion is detected. One formula commonly used to update cwnd during congestion avoidance is given by the following equation:
cwnd = cwnd + (SM SS SM SS/cwnd)

This adjustment is executed on every incoming non-duplicate ACK.

When congestion is noticed (the retransmission timer expires), then cwnd is reset to 1 full-sized segment and the slow start threshold ssthresh is updated as follows:
ssthresh = max(f lightsize/2, 2 SM SS)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 223

TCP Congestion Control (cont.)


44 40 36 congestion window (cwin) 32 28 ssthresh 24 20 16 12 8 4 timeout

10

12

14

16

18

20

22

24

26

28

transmission burst number

Congestion control with an initial window size of 2K.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 224

Retransmission Timer

The retransmission timer controls when a segment is resend if no acknowledgement has been received. The retransmission timer RT T needs to adapt to round-trip time changes. General idea: Measure the current round-trip time Measure the variation of the round-trip time Use the estimated round-trip time plus the measured variation to calculate the retransmit timeout Do not update the estimators if a segment needs to be retransmitted (Karns algorithm).

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 225

Retransmission Timer

If an acknowledgement is received for a segment before the associated retransmission timer expires:
RT T = RT T + (1 )M M is the measured round-trip time; is typicall 7 . 8

The standard deviation is estimated using:


D = D + (1 )|RT T M | is a smoothing factor.

The retransmission timeout RT O is determined as follows:


RT O = RT T + 4 D

The factor 4 has been choosen empirically.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 226

Fast Retransmit / Fast Recovery


TCP receivers should send an immediate duplicate acknowledgement when an out-of-order segment arrives. The arrival of four identical acknowledgements without the arrival of any other intervening packets is an indication that a segment has been lost. The sender performs a fast retransmission of what appears to be the missing segment, without waiting for the retransmission timer to expire. Upon a fast retransmission, the sender does not exercise the normal congestion reaction with a full slow start since acknowledgements are still owing. See RFC 2581 section 3.1 for details.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 227

Karns Algorithm

The dynamic estimation of the RT T has a problem if a timeout occurs and the segment is retransmitted. A subsequent acknowledgement might acknowledge the receipt of the rst packet which contained that segment or any of the retransmissions. Karn suggested that the RT T estimation is not updated for any segments which were retransmitted and that the RT O is doubled on each failure until the segment gets through. The doubling of the RT O leads to an exponential back-off for each consecutive attempt.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 228

Early Congestion Notication


Idea: Routers signal congestion by setting some special bits in the IP header. The ECN bits are located in the Type-of-Service eld of an IPv4 packet or the Trafc-Class eld of an IPv6 packet. TCP sets the ECN-Echo ag in the header TCP to indicate that a TCP endpoint has received an ECN marked packet. TCP sets the Congestion-Window-Reduced (CWR) ag in the TCP header to acknowledge the receipt of and reaction to the ECN-Echo ag.

= ECN uses the ECT and CE ags in the IP header for signaling between routers and connection endpoints, and uses the ECN-Echo and CWR ags in the TCP header for TCP-endpoint to TCP-endpoint signaling.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 229

TCP Performance

Goal: Simple analytic model for steady state TCP behavior. We only consider congestion avoidance (no slow start). denotes the congestion window size at time t. In steady state, W (t) increases to a maximum value W where it experiences congestion. As a reaction, the sender 1 sets the congestion window to 2 W .
1 The time interval needed to go from 2 W to W is T and we can send a window size of packets every RT T .

W (t)

Hence, the number N of packets is:


1 T N= 2 RT T W +W 2
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 230

TCP Performance (cont.)


The time T between two packet losses equals T = RT T W/2 since the window increases linearly. By equating the total number of packets transferred during this period, we get
W 4 W +W 2 1 = W = p 8 3p

where p is the packet loss probability (thus 1 packets are p transmitted between each packet loss). The average sending rate X(p), that is the number of packets transmitted during each period, then becomes:
X(p) = 1 1/p = RT T W/2 RT T 3 2p
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 231

TCP Performance (cont.)

Example: RT T = 200ms, p = 0.05 X(p) 27.4pps With 1500 byte segments, the data rate becomes 32Kbps Given this formula, it becomes clear that high data rates (say 10Gbps or 1 Tbps) require an extremely and unrealistic small packet error rate. TCP extensions to address this problem can be found in RFC 3649 [6].

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 232

References
[1] J. Postel. User Datagram Protocol. RFC 768, ISI, August 1980. [2] J. Postel. Transmission Control Protocol. RFC 793, ISI, September 1981. [3] M. Allman, V. Paxson, and W. Stevens. TCP Congestion Control. RFC 2581, NASA Glenn/Sterling Software, ACIRI/ICSI, April 1999. [4] K. Ramakrishnan, S. Floyd, and D. Black. The Addition of Explicit Congestion Notication (ECN) to IP. RFC 3168, TeraOptic Networks, ACIRI, EMC, September 2001. [5] M. Hassan and R. Jain. High Performance TCP/IP Networking. Prentice Hall, 2004. [6] S. Floyd. HighSpeed TCP for Large Congestion Windows. RFC 3649, ICSI, December 2003.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 233

7. Firewalls and Network Address Translators

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 234

Middleboxes

A middlebox is dened in [1] as any intermediary device performing functions other than the normal, standard functions of an IP router on the datagram path between a source host and destination host. A middlebox is not necessarily a physical box it is typically just a function implemented in some other box. Middleboxes challenge the End-to-End principle and the hourglass model of the Internet architecture.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 235

Concerns about Middleboxes

Protocols designed without consideration of middleboxes may fail, predictably or unpredictably, in the presence of middleboxes. Middleboxes introduce new failure modes; rerouting of IP packets around crashed routers is no longer the only case to consider. Conguration is no longer limited to the two ends of a session; middleboxes may also require conguration and management. Diagnosis of failures and miscongurations is more complex.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 236

Types of Middleboxes

Network Address Translators (NAT): A function that dynamically assigns a globally unique address to a host that doesnt have one, without that hosts knowledge. NAT with Protocol Translator (NAT-PT): A that performs NAT between an IPv6 host and an IPv4 network, additionally translating the entire IP header between IPv6 and IPv4 formats. IP Tunnel Endpoints: Tunnel endpoints, including virtual private network endpoints, use basic IP services to set up tunnels with their peer tunnel endpoints which might be anywhere in the Internet.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 237

Types of Middleboxes (cont.)

Packet classiers, markers and schedulers: Packet classiers classify packets owing through them according to policy and either select them for special treatment or mark them, in particular for differentiated services. Transport Relays: A middlebox which translates between two transport layer instances. TCP performance enhancing proxies: TCP spoofer are middleboxes that modify the timing or action of the TCP protocol in ight for the purposes of enhancing performance. Load balancers that divert/munge packets: Techniques that divert packets from their intended IP destination, or make that destination ambiguous.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 238

Types of Middleboxes (cont.)


IP Firewalls: A function that screens and rejects packets based purely on elds in the IP and Transport headers. Application Firewalls: Application-level rewalls act as a protocol end point and relay Application-level gateways (ALGs): ALGs translate IP addresses in application layer protocols and typically complement IP rewalls. SOCKS: A stateful mechanism for authenticated rewall traversal, in which the client host must communicate rst with the SOCKS server in the rewall before it is able to traverse the rewall.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 239

Types of Middleboxes (cont.)


Transcoders: Functions performing some type of on-the-y conversion of application level data. Proxies: An intermediary program which acts as both a server and a client for the purpose of making requests on behalf of other clients. Caches: Caches are functions typically intended to optimise response times. Anonymisers: Functions that hide the IP address of the data sender or receiver. Although the implementation may be distinct, this is in practice very similar to a NAT plus ALG. ...

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 240

Firewalls

A rewall is a system (or a set of systems) that enforce access control policies between two (or more) networks. Conservative rewalls allow known desired trafc and reject everything else. Optimistic rewalls reject known unwanted trafc and allow the rest. Firewalls typically consist of packet lters, transport gateways and application level gateways. Firewalls not only protect the inside from the outside, but also the outside from the inside.

There are many ways to circumvent rewalls if internal and external hosts cooperate.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 241

Firewall Architectures
External Network (Internet) Packet Filter Internal Network (Intranet)

The simplest architecture is a packet lter which is typically implemented within a router that connects the internal network with the external network. Sometimes called a screening router.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 242

Firewall Architectures
External Network (Internet) Bastion Host Internal Network (Intranet)

A bastion host is a multihomed host connected to the internal and external network which does not forward IP datagrams but instead provides suitable gateways. Effectively prevents any direct communication between hosts on the internal network with hosts on the external network.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 243

Firewall Architectures

Server (HTTP, SMTP, DNS, ...) Bastion Host Internal Network (Intranet) Packet Filter Packet Filter Demilitarized Zone (DMZ)

External Network (Internet)

The most common architecture consists of two packet lters which create a demilitarized zone (DMZ). Externally visible servers and gateways are located in the DMZ.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 244

Packetlter Example: Linux ipchains


local process

forwarding

forward chains

input chains

output chains

checksum & sanity checks

Input chains are lists of lter rules applied to all incoming IP packets. Output chains are lists of lter rules applied to all outgoing IP packet. Forward chains are lists of ter rules applied to all forwarded IP packets.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 245

Network Address Translators


Basic Network Address Translation (NAT): Translates private IP addresses into public IP addresses. Network Address Port Translation (NAPT): Translates transport endpoint identiers. NAPT allows to share a single public address among many private addresses (masquerading). Bi-directional NAT (Two-Way NAT): Translates outbound and inbound and uses DNS-ALGs to facilitate bi-directional name to address mappings. Twice NAT : A variation of a NAT which modies both the source and destination addresses of a datagram. Used to join overlapping address domains.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 246

Full Cone NAT

A full cone NAT is one where all requests from the same internal IP address and port are mapped to the same external IP address and port. Any external host can send a packet to the internal host, by sending a packet to the mapped external address.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 247

Restricted Cone NAT

A restricted cone NAT is one where all requests from the same internal IP address and port are mapped to the same external IP address and port. Unlike a full cone NAT, an external host (with IP address X) can send a packet to the internal host only if the internal host had previously sent a packet to IP address X.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 248

Port Restricted Cone NAT


A port restricted cone NAT is like a restricted cone NAT, but the restriction includes port numbers. Specically, an external host can send a packet, with source IP address X and source port P, to the internal host only if the internal host had previously sent a packet to IP address X and port P.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 249

Symmetric NAT

A symmetric NAT is one where all requests from the same internal IP address and port, to a specic destination IP address and port, are mapped to the same external IP address and port. If the same host sends a packet with the same source address and port, but to a different destination, a different mapping is used. Only the external host that receives a packet can send a UDP packet back to the internal host.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 250

References
[1] B. Carpenter and S. Brim. Middleboxes: Taxonomy and Issues. RFC 3234, IBM Zurich Research Laboratory, February 2002. [2] P. Srisuresh and M. Holdrege. IP Network Address Translator (NAT) Terminology and Considerations. RFC 2663, Lucent Technologies, August 1999. [3] B. Carpenter. Internet Transparency. RFC 2775, IBM, February 2000. [4] N. Freed. Behavior of and Requirements for Internet Firewalls. RFC 2979, Sun, October 2000. [5] J. Rosenberg, J. Weinberger, C. Huitema, and R. Mahy. STUN - Simple Traversal of User Datagram Protocol (UDP) Through Network Address Translators (NATs). RFC 3489, dynamicsoft, Microsoft, Cisco, March 2003. [6] E. D. Zwicky, S. Cooper, and D. B. Chapman. Building Internet Firewalls. OReilly, 2 edition, 2000.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 251

8. Abstract Syntax Notation One (ASN.1)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 252

Motivation and Origin

The Abstract Syntax Notation One (ASN.1) is a language for the denition of data structures and message formats which was developed in the 1980s. Original ASN.1 requirements: Exchange of information between machines with different hardware architectures. Independence of existing programming languages (neutrality). Sender and receiver should be able to choose one out of multiple data encoding formats that suits their needs. ASN.1 has been standardized by the ITU. Note that current versions of ASN.1 differ signicantly from earlier versions of ASN.1.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 253

Abstract vs. Local vs. Transfer Syntax


Abstract Syntax (ASN.1)

Local Syntax

Local Syntax

De/Encoder

Data Exchange

De/Encoder

System A

System B

Transfer Syntax (ASN.1 Encoding Rules)

Separation of the data representation during transmission from the data representation within applications.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 254

Abstract vs. Local vs. Transfer Syntax


The abstract syntax denes data structures in an application implementation neutral format. The abstract syntax is mapped to a local syntax which is used in a concrete implementation. ASN.1 compilers can be used to generate the local syntax from the abstract syntax. The transfer syntax denes how data structures are serialized for transmission over a network. A concrete transfer syntax is commonly dened as a set of encoding rules which dene how values of the various ASN.1 types are encoded. The implementation of the encoding and decoding functions can be automated by using ASN.1 compilers.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 255

Basic Concepts

ASN.1 denitions basically consist of type and value denitions that are organized into ASN.1 modules. Names of ASN.1 data types always begin with an uppercase character. Names of ASN.1 values (constants) always begin with a lowercase character. ASN.1 keywords and macro names only contain uppercase characters. Comments begin with two hyphens (--) and they end either at the end of the line or at the next occurance of two hyphens (--).

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 256

ISO/OSI Registration Tree

itut(0) / ccitt(0)

iso(1)

jointisoitut(2) / jointisoccitt(2)

standard(0)

registrationauthority(1)

memberbody(2)

identifiedorganization(3)

dod(6)

internet(1)

directory(1)

mgmt(2)

experimental(3)

private(4)

security(5)

snmpV2(6)

mib2(1)

snmpDomains(1)

snmpProxy(2)

snmpModules(3)

system(1)

interfaces(2)

ip(4)

icmp(5)

tcp(6)

udp(7)

transmission(10)

snmp(11) ...

snmpMIB(1)

snmpFrameworkMIB(10)

...

x25(5)

dot3(7) dot5(9) fddi(15) lapb(16)

...

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 257

Registration Tree

The registration tree is used to uniquely identify artefacts such as specic protocol denitions or protocol parameters. The registration tree provides a hierarchical name space that can be used for this purpose. The hierarchical structure makes it is possible to delegate authority. For example, the US Department of Defense (dod) has delegated authority over the internet subtree to the Internet Assigned Number Authority (IANA). Note that nodes are uniquely identied by the assigned numbers and not necessarily by their associated descriptors.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 258

Primitive ASN.1 Data Types


The data type BOOLEAN represents the two logical values TRUE and FALSE. The data type INTEGER represents integral numbers. Note that there is no restriction on the precision. The INTEGER type can also be used to represent named numbers. The data type BIT STRING represents a sequence of dened bits. The length of a BIT STRING does not have to be a multiple of 8. The data type OCTET STRING represents a sequence of octets (bytes). The OCTET STRING is a base type for character strings using different character sets or some other formatted strings such as GeneralizedTime or UTCTime.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 259

Primitive ASN.1 Data Types

The data type ENUMERATED represents an enumeration of values. The set of possible values must be dened when deriving a type from the ENUMERATED type. The data type OBJECT IDENTIFIER identies a node in the ISO registration tree. An OBJECT IDENTIFIER value determines the path from the root of the registration tree to a target node. The data type ObjectDescriptor contains a character string identifying a node in the ISO registration tree. Note that the value of an ObjectDescriptor is not guaranteed to uniquely identify a node in the ISO registration tree.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 260

Primitive ASN.1 Data Types


The data type ANY represents any valid ASN.1 data type. It is basically the union of all ASN.1 data types. The data type EXTERNAL represents data structures which which have not been dened with ASN.1. This type is useful to incorporate non-ASN.1 elements in ASN.1 messages. The data type NULL is a place holder, typically used to indicate that a value is missing in a constructed type.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 261

Constructed ASN.1 Data Types

The data type SEQUENCE corresponds to structs in C or records in Modula. The order of the elements of a SEQUENCE is well dened. The data type SET is similar to a SEQUENCE. However, the elements of a SET are not ordered. The data type SEQUENCE OF represents an ordered set (list) of values (which have the same type). The data type SET OF represents an unordered set (list) of values (which have the same type). The data type CHOICE is a selection type and corresponds to unions in C or variant records in Modula. The data type REAL is a constructed type representing oating point numbers. The mantissa and the exponent are INTEGER values.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 262

Restrictions
Unsigned32 Integer32 Unsigned64 MacAddress InetAddress

::= ::= ::= ::= ::=

INTEGER (0..4294967295) INTEGER (-2147483648..2147483647) INTEGER (0..18446744073709551615) OCTET STRING (SIZE (6)) OCTET STRING (SIZE (4|16))

ASN.1 data types can be restricted to reduce the number of possible values. Typical restrictions are size restrictions and value range restrictions. The precise syntax for the restrictions depends on the data type. It is usually a good idea to introduce restricted INTEGER types that match the precision supported by typical processor hardware.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 263

Tags

Data types have associated tags which can be used to identify the type of a value during transmission. A tag consists of a tag number and a tag class. There are four different tag classes: UNIVERSAL: Globally unique identication (tag assignments restricted to ASN.1 standards). APPLICATION: Unique identication within an application. PRIVATE: Private and not universally used identication. CONTEXT-SPECIFIC: Unique identication within a certain context (for example within a CHOICE).

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 264

Universal Tag Numbers


Data type
BOOLEAN INTEGER BIT STRING OCTET STRING NULL OBJECT IDENTIFIER ObjectDescriptor EXTERNAL REAL ENUMERATED ...

Tag (decimal) 1 2 3 4 5 6 7 8 9 10 ...

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 265

Universal Tag Numbers


SEQUENCE, SEQUENCE OF 16 SET, SET OF 17 NumericString 18 PrintableString 19 TeletexString 20 VideotextString 21 IA5String 22 UTCTime 23 GeneralizedType 24 GraphicsString 25 VisibleString 26 GeneralString 27 CharacterString 28
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 266

ASN.1 Example
SNMP-VERSION-1 DEFINITIONS ::= BEGIN -- object names and types ObjectName ::= OBJECT IDENTIFIER ObjectSyntax ::= CHOICE { simple SimpleSyntax, application-wide ApplicationSyntax } SimpleSyntax ::= CHOICE { integer-value INTEGER (-2147483648..2147483647), string-value OCTET STRING (SIZE (0..65535)), oid-value OBJECT IDENTIFIER, empty NULL }

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 267

ASN.1 Example
ApplicationSyntax ::= CHOICE { address-value NetworkAddress, counter-value Counter32, gauge-value Gauge32, timeticks-value TimeTicks, arbitrary-value Opaque } NetworkAddress ::= CHOICE { internet IpAddress } IpAddress Counter32 Gauge32 TimeTicks Opaque ::= ::= ::= ::= ::= [APPLICATION [APPLICATION [APPLICATION [APPLICATION [APPLICATION 0] 1] 2] 3] 4] IMPLICIT IMPLICIT IMPLICIT IMPLICIT IMPLICIT OCTET STRING (SIZE (4)) INTEGER (0..4294967295) INTEGER (0..4294967295) INTEGER (0..4294967295) OCTET STRING

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 268

ASN.1 Example
Message ::= SEQUENCE { version INTEGER { version-1(0) }, community OCTET STRING, data ANY -- PDUs if trivial authentication } PDUs ::= CHOICE { get-request get-next-request get-response set-request trap } GetRequest-PDU GetNextRequest-PDU GetResponse-PDU SetRequest-PDU Trap-PDU

GetRequest-PDU, GetNextRequest-PDU, GetResponse-PDU, SetRequest-PDU, Trap-PDU

::= ::= ::= ::= ::=

[0] [1] [2] [3] [4]

IMPLICIT IMPLICIT IMPLICIT IMPLICIT IMPLICIT

PDU PDU PDU PDU TrapPDU


slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 269

ASN.1 Example
max-bindings INTEGER ::= 2147483647 RequestID ErrorIndex ::= INTEGER (-214783648..214783647) ::= INTEGER (0..max-bindings)

ErrorStatus ::= INTEGER { noError(0), tooBig(1), noSuchName(2), badValue(3), readOnly(4), genErr(5) } VarBind ::= SEQUENCE { name ObjectName, value ObjectSyntax } VarBindList ::= SEQUENCE (SIZE (0..max-bindings)) OF VarBind
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 270

ASN.1 Example
PDU ::= SEQUENCE { request-id error-status error-index variable-bindings } RequestID, ErrorStatus, ErrorIndex, VarBindList

TrapPDU ::= SEQUENCE { enterprise OBJECT IDENTIFIER, agent-addr NetworkAddress, generic-trap INTEGER { coldStart(0), warmStart(1), linkDown(2), linkUp(3), authenticationFailure(4), egpNeighborLoss(5), enterpriseSpecific(6) }, specific-trap INTEGER (0..214783647), time-stamp TimeTicks, variable-bindings VarBindList } END
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 271

Basic Encoding Rules (BER)


The Basic Encoding Rules (BER) use a tag/length/value (TLV) encoding. Every data element is identied by a tag value, the length of the data element in bytes and the data element itself. The TLV encoding enables a receiver to identify the type of every data element which can then be veried against the type the receiver expects. The price for this is a somewhat increased length of the encoded messages. There are other ASN.1 encodings like the Packed Encoding Rules (PER) which do not generally encode type information.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 272

BER Encoding of Tags


8 7 6 5 4 3 2 1

tag number (< 31) tag type { primitive(0), constructed(1) } tag class { universal (00), application(01), context(10), private(11) }

8 7 6 5 4 3 2 1 1 1 1 1 1

8 7 6 5 4 3 2 1 1 ...

8 7 6 5 4 3 2 1 1

8 7 6 5 4 3 2 1 0

tag number (> 31) tag type { primitive(0), constructed(1) } tag class { universal (00), application(01), context(10), private(11) }

If the tag number is less than 31, the tag number can be encoded in a single byte. Otherwise, the highest bit indicates whether more tag bytes follow. The primitive / constructed allows a decoder to determine whether the value itself is a BER encoding.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 273

BER Encoding of Length


8 7 6 5 4 3 2 1 0

length (< 128) 8 7 6 5 4 3 2 1 1 8 7 6 5 4 3 2 1 ... 8 7 6 5 4 3 2 1

length (> 127) number of bytes encoding length

A length less than 128 bytes can be encoded in a single length byte. Larger length values must be encoded in multiple bytes. The maximum length is 2(1278) 1 = 21016 1 bytes. Besides the denite form length encoding, there is an indenite form where the end of a value is identied by a certain marker.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 274

BER Encoding of BOOLEAN and INTEGER

BOOLEAN value is encoded by a single byte. The value TRUE is represented by the byte 0xff and the value FALSE is represented by the byte 0x00.

An INTEGER value is encoded as a binary number. Negative numbers are encoded in twos complement notation Note that it might be necessary to encode a leading 0x00 byte for unsigned values. For example, given the following ASN.1 type denition
Unsigned8 ::= INTEGER (0..255)

the decimal value 255 will be encoded using two bytes as 0x00ff.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 275

BER Encoding of BIT STRING


8 7 6 5 4 3 2 1 8 7 6 5 4 3 2 1
b0 b1 b2 b3 b4 b5 b6 b7

8 7 6 5 4 3 2 1 ...

bits unused bits in last byte

A BIT STRING value is encoded in a sequence of bytes that contain the named bits. The byte sequence is prexed with a byte which indicates the number of unused bits in the last byte of the byte sequence.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 276

BER Encoding of OCTET STRING and NULL


An OCTET STRING value is encoded as a sequence of bytes. Specialized OCTET STRING values are identied by their specic tag value, but otherwise encoded just as OCTET STRING values. A NULL value is encoded using zero bytes. A NULL is thus encoded by its tag and the length zero.
NULL values with special tags may be used to indicate certain ags or parameters.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 277

BER Encoding of OBJECT IDENTIFIER


8 7 6 5 4 3 2 1 0

subidentifier (< 128) 8 7 6 5 4 3 2 1 1 ... 8 7 6 5 4 3 2 1 1 8 7 6 5 4 3 2 1 0

subidentifier (> 127)

An OBJECT IDENTIFIER value is encoded as a sequence of sub-identiers. The rst two sub-identier X and Y of an OBJECT IDENTIFIER value are encoded together in a single sub-identier using the formula (X 40) + Y . (This works because X may only hold one of the three values 0, 1 or 2.)
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 278

BER Encoding of Constructed Types

SEQUENCE, SEQUENCE OF, SET, and SET OF values are encoded by encoding the tag of the constructed type, the length of the constructed type, and the elements contained in these constructed types. CHOICE values are encoded by encoding the value currently present in the CHOICE.

This requires that the choice value present can be identier by its tag value.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 279

BER Example
Bytes Tag Length Value 27 1 6 14 4 1 1 0

30:1b SEQUENCE { 02:01:00 INTEGER 04:06:70:75:62:6C:69:63 OCTET STRING a1:0e GetNextRequest-PDU { 02:04:36:a2:8f:07 INTEGER 02:01:00 INTEGER 02:01:00 INTEGER 30:00 SEQUENCE OF {} } }

0 "public" 916623111 0 0

The example shows the BER encoding of an SNMP message which is consistent with the ASN.1 denition shown above.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 280

References
[1] ITU. Information technology - Abstract Syntax Notation One (ASN.1): Specication of basic notation. Recommendation ITU-T X.680, International Telecommunication Union, December 1997. [2] ITU. Information technology - ASN.1 encoding rules: Specication of Basic Encoding Rules (BER), Canonical Encoding Rules (CER) and Distinguished Encoding Rules (DER). Recommendation ITU-T X.690, International Telecommunication Union, December 1997. [3] D. Steedman. Abstract Syntax Notation One (ASN.1): The Tutorial and Reference. Technology Appraisals, 1990. [4] G. Neufeld and S. Vuong. An overview of ASN.1. Computer Networks and ISDN Systems, 23:393415, 1992.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 281

9. External Data Representation (XDR)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 282

External Data Representation (XDR)


Data structures are dened using the XDR data denition language, specied in RFC 1832 The actual data encoding on the wire uses the XDR encoding format as specied in RFC 1832 XDR supports the following base data types: integer, unsigned integer, enum, bool hyper integer, unsigned hyper integer float, double, quadrupel xed-length opaque, variable length opaque variable-length string xed-length arrays, variable-length arrays struct, discriminated union
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 283

XDR Language

The XDR language uses a C-style syntax. Below is the syntax summary taken from RFC 1832:
declaration: type-specifier identifier | type-specifier identifier "[" value "]" | type-specifier identifier "<" [ value ] ">" | "opaque" identifier "[" value "]" | "opaque" identifier "<" [ value ] ">" | "string" identifier "<" [ value ] ">" | type-specifier "*" identifier | "void" value: constant | identifier

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 284

XDR Language
type-specifier: [ "unsigned" ] "int" | [ "unsigned" ] "hyper" | "float" | "double" | "quadruple" | "bool" | enum-type-spec | struct-type-spec | union-type-spec | identifier enum-type-spec: "enum" enum-body enum-body: "{" ( identifier "=" value ) ( "," identifier "=" value )* "}"
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 285

XDR Language
struct-type-spec: "struct" struct-body struct-body: "{" ( declaration ";" ) ( declaration ";" )* "}" union-type-spec: "union" union-body union-body: "switch" "(" declaration ")" "{" ( "case" value ":" declaration ";" ) ( "case" value ":" declaration ";" )* [ "default" ":" declaration ";" ] "}"

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 286

XDR Language
constant-def: "const" identifier "=" constant ";" type-def: "typedef" declaration ";" | "enum" identifier enum-body ";" | "struct" identifier struct-body ";" | "union" identifier union-body ";" definition: type-def | constant-def specification: definition *

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 287

XDR Encoding

All data items are represented by a multiple of 32 bits Padding is used in cases where data items do not t 32 bit boundaries naturally No type information in the encoding Implicit length wherever possible Requires big-endian byte order

= Very efcient to encode/decode!

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 288

XDR Integers, Bools and Enums

Syntax
[unsigned] int identifier; /* [un]signed integer, 32 bit */ [unsigned] hyper identifier; /* [un]signed integer, 64 bit */ enum { name-identifier = constant, ... } identifier; bool identifier;

Encoding
(MSB) (LSB) +-------+-------+-------+-------+ |byte 0 |byte 1 |byte 2 |byte 3 | SIGNED / UNSIGNED INTEGER +-------+-------+-------+-------+ <------------32 bits------------> (MSB) (LSB) +-------+-------+-------+-------+-------+-------+-------+-------+ |byte 0 |byte 1 |byte 2 |byte 3 |byte 4 |byte 5 |byte 6 |byte 7 | +-------+-------+-------+-------+-------+-------+-------+-------+ <----------------------------64 bits----------------------------> HYPER INTEGER / UNSIGNED HYPER INTEGER
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 289

XDR Floating-Point

Syntax
float identifier;

Encoding
+-------+-------+-------+-------+ |byte 0 |byte 1 |byte 2 |byte 3 | S| E | F | +-------+-------+-------+-------+ 1|<- 8 ->|<-------23 bits------>| <------------32 bits------------> SINGLE-PRECISION FLOATING-POINT NUMBER

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 290

XDR Double Floating-Point


Syntax
double identifier;

Encoding
+------+------+------+------+------+------+------+------+ |byte 0|byte 1|byte 2|byte 3|byte 4|byte 5|byte 6|byte 7| S| E | F | +------+------+------+------+------+------+------+------+ 1|<--11-->|<-----------------52 bits------------------->| <-----------------------64 bits-------------------------> DOUBLE-PRECISION FLOATING-POINT

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 291

XDR Quadruple Floating-Point


Syntax
quadruple identifier;

Encoding
+------+------+------+------+------+------+-...--+------+ |byte 0|byte 1|byte 2|byte 3|byte 4|byte 5| ... |byte15| S| E | F | +------+------+------+------+------+------+-...--+------+ 1|<----15---->|<-------------112 bits------------------>| <-----------------------128 bits------------------------> QUADRUPLE-PRECISION FLOATING-POINT

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 292

XDR Fixed-Length Opaque


Syntax
opaque identifier[n]; /* opaque with fixed size n */

Encoding
0 1 ... +--------+--------+...+--------+--------+...+--------+ | byte 0 | byte 1 |...|byte n-1| 0 |...| 0 | +--------+--------+...+--------+--------+...+--------+ |<-----------n bytes---------->|<------r bytes------>| |<-----------n+r (where (n+r) mod 4 = 0)------------>| FIXED-LENGTH OPAQUE

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 293

XDR Variable-Length Opaque

Syntax
opaque identifier<m>; opaque identifier<>; /* opaque with max. size m */ /* opaque with max. size (2**32) */

Encoding
0 1 2 3 4 5 ... +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+ | length n |byte0|byte1|...| n-1 | 0 |...| 0 | +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+ |<-------4 bytes------->|<------n bytes------>|<---r bytes--->| |<----n+r (where (n+r) mod 4 = 0)---->| VARIABLE-LENGTH OPAQUE

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 294

XDR String

Syntax
string identifier<m>; string identifier<>; /* string with max. size m */ /* string with max. size (2**32) */

Encoding
0 1 2 3 4 5 ... +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+ | length n |byte0|byte1|...| n-1 | 0 |...| 0 | +-----+-----+-----+-----+-----+-----+...+-----+-----+...+-----+ |<-------4 bytes------->|<------n bytes------>|<---r bytes--->| |<----n+r (where (n+r) mod 4 = 0)---->| STRING

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 295

XDR Fixed-Length Array


Syntax
type-name identifier[n]; /* array with n element */

Encoding
+---+---+---+---+---+---+---+---+...+---+---+---+---+ | element 0 | element 1 |...| element n-1 | +---+---+---+---+---+---+---+---+...+---+---+---+---+ |<--------------------n elements------------------->| FIXED-LENGTH ARRAY

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 296

XDR Variable-Length Array

Syntax
type-name identifier<m>; type-name identifier<>; /* array with max. m elements */ /* array with max. (2**32) elements */

Encoding
0 1 2 3 +--+--+--+--+--+--+--+--+--+--+--+--+...+--+--+--+--+ | n | element 0 | element 1 |...|element n-1| +--+--+--+--+--+--+--+--+--+--+--+--+...+--+--+--+--+ |<-4 bytes->|<--------------n elements------------->| VARIABLE-LENGTH ARRAY

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 297

XDR Structures

Syntax
struct { component-declaration-A; component-declaration-B; ... } identifier;

Encoding
+-------------+-------------+... | component A | component B |... +-------------+-------------+... STRUCTURE

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 298

XDR Discriminated Unions

Syntax
union switch (discriminant-declaration) { case discriminant-value-A: arm-declaration-A; case discriminant-value-B: arm-declaration-B; ... default: default-declaration; } identifier;

Encoding
0 1 2 3 +---+---+---+---+---+---+---+---+ | discriminant | implied arm | +---+---+---+---+---+---+---+---+ |<---4 bytes--->| DISCRIMINATED UNION

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 299

XDR Implementation

Sun provided an API which offers conversion functions for other C data types in addition to the XDR data types An XDR handle represents the XDR encoding/decoding buffer and determines the operation to carry out

bool_t bool_t bool_t bool_t bool_t /* ... bool_t bool_t bool_t bool_t

xdr_void(void); xdr_short(XDR *xdrs, short *sp); xdr_u_short(XDR *xdrs, u_short *sp); xdr_int(XDR *xdrs, int *sp); xdr_u_int(XDR *xdrs, u_int *sp); */ xdr_bytes(XDR *xdrs, char **cpp, u_int *sizep, u_int maxsize); xdr_opaque(XDR *xdrs, caddr_t cp, u_int cnt); xdr_string(XDR *xdrs, char **cpp, u_int maxsize); xdr_union(XDR *__xdrs, enum_t *dscmp, char *unp, const struct xdr_discrim *choices, xdrproc_t dfault); /* ... */
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 300

References
[1] R. Srinivasan. XDR: External Data Representation Standard. RFC 1832, Sun Microsystems, August 1995.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 301

10. Augmented Backus Naur Form (ABNF)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 302

ABNF Basics

The Augmented Backus Naur Form (ABNF) dened in RFC 4234 can be used to formally specify textual protocol messages. An ABNF denition consists of a set of rules (sometimes also called productions). Every rule has a name followed by an assignment operator followed by an expression consisting of terminal symbols and operators. The end of a rule is marked by the end of the line or by a comment. Comments start with the comment symbol ; (semicolon) and continue to the end of the line. ABNF does not dene a module concept or an import/export mechanism.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 303

Rule Names and Terminal Symbols

The name of a rule must start with an alphabetic character followed by a combination of alphabetics, digits and hyphens. The case of a rule name is not signicant. Terminal symbols are non-negative numbers. The basis of these numbers can be binary (b), decimal (d) or hexadecimal (x). Multiple values can be concatenated by using the dot . as a value concatenation operator. It is also possible to dene ranges of consecutive values by using the hyphen as a value range operator. Terminal symbols can also be dened by using literal text strings containing US ASCII characters enclosed in double quotes. Note that these literal text strings are case-insensitive.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 304

Simple ABNF Examples


CR CRLF = %d13 = %d13.10 ; ASCII carriage return code in decimal ; ASCII carriage return and linefeed code sequence ; ASCCI digits (0 - 9) ; ASCII string "aba" or "ABA" or "Aba" or ... ; ASCII string "abba"

DIGIT = %x30-39 ABA abba = "aba" = %x61.62.62.61

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 305

ABFN Operators

Concatenation Concatenation operator symbol is the empty word Example: abba = %x61 %x62 %x62 %x61 Alternatives Alternatives operator symbol is the forward slash / Example: aorb = %x61 / %x62 Incremental alternatives assignment operator =/ can be used for long lists of alternatives Grouping Expressions can be grouped using parenthesis A grouped expression is treated as a single element Example: abba = %x61 (%x62 %x62) %x61
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 306

ABFN Operators

Repetitions The repetitions operator has the format n*m where n and m are optional decimal values The value of n indicates the minimum number of repetitions (defaults to 0 if not present) The value m indicates the maximum number of repetitions (defaults to innity if not present) The format * indicates 0 or more repetitions Example: abba = %x61 2 %x62 %x61 Optional Square brackets enclose an optional element Example: [ab] ; equivalent to *1(ab)
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 307

ABNF Core Denitions


ALPHA BIT CHAR CR CRLF CTL DIGIT DQUOTE HEXDIG HTAB LF LWSP OCTET SP VCHAR WSP = = = = = = = = = = = = = = = = %x41-5A / %x61-7A ; A-Z / a-z "0" / "1" %x01-7F ; any 7-bit US-ASCII character, excluding %x0D ; carriage return CR LF ; Internet standard newline %x00-1F / %x7F ; controls %x30-39 ; 0-9 %x22 ; " (Double Quote) DIGIT / "A" / "B" / "C" / "D" / "E" / "F" %x09 ; horizontal tab %x0A ; linefeed ; linear white space (past newline) *(WSP / CRLF WSP) %x00-FF ; 8 bits of data %x20 ; space %x21-7E ; visible (printing) characters SP / HTAB ; White space

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 308

ABNF in ABNF
rulelist rule = = 1*( rule / (*c-wsp c-nl) ) rulename defined-as elements c-nl ; continues if next line starts with white space ALPHA *(ALPHA / DIGIT / "-") *c-wsp ("=" / "=/") *c-wsp ; basic rules definition and incremental alternatives alternation *c-wsp WSP / (c-nl WSP) comment / CRLF ; comment or newline ";" *(WSP / VCHAR) CRLF

rulename defined-as

= =

elements c-wsp c-nl

= = =

comment

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 309

ABNF in ABNF
alternation = concatenation *(*c-wsp "/" *c-wsp concatenation) repetition *(1*c-wsp repetition) [repeat] element 1*DIGIT / (*DIGIT "*" *DIGIT) rulename / group / option / char-val / num-val / prose-val "(" *c-wsp alternation *c-wsp ")" "[" *c-wsp alternation *c-wsp "]"

concatenation repetition repeat element

= = = =

group option

= =

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 310

ABNF in ABNF
char-val = DQUOTE *(%x20-21 / %x23-7E) DQUOTE ; quoted string of SP and VCHAR without DQUOTE "%" (bin-val / dec-val / hex-val) "b" 1*BIT [ 1*("." 1*BIT) / ("-" 1*BIT) ] ; series of concatenated bit values ; or single ONEOF range "d" 1*DIGIT [ 1*("." 1*DIGIT) / ("-" 1*DIGIT) ] "x" 1*HEXDIG [ 1*("." 1*HEXDIG) / ("-" 1*HEXDIG) ] "<" *(%x20-3D / %x3F-7E) ">" ; bracketed string of SP and VCHAR without angles ; prose description, to be used as last resort

num-val bin-val

= =

dec-val hex-val prose-val

= = =

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 311

References
[1] D. Crocker and P. Overell. Augmented BNF for Syntax Specications: ABNF. RFC 4234, Brandenburg InternetWorking, THUS plc., October 2005.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 312

11. Domain Name System (DNS)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 313

Domain Name System (DNS)


country toplevel domains generic toplevel domains virtual root

ru

nl

de

edu

org

net

com

arpa

toplevel

iubremen

2nd level

faculty

3rd level

www

4th level

The Domain Name System (DNS) specied in RFC 1034 and RFC 1035 provides a global infrastructure to map human friendly domain names into IP addresses and vice versa. Critical resource since the Internet largely depends on name resolution services provided by the DNS.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 314

Resolver and Name Resolution


Cache

API

Encompassing Name Server DNS Primary Name Server

RR DB

Application

Resolver

DNS

RR DB

Cache

Cache

Client Machine

DNS Server Machines

The resolver is typically tightly integrated into the operating system (or more precisely standard libraries).
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 315

DNS Characteristics

Hierarchical name space with a virtual root. Administration of the name space can be delegated along the path starting from the virtual root. A DNS server knows a part (a zone) of the global name space and its position within the global name space. Name resolution queries can in principle be sent to arbitrary DNS servers. However, it is good practice to use a local DNS server as the primary DNS server. Recursive queries cause the queried DNS server to contact other DNS servers as needed in order to obtain a response to the query. The original DNS protocol does not provide sufcient security. In particular, there is no reason to trust DNS responses.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 316

DNS Labels and Names

The names (labels) on a certain level of the tree must be unique and may not exceed 63 byte in length. The character set for the labels is 7-bit ASCII. Comparisons are done in a case-insensitive manner. Labels must begin with a letter and end with a letter or decimal digit. The characters between the rst and last character must be letters, digits or hyphens. Labels can be concatenated with dots to form paths within the name space. Absolute paths, ending at the virtual root node, end with a trailing dot. All other paths which do not end with a trailing dot are relative paths. The overall length of a domain name is limited to 255 bytes.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 317

DNS Internationalization

Recent efforts did result in proposals for Internationalized Domain Names in Applications (IDNA) (RFC 3490, RFC 3491, RFC 3492). The basic idea is to support internationalized character sets within applications. For backward compatibility reasons, internationalized character sets are encoded into 7-bit ASCII representations (ASCII Compatible Encoding, ACE). ACE labels are recognized by a so called ACE prex. The ACE prex for IDNA is xn-. A label which contains an encoded internationalized name might for example be the value xn-de-jg4avhby1noc0d.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 318

Resource Records

The DNS name space is realized by Resource Records (RRs) which hold typed information for a given name. Resource records have the following components: The owner is the domain name which identies a resource record. The type indicates the kind of information that is stored in a resource record. The class indicates the protocol specic name space, normally IN for the Internet. The time to life (TTL) denes how many seconds information from a resource record can be stored in a local cache. The data format (RDATA) of a resource records depends on the type of the resource record.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 319

Resource Record Types


Type
A AAAA CNAME HINFO MX NS PTR SOA KEY SIG

Description IPv4 address IPv6 address Alias for another name (canonical name) Identication of the CPU and the operating system (host info) List of mail server (mail exchanger) Identication of an authoritative server for a domain Pointer to another part of the name space Start and parameters of a zone (start of zone of authority) Public key associated with a name Signature over the RRs associated with a name
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 320

DNS Message Formats

A DNS message starts with a protocol header. It indicates which of the following four parts is presents and whether the message is a query or a response. The header is followed by a list of questions. The list of questions is followed by a list of answers (resource records). The list of answers is followed by a list of pointers to authorities (also in the form of resource records). The list of pointers to authorities is followed by a list of additional information (also in the form of resource records). This list may contain for example A resource records for names in a response to an MX query.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 321

DNS Message Header


0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ID | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ |QR| Opcode |AA|TC|RD|RA| Z|AD|CD| RCODE | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QDCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ANCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | NSCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | ARCOUNT | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

Simple DNS queries usually use UDP as a transport. For larger data transfers (e.g. zone transfers), DNS may utilize TCP.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 322

DNS Query Format


0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | | / QNAME / / / +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QTYPE | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | QCLASS | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 323

DNS Response Format


0 1 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ : NAME : +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | TYPE | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | CLASS | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | TTL | | | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+ | RDLENGTH | +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--| : RDATA : +--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 324

Resource Record Formats


An A resource record contains an IPv4 address encoded in 4 bytes in network byte order. An AAAA resource record contains an IPv6 address encoded in 16 bytes in network byte order. A CNAME resource record contains a character string preceded by the length of the string encoded in the rst byte. A HINFO resource record contains two character strings, each prexed with a length byte. The rst character string describes the CPU and the second string the operating system. A MX resource record contains a 16-bit preference number followed by a character string prexed with a length bytes which contains the DNS name of a mail exchanger.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 325

Resource Record Formats

A NS resource record contains a character string prexed by a length byte which contains the name of an authoritative DNS server. A PTR resource record contains a character string prexed with a length byte which contains the name of another DNS server. PTR records are used to map IP addresses to names (so called reverse lookups). For an IPv4 address of the form d1 .d2 .d3 .d4 , a PTR resource record is created for the pseudo domain name d4 .d3 .d2 .d1 .in addr.arpa. For an IPv6 address of the form h1 h2 h3 h4 : . . . : h13 h14 h15 h16 , a PTR resource record is created for the pseudo domain name h16 .h15 .h14 .h13 . . . . .h4 .h3 .h2 .h1 .ip6.arpa

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 326

DNS Reverse Trees


virtual root

nl

de

edu

org

net

com inaddr

arpa ipv6

toplevel

iubremen

2nd level

faculty

212

3rd level

www

201

4th level 5th level

48

6th level

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 327

Resource Record Formats

A SOA resource record contains two character strings, each prexed by a length byte, and four 32-bit numbers: Name of the DNS server responsible for a zone. Email address of the administrator responsible for the management of the zone. Serial number (SERIAL) (must be incremented whenever the zone database changes). Time which may elapse before cached zone information must be updated (REFRESH). Time interval after which zone information is consider not current anymore (EXPIRE). Minimum lifetime for resource records (MINIMUM).

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 328

DNS Security

DNS security (DNSSEC) provides data integrity and authentication to security aware resolvers and applications through the use of cryptographic digital signatures. The KEY and SIG resource records have a rather complex format which is described in detail in RFC 2535. The DNSSEC specications are currently being revised to take operational experience into account.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 329

Dynamic DNS Updates


RFC 2136 / RFC 3007 dene a mechanism which allows to dynamically update RRs on name server. This is especially useful in environments which use dynamic IP address assignments. The payload of a DNS update message contains the zone section, the prerequisite section (supporting conditional updates), the update section, and an additional data section. The nsupdate command line utility can be used to make manual updates. Some DHCP servers perform automatic updates when they hand out an IP address.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 330

References
[1] P. Mockapetris. Domain Names - Concepts and Facilities. RFC 1034, ISI, November 1987. [2] P. Mockapetris. Domain Names - Implementation and Specication. RFC 1035, ISI, November 1987. [3] P. Faltstrom, P. Hoffman, and A. Costello. Internationalizing Domain Names in Applications (IDNA). RFC 3490, Cisco, IMC & VPNC, UC Berkeley, March 2003. [4] P. Hoffman and M. Blanchet. Nameprep: A Stringprep Prole for Internationalized Domain Names (IDN). RFC 3491, IMC & VPNC, Viagenie, March 2003. [5] A. Costello. Punycode: A Bootstring encoding of Unicode for Internationalized Domain Names in Applications (IDNA). RFC 3492, UC Berkeley, March 2003. [6] D. Eastlake. Domain Name System Security Extensions. RFC 2535, IBM, March 1999. [7] P. Vixie, S. Thomson, Y. Rekhter, and J. Bound. Dynamic Updates in the Domain Name System (DNS UPDATE). RFC 2136, ISC, Bellcore, Cisco, DEC, April 1997. [8] B. Wellington. Secure Domain Name System (DNS) Dynamic Update. RFC 3007, Nominum, November 2000.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 331

12. Electronic Mail

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 332

Components Involved in Electronic Mail


sender host system user agent mail queue local MTA sender host system user agent

SMTP mail queue organization A SMTP organization B mail queue relay MTA LMTP local MDA relay MTA

SMTP user agent receiver host system user mailbox local MTA

IMAP user agent receiver host system

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 333

Terminology

Mail User Agent (MUA) Mail Transfer Agent (MTA) Mail Delivery Agent (MDA) Store-and-Forward Principle Envelop vs. Header vs. Body

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 334

Simple Mail Transfer Protocol (SMTP)


Dened in RFC 2821 (originally RFC 821) Textual client/server protocol running over TCP Small set of commands to be executed by an SMTP server Supports multiple mail transactions over a single transport layer connection Server responds with structured response codes Fully specied in ABNF

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 335

SMTP Commands
HELO EHLO MAIL RCPT DATA RSET VRFY EXPN HELP NOOP QUIT

Indentify clients to a SMTP server (HELLO) Extended identication (EXTENDED HELLO) Inititate a mail transaction (MAIL) Identity an individual recipient (RECIPIENT) Transfer of mail message (DATA) Aborting current mail transaction (RESET) Verify an email address (VERIFY) Expand a mailing list address (EXPAND) Provide help about SMTP commands (HELP) No operation, has no effect (NOOP) Ask server to close connection (QUIT)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 336

SMTP in ABNF (excerpt)


helo ehlo mail rcpt data rset vrfy expn help noop quit = = = = = = = = = = = "HELO" SP Domain CRLF "EHLO" SP Domain CRLF "MAIL FROM:" ("<>" / Reverse-Path) [SP Mail-Parameters] CRLF "RCPT TO:" ("<Postmaster@" domain ">" / "<Postmaster>" / Forward-Path) [SP Rcpt-Parameters CRLF "DATA" CRLF "RSET" CRLF "VRFY" SP String CRLF "EXPN" SP String CRLF "HELP" [ SP String ] CRLF "NOOP" [ SP String ] CRLF "QUIT" CRLF

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 337

Theory of 3 Digit Reply Codes

The rst digit denotes whether the response is good, bad or incomplete. 1yz Positive Preliminary reply 2yz Positive Completion reply 3yz Positive Intermediate reply 4yz Transient Negative Completion reply 5yz Permanent Negative Completion reply The second digit encodes responses in specic categories. The third digit gives a ner gradation of meaning in each category specied by the second digit.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 338

Internet Message Format


The format of Internet messages is dened in RFC 2822. Most important ABNF productions and messages elds:
fields resend-field resend-field resend-field regular-field regular-field regular-field regular-field = *(trace *resent-field) *regular-field

= resent-date / resent-from / resent-sender =/ resent-to / resent-cc / resent-bcc =/ resent-msg-id = =/ =/ =/ orig-date / from / sender reply-to / to / cc / bcc message-id / in-reply-to / references subject / comments / keywords

Note that elds such as to or cc may be different from the actual addresses used by SMTP.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 339

Originator Fields

The From: eld species the author(s) of the message.


from = "From:" mailbox-list CRLF

The Sender: eld species the mailbox of the sender in cases where the actual sender is not the author (e.g., a secretary).
sender = "Sender:" mailbox CRLF

The Reply-To: eld indicates the mailbox(es) to which the author of the message suggests that replies be sent.
reply-to = "Reply-To:" address-list CRLF

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 340

Destination Address Fields

The To: eld contains the address(es) of the primary recipient(s) of the message.
to = "To:" address-list CRLF

The Cc: eld (Carbon Copy) contains the addresses of others who are to receive the message, though the content of the message may not be directed at them.
cc = "Cc:" address-list CRLF

The Bcc: eld (Blind Carbon Copy) contains addresses of recipients of the message whose addresses are not to be revealed to other recipients of the message.
bcc = "Bcc:" (address-list / [CFWS]) CRLF

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 341

Identication and Origination Date Fields

The Message-ID: eld provides a unique message identier that refers to a particular version of a particular message.
message-id = "Message-ID:" msg-id CRLF

The In-Reply-To: eld will contain the contents of the Message-ID: eld of the message to which this one is a reply.
in-reply-to = "In-Reply-To:" 1*msg-id CRLF

The References: eld will contain the contents of the parents References: eld (if any) followed by the contents of the parents Message-ID: eld (if any).
references = "References:" 1*msg-id CRLF

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 342

Informational Fields

The Subject: eld contains a short string identifying the topic of the message.
subject = "Subject:" unstructured CRLF

The Comments: eld contains any additional comments on the text of the body of the message.
comments = "Comments:" unstructured CRLF

The Keywords: eld contains a comma-separated list of important words and phrases that might be useful for the recipient.
keywords = "Keywords:" phrase *("," phrase) CRLF

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 343

Trace Fields

The Received: eld contains a (possibly empty) list of name/value pairs followed by a semicolon and a date-time specication. The rst item of the name/value pair is dened by item-name, and the second item is either an addr-spec, an atom, a domain, or a msg-id.
received = "Received:" name-val-list ";" date-time CRLF

The Return-Path: eld contains an email address to which messages indicating non-delivery or other mail system failures are to be sent.
return = "Return-Path:" path CRLF

A message may have multiple received elds and the return eld is optional
trace = [return] 1*received
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 344

Resend Fields

Resent elds are used to identify a message as having been reintroduced into the transport system by a user. Resent elds make the message appear to the nal recipient as if it were sent directly by the original sender, with all of the original elds remaining the same. Each set of resent elds correspond to a particular resending event.
resent-date resent-from = "Resent-Date:" date-time CRLF = "Resent-From:" mailbox-list CRLF

resent-sender = "Resent-Sender:" mailbox CRLF resent-to resent-cc resent-bcc = "Resent-To:" address-list CRLF = "Resent-Cc:" address-list CRLF = "Resent-Bcc:" (address-list / [CFWS]) CRLF
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 345

resent-msg-id = "Resent-Message-ID:" msg-id CRLF

Internet Message Example #1


Date: Tue, 1 Apr 1997 09:06:31 -0800 (PST) From: coyote@desert.example.org To: roadrunner@acme.example.com Subject: I have a present for you Look, Im sorry about the whole anvil thing, and I really didnt mean to try and drop it on you from the top of the cliff. I want to try to make it up to you. Ive got some great birdseed over here at my place--top of the line stuff--and if you come by, Ill have it all wrapped up for you. Im really sorry for all the problems Ive caused for you over the years, but I know we can work this out. -Wile E. Coyote "Super Genius" coyote@desert.example.org

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 346

Multipurpose Internet Mail Extensions

The Multipurpose Internet Mail Extensions (MIME) denes conventions to support multiple different character sets; support different media types; support messages containing multiple parts; encode content compatible with RFC 2821 and RFC 2822. The set of media types and identied characters sets is extensible. MIME is widely implemented and not only used for Internet mail messages.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 347

MIME Header Fields

The MIME-Version: eld declares the version of the Internet message body format standard in use.
version = "MIME-Version:" 1*DIGIT "." 1*DIGIT CRLF

The Content-Type: eld species the media type and subtype of data in the body.
content = "Content-Type:" type "/" subtype *(";" parameter)

The Content-Transfer-Encoding: eld species the encoding transformation that was applied to the body and the domain of the result.
encoding = "Content-Transfer-Encoding:" mechanism

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 348

MIME Header Fields

The optional Content-ID: eld allows one body to make reference to another.
id = "Content-ID:" msg-id

The optional Content-Description: eld associates some descriptive information with a given body.
description = "Content-Description" *text

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 349

MIME Media Types

Five discrete top-level media types (RFC 2046): text image audio video application Two composite top-level media types (RFC 2046): multipart message Some media types have additional parameters (e.g., the character set). The Internet Assigned Numbers Authority (IANA) maintains a list of registered media types and subtypes.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 350

MIME Boundaries

Multipart documents consists of several entities which are separated by a boundary delimiter line. After its boundary delimiter line, each body part then consists of a header area, a blank line, and a body area. The boundary is established through a parameter of the Content-Type: header eld of the message.
Content-Type: multipart/mixed; boundary="gc0pJq0M:08jU534c0p"

The boundary delimiter consists of two dashes followed by the established boundary followed by two dashes followed by the end of the line.
--gc0pJq0M:08jU534c0p--

The boundary delimiter must chosen so as to guarantee that there is no clash with the content.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 351

MIME Example
From: Nathaniel Borenstein <nsb@bellcore.com> To: Ned Freed <ned@innosoft.com> Date: Sun, 21 Mar 1993 23:56:48 -0800 (PST) Subject: Sample message MIME-Version: 1.0 Content-type: multipart/mixed; boundary="simple boundary" This is the preamble. --simple boundary This is implicitly typed plain US-ASCII text. It does NOT end with a linebreak. --simple boundary Content-type: text/plain; charset=us-ascii This is explicitly typed plain US-ASCII text ending with a linebreak. --simple boundary-This is the epilogue. It is also to be ignored. and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 352 slides.tex Networks It is to be ignored.

Base64 Encoding

Idea: Represent three input bytes (24 bit) using four characters taken from a 6-bit alphabet. The resulting character sequence is broken into lines such that no line is longer than 76 characters. If the input text has a length which is not a multiple of three, then the special character = is appended to indicate the number of ll bytes. Base64 encoded data is difcult to read by humans without tools (which however are trivial to write).

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 353

Base64 Character Set


Value 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 Encoding A B C D E F G H I J K L M N O P Q Value 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 Encoding R S T U V W X Y Z a b c d e f g h Value 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 Encoding i j k l m n o p q r s t u v w x y Value 51 52 53 54 55 56 57 58 59 60 61 62 63 Encoding z 0 1 2 3 4 5 6 7 8 9 + /

(pad) =

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 354

Base64 Example
CkRhdGU6IFR1ZSwgMSBBcHIgMTk5NyAwOTowNjozMSAtMDgwMCAoUFNUKQpGcm9t OiBjb3lvdGVAZGVzZXJ0LmV4YW1wbGUub3JnClRvOiByb2FkcnVubmVyQGFjbWUu ZXhhbXBsZS5jb20KU3ViamVjdDogSSBoYXZlIGEgcHJlc2VudCBmb3IgeW91CgpM b29rLCBJJ20gc29ycnkgYWJvdXQgdGhlIHdob2xlIGFudmlsIHRoaW5nLCBhbmQg SSByZWFsbHkKZGlkbid0IG1lYW4gdG8gdHJ5IGFuZCBkcm9wIGl0IG9uIHlvdSBm cm9tIHRoZSB0b3Agb2YgdGhlCmNsaWZmLiAgSSB3YW50IHRvIHRyeSB0byBtYWtl IGl0IHVwIHRvIHlvdS4gIEkndmUgZ290IHNvbWUKZ3JlYXQgYmlyZHNlZWQgb3Zl ciBoZXJlIGF0IG15IHBsYWNlLS10b3Agb2YgdGhlIGxpbmUKc3R1ZmYtLWFuZCBp ZiB5b3UgY29tZSBieSwgSSdsbCBoYXZlIGl0IGFsbCB3cmFwcGVkIHVwCmZvciB5 b3UuICBJJ20gcmVhbGx5IHNvcnJ5IGZvciBhbGwgdGhlIHByb2JsZW1zIEkndmUg Y2F1c2VkCmZvciB5b3Ugb3ZlciB0aGUgeWVhcnMsIGJ1dCBJIGtub3cgd2UgY2Fu IHdvcmsgdGhpcyBvdXQuCi0tCldpbGUgRS4gQ295b3RlICAgIlN1cGVyIEdlbml1 cyIgICBjb3lvdGVAZGVzZXJ0LmV4YW1wbGUub3JnCg==

What does this text mean? (Note that this example uses a non-standard line length to t on a slide.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 355

Quoted-Printable Encoding

Idea: Escape all characters which are not printable characters in the US-ASCII character set. The escape character is = followed by a two digit hexadecimal representation of the octets value (in uppercase). Relatively complex rules ensure that line breaks in the original input are preserved (so called hard line breaks); line breaks are added to ensure that encoded lines be no more than 76 characters long (so called soft line breaks). The encoding is intended for data that largely consists of printable characters in the US-ASCII character set.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 356

Internet Message Access Protocol (IMAP)

The Internet Message Access Protocol (IMAP) dened in RFC 3501 allows a client to access and manipulate electronic mail messages stored on a server. Messages are stored in mailboxes (message folders) which are identied by a mailbox name. IMAP runs over TCP with the well-known port number 143. IMAP users are typically authenticated using passwords (transmitted in cleartext over TCP). It is strongly suggested to use the Transport Layer Security (TLS) option to encrypt authentication exchanges and message transfers.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 357

IMAP Message Identication

Messages in a mailbox are identied by numbers: The unique identier identies a message in a mailbox independent of its position. Unique identiers should remain persistent across IMAP sessions. The message sequence number is the relative position of a message in a mailbox. The rst position in a mailbox is 1. Persistency of the unique identier can only be achieved to a certain extend and implementation therefore must cope with non-persistent unique identiers.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 358

IMAP States
connection established and server greeting

nonauthenticated

authenticated

selected

logout and connection release

The state determines the set of applicable commands.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 359

IMAP Commands

Commands applicable in all states:


CAPABILITY NOOP LOGOUT

Reports the servers capabilities Empty command (may be used to trigger status updates) Terminate the IMAP session

Commands applicable in the non-authenticated state:


AUTHENTICATE LOGIN STARTTLS

Indicates an authentication mechanism to the server Trivial authentication with a cleartext password Start TLS negotation to protect the channel

Note that the completion of the STARTTLS command can change the capabilities announced by the server.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 360

IMAP Commands

Commands applicable in the authenticated state:


SELECT EXAMINE CREATE DELETE RENAME SUBSCRIBE UNSUBSCRIBE LIST LSUB STATUS APPEND

Select an existing mailbox (read-write) Select an existing mailbox (read-only) Create a new mailbox Delete an existing mailbox Rename an existing mailbox Add a mailbox to the servers list of active mailboxes Remove a mailbox from the servers list of active mailboxes List all existing mailboxes List all active mailboxes Retrieve the status of a mailbox Append a message to a mailbox

The SELECT and EXAMINE commands cause a transition into the selected state.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 361

IMAP Commands

Commands applicable in the selected state:


CHECK CLOSE EXPUNGE SEARCH FETCH STORE COPY UID

Create a copy of the current mailbox (checkpoint) Close the current mailbox and leave state Expunge all messages marked as deleted Search for messages matching given criteria Fetch data of a message in the current mailbox Stote data as a message in the current mailbox Copy a message to the en of the specied mailbox Execute a command using unique identiers

The CLOSE command causes a transition back into the authenticated state.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 362

IMAP Tagging

IMAP supports asynchronous, concurrent operations. A client can send multiple commands which the server will execute asynchronously. A client tags commands such that responses returned by the server can be related to previously sent commands. Server responses that do not indicate command completion are prexed with the token *; that request additional information to complete a command are prexed with the token +; that communicate the successful for unsuccessful completion of a comment are prexed with the tag.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 363

IMAP Commands in ABNF


tag command = 1*<any ATOM_CHAR except "+"> = tag SPACE (command_any / command_auth / command_nonauth / command_select) CRLF ;; Modal based on state = "CAPABILITY" / "LOGOUT" / "NOOP" / x_command ;; Valid in all states = append / create / delete / examine / list / lsub / rename / select / status / subscribe / unsubscribe ;; Valid only in Authenticated or Selected state

command_any

command_auth

command_nonauth = login / authenticate ;; Valid only when in Non-Authenticated state command_select = "CHECK" / "CLOSE" / "EXPUNGE" / copy / fetch / store / uid / search ;; Valid only when in Selected state
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 364

IMAP Responses in ABNF


response continue_req response_data = *(continue_req / response_data) response_done = "+" SPACE (resp_text / base64) = "*" SPACE (resp_cond_state / resp_cond_bye / mailbox_data / message_data / capability_data) CRLF = response_tagged / response_fatal = "*" SPACE resp_cond_bye CRLF ;; Server closes connection immediately

response_done response_fatal

response_tagged = tag SPACE resp_cond_state CRLF

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 365

SIEVE Filtering Language


The SIEVE language dened in RFC 3028 can be used to lter messages at time of nal delivery. The language can be implemented either on the mail client of the mail server. SIEVE is extensible, simple, and independent of the access protocol, mail architecture, and operating system. The SIEVE language can be safely executed on servers (such as IMAP servers) as it has no variables, loops, or ability to shell out to external programs.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 366

SIEVE Scripts

SIEVE scripts are sequences of commands. An action command is an identier followed by zero or more arguments, terminated by a semicolon. Action commands do not take tests or blocks as arguments. A control command is similar, but it takes a test as an argument, and ends with a block instead of a semicolon. A test command is used as part of a control command. It is used to specify whether or not the block of code given to the control command is executed. The empty SIEVE script keeps the message (implicit keep).

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 367

SIEVE in ABNF
argument = string-list / number / tag arguments = *argument [test / test-list] block = "{" commands "}" command = identifier arguments ( ";" / block ) commands = *command start = commands string = quoted-string / multi-line string-list = "[" string *("," string) "]" / string ;; if there is only a single string, the brackets are optional test = identifier arguments test-list = "(" test *("," test) ")"

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 368

SIEVE Example
# Declare any optional features or extension used by the script require ["fileinto", "reject"]; # Reject any large messages (note that the four leading dots get # "stuffed" to three) if size :over 1M { reject text: Please do not send me large attachments. Put your file on a server and send me the URL. Thank you. .... Fred . ; stop; }

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 369

SIEVE Example (cont.)


# Move messages from IETF filter discussion list to filter folder if header :is "Sender" "owner-ietf-mta-filters@imc.org" { fileinto "filter"; # move to "filter" folder stop; } # # Keep all messages to or from people in my company # elsif address :domain :is ["From", "To"] "example.com" { keep; # keep in "In" folder stop; }

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 370

SIEVE Example (cont.)


# Try and catch unsolicited email. If a message is not to me, # or it contains a subject known to be spam, file it away. # if anyof (not address :all :contains ["To", "Cc", "Bcc"] "me@example.com", header :matches "subject" ["*make*money*fast*", "*university*dipl*mas*"]) { # If message header does not contain my address, # its from a list. fileinto "spam"; # move to "spam" folder } else { # Move all other (non-company) mail to "personal" folder. fileinto "personal"; }

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 371

References
[1] J. Klensin. Simple Mail Transfer Protocol. RFC 2821, AT&T Laboratories, April 2001. [2] P. Resnick. Internet Message Format. RFC 2822, QUALCOMM Incorporated, April 2001. [3] N. Freed and N. Borenstein. Multipurpose Internet Mail Extensions (MIME) Part One: Format of Internet Message Bodies. RFC 2045, Innosoft, First Virtual, November 1996. [4] N. Freed and N. Borenstein. Multipurpose Internet Mail Extensions (MIME) Part Two: Media Types. RFC 2046, Innosoft, First Virtual, November 1996. [5] M. Crispin. Internet Message Access Protocol -Version 4rev1. RFC 3501, University of Washington, March 2003. [6] T. Showalter. Sieve: A Mail Filtering Language. RFC 3028, Mirapoint, Inc., January 2001. [7] W. Segmuller. Sieve Extension: Relational Tests. RFC 3431, IBM T.J. Watson Research Center, December 2002. [8] C. Daboo. SIEVE Email Filtering: Spamtest and VirusTest Extensions. RFC 3685, Cyrusoft International, February 2004. [9] J. Degener. Sieve Extension: Copying Without Side Effects. RFC 3894, Sendmail, October 2004.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 372

13. File Transfer Protocol

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 373

File Transfer Protocol (FTP)


user interface

control process

commands replies

control process

file system

data transfer process

data

data transfer process

file system

FTP Server

FTP Client

The FTP protocol dened in RFC 959 uses a separate TCP connection for each data transfer. This approach sidesteps any problems about marking the end of les.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 374

File Transfer Protocol (FTP)


The control connection uses a text-based line-oriented protocol which is similar to SMTP. The client sends commands which are processed by the server. The server sends responses using three digit response codes. If the data transfer connection is initiated by the server, then the clients port number must be conveyed rst to the server. If the data transfer connection is initiated by the client, then the well-known port number 20 is used. The well-known port number for the control connection is 21.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 375

File Transfer Protocol (FTP)


FTP allows to resume a data transmission that did not complete using a special restart mechanism. FTP can be used to initiate a data transfer between two remote systems. However, this feature of FTP can result in some interesting security problems (see RFC 2577 for more details). RFC 2228 denes security extensions for FTP (authentication, integrity and condentiality). RFC 4217 denes how to run FTP over TLS.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 376

FTP Commands
USER PASS ACCT CWD CDUP SMNT QUIT REIN PORT PASV TYPE STRU MODE RETR <SP> <username> <CRLF> <SP> <password> <CRLF> <SP> <account-information> <CRLF> <SP> <pathname> <CRLF> <CRLF> <SP> <pathname> <CRLF> <CRLF> <CRLF> <SP> <host-port> <CRLF> <CRLF> <SP> <type-code> <CRLF> <SP> <structure-code> <CRLF> <SP> <mode-code> <CRLF> <SP> <pathname> <CRLF>

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 377

FTP Commands
STOR STOU APPE ALLO REST RNFR RNTO ABOR DELE RMD MKD PWD <SP> <pathname> <CRLF> <CRLF> <SP> <pathname> <CRLF> <SP> <decimal-integer> [<SP> R <SP> <decimal-integer>] <CRLF> <SP> <marker> <CRLF> <SP> <pathname> <CRLF> <SP> <pathname> <CRLF> <CRLF> <SP> <pathname> <CRLF> <SP> <pathname> <CRLF> <SP> <pathname> <CRLF> <CRLF>

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 378

FTP Commands
LIST NLST SITE SYST STAT HELP NOOP [<SP> <pathname>] <CRLF> [<SP> <pathname>] <CRLF> <SP> <string> <CRLF> <CRLF> [<SP> <pathname>] <CRLF> [<SP> <string>] <CRLF> <CRLF>

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 379

FTP Command Parameters


<username> ::= <string> <password> ::= <string> <account-information> ::= <string> <string> ::= <char> | <char><string> <char> ::= any of the 128 ASCII characters except <CR> and <LF> <marker> ::= <pr-string> <pr-string> ::= <pr-char> | <pr-char><pr-string> <pr-char> ::= printable characters, any ASCII code 33 through 126 <byte-size> ::= <number> <host-port> ::= <host-number>,<port-number>

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 380

FTP Command Parameters


<host-number> ::= <number>,<number>,<number>,<number> <port-number> ::= <number>,<number> <number> ::= any decimal integer 1 through 255 <form-code> ::= N | T | C <type-code> ::= A [<sp> <form-code>] | E [<sp> <form-code>] | I | L <sp> <byte-size> <structure-code> ::= F | R | P <mode-code> ::= S | B | C <pathname> ::= <string> <decimal-integer> ::= any decimal integer

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 381

References
[1] J. Postel and J. Reynolds. File Transfer Protocol (FTP). RFC 959, ISI, October 1985. [2] M. Horowitz and S. Lunt. FTP Security Extensions. RFC 2228, Cygnus Solutions, Bellcore, October 1997. [3] M. Allman, S. Ostermann, and C. Metz. FTP Extensions for IPv6 and NATs. RFC 2428, NASA Lewis/Sterling Software, Ohio University, The Inner Net, September 1998. [4] P. Ford-Hutchinson. Securing FTP with TLS. RFC 4217, IBM UK Ltd, October 2005.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 382

14. Hypertext Transfer Protocol

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 383

Hypertext Transfer Protocol (HTTP)

The Hypertext Transfer Protocol (HTTP) version 1.1 is dened in RFC 2616 is one of the core building blocks of the World Wide Web. HTTP is primarily used to exchange documents between clients (browsers) and servers. The HTTP protocol runs on top of TCP and it uses the well known port number 80. HTTP utilizes MIME conventions in order to distinguish different media types. RFC 2817 describes how to use TLS with HTTP 1.1. The Common Gateway Interface described in RFC 3875 allows to implement simple dynamic web pages.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 384

URL, URI, URN, IRI, . . .

Uniform Resource Identier (URI) A URI is a sequence of characters from the US-ASCII for identifying an abstract or physical resource. Uniform Resource Locator (URL) The term URL refers to the subset of URIs that provide a means of locating the resource by describing its primary access mechanism (e.g., its network "location"). Uniform Resource Name (URN) The term URN refer to the subset of URIs that identify resources by means of a globally unique and persistent name. Internationalized Resource Identier (IRI) An IRI is a sequence of characters from the Universal Character Set for identifying an abstract or physical resource.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 385

URI Syntax in ABNF


URI hier-part = scheme ":" hier-part [ "?" query ] [ "#" fragment ] = / / / "//" authority path-abempty path-absolute path-rootless path-empty

URI-reference = URI / relative-ref absolute-URI relative-ref = scheme ":" hier-part [ "?" query ] = relative-part [ "?" query ] [ "#" fragment ] "//" authority path-abempty path-absolute path-noscheme path-empty

relative-part = / / / scheme

= ALPHA *( ALPHA / DIGIT / "+" / "-" / "." )


slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 386

URI Syntax in ABNF


authority userinfo host port IP-literal IPvFuture IPv6address = = = = [ userinfo "@" ] host [ ":" port ] *( unreserved / pct-encoded / sub-delims / ":" ) IP-literal / IPv4address / reg-name *DIGIT ) "]"

= "[" ( IPv6address / IPvFuture

= "v" 1*HEXDIG "." 1*( unreserved / sub-delims / ":" ) = / / / / / / / / 6( 5( 4( 3( 2( h16 h16 h16 h16 h16 h16 ":" ":" ":" ":" ":" ":" ) ) ) ) ) ls32 ls32 ls32 ls32 ls32 ls32 ls32 h16

[ [ [ [ [ [ [

*1( *2( *3( *4( *5( *6(

h16 h16 h16 h16 h16 h16

":" ":" ":" ":" ":" ":"

) ) ) ) ) )

h16 h16 h16 h16 h16 h16 h16

] ] ] ] ] ] ]

"::" "::" "::" "::" "::" "::" "::" "::"

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 387

URI Syntax in ABNF


h16 ls32 IPv4address dec-octet = 1*4HEXDIG = ( h16 ":" h16 ) / IPv4address = dec-octet "." dec-octet "." dec-octet "." dec-octet = / / / / DIGIT %x31-39 DIGIT "1" 2DIGIT "2" %x30-34 DIGIT "25" %x30-35 ; ; ; ; ; 0-9 10-99 100-199 200-249 250-255

reg-name path

= *( unreserved / pct-encoded / sub-delims ) = / / / / path-abempty path-absolute path-noscheme path-rootless path-empty ; ; ; ; ; begins with "/" or is empty begins with "/" but not "//" begins with a non-colon segment begins with a segment zero characters

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 388

URI Syntax in ABNF


path-abempty path-absolute path-noscheme path-rootless path-empty segment segment-nz segment-nz-nc pchar query fragment pct-encoded unreserved reserved gen-delims sub-delims = = = = = = = = = *( "/" segment ) "/" [ segment-nz *( "/" segment ) ] segment-nz-nc *( "/" segment ) segment-nz *( "/" segment ) 0<pchar> *pchar 1*pchar 1*( unreserved / pct-encoded / sub-delims / "@" ) unreserved / pct-encoded / sub-delims / ":" / "@"

= *( pchar / "/" / "?" ) = *( pchar / "/" / "?" ) = = = = = / "%" HEXDIG HEXDIG ALPHA / DIGIT / "-" / "." / "_" / "" gen-delims / sub-delims ":" / "/" / "?" / "#" / "[" / "]" / "@" "!" / "$" / "&" / "" / "(" / ")" "*" / "+" / "," / ";" / "="
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 389

HTTP 1.1 Methods

OPTIONS Request information about the communication options available GET Retrieve whatever information is identied by a URI HEAD Retrieve only the meta-information which is identied by a URI POST Annotate an existing resource or pass data to a data-handling process

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 390

HTTP 1.1 Methods (cont.)

PUT Store information under the supplied URI DELETE Delete the resource identied by the URI TRACE Application-layer loopback of request messages for testing purposes CONNECT Initiate a tunnel such as a TLS or SSL tunnel

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 391

HTTP 1.1 ABNF


Message Request Response = = = Request / Response Request-Line *(message-header CRLF) CRLF [ message-body ] Status-Line *(message-header CRLF) CRLF [ message-body ] Method SP Request-URI SP HTTP-Version CRLF HTTP-Version SP Status-Code SP Reason-Phrase CRLF

Request-Line = Status-Line Method Method Method =

= "OPTIONS" / "GET" / "HEAD" / "POST" =/ "PUT" / "DELETE" / "TRACE" / "CONNECT" =/ Extension-Method

Extension-Method = token

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 392

HTTP 1.1 ABNF (cont.)


Request-URI HTTP-Version Status-Code Reason-Phrase message-header field-name field-value field-content = = = = = = = = "*" | absoluteURI | abs_path | authority "HTTP" "/" 1*DIGIT "." 1*DIGIT 3DIGIT *<TEXT, excluding CR, LF> field-name ":" [ field-value ] token *( field-content / LWS ) <the OCTETs making up the field-value and consisting of either *TEXT or combinations of token, separators, and quoted-string>

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 393

Persistent Connections and Pipelining


A client can establish a persistent connection to a server and use it to send multiple Request messages. HTTP relies on the MIME Content-Length header eld to detect the end of a message body (document). Earlier version of HTTP allowed only a single Request/Response exchange over a single connection, which is of course rather expensive. HTTP 1.1 also allows clients to make multiple requests without waiting for each response (pipelining), which can signicantly reduce latency.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 394

Caching and Proxies


Probably the most interesting and also most complex part of HTTP is its support for proxies and caching. Proxies are entities that exist between the client and the server and which basically relays requests and responses. Some proxies and clients maintain caches where copies of documents are stored in local storage space to speedup future accesses to these cached documents. The HTTP protocol allows a client to interrogate the server to determine whether the document has changed or not. Not all problems related to HTTP proxies and caches have been solved. A good list of issues can be found in RFC 3143.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 395

Negotiation

Negotation can be used to select different document formats, different transfer encodings, different languages or different character sets. Server-driven negotiation begins with a request from a client (a browser). The client indicates a list of its preferences. The server then decides how to best respond to the request. Client-driven negotiation requires two requests. The client rst asks the server what is available and then decides which concrete request to send to the server. In most cases, server-driven negotiation is used since this is much more efcient.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 396

Negotiation Example

The following is an example of typical negotiation header lines:


Accept: text/xml, application/xml, text/html;q=0.9, text/plain;q=0.8 Accept-Language: de, en;q=0.5 Accept-Encoding: gzip, deflate, compress;q=0.9 Accept-Charset: ISO-8859-1, utf8;q=0.66, *;q=0.33

The rst line indicates that the client accepts text/xml, application/xml, and text/plain and that it prefers text/html over text/plain. The last line indicates that the client prefers ISO-8859-1 encoding (with preference 1), UTF8 encoding with preference 0.66 and any other encoding with preference 0.33.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 397

Conditional Requests

Clients can make conditional requests by including headers that qualify the conditions under which the request should be honored. Conditional requests can be used to avoid unnecessary requests (e.g., to validate cached data). Example:
If-Modified-Since: Wed, 26 Nov 2003 23:21:08 +0100

The server checks whether the document was changed after the date indicated by the header line and only process the request if this is the case.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 398

Other Features and Extensions

Delta Encoding (RFC 3229) A mechanism to request only the document changes relative to a specic version. Web Distributed Authoring (RFC 2518, RFC 3253) An extension to the HTTP/1.1 protocol that allows clients to perform remote web content authoring operations. HTTP as a Substrate (RFC 3205) HTTP is sometimes used as a substrate for other application protocols (e.g., IPP or SOAP). There are some pitfalls with this approach as documented in RFC 3205.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 399

References
[1] R. Fielding, J. Gettys, J. Mogul, H. Frystyk, L. Masinter, P. Leach, and T. Berners-Lee. Hypertext Transfer Protocol HTTP/1.1. RFC 2616, UC Irvine, Compaq/W3C, Compaq, W3C/MIT, Xerox, Microsoft, W3C/MIT, June 1999. [2] R. Khare and S. Lawrence. Upgrading to TLS Within HTTP/1.1. RFC 2817, 4K Associates / UC Irvine, Agranat Systems, May 2001. [3] E. Rescorla. HTTP Over TLS. RFC 2818, RTFM, May 2000. [4] D. Robinson and K. Coar. The Common Gateway Interface (CGI) Version 1.1. RFC 3875, Apache Software Foundation, October 2004. [5] J. Mogul, B. Krishnamurthy, F. Douglis, A. Feldmann, Y. Goland, A. van Hoff, and D. Hellerstein. Delta encoding in HTTP. RFC 3229, Compaq WRL, AT&T, Univ. of Saarbruecken, Marimba, ERS/USDA, January 2002. [6] Y. Goland, E. Whitehead, A. Faizi, S. Carter, and D. Jensen. HTTP Extensions for Distributed Authoring WEBDAV. RFC 2518, Microsoft, UC Irvine, Netscape, Novell, February 1999. [7] G. Clemm, J. Amsden, T. Ellison, C. Kaler, and J. Whitehead. Versioning Extensions to WebDAV (Web Distributed Authoring and Versioning). RFC 3253, Rational Software, IBM, Microsoft, U.C. Santa Cruz, March 2002. [8] K. Moore. On the use of HTTP as a Substrate. RFC 3205, University of Tennessee, February 2002.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 400

15. Remote Procedure Calls

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 401

RPC Model
Client Client Stub Server Stub Server

Access Send Access

Result Send Result

Introduced by Birrel and Nelson (1984) to provide communication transparency and overcome heterogeneity
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 402

Stub Procedures
client stub ipc ipc stub server

invoke

pack

send

recv

unpack

invoke work

return

unpack

recv

send

pack

return

interface

interface

Client stubs provide a local interface which can be called like any other local procedure Server stubs provide the server interface which calls the servers implementation of the procedure provided by a programmer and returns any results back to the client Stubs hide all communication details
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 403

Marshalling

Marshalling is the technical term for transferring data structures used in remote procedure calls from one address space to another Serialization of data structures for transport in messages Conversion of data structures from the data representation of the calling process to that of the called process Pointers can be handled to some extend by introducing call-back handles which can be used to make an RPC call back from the server to the client in order to retrieve the data pointed to

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 404

RPC Denition Languages


procedure definition RPC definition language

implementation language RPC compiler

implementation language

client implementation

client stub source

header

server stub source

procedure implementation

compiler

compiler

compiler

compiler

client

server

Formal language to dene the type signature of remote procedures RPC compiler generates client / server stubs from the formal remote procedure denition
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 405

RPC Binding

A client needs to locate and bind to a server in order to use RPCs This usually requires to lookup the transport endpoint for a suitable server in some sort of name server: 1. The name server uses a well know transport address 2. A server registers with the name server when it starts up 3. A client rst queries the name server to retrieve the transport address of the server 4. Once the transport address is known, the client can send RPC messages to the correct transport endpoint

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 406

RPC Semantics

May-be:

Client does not retry failed requests

At-least-once: Client retries failed requests, server re-executes the procedure At-most-once: Client may retry failed requests, server detects retransmitted requests and responds with cached reply from the execution of the procedure Exactly-once: Client must retry failed requests, server detects retransmitted requests and responds with cached reply from the execution of the procedure
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 407

Local vs. Remote Procedure Calls


Client, server and the communication channel can fail independently and hence an RPC may fail Extra code must be present on the client side to handle RPC failures gracefully Global variables and pointers can not be used directly with RPCs Passing of functions as arguments is close to impossible The time needed to call remote procedures is orders of magnitude higher than the time needed for calling local procedures

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 408

Open Network Computing RPC


Developed by Sun Microsystems (Sun RPC), originally published in 1987/1988 Since 1995 controlled by the IETF (RFC 1790) ONC RPC encompasses: ONC RPC Language (RFC 1831, RFC 1832) ONC XDR Encoding (RFC 1832) ONC RPC Protocol (RFC 1831) ONC RPC Binding (RFC 1833) Foundation of the Network File System (NFS) and widely implemented on Unix systems

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 409

ONC RPC Language


program-def: "program" identifier "{" version-def version-def * "}" "=" constant ";" version-def: "version" identifier "{" procedure-def procedure-def * "}" "=" constant ";" procedure-def: type-specifier identifier "(" type-specifier ("," type-specifier )* ")" "=" constant ";"

The ONC RPC language is an extension of the XDR data denition language
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 410

ONC RPC Protocol


The ONC RPC protocol is dened in RFC 1831 Procedures are identied by a triple (P,V,F) with the following semantics: The program number P is used to identify a sets of related procedures (called a program in ONC terminology) A specic collection of related procedures within program P and identied by the version number V A specic procedure in the collection of procedures in program P with version V is identied by the procedure number F The textual name of the procedure is irrelevant from the protocol perspective

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 411

ONC RPC Protocol


RPC message formats are dened in the XDR language RPC protocol supports an opaque and thus extensible authentication data format Early versions of the protocol supported a DES based authentication scheme which used a key size which was very easy to break In many deployments, no authentication or clear-text authentication is still being used...

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 412

ONC RPC Protocol


struct rpc_msg { unsigned int xid; union switch (msg_type mtype) { case CALL: call_body cbody; case REPLY: reply_body rbody; } body; }; struct call_body { unsigned int rpcvers; /* must be equal to two (2) */ unsigned int prog; unsigned int vers; unsigned int proc; opaque_auth cred; opaque_auth verf; /* procedure specific parameters start here */ };
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 413

ONC RPC Protocol


union reply_body switch (reply_stat stat) { case MSG_ACCEPTED: accepted_reply areply; case MSG_DENIED: rejected_reply rreply; } reply; union rejected_reply switch (reject_stat stat) { case RPC_MISMATCH: struct { unsigned int low; unsigned int high; } mismatch_info; case AUTH_ERROR: auth_stat stat; };

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 414

ONC RPC Protocol


struct accepted_reply { opaque_auth verf; union switch (accept_stat stat) { case SUCCESS: opaque results[0]; /* procedure-specific results start here */ case PROG_MISMATCH: struct { unsigned int low; unsigned int high; } mismatch_info; default: /* Void. Cases include PROG_UNAVAIL, PROC_UNAVAIL, * GARBAGE_ARGS, and SYSTEM_ERR. */ void; } reply_data; };

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 415

ONC RPC Binding

2. lookup

portmapper 1. register

client 3. procedure call

server

RPC servers use dynamic port numbers. The portmapper is used by the ONC RPC to map program and version numbers (P,V) to transport endpoints on the machine where the portmapper is running The portmapper is itself an RPC service Firewall generally dislike dynamic port numbers...

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 416

ISO/OSI RPC (ROSE)

The Remote Operations Service Element (ROSE) is the service interface for RPCs in the ISO/OSI networking standards ROSE uses the ASN.1 language to dene the data structures that are exchanged Some special ASN.1 macros are used to dene the type signature of the remote operations The encoding on the wire is one of the ASN.1 encodings (e.g., BER, DER) Integration into frequently used implementation languages and environments complicated

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 417

XML based RPCs (XML-RPC)


XML based RPC uses XML to encode RPC messages Type information included in the encoding Very simple (perhaps too simple?) message protocol (no transaction identier, relies on HTTP security and access control) Uses HTTP as the underlying (surrogate) protocol No data denition / RPC language and thus no automatic stub generation For more information, see http://www.xmlrpc.com/

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 418

XML-RPC Request
POST /RPC2 HTTP/1.0 User-Agent: Frontier/5.1.2 (WinNT) Host: betty.userland.com Content-Type: text/xml Content-length: 196 <?xml version="1.0"?> <methodCall> <methodName>examples.getStateName</methodName> <params> <param> <value> <i4>41</i4> </value> </param> </params> </methodCall>

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 419

XML-RPC Response
HTTP/1.1 200 OK Connection: close Content-Length: 172 Content-Type: text/xml Date: Fri, 17 Jul 1998 19:55:08 GMT Server: UserLand Frontier/5.1.2-WinNT <?xml version="1.0"?> <methodResponse> <params> <param> <value> <string>South Dakota</string> </value> </param> </params> </methodResponse>

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 420

DCE RPC

RPC developed as part of the Distributed Computing Environment (DCE) Network Interface Denition Language (NIDL) Network Data Representation (NDR) Integrates with the Kerberos authentication service Basis of the RPC mechanism used in Microsofts DCOM

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 421

References
[1] A. Birrell and P. Nelson. Implementing Remote Procedure Calls. ACM Transactions on Computer Systems, 2(1):3959, 1984. [2] R. Srinivasan. RPC: Remote Procedure Call Protocol Specication Version 2. RFC 1831, Sun Microsystems, August 1995. [3] R. Srinivasan. XDR: External Data Representation Standard. RFC 1832, Sun Microsystems, August 1995. [4] R. Srinivasan. Binding Protocols for ONC RPC Version 2. RFC 1833, Sun Microsystems, August 1995. [5] ISO. Information processing systems Open System Interconnection Remote Operations Part 1: Model, Notation and Service Denition. International Standard ISO 9072-1, ISO, 1989.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 422

16. Internet Management

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 423

Management Standards

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 424

Why Network Management?

Networks of non-trivial size need management:


Fault detection and isolation Conguration generation and installation Accounting data gathering Performance monitoring and tuning Security management (keys, access control)

FCAPS functional areas (very broad but widely accepted functional categorization)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 425

Why is Network Management Hard?


Scalability is a key concern (millions of devices/users) Short technology life times (what happened to ATM?) Heterogenity requires standards-based solutions Lack of skilled persons

= But network management is not really fundamentally different from other complex control systems (e.g., systems that control robots in a vehicle fabric). = However, network management terminology is often very different and sometimes somewhat confusing (especially for people with computer science background).

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 426

Abstraction of Managed Objects (MOs)


Attributes Operations Behaviour Events

Warning: Coffee machine switched on but no coffee in the machine.

Management Application

Managed Object

Resource

A managed object is the abstracted view of a resource that presents its properties as seen by (and for the purpose of) management (ISO 7498-4). The boundary of a managed object denes the level of details which are accessible for management systems.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 427

Management Information Base (MIB)

MO MO Layer 7 Layer 6 Layer 5 MO Layer 4 MO Layer 3 Layer 2 Layer 1 Protocols MO MO

MO

MO MO

MO

MO

Management Information Base

Resources

The set of managed objects within a system, together with their attributes, constitutes that systems management information base (ISO 7498-4).

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 428

Management Protocols

Algorithm for solving a management problem

MIBModel of a resource

Management Protocol

Management Protocol

Manager (Client)

Agent (Server)

Management protocols realize the access to MOs contained in a MIB.


slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 429

Data-centric Approach

The device is represented as a collection of data objects representing all the properties and capabilities of a device. The management protocol manipulates the data objects representing a device and its state. Manipulation of data objects might cause side effects.

Example: Internet management (SNMP)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 430

Command-centric Approach

The device is considered to be a stateful black box. A sequence of commands can be send to the device to a) change the state of the device or to b) retrieve data about the current state of the device (or portions thereof). Determining the right sequence of commands to bring a device into a certain state might not be trivial.

Example: Command line interfaces of routers or switches

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 431

Object-centric Approach

The device is represented as a collection of data objects with associated methods. This can be seen as a combination of the data- and the command-centric approaches. Usually leads to object-oriented modeling and thus object-oriented approaches. A critical design decision is the granularity of the objects and the level of interdependencies between objects

Example: OSI management (CMIP), DMTF information models

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 432

Document-centric Approach

The conguration and state of a device is represented as a structured document. Management operations are realized by manipulating the structured document. Allows to use general document processors for management purposes. Closely related to data-centric approaches.

Example: Most XML-based management approaches follow this model.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 433

Essential Management Protocol Primitives

From a very abstract viewpoint, the following set of management protocol primitives is essential for data-centric or object-centric management protocols: GET, SET CREATE, DELETE SEARCH (or at the very least ITERATE) LOCK, UNLOCK, COMMIT, ROLLBACK NOTIFY EXECUTE an operation or INVOKE a method Command-centric protocols usually have a very rich set of primitives (which are often hierarchically structured).

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 434

Information Models (RFC 3444)

Information Models (IMs) are used to model managed objects at a conceptual level, independent of any specic protocols used to transport the data. The degree of specicity (or detail) of the abstractions dened in the IM depends on the modeling needs of its designers. In order to make the overall design as clear as possible, IMs should hide all protocol and implementation details. IMs focus on relationships between managed objects. IMs are often represented in Unied Modeling Language (UML) diagrams, but there are also informal IMs written in plain English language.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 435

Data Models (RFC 3444)


Data Models (DMs) are dened at a lower level of abstraction and include many details (compared to IMs). They are intended for implementors and include implementation- and protocol-specic constructs. DMs are often represented in formal data denition languages that are specic to the management protocol being used.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 436

Information Models vs. Data Models


IM conceptual/abstract model for designers and operators

DM

DM

DM

concrete/detailed model for implementors

Since conceptual models can be implemented in different ways, multiple DMs can be derived from a single IM. Although IMs and DMs serve different purposes, it is not always possible to decide which detail belongs to an IM and which detail belongs to a DM. Similarily, it is sometimes difcult to determine whether an abstraction belongs to an IM or a DM.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 437

IMs and DMs in the Real World


Information Model Information Model

MIB Module SMIv2

PIB Module SPPI

Schema XSD

Interface Definitions OMG IDL

Data Model

MIB Variables

Provisioning Instances

XML Documents

Instance Data

BER

BER

XML

The Architecture for Differentiated Services (RFC 2475) is an example for an informal denition of the DiffServ information model. The DiffServ MIB module (RFC 3289) and the DiffServ PIP module (RFC 3317) are examples of data models conforming to the DiffServ information model.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 438

Network Management Standards


1.0 CORBA (OMG) M.30 TMN (ITU) M.3010 M.3100 M.3400 2.0 2.2 2.3 2.4 2.5 2.6 3.0

OSI RM.4 CMIP CMIS GDMO CMIP (ISO) SNMPv1 [S] SMIv1 [S] SNMPv2p [P] SMIv2 [P] SNMPv2c [D/E] SMIv2 [D] SNMPv3 SNMPv3 [P] [D] SMIv2 [S] SPPIv1 [P] SNMPv3 [S]

SNMP (IETF)

SMI (IETF)

SPPI (IETF)

COPSPR (IETF)

COPSPRv1 [P]

NETCONF (IETF) Legend: [P] Proposed Standard [D] Draft Standard [S] Standard [E] Experimental LDAP [P] DMI (DMTF) 1.0 CIM (DMTF) LDAPv2 [D] LDAPv3 [P] 2.0 2.2 2.3 2.4 2.5 2.6 2.7 2.8 1.0 2.0 2.0s

LDAP (IETF)

1980

1982

1984

1986

1988

1990

1992

1994

1996

1998

2000

2002

2004

2006

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 439

Management Technology Fragmentation


CIM / WBEM XML / HTTP GDMO / CMIP DEN / CIM / LDAP SMI / SNMP TMN / IDL / IIOP NETCONF SPPI / COPS Mobile Agents? Programmable Networks ? Active ? Networks

Open Standards TL1 Unicenter

CLI JunoScript Tivoli Research

Autonomic ? Networks

Proprietary

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 440

Implications

Standards Organizations: Duplication of efforts binds scare human resources. Network Device Vendors: Customers force device vendors to support multiple management technologies, which makes devices unnecessarily complex and expensive. Management Application Vendors: Integrated solutions are complex and thus expensive due to the many different interfaces. Network Operators: Creating heterogeneous networks with common management interfaces is hard/expensive. Increased costs and time for deploying new services.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 441

Theory of Standardization
Activity Time for Standardization Research Investment

Time

The success of a standard must be measured in terms of wide-spread deployment. Standards must allow vendors to differentiate their products. Successful standards can create new open markets. The timeliness of standards is a key factor for success.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 442

IETF Standards Process (RFC 2026)


Historic Historic

Working Group Document (Internet Draft)

Proposed Standard (RFC)

Draft Standard (RFC)

Internet Standard (RFC)

Internet Drafts are working documents and can be changed or removed anytime. All Internet standards are published in a series called Request For Comments (RFC), but not all RFCs dene standards (informational or experimental RFCs). The step from Proposed to Draft standard requires two independent and interoperable implementations from different code bases for all protocol features. The step from Draft to Standard requires signicant implementation and operational experience.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 443

IETF Management Standards


The requirements for Internet management technologies have changed during the last decade. Some fundamental design decisions taken in the late 1980s must be revisited to better reect todays realities. Working group members are dominated by network device vendors (solutions tend to be too device specic or way too detailed for real networks). Work on SNMP security took many many years to nally result in a stable SNMPv3 specication while other urgently needed SNMP improvements were kept on hold. Need to move to more mainstream technologies since network management remains a niche market.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 444

IETF Standardization Principles


KISS: Complex standards requiring people with special skills will not survive. Timeliness: Standards need to address real-world problems in a timely manner. Interoperability is more important than strict correctness. (Implementations should be liberal in what they accept and stringent in what they generate.) Protocols sometimes show effects when used on a larger scale that can not be observed on small scales. Concentration processes have given a few big players strong inuence on the success of standards.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 445

MIB Standardization Experience

Standardizing MIBs in order to establish an open market between device vendors and management software vendors does not always work very well: 1. Standardization takes too long. 2. Consensus often on the lowest common denominator. 3. Operationally important information often contained in proprietary MIB extensions. 4. Implementation and resource costs hinder fast and wide-spread deployment.

The sheer number of standardization efforts and proposals sometimes seem to distract those who do the actual work from doing the actual work.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 446

References
[1] H. G. Hegering, S. Abeck, and B. Neumair. Integrated Management of Networked Systems. Morgan Kaufmann, 1999. [2] ISO. Information processing systems Open System Interconnection Basic Reference Model Part 4: Management Framework. International Standard ISO/IEC 7498-4, ISO, 1989. [3] S. Bradner. The Internet Standards Process Revision 3. RFC 2026, Harvard University, October 1996. [4] B. Carpenter. Architectural Principles of the Internet. RFC 1958, IAB, June 1996. [5] A. Pras and J. Schnwlder. On the Difference between Information Models and Data Models. RFC 3444, University of Twente, University of Osnabrueck, January 2003. [6] A. Pras. Network Management Architectures. PhD thesis, Centre for Telematics and Information Technology, Twente University, Enschede, 1995. [7] J. Schnwlder, A. Pras, and J. P. Martin-Flatin. On the Future of Internet Management Technologies. IEEE Communications Magazine, 41(10):9097, October 2003.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 447

Structure of Management Information Version 2 (SMIv2)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 448

SMI Version 2 (SMIv2)


Fundamental SMIv2 Concepts SMI Naming (Object Type and Instance Identier) SMI Modules and Object Identities Object Types and Notication Types Textual Conventions Conformance Statements MIB Module Design, Implementation and Testing

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 449

Fundamental SMIv2 Concepts

The second version of the Structure of Management Information (SMIv2) [RFC 2578, RFC 2579, RFC 2580] is a data denition language which is used to dene the syntax and semantics of management data models. SMIv2 is based on an adapted subset of ASN.1 (1988). All management information is held in a collection of typed variables. All SMIv2 data types resolve to one of the primitive ASN.1 types INTEGER, OCTET STRING, and OBJECT IDENTIFIER.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 450

Fundamental SMIv2 Concepts


There are no compound data types (like classes, structures or unions). Variables either appear as scalars (exactly one instance) or as columns in a conceptual table (zero or more instances). Variables can be read and written using SNMP. The protocol allows to manipulate arbitrary lists of variables in one transaction. The protocol does not allow to manipulate conceptual tables or conceptual table rows.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 451

SMIv2 Base Types


SMIv2 Type INTEGER OCTET STRING OBJECT IDENTIFIER Integer32 Unsigned32 Gauge32 Counter32 Counter64 TimeTicks IpAddress Opaque BITS Description enumerations (signed integer) sequence of arbitrary bytes unique identier signed integer (-2147483648..2147483647) unsigned integer (0..4294967295) gauge (0..4294967295) counter (0..4294967295) counter (0..18446744073709551615) elapsed time in hundredths of a second four byte IP version 4 address wrapper for arbitrary ASN.1 types named bit sets

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 452

SMI Naming of Variables


name Manager Manager (HP/OV)
Router

uptime Agent MIB address (1) info (2) (1)

address name (1) uptime (2)

Every variable type is registered in the ISO OID registration tree. The leaves of the OID tree represent so called object types. Intermediate nodes may be used to group related object type denitions together.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 453

Object Type Names versus Instance Names


The registration of an object type assigns a globally unique name (OID). Concrete instances of the object type are identied by an instance name. The instance name is appended to the OID (name) of the object type. Scalars can only have one instance. The instance name is always constructed by appending 0. The instance name of columnar variables is derived from the key(s) of the underlying conceptual table.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 454

Scalar Object Types and Instance Names


(1)

address (1)

info (2)

name (1)

uptime (2)

Object Identier 1.1 1.2.1 1.2.2

Instance Identier 0 0 0

Type IpAddress OCTET STRING TimeTicks

Value 10.1.2.1 "ACME Router" 54321

Descriptors are only used by humans SNMP only operates on OID values. Descriptors must be unique within a module. Module names must be globally unique.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 455

Columnar Object Types and Instance Names


(1)

address (1)

info (2)

fwdTable (3)

name (1)

uptime (2)

fwdEntry (1)

fwdIndex (1) 1 2 1 3 5 8 7 9 2 3 4 5 6

fwdDest (2) 2 3 5 7 8 9

fwdNext (3) 2 3 2 2 3 3

The column fwdIndex uniquely identies a row in the conceptual fwdTable.


slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 456

Columnar Instance Names

Conceptual tables are constructed using two auxiliary nodes: The rst node represents the table (syntactically an ASN.1 SEQUENCE OF). The descriptor of the rst node always ends with Table. The second node represents a table row (syntactically an ASN.1 SEQUENCE). The descriptor of the second node always ends with Entry. The primary key of the table is called the index and is used to derive the instance identier for a table row.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 457

Columnar Instance Names

The concatenation of the object type name and the instance identier yields the unique object identier to address a columnar variable. Examples (see fwdTable from previous slides):

1.3.1.1.1 1 1.3.1.3.1 2 1.3.1.2.4 7 1.3.1.2.7 does not exist

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 458

Complex Columnar Instance Names


(1)

address (1)

info (2)

fwdTable (3)

name (1)

uptime (2)

fwdEntry (1)

fwdDest (1) 0.0.0.0 127.0.0.1 134.169.34.0 134.169.35.1 134.169.35.2

fwdMask (2) 255.255.255.0 255.0.0.0 255.255.255.0 255.255.255.0 255.255.255.0

fwdNext (3) 134.169.34.1 127.0.0.1 134.169.34.15 134.169.34.18 134.169.34.18

The index into a more realistic IPv4 forwarding table is the combination of a destination address and an address mask. 1.3.1.3.134.169.34.0.255.255.255.0 134.169.34.15
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 459

Instance Name for Conceptual Tables

Rules for forming the instance names: 1. integer-valued: a single sub-identier taking the integer value 2. string-valued, xed-length: each octet of the string is encoded in a separate sub-identier 3. string-valued, variable-length: the rst sub-identier is the length followed by each octet of the string encoded in a separate sub-identier 4. object identier-valued: the rst sub-identier is the length followed by each sub-identier The IMPLIED keyword can be used to avoid the length sub-identier in some cases (not recommended). Instance identier can not be arbitrarily complex since OIDs are limited to 128 sub-identiers.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 460

MIB Modules and Object Identities


Related SMI denitions are grouped into MIB modules. Every MIB module must have a unique name. A MIB module corresponds to an ASN.1 module. Denitions can be imported from other MIB modules using an ASN.1 IMPORT. All SMIv2 macros and derived data types used in a MIB module must be imported.

ACME-ROUTER-MIB DEFINITIONS ::= BEGIN IMPORT MODULE-IDENTITY, OBJECT-TYPE, Integer32, enterprises FROM SNMPv2-SMI; -- ... END
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 461

SMIv2 Module Identity Macro (RFC 2578)


<Descriptor> MODULE-IDENTITY LAST-UPDATED <ExtUTCTime> ORGANIZATION <Text> CONTACT-INFO <Text> DESCRIPTION <Text> [REVISION <ExtUTCTime> DESCRIPTION <Text>]* ::= <ObjectIdentifier>

The MODULE-IDENTITY macro provides contact information and the revision history. The REVISION and DESCRIPTION clauses can repeat multiple times. The OID of the MODULE-IDENTITY macro is often used as the top-level node for all denitions in a module.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 462

Module Identity Macro Example (RFC 2863)


ifMIB MODULE-IDENTITY LAST-UPDATED "200006140000Z" ORGANIZATION "IETF Interfaces MIB Working Group" CONTACT-INFO "Keith McCloghrie Cisco Systems, Inc. 408-526-5260 170 West Tasman Drive kzm@cisco.com San Jose, CA 95134-1706, US" DESCRIPTION "The MIB module to describe generic objects for network interface sub-layers. This MIB is an updated version of MIB-IIs ifTable, and incorporates the extensions defined in RFC 1229." REVISION "200006140000Z" DESCRIPTION "Clarifications agreed upon by the Interfaces MIB WG, and published as RFC 2863" REVISION "199602282155Z" DESCRIPTION "Revisions made by the Interfaces MIB WG, and published in RFC 2233." REVISION "199311082155Z" DESCRIPTION "Initial revision, published as part of RFC 1573." ::= { mib-2 31 }
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 463

SMIv2 Object Identity Macro (RFC 2578)


<Descriptor> OBJECT-IDENTITY STATUS <Status> DESCRIPTION <Text> [REFERENCE <Text>] ::= <ObjectIdentifier>

The macro is used to dene and register OBJECT IDENTIFIER values. The STATUS clause denes whether the denition is current, deprecated or obsolete. The optional REFERENCE clause can point to related denitions. Used to dene constants (extensible enumerations) or to structure the OID tree.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 464

Object Identity Macro Examples (RFC 2578, 3417)


zeroDotZero OBJECT-IDENTITY STATUS current DESCRIPTION "A value used for null identifiers." ::= { 0 0 } snmpUDPDomain OBJECT-IDENTITY STATUS current DESCRIPTION "The SNMPv2 over UDP transport domain. The corresponding transport address is of type SnmpUDPAddress." ::= { snmpDomains 1 } snmpIPXDomain OBJECT-IDENTITY STATUS current DESCRIPTION "The SNMPv2 over IPX transport domain. The corresponding transport address is of type SnmpIPXAddress." ::= { snmpDomains 5 }

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 465

Object Types and Notication Types


The heart of a MIB are the object types and conceptual tables. The object type macro is used to dene object types and conceptual tables. Additional ASN.1 type denitions are needed for conceptual tables. Notication Types dene events that can be reported to managers. Notications have been used sparingly in the past since there were no standardized mechanisms to enable/disable them. Not carefully designed notication types can lead to notication storms.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 466

SMIv2 Object Type Macro (RFC 2578)


<Descriptor> OBJECT-TYPE SYNTAX <Syntax> [UNITS <Text>] MAX-ACCESS <Access> STATUS <Status> DESCRIPTION <Text> [REFERENCE <Text>] [INDEX <Index> | AUGMENTS <Index>] [DEFVAL <Value>] ::= <ObjectIdentifier>

OBJECT-TYPE macros are used to dene object types as well as conceptual tables. The INDEX or AUGMENTS clause is only allowed for conceptual row denitions, where one of them is mandatory.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 467

Object Type Macro Examples (RFC 3418)


sysDescr OBJECT-TYPE SYNTAX DisplayString (SIZE (0..255)) MAX-ACCESS read-only STATUS current DESCRIPTION "A textual description of the entity. This value should include the full name and version identification of the systems hardware type, software operating-system, and networking software." ::= { system 1 } sysUpTime OBJECT-TYPE SYNTAX TimeTicks MAX-ACCESS read-only STATUS current DESCRIPTION "The time (in hundredths of a second) since the network management portion of the system was last re-initialized." ::= { system 3 }

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 468

Object Type Macro Examples (RFC 3418)


sysName OBJECT-TYPE SYNTAX DisplayString (SIZE (0..255)) MAX-ACCESS read-write STATUS current DESCRIPTION "An administratively-assigned name for this managed node. By convention, this is the nodes fully-qualified domain name. If the name is unknown, the value is the zero-length string." ::= { system 5 } sysLocation OBJECT-TYPE SYNTAX DisplayString (SIZE (0..255)) MAX-ACCESS read-write STATUS current DESCRIPTION "The physical location of this node (e.g., telephone closet, 3rd floor). If the location is unknown, the value is the zero-length string." ::= { system 6 }

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 469

Object Type Macro Examples (RFC 3418)


sysORTable OBJECT-TYPE SYNTAX SEQUENCE OF SysOREntry MAX-ACCESS not-accessible STATUS current DESCRIPTION "The (conceptual) table listing the capabilities of the local SNMP application acting as a command responder with respect to various MIB modules. SNMP entities having dynamically-configurable support of MIB modules will have a dynamically-varying number of conceptual rows." ::= { system 9 } sysOREntry OBJECT-TYPE SYNTAX SysOREntry MAX-ACCESS not-accessible STATUS current DESCRIPTION "An entry (conceptual row) in the sysORTable." INDEX { sysORIndex } ::= { sysORTable 1 }

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 470

Object Type Macro Examples (RFC 3418)


SysOREntry ::= SEQUENCE { sysORIndex INTEGER, sysORID OBJECT IDENTIFIER, sysORDescr DisplayString, sysORUpTime TimeStamp } sysORIndex OBJECT-TYPE SYNTAX INTEGER (1..2147483647) MAX-ACCESS not-accessible STATUS current DESCRIPTION "The auxiliary variable used for identifying instances of the columnar objects in the sysORTable." ::= { sysOREntry 1 }

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 471

Object Type Macro Examples (RFC 3418)


sysORID OBJECT-TYPE SYNTAX OBJECT IDENTIFIER MAX-ACCESS read-only STATUS current DESCRIPTION "An authoritative identification of a capabilities statement with respect to various MIB modules supported by the local SNMP application acting as a command responder." ::= { sysOREntry 2 } sysORDescr OBJECT-TYPE SYNTAX DisplayString MAX-ACCESS read-only STATUS current DESCRIPTION "A textual description of the capabilities identified by the corresponding instance of sysORID." ::= { sysOREntry 3 }

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 472

Object Type Macro Examples (RFC 3418)


sysORUpTime OBJECT-TYPE SYNTAX TimeStamp MAX-ACCESS read-only STATUS current DESCRIPTION "The value of sysUpTime at the time this conceptual row was last instantiated." ::= { sysOREntry 4 }

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 473

SMIv2 Notication Type Macro (RFC 2578)


<Descriptor> NOTATION-TYPE [OBJECTS <Objects>] STATUS <Status> DESCRIPTION <Text> [REFERENCE <Text>] ::= <ObjectIdentifier>

Registers a notication type with an object identier. The next to last sub-identier should be 0 for backwards compatibility with older SNMP versions. The OBJECTS clause denes the ordered sequence of MIB object types which are contained in a notication. The DESCRIPTION clause must describe which instances are to be included in a notication.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 474

Notication Type Macro Examples (RFC 2863)


linkDown NOTIFICATION-TYPE OBJECTS { ifIndex, ifAdminStatus, ifOperStatus } STATUS current DESCRIPTION "A linkDown trap signifies that the SNMP entity, acting in an agent role, has detected that the ifOperStatus object for one of its communication links is about to enter the down state from some other state (but not from the notPresent state). This other state is indicated by the included value of ifOperStatus." ::= { snmpTraps 3 } linkUp NOTIFICATION-TYPE OBJECTS { ifIndex, ifAdminStatus, ifOperStatus } STATUS current DESCRIPTION "A linkDown trap signifies that the SNMP entity, acting in an agent role, has detected that the ifOperStatus object for one of its communication links left the down state and transitioned into some other state (but not into the notPresent state). This other state is indicated by the included value of ifOperStatus." ::= { snmpTraps 4 } slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 475

Textual Conventions (RFC 2579)


Textual conventions are used to derive new types from SMIv2 base types. New types may not be derived from other TEXTUAL-CONVENTIONs. The DISPLAY-HINT clause denes a mapping from the internal representation into a human readable format and back. The DISPLAY-HINT clause can only be used with an INTEGER or an OCTET STRING base type. A textual convention supports formal restrictions (e.g., ranges). A textual convention does not allow to dene compound types (e.g., structures or unions).
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 476

SMIv2 Textual Conventions Macro (RFC 2579)


<TypeDescriptor> ::= TEXTUAL-CONVENTION [DISPLAY-HINT <Text>] STATUS <Status> DESCRIPTION <Text> [REFERENCE <Text>] SYNTAX <Syntax>

The DISPLAY-HINT clause denes a mapping from the internal representation into a human readable format and back. The DISPLAY-HINT clause can only be used with an INTEGER or an OCTET STRING base type. The SYNTAX clause denes the (restricted) SMIv2 base type. Additional semantics are dened in the DESCRIPTION clause.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 477

DISPLAY-HINTS (INTEGER)
Format
d d-<number> b o x

Description decimal value decimal value with a decimal point binary value octal value hexadecimal value

Examples: d for value 143 renders 143 d-2 for value 143 renders 1.43 b for value 143 renders 100001111 o for value 143 renders 217 x for value 143 renders 8F
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 478

DISPLAY-HINTS (OCTET STRING)


[<repeat>]<number><format>[<separator>][<terminator>]

Field
<repeat>

Description number of times to apply the remainder of the specication ( indicates the current octet, default 1) number of octets converted according to the next eld display format (a ascii, d decimal, x hexadecimal, o octal, t UTF-8) separator character terminating character if <repeat> and <separator> parts are present

<number> <format> <separator> <terminator>

Examples: 255a for value aBc renders aBc 1x: for value aBc renders 61:42:63 0aH0ae0al0al0ao 1a for value World renders Hello World
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 479

Textual Convention Macro Examples (RFC 3411)


SnmpSecurityLevel ::= TEXTUAL-CONVENTION STATUS current DESCRIPTION "A Level of Security at which SNMP messages can be sent or with which operations are being processed; in particular, one of: noAuthNoPriv - without authentication and without privacy, authNoPriv - with authentication but without privacy, authPriv - with authentication and with privacy. These three values are ordered such that noAuthNoPriv is less than authNoPriv and authNoPriv is less than authPriv." SYNTAX INTEGER { noAuthNoPriv(1), authNoPriv(2), authPriv(3) }

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 480

Textual Convention Macro Examples (RFC 2579)


MacAddress ::= TEXTUAL-CONVENTION DISPLAY-HINT "1x:" STATUS current DESCRIPTION "Represents an 802 MAC address represented in the canonical order defined by IEEE 802.1a, i.e., as if it were transmitted least significant bit first, even though 802.5 (in contrast to other 802.x protocols) requires MAC addresses to be transmitted most significant bit first." SYNTAX OCTET STRING (SIZE (6)) TruthValue ::= TEXTUAL-CONVENTION STATUS current DESCRIPTION "Represents a boolean value." SYNTAX INTEGER { true(1), false(2) }

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 481

Textual Convention Macro Examples (RFC 2579)


DateAndTime ::= TEXTUAL-CONVENTION DISPLAY-HINT "2d-1d-1d,1d:1d:1d.1d,1a1d:1d" STATUS current DESCRIPTION "A date-time specification. field ----1 2 3 4 5 6 contents -------year* month day hour minutes seconds (use 60 for leap-second) 7 8 deci-seconds 8 9 direction from UTC 9 10 hours from UTC* 10 11 minutes from UTC OCTET STRING (SIZE (8 | 11)) octets -----1-2 3 4 5 6 7 range ----0..65536 1..12 1..31 0..23 0..59 0..60 0..9 + / - 0..13 0..59"

SYNTAX

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 482

Conformance Statements

Conformance statements dene implementation requirements. Object type and notication type denitions may be grouped together. Groups can be mandatory, optional or conditionally optional. The maximum access of an object type denition may be revised. The type associated with an object type may be rened. It is possible to have different type renements for read and write operations.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 483

SMIv2 Object Group Macro (RFC 2580)


<Descriptor> OBJECT-GROUP OBJECTS <Objects> STATUS <Status> DESCRIPTION <Text> [REFERENCE <Text>] ::= <ObjectIdentifier>

Denes a collection of related managed object types. Object groups can be used to dene conformance levels. The status of the object group denition must conform to the status of the objects types listed in the OBJECTS clause.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 484

SMIv2 Notication Group Macro (RFC 2580)


<Descriptor> NOTIFICATION-GROUP NOTIFICATIONS <Notifications> STATUS <Status> DESCRIPTION <Text> [REFERENCE <Text>] ::= <ObjectIdentifier>

Denes a collection of related notication types. Notication groups can be used to dene conformance levels. The status of the notication group denition must conform to the status of the notications listed in the NOTIFICATIONS clause.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 485

Object and Notication Group Examples (RFC 3418)


systemGroup OBJECT-GROUP OBJECTS { sysDescr, sysObjectID, sysUpTime, sysContact, sysName, sysLocation, sysServices, sysORLastChange, sysORID, sysORUpTime, sysORDescr } STATUS current DESCRIPTION "The system group defines objects which are common to all managed systems." ::= { snmpMIBGroups 6 } snmpBasicNotificationsGroup NOTIFICATION-GROUP NOTIFICATIONS { coldStart, authenticationFailure } STATUS current DESCRIPTION "The basic notifications implemented by an SNMP entity supporting command responder applications." ::= { snmpMIBGroups 7 }

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 486

SMIv2 Module Compliance Macro (RFC 2580)


<Descriptor> MODULE-COMPLIANCE STATUS <Status> DESCRIPTION <Text> [REFERENCE <Text>] [MODULE <Name> [MANDATORY-GROUPS <Groups>] [GROUP <ObjectIdentifier> DESCRIPTION <Text>]* [OBJECT <ObjectIdentifier> DESCRIPTION <Text>]*]* ::= <ObjectIdentifier>

This is a simplied version! Species implementation requirements for a compliant implementation.


slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 487

Module Compliance Macro Examples (RFC 3418)


snmpBasicComplianceRev2 MODULE-COMPLIANCE STATUS current DESCRIPTION "The compliance statement for SNMP entities which implement this MIB module." MODULE -- this module MANDATORY-GROUPS { snmpGroup, snmpSetGroup, systemGroup, snmpBasicNotificationsGroup } GROUP snmpCommunityGroup DESCRIPTION "This group is mandatory for SNMP entities which support community-based authentication." GROUP snmpWarmStartNotificationGroup DESCRIPTION "This group is mandatory for an SNMP entity which supports command responder applications, and is able to reinitialize itself such that its configuration is unaltered." ::= { snmpMIBCompliances 3 }

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 488

Module Compliance Macro Examples (RFC 3414)


usmMIBCompliance MODULE-COMPLIANCE STATUS current DESCRIPTION "The compliance statement for SNMP engines which implement the SNMP-USER-BASED-SM-MIB." MODULE -- this module MANDATORY-GROUPS { usmMIBBasicGroup } OBJECT usmUserAuthProtocol MIN-ACCESS read-only DESCRIPTION "Write access is not required." OBJECT usmUserPrivProtocol MIN-ACCESS read-only DESCRIPTION "Write access is not required." ::= { usmMIBCompliances 1 }

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 489

MIB Engineering Process

Iterations are possible and useful before MIB denitions are released. MIB denitions that have been released can not be modied anymore.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 490

MIB Analysis and Design

Components Collections of logical or physical devices or services. Attributes Stable attributes that describe the management aspects of a component. Relationships Relationships between components (containment, logical order, . . . ) Relationships between attributes of components. Cardinality of components and attributes involved in relationships.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 491

MIB Analysis and Design

State Attributes Describe the desired (administrative) and/or current (operational) state of a component. Use state diagrams to document possible state transitions. Statistic Attributes Statistic attributes document what a component has done in the past. Statistics are bound to a lifetime which should be well-dened. Statistics should be basic; managers can typically easily compute derived statistics

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 492

MIB Conventions

Related object type denitions are grouped into a subtree. Object type names start with a common prex to identify the logical group (e.g. sysUpTime, sysServices). Counter object type names use the plural spelling (e.g., ifInOctets). Conceptual table names end with the sufx Table and names of conceptual rows end with the sufx Entry (e.g., ifTable, ifEntry). All (conceptual table) denitions share a common prex (e.g., ifType, ifDescr).

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 493

MIB Compiler

Front-end compilers perform syntax checks and to some extend semantic checks. Back-end compilers can generate documentation, code to automate steps of the implementation process, test suites to test agent or manager implementation, input for management applications which access MIB denitions at run time.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 494

Testing MIB Implementations

Automated regression tests help to ensure the quality of the MIB implementation. Many test cases can be generated automatically from the MIB denition. Additional test cases should be written by an independent engineer.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 495

Typical MIB Implementation Problems

Caching: Access to management information might be expensive. Solution: Try to cache some information in the access method to reduce the number of costly interactions. Problem: How to dene the lifetime of the cache? Atomic Sets: It is sometimes difcult to guarantee as if simultaneous semantics because: Variable names and values can come in any order. A single set may include multiple values for a single instance. It is difcult to maintain state during a sequence of set operations.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 496

References
[1] K. McCloghrie, D. Perkins, and J. Schnwlder. Structure of Management Information Version 2 (SMIv2). RFC 2578, Cisco Systems, SNMPinfo, TU Braunschweig, April 1999. [2] K. McCloghrie, D. Perkins, and J. Schnwlder. Textual Conventions for SMIv2. RFC 2579, Cisco Systems, SNMPinfo, TU Braunschweig, April 1999. [3] K. McCloghrie, D. Perkins, and J. Schnwlder. Conformance Statements for SMIv2. RFC 2580, Cisco Systems, SNMPinfo, TU Braunschweig, April 1999. [4] D. Perkins and E. McGinnis. Understanding SNMP MIBs. Prentice Hall, 1997. [5] C. M. Heard. Guidelines for Authors and Reviewers of MIB Documents. RFC 4181, Consultant, September 2005.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 497

Core IETF MIB Models

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 498

Core IETF MIB Models


OID Registration Tree Physical and Logical Network Interfaces Physical and Logical Entities Internet Protocol (IPv4)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 499

OID Registration Tree

itut(0) / ccitt(0)

iso(1)

jointisoitut(2) / jointisoccitt(2)

standard(0)

registrationauthority(1)

memberbody(2)

identifiedorganization(3)

dod(6)

internet(1)

directory(1)

mgmt(2)

experimental(3)

private(4)

security(5)

snmpV2(6)

mib2(1)

snmpDomains(1)

snmpProxy(2)

snmpModules(3)

system(1)

interfaces(2)

ip(4)

icmp(5)

tcp(6)

udp(7)

transmission(10)

snmp(11) ...

snmpMIB(1)

snmpFrameworkMIB(1

x25(5)

dot3(7) dot5(9) fddi(15) lapb(16)

...

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 500

ifIndex(1) ifDescr(2) ifType(3) ifMtu(4) ifSpeed(5) ifPhysAddress(6) ifAdminStatus(7) ifOperStatus(8) ifLastChange(9) ifInOctets(10) ifInUcastPkts(11) ifInNUcastPkts(12) ifInDiscards(13) ifInErrors(14) ifInUnknownProtos(15) ifOutOctets(16) ifOutUcastPkts(17) ifOutNUcastPkts(18) ifOutDiscards(19) ifOutErrors(20) ifOutQLen(21) ifSpecific(22) ifName(1) ifInMulticastPkts(2) ifInBroadcastPkts(3) ifOutMulticastPkts(4) ifOutBroadcastPkts(5) ifHCInOctets(6) ifHCInUcastPkts(7) ifHCInMulticastPkts(8) ifHCInBroadcastPkts(9) ifHCOutOctets(10) ifHCOutUcastPkts(11) ifHCOutMulticastPkts(12) ifHCOutBroadcastPkts(13) ifLinkUpDownTrapEnable(14) ifHighSpeed(15) ifPromiscuousMode(16) ifConnectorPresent(17) ifAlias(18) ifCounterDiscontinuityTime(19)

Physical and Logical Network Interfaces

ifNumber(1)

interfaces(2)

ifEntry(1)

ifTable(2)

mib2(1.3.6.1.2.1)

ifXEntry(1)

ifXTable(1)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 501

ifMIBObjects(1)

ifMIB(31)

ifStackEntry(1)

ifStackTable(2)

ifStackHigherLayer(1) ifStackLowerLayer(2) ifStackStatus(3)

ifRcvAddressEntry(1)

ifRcvAddressTable(4)

ifRcvAddressAddress(1) ifRcvAddressStatus(2) ifRcvAddressType(3)

ifTableLastChange(5) ifStackLastChange(6)

OID Tree Structure of the IF-MIB (RFC 2863)

ifIndex(1) ifDescr(2) ifType(3) ifMtu(4) ifSpeed(5) ifPhysAddress(6) ifAdminStatus(7) ifOperStatus(8) ifLastChange(9) ifInOctets(10) ifInUcastPkts(11) ifInNUcastPkts(12) ifInDiscards(13) ifInErrors(14) ifInUnknownProtos(15) ifOutOctets(16) ifOutUcastPkts(17) ifOutNUcastPkts(18) ifOutDiscards(19) ifOutErrors(20) ifOutQLen(21) ifSpecific(22) ifName(1) ifInMulticastPkts(2) ifInBroadcastPkts(3) ifOutMulticastPkts(4) ifOutBroadcastPkts(5) ifHCInOctets(6) ifHCInUcastPkts(7) ifHCInMulticastPkts(8) ifHCInBroadcastPkts(9) ifHCOutOctets(10) ifHCOutUcastPkts(11) ifHCOutMulticastPkts(12) ifHCOutBroadcastPkts(13) ifLinkUpDownTrapEnable(14) ifHighSpeed(15) ifPromiscuousMode(16) ifConnectorPresent(17) ifAlias(18) ifCounterDiscontinuityTime(19)

ifNumber(1)

interfaces(2)

ifEntry(1)

ifTable(2)

mib2(1.3.6.1.2.1)

ifXEntry(1)

ifXTable(1)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 502

ifMIBObjects(1)

ifMIB(31)

ifStackEntry(1)

ifStackTable(2)

ifStackHigherLayer(1) ifStackLowerLayer(2) ifStackStatus(3)

ifRcvAddressEntry(1)

ifRcvAddressTable(4)

ifRcvAddressAddress(1) ifRcvAddressStatus(2) ifRcvAddressType(3)

ifTableLastChange(5) ifStackLastChange(6)

Case Diagram of the ifTable of the IF-MIB


ifInUcastPkts + ifInNUcastPkts ifOutUcastPkts + ifOutNUcastPkts

ifInDiscards ifInUnknownProtos ifInErrors ifOutErrors ifOutDiscards

Case diagrams are used to explain relationships between counters

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 503

Case Diagrams

The total number of packets delivered to the next higher layer is: ifInUcastPkts + ifInNUcastPkts The total number of received packets from the interface is: (ifInUcastPkts + ifInNUcastPkts) + ifInDiscards + ifInUnknownProtos + ifInErrors

The total number of send packets over the interface is: (ifOutUcastPkts + ifOutNUcastPkts) - ifOutErrors ifOutDiscards

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 504

Physical and Logical Entities


UML Diagram for the ENTITY-MIB (RFC 2737).

<<smi mib class>>

entLogicalEntry
-entLogicalIndex: Integer32 {index} +entLogicalDescr: SnmpAdminString +entLogicalType: AutonomousType +entLogicalCommunity: OctetString +entLogicalTAddress: TAddress +entLogicalTDomain: TDomain +entLogicalContextEngineID: SnmpEngineIdOrNone +entLogicalContextName: SnmpAdminString 0..n

<<smi mib class>>

entLPMappingEntry
-entLogicalIndex: Integer32 {index} +entLPPhysicalIndex: PhysicalIndex {index}

accessible via

<<smi mib 0..n class>>

extends

<<smi mib class>>

entPhysicalEntry
-entPhysicalIndex: PhysicalIndex {index} +entPhysicalDescr: SnmpAdminString +entPhysicalVendorType: AutonomousType +entPhysicalContainedIn: Integer32 +entPhysicalClass: PhysicalClass +entPhysicalParentRelPos: Integer32 +entPhysicalName: SnmpAdminString +entPhysicalHardwareRev: SnmpAdminString +entPhysicalFirmwareRev: SnmpAdminString +entPhysicalSoftwareRev: SnmpAdminString +entPhysicalSerialNum: SnmpAdminString +entPhysicalMfgName: SnmpAdminString +entPhysicalModelName: SnmpAdminString +entPhysicalAlias: SnmpAdminString +entPhysicalAssetID: SnmpAdminString +entPhysicalIsFRU: TruthValue container 1

entAliasMappingEntry
-entPhysicalIndex: PhysicalIndex {index} -entAliasLogicalIndexOrZero: Integer32 {index} +entAliasMappingIdentifier: RowPointer

<<smi mib class>>

entityGeneral
+entLastChangeTime: TimeStamp element 0..n

<<smi mib class>>

entPhysicalContainsEntry
-entPhysicalIndex: PhysicalIndex {index} +entPhysicalChildIndex: PhysicalIndex {index}

is contained in

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 505

IP/ICMP Version 4 MIB Module (RFC 2011)

ipAdEntAddr(1) ipAdEntIfIndex(2) ipAdEntNetMask(3) ipAdEntBcastAddr(4) ipAdEntReasmMaxSize(5)

ipForwarding(1) ipDefaultTTL(2) ipInReceives(3) ipInHdrErrors(4) ipInAddrErrors(5) ipForwDatagrams(6) ipInUnknownProtos(7) ipInDiscards(8) ipInDelivers(9) ipOutRequests(10) ipOutDiscards(11) ipOutNoRoutes(12) ipReasmTimeout(13) ipReasmReqds(14) ipReasmOKs(15) ipReasmFails(16) ipFragOKs(17) ipFragFails(18) ipFragCreates(19) ipAddrTahle(20) ipNetToMediaTable(22) ipRoutingDiscards(23)

ipAddrEntry(1) ipNetToMediaEntry(1)

ip(4) mib-2(1.3.6.1.2.1)

ipNetToMediaIfIndex(1) ipNetToMediaPhysAddress(2) ipNetToMediaNetAddress(3) ipNetToMediaType(4)

icmpInMsgs(1) icmpInErrors(2) icmpInDestUnreachs(3) icmpInTimeExcds(4) icmpInParmProbs(5) icmpInSrcQuenchs(6) icmpInRedirects(7) icmpInEchos(8) icmpInEchoReps(9) icmpInTimestamps(10) icmpInTimestampReps(11) icmpInAddrMasks(12) icmpInAddrMaskReps(13) icmpOutMsgs(14) icmpOutErrors(15) icmpOutDestUnreachs(16) icmpOutTimeExcds(17) icmpOutParmProbs(18) icmpOutSrcQuenchs(19) icmpOutRedirects(20) icmpOutEchos(21) icmpOutEchoReps(22) icmpOutTimestamps(23) icmpOutTimestampReps(24) icmpOutAddrMasks(25) icmpOutAddrMaskReps(26)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 506

icmp(5)

Case Diagram of the ip Group


ipInDelivers ipInUnknownProtos ipInDiscards ipReasmOKs ipReasmFails ipReasmReqds ipForwDatagrams ipInAddrErrors ipInHdrErrors ipInReceives ipOutNoRoutes ipOutDiscards ipFragOKs ipFragFails ipFragCreates ipOutRequests

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 507

= The IP layer MIBs will soon be replaced by MIBs that support IPv4 and IPv6.

IP Forwarding MIB Module (RFC 2096)

ipCidrRouteNumber(3)

mib-2(1.3.6.1.2.1)

ipForward(24)

ipCidrRouteDest(1) ipCidrRouteMask(2) ipCidrRouteTos(3) ipCidrRouteNextHop(4) ipCidrRouteIfIndex(5) ipCidrRouteType(6) ipCidrRouteProto(7) ipCidrRouteAge(8) ipCidrRouteInfo(9) ipCidrRouteNextHopAS(10) ipCidrRouteMetric1(11) ipCidrRouteMetric2(12) ipCidrRouteMetric3(13) ipCidrRouteMetric4(14) ipCidrRouteMetric5(15) ipCidrRouteStatus(16)

ip(4)

ipCidrRouteEntry(1)

ipCidrRouteTable(4)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 508

References
[1] K. McCloghrie and F. Kastenholz. The Interfaces Group MIB. RFC 2863, Cisco Systems, Argon Networks, June 2000. [2] K. McCloghrie and G. Hanson. The Inverted Stack Table Extension to the Interfaces Group MIB. RFC 2864, Cisco Systems, ADC Telecommunications, June 2000. [3] J. D. Case and C. Patridge. Case diagrams: A rst step to diagrammed management information bases. Computer Communication Review, 19(1), January 1989. [4] K. McCloghrie and A. Bierman. Entity MIB (Version 2). RFC 2737, Cisco Systems, December 1999. [5] K. McCloghrie. SNMPv2 Management Information Base for the Internet Protocol using SMIv2. RFC 2011, Cisco Systems, November 1996. [6] F. Baker. IP Forwarding Table MIB. RFC 2096, Cisco Systems, January 1997. [7] J. Schnwlder and A. Mller. Reverse Engineering Internet MIBs. In Proc. 7th IFIP/IEEE International Symposium on Integrated Network Management, Seattle, May 2001.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 509

SNMP Version 3

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 510

SNMP Version 3
Architectural Concepts Protocol Operations Message Format Authentication and Privacy Authorization and Access Control Remote Conguration Status and Limitations

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 511

Architectural Concepts (RFC 3411)


Command Generator Command Responder Notification Originator Notification Receiver Proxy Forwarder

SNMP Applications

Dispatcher

Message Processing Subsystem

Security Subsystem

Access Control Subsystem

SNMP Engine (identified by snmpEngineID) SNMP Entity

Fine grained SNMP applications instead of coarse grained agents and managers. Exactly on engine per SNMP entity and exactly one dispatcher per SNMP engine. Every abstract subsystem may have of one or more concrete models. Modularization enables incremental enhancements.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 512

SNMP Contexts

An context is a collection of management information accessible by an SNMP entity. SNMP entities may have access to multiple contexts. Identical management information may exist in more than one context. Within a management domain, a managed object is uniquely identied by: 1. the identication of the engine within the SNMP entity (e.g., 800007e580e16e1566696f7440) 2. the context name within the SNMP entity (e.g., board1) 3. the managed object type (e.g., IF-MIB.ifDescr) 4. the instance identier (e.g., 1)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 513

Manager and Agent in the SNMP Architecture


Traditional Agent

MIB Instrumentation

Traditional Manager
Access Control Subsystem Command Generator Notification Receiver Notification Originator Command Responder View-based Access Control Notification Originator Proxy Forwarder

Message Processing PDU Dispatcher v1MP Subsystem

Security Subsystem PDU Dispatcher Community Security Model

Message Processing Subsystem v1MP

Security Subsystem

Community Security Model

v2cMP Message Dispatcher v3MP Other Transport Mappings other MP Security Model Transport Mappings User-based Security Model Message Dispatcher

v2cMP User-based Security Model v3MP Other other MP Security Model

UDP

IPX

UDP

IPX

Communication Network

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 514

SNMPv3/USM Textual Conventions

SnmpEngineID Unique identication of an SNMP engine within a management domain. SnmpSecurityModel Identication of a specic security model. SnmpMessageProcessingModel Identication of a specic message processing model. The message processing model is encoded in the msgVersion.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 515

SNMPv3/USM Textual Conventions

SnmpSecurityLevel The security level of a given message (noAuthNoPriv, authNoPriv, authPriv). The security level is encoded in the msgFlags. KeyChange Denes a cryptographic algorithm to change authentication or encryption keys. Does not require encryption. An attacker can drill forward once a key is broken.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 516

Protocol Operations (RFC 3416)


Command Generator Get Response Command Responder Command Generator GetNext Trap Response Command Responder Notification Originator Notification Receiver

Command Generator Set

Command Responder

Command Generator GetBulk

Command Responder

Notification Originator Inform

Notification Receiver

Response

Response

Response

An additional Report protocol operation is used internally for error notications, engine discovery and clock synchronization.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 517

Lexicographic Ordering

Given are two vectors of natural numbers x = (x1 , . . . , xn ) and y = (y1 , . . . , ym ) with n m. We say that x is lexicographically less than y if and only if one of the following conditions is true: (a) xj = yj for 1 j k and xk < yk with k n and k m (b) xj = yj for 1 j n and n < m All OIDs identifying instances can be lexicographically ordered. The SNMP protocol operates only on the lexicographically ordered list of MIB instances and not on the OID registration tree or on conceptual tables.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 518

Simple Forwarding Table Example


(1)

address (1)

info (2)

fwdTable (3)

name (1)

uptime (2)

fwdEntry (1)

fwdIndex (1) 1 2 1 3 5 8 7 9 2 3 4 5 6

fwdDest (2) 2 3 5 7 8 9

fwdNext (3) 2 3 2 2 3 3

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 519

Lexicographic Ordering Example

Lexicographic ordered list of MIB instances:


OID 1.1.0 1.2.1.0 1.2.2.0 1.3.1.1.1 1.3.1.1.2 1.3.1.1.3 1.3.1.1.4 1.3.1.1.5 1.3.1.1.6 1.3.1.2.1 1.3.1.2.2 name address.0 name.0 uptime.0 fwdIndex.1 fwdIndex.2 fwdIndex.3 fwdIndex.4 fwdIndex.5 fwdIndex.6 fwdDest.1 fwdDest.2 value 10.1.2.1 "ACME Router" 54321 1 2 3 4 5 6 2 3 OID 1.3.1.2.3 1.3.1.2.4 1.3.1.2.5 1.3.1.2.6 1.3.1.3.1 1.3.1.3.2 1.3.1.3.3 1.3.1.3.4 1.3.1.3.5 1.3.1.3.6 name fwdDest.3 fwdDest.4 fwdDest.5 fwdDest.6 fwdNext.1 fwdNext.2 fwdNext.3 fwdNext.4 fwdNext.5 fwdNext.6 value 5 7 8 9 2 3 2 2 3 3

Conceptual table instances are ordered column by column not row by row.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 520

PDU Processing Errors


An error response signals the complete failure of the corresponding request. An error response contains an error status (numeric error code) and an error index (position in the variable list where the error occured). Error responses contain no useful management information. There is only a single error status and error index even if there are multiple errors. An error in general implies that none of the actions has taken place during a write operation (as if simultaneous writes).

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 521

PDU Error Codes (RFC 3416)


SNMPv3 Error Code noError(0) tooBig(1) noSuchName(2) badValue(3) readOnly(4) genErr(5) X X X Get/GetNext/GetBulk X X Set X X Trap/Inform X X SNMPv1 Error Code noError(0) tooBig(1) noSuchName(2) badValue(3) readOnly(4) genErr(5)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 522

PDU Error Codes (RFC 3416)


SNMPv3 Error Code noAccess(6) wrongType(7) wrongLength(8) wrongEncoding(9) wrongValue(10) noCreation(11) inconsistentValue(12) resourceUnavailable(13) commitFailed(14) undoFailed(15) authorizationError(16) notWritable(17) inconsistentName(18) X Get/GetNext/GetBulk Set X X X X X X X X X X X X X X Trap/Inform SNMPv1 Error Code noSuchName(2) badValue(3) badValue(3) badValue(3) badValue(3) noSuchName(2) badValue(3) genErr(5) genErr(5) genErr(5) noSuchName(2) noSuchName(2) noSuchName(2)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 523

PDU Processing Exceptions


A response can contain per variable binding exceptions. One or more exceptions in a response are not considered to be an error condition of the corresponding request. A response with exceptions still contains useful management information. Applications receiving response messages must check the error code, must detect exceptions, and they must deal with them gracefully. Not all applications get this right...

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 524

PDU Exceptions (RFC 3416)


SNMPv3 Exception noSuchObject noSuchInstance endOfMibView Get X X X GetNext/GetBulk SNMPv1 Error Status noSuchName(2) noSuchName(2) noSuchName(2)

The noSuchInstance exceptions indicates that a particular instances does not exist, but that other instances of the object type can exist. The noSuchObject exception indicates that a certain object type is not available. This distinctions allows smart applications to adapt to the capabilities of a particular command responder implementation.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 525

Get Operation (RFC 3416)


Command Generator Get Response Command Responder

The Get operation is used to read one or more MIB variables. Possible error codes: tooBig, genErr Possible exceptions: noSuchObject, noSuchInstance

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 526

Example Get Operations


1. Get(1.1.0) Response(noError@0, 1.1.0=10.1.2.1) Get(1.2.0) Response(noError@0, 1.2.0=noSuchObject) Get(1.1.1) Response(noError@0, 1.1.1=noSuchInstance) Get(1.1.0, 1.2.2.0) Response(noError@0, 1.1.0=10.1.2.1, 1.2.2.0=54321) Get(1.3.1.1.4, 1.3.1.3.4) Response(noError@0, 1.3.1.1.4=4, 1.3.1.3.4=2) Get(1.1.0, 1.1.1) Response(noError@0, 1.1.0=10.1.2.1, 1.1.1=noSuchInstance)

2.

3.

4.

5.

6.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 527

GetNext Operation (RFC 3416)


Command Generator GetNext Response Command Responder

The GetNext operation allows to retrieve the value of the next existing MIB instances in lexicographic order. Successive GetNext operations can be used to walk the MIB instances without prior knowledge about the MIB structure. Possible error codes: tooBig, genErr Possible exceptions: endOfMibView
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 528

Example GetNext Operations


1. GetNext(1.1.0) Response(noError@0, 1.2.1.0="ACME Router") GetNext(1.2.1.0) Response(noError@0, 1.2.2.0=54321) GetNext(1.1) Response(noError@0, 1.1.0=10.1.2.1) GetNext(1.3.1.1.1) Response(noError@0, 1.3.1.1.2=2) GetNext(1.3.1.1.6) Response(noError@0, 1.3.1.2.1=2) GetNext(1.3.1.1.1, 1.3.1.2.1, 1.3.1.3.1) Response(noError@0, 1.3.1.1.2=2, 1.3.1.2.2=3, 1.3.1.3.2=3)

2.

3.

4.

5.

6.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 529

GetBulk Operation (RFC 3416)


Command Generator GetBulk Response Command Responder

The GetBulk operation is a generalization of the GetNext operation where the agent performs a series of GetNext operations internally. The GetBulk operation like all the other protocol operations operates only on the lexicographically ordered list of MIB instances and does therefore not respect conceptual table boundaries.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 530

GetBulk Operation (RFC 3416)

GetBulk processing details: The rst N elements (non-repeaters) of the varbind list will be processed similar to the GetNext operation. The remaining R elements of the varbind list are repeatedly processed similar to the GetNext operation. The parameter M (max-repetitions) denes the upper bound of repetitions. The manager usually does not know how to choose a value for max-repetitions. If max-repetitions is too small, the potential gain will be small. If it is too large, there might be a costly overshoot.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 531

Example GetBulk Operations


1. GetBulk(non-repeaters=0, max-repetitions=4, 1.1) Response(noError@0, 1.1.0=10.1.2.1, 1.2.1.0="ACME Router", 1.2.2.0=54321, 1.3.1.1.1=1) GetBulk(non-repeaters=1, max-repetitions=2, 1.2.2, 1.3.1.1, 1.3.1.2, 1.3.1.3) Response(noError@0, 1.2.2.0=54321, 1.3.1.1.1=1, 1.3.1.2.1=2, 1.3.1.3.1=2, 1.3.1.1.2=2, 1.3.1.2.2=3, 1.3.1.3.2=3)

2.

The non-repeaters are typically used to retrieve a discontinuity indicating scalars, such as sysUpTime.0. Any ideas for a better GetBulk operation?

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 532

Set Operation (RFC 3416)


Command Generator Set Response Command Responder

The Set operation allows to modify a set of MIB instances. The operation is atomic (either all instances are modied or none). Possible error codes: wrongValue, wrongEncoding, wrongType, wrongLength, inconsistentValue, noAccess, notWritable, noCreation, inconsistentName, resourceUnavailable, commitFailed, undoFailed
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 533

Example Set Operations


1. Set(1.2.1.0="Moo Router") Response(noError@0, 1.2.1.0="Moo Router") Set(1.1.0="foo.bar.com") Response(badValue@1, 1.1.0="foo.bar.com") Set(1.1.1=10.2.3.4) Response(noSuchName@1, 1.1.1=10.2.3.4) Set(1.2.1.0="Moo Router", 1.1.0="foo.bar.com") Response(badValue@2, 1.2.1.0="Moo Router", 1.1.0="foo.bar.com") Set(1.3.1.1.7=7, 1.3.1.2.7=2, 1.3.1.3.7=3) Response(noError@0, 1.3.1.1.7=7, 1.3.1.2.7=2, 1.3.1.3.7=3)

2.

3.

4.

5.

The error codes authorizationError and readOnly are not used. No support for object type specic error codes.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 534

Trap Operation (RFC 3416)


Notification Originator Trap Notification Receiver

The Trap operation is used to notify a manager of the occurance of an event. The Trap operation is unconrmed: The sending agent does not know whether the trap was received and processed by a manager. All trap specic information in encoded in the varbind list (sysUpTime, snmpTrapOID, snmpTrapEnterprise).
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 535

Inform Operation (RFC 3416)


Notification Originator Inform Response Notification Receiver

The Inform operation is a conrmed trap. The contents of the varbind list of an Inform operation is similar to that of a Trap operation. The reception of an Inform operation is conrmed by a response message from the notication receiver. Conrmation indicates that the notication was delivered, not that the notication was understood.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 536

Message Format (RFC 3412, RFC 3414)


msgVersion msgID msgMaxSize msgFlags msgSecurityModel msgAuthoritativeEngineID msgAuthoritativeEngineBoots scope of authentication msgAuthoritativeEngineTime msgUserName msgAuthenticationParameters msgPrivacyParameters contextEngineID contextName scope of encryption request-id error-status / non-repeaters error-index / max-repetitions protocol operation (PDU) context selector (scope) security parameters (USM) message header (SNMPv3)

msgVersion identies the message processing model. msgSecurityModel identies the security model. contextEngineID and contextName determine the context.

variable-bindings

protocol operation type (and version) is determined by the tag of the PDU

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 537

Classes of Protocol Operations (RFC 3411)

The processing of a message depends on the class of the embedded protocol operation:
Class Read Write Response Notication Internal Conrmed Unconrmed Description PDUs that retrieve management information. PDUs which attempt to modify management information. PDUs which are sent in response to a request. PDUs which transmit event notications. PDUs exchanged internally between SNMP engines. PDUs which cause the receiver to send a response. PDUs which are not acknowledged.

PDU classes support the introduction of new protocol operations without changes the core specications. However, no indication of the set of protocol operations supported by an SNMP engine implementation.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 538

Encoding of SNMPv3/USM Messages


0x30 - sequence

tag len

SNMPv3Message

0x02 - integer

0x30 - sequence

0x04 - octet string

0x30 or 0x04 - sequence or octet string

tag len msgVersion

tag len

msgGlobalData

tag len

msgSecurityParameters

tag len

msgData

0x02 - integer

0x02 - integer

0x04 - octet string

0x02 - integer

0x30 - sequence

tag len

msgID

tag len

msgMaxSize

tag len

msgFlags

tag len msgSecurityModel

tag len

UsmSecurityParameters

0x04 - octet string

0x02 - integer

0x02 - integer

0x04 - octet string

0x04 - octet string

0x04 - octet string

tag len msgAuthEngineID

tag len msgAuthEngBoots

tag len msgAuthEngTime

tag len msgUserName

tag len msgAuthParam

tag len msgPrivParam

0x04 - octet string

0x04 - octet string

depends on PDU type

tag len contextEngineID

tag len

contextName

tag len

PDU

0x02 - integer

0x02 - integer

0x02 - integer

0x30 - sequence

tag len

request-id

tag len

error-status / non-repeaters

tag len error-index / max-repetitions

tag len variable-bindings

0x30 - sequence

0x30 - sequence

tag len

VarBind

tag len

VarBind

0x08 - object identifier

depends on type of value

0x08 - object identifier

depends on type of value

tag len

name

tag len value / exception

tag len

name

tag len value / exception

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 539

Security Issues

The following questions must be answered in order to decide whether an operation should be performed or not: 1. Is the message specifying an operation authentic? 2. Who requested the operation to be performed? 3. What objects are accessed in the operation? 4. What are the rights of the requester with regard to the objects of the operation? 1 and 2 are answered by message security mechanisms (authentication and privacy). 3 and 4 are answered by authorization mechanisms (access control).

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 540

Authentication and Privacy (RFC 3414)

Protection against the following threads: 1. Modication of Information (Unauthorized modication of in-transit SNMP messages.) 2. Masquerade (Unauthorized users attempting to use the identity of authorized users.) 3. Disclosure (Protection against eavesdropping on the exchanges between SNMP entities.) 4. Message Stream Modication (Re-ordered, delayed or replayed messages to effect unauthorized operations.)

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 541

USM Security Services (RFC 3414)

Data Integrity Data has not been altered or destroyed in an unauthorized manner. Data sequences have not been altered to an extent greater than can occur non-maliciously. Data Origin Authentication The claimed identity of the user on whose behalf received data was originated is corroborated. Data Condentiality Information is not made available or disclosed to unauthorized individuals, entities, or processes.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 542

USM Security Services (RFC 3414)

Message Timeliness and Limited Replay Protection A message whose generation time is outside of a time window is not accepted. Message reordering is not dealt with and can occur in normal conditions too. No protection against Denial of Service attacks Too hard of a problem to solve. No protection against Trafc Analysis attacks Many management trafc patterns are predictable. Hiding periodic management trafc would be extremly costly.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 543

Data Integrity and Data Origin Authentication


Sender Receiver Key Data Key Data

Hash-Function

Hash-Function

MAC

MAC =?

User

MAC

Data

User

MAC

Data

Cryptographic strong oneway hash functions generate message authentication codes (MACs). The MAC ensures integrity, the symmetric key provides for authentication. USM currently uses HMAC-MD5-96 or HMAC-SHA-96. Other hash functions may be added in the future.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 544

Data Condentiality
Sender Receiver Key Data Key Data

DES (CBC)

DES (CBC)

User

Encrypted Data

User

Encrypted Data

Optional encryption of the ScopedPDU using symmetric but localized keys. USM currently uses CBC-DES. Other encryption functions may be added in the future. Encryption is CPU expensive use only when needed.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 545

Message Timeliness and Replay Protection


Authoritative Engine Nonauthoritative Engine Boots Time Boots latestRecvTime

Time

Lifetime

Authoritative Clock Time Window

valid ?

EngineID

Timestamp

Data

EngineID

Timestamp

Data

A non-authoritative engine maintains a notion of the time at the authoritative engine. A non-authoritative engine keeps track when the last authentic message was received from a given engine. A message is accepted and considered fresh if it falls within a time window.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 546

Generating Keys from Passwords

Algorithmic transformation of a human readable password into a cryptographic key: Produce a string S of length 220 = 1048576 bytes by repeating the password as many times as necessary. Compute the users key KU using either KU = M D5(S) or KU = SHA(S). Slows down naive brute force password attacks. No serious barrier for an attacker with a transformed dictionary.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 547

Localized Keys

Algorithmic transformation of the users key KU and an engine identication E into a localized key: For a given engine E , compute either KU E = M D5(KU , E, KU ) or KU E = SHA(KU , E, KU ). Advantage: A compromised key does not give access to other SNMP engines. Very important in environments where devices can easily be stolen or accessed physically by attackers.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 548

Key Changes

Key change procedure (initiator): 1. Generate a random value r from a random number generator. 2. Compute d = M D5(Kold, r) or d = SHA(Kold , r). 3. Compute = d Knew and send (, r). Key change procedure (receiver): 1. Compute d = M D5(Kold, r) or d = SHA(Kold , r). 2. Compute Knew = d . The receiver computes the correct new key since d = d (d Knew ) = Knew .

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 549

Key Change Properties


Key changes must be possible without encryption since encryption is optional. An attacker who is able to catch all key updates can calculate the current keys once an old key has been broken. Attackers thus get an unlimited amount of time to break keys if they can catch all key change requests.

Use encryption for key changes if at all possible!

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 550

Authoritative Engine

Either the sender or the receiver of a message is designated the authoritative engine. The receiver is authoritative if the message contains a conrmed class PDU. The sender is authoritative if the message contains an unconrmed class PDU. The determination whether a message is recent is made relative to the authoritative engine.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 551

Timeliness Checks (Authoritative Receiver)

A message is outside the time window if any of the following holds true: 1. snmpEngineBoots = 231 1 2. msgAuthoritativeEngineBoots = snmpEngineBoots 3. abs(msgAuthoritativeEngineTime snmpEngineTime) > 150 seconds

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 552

Timeliness Checks (Non-authoritative Receiver)

A message is outside the time window if any of the following is true: 1. snmpEngineBoots = 231 1 2. msgAuthoritativeEngineBoots < snmpEngineBoots 3. msgAuthoritativeEngineBoots = snmpEngineBoots and msgAuthoritativeEngineTime < snmpEngineTime 150

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 553

Clock Synchronization

For each remote authoritative SNMP engine, an SNMP engine maintains: snmpEngineBoots, snmpEngineTime and latestReceivedEngineTime Time synchronization only occurs if the message is authentic. An update occurs, if at least one of the following conditions is true: 1. msgAuthoritativeEngineBoots > snmpEngineBoots 2. msgAuthoritativeEngineBoots = snmpEngineBoots and msgAuthoritativeEngineTime > latestReceivedEngineTime
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 554

Discovery and Initial Synchronization


The engine identication is needed to compute localized keys and to keep clock information for authoritative engines. An SNMP engine can learn the engine identication by sending a noAuthNoPriv request with a zero-length msgAuthoritativeEngineID. The receiver returns a Report PDU with the real msgAuthoritativeEngineID. Similarly, (initial) clock synchronization happens by sending an authentic request and receiving a Report PDU with the authoritative time.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 555

USM MIB (RFC 3414)


The usmUserTable maps USM user names to securityNames. New entries may be created by cloning existing entries (together with their keys). The usmUserAuthKeyChange and usmUserPrivKeyChange objects may be used by the security administrator to change the users keys. The usmUserOwnAuthKeyChange and usmUserOwnPrivKeyChange objects may be used by the user to change his keys.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 556

Authorization and Access Control (RFC 3415)


securityModel who securityName where contextName securityModel how securityLevel why what which viewType object type variableName object instance yes/no viewName groupName

Three different securityLevels: noAuthNoPriv, authNoPriv, authPriv A securityName is a security model independent name for a principal.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 557

View-based Access Control (RFC 3415)


1.1.2.1.*.1

11 00 11 00
1 1

11 00 11 00
1

1 0 1 0
1

1.2.1.2.*

11 00 1 0 11 11 1 1 1 00 00 0 0 0 11 00
2 3 1 2 1 2 1

1 0 1 0 1 0 1 0

2 1 1

1 11 0 00 1 1 1 11 11 0 0 0 00 00 11 00
2 3 1 2 1 2 1

1 0

1 0 1 0 1 0 1 0

2 1

A view subtree is a set of managed object instances with a common OID prex. A view tree family is the combination of an OID prex with a bit mask (wildcarding of OID prex components). A view is an ordered set of view tree families. Access control rights are dened by a read view, write view or notify view.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 558

View-based Access Control MIB (RFC 3415)


smi mib class

vacmSecurityToGroupEntry
-vacmSecurityModel: SnmpSecurityModel {index} -vacmSecurityName: SnmpAdminString {index} +vacmGroupName: SnmpAdminString +vacmSecurityToGroupStorageType: StorageType +vacmSecurityToGroupStatus: RowStatus 0..* smi mib class

vacmContextEntry
+vacmContextName: SnmpAdminString {index}

groupMemberRights readView 1 smi mib class writeView notifyView 0..* 0..* smi mib class 0..*

vacmAccessEntry
+vacmGroupName: SnmpAdminString {index} -vacmAccessContextPrefix: SnmpAdminString {index} -vacmAccessSecurityModel: SnmpSecurityModel {index} -vacmAccessSecurityLevel: SnmpSecurityLevel {index} +vacmAccessContextMatch: Enumeration +vacmAccessReadViewName: SnmpAdminString +vacmAccessWriteViewName: SnmpAdminString +vacmAccessNotifyViewName: SnmpAdminString +vacmAccessStorageType: StorageType +vacmAccessStatus: RowStatus

vacmViewTreeFamilyEntry
0..* 0..* 0..* +vacmViewSpinLock: TestAndIncr -vacmViewTreeFamilyViewName: SnmpAdminString {index} -vacmViewTreeFamilySubtree: ObjectIdentifier {index} +vacmViewTreeFamilyMask: OctetString +vacmViewTreeFamilyType: Enumeration +vacmViewTreeFamilyStorageType: StorageType +vacmViewTreeFamilyStatus: RowStatus

A security name (with a given security level) can not be a member of multiple groups. The vacmViewTreeFamilyType can be used to include or exclude a view tree family. The context table is kind of degenerated.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 559

Remote Conguration (RFC 3413)


smi mib class smi mib class smi mib class

snmpEngine
+snmpEngineID: SnmpEngineID +snmpEngineBoots: Integer32 +snmpEngineTime: Integer32 +snmpEngineMaxMessageSize: Integer32

snmpMPDStats
+snmpUnknownSecurityModels: Counter32 +snmpInvalidMsgs: Counter32 +snmpUnknownPDUHandlers: Counter32

snmpTargetObjects
+snmpUnavailableContexts: Counter32 +snmpUnknownContexts: Counter32

usesParameters 1 smi mib class

snmpTargetAddrEntry
+snmpTargetSpinLock: TestAndIncr -snmpTargetAddrName: SnmpAdminString {index} +snmpTargetAddrTDomain: TDomain +snmpTargetAddrTAddress: TAddress +snmpTargetAddrTimeout: TimeInterval +snmpTargetAddrRetryCount: Integer32 +snmpTargetAddrTagList: SnmpTagList +snmpTargetAddrParams: SnmpAdminString +snmpTargetAddrStorageType: StorageType +snmpTargetAddrRowStatus: RowStatus 0..* selectsTargets 0..* smi mib class 0..1 smi mib class

0..* smi mib class

snmpTargetParamsEntry
-snmpTargetParamsName: SnmpAdminString {index} +snmpTargetParamsMPModel: SnmpMessageProcessingModel +snmpTargetParamsSecurityModel: SnmpSecurityModel +snmpTargetParamsSecurityName: SnmpAdminString +snmpTargetParamsSecurityLevel: SnmpSecurityLevel +snmpTargetParamsStorageType: StorageType +snmpTargetParamsRowStatus: RowStatus 1 associatedProfile

smi mib class

snmpNotifyFilterEntry
+snmpNotifyFilterProfileName: SnmpAdminString {index} -snmpNotifyFilterSubtree: ObjectIdentifier {index} +snmpNotifyFilterMask: OctetString +snmpNotifyFilterType: Enumeration +snmpNotifyFilterStorageType: StorageType +snmpNotifyFilterRowStatus: RowStatus 0..* usesFilters

snmpNotifyEntry
-snmpNotifyName: SnmpAdminString {index} +snmpNotifyTag: SnmpTagValue +snmpNotifyType: Enumeration +snmpNotifyStorageType: StorageType +snmpNotifyRowStatus: RowStatus

snmpNotifyFilterProfileEntry
-snmpTargetParamsName: SnmpAdminString {index} +snmpNotifyFilterProfileName: SnmpAdminString +snmpNotifyFilterProfileStorType: StorageType +snmpNotifyFilterProfileRowStatus: RowStatus 1

SNMPv3 denes several MIB modules for remote conguration of SNMP entities.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 560

SNMPv3 Status and Limitations


Many implementations and products are available. Visit the SNMPv3 Web page for up-to-date information.
<http://www.ibr.cs.tu-bs.de/ietf/snmpv3/>

Some technology domains (e.g., cable modem industry in the US) require SNMPv3 support. However, general deployment happens much slower than originally expected. Manual conguration is an error prone and time consuming. Lack of integration in deployed AAA systems. Remote conguration and key management requires nontrivial applications.
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 561

SNMPv3 Status and Limitations


Missing extensibility for new base data types (e.g., Unsigned64). Missing extensibility for new protocol operations (e.g., GetRange). Limited exibility in VACM grouping rules. Asymmetries between notication ltering and VACM ltering. Strength of USM security (DES versus AES, key change procedure). Initial key assignment problematic (no standardized Dife-Helman exchange, no integration with other key management systems).
slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 562

References
[1] J. Case, R. Mundy, D. Partain, and B. Stewart. Introduction and Applicability Statements for Internet Standard Management Framework. RFC 3410, SNMP Research, Network Associates Laboratories, Ericsson, December 2002. [2] W. Stallings. SNMP, SNMPv2, SNMPv3, and RMON 1 and 2. Addison-Wesley, 3 edition, 1999. [3] D. Zeltserman. A Practical Guide to SNMPv3 and Network Management. Prentice Hall, 1999. [4] U. Blumenthal and B. Wijnen. Security Features of SNMPv3. Simple Times, 5(1), December 1997. [5] D. Black, K. McCloghrie, and J. Schnwlder. Uniform Resource Identier (URI) Scheme for the Simple Network Management Protocol (SNMP). RFC 4088, EMC Corporation, Cisco Systems, International University Bremen, June 2005.

slides.tex Networks and Protocols 2006 Jurgen Schonwalder 9/10/2006 8:18 p. 563

Das könnte Ihnen auch gefallen