Sie sind auf Seite 1von 69

Part 2:

Deep Packet Inspection Tutorial

Hendrik Schulze
ipoque
hendrik.schulze@ipoque.com

Wednesday, December 19, 12

Tutorial Scope

What is DPI?
Definition
Applications
Technical Motivation

How does DPI work?


Basic operations
Open source examples

Skype and DPI

Application
Network behavior
Implications for DPI and analysis

2
Wednesday, December 19, 12

What is DPI?

Deep Packet Inspection (DPI) is a form of computer network


packet filtering that examines the data part (and possibly also the
header) of a packet as it passes an inspection point, searching for
protocol non-compliance, viruses, spam, intrusions or predefined
criteria to decide if the packet can pass or if it needs to be routed
to a dierent destination, or for the purpose of collecting
statistical information.
[Wikipedia 02/2012]

3
Wednesday, December 19, 12

What is DPI?

Deep Packet Inspection (DPI) is a form of computer network


packet filtering that examines the data part (and possibly also the
header) of a packet as it passes an inspection point, searching for
protocol non-compliance, viruses, spam, intrusions or predefined
criteria to decide if the packet can pass or if it needs to be routed
to a dierent destination, or for the purpose of collecting
statistical information.
[Wikipedia 02/2012]

This is what a DPI engine does

4
Wednesday, December 19, 12

What is DPI?

Deep Packet Inspection (DPI) is a form of computer network


packet filtering that examines the data part (and possibly also the
header) of a packet as it passes an inspection point, searching for
protocol non-compliance, viruses, spam, intrusions or predefined
criteria to decide if the packet can pass or if it needs to be routed
to a dierent destination, or for the purpose of collecting
statistical information.
[Wikipedia 02/2012]

This is what a DPI engine does but these are applications!

5
Wednesday, December 19, 12

What is DPI?
Modified Definition:
Deep packet inspection (DPI) analyses all data of a packet
(headers and payload) as it passes an inspection point in order to
determine the protocol and/or application transported.
DPI provides meta-data of network trac
Meta-data is foundation of DPI applications

DPI is dierent from decoding


it does not retrieve the data transferred
but: the term DPI is often also used for decoding

6
Wednesday, December 19, 12

Three Use-cases of DPI

Protocol Decoding
(Full Payload area Analysis FPA)
Meta-data Extraction
(Predetermined Payload area
Analysis PPA)

Protocol/Application Classification

7
Wednesday, December 19, 12

What are DPI applications I?

Applications that use Protocol Classification


Network Operators
Billing
Tiered Services
Bandwidth Optimization
Policy Enforcement

Security
NG Firewalls: Allow/Block Applications and Protocols
Virus scan only in sensitive trac

Network Probing
Statistics
Trac Interception
Test and Measurement
8
Wednesday, December 19, 12

What are DPI applications II?

Applications that use Meta-data extraction


Network Operators
Billing
Tiered Services
Policy Enforcement

Security
NG Firewalls: prevent security relevant actions

Network Probing
Statistics
Trac Interception
Quality Measurement

9
Wednesday, December 19, 12

What are DPI applications III?

Applications that use Protocol Decoding


Network Probing
Statistics
Trac Interception/Investiagtion
Test and Measurement

10
Wednesday, December 19, 12

Protocol/Application Classification

Wednesday, December 19, 12

What problems does DPI solve?

Network convergence creates a technical challenge


Dierent applications have the dierent requirements
The Internet is practically a single service network

The Internet does not have a reliable mean of application


identification
Dierentiated application/protocol handling requires reliable
identification

12
Wednesday, December 19, 12

A single service network?


L7 Message
L7
Header

L7

Application
Data

L7

TCP/UDP Message
TCP/
UDP

TCP/UDP
Header

L7
Header

Application
Data

TCP/
UDP

Application
Data

IP

Application
Data

Link

IP Datagram
IP
Header

IP

TCP/UDP
Header

L7
Header

L2 Frame
Link

L2
Header

PHY

TCP/UDP
Header

L7
Header

PHY

13
Wednesday, December 19, 12

IP
Header

Application/Protocol Classification?

By IANA-assigned port numbers (well known ports)


http://www.iana.org/assignments/port-numbers
e.g. 80 HTTP, 20 FTP data, 21 FTP control, 22 SSH, 25 SMTP
1214 Kazaa, 4662 eMule, 6881-6999 BitTorrent, 9001 Tor, pretty
much any - Skype

Used in most firewall systems


+ Easy and fast look-up in real-time
Many protocols and applications do not adhere to standard
anymore
some protocols, such as P2P, try to deliberately avoid fixed ports
there is no guarantee that the well known ports are used

14
Wednesday, December 19, 12

The Failure of Port-Based Classification

15
Wednesday, December 19, 12

The only way to understand L7 is to look


at L7!
L7 Message
L7
Header

L7

Application
Data

L7

TCP/UDP Message
TCP/
UDP

TCP/UDP
Header

L7
Header

Application
Data

TCP/
UDP

Application
Data

IP

Application
Data

Link

IP Datagram
IP
Header

IP

TCP/UDP
Header

L7
Header

L2 Frame
Link

L2
Header

PHY

TCP/UDP
Header

L7
Header

PHY

16
Wednesday, December 19, 12

IP
Header

How does DPI work?

Whats behind the buzzword?


It is very simple: use all the information the packet holds to
classify it.
it is not really that deep ;-) usually <1500 bytes
look for protocol-specific patterns in the packet payload
track state across several packets of a flow

17
Wednesday, December 19, 12

Flow Tracking

First basic DPI operation


Common concept in network security equipment
e.g. Firewalls

Determines which packets belong to a communication between


two computers (flow)
Based on 5-tuple flow identifier (SRC-IP,DST-IP,SRC port,DST
port,L-4 protocol)
required to determine a flows protocol/application
+ speed-up for system performance

18
Wednesday, December 19, 12

Pattern matching

19
Wednesday, December 19, 12

Pattern matching
Simple pattern matching

XXX

19
Wednesday, December 19, 12

Pattern matching
Simple pattern matching

Basic flow tracking needed

XXX

19
Wednesday, December 19, 12

Pattern matching
Simple pattern matching

Basic flow tracking needed

XXX

Pattern matching over multiple packets

XXX

YYY

19
Wednesday, December 19, 12

ZZZ

Pattern matching
Simple pattern matching

Basic flow tracking needed

XXX

Pattern matching over multiple packets

XXX

YYY
Flow tracking mandatory

19
Wednesday, December 19, 12

ZZZ

Behavioral Analysis

Pattern matching over multiple packets

short

long

short

short
three short packets

20
Wednesday, December 19, 12

short

Pattern Matching

Second basic DPI operation


Search for strings, numbers at certain positions
usually several patterns for each protocol

Can be done with specialized hardware


e.g. Cavium, RMI
+ speed-up for pattern and regular expression matching
bus connection of accelerator cards is bottleneck

21
Wednesday, December 19, 12

Example I: RegEx Patterns


http://l7-filter.sourceforge.net
Based on Netfilter
http://netfilter.org
Linux packet filtering framework (ie. firewall)
configured with iptables

Uses regular expressions to match patterns


+ user-extensible
inecient slow

Example: http & BitTorrent


http/(0\.9|1\.0|1\.1) [1-5][0-9][0-9] [\x09-\x0d ~]*(connection:|content-type:|content-length:|
date:)|post [\x09-\x0d -~]* http/[01]\.[019]
^(\x13bittorrent protocol|azver\x01$|
get /scrape\?info_hash=)|d1:ad2:id20:|\x08'7P\)[RP]
22
Wednesday, December 19, 12

How L7-Filter Works


int match(packet, protocol)
{
! if(regular expression for the protocol is not compiled yet)
! ! compile it and put it in a list of compiled regexps;
! else
! ! fetch the compiled pattern from the list;
!
!
!
!
!

if(already classified this connection)


! if(classification matches one we're looking for)
! ! return true;
! else
! ! return false;

!
!

if(seen too many packets with no match)


! return false;

Append application layer data to data buffer;

!
!
!
!
!
}

if(data buffer matches regexp for the protocol we're looking for)
! Mark the connection as identified;
! return true;
else
! return false;

23
Wednesday, December 19, 12

Example II: Hard-Coded Patterns


Example from ipoques PACE ( Protocol and Application Classification Engine )
Uses hard-coded pattern instead of regular expressions
faster, but less flexible
Example: BitTorrent
/* test for match 0x13+"BitTorrent protocol" */

if (payload[0] == 0x13)

{

if (memcmp(payload+1, "BitTorrent protocol", 19) == 0) return (IPP2P_BIT * 100);

}


/* get tracker commandos, all starts with GET /

* then it can follow: scrape| announce

* and then ?hash_info=

*/

if (memcmp(payload,"GET /",5) == 0)

{

/* message scrape */

if ( memcmp(payload+5,"scrape?info_hash=",17)==0 ) return (IPP2P_BIT * 100 + 1);

/* message announce */

if ( memcmp(payload+5,"announce?info_hash=",19)==0 ) return (IPP2P_BIT * 100 + 2);
}

24
Wednesday, December 19, 12

Behavioral Analysis
Third basic DPI operation
Pattern matching impossible for encrypted trac
Instead, look at unencrypted patterns:
Packet sizes
Packet size sequences
Data rates
Packet rates
Number of concurrent flows
Flow arrival rate

25
Wednesday, December 19, 12

Combining weak patterns

If A matches --> 80% hit


If B matches --> 90% hit
Question what is the probability when A and B match?
= 100% -(100%-80%)x(100-90%)
= 100% - 20% x 10%

= 98%

26
Wednesday, December 19, 12

probability of misclassification

False positives vs. false negatives

false negatives

false positives
too strict

too loose
27
Wednesday, December 19, 12

Example: Skype

Skype is optimized to work under many network conditions


Dicult to detect
Requires behavioral analysis

Very dicult to block

Literature:
An Experimental Study of the Skype Peer-to-Peer VoIP System, Saikat Guha (Cornell University), Neil Daswani, Ravi Jain
(Google), 2/2006
Silver Needle in the Skype, Philippe Biondi, Fabrice Desclaux, EADS, Black Hat Europe 2006, 3/2006
Vanilla Skype (part 1+2), Fabrice Desclaux, Kostya Kortchinsky, EADS, RECON2006, 6/2006
An Analysis of the Skype Peer-to-Peer Internet Telephony Protocol, Salman A. Baset and Henning Schulzrinne, Columbia
University, 1/2006

28
Wednesday, December 19, 12

Popularity
>500 million registered users by end of 2011
~30 million users simultaneously online
First half of 2010
88.4 billion Skype-to-Skype call minutes
6.4 billion minutes of calls to landlines and mobiles
40% video calls

High diurnal usage variations

40-50% more users during working hours


25% more users on weekdays

Longer call duration compared to PSTN


PSTN: 3 minutes
Skype: 27 minutes
most likely because Skype calls are free

29
Wednesday, December 19, 12

Technical Basics

Peer-to-peer (P2P) network architecture


supernode architecture similar to the KaZaa FastTrack protocol

Very easy to use


Works in almost any network environment on almost any
operating system
Advanced obfuscation techniques
both in the code and the network trac

Generates trac even when idle

30
Wednesday, December 19, 12

Skype in the Network

P2P architecture
Uses UDP and TCP, both for signaling and communication
No fixed ports
a UDP port is randomly selected at installation time and used for all
UDP data
HTTP and HTTPS ports (80 & 443) can be used

Works behind firewalls and NAT gateways

31
Wednesday, December 19, 12

Skype & Firewalls

Penetrates most firewall systems


there is nothing firewalls can evaluate (such as port numbers, payload
patterns)

Works behind NAT gateways

uses NAT hole punching techniques similar to STUN and TURN


only requires a single connection to a supernode initiated by the client
to be fully operational

32
Wednesday, December 19, 12

Supernodes

Supernodes (SN)
implement the Global Index, the Skype user directory
essential for the proper operation

Relay nodes (RN)

call forwarding for clients behind NAT gateways

Dierentiation between SN and RN not clear


invisible network infrastructure

Every client with a public IP and sucient resources can become


a SN or RN
this can only be disabled for the latest Windows client by tweaking the
Registry
easier to become a RN

33
Wednesday, December 19, 12

DPI Technical Challenges

No well-known ports or server IP addresses


Bit patterns insucient for detection

nearly everything is encrypted or encoded


known bit patterns only cover some flows, not all
for instance, if Skype uses port 443, it mimics a valid HTTPS connection setup
patterns change from version to version

If one flow is blocked, Skype tries something else (i.e. dierent


port, dierent transport protocol)
successful blocking requires more than just initial detection

34
Wednesday, December 19, 12

Detection

Behavioral analysis requires tracking of


as many TCP and UDP flows of a single node as possible

Dierent flow patterns for TCP and UDP


Flow patterns, or signatures
absolute and relative packet sizes
flow count and arrival rate

Dierent connection mechanisms require dierent signatures


Regular signatures updates necessary
signatures change often
major changes in Skype 3.0
Skype closely monitors and counters detection eorts

35
Wednesday, December 19, 12

Skype Versions Behave Dierently


Skype v2.0.0.43
Skype v2.0.0.63
Skype v2.0.0.69
Skype v2.0.0.73
Skype v2.0.0.76
Skype v2.0.0.79
Skype v2.0.0.81
Skype v2.0.0.90
Skype v2.0.0.97
Skype v2.0.0.103
Skype v2.0.0.105
Skype v2.0.0.107
Skype v2.5.0.72
Skype v2.5.0.82
Skype v2.5.0.91
Skype v2.5.0.113
Skype v2.5.0.122
Skype v2.5.0.126
Skype v2.5.0.130
Skype v2.5.0.137
Skype v2.5.0.141
Skype v2.5.0.146

Skype v2.5.0.151
Skype v2.5.0.154
Skype v2.6.0.67
Skype v2.6.0.74
Skype v2.6.0.81
Skype v2.6.0.97
Skype v2.6.0.103
Skype v2.6.0.105
Skype v3.0.0.106
Skype v3.0.0.123
Skype v3.0.0.137
Skype v3.0.0.154
Skype v3.0.0.190
Skype v3.0.0.198
Skype v3.0.0.205
Skype v3.0.0.209
Skype v3.0.0.214
Skype v3.0.0.216
Skype v3.0.0.217
Skype v3.0.0.218
Skype v3.1.0.112
Skype v3.1.0.144

Skype v3.1.0.150
Skype v3.1.0.152
Skype v3.2.0.53
Skype v3.2.0.63
Skype v3.2.0.82
Skype v3.2.0.115
Skype v3.2.0.145
Skype v3.2.0.148
Skype v3.2.0.152
Skype v3.2.0.158
Skype v3.2.0.163
Skype v3.2.0.175
Skype v3.5.0.107
Skype v3.5.0.158
Skype v3.5.0.178
Skype v3.5.0.202
Skype v3.5.0.214
Skype v3.5.0.229
Skype v3.5.0.234
Skype v3.5.0.239
Skype v3.6.0.127
Skype v3.6.0.159

36
Wednesday, December 19, 12

Skype Detection Limitations of


Behavioral Analysis
The connection setup is detected
existing connections cannot be detected;
a call may use an existing connection and succeed

It takes some packets to detect a new connection


a call may appear to succeed, but no conversation will be possible
the contact list may become available periodically

High percentage of flows/packets of the Skype node need to be


visible

37
Wednesday, December 19, 12

Possible Infrastructure Obstacles

The detection engine must see all inbound and outbound


packets of a Skype client
no asymmetric routing
on other device blocking Skype packets
location of interception point in network topology

Unusual network conditions


many IPs behind a single NAT
unusual NAT behavior

38
Wednesday, December 19, 12

What is the challenge about application


classification
The Internet is constantly changing
Finding patterns is (normally) easy
even for encrypted trac

Maintaining patterns generates the pain


check for side eects (false positives)
check for updates (false negatives)

39
Wednesday, December 19, 12

What is the challenge about application


classification
The Internet is constantly changing
Finding patterns is (normally) easy
even for encrypted trac

Maintaining patterns generates the pain


check for side eects (false positives)
check for updates (false negatives)

39
Wednesday, December 19, 12

What is the challenge about application


classification
The Internet is constantly changing
Finding patterns is (normally) easy
even for encrypted trac

Maintaining patterns generates the pain


check for side eects (false positives)
check for updates (false negatives)

39
Wednesday, December 19, 12

What is the challenge about application


classification
The Internet is constantly changing
Finding patterns is (normally) easy
even for encrypted trac

Maintaining patterns generates the pain


check for side eects (false positives)
check for updates (false negatives)

39
Wednesday, December 19, 12

What is the challenge about application


classification
The Internet is constantly changing
Finding patterns is (normally) easy
even for encrypted trac

Maintaining patterns generates the pain


check for side eects (false positives)
check for updates (false negatives)

39
Wednesday, December 19, 12

40
Wednesday, December 19, 12

40
Wednesday, December 19, 12

41
Wednesday, December 19, 12

Meta-data Extraction

Wednesday, December 19, 12

Protocol specific meta-data

Application Signature

Application Meta-data

XXX

43
Wednesday, December 19, 12

Protocol specific meta-data

Application Signature

XXX

Application Meta-data

123

abc

Application meta-data is normally located at predetermined


payload areas
The location is defined by the network protocol

43
Wednesday, December 19, 12

request
response

Web Server

Client Applications

44
Wednesday, December 19, 12

Request:
GET /en/home/index.html HTTP/1.1
Host: www.ipoque.com
User-Agent: Mozilla/5.0 ...
[...]
request
response

Web Server

Client Applications

44
Wednesday, December 19, 12

Request:
GET /en/home/index.html HTTP/1.1
Host: www.ipoque.com
User-Agent: Mozilla/5.0 ...
[...]
request
response

Web Server

Client Applications

44
Wednesday, December 19, 12

Request:
GET /en/home/index.html HTTP/1.1
Host: www.ipoque.com
User-Agent: Mozilla/5.0 ...
[...]

Response:

request

response
HTTP/1.1 200 OK
Date: Sun, 12 Feb 2012 10:37:02 GMT
Server: Apache/2.2.14 (Ubuntu)
Last-Modified: Sun, 12 Feb 2012 10:37:02 +0000
Content-Language: en
Web Server
Content-Type: text/html; charset=utf-8
[...]
<html>
[ html web site description ]
</html>
44

Wednesday, December 19, 12

Client Applications

Request:
GET /en/home/index.html HTTP/1.1
Host: www.ipoque.com
User-Agent: Mozilla/5.0 ...
[...]

Response:

request

response
HTTP/1.1
HTTP/
200 OK
Date: Sun, 12 Feb 2012 10:37:02 GMT
Server: Apache/2.2.14 (Ubuntu)
Last-Modified: Sun, 12 Feb 2012 10:37:02 +0000
Content-Language: en
Web Server
Content-Type: text/html; charset=utf-8
[...]
<html>
[ html web site description ]
</html>
44

Wednesday, December 19, 12

Client Applications

Request:
GET /en/home/index.html HTTP/1.1
Host: www.ipoque.com
User-Agent: Mozilla/5.0 ...
[...]

Response:

request

response
HTTP/1.1
1.1 200 OK
Date: Sun, 12 Feb 2012 10:37:02 GMT
Server: Apache/2.2.14 (Ubuntu)
Last-Modified: Sun, 12 Feb 2012 10:37:02 +0000
Content-Language: en
Web Server
Content-Type: text/html; charset=utf-8
[...]
<html>
[ html web site description ]
</html>
44

Wednesday, December 19, 12

Client Applications

Request:
GET /en/home/index.html HTTP/1.1
Host: www.ipoque.com
User-Agent: Mozilla/5.0 ...
[...]

Response:

request

response
HTTP/1.1
1.1 200 OK
Date: Sun, 12 Feb 2012 10:37:02 GMT
Server: Apache/2.2.14 (Ubuntu)
Last-Modified: Sun, 12 Feb 2012 10:37:02 +0000
Content-Language: en
Web Server
Content-Type: text/html; charset=utf-8
[...]
<html>
[ html web site description ]
</html>
44

Wednesday, December 19, 12

Client Applications

Flow Correlation

Wednesday, December 19, 12

Control and Data Channel

Control Channel

Data Channel

46
Wednesday, December 19, 12

VoIP via SIP/RTP

SIP

RTP

47
Wednesday, December 19, 12

VoIP via SIP/RTP

SIP Invite
INVITE sip:bob@biloxi.example.com SIP/2.0
[..]
From: Alice <sip:alice@atlanta.example.com>;tag=9fxced76sl
To: Bob <sip:bob@biloxi.example.com>
Call-ID: 3848276298220188511@atlanta.example.com
CSeq: 1 INVITE
SIP
Contact: <sip:alice@client.atlanta.example.com;transport=tcp>
Content-Type: application/sdp
Content-Length: 151
v=0
o=alice 2890844526 2890844526 IN IP4 client.atlanta.example.com
RTP
s=c=IN IP4 192.0.2.101
t=0 0
m=audio 49172 RTP/AVP 0
a=rtpmap:0 PCMU/8000
47
Wednesday, December 19, 12

VoIP via SIP/RTP

SIP Invite
INVITE sip:bob@biloxi.example.com SIP/2.0
[..]
From: Alice <sip:alice@atlanta.example.com>;tag=9fxced76sl
To: Bob <sip:bob@biloxi.example.com>
Call-ID: 3848276298220188511@atlanta.example.com
CSeq: 1 INVITE
SIP
Contact: <sip:alice@client.atlanta.example.com;transport=tcp>
Content-Type: application/sdp
Content-Length: 151
v=0
o=alice 2890844526 2890844526 IN IP4 client.atlanta.example.com
RTP
s=c=IN IP4 192.0.2.101
t=0 0
m=audio 49172 RTP/AVP 0
a=rtpmap:0 PCMU/8000
47
Wednesday, December 19, 12

VoIP via SIP/RTP

SIP

RTP

48
Wednesday, December 19, 12

VoIP via SIP/RTP

SIP

RTP

48
Wednesday, December 19, 12

VoIP via SIP/RTP

DPI: Flow Correlation

SIP

RTP

48
Wednesday, December 19, 12

Thank you!
Hendrik Schulze
hendrik.schulze@ipoque.com

ISS World IP Tutorial


Wednesday, December 19, 12

49

Das könnte Ihnen auch gefallen