Survivors Final Project Report

DETECTION AND PREVENTION OF DISTRIBUTED
DENIAL OF SERVICE (DDoS) ATTACKS

TEAM: SURVIVORS
EE 533 PROJECT REPORT

USC - VITERBI SCHOOL OF ENGINEERING | USC
ANKITA PATIDAR
HARI KRISHNA VETSA
JASMEET KAUR
JIACHANG GE
KAMALPREET KAUR
KUSHAL REDDY
MADHURI N MURTHY
DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS
CONTENTS
CONTENTS ......................................................................................................................................... 1
ABSTRACT ......................................................................................................................................... 2
1.
PROBLEM STATEMENT & SOLUTION ........................................................................................... 3
1.1 Problem Statement ...................................................................................................................... 3

1.2 Proposed solution .............................................................................. Error! Bookmark not defined.
1.2.1 Volumetric DDoS........................................................................................................................... 4
1.2.2 IP Spoofing .................................................................................................................................... 4
2. BACKGROUND AND RELATED WORK............................................................................................... 6
3.
HARDWARE SYSTEM OVERVIEW ................................................................................................. 7

3.1 DETERLAB ENVIRONMENT SYSTEM ................................................................................................. 7
3.2 Network topology ............................................................................................................................ 9
4.
Implementation details .............................................................................................................10

4.1 Function performed by the hardware ........................................................................................... 10
4.2 Block diagram ................................................................................................................................ 11
4.3 Decryption ..................................................................................................................................16

4.4 Encryption ..................................................................................................................................16
4.5 Check sum Calculation .................................................................................................................16
4.6 Hardware Accelerator .................................................................................................................16
Bloom Filter ......................................................................................................................................... 16
4.6.1 Initialization ............................................................................................................................... 17
4.6.2 Query .......................................................................................................................................... 18
4.7 Scripts to assist the hardware ......................................................................................................18
4.7.1 C compiler................................................................................................................................... 18
4.7.2 Instruction Set Architecture ...................................................................................................... 19
4.8
RSA........................................................................................................................................23
REFERENCES: ....................................................................................................................................28
ABSTRACT
Malicious hackers launch thousands of distributed denial of service (DDoS) and web application
attacks each day. Application attacks can steal valuable data and the damage could be
irreplaceable.
Among the several network security threats that internet faces today, Denial of service attacks
are very severe. It is essential to have a secure channel to communicate data between various
nodes. Since, there is no clear to solution to this, this problem has troubled security architects for
over a decade. We come up with a solution to mitigate the DoS by bombarding the bad node
itself with large packets (ping of death).
We have implemented a Dual-Core and quad-threaded Processor with a DDoS detection and
Prevention. Our processor is based on RISC Architecture with 5-stage Pipeline. Nodes are
classified into authorized and Un-Authorized nodes. During the initialization phase, the IP and
MAC table is populated with the new IP addresses and the corresponding MAC addresses. Now,
when the packets arrive, the hardware accelerator (which is the IP and MAC table) is checked to
see if the packets are received from an authorized client or an IP-spoofed client. If packets are
received from authorized nodes, they are passed onto the output queues, otherwise the packet
is re-routed to a dump node. If packets are received from un-authorized nodes, i.e. If the IP is
not a part of the IP-MAC table (hardware accelerator), it is dropped. In addition to this, even if
the MAC address matches, it is checked for the MAC address and even if the IP and MAC address
do not match, the packet is dropped, since it is considered to be IP spoofed.
1. PROBLEM STATEMENT & SOLUTION

The section below describes the problem statement and the proposed solution.
1.1 Problem Statement

A Distributed Denial of Service (DDoS) attack is one of the major concerns in the Networking
Industry. This attack makes a regular service unavailable by flooding the server. They target a
wide variety of resources like banks, news websites, gaming servers; hence provide a major
challenge for people to access and publish important information.
DDoS attacks come in various forms: Smurfs, Teardrops, Ping of Death, TCP Connection attacks,
Volumetric attacks, Fragmentation attacks, to name a few.
An open-source web service, digital attack map provides a live stream of the number of DDoS
attacks happening around the world on a daily basis.
As shown below, the DDoS activity for just a single day (May 6, 2016) is enormous. The orange
dots indicate the volumetric DDoS attacks, which are in a majority followed by Application attacks
(Blue dots) and Fragmentation attacks (Green).
Figure 1: The various DDoS attacks across the globe on March 26th, 2016. (Source: Digital-attack-map)
Hence, due to this increasing statistics of Volumetric DDoS, we are concentrating on creating a
defense mechanism to identify these types of attacks and secure the server from overloading
and eventually crashing.
1.1.1 Volumetric DDoS

Packets received from the trusted source (Authorized Client) are matched for malicious packets
using our Intrusion Detection System. The malicious packets are dropped while the other packets
are forwarded to the destination.
In case the packets are received from the untrusted source (Unauthorized Client), along with
matching it for a malicious pattern, we also update a counter (Exclusive to each node). If more
than a pre-defined number of packets are received, (Counter reaches a threshold) then that node
is permanently denied service. And any further packets from the node are unprocessed and sent
back to the node followed by a set of voluminous packets which would crash the node (Ping of
Death) hence preventing any further attacks from it to the control node, thereby increasing the
throughput of the control node (NetFPGA).
1.1.2 IP Spoofing
Another problem that our hardware will be addressing is that of IP Spoofing. This happens when
a bad-client tries to spoof the IP Address of another client (Could be an authorized clients IP
address) and send bad packets to congest the processor. In order to identify this, we implement
a defense mechanism. For any packet being received from either an authorized or an authorized
node, the MAC address and IP address are extracted and are compared to the predefined data in
the LUT (Look up table) in the processor. In case of any mismatch in the values, then the IP
spoofing is successfully detected and the node gets denied service.
1.1.3 Data Exchange

For any data exchange that happens between one node to another, we require a secure path. In
order to do so, we are exchanging the symmetric keys using RSA Algorithm. We first send the
public key of the node, along with its digital signature to our control node (NetFPGA) which checks
for the validity of the signature, only after which it encrypts its own public key with the public key
received and sends it back to the node. Then, there is an exchange of symmetric keys between
the two which would be stored and used for all future communications between them.
1.2 Proposed solution

We implemented a hardware based Network Security System which would initially classify the
clients (Nodes) as trusted and untrusted based on their IP address.
1.2.1 Flow diagram

The following is the flow diagram of our system implementation.
Figure 2: Flow Diagram
As shown above, the incoming packets are stored in a convertible FIFO, and the IP and MAC
address is extracted from the packet, and then compared with the MAC and IP LUT. If an IP
address of the source is not found in the IP MAC table, the packet is dropped. Also, if the IP and
MAC addresses mismatch, it means that the IP of the source is spoofed and is a potential attack,
therefore, we re-route such a packet to a dump node.
The whole implementation of the system is described in the subsequent sections.
2. BACKGROUND AND RELATED WORK

The figure below shows the statistics related to the DDOS attacks, which highlights why it is an
important concern in network security.
Figure 3: DDoS statistics: Source: stateoftheinternet.com
On a commercial scale, there are several ways to defend DOS attacks:

1. Software approach: A simple approach to defend DOS attacks is to use software scripts.
2. But these days, the attacks are far more complex to be stopped using software.
3. Internet Service Provider (ISP): ISP has a better bandwidth to provide DDoS mitigation.
ISP DDoS mitigation solutions only protect their network links, not the other links you
might have. Cloud protection is also not provided.
4. Cloud Mitigation Provider: This provides DDoS Mitigation from cloud. The cloud can use
several machines to ensure that only clean data receives us.
We use our 5-stage pipelined processors hardware approach to prevent the DDoS attacks.
This approach is faster than the above two approaches, and is cheaper than the third solution.
3.
HARDWARE SYSTEM OVERVIEW
In the following section, we describe the hardware system in a top-to-down fashion, describing
what happens at the top level and further, explaining all the modules in detail eventually.
3.1 DETERLAB ENVIRONMENT SYSTEM

To detect and mitigate DDoS attack, the hardware system has been implemented on the
NetFPGA platform in DeterLab environment. The NetFPGA reference router module was
extended to work as per our custom specifications, i.e. to filter out the packets from IP addresses
which are not a part of the trusted sources. The router is modified to make a Dual-core quadthreaded pipelined Datapath on NetFPGA. The Data Memory (MEM stage) of the 5-stage pipeline
is implemented as FIFO, and the incoming packet is stored into this FIFO, and eventually the
packet is modified (decryption and checksum calculation) in the 5-stage pipeline. The dual
processor is accompanied by a hardware accelerator which is a IP-MAC lookup table to detect
any IP spoofed attacks.
a. NetFPGA
NetFPGA (Network Field Programmable Gate Array) is the major component of our hardware
system. It is a line-rate, flexible, open networking platform to develop open source hardware and
software for rapid prototyping of computer network devices. This allows users to develop designs
that are able to process packets at line-rate, a capability generally not afforded by software based
approaches.
The NetFPGA 1G specifications have been tabulated below:
Table 1: NetFPGA specifications
a. Top-level integration in DeterLab

A NetFPGA-1G implementation has eight 1Gbps links (four MAC ports and four corresponding
host ports) and one data path pipeline for data from all the eight ports. Figure 3 shows a detailed
block diagram of the Open-Flow based Switch implementation on the NetFPGA board. Our
hardware system is implemented after the output port lookup module in the system.
Figure 4: Hardware System integrated in the DeterLab system
Our hardware system uses a dual-core

quad-threaded processor to process
the packet, and to detect any attacks
and re-route the packets in case any
attack is detected, which was initially
targeted to the server node.
Figure 5: NetFPGA hardware
3.2 Network topology

The following diagram displays the topology of the system implemented. To mimic a possible
DDoS attack, we implement a DDoS attack from the attacker node, targeted to disrupt the service
of the server. As shown below, there is a client which wishes to communicate with the server.
However, the attacker which sends malicious packets continuously (TCP-SYN attack) disrupts the
normal working of the server. The attacker also employs a bot (or slave) which aids the attacker
in generating SYN packets.
Figure 6: Topology of the implemented system
In a normal communication, the server is able to serve the packets from a client and the CPU
utilization of the server is low. However, when an attacker, along with an army of bots sends
attack messages to the server, the server is unable to service the genuine messages from the
client, and crashes under the load the attack messages. The CPU utilization serves as a proof for
this, as it grows really high when several SYN packets are sent. Unlike a normal three way
handshake TCP communication, which uses the SYN, SYN-ACK and ACK, the attacker and the bot
do not wait for the SYN acknowledgement, rather they keep sending the SYN packets. So
eventually, the client is overloaded with several number of unserved packets, which will
eventually lead to the input buffer getting full and thereby, dropping off genuine packets.
4.
IMPLEMENTATION DETAILS
We are programming our NetFPGA as a dual-core quad-threaded processor. Each core is a 5-stage
pipeline consisting of the IF-ID-EX-MEM-WB stages. The five stage pipeline is explained in the
next section. The hardware router keeps track of malicious packets by checking whether the
source IP of the packet matches one of the IPs in the IP lookup table, such that it doesnt bombard
our server with malicious packets, and the malicious packet is dropped at the router itself. This
enables the servers CPU utilization to remain low such that it is able to serve the genuine packets.
In the absence of the hardware solution, the server is bombarded with malicious packets, thereby
reducing the server throughput.
4.1 Function performed by the hardware

The hardware performs functions, which are described pointwise below. Whenever a packet
arrives, it is stored into the FIFO and then this convertible fifo acts as a data memory and checked
against the several attack parameters which is going to decide whether we need to drop the
packet or not. Once the processor analyses the packet, it decides whether the packet is fit to be
routed to the server. If yes, it is encrypted using a symmetric key (exchanged by RSA algorithm)
and sent to the server, where we decode the message. If not, the packet is dropped at the router.
4.1.1 Detecting malicious nodes

When the node IP address is not the trusted (not a part of the topology), we classify that node as
a malicious node. The packets from a malicious node are dropped by our router, as they are not
the intended packets for a server. To define the nodes as trusted nodes, we have an initialization
phase in which packets are being sent by every node present in our topology. This way, we can
store IP and MAC addresses of our trusted nodes.
4.1.2 IP Spoofing
When the other nodes spoof the source IP such that it looks like it is arriving from the trusted IPs,
we can still detect that it is a malicious packet based on the MAC address of the packet, since the
attackers generally spoof the IP of the packet, and the MAC address remains that of the evil client
itself. Such a packet is also dropped. By doing all this, we can reduce the load on the server as our
system will be placed before the destination node.
10
4.2 Block diagram

The following figure shows the block diagram of our hardware system. It appends the working of
the NetFPGA router by using a dropFIFO, a dual core processor, and a hardware accelerator (IP
LUT). As soon as a packet arrives from a small FIFO, the source IP of the packet is compared with
the stored IPs in the IP-MAC LUT (Hardware accelerator).
The various components used in the block diagram of the hardware are described below:
Figure 7: Hardware overview of the system
4.2.1 DropFIFO
DropFIFO is a convertible dual-port data memory/FIFO. It has 512 locations each are 64 bit wide.
Initially, as a packet arrives, it is stored in the DropFIFO. When the FIFO is full with the packet
contents, the FIFO sends a done signal to the processor, after which the processor starts
modifying the packet.
After the packet is decoded by the pipeline and verified by the hardware accelerator, a match
signal is sent to the processor signaling the dropFIFO whether to drop the packet or send it to
11
the output queue. The way to generate the match signal is described in the hardware
accelerator section.
Figure 8: Convertible DropFIFO
Sequence of operations in FIFO:

1. Store packet into DropFIFO (used as FIFO)
2. Feed the packet to the pipeline (used as data memory)
3. Send the packet to output queue (used as FIFO)
4.2.2 Processor
A dual-core quad-thread processor is implemented for the hardware system. The following figure
shows the block diagram of the processor. Each core serves a packet each, and packet will be
sent to both cores using the De-Mux when it gets the signal from cores.
Figure 9: Dual core quad-thread processor
12
While sending packets to output queue, Mux will select the packets from one of the cores based
on the signals from the cores when processing is completed.
4.2.3 5-stage pipeline

In the following section, the 5-stage pipeline is described in the next stage.
Instruction Fetch (IF) STAGE

As the name suggests, IF Stage is the stage from which instructions to be executed are fetched.
The instructions are stored in the Instruction Memory in this stage. At every increment of the PC,
a new instruction is dispatched into the pipeline. As this is a quad-threaded processor the
instructions from four threads are scheduled in a round robin fashion.
In order to reduce the dependency, we are using 4 threads for encryption. Each thread encrypts
different memory location of the data memory such that each is independent of each other.
Here each instruction is of 32 bit wide, so we have instruction Memory which is of 256 locations
depth and 32 bit wide. The figure below shows the IF stage and the IF/ID stage register.
Figure 10: IF stage of the 5-stage pipeline
Instruction Decode (ID) STAGE

The instructions from the IF stage enter the Instruction Decode stage, where based on the
opcode field of the dispatched instruction, the corresponding control signals are generated and
the instruction is broken into several bit fields so as to be executed in a particular way.
The Register file is 32 locations deep and 64 bit wide. The Register File is a dual-port register
bank. The register file has two read ports and one write port.
13
The Opcode is taken as the input into the control unit and based on the instruction to be executed
a set of control signals is generated by the Control Unit. These control signals are consumed by
other modules as and when necessary for the successful execution of the instruction.
Figure 11: ID stage of the 5-stage pipeline
Execute (EX) STAGE

The arithmetic and logic operations related to the instruction are done in this stage. This is
handled by the Arithmetic and Logic Unit (ALU) present in this stage. The processed data is then
forwarded to the next stage. The ALU also does the address calculation for the load and store
instructions.
Depending on the binaries of the aluctrl the ALU is capable of operating on two 64bit inputs and
give the appropriate result. The ALU performs operations like addition, subtraction, bitwise OR,
bitwise AND, Set Less Than (SLT), Shift Left Logical (SLL) and Set Equal.
Figure 12: EX stage of the 5-stage pipeline
14
Memory (MEM) STAGE

The MEM stage consists of Data memory. The data memory is inter-convertible; we can use it
as FIFO for the network packets as well as we can use it as data memory for the pipeline. We
also have drop mechanism in data memory such that if the packet is IP Spoofed we will drop it
by just moving the write pointer to the start of the data memory. The Data Memory is of 64-bits
wide and 512 locations deep.
Figure 13: MEM stage of the 5-stage pipeline
Write-back (WB) STAGE

During the write back stage instructions write their results into the register file. The choice of
data between memory output data and the ALU output data is done with the help of a mux.
Figure 14: WB stage of the 5-stage pipeline
15
4.3 Decryption
The packet which is received at the NetFPGA is decrypted using the symmetric key for the
corresponding nodes. The symmetric key is earlier obtained by RSA key exchange which will be
explained in following section.
The packet is first decrypted using the XOR logic employed in the hardware (symmetric key
decryption). So, once the packet is received at the destination node, the packet will be decrypted
using exchanged symmetric key.
4.4 Encryption
We load the first part of the instruction memory with the instructions that are needed to perform
the encryption of the packet which is going to be send to the server node. Once the check sum is
calculated and the packet is encrypted by XNORing with symmetric key, we modify the encrypted
packet header with the new check sum.
4.5 Check sum Calculation

We load the second instruction memory with the instructions that are needed to perform the
check sum calculation of the encrypted packet forwarded to the server & the packet which is rerouted due to spoofing.
Once this is done we send the packet to the output queue.
4.6 Hardware Accelerator

We also employ an additional hardware accelerator, which aids the core to check whether a given
IP is a part of the IP look up table (LUT) or not. If it is not a part of the IP LUT, the packet is
considered as a malicious packet and is dropped. The hardware accelerator, which is a bloom
filter is described next.
Bloom Filter
A Bloom filter is a space-efficient probabilistic data structure, that is used to test whether
an element is a member of a set. False positive matches are possible, but false negatives are not,
thus a Bloom filter has a 100% recall rate. In other words, a query returns either "possibly in set"
or "definitely not in set". Elements can be added to the set, but not removed. The more elements
that are added to the set, the larger the probability of false positives.
An empty Bloom filter is a bit array of m bits, all set to 0. There must also be k different hash
functions defined, each of which maps or hashes some set element to one of the m array
positions with a uniform random distribution. Typically, k is a constant, much smaller than m,
16
which is proportional to the number of elements to be added; the precise choice of k and the
constant of proportionality of m are determined by the intended false positive rate of the filter.
Here we chose our k as 3 and m as 128.
To add an element, feed it to each of the 3 hash functions to get 3 array positions. Set the bits at
all these positions to 1, as shown below.
Figure 15: Bloom filter with three hash functions
To query for an element (test whether it is in the set), feed it to each of the 3 hash functions to
get 3 array positions. If any of the bits at these positions is 0, the element is definitely not in the
set if it were, then all the bits would have been set to 1 when it was inserted. If all are 1, then
either the element is in the set, or the bits have by chance been set to 1 during the insertion of
other elements, resulting in a false positive.
Here we have a false positive rate of 3 % which is fairly acceptable.
There are two stages in the bloom filter
1.
Initialization
2.
Query
4.6.1 Initialization
In the initialization phase we ping all the nodes in the network and thus we store all the ip and
mac addresses of the corresponding nodes in the bloom filter.
Here we concatenate the ip and mac addresses of the nodes and perform all the 3 hash
functions on the concatenated value, thus we get 3 array values and the corresponding array
elements are set to 1.
17
4.6.2 Query
So when a new packet comes in, we extract the ip and mac address from the received packet
and send to our bloom filter.
The bloom filter concatenates the received ip and mac address and perform all the three hash
functions on the concatenated value, thus again we get 3 values, using these values we access
the array elements and check whether the accessed elements of the array are set to 1.
If all the array elements are accessed are 1s we say that the packet received is from the genuine
node and send the packet to encryption module which is further sent to the destination node.
If the accessed array elements are all not ones then the received packet is from attacker, so we
drop the packet.
4.7 Scripts to assist the hardware

Several software scripts are required to assist the hardware with the required functionality. For
example, we write C codes for different functionalities to be implemented by the instruction
memory. These C nodes need to be converted to the machine code to be stored into the
instruction memory. We created our own compiler to convert the C code to machine code.
Besides, we need several other Perl scripts to read the hardware/software registers, read the
data memory of the processor as well as implement the RSA encryption key exchange in software.
The description of the software modules is as follows:
4.7.1 C compiler
The C compiler is required to convert the C-codes written to convert the C-code into machine
code. The compiler is implemented in two stages:
C to Assembly conversion: We use the GNU C Cross compiler to convert C code to MIPS Assembly
code. This outputs a .s file, which can be converted to our instruction sets assembly code (which
is slightly different from the MIPS code). We have written a Perl script to convert the Cross
compilers assembly output to our ISAs assembly code. This creates another .s assembly code.
Assembly to instruction code conversion: After the assembly code is created, we convert the
assembly code to binary format (using a perl script) such that the binary code can be fed into the
instruction memory. The assembly code conversion is based on the opcode and the registers to
be written and read.
18
4.7.2 Instruction Set Architecture

The Instruction Set Architecture (ISA) determines which instructions are used to perform the
relevant functions in the hardware. Some of the instructions are similar to that of MIPS
architecture, but limited number of instructions are supported by our ISA that are essential and
sufficient to a network processor. The 32-bit instruction field is divided as follows:
Opcode
Rs
Rt
Rd
Imm field
[31:28]
[27:23]
[22:18]
[17:13]
[12:0]
Since 15 instructions are supported by the ISA, a 4-bit opcode field is used for to specify the
instructions. There are 32 registers, so the Rs, Rt and Rd fields are 5-bit wide each.
The following are the 15 instructions supported by our ISA. All the instructions are described
below:
1. Load Word
LW R1,0001 //Load contents of location 0001 into register1
Description
Adds two source registers and stores the result in the destination register
Operation
Rd <- Rs+Rt
Syntax
ADD Rs,Rt,Rd
Encoding
0000 sssss 00000 00000 Addr_bits[12:0]
2. SW R1,0002 //Load contents of register 1 into memory location 2

Description
Operation
Rd <- Rs+Rt
Syntax
ADD Rs,Rt,Rd
Encoding
0000 sssss 00000 00000 Addr_bits[12:0]
19
3. ADD R1, R2, R3 // R3=R1+R2

Description
Operation
Rd <- Rs+Rt
Syntax
ADD Rs,Rt,Rd
Encoding
0000 sssss ttttt ddddd 0000000000000
4. SUB R1, R2, R3 // R3=R1-R2

Description
Operation
Rd <- Rs+Rt
Syntax
ADD Rs,Rt,Rd
Encoding
5. ADDI R1 R2 0x0001 //R2=R1+0x0001

Description
Operation
Rd <- Rs+Rt
Syntax
ADD Rs,Rt,Rd
Encoding
6. LEFT_SHIFT R1 R2 R1 //R1=R1<<R2
Description
Operation
Rd <- Rs+Rt
Syntax
ADD Rs,Rt,Rd
Encoding
20
7. RIGHT_SHIFT R1 R2 R1 //R1= R1>> R2

Description
Operation
Rd <- Rs+Rt
Syntax
ADD Rs,Rt,Rd
Encoding
8. XNOR R1 R2 R3 // R3= ~(R1^R2)

Description
Operation
Rd <- Rs+Rt
Syntax
ADD Rs,Rt,Rd
Encoding
9. MAKE_UP1 R1 R2 R3 //R3={R2[15:0], R1[47:0]}

Description
Operation
Rd <- Rs+Rt
Syntax
ADD Rs,Rt,Rd
Encoding
10. MAKE_UP2 R1 R2 R3 //R3={R1[63:48], R2[47:32], R1{15:0]}

Description
Operation
Rd <- Rs+Rt
Syntax
ADD Rs,Rt,Rd
Encoding
0000 sssss ttttt ddddd 0000 0000 0000 0000
21
11. MAKE_UP3 R1 R2 R3 //R3={R1[63:32], R2[31:16], R1{15:0]}

Description
Operation
Rd <- Rs+Rt
Syntax
ADD Rs,Rt,Rd
Encoding
12. MAKE_UP4 R1 R2 R3 // R3={R1[63:0], R2[15:0]}

Description
Operation
Rd <- Rs+Rt
Syntax
ADD Rs,Rt,Rd
Encoding
13. JMP 0x0010 //Jump to 0x0010

Description
Operation
Rd <- Rs+Rt
Syntax
ADD Rs,Rt,Rd
Encoding
14. BEQ R1 R2 0x0010 //if R1=R2, jump to 0x0010

Description
Operation
Rd <- Rs+Rt
Syntax
ADD Rs,Rt,Rd
Encoding
22
15. BNE R1 R2 0x0010 //if R1!=R2, jump to 0x0010

Description
Operation
Rd <- Rs+Rt
Syntax
ADD Rs,Rt,Rd
Encoding
The MAKE_UP instructions above are used in the checksum calculations, these are created to
extract the 16-bit substrings out of a 64-bit data. The MAKE_UP instruction1 extracts the first
16-bits, MAKE_UP instruction2 extracts the next 16-bits [31:16] and so on. The other
instructions execute instructions like a normal MIPS architecture, like load, store, add, subtract,
xnor, xor, etc.
4.8 RSA
To exchange the symmetric key which are used for encryption of all packets for secure
communication between any two nodes. Next the RSA key setup is explained.
Step1: Generating public and private key at each node
Each user generates a public/private key pair by selecting two large primes p and q
and computing their system modulus N=p.q. Next we compute (N)=(p-1)(q-1)
Now we select random encryption key e where 1<e<(N) and gcd(e,(N))=1
Solve following equation to find decryption key
e.d=1 mod (N) and 0dN
Publish their public encryption key: KU={e,N}
Keep secret private decryption key: KR={d,p,q}
Step 2: RSA to encrypt a message
To encrypt a message M the sender:

- obtains public key of recipient KU={e,N}
- computes encrypted message: C=Me mod N, where 0M<N
To decrypt the ciphertext C the receiver:

- uses their private key KR={d,p,q}
- computes the decrypted message: M=Cd mod N
23
In this way, the symmetric key is exchanged between any two nodes, such that no
attack can obtain the key.
RSA example: At the node n0, the RSA key is generated and message (symmetric key) is
encrypted using the generated key.
1.
Select primes: p=17 & q=11
2.
Compute n = pq =1711=187
3.
Compute (n)=(p1)(q-1)=1610=160
4.
Select e : gcd(e,160)=1; choose e=7
5.
Determine d: de=1 mod 160 and d < 160 Value is d=23 since 237=161= 10160+1
6.
Publish public key KU={7,187}
7.
Keep secret private key KR={23,17,11}
Sample RSA encryption/decryption is: Given message M = 88 (nb. 88<187)
Encryption: C = 887 mod 187 = 11
Decryption: M = 1123 mod 187 = 88
RSA key exchange makes the communication more secure.
24
5.Benchmarking
To compare the implemented hardware against the available software solutions, we use two
parameters namely CPU utilization, latency and data handling (throughput).
We used an open source DDoS attack generator and detection tool developed by UCLA students
and compared it against our hardware. The source code link can be found here:
https://github.com/kenzshi/DDoSProject
In the source code mentioned above, the topology consists of four main nodes: server (receiving
traffic), client (sending good traffic), master (sending malicious packets) and bots (slaves that
work for master and help in sending malicious packets). When the attack is generated, the server
is flooded with malicious packets such that it is unable to serve the good packets. The DDoS attack
is generated in the DeterLab topology as described in section 3.2, and also shown below. The
attack is generated from the master node and the bot node helps it generates DDoS attack traffic.
After the implementation of the software mitigation, TCP
dump is monitored at the server node and the following
parameters are extracted and compared with the
parameters obtained with the hardware. The software also
analyzes the tcpdump to check whether any packet is a
possible DDoS attack or not.
Advantages of the hardware solution over the software
solution:
Hardware solution employs a Netfpga Router in

between the server and the client, so that only the
packets that are intended to the server reach it, and the DDoS attack packets are dropped.
The hardware solution is also faster than the software, giving over a 2x improvement in
the design compared to the software when latency is measured for the two using return
trip time (RTT).
Data handling is also better in case of hardware. This is because the hardware router
receives the packet from the input buffer and sends it to the server immediately. However,
if a software routes the packet from the client to the server, it eventually starts dropping
packets as the input buffer becomes full rapidly, but the software wont be able to route
the packets to the correct destination.
25
5.1 CPU utilization

CPU utilization is an important parameter to judge the
performance of the server. In case of attack, the CPU
utilization of the server rises around 22%, which makes
the performance of the server go very low, and hence
good packets cant be served. The utilization in case of
hardware mitigation is around 3.3%, while the CPU
utilization for a software solution is around 17%.
CPU utilization of the server:
In case of attack:
Attack mitigated by software:
Attack mitigated by hardware:
22%
17.14%
3.3%
CPU Utilization
30%
20%
10%
0%
H/w mitigation
DDoS Attack
S/w mitigation
5.2 Data Handling

The data handling capacity of the server, which is measured using the number of packets that
are dropped, is reduced whenever there is a DDoS attack. If we mitigate the attack using a
software solution, the data handling improves, but not as much as it improves with the hardware.
According to [7], the number of request dropout () is the number of requests that are being
dropped due to attack. We measured the impact of attack as number of requests of clients is
being dropped due to congestion on bottleneck link. This value is the number of packets dropped
by the kernel.
Data handling (Packets dropped by kernel):
Software mitigation: 4322 packets dropped
Hardware mitigation: 168 packets dropped
Data handling: Packets

dropped by kernel
5000
4000
3000
2000
1000
0
Number of packets dropped
S/W mitigation
H/w mitigation
26
5.3 Latency
Latency is measured according to RTT (return trip time). It is the elapsed time between the end
of an inquiry or demand on a computer system and the beginning of a response. This is calculated
using the ping RTT of the server.
Following snapshot shows the ping time of the software mitigation on the left and the hardware
mitigation on the right.
Latency (RTT)
25
20
15
10
5
0
Latency
S/W mitigation
H/w mitigation
So the parameters above show that the hardware solution is much better than the software
solution.
5.4 Throughput
Throughput is the rate of sending or receiving of data by a network. It is a good measure of the
channel capacity of a communication link, and connections to the internet. When attack is
launched , the number of clients which have completed three way handshake reduces than the
maximum throughput as both legitimate and attack traffic both are received at server. We have
observed the throughputs at different attacking speeds. We have measured the throughput at
window size (256Kbps)
Attacking traffic (Packets per
second)
Hardware
Throughput(Mbps)
Software Throughput(Mbps)
2000
867.6
805
6000
433.12
367.3
27
Team Members:
Ankita Patidar: apatidar@usc.edu
Hari Krishna Vetsa: vetsa@usc.edu
Jasmeet Kaur: jasmeetk@usc.edu
Jiachang Ge: jiachang@usc.edu
Kamalpreet Kaur: kamalprk@usc.edu
Kushal Reddy: chennare@usc.edu
Madhuri Murthy: mnmurthy@usc.edu
REFERENCES
1.
2.
3.
4.
5.
6.
7.
Hao Chen, Yu Chen, Douglas H. Summerville, A Survey on the Application of FPGAs

for Network Infrastructure Security IEEE Communications Surveys & Tutorial, Vol. 13,
2011
Taner Tuncer, Yetkin Tatar, Detection DoS Attack on FPGA Using Fuzzy Association
Rules, International Joint Conference of IEEE TrustCom, 2011
Albert Kwon, et al, RotoRouter: Router Support for Endpoint-Authorized
Decentralized Traffic Filtering to Prevent DoS Attacks, IEEE International Conference,
2014
Shu Zhang, Partha Dasgupta: Denying Denial-of-Service Attacks: A Router Based
Solution
Carnegie Mellon University, CERT Coordination Center, Denial of Service Attacks,
1997
Gary C. Kessler, Defenses against Distributed Denial-of services Attack, SANS
Reading Room, Threats & Vulnerabilities, November 29th, 2000.
Measuring Impact of DDOS Attacks on on Web Service, Monika Sachdeva, et al.
28

Survivors Final Project Report

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Survivors Final Project Report

Hochgeladen von

Copyright:

Verfügbare Formate

DETECTION AND PREVENTION OF DISTRIBUTED

DENIAL OF SERVICE (DDoS) ATTACKS

EE 533 PROJECT REPORT

DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS

PROBLEM STATEMENT & SOLUTION ........................................................................................... 3

1.1 Problem Statement ...................................................................................................................... 3

HARDWARE SYSTEM OVERVIEW ................................................................................................. 7

Implementation details .............................................................................................................10

4.3 Decryption ..................................................................................................................................16

DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS

DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS

1. PROBLEM STATEMENT & SOLUTION

1.1 Problem Statement

DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS

1.1.1 Volumetric DDoS

1.1.3 Data Exchange

DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS

1.2 Proposed solution

1.2.1 Flow diagram

Figure 2: Flow Diagram

DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS

2. BACKGROUND AND RELATED WORK

Figure 3: DDoS statistics: Source: stateoftheinternet.com

On a commercial scale, there are several ways to defend DOS attacks:

DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS

HARDWARE SYSTEM OVERVIEW

3.1 DETERLAB ENVIRONMENT SYSTEM

Table 1: NetFPGA specifications

DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS

a. Top-level integration in DeterLab

Figure 4: Hardware System integrated in the DeterLab system

Our hardware system uses a dual-core

DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS

3.2 Network topology

Figure 6: Topology of the implemented system

DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS

4.1 Function performed by the hardware

4.1.1 Detecting malicious nodes

DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS

4.2 Block diagram

Figure 7: Hardware overview of the system

DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS

Figure 8: Convertible DropFIFO

Sequence of operations in FIFO:

Figure 9: Dual core quad-thread processor

DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS

4.2.3 5-stage pipeline

Instruction Fetch (IF) STAGE

Figure 10: IF stage of the 5-stage pipeline

Instruction Decode (ID) STAGE

DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS

Figure 11: ID stage of the 5-stage pipeline

Execute (EX) STAGE

Figure 12: EX stage of the 5-stage pipeline

DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS

Memory (MEM) STAGE

Figure 13: MEM stage of the 5-stage pipeline

Write-back (WB) STAGE

Figure 14: WB stage of the 5-stage pipeline

DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS

4.5 Check sum Calculation

4.6 Hardware Accelerator

DETECTION AND PREVENTION OF DISTRIBUTED DENIAL OF SERVICE (DDOS) ATTACKS

Figure 15: Bloom filter with three hash functions