Beruflich Dokumente
Kultur Dokumente
Table ofContents
List ofFigures
ii
1 Introduction
1.1 Abstract................................................................................................................2
1.2 Motivation2
1.3Overview...............................................................................................................2
1.3.1 Quality Factor.....................................................................................3
1.3.2 Operation............................................................................................3
1.4Benefits..................................................................................................................4
1.5Product Specifications...........................................................................................4
2 DesignFeatures
14
19
5 Appendix
21
nsFile...................................................................................................................22
Instruction SetArchitecture..................................................................................23
Important Hardware/SoftwareRegisters..............................................................25
List ofFigures
1. Block Diagram of the NetFPGA 1G
board....................................................................................7
2. User Data Path of the Reference Router on the NetFPGA 1G
......8
3. User Data Path of the HAL 9000 Custom
Router...9
4. Screen shot for RSA Key
Exchange
.17
5. Screen shot for Data Transmission..
.18
ii
Chapter1
Introduction
1.1Abstract
In a conventional router or switch, all the incoming packets are treated
alike but in real time network scenarios, we have different kinds of
packets some of which cannot tolerate delay as much as others, these
packets need to be routed with priority. Packets coming into a router are
generally routed out on FIFO basis. A priority based router checks for
the priority of each arriving packet. The regular packets get lined up in
the queue to go out in the order of their arrival but if a priority is
detected on a packet, it cuts into the existing queue and is rerouted out
first to a high priority node before the queue resumes its operation. This
is the basic idea of our priority based routing
1.2Motivation
Not all packets entering a network are of the same type. Hence,
preferential treatment should be given to important packets when
compared to the normal ones. This simple idea is of utmost importance in
real time network scenarios.
In this project, we have designed a custom router which routes the
packets depending on their priority. This priority is determined on a
Cumulative Quality Factor which is calculated using User-defined content.
1.3 Overview
The hardware of the custom router is a dual core processor build on a
NetFPGA reference router which is used to process the content of the
packets entering the network and determine the priority for custom
routing. The software is written to manage the flow of the packets in and
out of the router.
The processor has an inter-convertable FIFO. Additionally, two more
queues are implemented. A High Priority Queue(HPQ) is used to buffer all
2
the incoming high priority packets and a Low Priority Queue(LPQ) buffers
the other packets. The scheduling of the packets OUT of these queues is
dependent on a Quality factor.
1.3.2 Operation
The core of the router has two queues before the main output queue. For
all the incoming packets in the router, a cumulative quality factor is
calculated. The cumulative Quality Factor is dependent on the user
defined contentanddepending on this QF, it is determined that the
incoming packet will enter which of the two implemented queues.
A Threshold value for the QF is fixed. In the event when the QF of the
packet exceeds the threshold, it is marked as a priority packet and enters
the High Priority Queue and is buffered there. If the QF of the packet is
lesser than the defined threshold, then the packet is marked to be low
priority packet and is buffered in the low priority queue.
An output arbiter is used to schedule the high priority packets before the
low ones. It makes sure that the packets from the low priority queue are
not routed out until the high priority queue is empty.
1.4Benefits
The benefits of this implementation are:
1. Lesser Resource Allocation
The priority is calculated as the packet arrives in real time using
Deep packet inspection of the content and hence this saves the
resource space of an additional priority field.
2. Secure Routing with RSA Encryption
RSA Encryption and Decryption is implemented in software which
ensures secure routing and prevents against intrusions.
3.Use of Hardware Accelerators
The priority determination, scheduling and routing is more fast
and efficient as the implementation is done on Hardware
4.Better Throughput
The router routes all the priority packets on a high cost line which
increases the throughput of the low cost line and in turn increases
the average throughput of the network.
5.Importance of application
In this design, the switch makes sure that important packets are
given priority in terms of scheduling and bandwidth and this
application is of importance in real time networks.
1.5Product Specifications
TheRouterbehavesasatypicallayer2switchwithaddedfeature
s
1.It processes packets at line rate of1GBps
2.The module is clocked at 125MHz
Chapter2
DesignFeatures
2.1Design Outline
We usedaNetFPGA board as hardware and developed a multithreaded
processor to support our custom ISA. The reference router configuration that is
already available is used as the base upon which modifications are done to
implement the desired application.
ThisNetFPGA reference router module has been extended to make
thecustom router design. In the user data path of the Reference router, our
module is implemented after the input arbiter logic and before the output
queues. The module is implemented in hardware in Verilog HDL language.
For the priority based routing, a priority decider module inspects the header
and decides if the packet deserves a priority. Then once priority is established,
the header is modified for routing out on the high cost line. Next, two queues
are created in hardware. One is used to buffer the low priority type of packets
and the other is used to buffer the high priority type of packets. The flow of
the packets into these different queues respectively is controlled by the input
arbiter and the flow of packets out of these queues is controlled by the output
arbiter which again is implemented in hardware.
To interface with the processor inside the router, perl, shell and python scripts
have been written to perform various software functions and access the
values stored in the hardware registers. (Refer Appendix C).
Security for the system is implemented using RSA encryption implemented in
software. At first, the RSA symmetric key is encrypted at the node with its
own private key and then sent to the control node of the NetFPGA which
decrypts this with the public key and stores the result (original symmetric
key) in a register. Now both the node and the router have the same
symmetric key. When the node sends data, it encrypts the data using simple
xor operation with the symmetric key and at the router, the encrypted data
received is decrypted back using xor operation and the processing is done.
Once done, the router encrypts the data in the same way and it is decrypted
at the receiving node.
Each component of the design is explained in detail in the following section.
2.2 NetFPGA
TheNetFPGAisalow-costopenplatform for research and experimentation.
It
hasprimarilybeendesignedasatoolforteachingnetworkinghardware
NetFPGA1G
NetworkInterface
4 x 1Gbps Ethernetports
HostInterface
PCI
FPGA
VirtexII-Pro50
LogicCells
53,136
BlockRAMs
4176kbits
External Memories(SRAM)
External Memories(DRAM)
2.4.1 Hardware
ids
This
is
the
top
levelmodule
implementsthecustomswitchdesign.
It
for
has
our
the
design
and
following
sub-
modules:
fallthrough_small_fifo:
This
module
buffers
all
the content of each incoming packet with all of them to assign the
Quality Factor and in turn decide the priority.
Once the priority is decided it should be buffered into the High priority
queue. The priority decider module not only decides the priority but also
works on the packet once it is assessed and determined to have a high
priority. It performs header modification such that the high priority
module is routed out on the high cost line after being buffered in the
high priority queue.
input_arbiter
This module is once again an extension of the dpu. It receives the
packets from the output port of the priority decider; hence the incoming
packets to this module have a priority that has been determined. The
input arbiter then routes the packet either into the high priority queue
or into the low priority queue depending on the priority Quality Factor
they hold.
high_priority_queue
It is a FIFO module which buffers the high priority packets.
low_priority_queue
It is a FIFO module which buffers the high priority packets.
output_arbiter
The last module of the design is the output arbiter; this arbiter
implements the scheduling logic. It makes sure that the high priority
packets are routed out first and only when the high priority queue is
empty, then the low priority packets are routed out into the output
queues of the user data path.
2.4.2 Software
The following scripts are implemented to create a software platform to
interface with the hardware of the network.
Initialize.sh:
This script is run on the control node at the very beginning of the test. In
this script, several commands are given to the control node to perform
various functions such as:
Assign MAC ID and IPs to each node on the network and the
control node
Start.sh:
This script is run on all the other nodes once the initialization script is run
on the control node. In this script, the following commands are given to
all the nodes to perform functions such as:
Send/Receive Data
(For sending and receiving data, one of the nodes of the network is
programmed to send the data to the control node once key
exchange has occurred and the other nodes are programmed to
receive the data from the control node after the packets are routed
out.)
2.4Scheduling Logic
A High cost line is reserved only for the priority packets and cannot be
accessed by any other packet. The router uses deep packet inspection to
check the content of the packet and determine priority for each incoming
packet. Then, using this assigned priority information, all the Priority
packets are buffered into the High priority queue and the Low priority
packets are buffered into the Low priority queue. The packets from the
HPQ are routed out of the router to the High Cost line and all the other
packets in the LPQ are not allowed to access this High cost line.
This ensures that the throughput of the High cost line remains high and is
a privilege only for the high priority packets. The low priority packets in
the LPQ are routed out to the low cost line.
Chapter3
Design Implementation
3.2 Working
The hardware logic explained in Chapter 2 was implemented in Verilog HDL,
simulated with appropriate test benches and synthesized to completion. The
bitfile is generated and is used to emulate the design onto the Virtex FPGA
on the NetFPGA board.
To initiate the process, we create an ns file (Appendix A) and swap the
experiment in on deterlab. Once swapped in, we ssh into the control node of
the network and run the initialization script which assigns all the nodes of
the network with MAC and IP addresses. Then on running the setup script,
the hall9000 bitfile is downloaded onto the NetFPGA. Next, we run the rkd
(router kit daemon) which helps build the routing table. In order to check if
the network is set up and running, we ping and test it.
Next, we run the start script on each of the nodes which causes the
encryption of the symmetric key with the respective private key and sends it
to the control node of the router where the initialization script causes this
encrypted key to be decrypted using the public key and stores them in
corresponding registers.
Once the RSA symmetric key is exchanged, the data can be sent in the
encrypted form to the router. At the router, decryption happens and then the
packet enters the module written for priority assignment and routing out on
the high cost link. Finally the data is routed out in an encrypted format, to be
decrypted at the respective node.
Ping is used to test the connection, Iperf is used for bandwidth
measurements and tcpdump is used to analyze the data transmitted and
received.
Compared below is a table giving the average round trip time delays on
pinging each node of the network from every other node for two cases. One
with the reference router bitfile loaded onto the NetFPGA and the other with
the HAL9000 bitfile loaded onto the NetFPGA where in priority is assigned to
a node( in this case to n1) and high priority packets are routed through high
cost links in the network.
The average RTT was calculated from every ping and then the cumulative
average RTT was calculated. It can be seen that when n1 node comes into
the route, the use of the high cost link makes the delay lesser and this in
turn decreases the average RTT and increases the average throughput of
the network.
From
To
n0
n1
n2
n3
n1
n0
n2
n3
n2
n0
n1
n3
n3
n0
n1
n2
Cumulative Average RTT
Avg RTT(ms)
For Hal9000
33.030
34.614
31.004
34.114
31.341
31.325
31.295
27.197
31.097
31.258
27.018
31.214
30.959
Data Transmission:
Inferences:
In the convectional router, we observe that the throughput is less because
all the packets are routed through the same lines as there is no concept of
priority. In the HAL9000 router, we observe that the throughput is better on
the low cost line and the average throughput of the network is also
comparatively better. This is because of the high priority packets being
routed to the high cost line.
Chapter 4
References
References:
http://yuba.stanford.edu/~jnaous/papers/ancs-openflow-08.pdf
https://github.com/NetFPGA/netfpga/wiki/OpenFlowNetFPGA100
Appendix
A. nsFile
# NS file to create a network with 4 nodes and a
NetFPGAhostsourcetb_compat.tcl
set ns [ new Simulator]set
nfrouter [ $ns node]
tb-set-hardware $nfrouternetfpga2set
control [ $ns node]
tb-bind-parent $nfrouter$control
#Create endnodesset n0
[ $ns node]set n1 [
$ns node]set n2
[ $ns node]set n3 [
$ns node]
#Put all the nodes in alan
set lan0 [ $ns make-lan $nfrouter $n0 1000Mb 0ms]tbset-ip-lan $n0 $lan010.1.0.3
set lan1 [ $ns make-lan $nfrouter $n1 1000Mb 0ms]tbset-ip-lan $n1 $lan110.1.1.3
set lan2 [ $ns make-lan $nfrouter $n2 1000Mb 0ms]tbset-ip-lan $n2 $lan210.1.2.3
set lan3 [ $ns make-lan $nfrouter $n3 1000Mb 0ms]tbset-ip-lan $n3 $lan310.1.3.3
$ns rtprotoStatic
$nsrun
B. Instruction SetArchitecture
There are 24 instructions included in the Instruction Set Architecture (ISA)
oftheprocessor.These24instructionsareasubsetoftheMIPS32ISA.Eachinstructionis32bitwidemachinelanguageencodedandthedataoperandwordsare64-bitvalues.Rs, Rt, Rd
are fields whichpoint to32General Purpose Registers (GPR) of the RegisterFile.
OPCOD
E
Operation
BriefDescription
000000
010101
NOP
LW
NoOperation.
Load Word. The 16-bit signed offset is added to the contentsofRs, to
Rt,offset(Rs)
form
the
effective
address.
The
64-bit
value
at
thememorylocationspecifiedbythealignedeffectiveaddressisfetched from
010100
100001
SW
Rt,offset(Rs)
form
BEQ
Rtisstoredatthelocationspecifiedbythealignedeffectiveaddress
BranchonEqual.64-bitvalueinRsiscomparedwith64-bit value in Rt, if
Rs,
the
effective
address.
The
64-bit
Rt,offset
111111
JUMP,offset
If[Rs]=[Rt],PCPC+offset
Jump to the location given as offset.
010101
ADDI
PC gets PC +offset.
Add Immediate Word. 16-bit signed immediate valueis
010001
Rd,
value
Rs,immediat
addedwith64-bitvalueinRttoproduce64-bitresultinRd.[Rd][Rs]
e
SUBI
+immediate
Subtract Immediate Word. 16-bit signed immediate valueis
Rd,
Rs,immediat
addedwith64-bitvalueinRttoproduce64-bitresultinRd.[Rd][Rs]
000101
e
ADD
Rd,
immediate
AddWord.64-bitvalueinRsisaddedwith64-bitvaluein
000001
Rs,Rt
SUB
Rd,
Rttoproduce64-bitresultinRd.[Rd][Rs]+[Rt]
SubtractWord.64-bitvalueinRsissubtractedby64-
Rs,Rt
100010
001001
AND
bitvalueinRttoproduce64-bitresultinRd.
Rd,
[Rd][Rs]-[Rt]
BitwiseAndWord.64-bitvalueinRsisbitwiseanddwith64-
Rs,Rt
bitvalueinRttoproduce64-bitresultinRd.
OR
[Rd][Rs]&&[Rt]
BitwiseOrWord.64-bitvalueinRsisbitwiseordwith64-
Rd,
Rs,Rt
bitvalueinRttoproduce64-bitresultinRd.
[Rd][Rs]||[Rt]
001010
XOR
Rd,
BitwiseXorWord.64-bitvalueinRsisbitwisexordwith64-
in
001100
001101
000111
Rs,Rt
bitvalueinRttoproduce64-bitresultinRd.
LOGICAL
[Rd][Rs][Rt]
BitwiseNorWord.64bitvalueinRsisbitwisenordwith64bitvalueinRttoprodu
NOR
ce64-bitresultinRd.
LOGICAL
[Rd] ~([Rs]||[Rt])
BitwiseNandWord.64-bitvalueinRsisbitwisenordwith64-
NAND
bitvalueinRttoproduce64-bitresultinRd.
SLT
Rd,
Rs,Rt
[Rd] ~([Rs]&&[Rt])
SetonLessThan.64-bitvalueinRsiscomparedwith64bitvalueinRttoproduce64-bitresultinRd.Rdissetif
RsislessthanRt,elseitisreset.
000110
SLL Rt, Rs
[Rd][Rs]<[Rt]
One byte left shift
000011
SLR Rt, Rs
[Rt][Rs]<< 1
One byte right shift
000010
SGT
[Rt][Rs]>> 1
SetonGreater
Rd,
Rs,Rt
Than.64-bitvalueinRsiscomparedwith64-
bitvalueinRttoproduce64bitresultinRd.RdissetifRsislessthanRt,elseitisreset.
000100
INC Rs,Rt
[Rd][Rs]>[Rt]
Increment
001011
MULT
Rs,
[Rt][Rs]+1
Multiplication
100001
Rt, Rd
BNE
Rs,
[Rd][Rs]* [Rt]
BranchonNotEqual.64-bitvalueinRsiscomparedwith64-bit value in Rt, if
Rt,offset
LWI
110100
immediate
STOREI
theRd.
Store
Immediate
Rd,
immediate
DEC Rs,Rt
register
Decrement
001111
Rd,
If[Rs]=[Rt],PCPC+offset
Load Immediate Word. The 16bit immediate data taken in is placed in
110101
[Rt][Rs]1
Word.
The
16-bit
immediate
data
is
C. Important Hardware/SoftwareRegisters
IDS_CONTROL_REG
0x2000300
IDS_INSTRUCTION_IN_REG
0x2000304
IDS_DATA_IN_HIGH_REG
0x2000308
IDS_DATA_IN_LOW_REG
0x200030c
IDS_DATA_ADDR_REG
0x2000310
IDS_INSTRUCTION_ADDR_REG
0x2000314
IDS_PORT_DEST_REG
0x2000318
IDS_CONTENT_REG_REG
IDS_CONTENT0_LOW_REG
IDS_CONTENT0_HIGH_REG
0x200031c
0x2000320
0x2000324
IDS_CONTENT1_LOW_REG
0x2000328
IDS_CONTENT1_HIGH_REG
0x200032c
IDS_CONTENT2_LOW_REG
0x2000330
IDS_CONTENT2_HIGH_REG
0x2000334
IDS_CONTENT3_LOW_REG
0x2000338
IDS_CONTENT3_HIGH_REG
0x200033c
IDS_DES_KEY_LOW_REG
0x2000340
IDS_DES_KEY_HIGH_REG
0x2000344
IDS_DATA_OUT_HIGH_REG
0x2000348
IDS_DATA_OUT_LOW_REG
0x200034c
IDS_INSTRUCTION_OUT_REG
0x2000350
IDS_HIGH_COUNT_REG
0x2000354
IDS_LOW_COUNT_REG
0x2000358