Sie sind auf Seite 1von 32

100GIntrusionDetection

August2015
v1.0

VincentStoffer
AashishSharma
JayKrous

1of32

TableofContents
Background
Approach
SolutionOverview
AlternativeSolutions
DistributionDevice
Requirements
Selection
BroCluster
BuildGuide
Overview
Arista
Myricom
BroHardware
Performance
TrafficDistributiontotheCluster
BroClusterCPUUtilizationandPerformance
PerformanceMeasureofCaptureLoss
Shunting
ComponentsofShunting
Acknowledgements
References
Appendices
AppendixA:AristaConfig
Arista7504configuration
Arista7150configuration
AppendixB:ClusterConfiguration(FreeBSD)
AppendixC:ProcurementDetails
AristaProcurement
BrohardwareProcurement
MyricomDriversProcurement
AppendixD:PhotoofProductionSolution

2of32

Background
Berkeley Lab is a DOE National Laboratory operated by the University of California that conducts
largescale unclassified research across a widerange ofscientificdisciplines. BerkeleyLabwasapioneer
in the formation and use of the modern Internet and continues to make incredible demands on high
performancecomputingandhighspeednetworkstofulfillitsscientificmission.
Primarily driven by the data output of modern scientific instruments and the data transfer needs of
scientists, network traffic ontheEnergy SciencesNetwork (ESNet)hasroughly doubledevery 18months
for the past10 years (see Figure1). ESNet ishostedatand istheprimarynetworkproviderforBerkeley
Lab. ESnetcompletedan upgradetoitsbackbonenetwork to100Gbpsin2012,and nownineoftheDOE
nationallaboratories,majoruniversities,andothernetworkprovidersalsopeerwithESNetat100Gbps.

Figure1:ESnettrafficgrowth

While 100Gbps links have become more prevalent, security operations ability to maintain network
monitoring at this traffic volume has not kept pace. As of this writing, no commercially available 100G
monitoringsolutionsprovidefunctionalparitywithBerkeleyLabsnetworkmonitoringapproach.

Comprehensive monitoringis a significant challengeand can becomeabarrier toimplementation of100G


networks,orworse,securitymonitoringrequirementscanbeweakened toexpedite animplementation.The
Berkeley Lab scientific mission and fundamental approach to cybersecurity required that we overcome
these challenges, and in June 2014 we began a project to design and implement a system capable of
monitoringa100Gnetwork.

3of32

Approach
Our approach wastobuild a systembased upon the 100Gprototyping workperformedbyCampbell and
LeeinIntrusionDetectionat100G1
andPrototypinga100GMonitoringSystem.2

Thebasicmethodology
ofoursolution istobreakdownthe100Gnetworkconnection intosmallerpiecesoftraffic,whilepreserving
the affinity of individual network sessions to a single analysis process. We thendistribute that analysis
across many dozensor hundredsof workerprocesses,allowingthe systemto scaleuptospeedsof100G.
We also implement a unique traffic reduction mechanism called shunting tofurther shedload from the
analysispipeline.
Severalcomponentscomprisethenetworkmonitoringsystem:

Traffic distributiona device capable of aggregating, filtering,anddistributing the 100G traffic to


multiplesystems.
Hostdistributionamechanismtofurtherdividethetrafficatthehostlevelintosmallerpieces.
Networkintrusiondetection system(IDS)toperformdistributedanalysisonthetrafficreceivedat
thehost.
Operatingsystem(OS)thephysicalhardwareandoperatingsystemusedtoruntheIDSnodes.

The traffic distribution device performs several critical functions. It aggregates traffic from multiple taps
(including 100G), performs filtering, and distributes the traffic in an even way to analysis hosts at 10G
speeds. In the past, the functions of aggregation, distribution, and filteringhave sometimesnecessitated
separate hardwaredevicesor software. The current generation oftrafficdistributiondevicescollectsthese
functions into a single piece of hardware.We defined requirementsthatsuch a deviceneeds to meet for
our operation and talked to many of the major device vendors to evaluate available options. We were
particularly interested in commodity hardware products for reduced costs and for solutions that would
support nextgeneration, openstandards traffic distribution, such as OpenFlow.Furtherdetails aboutour
evaluationprocessarecoveredintheDistributionDevicesection.
The traffic from the distribution devicearrivesat thehosts network card at 10G, whichis too muchfor a
singleIDSprocess tohandle.The hosts network cardfurther dividesthe10G streamintosmallerpieces,
which can behandledbyasingleIDS process.Thisreplicatesthepreviousdistributionstepbut atthehost
level. We evaluated several options for host distribution, our primary requirement being that it would
supportourpreferredOSofFreeBSD.
By passing through both the traffic and host distribution, the traffic finally reaches a volume that can be
effectively analyzedby asingleIDS worker process
. A distributed networkIDS isnowneeded toperform
analysis across many dozens or hundreds of workers, all acting on a small fraction oftheoverall traffic
volume. Although many network IDS products are available, there are only afew options tochose from
whenevaluatingclustercapablesystemsthatcanhandle100Gtrafficvolumes.
As wenote in the Alternative Solutionssection ofthis paper,there are several viableoptionsforeach of
these components. The particular combination of components we selected was informed by our local
expertiseandtheexperienceofourteam,institution,andcolleagues.

4of32

Solution Overview
Wechosethefollowingtechnologicalcomponentstocreateour100Gmonitoringsystem:
Trafficdistribution

Aristaswitches(7150and7504)
Hostdistribution

Myricomnetworkinterfacecards(NIC)andSniffer10Gsoftware
NetworkIDS

BroNetworkSecurityMonitor
OS

FreeBSD

Ourmonitoring system runscompletelyout ofband, meaningittakestrafficfromopticaltapsandoperates


onlyon the duplicatedtraffic it doesnotsitinline (seeFigure2).AlertsandactionsgeneratedbytheIDS
canaffectnetwork traffic (by placing blocks,forexample)buttheseprocesses happeninparalleltonormal
network operation. This separation allows the monitoring systemto scaletogreaternetwork speeds and
alsoremovesthemonitoringsystemfromacriticaldependencyfornormalnetworkoperation.

Figure2:Simplifiedflowdiagramofthenetworkandmonitoringsystem

5of32

Figure 3 is adiagramof the 100G monitoring solution. TheArista 7504 is at the top of the diagram.This
device performs two functions: It aggregates the inputs of the optical taps from Berkeley Labs Internet
connectionsandcreatesa10G LinkAggregationGroup (LAG)ofthataggregated traffictopasstothe 7150
device.

Figure 3: Block diagram of100G cluster setup showing 100G feed goingtothe Arista(7504),whichusesan
Arista 7150 for symmetric hashing/loadbalancing. Traffic is then fed to the Bro cluster using Myricom
cards.

Below the 7504 istheArista 7150.This deviceperformsthecriticalfunctionofensuringeachTCPsession


is distributed ona single link. Thisfunction isprovided by the Arista DANZtechnologythroughsymmetric
hashing. As of this writing, symmetrichashing isnotcurrentlyavailablein the Arista 7504once the code
thatsupportssymmetrichashingisavailableonthe7504,the7150canbeeliminatedifdesired.

The bottomofthediagramshowstheBrocluster.EachBronodeisbuilton commodityhardware described


in moredetailinthe Build Guide section ofthis paper.AMyricom10GNICis installedoneachBronode.
The NIC(andassociatedsoftware)furtherdividesthe traffictomultipleBroprocessesrunningonthenode.
A single Bro node is designated the Bro manager, which aggregates events from the other Bro nodes,
createslogs,andcontrolsshunting.

6of32

On the rightside of the diagram the shuntingprocess is representedbythedashed red line. Inrealtime,
Bro detects specific large data flows basedon predetermined characteristics andcommunicates withthe
Arista 7150 viaanAPItostop sendingthoseflows toBroforanalysis. Theshunt rulesapplyanACL tothe
Arista which allows control packets to continue but ignores the remaining heavy tail of the data volume.
This allows Brotomaintainawarenessand logs of theseshunted large flowsbutdramatically reducethe
analysisloadnecessarytoprocesstraffic.ShuntingisdiscussedinfurtherdetailintheShuntingsection.

Alternative Solutions
ThefollowingtablesummarizesthetechnologystackimplementedatBerkeleyLab.

Table1:BerkeleyLabtechnologystack
Trafficdistribution

Hostdistribution

IDS

OS

Arista(7504+7150)

Myricom
10GPCIE28C22+
andMyricom10G
snifferdrivers

Bro

FreeBSD

Our specific choices for these components were informedby evaluation, localexpertise, and experience
but are not the only choices. Table 2 provides alternative toolsand technologies forthevariousbuilding
blocksofa100Gmonitoringsolution.

Table2:Alternativetechnologystacks
Distributiondevice

Arista
Brocade
Endace
Gigamon
OpenFlow/SDN

Trafficsplit/node

IDS

PF_RING
PacketBricks+
netmap
EndaceDAG

OS

Snort
Suricata

Linux

7of32

Distribution Device
Requirements
Berkeley Lab has been operating 10G network monitoring infrastructure since 2007. We have used a
variety of distribution devices, including Apcon and most recently cPacket cVu devices. The cPackets
served as a reference as we developed our requirements for a 100G distributiondevice. Weneeded to
maintain the ability to aggregate, distribute,and filter like thecPacket, but alsoneededtosupportmultiple
100Gand10Glinks.
Ourrequirementsweretosupportthefollowing:

Inputports
Two100Ginputportstomonitorbothtransmitandreceiveona100Glink
ThiswouldhandleourprimaryinternetconnectiontoESnet
Neededtosupport100GLR4opticsusingopticaltaps
Scalingbeyond two 100G portswasnotarequirementbutconsideredadesirable
futurecapability
Eight10Ginterfacestoallowbackuplinkstobeaggregated
Transmit and receive for all backup links (secondary connection to ES.net, UC
Berkeley,CENIC,offsitelinks,ScienceDMZ,etc.)

Outputports
Fourteen 10G interfaces (minimum) to feed a clustercapable IDS and otherstandalone
monitoringsystems
Fivetoten10Glinksfortheproofofconceptclustersystem
Enoughportstohandlescalinguptowards100G
1Gand10Gportstostandalonemonitoringsystemsdesirable

Aggregationandloadbalancing
Abilitytoaggregateinputportsandloadbalanceacrosstheoutputports
Fivetuplesymmetricloadbalancing(srcip,dstip,srcport,dstport,protocol)
Everyportcanbeassignedtoinputoroutput
Separateinputandoutputportgroups
Portspeedagnostic(1/10/100G)

Filtering
TCPflagandIPheaderfilteringtoenablegranularcontrolandfilteringoftrafficstream
Filteringtoexcludedataandallowonlycontrolpackets(seeshuntingsection)
ArbitraryIPheadersandTCPflagsfortestinganddevelopment
IPv6filtering
Filtersoninputoroutputportstoenableflexibilityandpreventoversubscription
Filteroninputoroutputportsorgroups
APIorCLIinterfacetoautomatefiltering

8of32

Selection
Over thecourseofayear,weresearchedavailablepacket distributiondevicesandevaluated threedevices
onsite(seeTable3).

Table3:Deviceevaluations
Device

Metrequirements

Pros

Cons

Arista
(7150,7504)

Yes

API,GUI,SDN
capability

Twodevicesinitially
neededfor100G,no
IPv6filteringattesting
time

BrocadeMLXe

Yes

Lowestcost,SDN
capability

NoGUI,noAPI

EndaceAccess

No(no10Ginputports)

Formfactor,firstto
market

Twodevicesneeded
forbidirectional,no
filtering,cost

We constructed a testbed of a 100G link and duplicated traffic over 10G for input. We used both raw
captures to multiple interfaces (tcpdump) and Bro to test for consistency of packet delivery and
effectivenessofhashing.We compared ease of use of the devices,including CLI,GUI,and APIfeatures.
We testedcreation of portsand portgroupsand directing traffic to andfromthem. Wealsocompared the
filteringfeaturesincludingusingourownsetofTCPflags.
Part of our testing wastoensure that symmetric hashingofallflows was happening correctly acrossthe
distributiondevice. When a single flow isaggregated and/or loadbalanced across a numberof ports,itis
criticalfor IDSanalysisthattheflow passesexclusively andreliablytoa singleoutputforfurtherdistribution
andanalysis. Symmetrichashing allows a numberof variables to bedefinedtocharacterizetheflowswith
usually some combination of source IP, source port, destination IP,destinationport,andprotocol.Most of
the devices we tested allow for those characteristics to be adjusted and then applied such that each
bidirectionalmatched flowwillegressoutthesamelink in its entirety. This is acriticallyimportantfeature
thatenables even distribution of trafficand allowsforthehorizontal scalingwe describe inour monitoring
system.
After evaluationwe chose Aristaforthe distributiondevices,utilizingbotha7504chassisandsmaller7150
switch. Brocade came in a close second and was most competitive on price alone. Arista wastheonly
vendor who exceeded all of our requirements. Perhaps the most compelling feature was Aristas JSON
API, whichallowsustohaveexternalprogramsdynamicallymodifythedevicespacketfiltersettings(inour
case for Bros shunting capability). The density of the Arista was also compelling in our current
configuration we are able to tap three fullduplex 100G links while still having more than 144 10G
connectionsavailableforinputandoutputports.

9of32

Bro Cluster
The Bro Network Security Monitor was created at Berkeley Lab in the 1990s by Vern Paxson. It is a
powerful, flexible, and opensource platform for networking monitoring and intrusion detection. Berkeley
Labs long history and deep experience with Bro made it a clear choice when settling on the intrusion
detection component of the monitoring system. Further, the teams experience with Bro clustering
technology at 10G speeds led us to believe that Bro could scale towards 100G. Bros ability to scale
horizontally and maintain visibility at high speedsmakesit uniquelysuitedforintrusiondetectionon high
speed networks. Simply put,we believe Brotobe the only IDSabletohandleboththe speedandanalytic
complexitynecessaryforcomprehensivenetworkingmonitoringat100G.
ThetraditionalapproachatBerkeleyLabwasapacketbroker devicethathandledtapaggregation,filtering,
and distribution. This aggregated output was fed via a single 10G link to a device capable of further
distributing the traffic to 1G workermachines that comprised a Brocluster. A moremodernapproach for
Broclusteringisasinglemulticore server,directlyfeedinga10GNICandhandlingthedistributionoftraffic
with specialized network drivers and individual CPU cores as workers. After experimenting with this
approach, we felt it could easily scale up to handle >10G traffic using a combination of these two
approaches(manymulticoreserverseachoperatingon10Gworthoftraffic).
Dividing 100G traffic into smaller pieces presents challenges for attack detection and monitoring. For
example, packets from a sitewide network scan might be separated across multiple IDSservers, which
may not see enough traffic individually to trigger detection thresholds. The IDS must provide a way to
correlate the traffic across multiple nodes to identifyattacks thatspan morethanonetraffic stream.In a
cluster configuration, Bro has the capabilitytoexchangelowlevel analysis staterealtime,givingitaglobal
pictureofnetworkactivitytoprovideaccurateintrusiondetection.
Correctly scaling aBrocluster is highly dependentonthe sitestrafficpatterns, hardwarecapabilitiesofthe
IDS, andthe running policies. Localexperimentationisneededto determinenotonlythecorrectnumberof
workers, but also the correct balance of workers to proxies. We suggest starting with roughly 1Gbpsof
traffic per worker and scaling up or down based on traffic loss or cluster instability. We also suggest a
starting point is oneproxyforevery10 workers this ratioshould be adjustedbased on performance and
packetlossuntilstabilityismaintained.
For our 100G Bro cluster setup, werun onemanager,fiveproxies,and50workernodes.This architecture
first splits the 100Gb link trafficintofive streams,each sentover a 10Gb link on oneBrohost intheBro
cluster.The Bro host splitsthe10Gb stream into101Gbstreams viaaMyricom networkcard.Each ofthe
10BroprocessesrunningaBrohostanalyzesoneofthe1GBstreams.
Our choice of architecture splits the traffic in such a manner that each of the worker nodes sees about
1/50th of the total traffic volume. Current peak network traffic is roughly 20Gps, so each Bro process
analyzesabout400Mbpsofthetraffic.
Thefollowingsectionofthispaper,'BuildGuide,'providesthedetailsofoursolution.

10of32

Build Guide
Overview
Threehardware componentsareneeded for our build. The first istheset of Arista devices, the secondis
the Myricom NIC and driver installed on each ofthe Bro hosts,andthethirdis the clusterof commodity
hardware that runs Bro. InAppendix C we providetheline items for the hardware usedforourbuild.We
arenotprovidinglineitem pricingdetails,butouroverallbuild costwasunderUS$400,000,whichincluded
maintenance and support forone yearformostofthehardwarecomponents. Thefollowing threesections
describeeachofthecomponentsofthebuildinmoredetail.

Arista
Certain Arista models support a feature called DANZ (Data Analyzer) which can place the device into
taponly mode. Thisdisablesallswitching androuting functions andallowsfor configurations unique totap
aggregation, filtering, and distribution. With this feature set,we are ableto create portgroupsspecific to
inputandoutput,applyfilterstothem,anddistributethetrafficgroupsasneededformonitoring.
The Arista hardware configuration necessary to handle 100G includes two devices: one 7504 chassis,
which holds the 100G line card, and one7150to whichit is connected. At the timeofpurchase,the7500
seriesdid not havesoftwaresupportforsymmetrichashing,sothiscriticalfeature runsonthe7150device.
Support forsymmetrichashing iscomingto the7500 series in afuturecode release, which will eliminate
theneedforthe7150.
All of our external links are fed into the7504chassis,includingboth100G and10Gconnections.Sincewe
are using optical taps, two ports must be reserved for each tapped link to account for the receive and
transmit sides of the fiberopticconnection.Inputportgroupsare createdtoaggregatethelinkstogether.A
seriesof connectionsmust be made between the 7504 deviceand the 7150. We accomplishedthisusing
10G twinax copper cables, which havea lower costthan opticalfiber connections. We initially made five
10Gtwinaxconnectionsbetweenthe7504andthe7150.
ALAGcombines all of thesephysicalconnectionportsintoatrunkedconnectionbetweenthetwo devices.
A tap aggregation group is then created, which connects the input port group to the output port group,
essentially passing through all of the collected traffic through the 7504 to the 7150 device.Onceatthe
7150, traffic moves throughasimilargroupconfiguration.Aninputgroupis createdfortheconnectionports
betweenthetwodevices,andanoutputportgroupiscreatedforthe10GportstotheBroclusterhosts.
Though the input ports arebeingcollectedtogetherand hashedonthe7504,recallthatsymmetrichashing
is not possible onthe7504 at the time ofthis writing, so thetrafficissymmetricallyhashedasitingresses
the7150. Thisensureseven balancingofallincomingnetworkflows,allowingeachflowtobedirectedinits
entirety to a distinct Bro worker machine. The 7150 is also where we apply static filters and dynamic
shuntingfilterstolimittheamountoftrafficbeingsenttotheBrocluster.
One oftheprimaryreasonsthatwechosetheAristawasthe abilitytododynamicshunting.Throughuseof
AristasJSON API,we are abletopushsimpleaccesslistchangestothedevicefromour Brocluster.This
isdescribedinfurtherdetailintheShuntingsectionofthispaper.
Afterthe initialsetup, thereis littleconfigurationnecessaryfor eitheroftheAristadevices.Onlywhenanew
port or tap is added on the input ora newcluster node ormonitoring deviceis added onthe output is it
necessarytochangeconfiguration.SeeAppendixAforlinebylineconfigurationofeachAristadevice.

Myricom

11of32

A necessary component of the clusterinabox approach is an NIC anddriversthat allow the system to
distribute the traffic to different IDS analysis threads for distributed processing. The majority of the
production monitoring systems atBerkeley Lab runFreeBSD, which somewhat limited ouroptionsforthe
NICsand softwaredrivers. Themostcommonlyusedapproach onLinuxforhighvolumepacketcaptureis
PF_RING,which doesnt runon FreeBSD. Wehave used Myricom NICs for several yearsanddecidedto
use Myricoms enhanced drivers to facilitate traffic distribution within the cluster. Myricom NICs are a
commodity product and are relatively inexpensive. Several otherspecialized NICoptions exist at greater
cost,includingEndaceandNapatech.
Myricoms driveris called Sniffer10G, andit isapaid, licensedfeature thatruns exclusively on Myricoms
10Gb network cards. For FreeBSD the paid Sniffer10G driver is mutually exclusive with the standard
Myricom FreeBSD driver. Thereare twoversions ofthesoftware:v2 and v3. Weexperimentedwithboth
versions the criticaldifferenceisthatv2allowsonlyforasingleapplicationtoconnectto the trafficstreams
(Bro, for example), and no other operationscanbeperformed onthetraffic(e.g., atcpdump). Thev3code
allows multiple applications to connect to the traffic simultaneously, which is valuable for testing and
analysismachinesrunningmultiplemonitoringtools.
Once the card isinstalled and the licenseappliedto the NIC, akernelmodulemustbeloaded onthehost.
Binarypackages are available from MyricomforLinux, FreeBSD,andWindows,butnosourceisavailable.
A set of tools isinstalled with the Sniffer package thatallows for configurationand diagnostics. Once the
system configuration iscompleteand the kerneldriver installed,anewnetworkinterfaceis created, which
canbereadbyBroorothermonitoringtools.
TheBroconfigurationsetsuptheconnectiontotheMyricomstreams andallowsthemtomaintainaffinityto
distinct CPUcoresforprocessing.It isimportantthatthecorrectCPUpinningsettingsare usedfortheOS,
number of physical cores, and number of desired workers (see the discussion in the next section,Bro
Hardware). Configuration for the Myricom drivers is well documented and allows for control of hashing
valuesamongotheradvancedoptions.

Bro Hardware
The Bro cluster utilizesamanager node thatoversees theoperationoftheentire cluster and acts aslog
aggregator for the workers. The manager is assisted by proxies that facilitate communications between
manager and Bro worker nodes and within the cluster itself. Worker nodes process and analyze the
network traffic and pass it to the manager to generate logs and alerts. Aftersome experimentation,five
proxieswere chosen astheoptimalnumbertoallowoneproxyforeachoftheBrohostsinthe cluster.Our
experimentsshowedthat greater thanone proxy per nodecould causeinstability andlessthanoneproxy
pernodecouldcausecongestioninthetrafficanalysis.
In our current setup, the 100G Bro cluster has five hardware nodes, which are commodity multicore
systems. Eachnodehasa Myricom NIC installed andit runsasingleBro proxyand 10 Broworkers.The
functionalityof manageris also carriedoutby one of thesenodes.This means thatthemanager box runs
10 workers, aproxy, andthemanager processes.Ourcurrentrunshowsthat althoughthe managerboxis
undermoreload,ithandlesthedualdutiesofworkerandmanagersuccessfully.

Processor andCPUpinning
:We choseIntel 3.5GHz Ivy Bridgedual hexcore(12cores total) CPUs for
each of the cluster hosts. The mostimportantconsideration when choosing processorsfor Bro analysis is
the overall speed we recommend purchasing the fastest processors possible. Hyperthreading is a
complication worth mentioning. We decided to enable hyperthreading because the Bro nodes are doing
othersystemandkernelprocessinginadditiontoBro.Leavingafreecore ortwogenerallyhelpsthesystem
performbetter,and thesystem can takeadvantageofthehyperthreaded cores.WhenpinningCPUcoresto
Bro workerprocesses,however,one mustbecautious touseonlyphysicalcores. InLinux andFreeBSDthe
numbering of physical cores is different, so the correct CPU identifiers must be set for only thephysical

12of32

cores in the system. If the physical andhyperthreaded corearebothselected for the same CPU, Browill
overwhelmthat coreand analysis will bedegraded. In the currentsetupwehave chosentorun10 workers
andoneproxyperBroclusterhost,leavingtwophysicalcoresfreeforothersystemprocesses.

RAM
: Each of the cluster hosts is equipped with 128GB DDR3 1600 MHz ECC/REG RAM. In our
environment, we observe that an average Bro worker consumes about 6GB of RAM if the cluster is
performing deep packet analysis with extendedpolicies.Sinceeachofourhostsisrunning10 workers,with
6GB/worker, we need 60GB ofRAMat minimum.128GB isinstalled forcapacity planning to scaleup the
numberofworkersifnecessaryortoenablemoreresourceintensivepolicies.

Disk:
Thecurrent hardwaresetup comprisesamirroroftwoIntel6GB/s2.5120GBSSDdrivesfortheOS
and a RAID6 of six WD1000CHTZ 10K RPM 6GB/s 1TB SATA drives for the
/data partition. Logs are
stored temporarily in
/data before being moved to a separate archive machine. Using SSDs for the OS
partitionprovidesuswithafast/tmp/alongwithadditionalspeedforswapspace.

Please
seeAppendixCforBrohardwareprocurementdetails.

Performance
As we have outlined, our current setup splits inbound traffic into five 10Gb streams (using the Arista
hardware), which are further split at the Bro hosts into roughly1Gbpsflows to feedto eachBroworker
process (usingtheMyricom sniffer drivers).Thisgivesusatotaltheoreticalmonitoring capacityof50Gbps.
This is sufficient for our production network, which currently operates with sustained traffic of about
23Gbpsand peaks goingashigh as24 Gbps.Weplantoscalebeyond50Gbps byaddingadditionalBro
nodesinthefuturetheotherdistributiondetailsandconfigurationremainthesame.
Inourcapacity planning we designedoursystemtohandlewhat weexpected to betheoverallvolumeof
ournetworkconnection(approaching50Gbps). Byusingtheshuntingcapabilitiesandreducingtheanalysis
oflargedataflows,oursystemscaleswellbeyondwhatweinitiallyexpected.
In the following sections we discuss some specific performance measurements:trafficdistribution to the
cluster,CPU utilization of Broclustermanager andworkers,packetdropsortrafficcapture losses,andthe
efficacyoftheshuntingmechanism.

13of32

Traffic Distribution to the Cluster

Figure4:
GraphsofthetotaltraffictotheclusterandthenseparategraphsforeachBronode

As shown by the Node1 through Node5 graphs, the average volume seen by each Bro clusternodeis
generallyunder1Gbps.

14of32

Bro Cluster CPU Utilization and Performance


Figure5 showstheCPU performanceofBroclustersmanagerandworkernodes overaspanoffour days
through a seriesofgraphs. These graphs reveal thattheaverageCPUutilizationofanindividualworkeris
about 20% whiletheBromanagerseesanaverageCPUutilizationofabout10%withfrequent spikesofup
to 80%. Assuming that traffic is being processed correctly,low CPUutilization whileperformingindepth
protocolanalysisisagoodindicatorofnearlyzeropacketdrops.

CPU Utilizationofabout 20% for Node1 running


10workers

CPU Utilization of about 20% for Manager (with


frequentpeaksof80%)

CPU Utilizationofabout 20% for Node2 running


10workers

CPU Utilizationofabout 20% for Node3 running


10workers

CPU Utilizationofabout 20% for Node5 running


10workers

CPU Utilizationofabout 20% for Node4 running


10workers

Figure5:CPUutilizationofManagerandworkerBroprocessesacross5physicalnodesofthecluster

15of32

Performance Measure of Capture Loss


This sectioncovers the amount of packetdropsseenacrossthe50workernodes.Thesepacketdropsare
calculated using the capture_loss.bro policy which keepstrack ofthe TCP state of connections, andany
major gapsinthecontent areinferredtobepacketdrops.The barsinFigure6representthepercentageof
packets dropped out of the total number of packets over 15 minutes. The figure shows the average for
packetdropsisaround0.05%sustainedwithpeaksofabout34%.

Figure6:%packetdropsseenacrossallthe50workernodesofthecluster

Shunting

Shuntingtakesadvantageoftheheavytaileffecttodynamically reducetheamountof trafficprocessedby


the IDS. Heavy taileffectreferstotheobservationthatasmall number of networkflows willdominatethe
overall volume of data transferred for a giventime.Thesedata transfersarealsocalledbulktransfers or
elephant flows. By identifying bulk transfers, which are well known and understood from a security
perspective, wecanuseshuntingtoeliminatetheseflowsfromprocessingbytheIDS,further reducingthe
processing needs and cost of the system. By using the shunting system we have outlined, we have
achieved massive savings in IDS processing, in some circumstances reducing the amount of traffic
processedbyafactorof10.
Bros reaction framework provides the capability to identify and classify these large uninteresting traffic
connections and communicate with the traffic distribution device (in our case, the Arista 7150) to stop
sending the datacomponentofthenetworkflowforfurther analysis.Theseconnectionscan bematchedon
specific source and/or destination addresses or any combination of traffic characteristics that can be
definedinBros policylanguage.In ourcaseweareshuntingwellknownbulkdata transferswhichusethe
GridFTPprotocolaswellaslargeFTPandHTTPconnectionsoveraspecificsizethreshold.
On the Aristatheshuntingis done witha simplefourtupleACL(srchost,srcport,dsthost,dstport)which
canbe dynamically controlledby Brothroughuseof ascriptcalledDumbno. TheACLisdesignedtopass
all control packets, whichmeansBromaintains accurate informationabout the connection,including total

16of32

size and duration. This provides criticalmetadata abouttheongoing data transfer and alsoallows Bro to
removethecorrespondingACLonceashuntedconnectionhasended.
Dumbno uses Aristas JSONAPItocommunicatedirectlybetweenBro andtheArista devicetocreateand
managetheACLs.DumbnowaswrittenbyJustinAzoff(currentlyatNSCA)andisavailablefrom
https://github.com/JustinAzoff/dumbno
.

Components of Shunting
Therearethreecomponentstoshunting.

1.

Brosreactionframework: Thismodule of Bro isprimarilyresponsible for identifyingthebulkflows


andtriggeringa block. Once Bro identifiesaconnectionas bulk(basedonvariousprotocollevel
heuristics),Bro triggers aBulk::connection_detectedevent.Thefollowingcodesnippethighlights
the event definition as well as the React::shunt() function which is called to trigger anaction to
shuntaconnection:

eventBulk::connection_detected(c:connection)
{
localaction=(c$orig$size>c$resp$size)?React::SHUNT_ORIG:
React::SHUNT_RESP
React::shunt(c,"bulk",action)
}

The React::shunt() function calls an external script to connect with the Dumbno daemon (see
below)whichgeneratesandimplementsACLsontheAristainrealtime.

2.

Dumbno daemon/script: The Dumbno daemon is a python script to facilitate communication


between Bro andtheAristadevice. TheboxbelowillustratesanACL ruletriggered bytheDumbno
scriptafterBrodeterminesthataconnectionisabulktransfer:

2015013004:07:46,857INFOop=ADDseq=33475rule=u'tcphost54.183.14.226
eq80host131.243.191.181eq47000'

2015013004:08:44,983INFOop=REMOVEacl=bulk_1seq=33475rule="denytcp
host54.183.14.226eqwwwhost131.243.191.181eq47000"matches=0

The op=ADDoperation is performedwhenabulkconnectionisidentified,and BrosReact::shunt()function


supplies the connection specifics to Dumbno. The Arista continues to send control packets to Bro while
filtering the data packets. When the connection completes (based on the TCP state or Bros internal
timers), Bro triggers another call to Dumbno, which processes theop=REMOVE operation,removing the
ACL from the Arista. By dynamically removing the ACL after completion, the number of ACLs can be
preventedfromgrowinguntilresourcesareexhausted.

The box below shows a specificshunted HTTPconnectionfromtheBro connection log.This connection


lasted for ~280 seconds and was shunted when theconnectionreached150 Mb in size. Alldatabefore

17of32

150MbwereanalyzedbyBroaswellasthecontrolpackets,whichclosedtheconnection.

Jan3004:07:11CAlIv61BX3YxDFSdod131.243.191.18147000
54.183.14.22680tcphttp280.754874129154300309
SFT2154880ShADadfFr426232216689108240158909881
(empty)worker35

Shunting Effectiveness
Figure7 illustratestheeffectivenessofshunting. Brohasidentifiedconnections(asillustratedby theyellow
series) andinstructed the Arista to stopsending the remaining dataofthoseconnectionstotheclusterfor
analysis. Thefigureshowsthat,onaverage,shuntingreducesthetrafficfromaround10Gbpsintheoriginal
streamto about 1 Gbps senttothecluster.The ToIDSseries highlights the totaltrafficseenbytheBro
cluster after shunting. The spikes show several large flows of810.5Gbpsbeing removed from analysis
through the shunting mechanism. These large spikes generally occur when applications like GridFTP or
SSHaredoinglongrunning,largedatatransfers.

Figure7:Shuntinginaction:bytesfilteredbyactiveshunting

18of32

Figure 8 shows the number of ACL operations per day where the Bro cluster identified and shunted
connections which were characterized as uninteresting and presenting no security risk. For the current
100Gcluster setup, weidentifyGridFTPandanyconnections>2GB(thevastmajorityofsuchconnections
areSSH,HTTP,FTPdatatransfers)aspotentialcandidatesforshunting.

Figure8:ACLtransactionsshowingthenumberofshuntingoperationsexecutedontheArista

19of32

Acknowledgements
This work was supported in part by Wayne Jones, the Acting Associate Administrator for Information
Managementand ChiefInformationOfficer within the Office oftheChiefInformationOfficerattheNational
NuclearSecurityAdministrationwithintheU.S.DepartmentofEnergy.

Strategic guidance andprojectsupport was provided byRosio Alvarez, Ph.D., Chief Information Officerat
BerkeleyLab.

We would also liketothank the following people for theirtechnicalsupport ofthisproject:RobinSommer,


ScottCampbell,SethHall,JustinAzoff,JamesWelcher,CraigLeres,ParthaBanerjee,MiguelSalazar,and
VernPaxson.

Earlier versions of this document were improved thanks to editorial reviews by Michael Jennings,Adam
Slagell,ScottCampbell,RobinSommer,andRuneStromsness. WealsothankJessicaScullyfortechnical
editing.

The following organizations also provided technical guidance or hardware to support our evaluation
process:ICSI,Broala,Arista,Brocade,andEndace.

Pleasedirectquestionsorcommentsaboutthisdocumenttosecurity@lbl.gov.

20of32

References
This sectionprovideslinkstorelevantbackgroundreadingorreferencematerialforthetechnologyusedin
our100GIDSimplementation.

1.

2.

Campbell,Scott,andJasonLee,IntrusionDetectionat100G,theInternationalConferencefor
HighPerformanceComputing,Networking,Storage,andAnalysis,November14,2011.

Campbell,Scott,andJasonLee,Prototypinga100GMonitoringSystem,20thEuromicro
InternationalConferenceonParallel,Distributed,andNetworkBasedProcessing(PDP2012),
February12,2012,
http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6169563
.

3.

Paxson,Vern,Bro:ASystemforDetectingNetworkIntrudersinRealTime,in
Proceedingsofthe
7thUSENIXSecuritySymposium,
SanAntonio,TX,1998.

4.

Leland,W.,M.Taqqu,andWilsonD.Willinger,OntheSelfSimilarNatureofEthernetTraffic,
Proceedings,SIGCOMM93,September1993.

5.

Vallentin,M.,R.Sommer,J.Lee,C.Leres,V.Paxson,andBrianTierney,TheNIDSCluster:
Scalable,StatefulNetworkIntrusionDetectiononCommodityHardware,ProceedingsRAID2007,
http://www.icir.org/robin/papers/raid07.pdf
.

6.

Weaver,N.,V.Paxson,andJ.Gonzalez,TheShunt:AnFPGABasedAcceleratorforNetwork
IntrusionPrevention,ProceedingsFPGA07,February2015,
http://www.icir.org/vern/papers/shuntfpga2007.pdf
.

7.

Schneider,F.,J.Wallerich,andA.Feldmann,PacketCapturein10GigabitEthernet
EnvironmentsUsingContemporaryCommodityHardware,PAM2007,Louvainlaneuve,Belgium.

8.

PF_RING:Highspeedpacketcapture,filteringandanalysis(n.d.),retrievedFebruary20,2015,
fromhttp://www.ntop.org/products/pf_ring/.

9.

MyricomSniffer10G:Sniffer10GDocumentationandFAQ(n.d.),retrievedFebruary20,2015,from
https://www.myricom.com/software/sniffer10g.html
.

10. EndaceDAGDataCaptureCards(n.d.),retrievedFebruary20,2015,from
http://www.emulex.com/products/networkvisibilityproductsandservices/endacedagdatacapture
cards/features/.

11. NapatechProducts(n.d.),retrievedFebruary20,2015,fromhttp://www.napatech.com/products.

21of32

Appendices
Appendix A: Arista Config
Arista 7504 configuration
!device:arista7504(DCS7504,EOS4.14.4F)
!
!bootsystemflash:/EOS4.14.4F.swi
!
transceiverqsfpdefaultmode4x10G
!
hostnamearista7504
ipnameservervrfdefault131.243.5.1
ipdomainnamelbl.gov
!
ntpservertic.lbl.gov
ntpservertoc.lbl.gov
!
ptphardwaresyncinterval1000
!
spanningtreemodemstp
!
noaaaroot
!
usernameadminrolenetworkadminsecret
!
tapaggregation
modeexclusive
!
interfacePortChannel1
descriptionarista7150
switchportmodetool
switchporttoolgroupsetCENICer2ESneter1100GUCBer1UCBer2
!
interfaceEthernet3/1/1
description"ESnet100GRX"
switchportmodetap
switchporttapdefaultgroupESneter1100G
!
interfaceEthernet3/1/2
!

22of32

...
!
interfaceEthernet3/2/1
description"ESnet100GTX"
switchportmodetap
switchporttapdefaultgroupESneter1100G
!
interfaceEthernet3/2/2
!
...
!
interfaceEthernet4/5
description"infromer1UCBtaprx"
switchportmodetap
switchporttapdefaultgroupUCBer1
!
interfaceEthernet4/6
description"infromer1UCBtaptx"
switchportmodetap
switchporttapdefaultgroupUCBer1
!
interfaceEthernet4/7
description"infromer2UCBtaprx"
switchportmodetap
switchporttapdefaultgroupUCBer2
!
interfaceEthernet4/8
description"infromer2UCBtaptx"
switchportmodetap
switchporttapdefaultgroupUCBer2
!
interfaceEthernet4/9
description"infromCENICer2taprx"
switchportmodetap
switchporttapdefaultgroupCENICer2
!
interfaceEthernet4/10
description"infromCENICer2taptx"
switchportmodetap
switchporttapdefaultgroupCENICer2
!

23of32

interfaceEthernet4/11
!
...
!
interfaceEthernet4/17
description"LAGtoarista7150"
channelgroup1modeon
switchportmodetool
!
interfaceEthernet4/18
description"LAGtoarista7150"
channelgroup1modeon
switchportmodetool
!
interfaceEthernet4/19
description"LAGtoarista7150"
channelgroup1modeon
switchportmodetool
!
interfaceEthernet4/20
description"LAGtoarista7150"
channelgroup1modeon
switchportmodetool
!
interfaceEthernet4/21
description"LAGtoarista7150"
channelgroup1modeon
switchportmodetool
!
...
!
interfaceManagement1/1
ipaddress
!
noiprouting
!
managementapihttpcommands
noshutdown
!
!
end

24of32

Arista 7150 configuration


!device:arista7150(DCS7150S52CL,EOS4.13.9M)
!
!bootsystemflash:/EOS4.13.9M.swi
!
transceiverqsfpdefaultmode4x10G
!
loadbalancepolicies
loadbalancefm6000profilesymmetric
nofieldsmac
fieldsipprotocoldstipdstportsrcipsrcport
distributionsymmetrichashmacip
!
!
hostnamearista7150
ipnameservervrfdefault131.243.5.1
ipdomainnamelbl.gov
!
ntpservertic.lbl.gov
ntpservertoc.lbl.gov
!
spanningtreemodemstp
!
noaaaroot
!
usernameadminrolenetworkadminsecret
!
tapaggregation
modeexclusive
!
interfacePortChannel1
descriptionarista7504in
ingressloadbalanceprofilesymmetric
ipaccessgroupbulk_1in
switchportmodetap
switchporttapdefaultgroup100G_test
!
interfacePortChannel2
description100Gout

25of32

switchportmodetool
switchporttoolallowedvlan1,517,1204,1206,1411,1611
switchporttoolgroupset100G_test
!
interfaceEthernet12
ingressloadbalanceprofilesymmetric
ipaccessgroupbulk_1in
switchportmodetap
!
...
!
interfaceEthernet17
descriptionLinktoarista7504#1
channelgroup1modeon
switchportmodetap
!
interfaceEthernet18
descriptionLinktoarista7504#2
channelgroup1modeon
switchportmodetap
!
interfaceEthernet19
descriptionLinktoarista7504#3
channelgroup1modeon
switchportmodetap
!
interfaceEthernet20
descriptionLinktoarista7504#4
channelgroup1modeon
switchportmodetap
!
interfaceEthernet21
descriptionLinktoarista7504#5
channelgroup1modeon
switchportmodetap
!
...
!
interfaceEthernet36
description100Gmgr
channelgroup2modeon

26of32

switchportmodetool
!
interfaceEthernet37
description100G01
channelgroup2modeon
!
interfaceEthernet38
description100G02
channelgroup2modeon
!
interfaceEthernet39
description100G03
channelgroup2modeon
!
interfaceEthernet40
description100G04
channelgroup2modeon
!
...
!
interfaceManagement1
ipaddress
!
ipaccesslistbulk_1
statisticsperentry
10permittcpanyanyfin
20permittcpanyanysyn
30permittcpanyanyrst
100001permitipanyany
!
iproute
!
noiprouting
!
managementapihttpcommands
noshutdown
!
!
end

27of32

Appendix B: Cluster Configuration (FreeBSD)


[bro@100Gmgr/usr/local/bro/etc]$catnode.cfg

##Belowisanexampleclusteredconfiguration.

[manager]
type=manager
host=100Gmgr.lbl.gov

[proxy1]
type=proxy
host=100G01.lbl.gov
env_vars=LD_LIBRARY_PATH=/usr/local/opt/snf/lib:$PATH

[worker1]
type=worker
host=100G01.lbl.gov
interface=myri0
lb_method=myricom
lb_procs=10
pin_cpus=3,5,7,9,11,13,15,17,19,21
env_vars=LD_LIBRARY_PATH=/usr/local/opt/snf/lib:$PATH,
SNF_DATARING_SIZE=0x100000000,SNF_NUM_RINGS=10,SNF_FLAGS=0x1

[proxy2]
type=proxy
host=100G02.lbl.gov
env_vars=LD_LIBRARY_PATH=/usr/local/opt/snf/lib:$PATH

[worker2]
type=worker
host=100G02.lbl.gov
interface=myri0
lb_method=myricom
lb_procs=10
pin_cpus=3,5,7,9,11,13,15,17,19,21
env_vars=LD_LIBRARY_PATH=/usr/local/opt/snf/lib:$PATH,
SNF_DATARING_SIZE=0x100000000,SNF_NUM_RINGS=10,SNF_FLAGS=0x1

[proxy3]
type=proxy
host=100G03.lbl.gov
env_vars=LD_LIBRARY_PATH=/usr/local/opt/snf/lib:$PATH

[worker3]
type=worker
host=100G03.lbl.gov
interface=myri0
lb_method=myricom
lb_procs=10
pin_cpus=3,5,7,9,11,13,15,17,19,21
env_vars=LD_LIBRARY_PATH=/usr/local/opt/snf/lib:$PATH,
SNF_DATARING_SIZE=0x100000000,SNF_NUM_RINGS=10,SNF_FLAGS=0x1

[proxy4]
type=proxy
host=100G04.lbl.gov

28of32

env_vars=LD_LIBRARY_PATH=/usr/local/opt/snf/lib:$PATH

[worker4]
type=worker
host=100G04.lbl.gov
interface=myri0
lb_method=myricom
lb_procs=10
pin_cpus=3,5,7,9,11,13,15,17,19,21
env_vars=LD_LIBRARY_PATH=/usr/local/opt/snf/lib:$PATH,
SNF_DATARING_SIZE=0x100000000,SNF_NUM_RINGS=10,SNF_FLAGS=0x1

[proxy5]
type=proxy
host=100Gmgr.lbl.gov
env_vars=LD_LIBRARY_PATH=/usr/local/opt/snf/lib:$PATH

[worker5]
type=worker
host=100Gmgr.lbl.gov
interface=myri0
lb_method=myricom
lb_procs=10
pin_cpus=3,5,7,9,11,13,15,17,19,21
env_vars=LD_LIBRARY_PATH=/usr/local/opt/snf/lib:$PATH,
SNF_DATARING_SIZE=0x100000000,SNF_NUM_RINGS=10,SNF_FLAGS=0x1

29of32

Appendix C: Procurement Details


Arista Procurement
1. AristaHardware:GSADCS7504EBND
Arista7504Echassisbundle.Includes7504chassis,4x2900PS,6xFabricE
modules,1xSupervisorE

2. AristaHardware:GSADCS7500ESUP#
Supervisormodulefor7500Eserieschassis(shipsinchassis)

3. AristaHardware:GSADCS7500E6C2LC#
6port100GbECFP2wirespeedlinecardfor7500ESeries(shipsinChassis)

4. AristaHardware:GSADCS7500E48SLC#
48port10GbESFP+wirespeedlinecardfor7500ESeries(shipsinchassis)

5. AristaHardware:GSACFP2100GLR4
100GLRTransceiverCFP2,10KM

6. AristaHardware:GSADCS7150S52CLF
Arista7150S,52x10GbE(SFP+)switchwithclock,fronttorearair,2xAC,
2xC13C14cords

7. AristaHardware:GSALICFIX2Z
Monitoring&provisioninglicenseforAristaFixedswitches40128port10G
(ZTP,LANZ,TapAgg,API,Timestamping,OpenFlow)

8. AristaHardware:GSASFP10GSR
10GBASESRSFP+(ShortReach)

9. AristaHardware:GSASFP10GLR
10GBASELRSFP+(LongReach)

10. AristaHardware:GSASFP1GT
1000BASETSFP(RJ45Copper)

11. AristaHardware:GSACABSFPSFP0.5M
10GBASECRtwinaxcoppercablewithSFP+connectorsonbothends(0.5m)

12. AristaHardware:GSACABSFPSFP3M
10GBASECRtwinaxcoppercablewithSFP+connectorsonbothends(3m)

30of32

Bro hardware Procurement


We purchased five of the followingpieces of hardware throughalocalsmallvendorof hardware.Notethe
Myricomnetworkcardsareincluded.

1. FTE52643V2/2U,IntelDualXeon(IvyBridge)E52643V23.5GHz2U
2. MotherboardSM,X9DRiF
3. IntelE52643V23.5GHzIvyBridge(2x6=12Cores)
4. CopperBaseCP0219CPUCoolerActive
5. 128GBDDRIII1600MHzECC/REG(8x16GBModulesInstalled)
6. OnBoard10/100/1000
7. OnBoardVGA
8. OnBoardIPMI2.0Via3rd.Lan
9. Intel120GBSSD6GB/s2.5"
10.WD1000CHTZ1TB10KRPM6GB/sSATA
11.10GPCIE28C22S+Myricom10G"Gen2"(5GT/s)PCIExpressNICwithtwoSFP+
12.Myricom10GSRModules
13.LSI92718i8PortsRaid
14.LSICacheVaultLSI00297
15.LSIMountingBoardLSI00291
16.SMCiChassis213LTQR720LPB(black)
17.720Whighefficiency(94%+)redundantpowersupplies

Myricom Drivers Procurement


In addition to purchasing the Myricom hardware, to use the advanced feature of the Myricom cards to
distributetraffic additionaldriversmustbepurchased.Myricomrequirestheserialnumberofthecardtolink
ittothedriverlicense.

1. 10GSNF3LICENSEVersion3license

31of32

Appendix D: Photo of Production Solution

32of32

Das könnte Ihnen auch gefallen