Sie sind auf Seite 1von 35

19

CHAPTER 2

STUDY AND ANALYSIS OF WINDOWS BASED BOTNETS

The goal of the botnet study and analysis is to gain understanding


of how a specific piece of botnet malware functions, what are the potential
attacks by it, and how does it change the behavior of some network and
system parameters so that defense can be built to protect against botnet
infection. A better analysis and understanding of botnets will help the
researchers to develop new technologies to detect and defeat the biggest cyber
security threat. Also the knowledge of existing detection techniques and their
limitations will lead to the development of better detection techniques. The
main focus of this chapter is to provide a concise overview of botnets and
analyze some of the existing techniques for botnet detection.

The major reason for remarkable success and increase of botnets is


their well organized and planned formation, generation and propagation.
Botnets employ different techniques, topologies and communication protocols
in different stages of their lifecycle. Hence, understanding the lifecycle of
botnet will help to identify the botnets at early stage.

2.1 BOTNET LIFE CYCLE

According to Feily et al. (2009), a typical botnet can be created and


maintained in five phases as shown in Figure 2.1.
20

Figure 2.1 Typical botnet lifecycle (Source: Feily et al 2009)

2.1.1 Initial Infection

During this phase, a system can be infected in different ways, such


as accidentally execute malicious code through drive-by download, exploit
system vulnerabilities, access through engineered backdoors and social
engineering. Once the system is infected with a bot malware, it will turn into a
zombie.

Drive-by download

Drive-by download happens when visiting a website, viewing an e-


mail message or by clicking a deceptive pop-up window. A download might
happen without user’s knowledge, often a computer virus, spyware, malware
or crimeware. The botmaster uses this technique to install bot binaries into
web servers and entices people to visit the site. When the user loads a certain
page, the bot binaries are automatically installed without user interaction,
usually by exploiting browser bugs, misconfigurations or unsecured ActiveX
controls.
21

The TDL-4 botnet (Securelist website) is a highly advanced, fourth


generation botnet found worldwide and the name of the rootkit using this
botnet is Tidserv, which is bundled with rogue security software and infects
low level system printer and file system drivers. Also it blocks system update
and disables some anti-malware programs. Then, it modifies the Master Boot
Record (MBR) on the hard disk, so that it is loaded and executed prior to the
operating system every time the system reboots.

Torpig botnet used by Mebroot rootkit infects systems through


injecting malicious Hyper Text Markup Language (HTML) and JavaScript
that exploit vulnerabilities of web browser plug-ins. If any such exploit is
successful, a copy of the Mebroot rootkit is downloaded and executed on the
victim’s machine. Like Tidserv, Mebroot also modifies the MBR to evade
detection by anti-malware programs.

Early variants of the Zeus botnet also adopt drive-by download for
initial infection by redirecting victims to a webpage which contains a
malicious Portable Document Format (PDF) file and also exploits known
vulnerabilities in the Adobe Reader software. Gumblar botnet also exploits
the similar vulnerabilities in the Adobe reader. Asprox botnet initiates
Structured Query Language (SQL) injection attacks against vulnerable pages
based on Microsoft Active Server Page (MSASP) to inject malicious scripts
for propagating malware.

Software vulnerabilities

The botmaster exploits vulnerability in a running service to


automatically gain access and install bot malware without user interaction.
Some variants of Conficker botnet use specially crafted Remote Procedure
Call (RPC) requests to infect new victims in the network. The requests will
22

trigger buffer overflows, which allow the existing bots to send and install
malware on the victim machine without user’s knowledge.

Backdoor

A backdoor is used to access a computer program that bypasses


security mechanisms. A programmer may sometimes install a backdoor so
that the program can be accessed for troubleshooting or other purposes.
However, attackers often use back doors that they detect or install themselves,
as part of an exploit. It is a method of bypassing normal authentication. The
backdoor may subvert the system through a rootkit. In order to take advantage
of this fact, a list of ports has been routinely examined by malicious software
for backdoors left by others, including port 2745 – backdoor of Bagle worm
and port 3410 – backdoor of Optix pro remote access Trojan. Most of the bot
binaries facilitate its propagation to other hosts in the network by using this
infection vector.

Social Engineering

Social engineering is a non-technical method of hackers which


relies heavily on human interaction and often involves tricking people into
breaking normal security procedures. It is a powerful method of bot
propagation as it can potentially spread the infection through social
engineering tactics. The botmaster uses social engineering tactics such as
baiting, phishing, pretexting, spamming, spear phishing and tailgating to
spread the bot binaries. Social engineering encompasses all methods that
entice the user to willingly download the bot binaries. Some botnets exploit
the culture of trust prevalent in social networks by posting catchy messages to
the user’s hijacked accounts. For example, Koobface tricked the users into
clicking on a link that is pointing to a fake YouTube website. Then the user
will be asked to download specific executable file to watch the video which is
23

actually a malware that turns the system into a zombie. Another popular
medium for social engineering is emails with interesting subjects and content,
enticing users to download attachments. For examples, the Srizbi/Reactor
botnet was behind the Ron Paul spam campaign. Strom sent spam emails with
catchy subjects that contained malicious links to install the bot binary on
victim machines. Zeus uses Facebook phishing and fake billing emails from
Verizon Wireless to initiate drive-by download.

2.1.2 Secondary Injection

After the successful initial infection, the next step by the system is
to download and execute a script known as shell-code in order to create a bot
which is under the control of the botmaster. The shell-code fetches the actual
bot binaries from the specific location using Trivial File Transfer Protocol
(TFTP), File Transfer Protocol (FTP), HTTP and Peer-to-Peer networks.
Once the bot malware is installed, the victim machine turns to a zombie and
runs the malicious code. The bot malware starts automatically, whenever the
zombie is rebooting.

2.1.3 Connection

During this phase, the zombie machine establishes a connection


with C&C server to receive instructions or updates. This connection phase is
scheduled every time the host is restarted to ensure the botmaster that the bot
is taking part in the botnet and is able to receive commands to perform
malicious activities. Therefore, the connection phase is likely to occur several
times during the bot lifecycle. This connection procedure is also known as
rallying. Rallying is the process used by bots to discover their C&C servers.
Some commonly used rallying mechanisms are described below:
24

Hardcoded IP address

In this method, the Internet Protocol (IP) address of a C&C sever


can either be provided as part of the bot binary known as binary hardcoding or
separately known as seeding.

In binary hardcoding, the IP address of the C&C server is


hardcoded into the bot binary. Botnets are attracted to gravitate towards
binary downloading because it eliminates the use of DNS, making their
activities stealthier. However, this is a rather primitive method of rallying. An
obvious pitfall to this is reverse engineering of the bot binary that may reveal
the C&C server, potentially leading to C&C server hijack. Another limitation
of this method is that network administrators can easily blacklist C&C IPs at a
network gateway using an Access Control List (ACL), thereby serving
call-back channels of all the bots.

Seeding is primarily used by P2P botnets. At the time of infection,


the bot is provided with an initial list of peers which includes a group of
active botnet peers that are regularly updated. The peer list differs from the
bot binary and is hidden anywhere on the infected machine with a vague
name. For example, Kelihos / Hlux, a P2P botnet stored its peer list in the
compromised host’s windows registry together with other configuration details.
Reverse engineering of the bot binary does not necessarily expose the peer list.

Some botnets use a combination of seeding and binary hardcoding


methods in which initial seeding is done by either pre-seeding the victim
machine’s windows registry with a peer list before actually running the
malware, or by obtaining the list from a small set of default hosts hardcoded
into the bot binary where the former case falls under seeding while the later is
25

more typical of binary hardcoding. Nugache provides a good representation of


such a botnet.

Hardcoded Domain names

Like IP addresses, domain names belonging to C&C servers can be


hardcoded into the bot binary. This is a better approach than IP address hard-
coding from the botmaster point of view. If the IP address associated with the
domain name is taken down, the bot master can still carry out the malicious
activities by mapping the domain name to a new IP address while requiring
no updation on the bot end. Gamblar botnet initially connected to a fixed
domain called gumblar.cn, and the removal of this domain in May 2009
actually shut down Gamblar. Later Gamblar reappeared with multiple domain
names for rallying, making it harder to detect and stop.

Modern botnets primarily use one or more hardcoded domain


names for DNS servers to resolve different IP addresses over a short span of
time. This technique is also known as Fast Flux Service Networks (FFSN).
The term “Fast flux” is commonly associated with spam and phishing attacks,
where the “IP Flux” describes the outcome of rapidly and repeatedly changing
the location to which the domain name of the Internet host (A) or its active
Name Server (NS) resolves. These changes are fast due to setting low time-to-
live (TTL) cache values in the DNS records. Taking down malicious DNS
records is often more difficult than removing compromised IP addresses,
because many DNS records can be established for the same IP address. Strom
and Warezov are among the first botnets that adopted fast-flux technique.
Wibimo is a more recent botnet that adopted fast-flux technique.
26

Dynamic domain names

Botnets can dynamically generate domain names by using a


Domain Generation Algorithm (DGA) which is known to the bot and the
botmaster. This makes it more difficult for static reputation systems to
maintain an accurate list of all possible C&C domains or for the security
community to attempt to hijack the domain. Taking down a domain is a
complicated process involving several formalities. By the time the older
domain is removed, the botnet might have typically already moved to a new
domain. This is called bot-herding. Some botnets that use DGAs as their
primary rallying mechanisms are Conficker, Murofet, Torpig and recent
variants of Zeus botnets.

2.1.4 Malicious Command and Control

After the connection phase, the actual botnet C&C activities will
start. The botmaster uses the C&C to distribute commands to his bot army.
Bots receive and execute commands sent by the botmaster through this
channel. The C&C enables the botmaster to remotely control the action of
large number of bots to conduct various illicit activities. Also it represents an
organization of a botnet in the way it functions and receives commands,
updates its features for performing various tasks and the way it transmits data.
The first generation of botnets utilized the IRC protocols as their C&C
structures. Due to the central point of failure, botmasters moved to robust and
resilient C&C structures namely, P2P, HTTP, etc.

The C&C mechanism is very important and is the major part of the
botnet design. This mechanism is used to instruct botnets to operate some tasks
such as spamming, phishing, denying services, etc. It directly determines the
27

communication topology of the botnet. Therefore understanding the C&C


mechanism in botnet has great importance to detect and defend against botnet.
Some commonly used C&C structures are described below:

Centralized C&C Model: Centralized model is characterized by a central


point that forwards messages between clients as shown in Figure 2.2. In this
model, the botmaster selects a host to be the contacting point of all bots; that
is all bots are connected to a centralized server. When the victim is infected, it
will connect to the centralized C&C server and wait for commands from the
botmaster. The botmaster controls thousands of infected bots using
centralized model and this model is easy to construct and efficient for
distributing commands. But this model can be easily detected and disabled.
This model uses three different topologies of botnet to connect any zombies in
the network, namely star, multi-server and hierarchical topology.

The star topology relies upon a single, centralized C&C channel to


communicate with all bot infected machines as seen in Figure 2.3. All the bot
infected machines are connecting to this server to receive the commands. This
preconfigured behavior is also known as “phoning home”. Utilizing star
topology allows the botnet an efficient command and control transfer.
However, if the server fails, the botnet cannot communicate with others.
Therefore, it is very vulnerable for the shutdowns.

Multi-server topology is an extended version of star topology. It


has multiple servers that are responsible for maintaining subsets of the botnet
as shown in Figure 2.4. These multiple command systems communicate
amongst each other as they manage the botnet. This topology is more robust than
the star topology since it is more tolerant to failures. Even if a number of servers
28

fail, the remaining servers maintain the integrity. Typically, the botmasters
distribute the C&C servers in different countries such that the bots in those
locations can communicate with the server in an efficient manner.

Figure 2.2 Centralized C&C structure

Bot
Bot

S
Bot
Bot

Bot

Figure 2.3 Star topology structure


29

Bot
Bot

Bot S

Bot

S S
Bot

Bot

Bot

Figure 2.4 Multi server topology structure

Proxy Proxy

Bot Bot Bot Bot

Figure 2.5 Hierarchical topology structure


30

In hierarchical topology, there is a hierarchy among the bots. At the


infection phase, the bot transforms to a proxy. The proxy bots are responsible
for forwarding the commands they receive to the botnet. The bots that directly
contact within the botnet are the proxy bots and not the botmaster. The real
C&C servers are hidden behind the proxies as it is shown in Figure 2.5.
Therefore, the bots are not aware of the rest of the botnet and where the C&C
server is located.

Botnets that employ hierarchical C&C topology are very difficult to


remove and also it is difficult to make estimations about the size and the
structure of it. Because, even if one of the bots is captured, it is impossible to
get more information than the IP address or the domain name of the
responsible proxy bot. Hierarchical C&C topologies also allows the botmaster
to split the botnet to sub-botnets such that botmaster can rent or sell services
to other botmasters. Clearly, this type of topology brings several advantages
to the botmasters. However, there can be an acceptable latency during
transmitting the command of the bots due to the fact that the commands
traverse through multiple points. This delay makes some
malicious activities difficult to be realized.

Decentralized C&C Model: Due to the drawbacks of centralized model, the


botmaster shifts their structures to decentralized (P2P) model as shown in
Figure 2.6. This model is resilient to dynamic churn. The communication will
not be disrupted even after losing a number of bots. In this model, there is no
central server, and bots are connected to each other and act as both client and
C&C server. This model follows a random topology. Random botnets are
highly resilient to shutdown and hijacking because they lack centralized C&C
and employ multiple communication paths between bots. The most recent
development in botnet C&C known as fast-flux has led some researchers to
argue that it may no longer is practically possible to disable a botnet by taking
31

out the command server. Fast-flux uses rapid DNS record updates to change
the address of the botnet controller very often among a large and redundant
set and is considerably more resilient to interference compared to previous
command approaches.

Figure 2.6 Decentralized C&C structure

Bot Bot

Bot
Bot
Bot

Bot

Figure 2.7 Random topology structure


32

Botnets with a random topology have no centralized C&C


infrastructure as shown in Figure 2.7. Instead, commands are injected in to the
botnet via any zombie machine. These commands are often “signed” as
authoritative, which tells the bot to automatically propagate the commands to
all other bots. Random botnets are highly resilient to shutdown and hijacking
because they lack centralized C&C and employ multiple communication
paths between bots. However, it is often easy to identify the members of the
botnet by monitoring a single infected host and observing the external hosts
with which it communicates. Command latency is a problem for random
topology botnets.

2.1.5 Maintenance and Update

The last phase of the botnet life cycle is maintenance of bots and
update of the bot malware. Maintenance is a necessary step that keeps the
botmasters with their army of bots up to date for further coordinated attacks.
Moreover, there are many reasons for updating bot binary codes for the bot
army, such as evading different detection techniques, adding further
functionality to the botnet. Server migration is also done when updating the
bot binary, which moves the bots to a different C&C server. This phase is
usually considered as a vulnerable phase. As the botmaster intends to
broadcast updates as soon as possible, some behavioral patterns of the zombie
machines belonging to the network may emerge and make the botnet
detectable.

2.2 BOTNET COMMUNICATION MECHANISMS

The communication mechanism is a part of the botnet design. The


type of communication used between a bot and its C&C server or between
any two bots can be classified into two types: push-based mechanism and
pull-based mechanism (Wang et al. 2009).
33

The push-based communication mechanism is also known as


command forwarding, which means a botmaster issues a command to some
bots, and these bots will actively forward the command to others as shown in
Figure 2.8. In this way, bots can avoid periodically requesting or checking for
a new command, and hence this reduces the risk of them being detected.
However, the inherent disadvantage of this method is the amount of traffic
that can be observed leaving the server, or the tight timing correlation
between various monitored nodes that are part of the same botnet, leading to
easier detection of infected hosts. This push-based method is mainly used in
IRC botnets to connect the selected channels and remain in the connect mode
waiting for commands and sending responses. Some of the popular IRC
botnets that are using push-based communication mechanisms are AgoBot,
SDBot, Phatbot, SpyBot and GTbot.

The pull-based communication mechanism is also known as


Command Publishing/Subscribing and it is shown in Figure 2.9. The bots
retrieve commands actively from a place where botmaster publishes
commands. This pull-based method is mainly used in HTTP botnets to
periodically visit certain web servers to get updates or new commands. Some
of the popular HTTP botnets that are using pull-based communication
mechanisms are BlackEnergy, Festo, Grum, Zeus, SpyEye, Citadel and
TDL-4.
34

Figure 2.8 Push-based communication structure

Figure 2.9 Pull-based communication structure

2.3 ATTACKS BY BOTNETS

The main motivation of botmasters is to generate financial profit


from their attack activities. According to the Arbor networks report (2014),
the percentage of various botnet attacks is shown in Figure 2.10.
35

40%
35%
30%
25%
Attack (%)

20%
15%
10%
5%
0%
Spam DDoS Clickfraud ID theft Others
Botnet Attack Types

Figure 2.10 Botnet attacks (Source: Arbor Network)

Spamming

One of the most popular uses of botnets is spamming. Spam emails


contains certain information such as health, medicine, financial, stock, adult
services, watch advertisements and delivers it to a large number of recipients,
whether they wish it or not (Yeh et al. 2011). Large number of spam emails
could be sent by bots in few seconds and Figure 2.11. shows various botnets
and the number of spam mails sent by them. About 70% to 90% of the
world’s spam is caused by botnets nowadays, which has most experienced in
the Internet security industry concerned (Xie et al. 2008, Gao et al. 2010).

Distributed Denial of Service (DDoS) attacks

A DDoS attack is a major Internet threat as it can create a huge


volume of unwanted traffic. This kind of attacks can prevent access to a
particular resource such as a website or server. Botnets are perfectly suited for
launching DDoS attacks. It consists of large numbers of remote zombie
machines and their cumulative bandwidth can reach multiple gigabytes of
upstream traffic per second. DDoS is a large scale, coordinated attack on the
36

availability of services of a victim system or network resources, launched


indirectly through many zombies by a botmaster. These attacks happen
regularly and the profit scheme obtained with the use of these botnets is very
high (Al-Duwairi et al. 2013).

50
Spam per day in billions

40

30

20

10

0
Grum Bobax Rustock Bagle Mega-D Maazben Cutwail Xarvester
Spam Botnets

Figure 2.11 Distribution of spam botnets (Source: Secure Works)

One of the most popular DDoS botnet is BlackEnergy botnet.


Figure 2.12 shows the number of Blackenergy botnet attacks on the targets.
On 6th February 2007, DDoS attacks were carried out repeatedly by botnets,
targeting several root servers hosting the domain name service including one
maintained by the United States Department of Defense (Holt 2013). In 3rd
January 2011, eight Tunisian Government websites had been affected by
DDoS attacks, including those of the president, prime minister, ministry of
industry, ministry of foreign affairs and the stock exchange. The
th
cybercriminals group administered the largest attack in 19 January 2011
against famous websites such as the Department of Justice (DOJ), the Federal
Bureau of Investigation (FBI), white house, BMI.com, copyright.com, etc
using a botnet based DDoS attack tool namely LOIC. All affected sites were
down for 10 minutes (Alomari et al. 2012).
37

The DDoS attack that disrupted website operations of Bank of


America and at least five other major banks happened in October 2012, used
compromised websites and flooded the bank’s routers, servers and server
applications – layers 3, 4 and 7 of the networking stack with junk traffic
(Arstechnica website). The Shadowserver Foundation recently identified a
new botnet type called Darkness. Darkness is generally comparable to the
BlackEnergy bot, which also specializes in DDoS attacks, but is claimed to be
more effective. It has been already actively used in some DDoS attacks and is
advertised as being able to take down even large websites with just a few
thousand bots (Shadowserver Foundation website).

In March 2013, The Spamhaus Project, an international anti-spam


organization based in London and Geneva, was hit by heavy DDoS attack
traffic, peaking up to 300 Gbit/s. Spamhaus maintains a huge blacklist of
likely spammers, which is used by colleges, research institutions, Internet
service providers, military, and businesses. CyberBunker, a service hosting
company in the Netherlands, was allegedly behind the DDoS attacks on
Spamhaus, in retaliation for its inclusion on the blacklist.

120
100
Number of Attacks

80
60
40
20
0

BlackEnergy Targets

Figure 2.12 BlackEnergy botnet attacks


38

Some more examples of botnets that are used for DDoS are Spybot
and Agobot (Barford & Yegneswaran 2007). The DDoS botnet in 2014 which
runs on Linux servers, named ‘Wopbot’ uses the bash shekkshock bug to auto
infect other servers. This botnet is active and scanning the Internet for
vulnerable systems, including the United States Department of Defense,
Chief executive of Italian security consultancy, Tiger Security etc. This botnet
has launched a DDoS attack against servers hosted by content delivery
network Akamai and is also aiming for other targets. Two more DDoS botnets
from the same year, namely, Warbot, Spikebot are used to launch DDoS
attacks.

Click Fraud

Click fraud is used to exploit Pay per Click (PPC) advertising. Data
collection is corrupted by the generation of illegitimate clicks, so that the
advertiser pays for clicks that offer no sales prospects. The distributed
processing offered by a botnet allows the bot master to allocate the task of
running automated scripts and binaries to machines. These programs generate
clicks and therefore illicit income.

Bitcoin Mining

Bitcoin is a digital asset and a payment system. This system is peer-to-


peer and transactions take place between users directly, without an
intermediary. These transactions are verified by network nodes and recorded
in a public distributed ledger called the block chain, which uses bitcoin as
its unit of account. Through the use of pooled Bitcoin mining, a botmaster
could covertly mine Bitcoins using the computational power of a victim's
computer.
39

2.4 BOTNET ANALYSIS

For understanding the botnet behavioral characteristics, a botnet setup


is created in our research lab with 15 systems as shown in Figure 2.13. The
physical topology of the botnet setup is star and the speed of the Ethernet is
100base TX. The executable binaries of botnet malware are obtained from
online sources. By exploiting the system vulnerabilities, we have installed bot
binaries in the system. For example, by exploiting unpatched flaw in Adobe’s
PDF document format, bot binaries are installed in the victim machine. Once
the bot binaries are installed, it will disable antivirus and security software in
the zombie machine to avoid detection. The bot binary injects itself in the
address space of the windows explorer. Using this setup, we have analyzed
the behaviors of HTTP and IRC botnets such as communication style, C&C
structure, topology and attacks by them. The P2P bot binaries are taken from
open sources and analyzed using reverse engineering tool IDA Pro (Hex-Rays
2013). The outcomes of the analysis are tabulated in Table 2.1.

Botmaster

C&C server
Zombie 1

Zombie 2
Zombie 15

Zombie 3
Zombie 4

Figure 2.13 Botnet experimental setup


Table 2.1 Outcome of the analysis of botnets

Year Botnets Protocol Communication C&C Topology Attacks


2002 Slapper P2P Push Unstructured P2P Random Spam, DDoS
2003 Rbot IRC Push Centralized Multi-server DDoS, data theft
Spybot P2P, IRC Push Decentralized Random DDoS, data theft
Sinit P2P Push Structured P2P Random DDoS
2004 Phat Bot P2P Push Unstructured P2P Random Spam
Bobax HTTP Pull Centralized Multi-server Spam
Bagle HTTP Pull SMTP Multi-server Spam
2006 Spam Thru P2P Push Custom P2P Random Spam
Rustock IRC Push Centralized Multi-server Spam
2007 Zeus HTTP Pull Centralized Multi-server Steal banking details
Cutwail SMTP Pull Centralized Star Spam
Srizbi IRC Push Centralized Multi-server Spam, Banking
Storm P2P Push Decentralized Random DDoS
2008 BlackEnergy HTTP Pull Centralized Multi-server DDoS
Conficker HTTP / P2P Pull Decentralized Random DDoS
Mariposa IRC / HTTP Pull Centralized Multi-server Spam, identity theft
Sality P2P Push Decentralized Random Spam,
Asprox HTTP Pull Centralized Multi-server Spam
Gumblar HTTP Pull Centralized Multi-server Spam
Waledac SMTP / P2P Push Decentralized Random Spam
Mega-D HTTP Pull Centralized Multi-server Spam
Lethic IRC Push Centralized Star Spam
Kraken IRC Push Centralized Star Spam
40
Table 2.1 Outcome of the analysis of botnets

Year Botnets Protocol Communication C&C Topology Attacks


2009 Festi HTTP Pull Centralized Multi-server Spam & DDoS
Zeus HTTP Pull Centralized Multi-server Steal banking details
Wopla HTTP Pull Centralized Multi-server DDoS
2010 Kelihos P2P Push Unstructured P2P Random Bitcoin, Spam
Spyeye HTTP Pull Centralized Multi-server Steal banking details
TDL4 IRC Pull Centralized Multi-server DDoS, Spam
2011 ZeroAccess P2P Push Unstructured P2P Random Bitcoin, Click fraud
Flahback P2P Push Decentralized Random DDoS
Andromeda HTTP Pull Centralized Multi-server DDoS
2012 Chameleon HTTP Pull Centralized Multi-server Click fraud
2013 Athena HTTP Pull Centralized Multi-server DDoS
41
42

2.5 LITERATURE SURVEY OF BOTNET DETECTION


TECHNIQUES

This section presents a brief survey of botnet detection techniques


that are available in the literature. Further, it investigates the advantages and
limitations of the existing botnet detection techniques in different directions.

Researchers have developed many architectures and proposed


different methods to detect these malicious attacks (Feily et al. 2009,
Zeidanloo et al. 2010). Figure 2.14 shows the overview of botnet detection
techniques. Botnet detection techniques are classified into two broad
categories, Intrusion Detection Systems (IDSs) and honeynets (Provos 2004,
Stinson & Mitchell 2007). IDSs are further divided into anomaly based and
signature based IDSs (Stalmans & Irwin 2011). The detailed description of
these botnet detection techniques is provided in the following sections.

Honeypots and honeynets

A honeypot can be defined as an “environment where


vulnerabilities have been deliberately introduced to observe attacks and
intrusions” (Baecher et al. 2006). They have a strong ability to detect security
threats, to collect malware signatures and to understand the motivation and
technique behind the threat used by the perpetrator. Generally it consists of a
computer, data or a network site that appears to be part of a network, but is
actually isolated and monitored and which seems to contain information or a
resource of value to attackers. Two or more honeypots on a network form a
honeynet, which is used for monitoring a larger or more diverse network in
which one honeypot may not be sufficient. Usually, honeynets are preferred
in Linux operating systems because of their ability, richness and of toolbox
contents.
43

Botnet Detection Techniques

Honeynet Intrusion Detection System

Signature based Anomaly based

Host based Network based

Active monitoring Passive monitoring

Mining based Protocol based Flow based

Classification

Clustering

Association –rule

Statistical

Graph based

Symptom based

Figure 2.14 Overview of botnet detection techniques

Honeypots and honeynets are effective detection techniques at a


reasonable cost and without false positives, and hence there has been much
research in this area. Honeynet is used to collect information from bots for
further analysis of botnet characteristics and the intensity of the attack.
Additionally, the information collected from bots is used to discover the C&C
channel, unknown susceptibilities, techniques and tools used by the attacker
44

as well as the motivation of the attacker. Typical honeynet architecture is


shown in Figure 2.15. The key component is honeywall, which is used to
separate honeypots from the rest of the network. The honeywall is a Layer2 /
Layer3 device which acts as a gateway to pass through network traffic.

Figure 2.15 Honeynet architecture (Source: Abu Rajab et al 2006)

Many researchers have utilized honeypot in their work. For


example, Nepenthes (Baecher et al. 2006) is a low-interaction honeypot that
simulates some vulnerability and provides some features for collection of
malware binaries. Abu Rajab et al. (2006) used honeypots to study the botnet
activities. They have constructed a multifaceted infrastructure to capture and
concurrently track multiple botnets in a given network infrastructure and they
achieved a comprehensive measurement, analysis that reflects several
important structural and behavioral aspects of botnets. Dagon et.al (2006)
studied the global diurnal behavior of botnets using DNS redirection and sink-
holing techniques with the help of honeynet systems. The redirection
technique for counting infected bots by manipulating the DNS entry
45

associated with a botnet’s IRC server and redirecting connections to a local


sinkhole. The sinkhole completed the three-way TCP handshake with bots
attempting to connect to the redirected IRC server and recorded the existence
of large botnet with population up to 350000 bots.

Barford & Yegneswaran (2007) investigated the internals of bot


instances using honeynet. They have examined source codes of four widely-
used IRC botnets namely Agobot, SDBot, SpyBot and GT bot. They have
examined the botnets control mechanisms, host control mechanisms,
propagation mechanisms, exploits, delivery mechanisms, obfuscation and
deception mechanisms. Rieck et.al (2010) introduced Botzilla, a system for
detection of bot malware communication, which proceeds by repetitively
recording network traffic of malware in a controlled environment using
honeynets. When the malicious software contacts its maintainer by a process
called ‘‘phoning home’’, signatures are automatically generated from
monitored malware traffic, without human intervention or full network
payload examination. The signature could be generated even if the malware is
observed on a single infected host. The analysis is limited to the first bytes of
network flows and attains sufficient detection accuracy.

Honeynets are essential to understand botnet characteristics and


technology, but they have some limitations. It cannot capture bots that use the
propagation methods other than scanning, spam and web-driven downloads. It
can only give a report of the infected machines that are anticipated and put on
the network as a trap system and can track limited scale of exploited
activities. As honeynets become increasingly popular in tracking and
monitoring botnet activities, intruders started to develop novel methods to
overwhelm honeynet traps.
46

Intrusion detection system

The IDS uses the signatures or behavior of existing botnets in


reference to detect potential botnets. IDS botnet detection is classified as
either a signature based or anomaly based technique.

Signature based botnet detection

Signature based botnet detection technique uses the signatures of


current botnets for its detection. For instance, Snort (Roesch 1999) is capable
to monitor network traffic to find the signature of existing bots. The basic idea
is to extract feature information from packets of monitoring traffic, mark such
patterns and register them in a knowledge database of existing bots. This
method has several advantages, such as immediate detection and impossibility
of false positives. But the signature based detection approach is only capable
for detection of well known botnets. Consequently, this solution is not
efficient for unknown bots. More important, very similar bots with a slightly
different signature may be missed of detection. One of the well known
signatures based botnet detection approach is Rishi (Goebel & Hozl 2007).
Rishi is primarily based on passive traffic monitoring for suspicious IRC
nicknames, IRC servers, and uncommon server ports. They have used n-gram
analysis and a scoring system to detect bots that use uncommon
communication channels, which are commonly not detected by classical
intrusion detection systems. The disadvantages of this method are that it
cannot detect encrypted communication as well as non-IRC botnets.
Moreover, this method is unable to detect bots without using known nickname
patterns. Another disadvantage of signature based detection techniques is that
there should always be an effort to update the knowledge database with new
signatures, which enhances the management cost and reduces the overall
performance. New bots may launch attacks before the knowledge database is
patched.
47

Anomaly based botnet detection

Anomaly based detection is a prominent research domain in botnet


detection. The basic idea comes from analyzing several network traffic
irregularities, including traffic passing through unusual ports, high network
latency, increased traffic volume, and system behavior indicating the presence
of malicious bots in the network. Anomaly based techniques are further
divided into a host based and network based approaches. In a host based
approach, the individual machine (host) is monitored to find any suspicious
behavior, including its processing overhead, and access to suspicious files.
Despite the importance of this approach, it is usually not scalable because all
machines in the network must have the monitoring tool installed to be
effective. Conversely, network based techniques analyze network traffic
either passively or actively.

Active monitoring

Active monitoring techniques are used to measure the service


quality by injecting test packets sent to the network, servers or applications.
The goal of active monitoring is to measure the network parameters such as
timing of packets, packet type / size, monitoring of functions / path and
statistical quality. Usually this technique is not considered as a preferable
strategy due to the additional load to network traffic. BotProb (Tokhtabayev
& Skormin 2007) is considered as an active monitoring strategy, which
injects packets into the network payload for finding suspicious activity caused
by humans or bots. As non-human bots usually transmit commands on a
predetermined pattern, it corresponds to the cause and effect correlation
between C&C and the bots. Such a command and response architecture can
easily determine the existence of bots because the response comes from the
predetermined command behavior. A drawback of active monitoring is that, it
generates additional packets and increases network traffic.
48

Passive Monitoring

Passive monitoring techniques observe data traffic in the network


and look for suspicious communications that may be provided by bots or
C&C servers. Data traffic is analyzed employing pre-recorded signatures or
anomaly detection techniques. Bots in the same botnet tend to present the
same communication patterns in both centralized and decentralized
architectures. This occurs because bots are pre-programmed to perform the
same routine communication with the C&C server. Because botmasters must
communicate with bots to perform the attack, there is some common traffic
patterns in the network linked to each stage of the bot life cycle. Moreover,
the same network protocols will be used for communication and performing
malicious activities. Passive monitoring based detection employs a myriad of
different techniques and methods, including classification techniques, flow
based, graph based, clustering, correlation, stochastic models, entropy, etc.

Binkley & Singh (2006) proposed an anomaly based algorithm for


detecting IRC-based botnet meshes. The algorithm can also reveal IRC bot
servers. It combines an IRC mesh detection component with a TCP scan
detector. However, simply using a minor cipher to encode the IRC commands
could easily crush this approach. Constantinou & Mavrommatis (2006)
proposed a novel approach for P2P bot traffic identification that relies on the
fundamental characteristics of P2P protocols instead of application specific
details. These characteristics include large network diameters and many
entities acting as both clients and servers. It utilizes only the transport layer
header of every packet, and can identify unknown P2P protocols. However,
this technique is time-consuming.

Karasaridis et al. (2007) developed a network flow-level IRC


botnet controller detection system for backbone networks. The system
combines heuristics that assume the network flow of IRC communication,
49

scanning behavior, and known botnet communication models for backbone


networks. All flow records for suspected bots are fetched and pruned, keeping
only the flows, which server port is one of the standard IRC ports or which
involves a hub server. The flow records for server IPs are aggregated, and a
correlation algorithm is used to identify suspicious bots according to a defined
bot infection dialog model. BotHunter (Gu et al. 2007) is a passive bot
detection system that uses IDS dialog correlation to associate IDS events with
bot infection models. As BotHunter aims to detect bot behavior at the network
level, stealthy bots can avoid detection by evading event timing correlation or
conducting local attacks (e.g., deleting files) without any networking
activities.

Botsniffer (Gu et al. 2008) uses network based anomaly detection


techniques designed especially for detecting IRC and HTTP botnets in a local
area network. Botsniffer observes that bots within the same botnet likely
reveal strong similarities in their responses and activities like scanning and
sending spam emails, thus sharing common communication contents. This
detection method uses spatial-temporal correlation and assumes that all
botnets, unlike humans, tend to communicate in a highly synchronized
fashion. Botsniffer performs string matching to detect similar responses from
botnets. Nevertheless, botnets may encrypt their communication traffic or
inject random noise packets to evade detection.

Botminer (Gu et al. 2008) is an approach that applies data mining


techniques for botnet C&C traffic detection. Botminer improves the
previously designed approach called Botsniffer. Botminer clusters similar
communication and malicious traffic. It then performs cross-cluster
correlation to identify hosts that share both similar communication and
malicious activity patterns. It is an advanced botnet detection tool
independent of botnet protocol and structure. It can detect botnets including
50

IRC-based, HTTP-based, and P2P botnets with a low false positive rate. The
system has many desirable features but it needs long monitoring time and
unforged large scale data to detect malicious activities; however real botnets
communicate silently with large number of small packets, and forge their
information. Strayer et al. (2008) proposed a network based approach to
detect botnet traffic using machine learning techniques. The detection process
uses two main steps: first, traffic that is unlikely to be part of a botnet is
eliminated; the remaining traffic is then classified into groups and correlated
to find common communication patterns that would suggest botnet activity.
This approach is specific to IRC botnets; also it cannot detect encrypted C&C
traffic.

Zhuang et al. (2008) developed a technique to map botnet


membership using email spam traces. To group bots into botnets, they looked
for multiple bots participating in the same email spam campaigns. The authors
applied the proposed technique against an email spam trace from Hotmail
services. Through this analysis, they made indirect observations about the
sizes and activities of different spam botnets behavioral characteristics (e.g.,
the amount of spam sent per bot) and the geographical botnet distribution.
The authors assumed that a spam campaign is realized for one botnet, but it
does not separate the activities of individual botnets or provide information on
the spammers’ latest techniques.

Choi et al. (2009) suggested an anomaly detection mechanism


using monitoring group activities in DNS traffic. Based on the group activity
model and metric, they developed a botnet detection mechanism, called
BotGAD (Botnet Group Activity Detector). They have defined some special
features of DNS traffic to differentiate valid DNS queries from botnet DNS
queries. BotGAD enables detecting unknown botnets from large-scale
networks in real time. The authors also developed a mechanism to detect
51

C&C server migration. The scheme may also detect botnets with encrypted
channels, as it uses information from IP headers. The main drawback of the
approach is the high processing time required to monitor the huge scale of
network traffic.

Wurzinger et al. (2009) proposed an automatic method to generate a


network level botnet signature (model) of a given bot binary based on the
botnet command-response pattern. They aimed to generate detection models
by observing bot behaviors captured in the wild, launching a bot in a
controlled environment and recording its network activity. The work can then
identify points in a network trace that likely correlate with command-response
activities. The limitation of this approach is that, it can detect only known
instances of botnets.

Nagaraja et al. (2010) developed BotGrep, a tool to detect P2P


botnets based on network graph analysis, for example, the information about
which pairs of nodes communicate with one another in the communication
graph. Their approach relies on the fast-mixing nature of the structured P2P
botnet C&C graph. The BotGrep algorithm iteratively partitions the
communication graph into faster- mixing and slower-mixing pieces,
eventually narrowing it to the fast-mixing component. The network graph
analysis assumes that hosts belonging to P2P botnets tend to be more
connected than other hosts.

A more robust correlation architecture is proposed by Zhang et al.


(2011) to perform botnet detection for high speed and high volume networks,
including a botnet-aware adaptive packet sampling algorithm along with a
scalable spatial-temporal flow correlation mechanism. However, evasion is
easy for the botmaster once it recognizes the proposed algorithm.
52

Chen et al. (2011) proposed an algorithm based on an incremental


least-squares support vector machine (LS-SVM) learning scheme, and
evaluated the performance on two real world datasets. It focuses on the
detection issue by using an online learning scheme that can be used for both
training sets and evolving features. Finding malicious bots depend on how
often a client machine visits a server machine by looking at the IP addresses
of the server machines. Moreover, this approach can detect encrypted botnet
communication. However, this scheme targets only online-learning systems
and algorithms. Lu et al. (2011) employed n-gram, decision tree and
clustering algorithms to classify network traffic into different application
communities. The proposed system analyzes the temporal-frequent
characteristics of the 256 American Standard Code for Information
Interchange (ASCII) bytes of the payload over a predefined time interval to
distinguish malicious bot traffic from normal one. Their approach is payload
aware and hard to execute on a large scale network.

Xu et al. (2012) presented a P2P passive botnet detection technique


which can effectively identify P2P malware codes by exploiting the botmaster
strength against them. The two-phase detection framework is robust in host
level dynamic binary code analysis with network level probing, based on the
assumption that usually a P2P mechanism has a remotely controllable built-in
architecture, which can be exploited to observe the malicious behavior of the
nodes. Besides the effectiveness of this approach, advanced encryption and
certificate based authentication may also evade detection. Moreover,
variations in the port binding delay may make detection difficult for this
scheme.
53

2.6 SUMMARY

This chapter provides a detailed analysis of windows based botnets


to understand the behavior and their lifecycle mechanism which would be
helpful for future study of thwarting botnet communications. The botnets
Zues, BlackEnergy, Spyeye, Rbot, Phat bot, Spam thru, Festi, Kelihos,
ZeroAccess, Chameleon, Aldi and TDL, etc are analyzed in a controlled
environment to identify their possible attacks, topology, C&C structure and
communication mechanisms. A brief survey of some of the existing botnet
detection techniques and their advantages and limitations are presented. The
outcome of the analysis is a base for designing better detection mechanisms.

Das könnte Ihnen auch gefallen