Sie sind auf Seite 1von 8

Peer-to-Peer Botnets: Overview and Case Study

By: Julian Grizzard (JHU), Vikram Sharma (UNC-Charlotte), Chris Nunne


ry (UNC-Charl),
Brent byung Hoon Kang (UNC-Charl), Davig Dagon (GATech)

In: HotBots, April 2007


---
Botnets historically have been controlled via a centralized C&C structure.

- where such C&C networks then are easier to dismantle since only
need to bring down a single central point

This paper is about peer-to-peer C&C topologies and includes an


example of a current bot using this approach.

====================
Brief summary (mine)
====================
A weakness of traditional centralized botnet C&C is the single point
of failure: an entire botnet can be dismantled by bringing down a
single node (the C&C server). While using P2P for C&C provides some
benefits in the form of resiliency to loss of single nodes, it also
presents some (surrmountable/manageable) challenges such as latency
and lack of reliability in message transmission. They note the
simultaneous though largely independent evolution of malicious bots
and peer-to-peer protocols. It is not expected that moving a botnet
from a centralized to decentralized C&C topology will fundamentally
alter how that botnet is used; it only affects the way in which
messages -- commanding the bots to do X, Y, or Z -- are delivered
to those bots.

Early P2P protocols (such as Napster) still employed centralized


indexing services whereas more recent P2P protocols (such as Gnutella,
Chord, CAN, Kademelia...) are completely decentralized. This
decentralization is increasingly performed using distributed hash
tables where each node in the P2P network contains some portion of
the name-->value mapping that identifies where certain items can be
found on the network.

They looked at a single example of a peer-to-peer bot: Trojan.Peacomm


which uses the Overnet peer-to-peer protocol for C&C. The Overnet
protocol implements a distributed hash table. The life cycle of a
Peacomm bot is as follows: (1) infection; (2) installation of initial
bot on infected host; (3) download of "secondary injection" onto the
host. The "initial bot" is able to connect to the P2P network and via
this connection locates then downloads the "secondary injection". This
secondary injection is akin to a bot command. That is, a secondary
injection -- when executed -- may: perform DDoS, spam, harvest email
addresses, install a rootkit, or spread. So there is no explicit
delivery of a command, instead the P2P network is used to deliver new
code to the compromised host where that code implements some capability.

Instructing the botnet to do something else then entails putting up a


different version of the secondary injection for the participating
bots to download.

A possible countermeasure to this decentralized C&C is index poisoning


whereby a good guy would change the hash table values (that enable
participating bots to locate the secondary injection) to be bogus
junk values so bots are unable to download the 2ndary injection and
thus effectively unable to execute botmaster commands. It occurs to me
that something like index poisoning may be able to be used by the bots
to infect the P2P network -- that is to spread to P2P users who are not
already infected with bots. Also, taking part in a P2P network usually
means the nodes have IP addresses of various members of that network...
which may provide another spreading vector for the malware.

--------
I. Intro
--------
In peer-to-peer architecture, theoretically no single point of failure.

- no centralized coordination point

- also, for centralized, if gain access to central location


then can learn who all members of botnet are

--> with P2P archs, may not be able to as easily determine


botnet's membership (roster)

--------------------
II. Bkgd and History
--------------------
peer-to-peer network: one in which any node can act as both a client
and a server

They note the simultaneous (mostly independent) evolution of


malicious bots and P2P protocols.

History of P2P:
---------------
1) P2P gained momentum with Napster

- file indexing for Napster still performed via a centralized server

-- when a peer would connect, it would send to the central server


its list of locally held files. Then each user interested in
obtaining some file would query that central server to identify
locations from which that file could be downloaded

-- this was efficient but provides single point of failure and


also single locus at which responsibility (for the content)
rests --> (copyright infringement) lawsuits!

- transferring files (once located via index) performed directly


between peers

2) Then Gnutella developed -- completely decentralized P2P

- used a 'flooding query model' where each user would broadcast


the file he was interested in obtaining to every other
machine in the network

- efficiency issues; also sopped up lots of bandwidth?

3) Freenet developed

- every file was associated with a key; files with similar keys
were clustered on a similar set of nodes(?)

- so if looking for a file with key K then that query was


routed to nodes that contained files with keys similar to K

- no guarantee that file would be found though


4) Recent P2P protocols use distributed hash tables to enable
peers to efficiently find info in P2P network

- DHT: each peer contains some part of a hash table

- index into hash table using name and 'value' is returned,


which in this case is a location of there that 'name'
can be found (within this P2P network)

- CAN, Chord, Pastry, and Tapestry: DHTs introduced around 2001

- used in BitTorrent and the Coral Content Distribution Network

- characteristics: decentralized, fault tolerant, scalable

- have some keyspace, e.g. 160-bit strings; each peer takes


ownership of some portion of that keyspace; then have some
overlay network that allows nodes to find the owner of any
particular key

- e.g. take: k = SHA-1( filename ) to store a file named


'filename'

- then send message: put( k, data ) to any node in the P2P


network ... which will be forwarded along til it reaches
the node responsible for the portion of the keyspace
containing k; 'data' is the data stored in 'filename'

--> this node then stores <k, data>

--> isn't there any redundancy? or is just one node responsible


for any particular key?

- obtaining 'filename': any node sends a message to any other


node in the DHT: get(k); that message gets routed to the
node responsible for the portion of the keyspace containing k
who will then in response return the 'data' associated with k

- overlay network: each node maintains a routing table of sorts...


identifying its neighbors and how to reach them.
for any key k, every node either: (a) owns k or (b) has a link
to a neighbor whose distance to k is shorter than this node's

so routing a message to the proper peer is done via a greedy


algorithm where each node forwards the get(k) message to the
node in its 'routing table' that is the closest to k

--> referred to as key-based routing

--> underlying this is the ability to determine proximity to k


which gets at the keyspace partitioning methodology

- additional goals: (a) ensure that the 'routing table' stored


at any one node isn't too big and (b) that the # of hops a
message needs to travel before arriving at the node containing
k (for example) isn't too high - bounded latency

----------------------
III. Goals and Metrics
----------------------
P2P botnets likely to have higher latency for message transmission than
centralized C&C would. Also, P2P protocols offer varying levels of
reliability of message transfer; clearly such reliability can be layered
on top of the particular P2P protocol however whereas the extra latency
introduced by P2P can be minimized (at the cost of additional state)
but is otherwise inherent in that architecture.

Changing a botnet from using centralized C&C to decentralized


should otherwise be transparent w.r.t. to the bot's capabilities.
They still can and will be used to perform the same functionalities,
it's just that the particular way that messages -- instructing the
bots to do X, Y, or Z -- are delivered to the bots will differ.

------------------------------
IV. Case Study: Trojan.Peacomm
------------------------------
Uses Overnet P2P protocol for C&C. The Overnet protocol implements a D
HT.

Life cycle of Peacomm:


----------------------
1) initial infection of a host; e.g. the exploit used to gain control
of the victim machine

--> typically an email containing a malicious attachment which,


when executed, installs the 'initial bot' on the machine
and executes it

--> This 'initial bot' can connect to the P2P network and
consists of: a kernel driver (wincom32.sys)

--> The initial bot sends packets as part of bootstrapping


itself to the Overnet network. Bootstrapping onto this
requires a set of nodes (a peer list) with which the
node connects

--> This peer list hard-coded in bot's install binary

--> May be the case that every infected node then is


operating with the same set of peers which makes
this effectively a single-point-of-failure (where
the peers on that list constitute the single point)

2) initial bot downloads 'secondary injection' to this host from the


P2P network

--> This 'secondary injection' contains the guts of the bot's


functionality.

--> The identifier (key) of the 'secondary injection' is also


hard-coded in the bot binary.

--> The 'value' stored at that key is an encrypted URL

--> The decryption key for this URL is hard-coded in the bot binary

--> Then the bot downloads the decrypted URL (from the web)

--> Then the bot executes the retrieved contents

Step (2) suggests a (possibly inefficient though immediate) way to


issue commands to the bots:

- have each bot continually poll the network for some static key
- the 'value' associated with that key may be NULL when there is
no command to be executed

- when there is a command to be executed, the key will be populated


with that command in either plaintext or encrypted form

- it may also use an additional layer of indirection as in (2)


whereby the 'value' is some (possibly encrypted URL) and the
command is written to the contents stored at that URL

--> this may be more efficient since doesn't entail re-looking


stuff up on P2P network as well as doesn't entail the
propagation time inherent in nodes realizing that the
target 'key' now has a value associated with it... then
removing that value and replacing it with another...

Various secondary injections used:


----------------------------------
1) downloader and rootkit
2) SMTP spammer
3) email addy harvester
4) email propagation component
5) DDoS tool

So the botmaster *implicitly* issues a command by:


--------------------------------------------------
(a) create new 'secondary injection' which contains fxnality
that would otherwise be induced via issuing a command

(b) each bot downloads the new secondary injection (bot update)
which is analogous to executing a received command

Can also configure bot for periodic updates -- that is, for the bot
to periodically search for the 'key' associated with some 2ndary
injection -- where the 'value' of that key is the encrypted URL.

-------------------------------
Countermeasure: index poisoning
-------------------------------
- identify some set of target keys
- then inject bogus values into the P2P network for those keys

==> Could be especially effective against Peacomm since the


secondary injection is a hard-coded key (set of keys)

http://cis.poly.edu/~ross/papers/poison.pdf

- so when P2P users search for that 'key', they will get
bogus values for peers that were supposed to have been
storing the 'value' associated with that key: thus
impeding the users' ability to download the 2ndary injection

Das könnte Ihnen auch gefallen