You are on page 1of 4

2010 Second International Conference on Network Applications, Protocols and Services

Malware Behavior Analysis: Learning and Understanding Current Malware Threats


Mohamad Fadli Zolkipli
School of Computer Science, Universiti Sains Malaysia, 11800 USM, Penang Malaysia e-mail: fadli@ump.edu.my
Abstract-Malware is one of the major security threats in computer and network environment. However, Signaturebased approach that commonly used does not provide enough opportunity to learn and understand malware threats that can be used in implementing security prevention mechanisms. In order to learn and understand the malwares, behavior-based technique that applied dynamic approach is the possible solution for identification, classification and clustering the malwares. In the paper, we present a new approach for conducting behavior-based analysis of malicious programs. One experiment was conducted on the campus network to generate an analysis of current malware behaviors. The result shows that the most potential malware threats in campus network are worm and Trojan. Keywords-malware; behavior analysis

Aman Jantan
School of Computer Science, Universiti Sains Malaysia, 11800 USM, Penang Malaysia e-mail: aman@cs.usm.my extracted by executing a program and closely observing its activities [3]. In this paper we propose a method to conduct malware behavior analysis using dynamic approach. An experiment has been conducted to learn and understand the current malware threat on the organization. The method was divided into four major steps such as malware collection, behavior identification, custom behavior analysis and statistical report. In order to complete the experiment, one case study was conducted at the higher learning institution that has more than four thousand of network users. The rest of the paper is structured as follows. Section II provides state of the art of the study that discusses background and related work. Section III explains the method which is used to conduct malware behavior analysis. The experimental result is described in section IV and the conclusion is provided in section V. II. STATE OF THE ART Computer malware have been a major threat to the computer systems and networks since 1990s [4]. Apel et al. (2009) defined that malicious software or malware is software performing actions intended by an attacker without consent of the owner when executed [3]. Preda et al. (2008) also stated that it designed to damage the computer on which it executes or the network over which it communicates [5]. Basically, bad malware can be classified in three major types such as virus, worm and Trojan horse. A computer virus is code that tries to replicate itself into other executable programs [6]. Then the infected program can infect new code in turn when it runs and referred as the viruss host. A worm also apply self-replicating same likes virus. Worms using network connectivity to find and attack other vulnerable systems from host to host [7]. The goal of worm is infecting as many computer systems that connected to the network [6]. A Trojan horse is benign programs that attack computer systems from within by replacing programs to perform unauthorized action [7]. It is a program designed to embed secret malicious task into other application or system. In general, malwares type was classified based on their behaviors and characteristics. There are basically three characteristics associated with these bad malware types such as self-replication, population growth and parasitic [6]. Selfreplicating malware actively attempts to propagate actively or passively by creating new copies of itself. The population growth of malware describes the overall change in the number of malware instances due to self-replication. Parasitic malware requires some other executable program code in order to be executed.
218

I.

INTRODUCTION

Common threat to computer system and network is malware such as of viruses, worms, Trojan and others bad type of malicious software. Each malware types has specific characteristic and can be classified based on the behaviors and spread manner. Although all types of malware have their specific objective, the main purpose is to create threats to the data network and computer operation. A malware detector is a system that attempts to identify malware [1]. This security tool has designed to protect computer or host against the malware threats. However, according to Christodorescu et al. (2007) malware detection tools that used signature-based approach have performed poorly in identifying unknown malicious programs [2]. Basically, it cannot provide the opportunity to learn and understand the threat because the detection process is only based on the string matching without knowing the goals and behaviors of the malware. Understanding malware goals and behaviors is very important because that information is very useful in designing and implementing prevention mechanism on the computer systems as well as data network. In order maintain computer operation, learning and understanding malware behaviors is the best practices to minimize threats from malware writers as it is different based on times, organizations types and methods of attacks. Malware behaviors analysis is the process to understand the types and characteristics of malicious software. Understanding malware behaviors is very important and critical tasks in order to use it for identification, classification and clustering of the malware. It is different from the signature-based method as the detection process will be done without knowing the behavior of the malware. Malware behavior analysis can be done using dynamic approach. According to Apel et al. (2009) dynamic analysis is a promising alternative for malware analysis which can be
978-0-7695-4177-8/10 $26.00 2010 IEEE DOI 10.1109/NETAPPS.2010.46

Wagener et al. (2008) proposed a flexible and automated approach to extract malware behavior by observing all the system function calls performed in a virtualized execution environment [8]. Similarities and distances between malware behaviors are computed which allows classifying malware behaviors. The main features of this approach reside in coupling a sequence alignment method to compute similarities and leverage the Hellinger distance to compute associated distances. The classification process proposed by this work is using phylogenetic tree. However, this technique still has a limitation due to the wrongly classified a few malware behavior. Martignoni et al. (2009) implemented a framework for improving behavior-based analysis of malware [9]. The framework is not the malware detector but it only enhances the capabilities of existing dynamic behavior-based detectors such as TTAnalyze, Panorama and CWSandbox. This framework was implemented on the cloud computing environment by analyzing a piece of malware on behalf of multiple end-users simultaneously. Based on the limitation of an automated behavior-based malware analysis, Bayer et al. (2009) was used a scalable clustering approach to identify and group malware samples that exhibit similar behavior [10]. This approach also performs dynamic analysis to obtain the execution traces of malware programs using automated tools. The execution traces are generalized into behavioral profiles, which characterize the activity of a program in more abstract terms. Then the profiles serve as input to an efficient clustering algorithm that allows handling sample sets larger than previous approaches in term of malware behaviors. Bayer et al. also stated that it is not sufficient while automating the analysis of a single malware sample in a first step because the analyst will facing a thousands of reports every day that need to be examined. Apel et al. (2009) was studied different distance measures in detail and discuss desirable properties of a distance measure for this particular purpose [3]. The reason of the study was due to the fact that malware authors started to deploy morphing or obfuscation techniques in order to hinder detection of such polymorphic malware by antimalware products. Polymorphic malware can generate numerous variants of a malware with a different syntactic representation while providing almost the same functionality and showing similar behavior. The study focused on behavioral features of malware, compares and experimentally evaluates different distance measures for malware behavior. Based on the results, Apel et al. was identified the most appropriate distance measure for grouping malware samples based on similar behavior. III. BEHAVIOR ANALYSIS METHOD In this study we are conducting an experiment using our proposed method. The method is combination of available tools and human expertise in identifying malware behaviors. The purpose of this method is to customize result of malware behaviors that generate by malware analyzer tools. This method consists of four major steps which are malware collection, behavior identification, custom behavior analysis and statistical report. The process flow of this method is shown in Fig. 1. In order to minimize technical, time and

cost constrains our method used a number of available computer tools that have specific function to complete the experiment.

Figure 1. Process flow for malware behavior analysis.

A. Malware Collection The first step of this method is used to collect samples of current malware that spread through network. In order to get as much as possible malware samples, this task can be done using a number of security tools such as malware collector, honeyport and intrusion detection system. Two malware collector tools were selected in this experiment such as HoneyClients [11] and Amun [12]. Both of tools were selected in order to avoid failure and to maximize the varieties of malware collection. The hardware setup for this step was show in Fig. 2. Those tools were continuously running on network within thirty days. As a result, about a thousand of malware samples were collected.

Figure 2. Hardware setup for collecting malware samples.

219

B. Behavior Identification The second task of this method is identifying malware behavior based on the malware types. Available tools must be used for this task in order to identify the behavior of each sample. In this experiment, behaviors of each malware are identified using two types of malware analyzer tools which are CWSandbox [13] and Anubis [14]. Due to security reason, both tools are run on virtual machine platform. Both tools were chosen because tool-based behavior identification is very useful in identifying malware behavior. Each sample was analyzed by both tools to get varieties of malware behaviors. C. Custom Behavior Analysis Human-based behavior analysis is then used to customize the result generated by both malware analyzer tools. This process used human expertise to analyze the malware samples. It is very important due to some sample have different result from each of malware analyzer tools. As a result, we group the malware based on two major types of malware families such as worm and Trojan as shown on Table 1. The detail about behaviors analysis will be described on the next part of this paper.
TABLE I MAJOR TYPES OF MALWARE FAMILIES Malware Worm Trojan Types P2P Worm Network Worm Worm Packed Trojan Dropper Trojan Downloader Trojan Clicker Trojan Gamethief Trojan Backdoors Trojan

Figure 3. Percentage of malware samples.

D. Statistical Report The malware collection used for this experiment comprises of 1073 unique samples obtained using malware collector tool mentioned before. By analyzing the samples, we classified them into two malware types as shown in Table 1. Based on the two major types, 519 samples are grouped as worm and 554 samples are Trojan. The percentage of the malware sample based on class was shown on Fig. 3. There are three types of worm identified such as P2P worm, network worm and worm. The detailed numbers of sample for each types of worm was shown in Fig. 4. Trojan was classified into seven types such as packed Trojan, dropper Trojan, downloader Trojan, clicker Trojan, gamethief Trojan, backdoors and Trojan. The detailed numbers of sample for each types of Trojan was shown in Fig. 5. The highest malware threat for this campus network is worm followed by Trojan.

Figure 4. Samples of Worm based on types.

Figure 5. Samples of Trojan based on types.

220

IV. ANALYSIS RESULT The behaviors analysis provides detailed information about malware that suitable for learning and understanding malware samples. The samples are run in a Windows virtual machine environment and their behavior is identified during program execution. Malware analyzer tools such as CWSandbox and Anubis that using API call was create an automatic summary report for our reference. Summary report that generate by tools was customize using human expertise in order to get accurate behavior analysis based on our need. Based on the experiment, only two types of malwares were detected such as worm and Trojan. Worm that implement self-replicating uses a computer network to spread copies of itself to other computers on the network without any user intervention. Worm such as P2P worm, network worm and worm have specific behavior that can cause harm to the network and also computer. Normally, worm will infect Windows system by dropping files in Windows\System32 location. P2P worm may cause data loss and also can spoil network and computer performance. The distribution channels that used by P2P worm are e-mail, peer-to-peer application and Internet Relay chat (IRC). Network worm that uses flaws in Windows application normally come with the combination of other malware such as virus and Trojan. Worm that used AutoRun method spread by copying itself into the root directories of hard drives and other writable media such as USB drive. AutoIt worm compiled script that has a folder icon in order to invite computer user to execute it. Trojan is a malicious program that used by hackers to get nearly complete control of computer. The program will be execute on targeted computer and performs a specific actions. Packed Trojan is usually user initiated program that used Packer program to compress malicious files. It is normally implement polymorphic technique to change system setting, infects computer control and redirected the web browser. Dropper Trojan is a program that drops different type of malware such as worms, virus and backdoor to a computer system. When it is executed, it simultaneously will run all malicious files it contains. Downloader Trojan is a malicious program that can be downloaded and installed harmful malware into the infected computer. It also can cause opening illicit network connections, self-mutation, disable security tools and transfer personal information of user without permission. Clicker Trojan normally is a JaveScript program that comes through Internet application in order to change start page in web browser. Gamethief Trojan may enter a computer system through web application and attempt to steal confidential data such as passwords, usernames or login details. It also can cause hard disk blocking, attacks the registry and stop the computer operation. Backdoor is a program that create secret door to getting into a computer system for future attacks.

V. CONCLUSION In this paper we have presented the behavior analysis experiment in order to learn and understanding current malware behavior. It is very important to an organization in order to plan and implementing suitable security mechanism to avoid malware attacks. According to the statistical analysis report, it shows that the most potential malware threats for that campus network are worm and Trojan. We can conclude that current malwares is difficult to classified using analysis tools because it have a verities of behaviors that difficult to defined. In some cases different tool will make different classification of malware type. Based on that limitation, our method also used human expertise in order to customize analysis result that generated by analysis tools. We plan to extend our future research work with better behavior analysis approach, improve classification technique of malware and optimizing malware detection. ACKNOWLEDGMENT This work was supported by Short-term Grant No.304/PKOMP/639021, School of Computer Science, Universiti Sains Malaysia, Penang, Malaysia.

REFERENCES
[1] [2] [3] [4] [5] M. Christodorescu and S. Jha, Testing Malware Detectors, ISSTA04 in ACM, pp. 34-44, 2004. M. Christodorescu, S. Jha and C. Kruegel, Mining Specifications ofMalware Behavior, ESEC/FSE07 in ACM, pp. 5-14, 2007. M. Apel, C. Bockermann and M. Meier, Measuring Similarity of Malware Behavior, SICK 2009 in IEEE Explorer, pp. 891-898, Oktober 2009. S. Noreen, S. Murtaza, M. Zubair and M. Farooq, Evolvable Malware, GECCO09 in ACM, pp. 1569-1576, 2009. M. D. Preda, M. Christodorescu, S. Jha and S. Debray, A Semantics-Based Approach to Malware Detection, ACM Transactions on Programming Languages and Systems, Vol. 30, No. 5, Article 25, 2008. J. Aycock, Computer Virus and Malware. Unites States, Springer, 2006. White Paper: A Brief History of Malware. McAfee, 2005. G. Wagener, R. State and A. Dulaunoy. Malware Behavior Analysis, Journal Computer Virol, pp. 279-287, Vol 4, 2008. L. Martignoni, R. Paleari and D. Bruschi, A Framework for Behavior-Based Malware Analysis in the Cloud, ICISS 2009 in Springer, pp. 178-192, 2009. U. Bayer, P. M. Comparetti, C. Hlauschek, C. Kruegel and E. Kirda, Scalable, Behavior-Based Malware Clustering, 16th Annual Network and Distributed System Security Symposium (NDSS 2009), San Diego, February 2009. Capture-hpc client honeypot / honeyclient, https://projects.honeynet.org/ Amun: Python honeypot, http://amunhoney.sourceforge.net/ CWSandbox Honeyblog, http://honeyblog.org/categories/5-CWSandbox Anubis: Analyzing unknown binaries, http://anubis.iseclab.org/

[6] [7] [8] [9] [10]

[11] [12] [13] [14]

221