Sie sind auf Seite 1von 71

International Journal of

Computer Science Issues

Security Systems and Technologies

Volume 2, August 2009


ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

© IJCSI PUBLICATION
www.IJCSI.org
© IJCSI PUBLICATION 2009
www.IJCSI.org
EDITORIAL

There are several journals available in the areas of Computer Science


having different policies. IJCSI is among the few of those who believe
giving free access to scientific results will help in advancing computer
science research and help the fellow scientist.

IJCSI pay particular care in ensuring wide dissemination of its authors’


works. Apart from being indexed in other databases (Google Scholar,
DOAJ, CiteSeerX, etc…), IJCSI makes articles available to be
downloaded for free to increase the chance of the latter to be cited.
Furthermore, unlike most journals, IJCSI send a printed copy of its issue
to the concerned authors free of charge irrespective of geographic
location.

IJCSI Editorial Board is pleased to present IJCSI Volume Two (IJCSI Vol.
2, 2009). This edition is a result of a special call for papers on Security
Systems and Technologies. The paper acceptance rate for this issue is
33.3%; set after all submitted papers have been received with important
comments and recommendations from our reviewers.

We sincerely hope you would find important ideas, concepts, techniques,


or results in this special issue.

As final words, PUBLISH, GET CITED and MAKE AN IMPACT.

IJCSI Editorial Board


August 2009
www.ijcsi.org
IJCSI EDITORIAL BOARD

Dr Tristan Vanrullen
Chief Editor
LPL, Laboratoire Parole et Langage - CNRS - Aix en Provence, France
LABRI, Laboratoire Bordelais de Recherche en Informatique - INRIA - Bordeaux,
France
LEEE, Laboratoire d'Esthétique et Expérimentations de l'Espace - Université d'auvergne,
France

Dr Mokhtar Beldjehem
Professor
Sainte-Anne University
Halifax, NS, Canada

Dr Pascal Chatonnay
Assistant Professor
Maître de Conférences
Université de Franche-Comté (University of French-County)
Laboratoire d'informatique de l'université de Franche-Comté (Computer Sience
Laboratory of University of French-County)

Prof N. Jaisankar
School of Computing Sciences, VIT University
Vellore, Tamilnadu, India
IJCSI REVIEWERS COMMITTEE

• Mr. Markus Schatten, University of Zagreb, Faculty of Organization


and Informatics, Croatia
• Mr. Forrest Sheng Bao, Texas Tech University, USA
• Mr. Vassilis Papataxiarhis, Department of Informatics and
Telecommunications, National and Kapodistrian University of Athens,
Panepistimiopolis, Ilissia, GR-15784, Athens, Greece, Greece
• Dr Modestos Stavrakis, Univarsity of the Aegean, Greece
• Prof Dr.Mohamed Abdelall Ibrahim, Faculty of Engineering -
Alexandria Univeristy, Egypt
• Dr Fadi KHALIL, LAAS -- CNRS Laboratory, France
• Dr Dimitar Trajanov, Faculty of Electrical Engineering and Information
technologies, ss. Cyril and Methodius Univesity - Skopje, Macedonia
• Dr Jinping Yuan, College of Information System and
Management,National Univ. of Defense Tech., China
• Dr Alexios Lazanas, Ministry of Education, Greece
• Dr Stavroula Mougiakakou, University of Bern, ARTORG Center for
Biomedical Engineering Research, Switzerland
• Dr DE RUNZ, CReSTIC-SIC, IUT de Reims, University of Reims,
France
• Mr. Pramodkumar P. Gupta, Dept of Bioinformatics, Dr D Y Patil
University, India
• Dr Alireza Fereidunian, School of ECE, University of Tehran, Iran
• Mr. Fred Viezens, Otto-Von-Guericke-University Magdeburg, Germany
• Mr. J. Caleb Goodwin, University of Texas at Houston: Health Science
Center, USA
• Dr. Richard G. Bush, Lawrence Technological University, United States
• Dr. Ola Osunkoya, Information Security Architect, USA
• Mr. Kotsokostas N.Antonios, TEI Piraeus, Hellas
• Prof Steven Totosy de Zepetnek, U of Halle-Wittenberg & Purdue U &
National Sun Yat-sen U, Germany, USA, Taiwan
• Mr. M Arif Siddiqui, Najran University, Saudi Arabia
• Ms. Ilknur Icke, The Graduate Center, City University of New York,
USA
• Prof Miroslav Baca, Associated Professor/Faculty of Organization and
Informatics/University of Zagreb, Croatia
• Dr. Elvia Ruiz Beltrán, Instituto Tecnológico de Aguascalientes,
Mexico
• Mr. Moustafa Banbouk, Engineer du Telecom, UAE
• Mr. Kevin P. Monaghan, Wayne State University, Detroit, Michigan,
USA
• Ms. Moira Stephens, University of Sydney, Australia
• Ms. Maryam Feily, National Advanced IPv6 Centre of Excellence
(NAV6) , Universiti Sains Malaysia (USM), Malaysia
• Dr. Constantine YIALOURIS, Informatics Laboratory Agricultural
University of Athens, Greece
• Dr. Sherif Edris Ahmed, Ain Shams University, Fac. of agriculture,
Dept. of Genetics, Egypt
• Mr. Barrington Stewart, Center for Regional & Tourism Research,
Denmark
• Mrs. Angeles Abella, U. de Montreal, Canada
• Dr. Patrizio Arrigo, CNR ISMAC, italy
• Mr. Anirban Mukhopadhyay, B.P.Poddar Institute of Management &
Technology, India
• Mr. Dinesh Kumar, DAV Institute of Engineering & Technology, India
• Mr. Jorge L. Hernandez-Ardieta, INDRA SISTEMAS / University
Carlos III of Madrid, Spain
• Mr. AliReza Shahrestani, University of Malaya (UM), National
Advanced IPv6 Centre of Excellence (NAv6), Malaysia
• Mr. Blagoj Ristevski, Faculty of Administration and Information
Systems Management - Bitola, Republic of Macedonia
• Mr. Mauricio Egidio Cantão, Department of Computer Science /
University of São Paulo, Brazil
• Mr. Thaddeus M. Carvajal, Trinity University of Asia - St Luke's
College of Nursing, Philippines
• Mr. Jules Ruis, Fractal Consultancy, The netherlands
• Mr. Mohammad Iftekhar Husain, University at Buffalo, USA
• Dr. Deepak Laxmi Narasimha, VIT University, INDIA
• Dr. Paola Di Maio, DMEM University of Strathclyde, UK
• Dr. Bhanu Pratap Singh, Institute of Instrumentation Engineering,
Kurukshetra University Kurukshetra, India
• Mr. Sana Ullah, Inha University, South Korea
• Mr. Cornelis Pieter Pieters, Condast, The Netherlands
• Dr. Amogh Kavimandan, The MathWorks Inc., USA
• Dr. Zhinan Zhou, Samsung Telecommunications America, USA
• Mr. Alberto de Santos Sierra, Universidad Politécnica de Madrid, Spain
• Dr. Md. Atiqur Rahman Ahad, Department of Applied Physics,
Electronics & Communication Engineering (APECE), University of Dhaka,
Bangladesh
• Dr. Charalampos Bratsas, Lab of Medical Informatics, Medical Faculty,
Aristotle University, Thessaloniki, Greece
• Ms. Alexia Dini Kounoudes, Cyprus University of Technology, Cyprus
• Mr. Anthony Gesase, University of Dar es salaam Computing Centre,
Tanzania
• Dr. Jorge A. Ruiz-Vanoye, Universidad Juárez Autónoma de Tabasco,
Mexico
• Dr. Alejandro Fuentes Penna, Universidad Popular Autónoma del
Estado de Puebla, México
• Dr. Ocotlán Díaz-Parra, Universidad Juárez Autónoma de Tabasco,
México
• Mrs. Nantia Iakovidou, Aristotle University of Thessaloniki, Greece
• Mr. Vinay Chopra, DAV Institute of Engineering & Technology,
Jalandhar
• Ms. Carmen Lastres, Universidad Politécnica de Madrid - Centre for
Smart Environments, Spain
• Dr. Sanja Lazarova-Molnar, United Arab Emirates University, UAE
• Mr. Srikrishna Nudurumati, Imaging & Printing Group R&D Hub,
Hewlett-Packard, India
• Dr. Olivier Nocent, CReSTIC/SIC, University of Reims, France
• Mr. Burak Cizmeci, Isik University, Turkey
• Dr. Carlos Jaime Barrios Hernandez, LIG (Laboratory Of Informatics of
Grenoble), France
• Mr. Md. Rabiul Islam, Rajshahi university of Engineering &
Technology (RUET), Bangladesh
• Dr. LAKHOUA Mohamed Najeh, ISSAT - Laboratory of Analysis and
Control of Systems, Tunisia
• Dr. Alessandro Lavacchi, Department of Chemistry - University of
Firenze, Italy
• Mr. Mungwe, University of Oldenburg, Germany
• Mr. Somnath Tagore, Dr D Y Patil University, India
• Mr. Nehinbe Joshua, University of Essex, Colchester, Essex, UK
• Ms. Xueqin Wang, ATCS, USA
• Dr. Borislav D Dimitrov, Department of General Practice, Royal
College of Surgeons in Ireland, Dublin, Ireland
• Dr. Fondjo Fotou Franklin, Langston University, USA
• Mr. Haytham Mohtasseb, Department of Computing - University of
Lincoln, United Kingdom
• Dr. Vishal Goyal, Department of Computer Science, Punjabi
University, Patiala, India
• Mr. Thomas J. Clancy, ACM, United States
• Dr. Ahmed Nabih Zaki Rashed, Dr. in Electronic Engineering, Faculty
of Electronic Engineering, menouf 32951, Electronics and Electrical
Communication Engineering Department, Menoufia university, EGYPT,
EGYPT
• Dr. Rushed Kanawati, LIPN, France
• Mr. Koteshwar Rao, K G REDDY COLLEGE OF
ENGG.&TECH,CHILKUR, RR DIST.,AP, INDIA
• Mr. M. Nagesh Kumar, Department of Electronics and Communication,
J.S.S. research foundation, Mysore University, Mysore-6, India
• Dr. Babu A Manjasetty, Research & Industry Incubation Center,
Dayananda Sagar Institutions, , India
• Mr. Saqib Saeed, University of Siegen, Germany
• Dr. Ibrahim Noha, Grenoble Informatics Laboratory, France
• Mr. Muhammad Yasir Qadri, University of Essex, UK
TABLE OF CONTENTS

1. Towards a General Definition of Biometric Systems


Markus Schatten, Miroslav Baca and Mirko Cubrilo, Faculty of Organization and Informatics,
University of Zagreb, Pavlinska 2, 42000 Varaždin, Croatia

2. Philosophical Survey of Passwords


M Atif Qureshi, Arjumand Younus and Arslan Ahmed Khan, UNHP Research Karachi, Sind,
Pakistan

3 . Global Heuristic Search on Encrypted Data (GHSED)


Maisa Halloush, Department of Computer Science, Al-Balqa Applied University, Amman,
Jordan
Mai Sharif, Barwa Technologies, Doha, Qatar

4. Comprehensive Security Framework for Global Threads Analysis


Jacques Saraydaryan, Exaprotect R&D, Villeurbanne, 69100, France
Fatiha Benali and Stéphane Ubeda, INSA Lyon, Villeurbanne, 69100, France

5. Self-Partial and Dynamic Reconfiguration Implementation for AES using FPGA


Zine El Abidine Alaoui Ismaili and Ahmed Moussa, Innovative Technologies Laboratory,
National School of Applied Sciences, Tangier, PBox 1818, Morocco

6. Web Single Sign-On Authentication using SAML


Kelly D. Lewis, Information Security, Brown-Forman Corporation, Louisville, KY 40210, USA
James E. Lewis, Engineering Fundamentals, Speed School of Engineering, University of
Louisville, Louisville, KY 40292, USA

7. An Efficient Secure Multimodal Biometric Fusion Using Palmprint and Face Image
Nageshkumar.M, Mahesh.PK and M.N. Shanmukha Swamy, Department of Electronics and
Communication, J.S.S. research foundation, Mysore University, Mysore-6

8. DPRAODV: A Dynamic Learning System Against Blackhole Attack In AODV Based


MANET
Payal N. Raj, Computer Engineering Department, SVMIT, Bharuch, Gujarat, India
Prashant B. Swadas, Computer Engineering Department, B.V.M., Anand, Gujarat, India
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 1
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

Towards a General Definition of Biometric Systems

Markus SCHATTEN1 , Miroslav BAČA1 and Mirko ČUBRILO1


1
Faculty of Organization and Informatics, University of Zagreb
Pavlinska 2, 42000 Varaždin, Croatia
{markus.schatten, miroslav.baca, mirko.cubrilo}@foi.hr

Abstract biometrics ontology that was afterwards partially build


A foundation for closing the gap between biometrics in the in [1] and [7]. During the development of this ontology
narrower and the broader perspective is presented trough a crucial concepts like biometric system, model, method,
conceptualization of biometric systems in both perspectives. A sample, characteristic, feature, extracted structure as
clear distinction between verification, identification and
well as others were defined. We also developed a full
classification systems is made as well as shown that there are
taxonomy of biometric methods in the narrower
additional classes of biometric systems. In the end a Unified
Modeling Language model is developed showing the perspective in [6] that contributed to a unique
connections between the two perspectives. framework for communication.
Key words: biometrics, biometric system, set mappings,
conceptualization, classification. All this previous research showed that there is confusion
when talking about different types or classes of
biometric systems. Most contemporary literature only
1. Introduction makes distinction between verification and identification
systems but some of our research showed that there are
The term biometrics comming from ancient greek words more different classes like simple classification systems
(bios) for life and (metron) for measure that seem to be a generalization of verification as well as
is often used in different contexts to denote different identification systems [7]. As we shall show in our
meanings. At the same time there are very similar and following reasoning by taking the input and output sets
often synonimic terms in use like biometry, biological of the different processes in biometric systems that
statistics, biostatistics, behaviometrics etc. The main aim define biometric methods into consideration, as well as
of this paper is to show the connection between these mappings between them a concise conceptualization
various views of biometrics as well as to continue our emerges that seems to applicable to any biometric
research on the essence of biometric systems. system.

In [2] we showed how to apply a system theory approach


to the general biometric identification system developed 2. Basic Definitions in Biometrics
by [8] in order to extend it to be aplicable to unimodal as
well as multimodal biometric identification, verification
In order to reason about biometrics we need to introduce
and classification systems in the narrower (security)
some basic definitions of concepts used in this paper.
perspective of biometrics. The developed system model
These definitions were crutial to the development of a
is partialy presented on figure 1.
selected biometrics segments ontology as well as an
taxonomy of biometric methods.

First of all, we can approach biometrics in a broader and


in a narrower perspective as indicated before. In the
broader perspective biometrics is the statistical research
on biological phenomenae; it is the use of mathematics
Fig. 1 Pseudo system diagram of the developed model.
and statistics in understanding living beeings [4]. In the
narrower perspective we can define biometrics as the
In [3] we argued that there is a need for an open research of possibilities to recognize persons on behalf

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 2

of their physical and/or behavioral (psychological) some phenomenae in time and/or space. Thus a
characteristics. We shall approach biometrics in the biometric sample represents a measured quantity or set
broader perspective in this paper. of quantities of a biological phenomenae [7].

A biometric characteristic is a biological A biometric template or extracted structure is a


phenomenon's physical or behavioral characteristic that quantity or set of quantities aquired by a conscious
can be used in order to recognize the phenomenon. In application of a biometric feature extraction or
the narrower perspective of biometrics physical preprocessing method on a biometric sample. These
characteristics are genetically implied (possibly templates are usually stored in a biometric database and
environmental influenced) characteristics (like a used for reference during recognition, training or
person's face, iris, retina, finger, vascular structure etc.). enrollment of a biometric system.
Behavioral or psychological characteristics are
characteristics that one acquires or learns during her life
(like a handwritten signature, a person's gait, her typing 3. Conceptualizing Input and Output
dynamics or voice characteristics). These definitions are Mappings
allmost easily translated into the broader perspective of
biometrics. Depending on the number of characteristics Having the basic concepts defined we can formalize the
used for recognition biometric systems can be unimodal domain using the following seven sets: (1) as the
(when only one biometric characteristic is used) or set of all biometric samples, (2) as the set of all
multimodal (if more than one characteristic is used). preprocessed samples, (3) as the set of all biometric
A biometric structure is a special feature of some templates or extracted structures, (4) as a subset of
biometric characteristic that can be used for recognition representing all extracted structures that are suitable
(for example a biometric structure for the human for recognition after quality control, (5) as the set of
biometric characteristic finger is the structure of all biological phenomenas (in the broader perspective)
papillary lines and minutiea, for the human biometric or all persons (in the narrower one) represented by
charactersitic gait it is the structure of body movements biometric structures on behalf of which recognition is
during a humans walk etc.). made possible, (6) as a subset of of all
biological phenomenas that are enrolled, and (7) as
The word method comes from the old greek the set of all recognition classes.
(methodos) that literarly means “way or path of transit”
and implies an orderly logical arrangement (usually in Using these sets we can formalize the classes of
steps) to achieve an attended goal [9; pp. 29]. Thus a biometric methods shown on figure 1. The sampling
biometric method is a series of steps or activities process, the preprocessing, the feature extraction
conducted to process biometric samples of some process, the quality control process as well as the
biometric characteristic usually to find the biometric recognition process are described using the mappings
characteristic's holder (in the narrower perspective) or a shown in the following set of equations respectively:
special feature of the biometric sample (in the broader
perspective).

A model is a (not neccesarily exact) image of some


system. It's main purpose is to facilitate the aquiring of
information about the original system [5; pp. 249]. A
biometric model is thus a sample of a biometric system
that facilitates the aquiring of information about the
system itself as well as information about biometric
characteristics. In [2] and [7] we showed that biometric
models consist of biometric methods for preprocessing
and feature extraction, quality control as well as
recognition. Figure 2 shows the mapping of the sampling process.
One can observe that every value from has its
A sample is a measured quantity or set of quantities of argument in . Arguments from can have 0 or

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 3

more values in . This can be explained easily element from the domain is associated with some
because there is a high probability that not every element in the co-domain ). The function is
biological phenomenon will be sampled by one surjective but not necessarily injective since some
biometric system, but every biometric sample is a samples can yield the same preprocessed sample even if
sample of a real biological phenomenas.\footnote.1 they are distinct.
There is also a considerable probability that a group of
biological phenomenas will yield the same sample Figure 4 shows the mapping of the feature extraction
which is depending on the quality of the sampling process which also happens to be a function (since every
technology.
element from the domain is associated with some
element in the co-domain ). The function is likewise
surjective and likewise not necessarily injective since
some preprocessed samples can yield the same extracted
structure even if they are distinct.

Fig. 2 Mapping of the sampling process.

In multimodal systems one sample can be made on


behalf of more than one feature, what would yield a
different figure than the one above. But, since every
Fig. 4 Mapping of the feature extraction process.
sample is made on behalf of exactly features (where
is the number of characteristics used in the There is another possibility in multimodal systems,
multimodal system) we can consider the tuple when samples aren't multimodal, but structures are
(where are partial extracted from multiple samples. As in the case of the
features that are being sampled) to be only one feature. sampling process mentioned before we can consider the
The mapping would thus have arguments and the elements in to be tuples of samples
figure wouldn't change, or likewise the set would (where is the is the number of characteristics used in
consist of tuples. the multimodal system) that are used to extract a single
structure.

Fig. 3 Mapping of the preprocessing process.


Fig. 5 Mapping of the quality control process.

Figure 3 shows the mapping of the preprocessing


process which happens to be a function (since every The mapping of the quality control process is shown on
figure 5. Every value from has its argument in
1 Presuming that there are no fake biometric but the opposite does not necessarily hold true since
samples in some extracted structures do not pass the quality test and

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 4

are abandoned. Thus, every argument from has 0 or 1 and then the system is a biometric
values in and the values are unique. verification system in normal (active) functioning.
We denote this mapping with .
Likewise figure 6 shows the mapping of the recognition • If one extracted structure is being mapped to one of
process that is similar to the previous one. Again, every classes where is the cardinality of set then
value from has its argument in but the opposite is the system is a biometric classification system in
not necessarily true since some structures that passed the normal (active) functioning. We denote this mapping
quality test cannot be recognized and classified into one with .
of the classes for recognition in . We could define a • If one extracted structure is being mapped to one of
set where is the class for all classes where is the cardinality of set and
unrecognized structures but we left this part out due to then the system is a biometric
concept consistency and simplicity. Thus, every identification system in normal active functioning.
argument in has 0 or 1 image in whereby the We denote this mapping with
images are not necessarily unique. .
• If tuples of person information and extracted
structures (where is the cardinality of set )
are being mapped individually into exactly one class
and when then the system is a biometric
verification system during training. We denote this
mapping with .
• If extracted structures are being mapped
individually into one of classes then the system is
a biometric classification system during training. We
Fig. 6 Mapping of the recognition process.
denote this mapping with
.
Special cases of the recognition process mapping • If extracted structures are being mapped
include the case when and the case when individually into one of classes and
is a mapping of two variables. In the former case we then the system is a biometric identification system
have the mapping that represents an during training. We denote this mapping with
actively functioning biometric identification system. In .
the letter case we have the mapping • If groups of tuples consisting of person
information and extracted structure are being
(whereby elements of are tuples where
mapped into exactly one of classes (whereby
and ) that represent an actively
functioning biometric verification system. , and is the number of
extracted structures per person)2 the system is a
biometric verification system during enrollment. We
denote this mapping with
4. Conceptualizing Mapping Cardinalities of .
the Recognition Process • If groups of extracted structures are being
mapped into one of classes (whereby
If we consider the mapping of the
, and is the number of
recognition process and presume that the biometric
system is active (thereby eliminating passive periods) we
2 Usually a standard number of samples is used
can observe the following situations:
for enrollment but can be variable due to lack of such
• If one tuple of one person information and one
standard or due to eliminated samples during other
extracted structure are mapped to exactly one class processes of the biometric system.

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 5

extracted structures per class) the system is a two other special subsets of denoted by the set
biometric classification system during enrollment. of all enrolled phenomenas or persons in the narrower
We denote this mapping with sense of biometrics. Thus every instance of Enrolled
. phenomenon is an instance of Phenomenon, every
• If groups of extracted structures are being instance of Enrolled person is an instance of Person, as
mapped into one of classes (whereby well as every Enrolled person is an instance of
Enrolled phenomenon.
, and is the number of
extracted structures per person) the system is a
biometric identification system during enrollment.
We denote this mapping with
.

From this reasoning we can conclude that biometric


verification and identification systems are only special
cases of biometric classification systems when the
number of classes into which extracted structures are
mapped into are equivalent to the set of biological
phenomenas (or persons in the narrower sense) that are
enrolled. Further we can observe three distinct situations
in biometric systems recognition process cardinalities
defined in equation where is the number of
Fig. 7 UML Class Diagram of the Defined Concepts.
extracted structures (or tuples in the case of verification
systems) on the input to the recognition process, and
As shown in the diagram any Phenomena can consist of
the number of classes (outputs) into which the inputs are
zero or more biometric Structure instances, while a
being mapped.
biometric Structure is part of exactly one biological
Phenomena. Following the process flow from figure 1
1. In the case when and the we can observe that every Sample instance is made on
biometric system is in normal (everyday) use. behalf of one or more Structure instances3 and thus the
2. In the case when and set is represented with the class Sample. The case
the biometric system is in the training phase. that a sample is made on behalf of more biometric
structures applies only to multimodal systems where
3. In the case when and
multiple biometric structures are sampled into exactly
the biometric system is in the one sample.
enrollment phase (whereby is an positive
integer possibly inside an interval, Every instance of Preprocessed sample instance is
). derived from exactly one Sample instance where the
class Preprocessed sample corresponds to the set .
Further on every Extracted structure instance is
5. Conceptualizing Relations Between the extracted from one or more Preprocessed sample
Defined Sets instances. The class Extracted structure represents the
two sets concerning biometric templates or extracted
Figure 7 shows the UML (Unified Modeling Language)
structures and depending on the value of an
class diagram of the defined concepts that gives us even
instance's status attribute. If the value is untested or
deeper insight of the domain being conceptualized.
Every class applies for some of the previously defined failed the instance belongs into set (since the
instance hasn't been tested for quality or it hasn't pass
sets. The class Phenomenon applies to the set of all the quality test). In the opposite case when the value is
biological phenomenas. As the diagram shows there is a
special subset defined by the class Person. Every Person 3 Presuming again that there are only real
instance is an instance of Phenomenon. There are also biometric samples in

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 6

passed the instance belongs into the set since it has are classified into is equivalent to the number of
been tested for quality and passed the test. The enrolled biological phenomenas (in the broader sense of
enumeration holding the values of the status attribute biometrics) or the number of enrolled persons (in the
has been left out form the diagram for the sake of narrower perspective).
simplicity. The case when an extracted structure is
extracted from more biometric samples applies only to We argued that biometrics in the narrower and in the
multimodal biometric systems that extract features on broader perspective have a lot in common especially
behalf of more biometric samples, whilst the case when when talking about data and data manipulation
on extracted structure is extracted from only one sample techniques. Biometrics in the narrower perspective is
applies to unimodal biometric systems. and remains a special case of biometrics in the broader
perspective. Thus this conceptualization presents a clear
Every Extracted structure can be classified into zero or framework for communication on any biometric system
more instances of Class whilst every Class instance topic.
applies to zero or more instances of Extracted
The only thing that seems to be the difference is the
structure. The Class class represents the set as it is
semantic context in which the same methods are used.
obvious from our previous reasoning. There is
So we ask our self, why making a difference? The
correspondence between the Class class and the
developed UML model merges the two perspectives by
Enrolled phenomenon class depending on the purpose
stating that biometrics in information sciences and
of the system as argued before.
information system security specialization of biometrics
in mathematics, statistics and biology. The narrower
From this reasoning we can conclude that the classes
perspective heavily depends on theories from the
Structure, Sample, Preprocessed sample, Extracted
broader one, but insights from information system's
structure and Class apply to both biometrics in the
security biometrics are of course usefull in the biology,
narrower and the broader perspective. If the connected
mathematics and statistics perspective especially when
classes are Phenomenon and Enrolled phenomenon we
talking about system planning and implementation.
are talking about the broader perspective of biometrics.
In the other case when the connected classes are Person
If we add this conceptualization to our previously
and Enrolled person the narrower perspective comes developed open ontology of chosen parts of biometrics,
into play. Since Person is a special case of Phenomenon as well as to the developed systematization and
and Enrolled person is a special case of Enrolled taxonomy of biometric methods, characteristics,
phenomenon the narrower perspective of biometrics is features, models and systems we get an even clearer
only a special case of the broader one. framework for communicating about biometrics that
puts our research into a broader perspective. Future
research shall yield an open ontology of biometrics in
6. Conceptualizing Relations Between the
the broader perspective.
Defined Sets
In this paper we showed a simple conceptualization of
biometric systems. If one considers a general biometric Acknowledgments
system consisting of a series of processes she can
observe the input and output sets of any given process. Results presented in this paper came from the scientific
By mapping these sets in a sequence of events one can project “Methodology of biometrics characteristics
observe their features. The recognition process is of evaluation” (No. 016-0161199-1721) supported by
special interest since the special cases of the possible Ministry of Science Education and Sports Republic of
mappings define the three types of biometric systems Croatia.
(classification, verification, identification) as well as the
three possible processing conditions (everyday use, References
training, enrollment). [1] M. Bača, M. Schatten, and B. Golenja, “Modeling
Biometrics Systems in UML”. in IIS2007 International
As we showed, biometric verification and identification Conference on Intelligent and Information Systems
systems are only special cases of biometric classification Proceedings. 2007, Vol. 18, pp. 23–27.
systems where the number of classes into which samples [2] M. Bača, M. Schatten, and K. Rabuzin, “A Framework for

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 7

Systematization and Categorization of Biometrics


Methods”. in IIS2006 International Conference on M. Schatten received his bachelors degree in Information systems
Intelligent and Information Systems Proceedings. 2006, (2005), and his masters degree in Information Sciences (2008)
both on the faculty of Organization and Informatics, University of
Vol. 17, pp. 271–278.
Zagreb where he is currently a teaching and research assistent. He
[3] M. Bača, M. Schatten, and K. Rabuzin, “Towards an Open is a member of the Central European Conference on Intelligent and
Biometrics Ontology”, Journal of Information and Information Systems organizing comitee. He is a researcher in the
Organizational Sciences, Vol. 31, No. 1, 2007, pp. 1–11. Biometrics center in Varaždin, Croatia.
[4] R. H. Jr. Giles, “Lasting Forests Glossary”. Available at
http://fwie.fw.vt.edu/rhgiles/appendices/glossb.htm, M. Bača received his bachelor degree form Faculty of Electrical
Accessed: 28th February 2005. Engineering in Osijek (1992), second bachelors degree form High
Police School in Zagreb (1996), MSc degree form Faculty of
[5] D. Radošević, Osnove teorije sustava, Zagreb: Nakladni Organization and Informatics, Varaždin (1999), PhD degree from
zavod Matice hrvatske, 2001. Faculty of Organization and Informatics, Varaždin (2003). He was
[6] M. Schatten, M. Bača, and K. Rabuzin, “A Taxonomy of an Assistant professor, University of Zagreb, Faculty of
Biometric Methods”, in ITI2008 International Organization and Informatics (2004-2007), and is currently an
Conference on Information Technology Interfaces Associated professor, University of Zagreb, Faculty of Organization
and Informatics. He is a member of various professional societies
Proceedings, Cavtat/Dubrovnik: SRCE University and head of the Central European Conference on Intelligent and
Computing Centre 2008; 389–393. Information Systems organizing comitee. He is also the head of the
[7] M. Schatten, “Zasnivanje otvorene ontologije odabranih Biometrics center in Varaždin, Croatia. He lead 2 scientific projects
segmenata biometrijske znanosti” M.S. Thesis, Faculty of granted by the Ministry of Science, Education and Sports of
Organization and Informatics, University of Zagreb, Croatia.
Varaždin, Croatia, 2008.
M. Čubrilo received his bachelors (1979) and masters (1984)
[8] J. L. Wayman, “Generalized Biometric Identification degree from the Faculty of Natural Sciences and Mathematics,
System Model”, in National Biometric Test Center University of Zagreb. He received his PhD degree from the Faculty
Collected Works 1997. - 2000. San Jose: San Jose State of Electrotechnics, University of Zagreb (1992). He is currently a
University. 2000, pp. 25–31. full professor at the Faculty of Organization and Informatics,
University of Zagreb. He was main editor of the Journal of
[9] M. Žugaj, K. Dumičić and V. Dušak, Temelji
Information and Organization Sciences. He was leader of 2
znanstvenoistraživačkog rada. Metodologija i metodika, scientific projects granted by the Ministry of Science, Education
Varaždin: TIVA & Faculty of Organization and and Sports of Croatia.
Informatics, 2006.

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 8
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

Philosophical Survey of Passwords


M Atif Qureshi1, Arjumand Younus2 and Arslan Ahmed Khan3
1
UNHP Research
Karachi, Sind, Pakistan
atif.qureshi@unhp.com.pk
2
UNHP Research
Karachi, Sind, Pakistan
arjumand.younus@unhp.com.pk

3
UNHP Research
Karachi, Sind, Pakistan
arslan.khan@unhp.com.pk

Abstract existence of their own and this lead us to study them under
Over the years security experts in the field of Information a philosophical context.
Technology have had a tough time in making passwords secure.
This paper studies and takes a careful look at this issue from the Passwords: this word is essentially composed of two
angle of philosophy and cognitive science. We have studied the words i.e. pass and word so you pass if you have the right
process of passwords to rank its strengths and weaknesses in
word. Even before the advent of computers watchwords
order to establish a quality metric for passwords. Finally we
related the process to human senses which enables us to propose existed in the form of secret codes, agents of certain
a constitutional scheme for the process of password. The basic command for their respective authorization or
proposition is to exploit relationship between human senses and administration used watchword e.g. for identifying other
password to ensure improvement in authentication while keeping agents [3] and the underlying concept is essentially the
it an enjoyable activity. same today. Next we move on to word: in this context
Key words: Context of password, password semantics, word is not necessarily something making dictionary-
password cognition, constitution of password, knowledge-based based sense (we do keep passwords that make no meaning
authentication e.g. passwords like adegj or a2b5et). Hence Passwords are
keys that control access. They let you in and keep others
1. Introduction out. They provide information control (passwords on
documents); access control (passwords to web pages) and
No doubt information is a valuable asset in this digital age. authentication (proving that you are who you say you are)
Due to the critical nature of information, be it personal [4]. In this paper we take a deep look into both the theory
information on someone’s personal computer or and philosophy of passwords; in short we will be
information systems of large organizations, security is a addressing a fundamental question: can password
major concern. There are three aspects of computer semantics enable them to mimic Nature’s way of keeping
security: authentication, authorization and encryption. The secrets and providing security.
first and most important of these layers is authentication
and it is at this layer that passwords play a significant role.
1.1 Why philosophical perspective of passwords
Most common authentication mechanisms include use of
an alphanumeric based word that only the user to be Ontology is a philosophical term used to describe a
authenticated knows and is commonly referred to as particular theory about the nature of being or the kinds of
passwords [1]. The SANS Institute indicates that weak or things that have existence [5]. In the context of passwords
nonexistent passwords are among the top 10 most critical it implies a careful and thorough dive into the existence
computer vulnerabilities in homes and businesses [2]. and nature of passwords and their relationship to users and
Philosophical analysis of passwords can lead to the computers. A password has a relationship with the user’s
refinement of the authentication process. This approach mind and therefore it should be linked with specific user’s
has rarely been adopted in the exploration and design of mindset by creating a sensible bridge between the two. In
computer security. Passwords too are entities having an short password must be backed by a certain philosophy

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 11

which establishes a link between concerned rational security. For example ATM cards are generally used
entities i.e. user and system of recognition. together with a PIN number [1]. Biometrics systems are
being heavily used [10], biometric authentication refers to
technologies that measure and analyze human physical and
1.2 Outline behavioral characteristics for authentication purposes.
Examples of physical characteristics include fingerprints,
The organization of this paper is as follows: in section 2 eye retinas and irises, facial patterns and hand
we take a careful look into the problems of the existing measurements, while examples of mostly behavioral
password schemes and analyze the existing solutions. In characteristics include signature, gait and typing patterns.
section 3 we propose some suggestions in light of our Voice is considered a mix of both physical and behavioral
philosophical approach at the same time evaluating and characteristics. However, it can be argued that all
presenting a critique of the existing mechanisms. Finally biometric traits share physical and behavioral aspects.
section 4 concludes the discussion.
Knowledge-based techniques are most common and will
mainly be the focus of our discussion and under which
2. The Password Problem both text-based and picture-based passwords are
subcategorized.
When it comes to the area of computer security there is a
heavy reliance on passwords. But the main drawback of
2.1 An Extension of Knowledge-Based Passwords
passwords is what is termed as the “password problem”
[6] for text-based passwords. We will refer to this problem A new phenomenon that computer security researchers
as the “classical password problem.” This problem have recently explored under the domain of knowledge-
basically arises from either two of the following facts: based passwords is that of graphical passwords i.e.
passwords that are based on pictures. They have motivated
1) Human memory is limited and therefore users their studies on some psychological studies revealing that
cannot remember secure passwords as a result of humans remember pictures better than text [11]. Picture-
which they tend to pick passwords that are too based passwords are subdivided into recognition-based
short or easy to remember [7]. Hence passwords and recall-based approaches.
should be easy to remember.
2) Passwords should be secure, i.e., they should Using recognition-based techniques, a user is presented
look random and should be hard to guess; they with a set of images and the user passes the authentication
should be changed frequently, and should be by recognizing and identifying the images he or she
different on different accounts of the same user. selected during the registration stage. Using recall-based
They should not be written down or stored in techniques, a user is asked to reproduce something that he
plain text. But unfortunately users do not tend to or she created or selected earlier during the registration
follow these practices [8]. stage.
Tradeoffs have to be made between convenience and
security due to the shortcomings of text-based passwords. 3. Passwords from a Philosophical Viewpoint
Now we explore some techniques that have been adopted
to minimize the tradeoffs and increase computer security. As previously mentioned we focus on an ontological study
of passwords and that too under the light of philosophy.
2.1 Attempts to Address the Problem However ontology has its definition in Computer Science
(more specifically in Artificial Intelligence [5]). In fact at
Current authentication techniques fall into three main the start of this century emerged a whole new field namely
areas: token-based authentication, biometric-based cognitive science [12] which brought scholars of
authentication and knowledge-based authentication. philosophy and computer science close to each other and
under this field computer scientists are closely studying
Token-based authentication techniques [9] use a mark or a working of the human mind to make computational tasks
symbol for identification which is only known to the efficient. It is this approach that we also propose and that’s
authenticating mechanism and it is under the possession of one main reason why we say that passwords should be
the user just like a coin which has no meaning other than studied from a philosophical perspective.
that known to the mechanism. An example is that of key
cards and smart cards. Many token-based authentication Passwords have never managed a distinct line whether it is
systems also use knowledge-based techniques to enhance a single unit of work or a process. If the password follows

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 12

a cognitive paradigm then password recognition is a 3.2 Proposed Directions to Prevent Possible
complete process just like the human mind follows a Attacks
certain process in recognizing and authenticating known
people; similarly computers should take passwords as a Following directions can be adopted in order to improve
process in the light of philosophy. In fact we believe that the security of passwords at the same time making it an
much of the drawbacks in previous approaches are due to enjoyable/sensible activity to ensure user satisfaction:
treating password as a unit of work and not carefully
viewing the details of the entire process in close context 1. Appropriate utilization of human senses in the
with the human mind. The password recognition process is passwords.
a detailed DFD (data flow diagram) rather than a context 2. Increase in the domain set of password by
DFD. introducing a greater deal of variety.
3. Empowering user to make selection from domain
Once we are clear that password recognition is a process set of variety to ensure his mental and physical
we must now look at ways that can make this process satisfaction.
friendly for the humans at the same time ensuring security 4. Introducing facility of randomization into the
to the maximum level. A common point that is raised password.
when addressing the classical password problem defined 5. Ensure the establishment of a link between
in the previous section is that human factors are the system and specific human mind from domain
weakest link in a computer security system [13]. But here set.
we raise an important question: is human really the weak
link here or is it the weakness of evaluation procedure for A discussion on possible attacks and tips for prevention
password that under utilizes the intelligence and senses of (in light of philosophy and cognitive science) follows:
human that make him look as a naive link in whole
process of text-based passwords environment. In fact • Brute force search: is basically a global attack
human intelligence and senses if properly utilized can on passwords to search for all possible
result in best-possible security mechanism. combinations of alpha numerals (in case of text-
based passwords) and graphical images (in case
3.1 Some Problems in Earlier Attempts of graphical passwords). In short brute force
launches attack of words that can be text-based,
In section 2.1 we explored some attempts to solve the activity and mixed courses of action. The brute
classical password problem. However each of the force attack can be prevented with ease by
techniques that have been proposed has some drawbacks application of point 2, 4 and 5 mentioned above
which can be summarized as follows: and as a result the brute-force attack becomes
computationally impossible. This philosophy
• The token-based passwords though secure but should be kept in mind and the engine should be
require a token (permit pass) which could be such that point 3 also follows as a logical
misplaced, stolen, forgotten or duplicated and the consequence.
biggest drawback is that the technique can only
be applied in limited domains not within the • Dictionary Attacks: are regional attacks that run
reach of common user. through a possible series of dictionary words,
activities and mixed courses of action until one
• The biometric passwords are efficient in that they works. Even some graphical passwords are
are near to a human’s science and do not require vulnerable to these types of attacks. However
remembrance rather they are closely linked with these can be prevented in an effective manner by
humans but they are expensive solutions and application of techniques mentioned in point 4
cannot be used in every scenario. and 5. This will allow maximum sense
exploitation so dictionary attacks would fail most
• Knowledge based passwords require often.
remembrance and are sometimes breakable or
guessable. • Shoulder surfing: is when an attacker directly
watches a user during login, or when a security
camera films a user, or when an electromagnetic
pulse scanner monitors the keyboard or the
mouse, or when Trojan login screens capture

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 11

passwords etc [6]. This attack can easily be that N^K formulation sustains more with N than
prevented with the simple approach proposed in with K where N is single input or action and K is
point 4 in the pass-word the pass should be the length of input.
same but we should not take the word as static 2. We have stated that password recognition is a
thereby making it pass-sense. process in itself but the details and phases of that
process have to be identified. To accommodate
• Guessing: is a very common problem associated philosophical ideas one must carefully model the
with text-based passwords or even graphical process of evaluation (i.e. input and validation).
passwords. Guess work is possible when the 3. By exploiting senses to ensure variety does not
domain is limited and choices are few; in other mean to exhaust user both physically and
words there is a lesser utilization of senses. So mentally but means to enhance level of comfort
this threat can easily be prevented by practicing and freedom to choose from variety that lead in
points 1, 3 and 4. securing system sensibly.
4. Randomization in password should follow the
• Spy ware: is type of malware that collects user’s common sense rather than heavy mental exercise
information about their computational behavior in a way that senses tell computer system “Yes, I
and personal information. This attack can easily am the right person. Please let me pass!”
fail in the light of above mentioned points 4 and 5. In security critical zones, heavy investment is
5 which imply that the password is making sense made to ensure protection at the level of
to both human and computer but not spyware. authentication but lacks to decide level of quality
achieved. The discussion in section 3.2 will give
All these suggestions were for the knowledge-based transparency for proper budgeting, level of
passwords but this philosophy can also be applied on other comfort and level of security achieved in
two categories as mentioned in section 3.1. Biometrics and authentication mechanism.
token-based authentication mechanisms cannot be
deployed everywhere because of the amount of investment In short a sensible link between the human mind and the
and ease of use. But these authentication mechanisms can computer system for verification is a complex problem
be treated as choice for domain set as mentioned in point 2 and is a great challenge for researchers in the field of
and leaving the choice to user as discussed in point 3. computer security

3.3 Redefinition of the Password Problem


4. Conclusions
In the classical scenario the domain of the problem was
simply limited to text-based passwords but the three This paper has thrown light onto the philosophy of
solutions proposed: token-based passwords, biometric passwords and their study in connection with the human
passwords and knowledge-based passwords (under which mind. Although the points that were mentioned in this
come both text-based and graphical passwords) widen the paper have been noted by different researchers at different
scope of the problem. Furthermore the directions that we times but there’s no single place where the entire
have proposed in section 3.2 can lead to other issues in the “password philosophy” has been defined. Thus we have
password arena. The treatment of password recognition as laid out the constitutional terms for any study of intelligent
a process and exploitation of human senses in the process and smart passwords. The two main points that we have
seems to be an appealing idea but it naturally leads to a identified in this “Constitution of Passwords” are as
redefinition of the password problem. Hence first of all we follows:
must redefine the password problem in order to extend its
domain and increase the size of the universe of discourse. 1. Password is not just a unit of work; rather it is a
complete process.
We can redefine the problem as follows: 2. Password should incorporate common sense of
humans.
1. Introducing variety into the domain set of 3. There must be quality assurance at the level of
password is a task that must be given due authentication mechanism.
consideration and any attempt to implement the
philosophical concepts explored in this paper This philosophy can play vital role for immediate
must address the question: How and in what ways practitioners if they keep tradeoff of in their mind before
can variety be introduced into the passwords so producing a secure solution and as well as for researchers

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 12

to dive into challenging problems that have been left open


for them.

References
[1] X. Suo, Y. Zhu, G. S. Owen, "Graphical Passwords: A
Survey," 21st Annual Computer Security Applications
Conference (ACSAC'05), 2005, pp. 463-472.
[2] R. L. Wakefield, "Network Security and Password Policies",
CPA Online Journal, 2004.
http://www.nysscpa.org/cpajournal/2004/704/perspectives/p6
.htm
[3] From The Histories of Polybius published in Book VI Vol. III
of the Loeb Classical Library edition Public Domain
translation: The Roman Military System, Book Cited in
http://penelope.uchicago.edu/Thayer/E/Roman/Texts/Polybiu
s/6*.html
[4]The Hacker Highschool Project, ISECOM 2004.
http://www.hackerhighschool.org/lessons/HHS_en11_Passw
ords.pdf
[5] S. Russell and P. Norvig, Artificial Intelligence – A Modern
Approach 2nd Edition, Pearson Education Series in
Artificial Intelligence
[6] S. Wiedenbeck, J. Waters, J.C. Birget, A. Brodskiy, N.
Memon, “Authentication using graphical passwords: Basic
results”, Human-Computer Interaction International (HCII
2005), Las Vegas, July 25-27, 2005.
[7] A. Adams and M. A. Sasse, "Users are not the enemy: why
users compromise computer security mechanisms and how to
take remedial measures," Communications of the ACM, vol.
42, pp. 41-46, 1999.
[8] M. Kotadia, “Microsoft: Write down your passwords” in
ZDNet Australia, May 23, 2005.
[9] R. Molva, G. Tsudik, "Authentication Method with
Impersonal Token Cards," IEEE Symposium on Security
and Privacy, 1993, p. 56.
[10]A. Jain, L. Hong and S. Pankanti, "Biometric Identification,"
Communications of the ACM, vol. 33, pp. 167-176, 2000.
[11] R. N. Shepard, "Recognition memory for words, sentences,
and pictures," Journal of Verbal Learning and Verbal
Behavior, vol. 6, pp. 156-163, 1967.
[12] Cognitive Science Definition by Berkeley
http://ls.berkeley.edu/ugis/cogsci/major/about.php
[13] A. S. Patrick, A. C. Long, and S. Flinn, "HCI and Security
Systems," presented at CHI, Extended Abstracts
(Workshops). Ft. Lauderdale, Florida, USA, 2003.

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 13
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

Global Heuristic Search on Encrypted Data (GHSED)


Maisa Halloush, Mai Sharif
1
Department of Computer Science, Al-Balqa Applied University,
Amman, Jordan
Mhalloush@yahoo.com
2
Barwa Technologies,
Doha, Qatar
Mai_shareef@hotmail.com

Abstract give the server the ability to decrypt all her files or even
Important document are being kept encrypted in remote servers. know anything about the search keyword.
In order to retrieve these encrypted data, efficient search
methods needed to enable the retrieval of the document without A technique called a global heuristic search on encrypted
knowing the content of the documents In this paper a technique data (GHSED) that enables server to search for a specific
called a global heuristic search on encrypted data (GHSED)
pattern on encrypted files without revealing any
technique will be described for search in an encrypted files using
public key encryption stored on an untrusted server and retrieve information to the untrusted server or any loss of data
the files that satisfy a certain search pattern without revealing confidentiality will be defined and constructed, and its
any information about the original files. GHSED technique security would be proved. It also would be proved to
would satisfy the following: (1) Provably secure, the untrusted have a minimal collision rate and stable construction time
server cannot learn anything about the plaintext given only the and it would also be proven to be applied to databases
cipher text. (2) Provide controlled searching, so that the records, emails or audit logs.
untrusted server cannot search for a word without the user's
authorization. (3) Support hidden queries, so that the user may (GHSED) technique is an enhancement over the (HSED)
ask the untrusted server to search for a secret word without
technique (Heuristic Search on Encrypted Data) technique
revealing the word to the server. (4) Support query isolation, so
the untrusted server learns nothing more than the search result [1], where they present "a new technique capable of
about the plaintext. handling large data keyword search in an encrypted
document using public key encryption stored in untrusted
Key words: Heuristic Table, Controlled Search, Query server without revealing the content of the search and the
Isolation, hidden queries, false positive, hash chaining document. The prototype provides a local search,
minimizing communication overhead and computations
on both the server and the client."[1].
1. Introduction (HSED) technique enables the server to efficiently search
for a keyword without communication overhead since the
With more and more files stored on not necessarily trusted message is encrypted and heuristic table construction is
external server, concerns about this file falling into the done on the client side. It also implies no additional
wrong hand grow (i.e. server administrator can read my computation overhead on the email server because no
file). Thus users often store their data encrypted to ensure decryption is performed on the server since their model
confidentiality of data on remote servers, for more space, uses the public key cryptographic system. So unlike other
cost & convenience. techniques that use symmetric key cryptography, (HSED)
technique reduces computation and communication
But what happen if the client wants to retrieve particular overhead on the sever, in addition requires no additional
files (the files that satisfy certain search pattern or computation except for simply calculating a hash function
keyword)? A method is needed to search in the files for a that serves as the address of an entry in the heuristic
particular keyword (search pattern) and only retrieve the table."[1]
files that contain that keyword. For example, consider a
server that stores various files encrypted for "Alice" by However, one of the disadvantages in (HSED) technique
others. A server wants to test whether the files contain the was that it deals with each document alone. When the
keyword "urgent" so that it could forward the file sever search for a document that has a specific keyword it
accordingly. "Alice", on the other hand does not wish to search all the document's heuristic table, this would be

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 14
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

easy if the server has a small number of documents, but


what if it has a large number of document? This would be When "Alice" wants to store an encrypted document in the
too hard and needs a lot of time. Therefore, (GHSED) server she will construct heuristic table HT which will be
technique handles this problem by making a GHT (Global used by the server to embed it in the GHT. Then "Alice"
Heuristic Table) together with the HT (Heuristic Table) will encrypt the document by "Alice’s" public key Apub.
used in (HSED) technique. The document and the HT will be sent to the server. These
steps are illustrated figure 2.1.
Another disadvantage was that they drop the possibility of
repeated word in a document; this would also be solved in
(GHSED) technique, since this is a very common
situation and has to be solved.

2. Global HEURISTIC SEARCH ON


ENCRYPTED DATA (GHSED)

As shown in the previous section; (HSED) technique use Fig 2.1: Steps of Constructing the Files.
heuristic table to make the search secure. Although the
time needed to scan the heuristic table is encountered very The heuristic table HT will be constructed as follows:
little, it takes O(M*e) to search in all documents in the For each keyword in the document a record in the
server, cause it must go through one heuristic table per heuristic table HT will be added as shown in Table 2.1
each document.

In this paper the same idea of (HSED) technique will be


used; that is using public key cryptography, and search all
keywords in document, but with one heuristic table for all
documents in the server. This heuristic table will be
named as Global Heuristic Table (GHT), and it will
contain information about each word exists in documents
stored in the server.

Suppose "Alice" wants to store her encrypted documents


in a form that could be searchable. To do this a Global
Heuristic Table (GHT) will be used to contain every
Table 2.1: Heuristic Table Header.
keyword in the documents stored in the server; each
keyword will point to all documents in which the keyword
located in. The pointers will be illusory pointers; that is Where:
the keyword will point to a binary array which contains Indexi = H(Wi) Calculated hash function used as
every document number that the keyword exists in. It is index to both HT and GHT entries.
n

∑ ( chl (W )* chw ( W ))
clear that every document will be given a number before
stored in the server. KI(Wi) = j i j i The sum of each
j =1
When "Alice" wants to retrieve all documents which position of the ith character of the
contain a specific keyword, she will send a trapdoor to the word multiplied by the character
server. The server will use this trapdoor to search the weight.
Global Heuristic Table (GHT) and find the keyword, then
retrieve the documents number which contain the EWi Keyword Wi Encrypted using Apub
keyword, and send the documents to "Alice". n

In the next subsections we will illustrate how (GHSED) Sum(EWi) = ∑


j =1
dj The sum of digits of the encrypted
technique works, we will show the roll of every party
involved in the search; that is the document generator, the keyword EWi.
searcher, and the server.
Ver-key(Wi) = KI(Wi) || Sum(EWi)
2.1 The Document Generator Side

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 15
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

When "Alice" wants to retrieve the files that contain a


specific keyword, she sends a trapdoor T (Tew, TKI) to the
2.2 The Server Side server
- Tew: Keyword encrypted with "Alice’s" public key
Two operations will be done in the server side: the first n
one is storing the encrypted documents is a way to be
searchable. The second one is to return documents which
- TKI = ∑ ( chl (W )* chw ( W ))
j =1
j i j i

contain specific keyword when needed.


This trapdoor is used by the server to calculate an index in
After building the HT which is related to the encrypted
the global heuristic table GHT, that may the word locate
document, it will be sent to the server to be stored in it.
The position of the ith character of Wi in the language
The server will give a number to the document which will
multiplied by its position in the word.
be used to define it. Then the server will embed the HT
Note that "Alice" then signs this trapdoor using her
into GHT. The document with its HT will be stored in the
Private Key. Digital signature is used to allow the server
server, as shown in figure 2.2.
identify that the trapdoor is sent by the recipient.
Enc (Trapdoor(Tew, TKI), Apriv)

This will lead to an entry in the global heuristic table, the


server can then retrieve the first and second column's
entries from the heuristic table which is < KI > < Ver- key
> and calculate:
n
Fig 2.2: Steps done on the server side to store the documents.
Sum(Tew ) = ∑
j =1
dj The sum of digits of the first
GHT is a binary array contains information about each
keyword exists in one or more documents stored in the part of the trapdoor Tw
server. If two words have the same index, the changing
will be used. Each keyword in the GHT has a pointer to a Ver- key’(Tew) = KI || Sum(Tew)
binary array, which contains all documents numbers in
which the keyword exists. Using the binary array will The calculated Ver- key’(Tew) will be compared to Ver-
facilitate the operation done on the array. key(W), which is the second table entry indexed by
H(Tew). If there is a match then the word exists in one of
Each entry in HT should be added to the GHT, such that, the files stored in the server, if not the index entry will be
the index of both the HT and the GHT is the same. If no checked to see if there is a collided entries, if yes then the
index exists in the GHT same as in HT then a new entry chain will be checked until a match will be found, if no
will be added in the GHT. This entry will contain both KI match is found then the word does not exist.
and Ver-Key and a pointer to a binary array which If the word found in the GHT then the server will check
contains the document number. the documents serial numbers that contains the specified
word and send documents to "Alice". These steps are
If the index in HT exists in the GHT, then the entries of shown in figure 2.3.
the table will be checked to see if we have the same word
in the GHT by comparing both KI and Ver-key. If it is a
new keyword then a new chain entry will be created and a
pointer to a binary array which contains the document
number. But if it is an existing keyword then just an entry
containing the document number will be added in the
binary array which contains the documents number.
Fig 2.3.: Steps done on the server side for searching for the documents.
2.3 Return Documents Which Contain a Specific
Keyword
3. Results
(GHSED) technique algorithm mainly has two parts:
embedding the heuristic table into the Global heuristic

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 16
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

table (embedding operation) with take , and the searching


part (search operation). 0 .5
The embedding operation take O(M*e) per each HT, 0 .4
where M= maximum index number in HT, e= number of 0 .3

T im e ( m in )
collided entries in the chain. While the search operation tim e
0 .2
will take O(e) per each search, where e= number of
0 .1
collided entries in the chain.
It can be clearly noticed that there is overhead while 0
embedding the HT into the Global one. This overhead will 0 2000 4000 6000
be increased by increasing the storing operations in the F ile S iz e (K .B .)
server, that is; if the need to store encrypted documents in
the server done often, then there is an overhead on the Fig.3.2: Time Needed To Search in the Global Heuristic Table.
server. But if the need to stored encrypted documents in
the server is rarely done then the overhead may be 4. Conclusion
ignored. On the other hand, the search time needed is very
small in all cases as can be noticed. (GHSED) technique enable search in the entire document
The previous two parts of the (GHSED) technique for any keyword not just predefined keywords. It is
algorithm were tested on a number of files that range in efficient, fast and easy to implement. It minimizes
size from 10 KB to 5000 KB These files represent the communication and computation overhead. It can be
heuristic table size in the first part of the algorithm, and applied to documents, emails, audit logs, and to database
represent the global heuristic table size which will be records. Any changes to the document can be detected
searched within it the second part of the algorithm. because of the heuristic table. It can use hash chaining as
The main interest is concentrated mainly on the time it tightly links all entries in the array. It has no false
needed to embed the heuristic table into the global positive; if the keyword appears to be in the document
heuristic table, and on the search time needed to find in then it is in the document. It support hidden queries and
which documents a specific key word exists. query isolation. Finally, no one can detect the content of
Figure 3.1 indicates that as the heuristic table size gets the document from the heuristic table so it is provably
bigger, the time needed to embed it in the global heuristic secure and it provides controlled searching.
table is higher, note that this process is being done on
server. But (GHSED) technique cannot be applied when the
email server stores the emails compressed. Efficiency is
4 dependent on the hash function to search for entries, so if
3 the hash function is week, collision will occur more
frequently and so the search will take longer.
T im e ( m in )

2 tim e Dealing with queries containing Boolean operations on


1 multiple keywords remains the most significant and
challenging open problem. Allowing general pattern
0 matching, instead of keyword matching, also remains
0 2000 4000 6000 open.
F ile S iz e (K .B .)

Fig.3.1: Time Needed to Embed Heuristic Table into the Global Heuristic References:
Table.

[1] J.Qaryouti, G.Sammour, M.Shareef, K.Kaabneh, Heuristic


Figure 3.2 indicates that no matter what the global Search on Encrypted Data (HSED), EBEL 2005.
heuristic table is, the search time will be constant. [2] R. C. Jammalamadaka, R. Gamboni, S. Mehrotra, K.
Seamons, N. Venkatasubramanian ,iDataGuard: Middleware
Providing a Secure Network Drive Interface to Untrusted
Internet Data Storage, EDBT, 2008.
[3] R.C. Jammalamadaka, R. Gamboni, S. Mehrotra, K. Seamons
and N. Venkatasubramanian. GVault, A Gmail Based
Cryptographic Network File System. Proceedings of 21st Annual
IFIP WG 11.3 Working Conference on Data and
Applications, 2007.

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 17
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

[4] Bijit Hore, Sharad Mehrotra, Hakan Hacigumus. Handbook [17] D.Song, D.Wanger, and A.Perrig. Practical techniques for
of Database Security: Applications and Trends. pages 163- searches on encrypted data. In IEEE Symposium on Security
190, 2007. and Privacy, 2000.
[5] Hakan Hacigumus, Bijit Hore, Bala Iyer, Sharad Mehrotra. [18] Efficient Tree Search in Encrypted Data , R. Brinkman, L.
Secure Data Management in Decentralized Systems. pages Feng, J. Doumen, P.H. Hartel and W. Jonker, (2004),
383 – 425, 2007. http://www.ub.utwente.nl/webdocs/http://www.ub.utwente.nl/we
[6] Richard Brinkman, Jeroen Doumen, and Willem Jonker. bdocs/ctit/1/000000f3.pdf.
Secure Data Management, pages 18-27 2004. [19] Boneh, G. Di Crescenzo, R. Ostrovsky, and G. Persian.
[7] E. Goh. Secure Indexes, In the Cryptology ePrint Archive, Public key encryption with keyword search. In proceedings of
Report 2003/216, 2004. http://eprint.iacr.org/2003/216/. Eurocrypt 2004, LNCS 3027, 2004.
[8] Richard Brinkman, Ling Feng, Jeroen Doumen, Pieter H. [20] A Survey of Public-Key Cryptosystems, Neal Koblitz,
Hartel, and Willem Jonker. Efficient Tree Search in Encrypted Alfred J.
Data. WOSIS pages 126-135 2004. Menezes,http://www.math.uwaterloo.ca/~ajmeneze/publications/
[9] Y. Chang and M. Mitzenmacher, Privacy Preserving publickey.pdf, 2004.
Keyword searches on Remote Encrypted Data, Cryptology [21] P. Golle, , B.Waters; J. Staddon, Secure conjunctive
ePrint Archive, Report 2004/051, 2004. keyword search over encrypted data. In proceedings of the
[10] K.Bennett, C. Grothoff, T. Horozov and I. Patrascu. Second International Conference on Applied Cryptography
Efficient sharing of encrypted data. In proceedings of and Network Security (ACNS-2004); June 8-11 2004.
ACISP 2002. [22] Hash functions: Theory, attacks, and applications, Ilya
[11] B. Chor, O. Goldreich, E. Kushilevitz and M. Sudan, Mironov
Private Information Retrieval, Journal of ACM, Vol. 45, No. 6, research.microsoft.com/users/mironov/papers/hash_survey.pdf,
1998. 2005.
[12] S. Jarecki, P. Lincoln and V. Shmatikov. Negotiated [23] Collisions for Hash Functions, MD4, MD5, HAVAL-128
privacy. In the International Symposium on Software and RIPEMD, Xiaoyun Wang , Dengguo Feng , Xuejia Lai ,
Security, 2002. Hongbo Yu, http://eprint.iacr.org/2004/199.pdf, 2004.
[13] W. Ogata, and K. Kurosawa. Oblivious Keyword Search, [24] William Stallings, Cryptography and Network Security:
Special issue on coding and cryptography, Journal of Principles and Practice, 3/E, Publisher: Prentice Hall, Copyright:
Complexity, Vol.20, 2004. 2000.
[14] Richard Brinkman, Berry Schoenmakers, Jeroen Doumen,
and Willem Jonker. Experiments with Queries over Encrypted
Date Usinf Seacrete Sharing. Information Systems Security
Journal, 13(3):14–21, July. 2004. Maisa Halloush Received the B.Sc. and M.Sc. scientific degrees
http://eprints.eemcs.utwente.nl/7410/01/fulltext.pdf. in computer science 2002 and 2007, respectively. Her master
thesis was about information security. She is continuing doing
[15] B. Waters, D. Balfanz, G. Durfee and D. Smetters. Building research in the same topic. Working currently as IT instructor in Al
an Encrypted and Searchable Audit Log. Proceedings of the Quds College. She is also involved in writing books related to E-
Network and Distributed System Security Symposium, NDSS Business, Software Engineering, and Operating System.
2004, ISBN 1-891562-18-5, 2004.
[16] Searching in encrypted data, Jeroen Doumen, Mai Sharif Received the B.Sc. and M.Sc. scientific degrees in
http://www.exp- computer science 1997 and 2005, respectively. She has more
math.uniessen.de/zahlentheorie/gkkrypto/kolloquien/abstract_20 than 10 years experience in business analysis and software
040722_1.pdf, WOSIS 2004. development. She has two published papers. Currently her areas
of interest include information security, ripple effect in software
modules, e-learning.

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 18
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

Comprehensive Security Framework for Global Threads


Analysis
Jacques SARAYDARYAN, Fatiha BENALI and Stéphane UBEDA

1
Exaprotect R&D
Villeurbanne, 69100, France
jsaraydaryan@exaprotect.com
2
INSA Lyon
Villeurbanne, 69100, France
Fatiha.benali@insa-lyon.fr
3
INSA Lyon
Villeurbanne, 69100, France
Stephane.ubeda@insa-lyon.fr

Network (VPN), Intrusion Detection System (IDS) and


Abstract security audits.
Cyber criminality activities are changing and becoming more
and more professional. With the growth of financial flows But are the actions carried out an IS only associated with
through the Internet and the Information System (IS), new kinds attackers? Although the real figures are difficult to know,
of thread arise involving complex scenarios spread within most experts agree that the greatest threat for security
multiple IS components. The IS information modeling and
comes not only from outside, but also from inside the
Behavioral Analysis are becoming new solutions to normalize
the IS information and counter these new threads. This paper company. Now, administrators are facing new
presents a framework which details the principal and necessary requirements consisting in tracing the legitimate users. Do
steps for monitoring an IS. We present the architecture of the we need to trace other users of IS even if they are
framework, i.e. an ontology of activities carried out within an IS legitimate? Monitoring attackers and legitimate users aims
to model security information and User Behavioral analysis. at detecting and identifying a malicious use of the IS,
The results of the performed experiments on real data show that stopping attacks in progress and isolating the attacks that
the modeling is effective to reduce the amount of events by may occur, minimizing risks and preventing future attacks
91%. The User Behavioral Analysis on uniform modeled data is to take counter measures. To trace legitimate users, some
also effective, detecting more than 80% of legitimate actions of
administrators perform audit on applications, operating
attack scenarios.
systems and administrators products. Events triggered by
Key words: Security Information, Heterogeneity, Intrusion
Detection, Behavioral Analysis, Ontology. these mechanisms are thus relevant for actions to be
performed by legitimate users on these particular
resources.
1. Introduction
Monitoring organization resources produces a great
Today, information technology and networking resources amount of security-relevant information. Devices such as
are dispersed across an organization. Threats are similarly firewalls, VPN, IDS, operating systems and switches may
distributed across many organization resources. Therefore, generate tens of thousands of events per second. Security
the Security of information systems (IS) is becoming an administrators are facing the task of analyzing an
important part of business processes. Companies must deal increasing number of alerts and events. The approaches
with open systems on the one hand and ensure a high implemented in security products are different, security
protection on the other hand. As a common task, an products analysis may not be exact, they may produce
administrator starts with the identification of threats false positives (normal events considered as attacks) and
related to business assets, and applies a security product false negatives (Malicious events considered as normal).
on each asset to protect an IS. Then, administrators tend to Alerts and events can be of different natures and level of
combine and multiply security products and protection granularity; in the form of logs, Syslog, SNMP traps,
techniques such as firewalls, antivirus, Virtual Private security alerts and other reporting mechanisms. This

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 19

information is extremely valuable and the operations that within the company. It can send alerts to administrators so
must be carried out on security require a constant analysis that actions can be taken or it can automate responses that
of these data to guarantee knowledge on threats in real risks can be addressed and remediated quickly, by taking
time. An appropriate treatment for these issues is not actions such as shutting down an account of a legitimate
trivial and needs a large range of knowledge. Until user who misuses the IS or ports on firewalls.
recently, the combined security status of an organization
could not be decided. To compensate for this failure, The distributed architecture concept, DIDS (Distributive
attention must be given to integrate local security disparate Intrusion Detection System), first appeared in 1989
observations into a single view of the composite security (Haystack Lab). This first analysis of distributed
state of an organization. information did not present a particular architecture but
collected the information of several audit files on IS hosts.
To address this problem, both vendors and researchers The recent global IS monitoring brings new challenges in
have proposed various approaches. Vendors’ approaches the collection and analysis of distributed data. Recent
are referred to as Security Information Management (SIM) distributed architectures are mostly based on Agents.
or Security Event Management (SEM). They address a These types of architectures are mainly used in research
company’s need to manage alerts, logs and events, and projects and commercial solutions (Arcsight, Netforensic,
any other security elementary information coming from Intellitactics, LogLogic). An agent is an autonomy
company resources such as networking devices of all application with predefined goals [31]. These goals are
sorts, diverse security products (such as firewalls, IDS and various: monitor an environment, deploy counter-
antivirus), operating systems, applications and databases. measures, pre-analyze information, etc. The autonomy and
The purpose is to create a good position for observation goal of an agent would depend on a used architecture.
from which an enterprise can manage threats, exposure, Two types of architecture can be highlighted, distributive
risk, and vulnerabilities. The industry’ approaches focus centralized architecture and distributive collaborative
on information technology events in addition to security architecture. Zheng Zhang et al. [1] provided a
event. They can trace IS user, although the user is an hierarchical centralized architecture for network attacks
attacker or a legitimate user. The intrusion detection detection. The authors recommend a three-layer
research community has developed a number of different architecture which collects and analyzes information from
approaches to make security products interact. They focus IS components and from other layers. This architecture
on the correlation aspect in the analysis step of data, they provides multiple levels of analysis for the network attacks
do not provide insights into what properties of the data detection; a local attack detection provided by the first
being analyzed. layer and a global attack detection provided by upper
layers. A similar architecture was provided by [39] for the
The question asked in this article is to know what is network activity graph construction revealing local and
missing in today’s distributed intrusion detection. global casual structures of the network activity. K.
However, it is not clear how the different parts that Boudaoud [4] provides a hierarchical collaborative
compose Vendor product should be. Vendor’s approaches architecture. Two main layers are used. The first one is
do not give information on how data are modeled and composed of agents which analyze local components to
analyzed. Moreover, vendors claim that they can detect discover intrusion based on their analysis of their own
attacks, but how can they do if the information is knowledge but also with the knowledge of other agents.
heterogeneous? How can they rebuild IS misuse The upper layer collects information from the first layer
scenarios? All the same, research works lack of details on and tries to detect global attacks. In order to detect
the different components, which make the correlation intrusions, each agent holds attacks signatures (simple
process effective. They were developed in particular pattern for the first layer, attack graph for the second
environments. They rarely address the nature of the data to layer). Helmer et al. [13] provide a different point of view
be analyzed, they do not give global vision of the security by using mobile agents. A light weight agent has the
state of an IS because some steps are missing to build the ability to “travel" on different data sources. Each mobile
IS scenarios of use. Both approaches do not indicate how agent uses a specific schema of analysis (Login Failed,
they should be implemented and evaluated. Therefore, a System Call, TCP connection) and can communicate with
coherent architecture and explanation of a framework, other agents to refine their analyses.
which manages company’s security effectively is needed.
Despite many discussions, scalability, analysis availability
The framework must collect and normalize data across a and collaborative architecture are difficult to apply, in
company structure, then cleverly analyze data in order to today’s, infrastructure but also time and effort consuming.
give administrators a global view of the security status

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 20

Fig. 1 Global Anomaly Intrusion Detection Architecture

Thus, despite known drawbacks, distributive centralized is highly dependent on the data modeling, and that
architectures will be used in our approach for the analysis unknown attack scenarios could be efficiently detected
of distributive knowledge in the IS. without hard pre-descriptive information. Our decision
module also allows reducing false positive.
All IS and User behaviors’ actions are distributed inside IS
components. In order to collect and analyze these The reminder of this paper is structured as follows. In the
knowledge, we propose an architecture composed of next section, related work on security event modeling and
distributed agents allowing distributive data operations. behavioral analysis is covered. In the third section, the
Distributive agent aims at collecting data by making pre- proposed modeling for event security in the context of IS
operations and forwarding this information to an Analysis global vision is presented. Section 4 details the anomaly
Server. The Analysis Server holds necessary information detection module. The validation of the homogenization
to correlate and detect abnormal IS behaviors. This function and the anomaly detection module is performed
architecture is a hierarchical central architecture. on real data and presented in Section 5. Finally, the
Distributive agents share two main functionalities: conclusions and perspectives of our work are mentioned in
• a collector function aiming at collecting information on the last section.
monitored components,
• an homogenization function aiming at standardizing and
filtering collected information. 2. Related Work
As shown in figure 1, three types of agents are used. The
constructor-based agent aims at collecting information As mentioned in the introduction, security monitoring of
from a specific IS components (Window Host, Juniper an IS is strongly related to the information generated in
firewall). products’ log file and to the analysis carried out on this
The multi-collector based agent aims at collecting information. In this section, we address both event
information from several IS components redirecting their modeling and Behavioral Analysis state of the art.
flow of log (syslog). Then, the multi-service based agent
aims at collecting several different information (system 2.1 Event Modeling
log, Web server application log) from a single IS
All the research works performed on information security
component.
modeling direct our attention on describing attacks. There
is a lack of describing information security in the context
This paper presents a comprehensive framework to
of a global vision of the IS security introduced in the
manage information security intelligently so that processes
previous section. As events are generated in our
implemented in analysis module are effective. We focus
framework by different products, events can be
our study on the information modeling function, the
represented in different formats with a different
information volume reductions and the Abnormal Users
vocabulary. Information modeling aims to represent each
Behavior detection. A large amount of data triggered in a
product event into a common format. The common format
business context is then analyzed by the framework. The
requires a common specification of the semantics and the
results show that the effectiveness of the analysis process

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 21

syntax of the events. There is a high number of alerts knowledge that exists in the modeling. The developed
classification proposed for use in intrusion detection ontology represents the data model for the triggered
research. Four approaches were used to describe attacks: information by IDSs.
list of terms, taxonomies, ontologies and attacks language.
The easiest classification proposes a list of single terms [7, Attack languages are proposed by several authors to detect
18], covering various aspects of attacks. The number of intrusions. These languages are used to describe the
terms differs from an author to another one. Other authors presence of attacks in a suitable format. These languages
have created categories regrouping many terms under a are classified in six distinct categories presented in [12]:
common definition. Cheswick and Bellovin classify Exploit languages, event languages, detection languages,
attacks into seven categories [5]. Stallings classification correlation languages, reporting languages and response
[38] is based on the action. The model focuses on languages. The Correlation languages are currently the
transiting data and defines four categories of attacks: interest of several researchers in the intrusion detection
interruption, interception, modification and fabrication. community. They specify relations between attacks to
Cohen [6] groups attacks into categories that describe the identify numerous attacks against the system. These
result of an attack. Other authors developed categories languages have different characteristics but are suitable for
based on empirical data. Each author uses an events intrusion detection, in particular environments. Language
corpus generated in a specific environment. Neumann and models are based on the models that are used for
Parker [25] works were based on a corpus of 3000 describing alerts or events semantic. They do not model
incidents collected for 20 years; they created nine classes the semantics of events but they implicitly use taxonomies
according to attacking techniques. Terms tend to not be of attacks in their modeling.
mutually exclusive; this type of classification can not
provide a classification scheme that avoids ambiguity. All the researches quoted above only give a partial vision
of the monitored system, they were focused on the
conceptualization of attacks or incidents, which is due to
To avoid these drawbacks, a lot of taxonomies were the consideration of a single type of monitoring product
developed to describe attacks. Neumann [24] extended the which is the IDS. It is important to mention the efforts
classification in [25] by adding the exploited done to realize a data model for information security. The
vulnerabilities and the impact of the attack. Lindqvist and first attempts were undertaken by the American agency -
Jonson [21] presented a classification based on the Defense Advanced Research Projects Agency (DARPA),
Neumann classification [25]. They proposed intrusion which has created the Common Intrusion Detection
results and intrusion techniques as dimension for Framework (CIDF) [32]. The objective of the CIDF is to
classification. John Howard [16] presented a taxonomy of develop protocols and applications so that intrusion
computer and network attacks. The taxonomy consists in detection research projects can share information. Work
five dimensions: attackers, tools, access, results and on CIDF was stopped in 1999 and this format was not
objectives. The author worked on the incidents of the implemented by any product. Some ideas introduced in the
Computer Emergency Response Team (CERT), the CIDF have encouraged the creation of a work group called
taxonomy is a process-driven. Howard extends his work Intrusion Detection Working Group (IDWG) at Internet
by refining some of the dimensions [15]. Representing Engineering Task Force (IETF). IETF have proposed the
attacks by taxonomies is an improvement compared with Intrusion Detection Message Exchange Format (IDMEF)
the list of terms: individual attacks are described with an [8] as a way to set a standard representation for intrusion
enriched semantics, but taxonomies fail to meet mutual alerts. IDMEF became a standard format with the RFC
exclusion requirements, some of the categories may 476521. The effort of the IDMEF is centered on alert
overlap. However, the ambiguity problem still exists with syntax representation. In the implementations of IDSs,
the refined taxonomy. each IDS chooses the name of the attack, different IDSs
can give different names to the same attack. As a result,
Undercoffer and al [3] describe attacks by an ontology. similar information can be tagged differently and handled
Authors have proposed a new way of sharing the as two different alerts.
knowledge about intrusions in distributed IDS
environment. Initially, they developed a taxonomy defined Modeling information security is a necessary and
by the target, means, consequences of an attack and the important task. Information security is the input data for
attacker. The taxonomy was extended to an ontology, by all the analysis processes, e.g. the correlation process. All
defining the various classes, their attributes and their
relations based on an examination of 4000 alerts. The
1 http://www.rfc-editor.org/rfc/rfc4765.txt
authors have built correlation decisions based on the

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 22

the analysis processes require automatic processing of order to discover causes of IS disaster (Forensic Analysis).
information. Considering the number of alerts or events The main purpose of this approach is to build casual
generated in a monitored system, the process, which relationships between IS components to discover the
manages this information, must be able to think on these origin of an observed effect. The lack of anomaly
data. We need an information security modeling based on detection System can be explained by the fact that
abstraction of deployed products and mechanisms, which working on the Global vision introduces three main
helps the classification process, avoids ambiguity to limitations. First of all, the volume of computed data can
classify an event, and reflects the reality. Authors in reach thousands of events per second. Secondly, collected
[16,21] agree that the proposed classification for intrusion information is heterogeneous due to the fact that each IS
detection must have the following characteristics: component holds its own events description. Finally, the
accepted, unambiguous, understandable, determinist, complexity of attacks scenarios and IS dependencies
mutually exclusive, exhaustive. To ensure the presence of increases very quickly with the volume of data.
all these characteristics, it is necessary to use an ontology
to describe the semantics of security information.
3. Event Modeling
2.2 Behavioral Analysis
As we previously stated, managing information security
Even if Host Intrusion Detection System (HIDS) and has to deal with the several differences existing in the
Network Intrusion Detection System (NIDS) tools are monitoring products. To achieve this goal, it is necessary
known to be efficient for local vision by detecting or to transform raw messages in a uniform representation.
blocking unusual and forbidden activities, they can not Indeed, all the events and alerts must be based on the same
detect new attack scenarios involving several network semantics description, and be transformed in the same data
components. Focusing on this issue, industrial and model. To have a uniform representation of semantics, we
research communities show a great interest in the Global focus on concepts handled by the products, we use them to
Information System Monitoring. describe the semantics messages. In this way, we are able
to offset products types, functions, and products languages
Recent literatures in the intrusion detection field [30] aim aside. The Abstraction concept was already evoked in
at discovering and modeling global attack scenarios and intrusions detection field by Ning and Al [27]. Authors
Information System dependencies (IS components consider that the abstraction is important for two primary
relationships). In fact, recent approaches deal with the reasons. First, the systems to be protected as well as IDSs
Global Information System Monitoring like [22] who are heterogeneous. In particular, a distributed system is
describes a hierarchical attack scenario representation. The often composed of various types of heterogeneous
authors provide an evaluation of the most credible components. Abstraction becomes thus a necessary means
attacker’s step inside a multistage attack scenario. [28] to hide the difference between these component systems,
computes also attack scenario graphs through the and to allow the detection of intrusions in the distributed
association of vulnerabilities on IS components and systems. Secondly, abstraction is often used to remove all
determines a "distance" between correlated events and the non relevant details, so that IDS can avoid an useless
these attack graphs. In the same way, [26] used a semi- complexity and concentrate on the essential information.
explicit correlation method to automatically build attack The description of the information generated by a
scenarios. With a pre-processing stage, the authors model deployed solution is strongly related to the action
pre-conditions and post conditions for each event. The perceived by the system, this action can be observed at
association of pre and post conditions of each event leads any time of its life cycle: its launching, its interruption or
to the construction of graphs representing attack scenarios. its end. An event can inform that: an action has just
Other approaches automatically discover an attack started, it is in progress, it failed or it is finished. To
scenario with model checking methods, which involves a simplify, we retained information semantics modeling via
full IS component interaction and configuration the concept of observed action. We obtain thus a modeling
description [36]. that fits to any type of observation, and meets the
abstraction criteria.
However, classical intrusion detection schemes are
composed of two types of detection: Signature based and 3.1 Action Theory
Anomaly based detections. The anomaly detection is not In order to model the observed action, we refer to the
developed regarding to Global IS Monitoring. Few works that have already been done in the Action Theory of
approaches intend to model system normal behavior. the philosophy field. According to the traditional model of
Authors in [11] model IS components’ interactions in the action explained by the authors in [9,19], an action is
an Intention directed to an Object and uses a Movement. It

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 23

is generally conceded that intentions have a motivation


role for the action. Authors underline that this role is not
limited in starting the action but in supporting it until its
completion. Our actions utilize movements, explanation of
the action remains incomplete if we do not take into
account the movements. Movement is the means to
achieve the action. The object is the target towards which
the action is directed to. In summary, the human actions Fig. 2 Security message semantics
are in conformity with a certain logic. To say that an agent
carries out an action A, it is to say that the agent had an • Recon: intention of collecting information on a target,
Intention I, by making a Movement M to produce an effect • Authentication: intention to access to the IS via an
on a Target T of Action A. Action’s basic model is so authentication system,
composed by the following concepts: intention, movement • Authorization: intention to access to a resource of an
and target. IS,
• System: intention of modifying the availability of an IS
3.2 Event Semantics resources.
Intentions are carried out through movements. We have
We observe the action performed in the monitored system categorized the movement into seven natures of
and see that this action is dissociating from the human movements:
mind. We add another concept, i.e. the Effect, to the basic • Activity: all the movements related to activities which
model (Intention, Movement and Target) of the Action do not change the configuration of the IS,
Theory. We can say that this modeling is a general • Config: all the movements which change configuration,
modeling, it can be adapted to any context of the • Attack: all the movements related to attacks,
monitoring such as in IS intruders monitoring, or in the • Malware: all the movements related to malwares.
monitoring of bank physical intruders. All we have to do, Malware are malicious software programs designed
is to instantiate the meta-model with the intrusion specifically to damage or disrupt a system,
detection context’s vocabulary. • Suspicious: all the movements related to the suspicious
activities detected by the products. In some cases, a
We have outlined an adaptation of this meta-model to our product generates an event to inform that there was a
context of the IS monitoring from threats. The concepts suspicious activity on a resource, this information is
are redefined as follows: reported by our modeling as it is.
• Intention: the objective for which the user carries out • Vulnerability: all the movements related to
his action, vulnerabilities.
• Movement: the means used to carry out the objective of • Information: the probes can produce messages which
the user, do not reflect the presence of the action, they reflect a
• Target: the resource in the IS to which the action is state observed on the IS.
directed to, Under each one of these main modes, we have identified
• Gain: the effect produced by the action on the system, the natures of movements. A movement is defined by a
i.e. if the user makes a success of his attempt to carry out mode of movements (such as Activity, Config,
an action or not. Information, Attack, Suspicious, Vulnerability or
Security information is an Intention directed towards a Malware) and a nature of movement (such as Login, Read,
Target which uses a Movement to reach the target, and Execute, etc.), the mode and the nature of the action
produces a Gain. Figure 2 illustrates the ontology defines the movement. As the model must be adapted to
concepts, the semantic relation between the concepts, and the context of the global vision of IS’s security
the general semantics of a security event. monitoring, it is clear that we have defined movements of
presumed normal actions or a movement of presumed
In order to identify the intentions of a user’s IS, we have dangerous actions. An intention is able to have several
studied the attacker strategy [17,26]. In fact, once the movements, for example, an access to the IS performed by
attacker is in the IS, he can undergo both attacker’s and the opening of a user’s session or by an account’s
legitimate user’s action. Analyzing attacker strategy configuration or by a bruteforce attack.
provides an opportunity to reconstruct the various steps of
an attack scenario or an IS utilization scenario, and Each IS resource can be a target of an IS user activity.
perform pro-active actions to stop IS misuse. According to Targets are defined according to intentions.
the attacker strategy, we have identified four intentions:

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 24

• In the case of the Recon intention, an activity of Authentication_Activity.Login_Firewall.Admin_Successes.


information collection is carried out on a host. The target We have identified 4 Intentions, 7 modes of Movement,
represents the host on whom the activity was detected. 52 natures of Movement, 70 Targets and 13 Gains.
• In the case of the Authentication intention, an activity of
access to an IS is always carried out under the control of 3.3 Event Data Model
an authentication service. The target is represented by a
pair (target1, target2), target1 and target2 refer It seems reasonable that the data model for information
respectively to the system that performs the security can be based on standards. We have mentioned in
authentication process and the object or an attribute of 2.1 that the format IDMEF becomes a standard. We use
the object that authenticates on the authentication this data model like a data model for event generated in a
service. products interoperability context.
• In the case of the Authorization intention, an access to
an IS resource is always carried out under the control of This format is composed of two subclasses Alert and
a service, this service allows the access to a resource Heartbeat. When the analyzer of an IDS detects an event
based on access criteria. The target is represented by a for which he has been configured, it sends a message to
pair (target1, target2). Target1 and Target2 refer inform their manager. Alert class is composed of nine
respectively to the resource which filters the accesses classes: Analyzer, CreateTime, DetectTime, analyserTime,
(which manages the rights of the users or groups) and Source, Target, Classification, Assessment, and
the resource on which rights were used to reach it. AdditionalData.
• In the case of the System intention, an activity depends
on the availability of the system. The target is The IDMEF format Classification class is considered as a
represented by a pair (target1, target2). Target1 and way to describe the alert semantics. The ontology
Target2 refer respectively to a resource and a property of developed in this framework describes all the categories of
the resource, for example (Host, CPU). activities that can be undertaken in an IS. We define the
Classification of the IDMEF data model class by a
These constraints on the targets enable us to fix the category of the ontology that reflects the semantics of the
level of details to be respected in the modeling. We triggered raw event. Figure 3 illustrates the format
have defined the Gain in the IS according to the IDMEF with modification of the class Classification.
Movement mode. Finally, with the proposed ontology and the adapted
IDMEF data
• In the case of the Activity and Config movement mode,
model to information security in the context of global IS
Gain takes the values: Success, Failed, Denied or Error.
view, information is homogeneous. Indeed, all processes
• In the case of the Malware, Vulnerability, Suspicious
that can be implemented in the analysis server can be
and Attack movement mode, Gain takes the value
undertaken including the behavioral analysis.
Detected.
• In the case of the Information movement mode, the
event focuses on information about the system state. 4. Behavioral Analysis
Gain is related to information on control and takes the
values: Valid, Invalid or Notify, or related to information Anomaly Detection System differs from signature based
on thresholds and takes the values Expired, Exceeded, Intrusion Detection System by modeling normal reference
Low or Normal. instead of detecting well known patterns. Two periods are
The result is an ontology described by four concepts: distinguished in Anomaly Detection: a first period, called
Intention, Movement, Target, Gain, tree semantics training period, which builds and calibrates the normal
relations: Produce, Directed to, Use between the concepts, reference. The detection of deviant events is performed
a vocabulary adapted to the context of the IS monitoring during a second period called exploitation. We propose an
against security violation, and rules explaining how to Anomaly Detection System composed of four main blocks
avoid the ambiguity. For example, for an administrator as shown in figure 4. The Event Collection and Modeling
action who succeeded in opening a session on a firewall, block aims at collecting normalized information from the
the ontology describes this action by the 4-uplets: different agents. The Event Selection block would filter
Authentication (refers to the intention of the user), only relevant information (see section 4.3). The Anomaly
Activity Login (refers to the movement carried out by the Detection block would model user’s behaviors through an
user), Firewall Admin (refers to the target of the action activity graph and a Bayesian Network (see section 4.1.1)
carried out) and Success (refers to the result of the action). during a training period and would detect anomaly (see
The category of the message to which this action belongs section 4.2) in the exploitation period. Then all behavioral
is: anomalies are evaluated by the Anomaly Evaluation block

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 25

which identifies normal reference update from dangerous Several modeling methods are used for normal reference
behavioral anomalies (see section 4.4). Each block will be modeling in Anomaly Detection (e.g. classification,
explained in the following sections. Neural Network, States Automaton, etc). Nevertheless
three of them deal with the new goals: Hidden Markov
4.1 Model Method Selection Model, stochastic Petri Network and Bayesian Network.
In Hidden Markov Model and stochastic Petri Network
Modeling user behavior is a well-known topic in NIDS or methods each node of sequences identifies one unique
HIDS regarding local component monitoring. The system or event state. Modeling the events of each user on
proliferation of security and management equipment units each IS components would lead to the construction of a
over the IS brings a class of anomaly detection. Following huge events graph. All these techniques can model
user behaviors over the IS implies new goals that anomaly probabilistic sequences of events but only Bayesian
detection needs to cover. Network provides a human understandable model.

First of all we need to identify the owner of each event Bayesian Networks (BN) are well adapted for user’s
occurring on the IS. Then our model should be able to activities modeling regarding the Global Monitoring goals
hold attributes identifying this owner. The fact that all and provide a suitable model support. BN is a probabilistic
users travel inside the IS implies that the user activity graphical model that represents a set of variables and their
model should models sequences of events representing conditional probabilities. BNs are built around an oriented
each user’s actions on all IS components. Moreover, user acyclic graph which represents the structure of the
behavior can be assimilated to a dynamic system. network. This graph describes casual relationships
Modeling user activities should enhance periodical between the variables. By instantiating a variable, each
phenomena and isolate sporadic ones. Then, user conditional probability is computed using mechanism of
behaviors hold major information of the usage of the inference and the BN gives us the probabilities of all
system and can highlight the users’ behaviors compliance variables regarding this setting. By associating each node
to a variable and each state of a node to a specific value,
BN graph contracts knowledge in human readable graph.
Furthermore, BNs are useful for learning probabilities in
pre-computed data set and are well appropriate for the
deviance detection. BN inference allows injecting variable
values in BN graph and determining all conditional
probabilities related to the injected proof.

To achieve a user activity Bayesian Network model, we


need to create a Bayesian Network structure. This
structure would refer to a graph of events where each node
represents an event and each arc a causal relationship
between two events. Some approaches used learning
methods (k2 algorithm [11]) to reveal a Bayesian structure
in a dataset. In the context of a global events monitoring,
lots of parameters are involved and without some priori
knowledge, self learning methods would extract
inconsistent relationships between events.

Following our previous work [34] on user behaviors


analysis, we specify three event’s attributes involved in
the identification of casual relationships: user login, IP
address Source and Destination. Here, we enhance the fact
Fig. 3 The IDMEF data model with the class Classification that legitimate users performed two types of actions: local
represented by the category of the ontology that describes the
actions and remote actions. First, we focus our attention
with the security policy. This property should offer the on event’s attributes identifying local user action. The
ability to make the model fit the IS policies or plan future couple of attributes ‘Source IP address’ and ‘user name’ is
system evolutions to a security analyst. To achieve that, usually used to identify users. These two attributes allow
user activities modeling should be Human readable. tracking user activities in a same location (e.g. work
station, application server). To monitor remote user

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 26

actions only, ‘Destination IP address’ and ‘source IP compliant with this technical constraint, we follow the
address’ attributes can be used. Then, to monitor a recommendation defined in our work [33] by creating a
physical user location move, only the ‘user login name’ "loop node" expressing recurrent events points in a
can be used to follow them. sequence.
These three attributes, i.e. user login, IP address Source
and destination, are combined to determine correlation After the Bayesian Network structure creation, the training
rules between two collected events as defined in our work data set is used to compute conditional probabilities inside
[33]. Two events are connected together if: the Bayesian Network. We use the simple and fast
• the events share the same source IP address and/or User counting-learning algorithm [37] to compute conditional
name, probabilities considering each event as an experience.
• the target IP address of the first event is equal to the
source IP address of the second, The time duration between collected events can also be
modeled by adding information in the Bayesian Network.
The event correlation process would be realized during a The relation between clusters of events can be
training period. The resulting correlated events build an characterized by a temporal node holding the time
oriented graph, called users activity graph, where each duration probabilities between events. This extension is
node represents an event and each arc a causal relationship defined in detail in [35].
according to the correlation definition. Nevertheless,
user’s activity graph is too large and concentrated to be

Fig. 4 Global Anomaly Intrusion Detection Architecture

efficient for a Bayesian structure. We propose a merging


function that gathers events according to their 4.2 Anomaly Detection
normalization. Based on the edge contraction definition
[10], we compute a new graph, called merged graph, The anomaly detection is performed during the
where each node represents a cluster of events sharing the exploitation period. This period intends to highlight
same meaning (same semantics) and holding a list of abnormal behaviors between received data set and the
events attributes which identifies the owner of each event trained user’s activities model.
in this cluster. The resulting graph would show casual
relationships between each user’s events classes as First of all, we compute a small user activities graph as
described in section 5.1. This merged graph is used as the defined in section 4.1.1 for a certain period of time
basis of our Bayesian structure. represented by a temporal sliding windows on incoming
events. This graph reflects all the users activities
Classical Bayesian Networks are built on acyclic oriented interactions for the sliding windows time period.
graph which is not the case here because of user activities This detection graph is then compared with our normal
periodical sequences of events. Although some methods user model (BN) which makes two types of deviances
exist to emerged: graph structure deviance detection and
allow Bayesian Network working on cyclic graph [23], probabilistic deviance detection. To check the structure
most Bayesian libraries do not support cycles. To be compliance between both graphs, we first control the

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 27

correctness of each graph’s features, and then we check if checkpoints need to be monitored regarding to the nature
the detected event’s classes belong to our reference, if the and the location of a component.
relationships between event’s classes are valid and if the
detected event’s attributes are valid. Finally, each step of We extract the core information needed to detect misuses
each sequence of events inside the detection graph is or attack scenarios. Thus, we do not focus our work on all
injected in the Bayesian Network. For each step, our the data involved in misuse or variant of attack scenarios
model evaluates the probability to receive a specific event but only on one piece of data reflecting the actions shared
knowing it precedence events, if this probability is below a by the user’s and attacker’s behavior. We also study a
threshold the events is considered as deviant. When a couple of sequences of actions selection following an
deviance is detected, an alert is sent to the security analyst. identical consideration.

4.3 Event Selection Both checkpoints and sequences selections provide a


significant model complexity reduction. Indeed, we
To prevent from a graph overloading, i.e. an expensive manage a reduction of 24% of nodes and 60% of links.
memory and CPU consumption, we simplify our model to This selection slightly reduces the detection rate of
provide only the relevant information about User unusual system use and reduces false positive of 10%.
Behaviors [34].
4.4 Anomaly Evaluation
Indeed, with a huge volume of distinct events coming
from the monitoring IS, the complexity of user activities The lack of classical anomaly detection system is mainly
increases greatly. On large IS, lots of interactions can due to a high false positive rate and poor information
appear between user actions. In the worst case, the description about deviant events. The majority of these
resulting graph can be a heavy graph. So, to specify the false positive comes from the evolution of the normal
relevance of collected events, we studied the interactions system behavior. Without a dynamic learning process,
between legitimate user’s behaviors and attacker’s strategy anomaly models become outdated and produce false alerts.
[34]. The update mechanism has been enhanced by some

We define a new way to reduce the model’s complexity of previous works [14,29] which point out two main
user or system activities representation. We introduce the difficulties.
notion of necessary transit actions for an attacker to The first difficulty is the choice of the interval time
achieve these objectives: these actions are called between two update procedures [14]. On one hand, if the
Checkpoints. Checkpoints are based on different classes of interval time is too large, some Information System
attacker scenarios; evolutions may not be caught by the behavior detection
User to Root, Remote to Local, Denial of Service, engine. On the other hand, if it is too small, the system
Monitoring/Probe and System Access/Alter Data. We learns rapidly but loses gradual behavior changes. We do
enrich this attacks classification with classes of malicious not focus our work especially on this issue but we assume
actions (Denial of Service, Virus, Trojan, Overflow, etc). that, by modeling users’ activities behaviors, a day model
For each scenario, we provide a list of Checkpoints which updating is a good compromise between a global User
determine all the necessary legitimate activities needed by behavior evolution and the time consumption led by such
an attacker to reach such effects. updates.
For instance, to achieve a User to Root effect an attacker The second difficulty is the selection of appropriate events
chooses between six different variants of scenarios (Gain, to update the reference in terms of event natures.
Injection, Overflow, Bypass, Trojan, Virus). A checkpoint Differentiating normal behavior evolution from suspicious
related to an Injection 1 is, for example, a command deviation is impossible without additional information and
launch. We analyzed all the checkpoints of all the possible context definition. To take efficient decisions, we need to
actions leading to one of these five effects. characterize each event through specific criteria. These
We propose a selection of thirteen checkpoints criteria should identify the objective of the end user
representing different types of events involved in at least behind the deviating events. We focus our work on this
one of the five effects. These checkpoints reflect the basis second issue and follow the approach in [40] that analyzes
of the information to detect attacker’s activities. We also end users security behavior.
provide a description of the context to determine if all
Our evaluation process evaluates a deviating event
through a three dimensions evaluation of each deviating
1 events: the intention behind the event, the technical
An injection consists in launching an operation through a started session
or service.

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 28

expertise needed to achieve the event and the criticality of maximizes the generalization process (ability to not fall in
the targeted components by the event, each dimension an over-training). After the construction of frontiers, each
characterizes a property of the deviating event. deviating events belonging to one community will be
qualified as a normal evolution or attackers’ activity and
All received events are associated with one of the three will receive a degree of belonging to each community.
types of movements introduced in section 5.1: the After that, two thresholds will be computed. They define
intention of the configuration which defines beneficial three areas; normal evolution of system area, suspicious
moves trying to improve the system, intention of activity events area and attack or intrusive activities area. Each
which represents usual activity on the IS or neutral activity deviating events belonging to normal evolution area will
and then the intention of attack which refers to all the update our normal Bayesian model. Each deviating events
malicious activities in the system. The degree of deviation belonging to the intrusive activities area or suspicious area
of an event would inform us how far an event from the will create an alarm and send it to an analyst.
normal use of the system is. We assume that the more an
event is far from normal behavior, the more this deviating
event holds malicious intention. Finally, other past events 5. Experimentation
linked by casual relationship with the deviating one lead
also the malicious intention. The expertise dimension In this section, we aim to make into practice the two
defines the technical expertise needed by a user to realize proposed modules, event modeling and User Behavioral
an event. This expertise is computed on the type of actions Analysis, while using a large corpus of real data. Event
realized by the event (action of configuration or action of modeling experience will normalize raw events in the
activity), the type of a targeted component (a Router needs ontology’s categories that describe her semantics. User
more technical expertise than a Work Station) and the behavioral analysis experimentation will use normalized
owner of the event (classical user or administrator). events generated by events modeling module to detect
abnormal behaviors.
Finally, the event’s impact on IS will depend on the
targeted component. Thus, we evaluate a deviating event 5.1 Event Modeling
also by the criticality of the targeted component. This
To study the effectiveness of the modeling proposed in
criticality is evaluated by combining vulnerabilities held
Section 5.1, we focused our analysis on the exhaustiveness
by the targeted component, the location of the targeted
of the ontology (each event is covered by a category) and
component (e.g. LAN, Public DeMilitary Zone, etc) and
on the reduction of event number to be presented to the
its business importance (e.g. critical authentication servers
security analyst. We performed an experiment on a corpus
are more important than workstations regarding the
of 20182 messages collected from 123 different products.
business of the company).

According to all dimensions definitions, each deviating


point will be located in this three dimension graph. The
three dimension representation increases the analyst
visibility of deviating events. Nevertheless, some
automatic actions could considerably help analyst to
decide if a deviant event is related to a normal system
evolution or to intrusive activity. We propose to build a Fig 5: Type of used product
semi-automatic decision module to define which deviating The main characteristic of this corpus is that the events are
events fit normal system evolutions and which ones reflect collected from heterogeneous products, where the
attackers’ activities. Our semi-automatic decision module products manipulate different concepts (such as attacks
is a supervised machine learning method. We used a detection, virus detection, flaws filtering, etc.). The
training data set composed of deviating events located in sources used are security equipment logs, audit system
our three-dimension graph. Each point of the training data logs, audit application logs and network component logs.
set is identified as a normal system evolution or attackers’ Figure 5 illustrates the various probes types used and, into
activities by security analysts. To learn this expertise brackets, the number of probes per type is specified. The
knowledge, we use a Support Vector Machine (SVM) to classification process was performed manually by the
define frontiers between identified normal points and experts. The expert reads the message and assigns it to the
attack points. The SVM technique is a well known category which describes its semantics. The expert must
classification algorithm [20] which tries to find a separator extract the intention from the action which generated the
between communities in upper dimensions. SVM also message, the movement used to achieve the intention, the

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 29

target toward which the intention was directed and the supervised by these products. The conclusion that we can
gain related to this intention. draw from this study is that a good line of defense must
We have obtained, with the manual classification of raw supervise all the activities aimed in an IS, and that the
events, categories of various sizes. The distribution of the cooperative detection should not be focused on the
messages on the categories is represented on figure 6. number of the deployed products but on the activities to be
Some categories are large, the largest one contains 6627 supervised in the IS. This result can bring into question the
events which presents a rate of 32,83% of the corpus. This choice of the defense line for the IS.
is due to the monitoring of the same activity by many
products or to the presence of these signatures in many 5.2 Behavioral Analysis

Fig. 6 Events Distribution on the Ontology's Categories

products. The representation of the events under the same Actual Intrusion Detection System operating on a Global
semantics reinforces the process of managing the security Information System Monitoring lacks of large test dataset
in a cooperative context and facilitates the task of the aiming at checking their efficiency and their scalability. In
analyst (more detail in [2]). In addition, we had a singleton this section, we provide our results on our Anomaly
categories, 732 raw events forming their own category, Intrusion Detection System using a real normalized data
which represent a rate of 42,21% of all categories and set. We deployed our architecture on a real network and
which represent only 3,63% of the corpus. Event modeling collected events coming from hundreds of IS components.
has reduced the number of events by 91,40% (from 20182
to 1734). The presence of singleton categories can be DataSet Analysis: Our dataset analysis comes from a
explained by the following points: only one product large company composed of hundreds of users using
among the deployed products produces this type of event. multiple services. The dataset has been divided into two
A signature, which is recognized by a product and not datasets: one training data set composed of events
recognized by an another, errors made by experts collected for 23 days and the other one (test data set)
generated the creation of new categories, they do not have composed of events collected for 2 days after the training
to exist theoretically, and the presence of targets period. The training dataset aims at train-ing our engine
monitored rarely increases the number of singleton and creating a user normal behavioral model. The test
categories, because the movement exists several times, but ‘dataset’ has been enriched of attack scenarios in or-der to
only once for these rare targets. test our detection engine. First of all, the test data set is
We observe that the category of the movement Suspicious used to test the false alarms rate (false positive rate) of our
introduced into our ontology is quite necessary to preserve engine. Then attack scenarios will be used to determine
the semantics of a raw event which reflects a suspicion. our detection rate, and more over the false negative rate
These types of events will be processed with the User (not detected attacks rate) of our engine.
Behavioral Analysis. Ontology does not make it possible The major parts of the collected events are web server
to analyze event, its goal is to preserve raw events information and authentication information. We can notice
semantics. The proportions of the various categories that during the monitored period, some types of events are
depend on the deployed products and the activities to be periodic (like Authentication_Activity .Login.

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 30

SysAuth.Account.Success) and other ones are sporadic. different times. Nodes (referring to event’s classes)
Moreover, our dataset is composed of more than 70 types become stationary around the step 330 whereas links
of events ranging from Authentication actions to System (relationships between event’s classes) continue to evolve
usage (like service start). until step 360. Only the status (user or process identifier)
Our training dataset, representing the activity of the seems to never reach a stationary point. To understand this
system for 23 days, is composed of 7 500 000 events. The phenomenon, we analyze in depth the evolution of the
test data is composed of 85 000 normal events (two days status of each different nodes. We notice that the status of
of the System’s activity) and 410 events representing three one particular node, Authentication_Activity
different attack scenarios. These scenarios reflected three .Login.SysAuth.Account.Success, blow up. We
types of attack effects on the system as introduced in the investigate and discover that the considered company
DARPA attacks classification (Remote to local, User to owns an e commerce Web server on which each new
Root,...). Some scenario variants are developed for each consumer receives a new login account. That is why when
class. For example, concerning the Remote to Local attack other nodes reach their stationary point around the 390th
scenario, we provide two kinds of variant of scenario as step, Authentication_Activity. Login. SysAuth.Account.
follow: Success node continues to grow. To avoid a complexity
The remote to local variant one is composed of four explosion inside our Bayesian model, we add a constraint
different classes of events: defining a time of unused events indicator. We define a
-Authentication_Activity.Login.SysAuth.Account.Success, threshold to determine which state of node will be kept
-uthentication_Config.Add.SysAuth.Account.Success, and which one will be dropped.
-Authentication_Activity.Login.SSH.Admin.Success,

Fig. 7 Detection Sums

-System_Activity.Stop.Audit.N.Success. The test data set is then processed by our Anomaly


The second one holds the classes below: detection System and our detection’s results are in figure
-Authentication_Activity.Login.SysAuth.Account.Success, 7. This table distinguishes each scenario’s events and the
-Authentication_Config.Modify.SysAuth.Account.Success, detection rate for different probability threshold. These
-Authentication_Activity.Login.SysAuth.Admin.Success, thresholds could be chosen regarding to the organisms or
-System_Activity.Execute.Command.Admin.Success.
company goals. In case of a very sensitive IS, the attack
detection rate needs to be as high as possible. A
Each variant of each scenario is reproduced ten times with
probability threshold of 0.002 achieving a detection rate of
different attributes (login user, IP address Source and
90% with false positive rate around 14% would be
Destination) belonging to the data set. We can notice that
suitable. In case of a more transversal use of our approach,
all events involved in attack scenarios refer to legitimate
companies deal with false positives and detection rate. A
actions. All these events define a set of event among
threshold of 0.0001 provides an attack detection rate of
shared actions between legitimate user behaviors and
79% with a false positive rate below 0.5%. Additional
attacker strategy.
observation can be made regarding our attack detection
Results: The test data set is used to build our user
rate. Most of the time, attack detection rate of detection
activities model. To compute efficiently this model, we
tools reaches 95% but in our context, all our scenarios are
split the training data set into 440 steps.
composed of events which belong to normal behavior. All
Each Bayesian structure feature (nodes, links, states)
these events do not necessary deviate from the normal
evolves differently and reaches its stationary point at

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 31

behavior that is why our detection rate is slightly below The proposed framework can be useful to other processes.
classical detection rate. We can estimate that little less Indeed, the ontology is necessary to carry out counter-
than 10% of the test attack’s events belong to normal measures process, the results of User Behavioral Analysis
behavior (legitimate event and attributes). Despite this allowing the administrator to detect legitimate users that
constraint, we still reach detection rate from 80% to 90%. deviate from its behavior, a reaction process can then be
set up to answer malicious behaviors.
6. Conclusion and Perspectives
References
Our main goal throughout this paper was to describe a [1] Z. Zhang, J. Li, C. Manikopoulos, J. Jorgenson, and J. Ucles.
framework that addresses companies: managing the Hide: A hierarchical network intrusion detection system using
company’s security information to protect the IS from statistical preprocessing and neural network classification. In
threats. We proposed an architecture which provides a the 2001 IEEE Workshop on Information Assurance and
global view of what occurs in the IS. It is composed of Security, 2001.
different agent types collecting and modeling information [2] F. Benali, V. Legrand, and S. Ubéda. An ontology for the
coming from various security products (such as firewalls, management of heterogeneous alerts of information system. In
IDS and antivirus), Operating Systems, applications, The 2007 International Conference on Security and
databases and other elementary information relevant for Management (SAM’07), Las Vegas, USA, June 2007.
the security analysis. An Analysis Server gathers all [3] J. L. Undercoffer, A. Joshi, and J. Pinkston. Modeling
computer attacks an ontology for intrusion detections. The
information sent by agents and provides a behavioral
Sixth International Symposium on Recent Advances in
Analysis of the user activities. Intrusion Detection. September 2003.
[4] K. Boudaoud. Un système multi-agents pour la détection
A new modeling for describing security information d’intrusions. In JDIR’2000, Paris, France, 2000.
semantics is defined to address the heterogeneity problem. [5] W. R. Cheswick and S. M. Bellovin. Firewalls and Internet
The modeling is an ontology that describes all activities Security Repelling the Wily Hacker. Addison-Wesley,
that can be undertaken in an IS. By using real data 1994.
triggered from products deployed to protect the assets of [6] F. B. Cohen. Protection and security on the information
an organization, we shown that the modeling reduced the superhighway. John Wiley & Sons, Inc., New York, NY,
amount of events and allowed automatic treatments of USA, 1995.
security information by the analysis algorithms. The [7] F. B. Cohen. Information system attacks: A preliminary
model is extensible, we can increase the vocabulary classification scheme. Computers and Security, 16, No.
1:29-46, 1997.
according to the need such as adding a new movement to
[8] D. Curry and H. Debar. Intrusion detection message exchange
be supervised in the IS. The model can be applied to other
format. http://www.rfc-editor.org/rfc/rfc4765.txt.
contexts of monitoring such as the monitoring of physical
[9] Davidson. Actions, reasons, and causes. Journal of
intruders in a museum; all we have to do is to define the Philosophy, pages 685-700 (Reprinted in Davidson 1980, pp.
adequate vocabulary of the new context. 3-19.), 1963.
[10] T. Wolle and H. L. Bodlaender. A note on edge contraction.
We demonstrated that unknown attack scenarios could be Technical report, institute of information and computing
efficiently detected without hard pre description sciences, Utrecht university, 2004.
information through our User Behavioral Analysis. By [11] T. Duval, B. Jouga, and L. Roger. Xmeta a bayesian
using only relevant information, User’s behaviors are approach for computer forensics. In Annual Computer
modeled through a Bayesian network. The Bayesian Security Applications Conference (ACSAC), 2004.
Network modeling allows a great detection effectiveness [12] S. T. Eckmann, G. Vigna, and R. A. Kemmerer. STATL: an
by injecting incoming events inside the model and attack language for state-based intrusion detection. In Journal
of Computer Security, 10:71-103, 2002.
computing all conditional probabilities associated. Our
[13] G. Helmer, J. S. K. Wong, V. G. Honavar, L. Miller, and
Anomaly evaluation module allows updating dynamically
Y. Wang. Lightweight agents for intrusion detection. Journal
a User’s model, reducing false positive and enriching of Systems and Software, 67:109-122, 2003.
Behavioral Anomalies. The experimentation on real data [14] M. Hossain and S. M. Bridges. A framework for an adaptive
set highlights our high detection rate on legitimate action intrusion detection system with data mining. In 13th Annual
involved in Attack scenarios. As data are modeled in the Canadian Information Technology Security Symposium,
same way, User Behavioral Analysis results show that the 2001.
effectiveness of the analysis processes is highly dependent [15] J. Howard and T. Longstaff. A common language for
on the data modeling. computer security incidents. Sand98-8667, Sandia
International Laboratories, 1998.
[16] J. D. Howard. An Analysis of Security Incidents on the
Internet PhD thesis, Carnegie Mellon University, Pittsburgh,

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 32

Pennsylvania 15213 USA, April 1997. [35] J. Saraydaryan, V. Legrand, and S. Ubeda. Modeling of
[17] M.-Y. Huang, R. J. Jasper, and T. M. Wicks. A large scale information system correlated events time dependencies. In
distributed intrusion detection framework based on attack NOTERE’ 08, 2008.
strategy analysis. Computer Networks (Amsterdam, [36] O. Sheyner. Scenario Graphs and Attack Graphs. PhD
Netherlands: 1999, 31(23-24):2465-2475, 1999. thesis, Carnegie Mellon University, 2004.
[18] D. Icove, K. Seger, and W. VonStorch. Computer Crime: A [37] D. J. Spiegelhalter and S. L. Lauritzen. Sequential updating
Crimefighter’s Handbook. Inc., Sebastopol, CA, 1995. of conditional probabilities on directed graphical structures.
[19] I. D. J., P. J. R., and T. S. Executions, motivations, and Networks ISSN 0028-3045, 20:579-605, 1990.
accomplishments. Philosophical Review, 102(4), Oct 1993. [38] W. Stallings. Network and Internetwork Security:
[20] A. Keleme, Y. Liang, and S. Franklin. A comparative study Principles and Practice. Prentice-Hall, Inc, Upper Saddle
of different machine learning approaches for decision making. River, NJ, USA, 1995.
In Recent Advances in Simulation, Computational Methods [39] S. Staniford-Chen, S. Cheung, R. Crawford, M. Dilger, J.
and Soft Computing, 2002. Frank, J. Hoagland, K. Levitt, C. Wee, R. Yip, and D.
[21] U. Lindqvist and E. Jonsson. How to systematically classify Zerkle. Grids - a graph-based intrusion detection system for
computer security intrusions. In proceeding of the IEEE large networks. In Proceedings of the 19th National
Symposium on Security and Privacy, pages 154-163, 1997. Information Systems Security Conference, 1996.
[22] S. Mathew, C. Shah, and S. Upadhyaya. An alert fusion [40] J. M. Stanton, K. R. Stam, P. Mastrangelo, and J. Jolton.
framework for situation awareness of coordinated multistage Analysis of end user security behaviors. Computers &
attacks. In Proceedings of the Third IEEE International Security, Volume 24: Pages 124-133, 2005.
Workshop on Information Assurance, 2005.
[23] T. P. Minka. Expectation propagation for approximate Jacques Saraydaryan holds a Master’s Degree in Telecoms
bayesian inference. In the 17th Conference in Uncertainty in and Networks from National Institute of Applied Sciences
Artificial Intelligence, 2001. (INSA), Lyon –France in 2005, and a Ph.D. in computer
sciences from INSA, Lyon France in 2009. He is a Research
[24] P. G. Neumann. Computer-Related Risks. Addison-Wesley,
Engineer at the Exaprotect company, France. His research
October 1994.
focus is on IS Security especially on Anomaly intrusion
[25] P. G. Neumann and D. B. Parker. A summary of computer
detection system. His research work has been published in
misuse techniques. In Proceedings of the 12th National
international conferences such as Secrureware’08,
Computer Security Conference, pages 396-407, Baltimore,
Securware’07. He has one patent with Exaprotect Company.
Maryland, October 1989.
[26] A. J. Stewart. Distributed metastasis: A computer network
Fatiha Benali holds a Master’s Degree in Fundamental
penetration methodology. Phrack Magazine, 55(9), 1999.
Computer Sciences at Ecole Normale Supérieure (ENS), Lyon
[27] P. Ning, S. Jajodia, and X. S. Wang. Abstraction-based
-France, and a Ph.D. in computer sciences from INSA, Lyon-
intrusion detection in distributed environments. ACM
France in 2009. She is a Lecturer in the Department of
Transaction Information and System Security, 4(4):407-
Telecommunications Services & Usages and a researcher in the
452, 2001.
Center for Innovations in Telecommunication and Services
[28] S. Noel, E. Robertson, and S. Jajodia. Correlating intrusion
integration (CITI Lab) at INSA, Lyon- France. Her research
events and building attack scenarios through attack graph
focus on IS security notably on information security modeling.
distances. In Proceedings of the 20th Annual Computer Her research work has been published in international
Security Applications Conference, 2004. conferences such as Security and Management (SAM’07),
[29] R. Puttini, Z. Marrakchi, and L. Me. Bayesian classification Securware’08. She has 2 papers awarded and one patent with
model for real-time intrusion detection. In 22nd International Exaprotect Company.
Workshop on Bayesian Inference and Maximum Entropy
Methods in Science and Engineering, 2002. Stéphane Ubéda holds a PhD in computer sciences at ENS
[30] X. Qin and W. Lee. Attack plan recognition and prediction Lyon - France in 1993. He became an associated professor in
using causal networks. In Proceedings of the 20th Annual the Jean-Monnet University-France in 1995, obtain an
Computer Security Applications Conference, 2004. Habilitation to conduct research in 1997 and became in 2000
[31] R. A. Wasniowski. Multi-sensor agent-based intrusion full professor at the INSA of Lyon. He is a full professor at
detection system. In the 2nd annual conference on INSA of Lyon in the Telecommunications department. He is
Information security curriculum development, 2005. the director of the CITI Lab, he is also the head of the French
[32] S.-C. S, Tung.B, and Schnackenberg.D. The common National Institute for Research in Computer Science and
intrusion detection framework (CIDF). In The Information Control (INRIA) Project named AMAZONES for AMbient
Survivability Workshop, Orlando, FL, October 1998. CERT Architectures: Service-Oriented, Networked, Efficient, Secure.
Coordination Center, Software Engineering Institute.
[33] J. Saraydaryan, V. Legrand, and S. Ubeda. Behavioral
anomaly detection using bayesian modelization based on a
global vision of the system. In NOTERE’07, 2007.
[34] J. Saraydaryan, V. Legrand, and S. Ubeda. Behavioral
intrusion detection indicators. In 23rd International
Information Security Conference (SEC 2008), 2008.

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 33
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

Self-Partial and Dynamic Reconfiguration Implementation for


AES using FPGA
Zine El Abidine ALAOUI ISMAILI and Ahmed MOUSSA

Innovative Technologies Laboratory,


National School of Applied Sciences,
Tangier, PBox 1818, Morocco
alaoui_zineabidine@yahoo.fr
amoussa@ensat.ac.ma

cryptography algorithms for high level security. Full


Abstract software implementation is very heavy and slows down
considerably speed of the information exchange. From
This paper addresses efficient hardware/software implementation another side, full hardware implementation is very
approaches for the AES (Advanced Encryption Standard) expensive in terms of area, power and can also deteriorate
algorithm and describes the design and performance testing speed of information transitions. This can be done
algorithm for embedded system.
Also, with the spread of reconfigurable hardware such as FPGAs
dynamically at run-time and without user interaction,
(Field Programmable Gate Array) embedded cryptographic while the static part of the chip is not interrupted. The idea
hardware became cost-effective. Nevertheless, it is worthy to we put into practice is a coarse-grained partially
note that nowadays, even hardwired cryptographic algorithms are dynamically reconfigurable implementation of a
not so safe. cryptosystem.
From another side, the self-reconfiguring platform is reported
that enables an FPGA to dynamically reconfigure itself under the Our prototype implementation consists of a FPGA which
control of an embedded microprocessor. Hardware acceleration is partially reconfigured at run-time to provide
significantly increases the performance of embedded systems countermeasures against physical attacks. The static part is
built on programmable logic. Allowing a FPGA-based
MicroBlaze processor to self-select the coprocessors uses can
only configured upon system reset. Some advantages of
help reduce area requirements and increase a system's versatility. dynamic reconfiguration for cryptosystems have been
The architecture proposed in this paper is an optimal hardware explored before [1, 2, 3]. In such systems, the main goal of
implementation algorithm and takes dynamic partially dynamic reconfigurability is to use the available hardware
reconfigurable of FPGA. This implementation is good solution to resources in an optimal way. This is the first work that
preserve confidentiality and accessibility to the information in considers using a coarse-grained partially dynamically
the numeric communication. reconfigurable architecture in cryptosystems to prevent
Key words: Cryptography; Embedded systems; Reconfigurable physical attacks by introducing temporal and/or spatial
computing; Self-reconfiguration jitter [4, 5].

This paper presents an optimal implementation of the AES


1. Introduction (Advanced Encryption Standard) cryptography algorithm
by the use of a dynamic partially reconfigurable FPGA [6].
Today, ultra deep submicronic technologies offer high
The reconfigurable aspect adapts the allowed basic bloc
scale density of integration for communication systems.
size to both the loop number and the size of the provided
This growth in integration has been accompanied with
information, and makes all the AES blocs reconfigurable.
dramatically increase of complexity and transaction speed
The paper is organized as follows: section 2 describes the
of this systems. As a consequence, security becomes a
AES algorithm. Reconfigurable FPGA and self
challenge and a critical issue especially for real time
reconfigurable methodology is presented in section 3, 4
applications where materiel and software resources are
and 5. The proposed methodology of algorithm
very precious and necessary to provide a minimum of
implementation is given in section 6. Finally, results are
service quality.
presented and illustrated in section 7.
Indeed, today speed and computing power impose the
recourse to sophisticated and more complicated

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 34

2. AES Encryption Algorithm specified to any multiple of 32 bits, with a minimum of


128 and a maximum of 256 bits. The repeated application
The National Institute of Standards and Technology (NIST) of a round transformation state depends on the block
has initiated a process to develop a Federal Information length and the key length. For various block length and
Processing Standard (FIPS) for the AES, specifying an key length variable’s value are given in table1.
Advanced Encryption Algorithm to replace the Data The number of rounds of AES algorithm to be performed
Encryption Standard (DES) which expired in 1998 [6,7]. during the execution of the algorithm is dependent on the
NIST has solicited candidate algorithms for inclusion in key size. The number of rounds, Key length and Block
AES, resulting in fifteen official candidate algorithms of Size in the AES standard is summarized in Table 1 [8].
which five have been selected as finalists. Unlike DES,
which was designed specifically for hardware Table 1: Margin specifications Key-Block-Round Combinations for AES
implementations, one of the design criteria for AES Key length Key length Number of
candidate algorithms is that they can be efficiently (Nk round) (Bits) Round (Nr)
implemented in both hardware and software. Thus, NIST AES-128 4 128 10
has announced that both hardware and software AES-192 6 192 12
performance measurements will be included in their
AES-256 8 256 14
efficiency testing. However, prior to the third AES
conference in April 2000, virtually all performance
As mentioned before the coding process consists on the
comparisons have been restricted to software
manipulation of the 128-bit data block through a series of
implementations on various platforms [5]. In October 2000,
logical and arithmetic operations, repeated a fixed number
NIST chose Rijndael as the Advanced Encryption
of times. This number of rounds is directly dependent on
Algorithm.
the size of the cipher key. In the computation of both the
encryption and decryption, a well defined order exists for
The AES use the Rijndael encryption algorithm with
the several operations that have to be performed over the
cryptography keys of 128, 192, 256 bits. As in most of the
data block. The encryption/decryption process runs as
symmetrical encryption algorithms, the AES algorithm
follows in figure 1.
manipulates the 128 bits of the input data, disposed in a 4
by 4 bytes matrix, with byte substitution, bit permutation
and arithmetic operations in finite fields, more specifically,
addition and multiplications in the Galois Field 28
(GF(28 )). Each set of operations is designated by round.
The round computation is repeated 10, 12 or 14 times
depending on the size of the key (128, 192, 256 bits
respectively). The coding process includes the
manipulation of a 128-bit data block through a series of
logical and arithmetic operations. In the computation of
both the encryption and decryption, a well defined order
exists for the several operations that have to be performed
over the data block.

The following describes in detail the operation performed


by the AES encryption in each round. The State variable
contains the 128-bit data block to be encrypted. In the
Encryption part, first the data block to be encrypted is split
into an array of bytes called as state matrix. This algorithm
is based on round function, and different combinations of
the algorithm are structured by repeating the round
function different times. Each round function contains
Fig.1: AES algorithm
uniform and parallel four steps: SubBytes, ShiftRows, (a) Encryption Structure (b) Decryption Structure
MixColumn and AddRoundKey transformation and each
step has its own particular functionality. This is The next subsections describe in detail the operation
represented by this flow diagram. Here the round key is performed by each of the functions used above, for the
derived from the initial key and repeatedly applied to particular case of the encryption.
transform the block of plain text into cipher text blocks.
The block and the key lengths can be independently

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 35
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

2.1 The SubBytes Transformation


The SubBytes transformation is a non-linear byte
substitution that acts on every byte of the state in isolation
to produce a new byte value using an S-box substitution
table. The action of this transformation is illustrated in
Figure 2 for a block size of 4.
This substitution, which is invertible, is constructed by
composing two transformations:
- First the multiplicative inverse in the finite field
described earlier, with the {00} element mapped to itself.
- Second the affine transformation over GF(28) defined by:
Fig. 3: Proposed beam former ShiftRows() cyclically shifts the last three
rows in the State.
(1)
2.3 The MixColumns Transformation
th
for 0 ≤ i < 8 where bi is the i bit of the byte, and ci is the
The MixColumns transformation acts independently on
ith bit of a byte c with the value {63} or {01100011}. Here
every column of the state and treats each column as a four-
and elsewhere, a prime on a variable b’ indicates that its
term polynomial. The columns are considered as
value is to be updated with the value on the right.
polynomials over GF(28) and multiplied modulo x4+1 with
a fixed polynomial a(x), given by
a(x) = {03}x3+ {01}x2+ {01}x + {02} (3)
This equation can be written as a matrix multiplication.
Let:
S’(x) =a(x)X S(x): (4)
In matrix form the transformation used given in where all
the values are finite field elements as discussed in Section
2.
Fig. 2. SubBytes acts on every byte in the state

In matrix form the affine transformation element of this S- (5)


box can be expressed as:

The action of this transformation is illustrated in Figure 3.

2.2 The ShiftRows Transformation


The ShiftRows transformation operates individually on
each of the last three rows of the state by cyclically
shifting the bytes in the row such that:
Fig. 4: MixColumns() operates on the State column-by-column.
(2)
2.4 The AddRoundKey Transformation
This has the effect of moving bytes to lower positions in
the row except that the lowest bytes wrap around into the In the AddRoundKey transformation Nb words from the
top of the row (note that a prime on a variable indicates an key schedule, described later, are each added (XOR) into
updated value). The action of this transformation is the columns of the state so that:
illustrated in Figure 3.

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 36

more hard-core Power PC processor embedded along with


the FPGA’s logic fabric.
(6)
Alternatively, soft processor cores that are implemented
Where the key schedule words [7] will be described later using part of the FPGAs logic fabric are also available.
and round is the round number in the range 1≤ round ≤Nr. This approach is more flexible and less costly than the
The round number starts at 1 because there is an initial key CSoC technology [11]. Many soft processors core are now
addition prior to the round function. The action of this available in commercial products. Some of the most
transformation is illustrated in Figure 5. notorious examples are: Xilinx 32-bits MicroBlaze and
PicoBlaze, and the Altera Nios and 32-bits Nios II
processors. These soft processor cores are configurable in
the since that the designer can introduce new custom
instructions or data paths. Furthermore, unlike the hard-
core processors included in the Configurable System-on-
Chip (CSoC) technology, designers can add as many soft
processor cores as they may need. (Some designs could
include 64 such processors or even more).

Fig.5: AddRoundKey() XORs each column of the State with a word from
the key schedule. 4. Dynamic Partial Reconfiguration
The incredible growth of FPGA capabilities in recent years
3. Reconfigurable Hardware Technology and the new features included on them has opened many
new investigation fields. One of the more interesting ones
Field Programmable Gate Array (FPGA) is an integrated concerns partial reconfiguration and its possibilities [12,9].
circuit that can be bought off the shelf and reconfigured by This feature allows the device to be partially reconfigured
designers themselves. With each reconfiguration, which while the rest of the device continues its normal operation.
takes only a fraction of a second, an integrated circuit can Partial reconfiguration is the ability to reconfigure
perform a completely different function. FPGA consists of preselected areas of an FPGA anytime after its initial
thousands of universal building blocks, known as configuration while the design is operational. By taking
Configurable Logic Blocks (CLBs), connected using advantage of partial reconfiguration, hardware can be
programmable interconnects. Reconfiguration is able to shared between various applications and upgraded
change a function of each CLB and connections among remotely without rebooting and thus resource utilization
them, leading to a functionally new digital circuit. can be increased [12].
In recent years, FPGAs have been used for reconfigurable
computing, when the main goal is to obtain high
performance at a reasonable coast at the hardware
implemented algorithms. The main advantage of FPGAs is
their reconfigurability, i.e. they can be used for different
purposes at different stages of computation and they can
be.
Besides Cryptography, application of FPGAs can be found
in the domains of evolvable and biologically-inspired
hardware, network processor, real-time system, rapid
ASIC prototyping, digital signal processing interactive Fig. 6: Reconfigurable FPGA structure
multimedia, machine vision, computer graphics, robotics,
embedded applications, and so forth. In general, FPGAs FPGA devices are partially reconfigured by loading only a
tend to be an excellent choice when dealing with subset of configuration frames into the FPGA internal
algorithms that can benefit from the high parallelism configuration memory. The Xilinx Virtex-II Pro FPGAs
offered by the FPGA fine grained architecture. allow partial reconfiguration in two forms: static and
Significant technical advances have led to architecture to dynamic.
combine FPGAs logic blocks and interconnect matrices, Static (or shutdown) partial reconfiguration takes place
with one or more microprocessors and memory blocks when the rest of the device is inactive and in shutdown
integrated on a single chip [9, 10]. This hybrid technology mode. The non-reconfigurable area of the FPGA is held in
is called Configurable System on Chip (CSoC). Example reset and the FPGA enters the start-up sequence after
for the CSoC technology are the Xilinx Virtex Pro II, the partial reconfiguration is completed. In contrast, in
virtex 4, and virtex 5 FPGAs families, with include one or dynamic (or active) partial reconfiguration new data can

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 37
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

be loaded to dynamically reconfigure a particular area of information. The global architecture of the proposed
FPGA while the rest of it is still operational. User design is system using dynamically reconfigurable FPGA is
not suspended and no reset and start-up sequence is illustrated below (Cf. Fig. 8)
necessary.

Fig. 7: Static and dynamic part for system reconfigurable Fig. 8: Global architecture for self-reconfigurable system

5. Self Partial Dynamic Reconfiguration 6. A Self-Reconfigurable Implementation of


The Dynamic Partial Self-Reconfiguration (DPSR) AES
concept is the ability to change the configuration of part of
Our principal contribution in this article, is to conceive an
an FPGA device by itself while other processes continue in
optimal system allowing the implementation of the AES
the rest of the device. A self-reconfiguring platform is
by using the self-reconfigurable dynamic method.
reported that enables an FPGA to dynamically reconfigure
itself under the control of an embedded microprocessor 6.1 Methodology implementation
[10].
A partially reconfigurable design consists of a set of full To increase the performance of the implemented circuit,
designs and partial modules. The full and partial bitstreams especially cost, power and inaccessibility, all of the AES
are generated for different configurations of a design. The blocs may be reconfigurable [17]. So, the used parameters
idea of implementing a self-reconfiguring platform for for reconfiguration are implanted inside the manager
Xilinx Virtex family was first reported in [10]. The module of reconfiguring, and it is possible to quickly cross
platform enabled an FPGA to dynamically reconfigure from a safe configuration to another by updating a hard
itself under the control of an embedded microprocessor. system protection.
The hardware component of Self Reconfiguring Platform The control and management module of reconfiguration
(SRP) is composed of the internal configuration access allows choosing a correct memory program (PR) and
port (ICAP), control logic, a small configuration cache, generating a reconfiguration signal (Cf. Fig. 9). The real
and an embedded processor. The embedded processor can dynamic reconfiguration procedure of the AES is preceded
be Xilinx Microblaze, which is a 32-bit RISC soft by two controllers: the first one, achieved by Microblaze
processor core [13]. The hard-core Power PC on the virtex processor, computes the reconfiguration parameters using
II Pro can also be used as the embedded processor. The the available signal and the key size. This is the current
embedded processor provides intelligent control of device state of the system. The second one computes the best
reconfiguration run-time. The provided hardware parameters under input constraints, and writes these
architecture established the framework for the parameters in the configuration register for managing the
implementation of the self-reconfiguring platforms. reconfiguration process.
Internal configuration access port application program
interface (ICAP API) and Xilinx partial reconfiguration Initialisation
Key
toolkit (XPART) provide methods for reading and
modifying select FPGA resources and support for re-
locatable partial bitstreams. attacks Manager Controller
Detection

Taking advantage of FPGA capacity presented above, we MicroBlaze - ICAP


try to develop a flexible architecture of the AES
implementation. The complexity of this arises from the
algorithm architecture associated to the loop number and AES Core
information size [13,4, 5, 14, 15, 3, 16]. Reconfigurable
The main idea of this work is to adapt the basic bloc size Fig.9: Global architecture for implementation the AES
to the loop number and the size of the available

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 38

In Figure 10, we present the modular design and 7 Implementation Result


reconfiguration cryptosystem.
To test the proposed method, in first time, we have
implemented the AES algorithm with on the Spartan II

Reconfigurable
(XC2S200E) and Virtex II (XC2V500) of Xilinx. The

Area AES192
MicroBlaze results are summarized in the Table 2.
System
Table 2: comparison of the different implementations of the AES
Reconfigurable

Reconfiguration
Area AES128

MicroBlaze Resource Resource


System FPGA Used/Total Used/Total
Reconfiguration
Resource Resource Resource

Reconfigurable
Area AES256
(XC2S200E) (XC2V500)
MicroBlaze
System Slices 196/2353 192/072

Slice Flip-

AES-128
92/4704 78/6144
Flops
Fig. 10: Modular Design and Reconfiguration cryptosystem.
4-input LUTs 352/4704 342/6144
6.2 Configuration controller finite state machine BRAMs 6/14 6/32

Slices 265/2353 241/3072


As described previously, the configuration controller is
developed with a finite state machine. With the Slice Flip-
AES-192
102/4704 76/6144
knowledge of the memory mapping, the configuration Flops
management finite state machine is relatively simple. 4-input LUTs 467/4707 341/6144
The configuration controller is used only for normal
FPGA configuration when power is switched on. BRAMs 6/14 6/32
Figure 11 shows the four-global-states used by the Slices 252/2353 207/3072
configuration controller.
Slice Flip-
The first state of this four-states FSM (Finite State
AES-256

99/4704 81/6144
Machine) is an start state. To change state the Flops
configuration controller waits for detection the length 4-input LUTs 469/4704 381/6144
key signal. This signal is the begin-signal of the normal BRAMs 6/14 6/32
configuration process.
The performance implementation of AES cryptographic
is presented in the table 3.
Change of
length
Table 3: Performance implementation for AES
Crypto Device Device
Crypto AES 192 Parameter
AES 128 XC2S200E) (XC2V500)
Key length Minimum
(128 bits) 35.520 13.674
Period (ns)
Key length Maximum
Start (192 bits) 28.742 78.59
Frequency
AES-128

Clock Cycle
250 250
Used
Change of Thtoughput
length Key length Change of 16.362 40.57
(256 bits) length (Mbps)
TPS (kbps/slice) 83 232
Minimum
41.387 13.863
Crypto Period (ns)
AES 256 Maximum
25.825 71.78
Frequency
AES-192

Fig. 11: Finite state machine configuration controller Clock Cycle


300 300
Used
Thtoughput
11.361 31.72
(Mbps)
TPS (kbps/slice) 41 135
A
E

Minimum 37.648 15.043

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 39
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814
Period (ns)
Maximum
27.067 70.975 References
Frequency
Clock Cycle
350 350
Used [1] F.-X. Standaert, G. Rouvroy, J.-J. Quisquater and J.-D.
Thtoughput Legat, “Efficient implementation of Rijndael encryption in
9.739 26.734
(Mbps) reconfigurable hardware: Improvements and design
tradeoffs,” in the proceedings of CHES 2003, Lecture
After checking of different hardware implementation of Notes in Computer Science, Cologne Germany September
algorithms from the AES, we passed to the total test of the 2003, pp. 334–350.
system of self reconfiguration a base the Microblaze [2] Ming-Haw Jing, Zih-Heng Chen, Jian-Hong Chen, and Yan-
processor, the results of this implementation in virtex II Haw Chen, “Reconfigurable system for high speed and
pro is shown on the table 4. diversified AES using FPGA”, Microprocessors and
Microsystems, vol. 31, Issue 2, March 2007, pp. 94-102.
We notice that one can easily pass from a configuration to
another using the software program implemented in the [3] A.J Elbirt., W. Yip, B. Chetwynd, C. Paar “An FPGA-
processor Microblaze. based performance evaluation of the AES block cipher
As described previously, the configuration controller is candidate algorithm ”, IEEE Transactions on Very Large
developed with a finite state machine in figure 11. With Scale Integration (VLSI) Systems, vol. 9 Issue 4, 2001,
the knowledge of the memory mapping, the configuration pp. 545 – 557.
management finite state machine is relatively simple. [4] M. McLoone and J.V.McCanny: “High Performance
Single-Chip FPGA Rijndael Algorithm Implementations”,
Table 4: Implementation of Microblaze and cryptosystem Cryptographic Hardware and Embedded Systems
(CHES 2001), Paris, France, 2001.
FPGA [5] National Institute of Standards and Technology (NIST),
LUTs FF/Latches BRAM
Slices Second Advanced Encryption Standard (AES)
MicroBlaze
4083 3383 3228 25
Conference, Rome, Italy, March 1999.
System [6] B. Schneier, “Applied Cryptography”, John Wiley & Sons
AES 128 3565 3086 3042 4 Inc., New York, USA, 2nd ed., 1996.
[7] M. Kandemir, W. Zhang, and M. Karakoy, “Runtime code
coprocessor

parallelization for onchip multiprocessors”,


AES

In
AES-192 3764 3259 3149 4
Proceedings of the 6th Design Automation and Test in
Europe Conference, Munich, Germany, March, 2003.
AES-256 3632 3127 3205 4
[8] J. Daemen, V. Rijmen : “AES Proposal: Rijndael, The
Rijndael Block Cipher”, AES Proposal, 1999, pp. 1–45.
[9] M. Huebner, C. Schuck, M. Kuhnle, and J. Becker, “New
8 Conclusion 2-Dimensional Partial Dynamic Reconfiguration
Techniques for Real-time Adaptive Microelectronic
In this paper we present the AES coprocessor Circuits,” Proc. Of Emerging VLSI Technologies and
implementation using the self partial dynamically Architectures, Karlsruhe, Germany ,Mars 2006.
[10] Xilinx web site.
reconfiguration of FPGA. The main advantage of this
http://www.xilinx.com/ipcenter/processorcentral/microblaz
works appear in the capacity of the proposed architecture e (2003).
to modify or/and change the size of the key without [11] P. Lysaght, B. Brodget, J. Mason, J. Young, and B.
stopping the normal operation of the system. As a Bridgford, “Enhanced Architectures, Design
consequence, the proposed system is able to increase the Methodologies and CAD Tools for Dynamic
security and safety of the AES algorithm. Reconfiguration of Xilinx FPGAs”, International
Moreover, implementation of the AES crypto-processor Conference on Field Programmable Logic and
with this new configuration illustrates the ability of this Applications, Madrid, Spain, 2006.
architecture to optimize the processor occupation and the [12] M. Ullmann, M. Huebner, B. Grimm, and J. Becker, “An
FPGA Run-Time System for Dynamical On-Demand
reconfiguration time.
Reconfiguration,” Proc. of the 18th International
In order to explore the encoding method on the self-partial Parallel and Distributed Processing Symposium,
dynamic reconfiguration, our short-term prospect, in the Karlsruhe, Germany April 26-30, 2004.
feature work, consists with the implementation of this
algorithm in a real communication system. [13] H. Qin, T. Sasao and Y. “An FPGA Design of AES
Encryption Circuit with 128-bit Keys” Proceedings of
the 15th ACM Great Lakes symposium on VLSI,
Chicago, Illinois, USA, April 17–19, 2005,.
[14] O.Perez, Y.Berviller,C.Tanougast, and S.Weber, “The Use
of Runtime Reconfiguration on FPGA Circuits to Increase

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 40

the Performance of the AES Algorithm Implementation”,


Journal of Universal Computer Science, vol. 13, no. 3,
2007, pp.349-362.
[15] N. Saqib, F.Rodriguez-Henriquez, and A. Diaz-Pérez, “Two
approaches for a single-Chip FPGA Implementation of an
Encyptor/Decryptor AES Core” International C-
Conference on Field-Programmable Logic and
Applications , Lisbon , Portugal, September 2003 .
[16] M Mogollon: “Cryptography and Security Services:
Mechanisms and Applications” Cybertech Publishing,
2007.
[17] Z.A, Alaoui, A. Moussa, A. Elmourabit and K. Amechnoue
“Flexible Hardware Architecture for AES Cryptography
Algorithm” IEEE Conference on Multimedia Computing
and Systems, ouarzazate, morocco, April 2009.

Z. alaoui-Ismaili, received the DEA in electronics in 1997 and


the Ph.D. degree in Electronics and industrial Computer
Engineering in 2002, both from University IbnTofail de Kenitra,
Morocco. He is currently researcher teacher at the Telecoms &
Electronics department of National School of Applied Sciences
tangier, Morocco, since June 2003.
His main research interests are FPGA based reconfigurable
computing applications, with a special focus on dynamic partial
reconfiguration and embedded systems.
Dr. Alaoui_Ismaili authored or coauthored more than 10 papers
journal and conference.
He is president of Association Moroccan Society of
microelectronics.

A. Moussa, was born in 1970 in Oujda, Morocco. He received


the Licence in Electronics from the University of Oujda, Morocco,
in 1994, and the PhD in Automatic Control and Information
Theory from the University of Kenitra, Morocco, in 2001. He
worked two years as a post-graduate researcher at the
University of Sciences and Technology of Lille, France. At 2003
he joined Sanofi-Aventis research laboratory in Montpellier,
France where he supervised Microarray analysis activities .He is
now a professor at the National School of Applied Sciences in
Tangier-Morocco and his current research interests are in the
application of the Markov theory and multidimensional data
analysis to image processing, and embedded systems.

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 41
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

Web Single Sign-On Authentication using SAML

Kelly D. LEWIS, James E. LEWIS, Ph.D.

Information Security, Brown-Forman Corporation


Louisville, KY 40210, USA
kellydlewis@gmail.com

Engineering Fundamentals, Speed School of Engineering, University of Louisville


Louisville, KY 40292, USA
jel@louisville.edu

Abstract In addition, there are problems for the external service


Companies have increasingly turned to application service provider as well. Every user in an organization will need
providers (ASPs) or Software as a Service (SaaS) vendors to
offer specialized web-based services that will cut costs and
to be set up for the service provider’s application,
provide specific and focused applications to users. The causing a duplicate set of data. Instead, if the
complexity of designing, installing, configuring, deploying, and organization can control this user data, it would save the
supporting the system with internal resources can be eliminated service provider time by not needing to set up and
with this type of methodology, providing great benefit to
organizations. However, these models can present an terminate user access on a daily basis. Furthermore, one
authentication problem for corporations with a large number of central source would allow the data to be more accurate
external service providers. This paper describes the and up-to-date.
implementation of Security Assertion Markup Language
(SAML) and its capabilities to provide secure single sign-on
Given this set of problems for organizations and their
(SSO) solutions for externally hosted applications.
Keywords: Security, SAML, Single Sign-On, Web, service providers, it is apparent that a solution is needed
Authentication that provides a standard for authentication information to
be exchanged over the Internet. Security Assertion
Markup Language (SAML) provides a secure, XML-
1. Introduction based solution for exchanging user security information
between an identity provider (our organization) and a
Organizations for the most part have recently started service provider (ASPs or SaaSs). The SAML standard
using a central authentication source for internal defines rules and syntax for the data exchange, yet is
applications and web-based portals. This single source flexible and can allow for custom data to be transmitted
of authentication, when configured properly, provides to the external service provider.
strong security in the sense that users no longer keep
passwords for different systems on sticky notes on
monitors or under their keyboards. In addition, 2. Background
management and auditing of users becomes simplified
with this central store. The consortium for defining SAML standards and
security is OASIS (Organization for the Advancement of
As more web services are being hosted by external Structured Information Standards). They are a non-profit
service providers, the sticky note problem has reoccurred international organization that promotes the development
for these outside applications. Users are now forced to and adoption of open standards for security and web
remember passwords for HR benefits, travel agencies, services. OASIS was founded in 1993 under SGML
expense processing, etc. - or programmers must develop (Standard Generalized Markup Language) Open until its
custom SSO code for each site. Management of users name change in 1998. Headquarters for OASIS are
becomes a complex problem for the help desk and located in North America, but there is active member
custom built code for each external service provider can participation internationally in 100 countries on five
become difficult to administer and maintain. continents [1].

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 42

SAML 1.0 became an OASIS standard toward the end of subject in the form of attributes and conditions. The
2002, with its early formations beginning in 2001. The assertion can also contain authorization statements
goal behind SAML 1.0 was to form a XML framework defining what the user is permitted to do inside the web
to allow for the authentication and authorization from a application.
single sign-on perspective. At the time of this milestone,
other companies and consortiums started extending The SAML standard defines request and response
SAML 1.0. While these extensions were being formed, protocols used to communicate the assertions between
the SAML 1.1 specification was ratified as an OASIS the service provider (relying party) and the identity
standard in the fall of 2003. provider (asserting party). Some example protocols are
[4]:
The next major revision of SAML is 2.0, and it became
an official OASIS Standard in 2005. SAML 2.0 involves • Authentication Request Protocol – defines how
major changes to the SAML specifications. This is the the service provider can request an assertion
first revision of the standard that is not backwards that contains authentication or attribute
compatible, and it provides significant additional statements
functionality [2]. SAML 2.0 now supports W3C XML • Single Logout Protocol – defines the
encryption to satisfy privacy requirements [3]. Another mechanism to allow for logout of all service
advantage that SAML 2.0 includes is the support for providers
service provider initiated web single sign-on exchanges. • Artifact Resolution Protocol – defines how the
This allows for the service provider to query the identity initial artifact value and then the
provider for authentication. Additionally, SAML 2.0 request/response values are passed between the
adds “Single Logout” functionality. The remainder of identity provider and the service provider.
this text will be discussing implementation of a SAML • Name Identifier Management Protocol – defines
2.0 environment. how to add, change or delete the value of the
name identifier for the service provider
There are three roles involved in a SAML transaction –
an asserting party, a relying party, and a subject. The SAML bindings map the SAML protocols onto standard
asserting party (identity provider) is the system in lower level network communication protocols used to
authority that provides the user information. The relying transport the SAML assertions between the identity
party (service provider) is the system that trusts the provider and service provider. Some example bindings
asserting party’s information, and uses the data to used are [4]:
provide an application to the user. The user and their
identity that is involved in the transaction are known as • HTTP Redirect Binding – uses HTTP redirect
the subject. messages
• HTTP POST Binding – defines how assertions
The components that make up the SAML standard are can be transported using base64-encoded
assertions, protocols, bindings and profiles. Each layer content
of the standard can be customized, allowing specific • HTTP Artifact Binding – defines how an
business cases to be addressed per company. Since each artifact is transported to the receiver using
company’s scenarios could be unique, the HTTP
implementation of these business cases should be able to • SOAP HTTP Binding – uses SOAP 1.1
be personalized per service and per identity providers. messages and SOAP over HTTP

The transaction from the asserting party to the relying The highest SAML component level is profiles, or the
party is called a SAML assertion. The relying party business use cases between the service provider and the
assumes that all data contained in the assertion from the identity provider that dictate how the assertion, protocol
asserting party is valid. The structure of the SAML and bindings will work together to provide SSO. Some
assertion is defined by the XML schema and contains example profiles are [4]:
header information, the subject and statements about the

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 43
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

• Web Browser SSO Profile – uses the Assertion Consumer Service. The diagram in Figure 1
Authentication Request Protocol, and any of the shows the identity provider initiated SAML assertion.
following bindings: HTTP Redirect, HTTP
POST and HTTP Artifact
• Single Logout Profile – uses the Single Logout
Protocol, which can log the user out of all
service providers using a single logout function
• Artifact Resolution Profile – uses the Artifact
Resolution Protocol over a SOAP HTTP
binding
• Name Identifier Management Profile – uses the
name Identifier management Protocol and can
be used with HTTP Redirect, HTTP POST,
HTTP Artifact or SOAP

Two profiles will be briefly discussed in more detail, the


artifact resolution profile and web browser SSO profile.
The artifact resolution profile can be used if the business
case requires highly sensitive data to pass between the Figure 1: Identity Provider Initiated SAML Assertion Flowchart
identity provider and service provider, or if the two
partners want to utilize an existing secure connection If the user accesses the external webpage without passing
between the two companies. through the internal federated identity manager first, the
service provider will need to issue the SAML request
This profile allows for a small value, called an artifact to back to the identity provider on behalf of the user. This
be passed between the browser and the service provider process of SSO is called service provider initiated. In
by one of the HTTP bindings. After the service provider this case, the user arrives at a webpage specific for the
receives the artifact, it transmits the artifact and the company, but without a SAML assertion. The service
request/response messages out of band from the browser provider redirects the user back to the identity provider’s
back to the identity provider. Most likely the messages federation webpage with a SAML request, and optionally
are transmitted over a SSL VPN connection between the with a RelayState query string variable that can be used
two companies. This provides security for the message, to determine what SAML entity to utilize when sending
plus eliminates the need for the assertions to be signed or the assertion back to the service provider.
encrypted which could potentially reduce overhead.
When the identify provider receives the artifact, it looks After receiving the request from the service provider, the
up the value in its database and processes the request. identity provider processes the SAML request as if it
After all out of band messages are transmitted between came internally. This use case is important since it
the identity provider and service provider, the service allows users to be able to bookmark external sites
provider presents the information directly to the browser. directly, but still provides SAML SSO capabilities with
browser redirects. Figure 2 demonstrates this service
The web browser SSO profile may be initiated by the provider initiated use case.
identify provider or the service provider. If initiated by
the identity provider, the assertion is either signed, The most popular business use case for SAML federation
encrypted, or both. In the web browser SSO profile, all is the web browser SSO profile, used in conjunction with
of the assertion information is sent at once to the service the HTTP POST binding and authentication request
provider using any of the HTTP bindings and protocols. protocol. The implementation and framework section
The service provider decrypts if necessary and checks for will discuss this specific use case and the security needed
message integrity against the signature. Next, it parses to protect data integrity.
the SAML XML statements and gathers any attributes
that were passed, and then performs SSO using the

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 44

information determines the specifications that will be


used in a particular business case.

Once the metadata file has been received from the


partner entity, this XML file can be uploaded into the
federation software without any additional configuration
needed for the partner entity. This process saves time
and reduces the possibility for error. The file contains
elements and attributes, and includes an EntityDescriptor
and EntityID that specifies to which entity the
configuration refers.

There are many optional elements and attributes for


metadata files; some that may apply are Binding,
WantAuthRequestsSigned, WantAssertionsSigned,
SingleLogoutService, etc. To review the entire list of
Figure 2: Service Provider Initiated SAML Assertion Flowchart elements available for the metadata file, see the OASIS
metadata standard [5].

3. Implementation/Framework When manually configuring a local entity, first


determine the parameters to be passed in the assertion
There are numerous identity and federation manager that will be the unique username for each user. Normally
products on the market that support federation via SAML this value is an email address or employee number, since
versions 1.1 and 2.0, as well as several open source they are guaranteed to be exclusive for each individual.
products. OpenSAML, an open source toolkit, is In some federation products, values from a data source
available to support developers working with SAML. can be automatically utilized with the SAML assertion.
Shibboleth is an example of an open source project that These values can be extracted from different data sources
uses the OpenSAML toolkit. Sun Microsystems has a such as LDAP, or another source that could be tied into a
product called OpenSSO that is an open source version HR system. While setting up the local entity there are
of their commercial product, OpenSSO Enterprise. other considerations, such as how the parameters will be
Computer Associates provides an access manager called passed (in attributes or nameID), a certificate keystore
SiteMinder and RSA has a product called Federated for the association, and type of signing policies required.
Identity Manager to name a few. Regardless of which
product is selected, as long as it conforms to the The following sample metadata shown in Figure 3 is an
standards of SAML, all products can be used example that would be sent from the local entity (identity
interchangeably with no compatibility issues. provider in this case) to the partner entity (service
provider) to load into the federation software. The
The process of setting up federation involves configuring descriptor shows titled as “IDPSSODescriptor”, which
a local entity, a partner entity, and an association demonstrates this is metadata from an identity provider.
between the two that forms the federation. The local
entity must be manually configured inside the federation Some elements are mandatory, such as entityID, yet
software; however, for SAML 2.0 the process of setting others are optional, such as ID and OrganizationName.
up the partner entity has been made easier with the The elements to note are the Single Sign-On Service
introduction of metadata. Since the SAML standard is binding, location, protocol support section, and key
flexible and can allow a number of custom descriptor and key info areas. In this example, the
configurations, certain agreements and configuration binding must be performed by an HTTP-POST, and the
information must be initially set up between two supported protocol is SAML 2.0.
partners. Exchanging metadata containing this specific

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 45
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

Also, there are two KeyDescriptors, one for signing and


one for encrypting. This indicates the service provider
<md:EntityDescriptor ID="MyCompany"
entityID="mycompany:saml2.0"
requires both for the assertion. There are two methods of
xmlns:ds="http://www.w3.org/2000/09/xmldsig#" binding listed for the assertion consumer service: the
xmlns:md="urn:oasis:names:tc:SAML:2.0:metadata" HTTP Post and the HTTP Artifact. These two metadata
xmlns:query="urn:oasis:names:tc:SAML:metadata:ext:query"
xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion"
samples show how custom each company can be with
xmlns:xenc="http://www.w3.org/2001/04/xmlenc#"> unique SAML requirements.
<md:IDPSSODescriptor WantAuthnRequestsSigned="false"
protocolSupportEnumeration= <EntityDescriptor
"urn:oasis:names:tc:SAML:2.0:protocol"> entityID="mypartner:saml2.0"
<md:KeyDescriptor use="encryption"> xmlns="urn:oasis:names:tc:SAML:2.0:metadata">
<ds:KeyInfo <SPSSODescriptor
xmlns:ds="http://www.w3.org/2000/09/xmldsig#"> AuthnRequestsSigned="true"
<ds:X509Data> WantAssertionsSigned="true"
<ds:X509Certificate> protocolSupportEnumeration=
CERTIFICATE "urn:oasis:names:tc:SAML:2.0:protocol">
</ds:X509Certificate> <KeyDescriptor use="signing">
</ds:X509Data> <ds:KeyInfo xmlns:ds="http://www.w3.org/2000/09/xmldsig#">
</ds:KeyInfo> <ds:X509Data>
<md:EncryptionMethod <ds:X509Certificate>CERTIFICATE</ds:X509Certificate>
Algorithm="http://www.w3.org/2001/04/xmlenc#aes128-cbc"> </ds:X509Data>
</md:EncryptionMethod> </ds:KeyInfo>
</md:KeyDescriptor> </KeyDescriptor>
<md:KeyDescriptor use="signing"> <KeyDescriptor use="encryption">
<ds:KeyInfo xmlns:ds="http://www.w3.org/2000/09/xmldsig#"> <ds:KeyInfo xmlns:ds="http://www.w3.org/2000/09/xmldsig#">
<ds:X509Data> <ds:X509Data>
<ds:X509Certificate> <ds:X509Certificate>CERTIFICATE</ds:X509Certificate>
CERTIFICATE </ds:X509Data>
</ds:X509Certificate> </ds:KeyInfo>
</ds:X509Data> <EncryptionMethod
</ds:KeyInfo> Algorithm="http://www.w3.org/2001/04/xmlenc#aes128-cbc">
</md:KeyDescriptor> <xenc:KeySize
<md:SingleSignOnService xmlns:xenc="http://www.w3.org/2001/04/xmlenc#">128
Binding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST" </xenc:KeySize>
Location="http://mycompany.com/sso/SSO"> </EncryptionMethod>
</md:SingleSignOnService> </KeyDescriptor>
</md:IDPSSODescriptor> <NameIDFormat>
<md:Organization> urn:oasis:names:tc:SAML:2.0:nameid-format:transient
<md:OrganizationName xml:lang="en-us"> </NameIDFormat>
My Company Org <AssertionConsumerService
</md:OrganizationName> index="0"
<md:OrganizationDisplayName xml:lang="en-us"> isDefault="true"
My Company Binding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-POST"
</md:OrganizationDisplayName> Location="https://mypartner.com/federation/metaAlias/sp"/>
<md:OrganizationURL xml:lang="en-s"> <AssertionConsumerService
http://www.mycompany.com index="1"
</md:OrganizationURL> Binding="urn:oasis:names:tc:SAML:2.0:bindings:HTTP-Artifact"
</md:Organization> Location="https://mypartner.com/federation/metaAlias/sp"/>
</md:EntityDescriptor> </SPSSODescriptor>
Figure 3: Sample Identity Provider Metadata XML </EntityDescriptor>
Figure 4: Sample Service Provider Metadata XML
Figure 4 demonstrates an example metadata XML file
that would be sent from a service provider to an identity After the metadata is exchanged and all entities are set
provider for loading into the federation software. Note up, the assertion can be tested and verified using browser
that the descriptor is “SPSSODescriptor”, indicating tools and decoders. For this example, the service
service provider single sign-on descriptor. provider implementation of the HTTP POST method will
be described briefly.
In this case, “WantAuthnRequestsSigned” is equal to
true, as opposed to the previous example in Figure 3. The identity provider must first determine what URL the
federation software requires, and what attributes need to

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 46

be passed with the POST data, such as entityID or <ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#">


SIGNATURE VALUE, ALGORITHM, ETC.
RelayState. The browser HTTP-POST action contains </ds:Signature>
hidden SAMLResponse and RelayState fields enclosed <saml:Subject>
in a HTML form. After the browser POST is received <saml:NameID>NAMEID FORMAT, INFO, ETC</saml:NameID>
<saml:SubjectConfirmation
by the service provider, the Assertion Consumer Service Method="urn:oasis:names:tc:SAML:2.0:cm:bearer">
validates the signature and processes the assertion, <saml:SubjectConfirmationData
gathering attributes and other conditions that could NotOnOrAfter="2009-04-22T12:43:36Z"
Recipient="https://mypartner.com/metaAlias/sp">
optionally be required. The service provider also obtains </saml:SubjectConfirmationData>
the optional RelayState variable in the HTML form, </saml:SubjectConfirmation>
determines the application URL, and redirects the </saml:Subject>
<saml:Conditions
browser to it providing single sign-on to the web NotBefore="2009-04-22T12:28:36Z"
application [4]. NotOnOrAfter="2009-04-22T12:33:36Z">
<saml:AudienceRestriction>
To validate the sent attributes in the assertion with this <saml:Audience>mypartner.com:saml2.0</saml:Audience>
</saml:AudienceRestriction>
HTTP POST example, a browser add-on program can be </saml:Conditions>
used to watch exactly what is sent between the browser <saml:AuthnStatement AuthnInstant="2009-04-22T12:33:20Z"
and the partner. A few browser add-ons are “HttpFox” SessionIndex="ccda16bc322adf4f74d556bd">
<saml:SubjectLocality Address="192.168.0.189"
[6] which can be used with Mozilla Firefox, and DNSName="myserver.mycompany.com">
“HttpWatch” [7] which can be used with Mozilla Firefox </saml:SubjectLocality>
or Internet Explorer. After capturing HTTP data, the </saml:AuthnStatement>
<saml:AttributeStatement xmlns:xs=SCHEMA INFO>
browser POST action can be verified to ensure the proper <saml:Attribute FriendlyName="clientId" Name="clientId"
attributes are passed to the partner. The POST action NameFormat="urn:oasis:names:tc:SAML:2.0:
shows the hidden SAMLResponse and RelayState fields attrname-format:basic">
<saml:AttributeValue>1234</saml:AttributeValue>
in the HTML form, and can be used to validate the data </saml:Attribute>
sent to the service provider. <saml:Attribute FriendlyName="uid" Name="uid"
NameFormat="urn:oasis:names:tc:SAML:2.0:
The SAMLResponse field is URL encoded, and must be attrname-format:basic">
<saml:AttributeValue>the.user@mycompany.com
decoded before reading the assertion. Depending on the </saml:AttributeValue>
requirements, the assertion must be signed, or signed and </saml:Attribute>
encrypted. For testing purposes, first only sign the </saml:AttributeStatement>
</saml:Assertion>
assertion so it can be URL decoded into a non-encrypted </samlp:Response>
readable version. Figure 5 shows an example of a URL Figure 5: Sample SAML Assertion
decoded SAMLResponse and has been shortened for
readability, designated by capital words. For testing purposes with this sample assertion, the
attributes toward the end of the XML should be verified.
<samlp:Response xmlns:saml="urn:oasis:names:tc:SAML:2.0:assertion" In this example, two attributes are being passed:
xmlns:samlp="urn:oasis:names:tc:SAML:2.0:protocol"
Consent="urn:oasis:names:tc:SAML:2.0:consent:unspecified" clientID and uid. The clientID is a unique value that has
Destination="https://mypartner.com/metaAlias/sp" been assigned by the service provider indicating which
ID="ad58514ea9365e51c382218fea" company is sending the assertion. The uid in this case is
IssueInstant="2009-04-22T12:33:36Z"
Version="2.0"> the email address of the user requesting the web
<saml:Issuer>http://login.mycompany.com/mypartner</saml:Issuer> resource. After receiving and validating these values, the
<ds:Signature xmlns:ds="http://www.w3.org/2000/09/xmldsig#"> service provider application performs SSO for the user.
SIGNATURE VALUE, ALGORITHM, ETC.
</ds:Signature> Once these values have been tested and accepted as
<samlp:Status> accurate, the SAML assertion can be encrypted if
<samlp:StatusCode required, and the service provider application can be
Value="urn:oasis:names:tc:SAML:2.0:status:Success">
</samlp:StatusCode> fully tested.
</samlp:Status>
<saml:Assertion ID="1234" IssueInstant="2009-04-22T12:33:36Z" There are important security aspects to be considered,
Version="2.0"> given that the relying party fully trusts the data in the
<saml:Issuer>http://login.mycompany.com/mypartner</saml:Issuer>

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 47
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

SAML assertion. The integrity of the message must be passwords, which also allows for fewer helpdesk calls
preserved from man-in-the-middle attacks and other and administrative costs.
spoofs. In dealing with this scenario, A SAML assertion
can be unsigned, signed, or signed and encrypted Companies should have documentation available to
depending on the type of data and the sensitivity required exchange when setting up SAML associations, since
per application. The SAML standard allows for message each SAML use case can be customized per individual
integrity by supporting X509 digital signatures in the business need. Service providers can use different
request/response transmissions. SAML also supports security protocols, such as signed only, versus signed
and recommends HTTP over SSL 3.0 and TLS 1.0 for and encrypted. In addition, some service providers may
situations where data confidentiality is required [8]. only use the nameID section of the assertion, while
others might use custom attributes only. This upfront
As analyzed by Hansen, Skriver, and Nielson there are documentation can save troubleshooting time during the
some major issues in the SAML 1.1 browser/artifact implementation and testing phases of the project.
profile using TLS security [9]. In SAML 2.0, this profile
was improved to repair a majority of these security Furthermore, during testing phases it is helpful to use a
issues; however there is one existing problem in the sample test site for the service provider and also to test
specification examined by Groß and Pfitzmann [10]. with SAML assertions signed only. The sample test site
Groß and Pfitzmann devised a solution to this exploit by allows for the ability to isolate a test of only the SAML
creating a new profile that produces two artifacts, with connection between the two partners, before testing of
the token being valid only when it consists of both the application occurs. Testing with signed only
values, thus eliminating successful replay of a single assertions allows for the ability to URL decode the
token. Additional work has also been performed on HTML hidden input field, and validate the data being
recently proposed attack scenarios. Gajek, Liao, and passed to the service provider. This ensures the correct
Schwenk recommend two new stronger bindings for data in the assertion is sent and can be tested prior to the
SAML artifacts to the TLS security layer [11]. service provider site being fully prepared for testing.

An additional scenario that could compromise data Additionally, using SAML metadata is very helpful since
integrity is a replay attack that intercepts the valid it eliminates typos and errors when setting up the partner
assertion and saves the data for impersonation at a later entity. These metadata files can help the identity
time. Both the identity provider and the service provider provider understand exactly what the service provider
should utilize the SAML attributes NotBefore and needs in the SAML assertion. Both the identity provider
NotOnOrAfter shown in Figure 5. These values should and service provider should utilize metadata files, not
contain a time frame that is as short as possible, usually only to speed up manual work when entering data into
around 5 minutes. In addition, the identity provider can the federation software, but to also reduce human error.
insert locality information into the assertion, which the
The OASIS Security Services Technical Committee
service provider can verify is valid against the IP address
continues to improve upon the current SAML 2.0
of the requesting user. For additional security
standard by developing new profiles to possibly be used
considerations, see the OASIS security and privacy
in later releases. For example, one area OASIS has
considerations standard [8].
already improved upon was a supplement to the metadata
specifications that added new elements and descriptor
4. Conclusions/Best Practices types. Both identity providers and service providers
should be aware of any changes to SAML standards that
In conclusion, the benefits of SAML are abundant. are ratified by OASIS. Staying current and not deviating
Organizations can easily, yet securely share identity from the standards helps to ensure compatibility,
information and security is improved by eliminating the resulting in less customized configurations between
possibility of shared accounts. User experience is organizations.
enhanced by eliminating additional usernames and

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 48

References and is currently an Assistant Professor in the Department of


Engineering Fundamentals at the University of Louisville’s Speed
[1] OASIS Frequently Asked Questions “http://www.oasis-
School of Engineering, where he received this appointment in
open.org/who/faqs.php”, 2009.
2004. He has fourteen publications on various topics including
[2] P. Madsen. SAML 2: The Building Blocks of Federated
distributed algorithms, intelligent system design, and engineering
Identity
education that were published in national and international
“http://www.xml.com/pub/a/2005/01/12/saml2.html”, 2005.
conference proceedings. He has also been invited to present on
[3] Differences Between SAML V2.0 and SAML V1.1.
critical thinking in engineering education at two conferences. He
“https://spaces.internet2.edu/display/SHIB/SAMLDiffs”,
has been awarded two research grants for his critical thinking
Feb. 2007.
and case study initiatives. He is a member of the ACM and
[4] N. Ragouzis et al. Security Assertion Markup Language
ASEE organizations. His research interests include parallel and
(SAML) V2.0 Technical Overview. “http://www.oasis-
open.org/committees/download.php/22553/sstc-saml-tech- distributed computer systems, cryptography, security design,
overview.pdf”, Feb. 2007. engineering education, and technology used in the classroom.
[5] S. Cantor et al. Metadata for the OASIS Security Assertion
Markup Language (SAML) V2.0. “http://docs.oasis-
open.org/security/saml/v2.0/saml-metadata-2.0-os.pdf”,
March 2005.
[6] M. Theimer. HttpFox 0.8.4. “https://addons.mozilla.org/en-
US/firefox/addon/6647”, 2009.
[7] HttpWatch. “http://www.httpwatch.com/”, 2009.
[8] F. Hirsch et al. Security and Privacy Considerations for the
OASIS Security Assertion Markup Language (SAML)
V2.0. “http://docs.oasis-open.org/security/saml/v2.0/saml-
sec-consider-2.0-os.pdf”, March 2005.
[9] S. M. Hansen, J. Skriver, and H. R. Nielson. “Using static
analysis to validate the SAML single sign-on protocol”, in
Proceedings of the 2005 workshop on Issues in the theory
of security (WITS ’05), 2005, pages 27–40.
[10] T. Großand, and B. Pfitzmann, “Saml Artifact Information
Flow Revisited”. Research Report RZ 3643 (99653), IBM
Research,
“http://www.zurich.ibm.com/security/publications/2006/Gr
Pf06.SAML-Artifacts.rz3643.pdf”, 2006.
[11] S. Gajek, L. Liao, and J. Schwenk. “Stronger TLS bindings
for SAML assertions and SAML artifacts”, in Proceedings
of the 2008 ACM Workshop on Secure Web Services
(SWS ’08), 2008, pages 11-20.

Kelly D. Lewis graduated with a B.S. of Computer Engineering


and Computer Science at the University of Louisville in 2001.
She received her M. Eng. at the University of Louisville in the
same discipline in 2005, publishing a thesis titled “Student
Performance Evaluation Package using a Web Interface and
a Database”. She started her Information Technology career
in 1999 with the United States Army Research Institute. She
has been employed for Brown-Forman Corporation the last 8
years, and has worked a Systems Administrator, Network
Engineer, and presently holds a Security Analyst position in
Information Security. Her focus is on network security,
automation, and single sign-on technologies.

James E. Lewis graduated with a B.A. in Computer Science


from Hanover College in 1994, and earned a M.S. in Computer
Science from the University of Louisville in 1996, with a thesis
focusing on expert systems and networking. He received a
Ph.D. in Computer Science and Engineering from the University
of Louisville in 2003, publishing a dissertation with an emphasis
in distributed genetic algorithms. He started teaching in 1995,

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 49
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

An Efficient Secure Multimodal Biometric Fusion Using


Palmprint and Face Image
Nageshkumar.M, Mahesh.PK and M.N. Shanmukha Swamy
1
Department of Electronics and Communication,
J.S.S. research foundation, Mysore University, Mysore-6
nageshkumar79m@gmail.com
2
Department of Electronics and Communication,
J.S.S. research foundation, Mysore University, Mysore-6
mahesh24pk@gmail.com
3
Department of Electronics and Communication,
J.S.S. research foundation, Mysore University, Mysore-6
mnsjce@gmail.com

Abstract
Biometrics based personal identification is regarded as an
effective method for automatically recognizing, with a high person. By combining multiple modalities enhanced
confidence a person’s identity. A multimodal biometric systems performance reliability could be achieved. Due to its
consolidate the evidence presented by multiple biometric sources promising applications as well as the theoretical
and typically better recognition performance compare to system
challenges, multimodal biometric has drawn more and
based on a single biometric modality. This paper proposes an
authentication method for a multimodal biometric system more attention in recent years [1]. Face and palmprint
identification using two traits i.e. face and palmprint. The multimodal biometrics are advantageous due to the use of
proposed system is designed for application where the training non-invasive and low-cost image acquisition. In this
data contains a face and palmprint. Integrating the palmprint and method we can easily acquire face and palmprint images
face features increases robustness of the person authentication. using two touchless sensors simultaneously. Existing
The final decision is made by fusion at matching score level studies in this approach [2, 3] employ holistic features for
architecture in which features vectors are created independently face representation and results are shown with small data
for query measures and are then compared to the enrolment set that was reported.
template, which are stored during database preparation.
Multimodal biometric system is developed through fusion of face
and palmprint recognition. Multimodal system also provides anti-spooling
measures by making it difficult for an intruder to spool
Keywords: Biometrics, multimodal, face, palmprint, fusion multiple biometric traits simultaneously. However, an
module, matching module, decision module. integration scheme is required to fuse the information
presented by the individual modalities.
1 INTRODUCTION
This paper presents a novel fusion strategy for personal
A multimodal biometric authentication, which identifies identification using face and palmprint biometrics [8] at
an individual person using physiological and/or behavioral the features level fusion Scheme. The proposed paper
characteristics, such as face, fingerprints, hand geometry, shows that integration of face and palmprint biometrics
iris, retina, vein and speech is one of the most attractive can achieve higher performance that may not be possible
and effective methods. These methods are more reliable using a single biometric indicator alone. This paper
and capable than knowledge-based (e.g. Password) or presents a new method called canonical form based on
token-based (e.g. Key) techniques. Since PCA, which gives better performance and better accuracy
for both traits (face & palmprint).
biometric features are hardly stolen or forgotten.
The rest of this paper is organized as fallows. Section 2
However, a single biometric feature sometimes presents the system structure, which is used to increase the
fails to be exact enough for verifying the identity of a performance of individual biometric trait; multiple

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 50

classifiers are combined using matching scores. Section 3 3 FEATURE EXTRACTION USING CANONICAL
presents feature extraction using canonical form based on FORM BASED ON PCA APPROACH
PCA. Section 4, the individual traits are fused at matching
score level using sum of score techniques. Finally, the The “Eigenface” or “Eigenpalm” method proposed by
experimental results are given in section 5. Conclusions Turk and Pentland [5] [6] is based on Karhunen-Loeve
are given in the last section. Expression and is motivated by the earlier work of
Sirovitch and Kirby [7][8] for efficiently representing
2 SYSTEM STRUCTURE picture of images. The Eigen method presented by Turk
and Pentland finds the principal components (Karhunen-
The multimodal biometrics system is developed using two Loeve Expression) of the image distribution or the
traits (face & palmprint) as shown in the figure1. For both, eigenvectors of the covariance matrix of the set of images.
face & palmprint recognition the paper proposes a new These eigenvectors can be thought as set of features,
approach called canonical form based on PCA method for which together characterized between images.
feature extraction. The matching score for each trait is
calculated by using Euclidean distance. The modules Let a image I (x, y) be a two dimensional array of intensity
based on individual traits returns an integer value after values or a vector of dimension n. Let the training set of
matching the templates and query feature vectors. The images be I1, I2, I3,…….In. The average image of the set is
final score is generated by using the sum of score defined by
technique at fusion level, which is then passed to the
1 n
∑I i
decision module. The final decision is made by comparing
the final score with a threshold value at the decision Ψ= (1)
N i =1
module.

Each image differed from the average by the vector. This


φI = I i − Ψset of very large vectors is subjected to
principal component analysis which seeks a set of K
orthonormal vectors Vk, K=1,…...., K and their associated
eigenvalues λk which best describe the distribution of
data. The vectors Vk and scalars λk are the eigenvectors
and eigenvalues of the covariance matrix:

N
1
C=
N
∑φ φ
i =1
i i
T
= A AT (2)

[
Where the matrix A = φ1 , φ2 ...........φ N ]
finding the
eigenvectors of matrix Cnxn is computationally intensive.
However, the eigenvectors of C can determine by first
finding the eigenvectors of much smaller matrix of size
NxN and taking a linear combination of the resulting
vectors [6].

The canonical method proposed in this paper is based on


Eigen values and Eigen vectors. These Eigen valves can
be thought a set of features which together characterized
between images.

Figure1. Block diagram of face and palmprint multimodal biometric Let Q be a quadratic form given by
system
n n
Q = C T I C = ∑∑ aij ci c j (3)
i =1 j −1

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 51
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

Therefore “n” set of eigen vectors corresponding “n”


1
eigen values. MS FINAL = (α ∗ MS Face + β ∗ MS Palm ) (7)
2
Let P̂ be the normalized modal matrix of I, the diagonal Where α and β are the weights assigned to both the traits.
matrix is given by The final matching score (MSFINAL) is compared against a

Pˆ −1 I Pˆ = D
Where I = Pˆ D Pˆ −1 (4) certain threshold value to recognize the person as genuine
or an impostor.
Then
Q=CT I C =CT PDP (
ˆ ˆ−1 C = CTPˆ ( D) Pˆ−1 C ) ( ) (5) 5 EXPERIMENTAL RESULTS

We evaluate the proposed multimodal system on a data set


The above equation is known as a canonical form or sum
including 720 pairs of images from 120 subjects. The
of squares form or principal axes form.
training database contains a face & palmprint images for
each individual for each subject.
The following steps are considered for the feature
extraction:

(1) Select the text image for the input


(2) Pre-process the image (only for palm image)
(3) Determine the eigen values and eigen vectors of
the image
(4) Use the canonical for the feature extraction.

3.1 EUCLIDEAN DISTANCE: Let an arbitrary instance X be


described by the feature vector

X = [a1 ( x), a2 ( x).........an ( x)] Where ar(x) denotes the


value of the rth attribute of instance x. Then the distance
between two instances xi and xj is defined to be
d ( xi , x j ) ;

n
d ( xi , x j ) = ∑ (a ( X ) − a ( X
r =1
r i r j ))2 (6)

4 FUSION
The different biometric system can be integrated to
improve the performance of the verification system. The
following steps are performed for fusion:

(1) Given the query image as input, features are extracted


by a individual recognition.
(2) The weights α and β are calculated.
(3) Finally the sum of score technique is applied for
combining the matching score of two traits i.e. face
and palmprint. Thus the final score MNFINAL is given
by

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 52

Fig.2. Canonical form based Palm images.


(a) Original image (b) Grey image (c) Resized image (d) Normalized
modal image (e) Diagonalization image

In the last experiment both the traits are combined at


matching score level using sum of score technique. The
results are found to be very encouraging and promoting
for the research in this field. The overall accuracy of the
system is more than 97%, FAR & FRR of 2.4% & 0.8%
respectively.

6 CONCLUSION
Biometric systems are widely used to overcome the
traditional methods of authentication. But the unimodal
biometric system fails in case of biometric data for
particular trait. Thus the individual score of two traits
(face & palmprint) are combined at classifier level and
trait level to develop a multimodal biometric system. The
performance table shows that multimodal system performs
better as compared to unimodal biometrics with accuracy
of more than 98%.

REFERENCES

[1] Ross.A.A, Nandakumar.K, Jain.A.K. Handbook of Multibiometrics.


Springer-Verlag, 2006.
[2] Kumar.A, Zhang.D Integrating palmprint with face for user
authentication. InProc.Multi Modal User Authentication Workshop,
pages 107–112, 2003.
[3] Feng.G, Dong.K, Hu.D, Zhang.D When Faces Are Combined with
Palmprints: A Novel Biometric Fusion Strategy. In Proceedings of
ICBA, pages 701–707, 2004.
[4] G. Feng, K. Dong, D. Hu & D. Zhang, when Faces are combined with
Palmprints: A Noval Biometric Fusion Strategy, ICBA, pp.701-707,
Fig.3. Canonical form based Face images. 2004.
(a) Original image (b) Grey image (c) Resized image [5] M. Turk and A. Pentland, “Face Recognition using Eigenfaces”, in
(d) Normalized modal image (e) Diagonalization image Proceeding of International Conference on Pattern Recognition, pp.
591-1991.
[6] M. Turk and A. Pentland, “Face Recognition using Eigenfaces”,
The multimodal system has been designed at multi- Journals of Cognitive Neuroscience, March 1991.
classifier & multimodal level. At multi-classifier level, [7] L. Sirovitch and M. Kirby, “Low-dimensional Procedure for the
Characterization of Human Faces”, Journals of the Optical Society of
multiple algorithms are combined better results. At first America, vol.4, pp. 519-524, March 1987.
experimental the individual systems were developed and [8] Kirby.M, Sirovitch.L. “Application of the Karhunen-Loeve Procedure
tested for FAR, FRR & accuracy. Table1 shows FAR, for the Characterization of Human Faces”, IEEE Transaction on
FRR & Accuracy of the systems. Pattern Analysis and Machine Intelligence, vol. 12, pp. 103-108,
January 1990.
[9] Daugman.J.G, “High Confidence Visual Recognition of Persons by a
Test of Statistical Independence”, IEEE Trans. Pattern Analysis and
Table1: The Accuracy, FAR, FRR of face & palmprint
Machine Intelligence, vol. 15, no. 11, pp. 1148-1161, Nov. 1993.

Nageshkumar M., graduated in Electronics and

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 53
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814
Communication from Mysore University in 2003,
received the M-Tech degree in Computer Science
& Engineering from V.T.U., Belguam, presently
pursing Ph.D under Mysore University. He was
lecturer in J.V.I.T.

Mahesh. P.K., graduated in Electronics and


Communication from Bangalore University in
2000, received the M-Tech degree in VLSI
design & Embedded Systems from VTU
Belguam, presently pursing Ph.D under Mysore
University. He was lecturer in J.S.S.A.T.E., Nodia
and later Asst. Professor in J.V.I.T.

Dr. M.N.Shanmukha Swamy, graduated in Electronics and


Communication from Mysore University in 1978, received the M-
Tech degree in Industrial Electronics from Mysore University and
then received PhD from Indian Institute of Science, Bangalore.
Presently he is working as a Professor in S.J.C.E., Mysore. So for
he has more than 10 research papers published, journals, articles,
books and conference paper publications.

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 54
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

DPRAODV: A DYANAMIC LEARNING SYSTEM AGAINST


BLACKHOLE ATTACK IN AODV BASED MANET
Payal N. Raj, Prashant B. Swadas
1
Computer Engineering Department, SVMIT
Bharuch, Gujarat, India
payalnraj@gmail.com
2
Computer Engineering Department, B.V.M.
Anand, Gujarat, India
prashantswadas@yahoo.com

Abstract the other side they pose a number of non-trivial challenges


Security is an essential requirement in mobile ad hoc to the security design as they are more vulnerable than
networks to provide protected communication between wired networks [1]. These challenges include open
mobile nodes. Due to unique characteristics of MANETS, network architecture, shared wireless medium, demanding
it creates a number of consequential challenges to its resource constraints, and, highly dynamic network
security design. To overcome the challenges, there is a topology. In this paper, we have considered a fundamental
need to build a multifence security solution that achieves security problem in MANET to protect its basic
both broad protection and desirable network performance. functionality to deliver data bits from one node to another.
MANETs are vulnerable to various attacks, blackhole, is Nodes help each other in conveying information to and fro
one of the possible attacks. Black hole is a type of routing and thereby creating a virtual set of connections between
attack where a malicious node advertise itself as having each other. Routing protocols play an imperative role in
the shortest path to all nodes in the environment by the creation and maintenance of these connections. In
sending fake route reply. By doing this, the malicious node contrast to wired networks, each node in an ad-hoc
can deprive the traffic from the source node. It can be used networks acts like a router and forwards packets to other
as a denial-of-service attack where it can drop the packets peer nodes. The wireless channel is accessible to both
later. In this paper, we proposed a DPRAODV (Detection, legitimate network users and malicious attackers. As a
Prevention and Reactive AODV) to prevent security result, there is a blurry boundary separating the inside
threats of blackhole by notifying other nodes in the network from the outside world.
network of the incident. The simulation results in ns2 (ver- Many different types of routing protocols have been
2.33) demonstrate that our protocol not only prevents developed for ad hoc networks and have been classified
blackhole attack but consequently improves the overall into two main categories by Royer and Toh (1999) as
performance of (normal) AODV in presence of black hole Proactive (periodic) protocols and Reactive (on-demand)
attack. protocols. In a proactive routing protocol, nodes
periodically exchange routing information with other
Keywords: MANETs, AODV, Routing protocol, blackhole nodes in an attempt to have each node always know a
attack. current route to all destinations [2]. In a reactive protocol,
on the other hand, nodes exchange routing information
only when needed, with a node attempting to discover a
1. Introduction route to some destination only when it has a packet to send
to that destination [3]. In addition, some ad hoc network
Mobile ad hoc network (MANET) is one of the recent routing protocols are hybrids of periodic and on-demand
active fields and has received spectacular consideration mechanisms.
because of their self-configuration and self-maintenance. Wireless ad hoc networks are vulnerable to various
Early research assumed a friendly and cooperative attacks. These include passive eavesdropping, active
environment of wireless network. As a result they focused interfering, impersonation, and denial-of-service. A single
on problems such as wireless channel access and multihop solution cannot resolve all the different types of attacks in
routing. But security has become a primary concern to ad hoc networks. In this paper, we have designed a novel
provide protected communication between mobile nodes method to detect blackhole attack: DPRAODV, which
in a hostile environment. Although mobile ad hoc isolates that malicious node from the network. We have
networks have several advantages over wired networks, on complemented the reactive system on every node on the

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 55

network. This agent stores the Destination sequence packet. If a node receives more than one RREPs, it updates
number of incoming route reply packets (RREPs) in the its routing information and propagates the RREP only if
routing table and calculates the threshold value to evaluate RREP contains either a greater destination sequence
the dynamic training data in every time interval as in [4]. number than the previous RREP, or same destination
Our solution makes the participating nodes realize that, sequence number with a smaller hop count. It restrains all
one of their neighbors is malicious; the node thereafter is other RREPs it receives. The source node starts the data
not allowed to participate in packet forwarding operation. transmission as soon as it receives the first RREP, and
In Section 2 of this paper, we summarize the basic then later updates its routing information of better route to
operation of AODV (Ad hoc On-Demand distance Vector the destination node. Each route table entry contains the
Routing) protocol on which we base our work. In Section following information:
3, we discuss related work. In Section 4, we describe the
effect of blackhole attack in AODV. Section 5 presents the • Destination node
design of our protocol; DPRAODV that protects against • Next hop
blackhole attack. Section 6 discusses the performance • number of hops
evaluation based on simulation experiments. Finally, • Destination sequence number
Section 7 presents conclusion and future work • Active neighbors for the route
• Expiration timer for the route table entry
2. Theoretical background of AODV The route discovery process is reinitiated to establish
AODV is a reactive routing protocol; that do not lie on a new route to the destination node, if the source node
active paths neither maintain any routing information nor moves in an active session. As the link is broken and node
receives a notification, and Route Error (RERR) control
participate in any periodic routing table exchanges.
Further, the nodes do not have to discover and maintain a packet is being sent to all the nodes that uses this broken
route to another node until the two needs to communicate, link for further communication. And then, the source node
restarts the discovery process.
unless former node is offering its services as an
intermediate forwarding station to maintain connectivity As the routing protocols typically assume that all nodes
between other nodes [3]. AODV has borrowed the concept are cooperative in the coordination process, malicious
attackers can easily disrupt network operations by
of destination sequence number from DSDV [5], to
maintain the most recent routing information between violating protocol specification. This paper discusses about
nodes. blackhole attack and provides routing security in AODV
by purging the threat of blackhole attacks
Whenever a source node needs to communicate with
another node for which it has no routing information,
Route Discovery process is initiated by broadcasting a
Route Request (RREQ) packet to its neighbors. Each
3. Related works in securing AODV
neighboring node either responds the RREQ by sending a There are basically two approaches to secure MANET:
Route Reply (RREP) back to the source node or (1) Securing Ad hoc Routing and (2) Intrusion Detection
rebroadcasts the RREQ to its own neighbors after [7].
increasing the hop_count field. If a node cannot respond
by RREP, it keeps track of the routing information in order 3.1 Secure Routing
to implement the reverse path setup or forward path setup
[6]. The Secure Efficient Ad hoc Distance vector routing
The destination sequence number specifies the protocol (SEAD) [8] employs the use of hash chains to
freshness of a route to the destination before it can be authenticate hop counts and sequence numbers in DSDV.
accepted by the source node. Eventually, a RREQ will Another secure routing protocol, Ariadne[9] assumes the
arrive to node that possesses a fresh route to the existence of a shared secret key between two nodes based
destination. If the intermediate node has a route entry for on DSR (reactive) routing protocol. The Authenticated
the desired destination, it determines whether the route is Routing for Ad hoc networks (ARAN) is a standalone
fresh by comparing the destination sequence number in its protocol that uses cryptographic public-key certificates in
route table entry with the destination sequence number in order to achieve the security goals [10]. Security-Aware
the RREQ received. The intermediate node can use its Ad hoc Routing (SAR) uses security attributes such as
recorded route to respond to the RREQ by a RREP packet, trust values and relationships [11].
only if, the RREQ’s sequence number for the destination is The computation overhead involved in the above
greater than the recorded by the intermediate node. mentioned protocols is awful and often suffer from
Instead, the intermediate node rebroadcasts the RREQ scalability problems. As a preventive measure, the packets

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 56
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

are carefully signed, but an attacker can simply drop the immediately send a false RREP packet with a modified
packet passing through it, therefore, secure routing cannot higher sequence number. So, that the source node assumes
resist such internal attacks. So our solution provides a that node is having the fresh route towards the destination.
reactive scheme that triggers an action to protect the The source node ignores the RREP packet received from
network from future attacks launched by this malicious other nodes and begins to send the data packets over
node. malicious node. A malicious node takes all the routes
towards itself. It does not allow forwarding any packet
3.2 Intrusion Detection System anywhere. This attack is called a blackhole as it swallows
all objects; data packets [15].
Zhang and Lee [12] present an intrusion detection
technique for wireless ad hoc networks that uses C
cooperative statistical anomaly detection techniques. The D RREQ
use of anomaly based detection techniques results in too RREP
many number of false positives. Stamouli proposes S B Data
architecture for Real-Time Intrusion Detection for Ad hoc
Networks (RIDAN) [7]. The detection process relies on a M
state-based misuse detection system. Therefore, each node A
requires extra processing power and sensing capabilities.
In [13], the method requires the intermediate node to
send Route Confirmation Request (CREQ) to next hop Fig. 1 Blackhole attacks in MANETs
towards the destination. This operation can increase the
routing overhead resulting in performance degradation. In In figure 1, source node S wants to send data packets to
[14], source node verifies the authenticity of node that a destination node D in the network. Node M is a
initiates RREP by finding more than one route to the malicious node which acts as a blackhole. The attacker
destination, so that it can recognize the safe route to replies with false reply RREP having higher modified
destination. This method can cause the routing delay, since sequence number. So, data communication initiates from S
a node has to wait for RREP packet to arrive from more towards M instead of D.
than two nodes. In [4], the feature used is dest_seq_no,
which reflects the trend of updating the threshold and
hence reflecting the adaptively change in network 5. DPRAODV: Solution against blackhole
environment. attack
Therefore, a method that can prevent the attack without
increasing routing overhead and delay is required. All the In normal AODV, the node that receives the RREP
above mentioned approaches except [4], use static value packet first checks the value of sequence number in its
for threshold. To resolve the problem, threshold value routing table. The RREP packet is accepted if it has
should be reflecting current network environment by RREP_seq_no higher than the one in routing table. Our
updating its value. And also, our solution ensures that a solution does an addition check to find whether the
node once detected as malicious cannot participate in RREP_seq_no is higher than the threshold value. The
forwarding and sending of a data packet in the network. threshold value is dynamically updated as in [4] in every
time interval. As the value of RREP_seq_no is found to be
higher than the threshold value, the node is suspected to be
4. Description of Blackhole attack malicious and it adds the node to the black list. As the
node detected an anomaly, it sends a new control packet,
MANETs are vulnerable to various attacks. General ALARM to its neighbors. The ALARM packet has the
attack types are the threats against Physical, MAC, and black list node as a parameter so that, the neighboring
network layer which are the most important layers that nodes know that RREP packet from the node is to be
function for the routing mechanism of the ad hoc network. discarded. Further, if any node receives the RREP packet,
Attacks in the network layer have generally two purposes: it looks over the list, if the reply is from the blacklisted
not forwarding the packets or adding and changing some node; no processing is done for the same. It simply ignores
parameters of routing messages; such as sequence number the node and does not receive reply from that node again.
and hop count. A basic attack that an adversary can So, in this way, the malicious node is isolated from the
execute is to stop forwarding the data packets. As a result, network by the ALARM packet. The continuous replies
when the adversary is selected as a route, it denies the from the malicious node are blocked, which results in less
communication to take place. In blackhole attack, the Routing overhead. Moreover, unlike AODV, if the node is
malicious node waits for the neighbors to initiate a RREQ found to be malicious, the routing table for that node is not
packet. As the node receives the RREQ packet, it will updated, nor the packet is forwarded to another node.

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 57

The threshold value is dynamically updated using the 6.2 Simulation Evaluation Methodology
data collected in the time interval. If the initial training
data were used, then the system could not adapt the The simulation is done to analyze the performance of
changing environment. The threshold value is the average the network’s various parameters. The metrics used to
of the difference of dest_seq_no in each time slot between evaluate the performance are given below:
the sequence number in the routing table and the RREP • Packet Delivery Ratio: The ratio of the data
packet. The time interval to update the threshold value is delivered to the destination to the data sent out by
as soon as a newer node receives a RREP packet. As a new the source.
node receives a RREP for the first time, it gets the updated • Average End-to-end delay: The difference in the
value of the threshold. So our design not only detects the time it takes for a sent packet to reach the
blackhole attack, but tries to prevent it further, by updating destination. It includes all the delays, in the
threshold which reflects the real changing environment. source and each intermediate host, caused by the
Other nodes are also updated about the malicious act by an routing discovery, queuing at the interface queue
ALARM packet, and they react to it by isolating the etc.
malicious node from network. • Normalized routing overhead: This is the ratio of
routing-related transmissions (RREQ, RREP,
RERR etc) to data transmissions in a simulation.
6. Evaluation of DPRAODV A transmission is one node either sending or
forwarding a packet. Either way, the routing load
6.1 Simulation Environment per unit data successfully delivered to the
destination.
For simulation, we have used ns2 (v-2.33) network
simulator [16]. Mobility scenarios are generated by using a 6.2 Simulation Analysis and Results
Random waypoint model by varying 10 to 70 nodes
moving in a terrain area of 800m x 800m. Each node Various network contexts are considered to measure
independently repeats this behavior and mobility is varied the performance of a protocol. These contexts are created
by making each node stationary for a period of pause time. by varying the following parameters in the simulation.
The simulation parameters are summarized in Table 1. • Network size: variation in the number of mobile
nodes.
Table 1: Simulation Parameters
• Traffic load: variation in the number of sources
Parameter Value
Simulator Ns-2(ver.2.33) • Mobility: variation in the maximum speed
Simulation time 1000 s
Number of nodes 70 (a)
Routing Protocol AODV
Traffic Model CBR
Pause time 2 (s)
Maximum mobility 60 m/s
No. of sources 5
Terrain area 800m x 800m
Transmission Range 250m
No. of malicious node 1

A new Routing Agent is added in ns-2 to include the (b)


blackhole attack. In order to implement blackhole attack,
the malicious node generates a random number between
15 and 200, adds the number to the sequence number in
RREQ and then generates the sequence number in RREP.
In our simulation, the communication is started between
source node to the destination node in presence of the
malicious node. The node number of source node,
destination node and malicious node are 2, 7 and 0
respectively.

Fig. 2 Impact of Mobility on the performance

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 58
ISSN (Online): 1694-0784
ISSN (Printed): 1694-0814

than AODV under attack with Average-End-to-end delay


Figure 2a and 2b conclude the simulation based on the almost same as normal AODV.
effect of mobility on the DPRAODV compared to normal
AODV. The PDR stays within acceptable limits almost 4- In Figure 3c, it is observed that there is slight increase
5% lower than it should normally be with minimum in Normalized Routing Overhead, which is quite
overhead. negligible. In AODV under attack, the delay will be less
and routing overhead will be quite high compared to
(a) normal AODV, so our comparison is between normal
AODV and DPRAODV.

(a)

(b)

(b)

(c)
(c)

Fig. 3 Impact of Network Size on the performance Fig. 4 Impact of Traffic Load on the performance

All the above three contexts are simulated and tested to From the figure 4, it is clear that as the traffic load
see the effect of network size on Packet Delivery Ratio( increases, the PDR of DPRAODV increases by
PDR), Average End-to end delay and Normalized Routing approximately 60% than AODV under attack. As our
Overhead. solution generates ALARM packet, there is slight increase
From figure 3a and b, we analyze that, under blackhole in Normalized Routing Overhead with almost same Delay
attack, the PDR of DPRAODV is improved by 80-85% as normal AODV.

IJCSI
IJCSI International Journal of Computer Science Issues, Vol. 2, 2009 59

7. Conclusions IEEE International Conference on Network Protocols


(ICNP’ 02), 2002
[11] S. Yi, P. Naldurg, and R. Kravets, “Security-Aware Ad hoc
In DPRAODV, we have used a very simple and Routing for Wireless Networks,” Proc. 2nd ACM Symp.
effective way of providing security in AODV against Mobile Ad hoc Networking and Computing (Mobihoc’01),
blackhole attack. As from the graphs illustrated in results Long Beach, CA, October 2001, pp. 299-302.
we can easily infer that the performance of the normal [12] Y. Zhang and W. Lee, "Intrusion detection in wireless ad –
AODV drops under the presence of blackhole attack. Our hoc networks," 6th annual international Mobile computing
and networking Conference Proceedings, 2000.
prevention scheme detects the malicious nodes and isolates
[13] S. Lee, B. Han, and M. Shin, “Robust routing in wireless ad
it from the active data forwarding and routing and reacts hoc networks,” in ICPP Workshops, pp. 73, 2002.
by sending ALARM packet to its neighbors. Our solution: [14] M. A. Shurman, S. M. Yoo, and S. Park, “Black hole attack
DPRAODV increases PDR with minimum increase in in wireless ad hoc networks,” in ACM 42nd Southeast
Average-End-to-end Delay and normalized Routing Conference (ACMSE’04), pp. 96-97, Apr. 2004.
Overhead. [15] Dokurer, Semih.”Simulation of Black hole attack in wireless
Ad-hoc networks”. Master's thesis, AtılımUniversity,
September 2006
Acknowledgments [16] Kevin Fall and Kannan Varadhan (Eds.), "The ns Manual",
2006, available from http://www-mash.cs.berkeley.edu/ns/
This work is sponsored by the Institute of Science and
Technology for Advanced Studies and Research (ISTAR),
Vallabh Vidyanagar, Gujarat, India.

References
[1] Hao Yang, Haiyun Luo. Fan Ye, Songwu Lu, and Lixia
Zhang. “Security in mobile ad hoc networks: Challenges
and solutions”. IEEE Wireless Communications , February
2004
[2] Shree Murthy and J. J. Garcia-Luna-Aceves. “An Efficient
Routing Protocol for Wireless Networks”. Mobile
Networks and Applications, 1(2):183–197, 1996.
[3] Charles E. Perkins and Elizabeth M. Royer. “Ad-Hoc On-
Demand Distance Vector Routing”. In Proceedings of the
Second IEEE Workshop on Mobile Computing Systems and
Applications (WMCSA’99), pages 90–100, February 1999.
[4] Satoshi Kurosawa, Hidehisa Nakayama, Nei Kat, Abbas
Jamalipour, and Yoshiaki Nemoto, “Detecting Blackhole
Attack on AODV-based Mobile Ad Hoc Networks by
Dynamic Learning Method”, International Journal of
Network Security, Vol.5, No.3, P.P 338-346, Nov. 2007
[5] C. Perkins and P. Bhagwat. “Routing over multihop wireless
network for mobile computers”. SIGCOMM ’94 : Computer
Communications Review:234-244, Oct. 1994.
[6] C. E. Perkins, S.R. Das, and E. Royer, “Ad-hoc on Demand
Distance Vector (AODV)”. March 2000,
http://www.ietf.org/internal-drafts/draft-ietf-manet-aodv-
05.txt
[7] Ioanna Stamouli, “Real-time Intrusion Detection for Ad hoc
Networks” Master’s thesis, University of Dublin,
Septermber 2003.
[8] Y.-C. Hu, D.B. Johnson, and A. Perrig, “SEAD: Secure
Efficient Distance Vector Routing for Mobile Wireless Ad
hoc Networks,” Proc. 4th IEEE Workshop on Mobile
Computing Systems and Applications, Callicoon, NY, June
2002, pp. 3-13.
[9] Y.-C. Hu, A. Perrig, and D.B. Johnson, “Ariadne: A Secure
On-Demand Routing Protocol for Ad hoc Networks,” Proc.
8th ACM Int’l. Conf. Mobile Computing and Networking
(Mobicom’02), Atlanta, Georgia, September 2002, pp. 12-
23.
[10] Kimaya Sanzgiti, Bridget Dahill, Brian Neil Levine, Clay
shields, Elizabeth M, Belding-Royer, “A secure Routing
Protocol for Ad hoc networks In Proceedings of the 10th

IJCSI
IJCSI CALL FOR PAPERS JANUARY 2010 ISSUE

The topics suggested by this issue can be discussed in term of concepts, surveys, state of the
art, research, standards, implementations, running experiments, applications, and industrial
case studies. Authors are invited to submit complete unpublished papers, which are not under
review in any other conference or journal in the following, but not limited to, topic areas.
See authors guide for manuscript preparation and submission guidelines.

Accepted papers will be published online and authors will be provided with printed
copies and indexed by Google Scholar, CiteSeerX, Directory for Open Access Journal
(DOAJ), Bielefeld Academic Search Engine (BASE), SCIRUS and more.

Deadline: 15th December 2009


Notification: 15th January 2010
Online Publication: 31st January 2010

• Evolutionary computation • Software development and


• Industrial systems deployment
• Evolutionary computation • Knowledge virtualization
• Autonomic and autonomous systems • Systems and networks on the chip
• Bio-technologies • Context-aware systems
• Knowledge data systems • Networking technologies
• Mobile and distance education • Security in network, systems, and
• Intelligent techniques, logics, and applications
systems • Knowledge for global defense
• Knowledge processing • Information Systems [IS]
• Information technologies • IPv6 Today - Technology and
• Internet and web technologies deployment
• Digital information processing • Modeling
• Cognitive science and knowledge • Optimization
agent-based systems • Complexity
• Mobility and multimedia systems • Natural Language Processing
• Systems performance • Speech Synthesis
• Networking and telecommunications • Data Mining

All submitted papers will be judged based on their quality by the technical committee and
reviewers. Papers that describe research and experimentation are encouraged.
All paper submissions will be handled electronically and detailed instructions on submission
procedure are available on IJCSI website (http://www.ijcsi.org).

For other information, please contact IJCSI Managing Editor, (editor@ijcsi.org)


Website: http://www.ijcsi.org
© IJCSI PUBLICATION 2009
www.IJCSI.org
IJCSI
The International Journal of Computer Science Issues (IJCSI) is a refereed journal for
scientific papers dealing with any area of computer science research. The purpose of
establishing the scientific journal is the assistance in development of science, fast
operative publication and storage of materials and results of scientific researches and
representation of the scientific conception of the society.

It also provides a venue for researchers, students and professionals to submit on-
going research and developments in these areas. Authors are encouraged to
contribute to the journal by submitting articles that illustrate new research results,
projects, surveying works and industrial experiences that describe significant advances
in field of computer science.

Indexing of IJCSI:

1. Google Scholar
2. Directory for Open Access Journals (DOAJ)
3. Bielefeld Academic Search Engine (BASE)
4. CiteSeerX
5. SCIRUS

Frequency of Publication: Monthly

© IJCSI PUBLICATION
www.IJCSI.org