Sie sind auf Seite 1von 620

PROCEEDING OF

THE THIRD INTERNATIONAL CONFERENCE ON


CONTEMPORARY ISSUES IN COMPUTER AND INFORMATION
SCIENCES

THE INSTITUTE FOR ADVANCED STUDIES IN BASIC SCIENCES


(IASBS)

Foreword
We are glad and proud that we have been able to hold the International Conference on
Contemporary Issues in Computer and Information Science for the third successive year,
with your delightful presence; the conference which, besides special attention to the
scientific progress in this subject, is aimed at bringing better interaction between different
areas of computer science and everyday life, and acknowledges this fraternity a must for
the progress of the society. By this approach, the CICIS Conference pays direct attention
to several applicatory aspects of computer science and information. The third conference
is held with concentration on Graph and Geometrical Algorithms, Intelligent Systems,
Bioinformatics, IT and the society, in addition to all computer areas.
What makes us more glorious is the coincidence of this conference with the 20th
anniversary of the creation of Institute for Advanced Studies in Basic Sciences where
outstanding scientific achievements is carried out in a friendly environment which would
never have happened without God assistance and notable effort of the directors, teachers,
researchers and students.
Number of 277 received papers indicates kind feedback and makes us more determined.
45 papers (16.24 %) were accepted as oral presentation, 77 papers (27.79 %) as poster
presentation and 155 papers were rejected.
In this conference besides IASBS, Computer Society of Iran, Iranian branch of IEEE
and University of Zanjan have collaborated and supported us and we hope this improves
the scientific results.
Last but not least, we would like to respect our sponsors for their help and financial
support: Information Technology and Digital Media Development Center, Statistics and
Informatics department of Sanjesh Organization, Arameh Innovative Researchers, Brown
Walker Publisher.

Bahram Sadeghi Bigham


General Ghair

Contents
Reducing Packet Overhead by Improved Tunneling-based Route Optimization
Mechanism
Hooshiar Zolfagharnasab

Neural Network Learning based on Football Optimization Algorithm


Payam Hatamzadeh and Mohammad Reza Khayyambashi

Evaluating XML Retrieval Systems Using Methods of Averaging Precision and


Recall at Rank Cut-offs
Marzieh Javadi and Hassan Naderi

15

Performability Improvement in Grid Computing with Artificial Bee Colony


Optimization Algorithm
Neda Azadi and Mohammad Kalantari

19

Security Enforcement with Language-Based Security


Ali Ahmadian Ramaki, Shahin Shirmohammadzadeh Sahraeii and Reza Ebrahimi
Atani

26

Application of the PSO-ANFIS Model for Time Series Prediction of Interior


Daylight Illuminance
Hossein Babaee and Alireza Khosravi

30

Evaluating the impact of using several criteria for buffer management in


VDTNs
Zhaleh Sadreddini, Mohammad Ali Jabraeil Jamali and Ali Asghar Pourhaji Kazem

36

Improvement of VDTNs Performance with Effective Scheduling Policy


Masumeh Marzaei Afshord, Mohammad Ali Jabraeil Jamali and Ali Asghar
Pourhaji Kazem

40

Classification of Gene Expression Data using Multiple Ranker Evaluators and


Neural Network
Zahra Roozbahani and Ali Katanforoush

44

Data mining with learning decision tree and Bayesian network for data
replication in Data Grid
Farzaneh Veghari Baheri, Farnaz Davardoost and Vahid Ahmadzadeh

49

Design and Implementation of a three-node Wireless Network


Roya Derakhshanfar, Maisam M.Bassiri and S.Kamaledin Setarehdan

54

CEA Framework: A Comprehensive Enterprise Architecture Framework for


middle-sized company
Elahe Najafi and Ahmad Baraani

58

Thick non-crossing paths in a polygon with one hole


Maryam Tahmasbi and Narges Mirehi

64

A Note on the 3-Sum Problem


Keivan Borna and Zahra Jalalian

69

Voronoi Diagrams and Inversion Geometry


Zahra Nilforoushan, Abolghasem Laleh and Ali Mohades

74

Selection of Effective Factors in Estimating of Costumers Respond to Mobile


Advertising by Using AHP
Mehdi Seyyed Hamzeh, Bahram Sadeghi Bigham and Reza Askari Moghadam

80

An Obstacle Avoiding Approach for Solving Steiner Tree Problem on Urban


Transportation Network
Ali Nourollah and Fatemeh Ghadimi

84

Black Hole Attack in Mobile Ad Hoc Networks


Kamal Bazargan

89

Improvement of the Modeling Airport Assignment Gate System Using SelfAdaptive Methodology
Masoud Arabfard, Mohamad Mehdi Morovati and Masoud Karimian Ravandi

95

A new model for solving capacitated facility location problem with overall cost
of losing any facility and comparison of Particle Swarm Optimization,
Simulated Annealing and Genetic Algorithm
Samirasadat jamali Dinan, Fatemeh Taheri and Farhad Maleki

100

A hybrid method for collusion attack detection in OLSR based MANETs


Hojjat Gohargazi and Saeed Jalili

104

A Statistical Test Suite for Windows to Cryptography Purposes


R. Ebrahimi Atani, N. Karimpour Darav and S. Arabani Mostaghim

109

An Empirical Evaluation of Hybrid Neural Networks for Customer Churn


Prediction
Razieh Qiasi, Zahra Roozbahani and Behrooz Minaei-Bidgoli

114

A Clustering Based Model for Class Responsibility Assignment Problem


Hamid Masoud, Saeed Jalili and S.M.Hossein Hasheminejad

118

A Power-Aware Multi-Constrained Routing Protocol for Wireless Multimedia


Sensor Networks
Babak Namazi and Karim Faez

123

Mobile Learning- Features, Approaches and Opportunities


Faranak Fotouhi-Ghazvini and Ali Moeini

127

Predicting Crude Oil Price Using Particle Swarm Optimization (PSO) Based
Method
Zahra Salahshoor Mottaghi, Ahmad Bagheri and Mehrgan Mahdavi

131

Image Steganalysis Based On Color Channels Correlation In Homogeneous


Areas In Color Images
SeyyedMohammadAli Javadi and Maryam Hasanzadeh

134

Online Prediction of Deadlocks in Concurrent Processes


Elmira Hasanzade and Seyed Morteza Babamir

138

Fisher Based Eigenvector Selection in Spectral Clustering Using Google's Page


Rank Procedure
Amin Allahyar, Hadi Sadoghi Yazdi and Soheila Ashkezari Toussi

146

Imperialist Competitive Algorithm for Neighbor Selection in Peer-to-Peer


Networks
Shabnam Ebadi and Abolfazl Toroghi Haghighat

151

Different Approaches For Multi Step Ahead Traffic Prediction Based on


Modified ANFIS
Shiva Rahimipour, Mahnaz Agha-Mohaqeq and Seyyed Mehdi Tashakkori Hashemi

156

E-service Quality Management in B2B e-Commerce Environment


Parvaneh Hajinazari and Abbass Asosheh

161

Calibration of METANET Model for Real-Time Coordinated and Integrated


Highway Traffic Control using Genetic Algorithm: Tehran Case Study
Mahnaz Aghamohaqeqi, Shiva Rahimipour, Masoud Sa_lian and S.Mehdi Tashakori
Hashemi

165

Designing An Expert System To Diagnose And Propose About Therapy Of


Leukemia
Zohreh Mohammad Alizadeh Bakhshmandi and Armin Ghasem Azar

171

A Basic Proof Method For The Verification, Validation And Evaluation Of


Expert Systems
Armin Ghasem Azar and Zohreh Mohammad Alizadeh Bakhshmandi

175

Point set embedding of some graphs with small number of bends


Maryam Tahmasbi and Zahra Abdi reyhan

180

On The Pairwise Sums


Keivan Borna and Zahra Jalalian

184

Hyperbolic Voronoi Diagram: A Fast Method


Zahra Nilforoushan, Ali Mohades, Amin Gheibi and Sina Khakabi

187

Solving Systems of Nonlinear Equations Using The Cuckoo Optimization


Algorithm
Mahdi Abdollahi, Shahriar Lotfi and Davoud Abdollahi

191

A Novel Model-Based Slicing Approach For Adaptive Softwares


Sanaz Sheikhi and Seyed Morteza Babamir

195

A novel approach to multiple resource discoveries in grid environment


Leyli Mohammad khanli, Saeed Kargar and Hossein Kargar

200

HTML5 Security: Offline Web Application


Abdolmajid Shahgholi, HamidReza Barzegar and G.Praveen Babu

205

Earthquake Prediction by Study on Vital Signs of Animals in Wireless Sensor


Network by using Multi Agent System
Media Aminian, Amin Moradi and Hamid Reza Naji

209

Availability analysis and improvement with Software Rejuvenation


Zahra Rahmani Ghobadi and Baharak Shakeri Aski

213

A fuzzy neuro-chaotic network for storing and retrieving pattern


Nasrin Shourie and Amir Homayoun Jafari

219

GSM Technology and security impact


Ahmad Sharifi and Mohsen Khosravi

224

MicTSP: An Efficient Microaggregation Algorithm Based On TSP


Reza Mortazavi and Saeed Jalili

228

Proposing a new method for selecting a model to evaluate effective factors on


job production capabilities of central province industrial cooperatives using
Data mining and BSC techniques
Peyman Gholami and Davood Noshirvani Baboli

233

A Complex Scheme For Target Tracking And Recovery Of Lost Targets In


Cluster-Based Wireless Sensor Networks
Behrouz Mahmoudzadeh and Karim Faez

237

A Measure of Quality for Evaluation of Image Segmentation


Hakimeh Vojodi and Amir Masoud Eftekhary Moghadam

241

An Unsupervised Evaluation Method for Image Segmentation Algorithms


Hakimeh Vojodi and Amir Masoud Eftekhary Moghadam

246

Evaluate and improve the SPEA using fuzzy c-mean clustering algorithm
Pezhman Gholamnezhad and Mohammad mehdi Ebadzadeh

251

Hypercube Data Grid: a new method for data replication and replica
consistency in data gird
Tayebeh Khalvandi, Amir Masoud Rahmani and Seyyed Mohsen Hashemi

255

Exploiting Parameters of SLA to Allocate Resources for Bag of Task


Applications in Cloud Environment
Masoud Salehpour and Asadollah Shahbahrami

262

Bus Arrival Time Prediction Using Bayesian Learning for Neural Networks
Farshad Bakhshandegan Moghaddam, Alireza Khanteimoory and Fatemeh Forutan
Eghlidi

267

SRank: Shortest Path-Based Ranking in Semantic Network


Hadi Khosravi-Farsani, Mohammadali Nematbakhsh and George Lausen

271

RL Rank: A Connectivity-based Ranking Algorithm Using Reinforcement


Learning
Elahe Khodadadian, Mohammad Ghasemzadeh and Vali Derhami

276

YABAC4.5: Yet Another Boosting Approach for C4.5 Algorithm


B.Shabani and H.Sajedi

281

A New Method for Automatic Language Identification In Trilingual documents


of Arabic, English, and Chinese with Different Fonts
Einolah Hatami and Karim Faez

286

Clustering in backtracking for solution of N-queen Problem


Samaneh Ahmadi, Vishal Kesri and Vaibhav Kesri

290

An Improved Phone Lattice Search Method for Triphone Based Keyword


Spotting in Online Persian Telephony Speech
Maria Rajabzadeh, Shima Tabibian, Ahmad Akbari and Babak Nasersharif

294

Adaptive Gaussian Estimation of Distribution Algorithm


Shahram Shahraki and Mohammad-R. Akbarzadeh-T

300

A New Feature Transformation Method Based On Genetic Algorithm


Hannane Mahdavinataj and Babak Nasersharif

304

Evaluating the performance of energy aware tag anti collision protocols in


RFID systems
Milad Haj Mirzaei and Masoud Ghiasbeigi

310

GPS GDOP Classification via Advanced Neural Network Training


H. Azami, S. Sanei and H. Alizadeh

315

Improving Performance of Software Fault Tolerance Techniques Using


Multi-Core Architecture
Hoda Banki, Seyed Morteza Babamir, Azam Farokh and Mohamad Mehdi Morovati

320

An Introduction to an Architecture for a Digital-Traditional Museum


Reza Asad Nejhad, Mina Serajian, Mohsen Vahed and Seyyed Peyman Emadi

326

A Comparison of Transform-Domain Digital Image Watermarking Algorithms


Asadollah Shahbahrami, Mitra Abbasfard and Reza Hassanpour

329

Polygon partitioning for minimizing the maximum of geodesic diameters


Zahra Mirzaei Rad and Ali Mohades

336

Automatic Path-oriented Test Case Generation by considering Infeasible Paths


Shahram Moadab, Hasan Rashidi and Eslam Nazemi

340

Control Topology based on delay and traffic in wireless sensor networks


Bahareh Gholamiyan Yosef Abad and Masuod Sabaei

345

Two-stage Layout of workstations in an organization based clustering and


using an evolutionary approach
Rana ChaieAsl, Shahriar Lotfi and Reza Askari Moghadam

350

CAB : Channel Available Bandwidth Routing Metric for Wireless Mesh


Networks
Majid Akbari and Abolfazl Toroghi Haghighat

355

A PSO Inspired Harmony Search Algorithm


Farhad Maleki, Ali Mohades, F. Zare-Mirakabad, M. E. Shiri and Afsane Bijari

360

Repairing Broken RDF Links in the Web of Data by Superiors and Inferiors
sets
Mohammad Pourzaferani and Mohammad Ali Nematbakhsh

365

Palmprint Authentication Based on HOG and Kullback Leibler


Ma.Yazdani, F. Moayyedi and Mi. Yazdani

370

A Simple and Efficient Fusion Model based on the Majority Criteria for
Human Skin Segmentation
S. Mostafa Sheikholslam, Asadollah Shahbahrami, Reza PR Hasanzadeh and Nima
Karimpour Darav

374

A New Memetic Fuzzy C-Means Algorithm For Fuzzy Clustering


Fatemeh Golichenari and Mohammad Saniee Abadeh

380

Cross-Layer Architecture Design for long-range Quantum Nanonetworks


Aso Shojaie, Mehdi Dehghan Takhtfooladi,Mohsen Safaeinezhad and Ebrahim
SaeediNia

385

Generation And Configuration Of PKI Based Digital Certificate Based On


Robust OpenCA Web Interface
Parisa Taherian and Mohammad Hossein Karimi

391

Network Intrusion Detection Using Tree Augmented Naive-Bayes


R. Najafi and Mohsen Afsharchi

396

Dynamic Fixed-Point Arithmetic: Algorithm and VLSI Implementation


Mohammad Haji Seyed Javadi, Hamid Reza Mahdiani and Esmaeil Zeinali Kh.

403

Cost of Time-shared Policy in Cloud Environment


GhDastghibyfard and Abbas Horri

408

Using Fuzzy Classification System for Diagnosis of Breast Cancer


Maryam Sadat Mahmoodi, Bahram Sadeghi Bigham and Adel Najafi-Aghblagh
Rostam Khan

412

Government above the Clouds: Cloud Computing Based Approach to


Implement E-Government
Toofan Samapour and Mohsen Solhnia

417

Human Tracking-by-Detection using Adaptive Particle Filter based on HOG


and Color Histogram
Fatemeh Rezaei and Babak H.Khalaj

422

Use of multi-agent system approach for concurrency control of transactions in


distributed databases
Seyed Mehrzad Almasi, Hamid Reza Naji and Reza Ebrahimi Atani

426

Multi-scale Local Average Binary Pattern based Genetic algorithm (MLABPG)


for face recognition
A. Hazrati Bishak and K. faez

430

A Novel Method for Function Approximation in Reinforcement Learning


Bahar Haghighat, Saeed Bagheri Shouraki and Mohsen Firouzi

435

An Intelligent Hybrid Data Mining Method for Car-Parking Management


Sevila Sojudi, Susan Fatemieparsa, Reza Mahini, Parisa YosefZadehfard and
Somayeh Ahmadzadeh

443

Iris Recognition with Parallel Algorithms Using GPUs


Meisam Askari, Reyhane azimi and Hossein Ebrahimpour Komle

448

Improving Performance of Mandelbrot Set Using Windows HPC Cluster and


MPI.NET
Azam Farokh, Hoda Banki, Mohamad Mehdi Morovati and Hossein Ebrahimpour
Komle

453

The study of indices and spheres for implementation and development of trade
single window in Iran
Elham Esmaeilpour and Noor Mohammad Yaghobi

458

Web Anomaly Detection Using Artificial Immune System and Web Usage
Mining Approach
Masoumeh Raji, Vali Derhami and Reza Azmi

462

A Fast and Robust Face Recognition Approach Using Weighted Haar And
Weighted LBP Histogram
Mohsen Biglari, F. Mirzaei and H. Ebrahimpour-Komleh

467

An Unsupervised Method for Change Detection in Breast MRI Images based on


SOFM
Marzieh Salehi, Reza Azmi and Narges Norozi

473

A new image steganography method based on LSB replacement using


Genetic Algorithm and chaos theory
Amirreza Falahi and Maryam Hasanzadeh

478

Providing a CACP Model for Web Services Composition


Parinaz Mobedi and Mehregan Mahdavi

482

Using Collaborative Filtering for Rate Prediction


Sonia Ghiasifard and Amin Nikanjam

487

A New Backbone Formation Algorithm For Wireless Ad-Hoc Networks Based


On Cellular Learning Automata
Maryam Gholami, Mohammad Reza Meybodi and Mohammad Reza Meybodi

492

Solving Dominating Set Problem In Unit Disk Graphs By Genetic Algorithms


Azadeh Gholami, Mahmoud Shirazi and Bahram Sadeghi Bigham

498

Conflict Detection and Resolution in Air Traffic Management based on Graph


Coloring Problem using Prioritization Method
Hojjat Emami and Farnaz Derakhshan

504

A Review of M-Health Approach for Chronic Disease Management


Marva Mirabolghasemi, N.A.Iahadi, Maziar Mirabolghasemi and Vida Zakerifardi

509

A New IIR Modeling by means of Genetic Algorithm


Tayebeh Mostajabi and Javad Poshtan

514

A New Similarity Measure for Improving Recommender Systems Based on


Fuzzy Clustering and Genetic Algorithm
Fereshteh Kiasat and Parham Moradi

518

The lattice structure of Signed chip firing games and related models
A. Dolati, S. Taromi and B. Bakhshayesh

525

Tiling Finite Planes


Jalal Khairabadi ,Rebvar Hosseini, Zohreh Mohammad Alizadeh and Bahram
Sadeghi Bigham

528

J2ME And Mobile Database Design


Seyed Rebvar Hosseini, Lida Ahmadi, Bahram Sadeghi Bigham and Jalal
Khairabadi

532

IIR Modeling via Skimpy Data and Genetic Algorithm


Tayebeh Mostajabi and Javad Poshtan

536

Concurrent overlap partitioning, A new Parallel Framework for Haplotype


inference with Maximum parsimonious
Mohsen Taheri, Alireza Meshkin and Mehdi Sadeghi

540

A Bayesian Neural Network for Price Prediction in Stock Markets


Sara Amini, Farzaneh Yahyanejad and Alireza Khanteymoori

548

Maintaining the Envelope of an Arrangement Fixed


Marzieh Eskandari and Marjan Abedin

553

Investigating and Recognizing the Barriers of Exerting E-Insurance in Iran


Insurance Company According to the Model of Mirzai Ahar Najai (Case Study:
Iran Insurance Company in Orumieh City)
Parisa Jafari, Hamed Hagtalab, Morteza Shokrzadeh and Hasan Danaie

557

Identifying and Prioritizing Effective Factors in Electronic Readiness of the


Organizations for Accepting and Using Teleworking by Fuzzy AHP Technique
(Case Study: Governmental and Semi-Governmental Organizations in Tabriz
City)
Morteza Shokrzadeh, Naser Norouzi, Jabrael Marzi Alamdari and Alireza Rasouli

561

Hybrid Harmony Search for the Hop Constrained Connected Facility Location
Problem
Bahareh khazaei, Farzane Yahyanejad, Angeh Aslanian and S. Mehdi Hashemi

566

Gene Selection using Tabu Search in Prostate Cancer Microarray Data


Farzane Yahyanejad, Mehdi Vasighi, Angeh Aslanian and Bahareh khazaei

571

BI Capabilities and Decision Environment in BI Success


Zahra Jafari, Mahmoud Shirazi and Mohammad Hosseion Hayati

575

Computation in Logic and Logic in Computation


Saeed Salehi

580

Rating System for Software based on International Standard Set 25000


ISO/IEC
Hassan Alizadeh, Hossein Afsari and Bahram Sadeghi Bigham

584

TOMSAGA: TOolbox for Multiple Sequence Alignment using Genetic


Algorithm
Farshad Bakhshandegan Moghaddam, Mahdi Vasighi

589

To enrich the life book of IT specialists through shaping living schema Strategy
based on Balance-oriented Model
Mostafa Jafari

595

Reducing Packet Overhead by Improved Tunneling-based Route


Optimization Mechanism
Hooshiar Zolfagharnasab

Department of Computer Engineering


University of Isfahan
Department of IT, Soroush Educational Complex
hoppico@eng.ui.ac.ir

Abstract: Common Mobile IPv6 mechanisms, bidirectional tunneling and route optimization,
show inefficient packet overhead when both nodes are mobile. Researchers have proposed methods
to reduce per-packet overhead regarding to maintain compatible with standard mechanisms. In this
paper, three mechanisms in Mobile IPv6 are discussed to show their efficiency and performance.
Following discussion, a new mechanism called improved tunneling-based route optimization is proposed and due to performance analysis on packet overhead, it is shown that proposed mechanism
has less overhead comparing to others. Analytical results indicate that improved tunneling-based
route optimization transmits more payloads due to send packets with less overhead.

Keywords: Mobile IP; Route Optimization; Bidirectional Tunneling; Packet Overhead.

Introduction

Mobile IP is a technique enables nodes to maintain


permanent IP address while they are moving through
networks [1]. Due to Mobile IP protocol, a communication can be established between a Mobile Node (MN)
and a Corresponding Node (CN) regardless to their locations.
The Mobile IP protocol supports transparency
above the network layer including transport layer
which consists of the maintenance of active TCP connections and UDP port bindings, and application layer.
Mobile IP is most often found in wireless WAN environments where users need to carry their mobile devices
across multiple LANs with different IP addresses [2]
[4]. Mobile IP is implemented in IPv6 via two mechanisms called bidirectional tunneling and route optimization [1], [8].

inform other devices about location and network they


are wandering. Original packets from the network upper layers are embedded in packets containing mobile
routing headers. Reducing mobility overhead causes
more data to be sent with each packet. Therefore some
mechanisms are used to reduce mobility overhead. In
this paper, a new mechanism is proposed to reduce mobility overhead by reusing address field of IP address
twice.

Related Works

Some attempts have been performed to improve security and performance in Mobile IP. C. Perkins proposed
a security mechanism in binding updates between CN
and MN in [5]. C. Vogt et al. in [6] proposed a proactive address testing in route optimization.

In other aspect, D. Le and J. Chang suggested reIn order to enable mobility over IP protocols, net- ducing bandwidth usage due to use tunnel header inwork layer of mobile devices should send messages to stead of route optimization header when both MN and
Corresponding

Author: IT Manager at Soroush Educational Complex, Tehran, Iran, Tel: (+98) 912 539-4829

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

CN are mobile nodes [7].


It should be noted few papers focused on bandwidth reduction in Mobile IP while a lot of suggestions
are proposed to solve issues in security and delay. In
this paper, we are going to present a new technique to
reduce bandwidth by diminishing overhead of packets
when both MN and CN are mobile nodes.

Also it is important to update CNs binding cache by


sending BU messages frequently.
Route Optimization mechanism uses Home Address
Option header extension to carry MNs HoA when a
packet is sent from MN to CN. Reversely when a packet
is sent from CN to MN, another header extension called
Type 2 Routing header is used.

In a scenario that both MN and CN are mobile


nodes, route optimization can be implemented, too
[1]. Since both MN and CN have HoA and CoA,
3 Mobile IPv6
packet routing requires both extension headers to carry
enough information for the pairs network layer. ThereDiscussing about bidirectional and route optimization, fore, to transmit a packet from MN to CN, not only
we will talk about their advantages and disadvantages. Home Address Option header, but also Type 2 RoutLater, a method presented in [7], is explained to cover ing header should be filled with appropriate addresses.
some disadvantages of standard mechanisms.
Since each extension header is 24 bytes, total overhead
to transmit a packet between two mobile nodes is 48
bytes.

3.1

Bidirectional Tunneling

In Bidirectional Tunneling, MN and HA are connected


to each other via a tunnel, so signaling is required to
construct a tunnel between MN and CN. Packets sent
from CN to MN passes through HA before deliverance
to MN. Intercepting all packets destined to MN, HA
detects by Proxy Neighbor Discovery [9]. Since MN
is not present in home network and assuming noticed
tunnel is constructed, HA encapsulates each detected
packet in a new packet addressed to MNs new careof address (CoA) and sends them through the tunnel
[10]. At the end of the tunnel, the tunneled packet
is de-capsulated by MNs network layer before being
surrendered to MNs upper layers.

3.3

Tunneling-based Route Optimization (TRO)

As discussed before, in a scenario when both MN and


CN are mobile nodes, total overhead to carry a packet
between nodes is 48 bytes in route optimization. To reduce the overhead, D. le and J. Chang in [7] proposed
a mechanism called Tunneling-based Route Optimization. Like standard route optimization, TRO construct
a tunnel to transfer packets directly between MN and
CN. But in their proposed method, a Tunnel Manager
is controlling packets. Not only tunnel manager is in
touch with binding cache, but also it manipulates packets importing and exporting from the network layer.

Similar encapsulation is performed when MN sends


packets. Encapsulated packets are tunneled to HA,
that is called reverse tunneling, by adding 40 bytes as
tunnel header, addressed from MNs CoA to HA. Being
de-capsulated by HA, tunneling header is removed and
modified packet is sent to CN through the Internet.

As long as MNs transport layer create a packet


from MNs HoA destined to CNs HoA, the packet
is surrendered to MNs tunnel manager before it is
sent. Since tunnel manager is aware of CNs mobility,
it encapsulates the packet in a new packet addressed
from MNs CoA to CNs CoA. Later the packet is sent
through the tunnel to CN. At the other side of tunnel,
CNs tunnel manger de-capsulate the packet, extract3.2 Route Optimization (RO)
ing the original packet addressed from MNs HoA to
CNs HoA. Then the packet is surrendered to transIn Route Optimization mechanism, packets are trans- port layer which is still unaware of mobility.
mitted between MN and CN directly [3]. Binding Update (BU) messages are sent not only to HA, but also to
To maintain compatible with previous mechanisms,
all connected CNs to bind MNs current address to its BU messages are changed. By using a flag called ROT,
HoA. Each CN has a table called Binding Cache to keep tunnel manager decides whether to use tunneling-based
track of all corresponding MNs CoA and their HoA. route optimization or standard route optimization [7].
Similar table is kept in MN to determine whether a
CN uses bidirectional tunneling or route optimization.
TRO mechanism benefits from using 40 bytes tun-

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Figure 1: Protocol model for route optimization and packets passing between layers
nel header instead of using 48 bytes extension header depicts the protocol model in sender and receiver.
when standard route optimization is used.Result presented in [7] shows that TRO can increase performance
in Mobile IP comparing to standard mechanisms.

Improved
Tunneling-based
Route Optimization (ITRO)

More reduction can be accessed in order to spend less


header overhead in communication between MN and
CN, when they are both mobile nodes. Each node
constructs a binding cache to keep the address of the
other, so there is no necessity to send HoA of the other
pair via header extension because it can be obtained
from binding cache by the help of CoA included in
packet. In other words, header overhead is reduced
by using IPv6 address fields twice, both for the Internet addressing and mobile addressing. Instead, a
tunnel manager should be embedded not only to control binding cache, but also change the packet header.
The tunnel manager should control whether IPv6 address header is used for Internet addressing or mobile
addressing. Later in this section, we discuss about Improved Tunneling-based Route Optimization method.

4.1

Protocol Model in End-Points

Mobile IPv6 protocol should change a little to support


overhead reduction. Both nodes should be devised with
a tunnel manager which control and change all packets
switched between MN and CN. Also the noticed tunnel manger should be allowed to access binding cache
in order to find corresponding HoA of a node. Fig. 1

4.2

Improved Tunneling-based Route


Optimization Routing

Below, we discuss two scenarios to explain our proposed method. It should be mentioned that a tunnel
between MN and CN should be initiated at first. Also
BU messages have been sent to construct binding cache
in both CoA and HoA.
As long as MN wants to send a packet to CN, since
mobility is transparent to upper layers in nodes, MNs
network layer sets both source of the packet to MN
HoA and destination to CNs HoA. In the next step,
when tunnel manager gets the packet, it updates the
packet by changing both packets source and destination. Since MN is in a foreign network, it changes the
source field from its HoA to its CoA. Later, searching
binding cache (by the help of CNs HoA), it finds CNs
corresponding CoA and then writes it in the destination address field. Altered packet is sent directly to
CN through the tunnel.
By reception of packet to the other side of the tunnel, CNs tunnel manager manipulates the packet to
make it ready for upper layers. First manipulation is
performed by changing the packets destination from
CN CoA to CNs HoA. Next step is followed by searching binding cache with MNs CoA to find corresponding HoA. Later, the CNs tunnel manger then change
packets source from MNs CoA to what has just been
found, MNs HoA. As long as changes are finished, the

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Evaluation

We have evaluated our proposed mechanism via comparison to three other mechanisms. Since improved
tunneling-based route optimization mechanism intends
Figure 2: Improved tunneling-based route optimization to reduce header overhead, main comparison metric
is bytes consumed to establish mobile communication.
packets due to Fig. 1
We used relation 1 proposed in [7] to calculate mobility
overhead. It should be noted that mobility overhead
is bytes used to route packets from one mobile node to
another, and is different from overhead used to route
updated packet is surrendered to upper layers. Due to packets through network layer.
Fig. 1, packets sent from MN to CN are addressed as
shown in Fig. 2.
Mobility Addition Size
,
Mobility Overhead Ratio =
Original Packet Size
Same action is performed when a packet is sent from
(1)
CN to MN. Since CNs network upper layers are unaware of mobility, a packet is constructed which is adAlso, comparing to bidirectional tunneling mechadressed from CNs HoA to MNs HoA. As the packet is nism, communicating time is also mentioned which is
passed to CNs tunnel manger, due to binding cache, defined as total time for a packet to deliver from source
to destination.
the destination of the packet is changed from MNs
HoA to MNs CoA. Since CN knows its CoA, tunnel
Moreover, packets are assumed to be 1500 bytes
manger updates the packets source from its HoA to that is maximum transmission unit size in Ethernet,
CoA. Then the packet is tunneled to MN.
containing IPv6 packets, extension header if needed
and tunneling overhead.
Similarly, MNs tunnel manager changes the packets destination from MNs CoA to MNs HoA. Later,
searching binding cache, the packets source is also
5.1 Comparing to Bidirectional Tunchanged from CNs CoA to CNs HoA.

neling

4.3

Changing BU messages

To maintain compatible with other MIPv6 mechanisms, binding messages should change. We propose
to use two flags in order to distinguish three different
mechanisms. Calling ROT0 and ROT1, these flags indicate whether route optimization or tunneling-based
route optimization or improved tunneling-based route
optimization is used. Routing mechanisms due to
ROT0 and ROT1 are listed in table 1.

As mentioned before, in bidirectional tunneling, packets from CN should be tunneled from HA to MN and
are replied in the same tunnel from MN to HA, called
reverse tunneling. For each time a packet is tunneled,
40 bytes are used additionally to route the packet to
the other side of tunnel. As a packet is tunneled twice
to reach to destination, 80 bytes are consumed in two
different communications. Total bandwidth which is
used to carry a packet from source to destination is
calculated as follows:

Mobility Overhead Ratio =

+
Table 1: Routing mechanism due to ROT flags
Mechanism

ROT1

ROT0

Route Optimization
Tunneling-based Route Optimization
Improved Tunneling-based Route
Optimization (proposed method)

0
0

0
1

1 or 0

Tunnel Header SizeHAM N

Original Packet Size


Tunnel Header SizeM N HA
Original Packet Size

40
40
+
= 5.48%,
1500 40 1500 40

(2)

Also in bidirectional tunneling, each routing elapses


one Internet routing time [11] because each node can
be anywhere in the Internet. Due to Fig. 3, total delay

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Figure 4: Route optimization packets due to Fig. 1

5.2

Comparing to Route Optimization

Although both route optimization and proposed mechanisms construct a tunnel to reduce delay time and
overhead needed to communicate two mobile nodes,
different overheads are used to route a packet in constructed tunnel. In the situation when both nodes are
mobile, route optimization uses Home Address Option
and Type 2 routing extension headers as it is depicted
Figure 3: Comparing delay time for bidirectional tun- in Fig. 4. Since each extension header is 24 bytes in
neling mechanism and route optimization based mech- size, total mobility header added to IPv6 packet is
anisms
48 bytes. So mobility overhead ratio is calculated as
follows:
consists of three Internet routing time that is com24 Btype 2 + 24 BHOA Option
Mobility Overhead Ratio =
puted from:
1500 48
=
Total time =TM N HAM N + THAM N HACN
+ THACN CN
= 3 TInternet ,

48
= 3.3%,
1452

(6)

(3)

Since packets are tunneled directly to each node,


one Internet time is required (Equation. 5).

In improved tunneling-based route optimization,


since nodes are connected to each other through a tunnel, there is no need to tunnel packets twice between
MN and HA. Also address field of packet issued both
for tunnel and IPv6 header. Therefore, reduction in
both overhead and delay are sensible. Mobility Overhead Ratio is calculated as follows:

Because improved tunneling-based route optimization uses address field of packet both for tunneling and
IPv6 routing, as it calculated before, it uses 0% of total
packet size.
Using same tunnel for transmitting packets, total
delay time is same for both route optimization and proposed method.

0 BIPv6 tunnel header


1500 0
5.3
0
=
= 0%,
(4)
1500

Mobility Overhead Ratio =

Comparing to Tunneling-based
Route Optimization

Tunneling-based Route Optimization is proposed not


Also delay in proposed mechanism is computed only to decrease communication delay, but also to refrom:
duce overhead. It benefits from both tunneling idea
used in bidirectional tunneling and connecting directly

Total time = TM N CN = TInternet ,


(5) used in Route Optimization. Tunneling header which
is 40 bytes is added to IPv6 packet duo to reduce 48
bytes of extension headers. Fig. 5 shows packets A and
It means in Improved Tunneling-based Route Opti- B due to Fig. 1 when tunneling-based route optimizamization mechanism s more efficient both in overhead tion mechanism is used. Also, mobility overhead ratio
and delay.
is calculated as follows:

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

with standard mechanisms, not only the tunnel manager should be changed, but also Binding Update messages must be altered. Comparison to Bidirectional
Tunneling, Route Optimization and Tunneling-based
Route Optimization shows that the packet overhead of
proposed mechanism is reduced significantly comparFigure 5: Tunneling-based route optimization packets ing to previous mechanisms. Therefore regarding to
less overhead for each packet, more data can be transdue to Fig. 1
mitted through network via a Mobile IP communication.
Table 2: Comparison between Mobile IPv6 mechanisms
Mechanism

Bidirectional Tunneling
Route Optimization
Tunneling-based Route
Optimization
Improved Tunneling-based
Route Optimization
(proposed method)

Packet
Overhead
(%)

Delay
(Internet
Time)

6.6
3.3

3
1

2.74

Acknowledgement
I would like to thank Soroush Educational Complex
and especially Mr. Adbullah Shirazi for financial support and assistance. Also I should thank Mr. Seyed
Morteza Hosseini for preparing final version of PDF
using LATEX 2 .

Refrences
40 BIPv6 tunnel header
1500 40
40
=
= 2.74%,
(7)
1460

Mobility Overhead Ratio =

Total delay is equal to one Internet Time, because


packets should pass a tunnel same as the tunnel used
in Route Optimization.
Comparing to improved tunneling-based route optimization mechanism, proposed method has no overhead in header used in mobile communication. And
total delay is the same to route optimization mechanism.
Listed in table 2, Mobile IPv6 mechanisms are compared to each other. All in all it is obvious that proposed method can reduce both delay and bandwidth
used in mobile nodes communication.

Conclusion

In this paper, performance of both standard Mobile


IPv6 routing mechanisms and Tunneling-based Route
Optimization are analyzed. To reduce packet overhead,
we proposed Improved Tunneling-based Route Optimization mechanism. In order to maintain compatible

[1] D. Johnson, C. Perkins, and J. Arkko, Mobility Support


in IPv6, Internet Draft (work in progress), IETF (2009),
[Online] Available: http://tools.ietf.org/id/draft-ietf-mextrfc3775bis-05.txt.
[2] R.
Koodli,
Mobile
IPv6
Fast
RFC
5568,
IETF
(2009),
[Online]
http://www.ietf.org/rfc/rfc5568.txt.

Handovers,
Available:

[3] A. Muhanna, M. Khalil, S. Gundavelli, K. Chowdhury,


and P. Yegani, Binding Revocation for IPv6 Mobility, Internet Draft (work in progress), IETF (2009), [Online]
Available: http://www.ietf.org/id/draft-ietf-mext-bindingrevocation-14.txt.
[4] M. Liebsch, A. Muhanna, and M. Blume, Transient Binding for Proxy Mobile IPv6, Internet Draft
(work in progress), IETF (2009), [Online] Available:
http://www.ietf.org/id/draft-ietf-mipshop-transientbcepmipv6-04.txt.
[5] C. Perkins, Securing Mobile IPv6 Route Optimization Using a Static Shared Key, RFC 4449, IETF (2006), [Online]
Available: http://www.ietf.org/rfc/rfc4449.txt.
[6] C. Vogt, R. Bless, M. Doll, and T. Kuefner, Early Binding
Updates for Mobile IPv6, in Proceedings of IEEE Wireless
Communications and Networking Conference (WCNC05) 3
(2005), 14401445.
[7] D. Le and J. Chang, Tunneling-based route optimization
for mobile IPv6, in Proceedings of IEEE Wireless Communications, Networking and Information Security (WCNIS)
(2010), 509513.
[8] C.
Perkins,
IP
Mobility
Support
for
IPv4,
RFC
3344,
IETF
(2002),
[Online]
Available:
http://www.ietf.org/rfc/rfc3344.txt.
[9] T. Narten, E. Nordmark, and W. Simpson, Neighbor Discovery for IP Version 6 (IPv6), RFC 4861, IETF (2007),
[Online] Available: http://www.ietf.org/rfc/rfc4861.txt.
[10] A. Conta and S. Deering, Generic Packet Tunnelling in
IPv6 Specification, RFC 2473, IETF (1998), [Online] Available: http://www.ietf.org/rfc/rfc2473.txt.

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[11] M. Kalman and B. Girod, Modeling the delays of successivelytransmitted Internet packets, In Proceedings of the

IEEE International Conference on Multimedia and Expo,


ICME04, Taipei, Taiwan (2004), 20152018.

Neural Network Learning based on Football Optimization Algorithm


Payam Hatamzadeh

Mohammad Reza Khayyambashi

Faculty of Engineering

Faculty of Engineering

Department of Computer Engineering

Department of Computer Engineering

Payam@eng.ui.ac.ir

M.R.Khayyambashi@eng.ui.ac.ir

Abstract: Football Optimization Algorithm (FOA) is a novel optimization algorithm, which is


inspired by football game. Like other evolutionary ones, the proposed algorithm starts with an
initial population called team. Population individuals called players are in two types: main players
and substitute players. Teamwork among these players forms the basis of the proposed evolutionary
algorithm. In this paper, the application of the FOA to tuning the parameters of artificial neural
networks (ANNs) is presented as a new evolutionary method of ANN training. In this paper, neural
network trained with FOA, Imperialist Competitive Algorithm, Particle Swarm Optimization and
Genetic Algorithm and compared the experimental results obtained from these four methods. The
consideration of results showed that the training and test error of the network trained by FOA
algorithm has been reduced in comparison to the other three methods. Hence, FOA can tune the
weight values and it is believed that FOA will become a promising candidate for training ANNs.

Keywords: Football Optimization Algorithm; Artificial Neural Network; Evolutionary Algorithms.

Introduction

Artificial neural networks have been developed as generalizations of mathematical models of biological nervous systems. A first wave of interest in neural networks emerged after the introduction of simplified neurons by McCulloch and Pitts (1943). Neural networks
have the ability to perform tasks such as pattern recognition, classification problems, regression problems differential equations and etc as demonstrated [1,2]. The
basic processing elements of neural networks are called
artificial neurons, or simply neurons or nodes. In a simplified mathematical model of the neuron, the effects
of the synapses are represented by connection weights
that modulate the effect of the associated input signals,
and the nonlinear characteristic exhibited by neurons
is represented by a transfer function. The neuron impulse is then computed as the weighted sum of the
input signals, transformed by the transfer function.

used to determine weight adjustments has a large influence on the performance of neural networks. While gradient descent is a very popular optimization method,
it is plagued by slow convergence and susceptibility to
local minima as demonstrated [3]. Therefore, other
approaches to improve neural networks training introduced as demonstrated [4]. These methods include
global optimization algorithms, such as Seeker Optimization Algorithm [5], Genetic Algorithms [6-8], Particle Swarm Optimization Algorithms [9-10], Imperialist Competitive Algorithm [11] and Harmony Search
Algorithm [12].

In this paper, a new evolutionary algorithm has


been proposed which has inspired by football game
called Football Optimization Algorithm. The proposed
algorithm starts with an initial population called team.
A team composed of good passers and mobile players.
All the players are divided into two types: main players
and substitute players. Teamwork among main players is the main part of the proposed algorithm and
The learning capability of an artificial neuron is expectantly causes the ball to converge to the goal.
achieved by adjusting the weights in accordance to the Teamwork is achieved when individuals make personal
chosen learning algorithm. The optimization method
Corresponding

Author, P. O. Box 83139-64841, T: (+98) 913 913-9948

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

sacrifices to work together for the success of the group.


Here, the Neural Network trained by FOA, ICA, PSO
and GA algorithms and compared their results with
each other. The consideration of results indicated that
the training and test error of the network which trained
by FOA algorithm has been reduced in comparison to
the other methods. The rest of this paper is organized
as follows. Section two introduce the proposed algorithm and studies its different parts in details. The
proposed algorithm for training the Neural Network
will be presented in section three. The proposed algorithm is tested with benchmark problems in section
four and section five concludes the paper.

Football Optimization
Algorithm

Figure 1 shows the flowchart of the proposed algorithm. FOA encodes potential solutions to a specific
problem on players and applies teamwork operators to
these players. This algorithm is viewed as function optimizers although the range of problems to which this
algorithm has been applied to, is quite broad.
Start
Initialize parameters

Parameter
n

Description
Maximum number of players
Divide coefficient of players
Number of replacement in entire iterations
Number of replacement per iteration
Pass coefficient
Velocity coefficient of players
Spectators effect on players
Spectators effect on parameters

Value
[11,)
(0,1]
[0,n]
[0, n ]
[0,1]
best value [0.5,2]
[0,1]
[0,1]

Table 1: Adjustable parameters of the FO algorithm

2.2

Creating a team

The first step in the implementation of any optimization algorithm is to generate an initial population [13].
In a FO algorithm, a population of players called team,
which encode candidate solutions to an optimization
problem, evolves towards better solutions. In other
words, each player creates by array. The population
size (n) depends on the nature of the problem, but typically contains several hundreds or thousands of possible
solutions. This algorithm usually starts from a population of randomly generated individuals and covers the
entire range of possible solutions (the search space).

player1
player2

T eam =
(1)

..

.
playern

Collecting random players and


creating a team

in which:
Dividing players into main players
and Substitution players

playeri = [parameter1 , ..., parameterk ]

Giving the ball to suitable player


Passing the ball to best player

2.3

Dividing players

Attacking players into free space

Well-organized and well-prepared teams are often seen


beating teams. Before the games, coaches always
Substituting players
trained the team because it can play an important part
No
in a match. After practice players, they selected most
Stop conditions
satisfied?
powerful them to form main team. In this algorithm,
Yes
the power of a player is founds by evaluating the fitDone
ness function. After evaluating the fitness function of
players, m of the most powerful players are selected to
form the main players. The remaining of the populaFigure 1: Flowchart of the FO algorithm
tion (s) will be remaining axs substitute players. Then,
we have two types of players: main players and substitute players.
2.1 Initialize parameters

player1

..
mainP layers =

Table 1 shows the adjustable parameters of the football


.
optimization algorithm.
playerm mk
Random moving by spectators effect

The Third International Conference on Contemporary Issues in Computer and Information Sciences

playerm+1

..
substituteP layers =

.
playern
sk

Playeri

(2)

Parameter1

Parameter1

Parameter2

Parameter2

in which:

Parameter k

m = round(n ), s = n m

2.4

Player j

Parameter k

Figure 2: Exchange k parameters between two players in pass operation

Giving the ball to suitable player

2.6

Attacking players into free space

From the beginning of each playing period until the


end of the playing period, there is one ball in football
game. Hence, one player the existing main players is
selected to get possession of of the ball. Individual
solutions are selected through a fitness-based process,
where fitter solutions (as measured by a fitness function) are typically more likely to be selected. Other
methods rate only a random sample of the population,
as this process may be very time-consuming. Equation
3 determines the certain selection methods that rate
the fitness of each solution, add random value (with
uniform distribution(U )) to it, and preferentially select the best rank solution.
OwnerBall = BestIndex{Rank1 , ..., Rankm }

Once a player has passed the ball, other players do not


remain stationary but move into a position where they
can receive the ball and give more options to the player
in possession. Moving into free spaces is one of the
most critical skills that footballers must develop. Players must move off the ball into space to give an advance
the maximum chance of success. Passes to space are
feasible when there is intelligent movement of players
to receive the ball and they do something constructive
with it. In this algorithm, players move into the search
space. The proposed algorithm has modeled this fact
by moving all the players toward the best player. To
search different points around the best player a random
amount of deviation added to the direction of movement. Figure 4, shows the overview of this movement.
(3)
Transfer occurs in a space that is shown as a triangle.

in which:
Ranki = F itness(playeri ) + U (d, +d)

2.5

Passing the ball to the best player

Passing the ball is a key part of football. The purpose


of passing is to keep possession of the ball by maneuvering it on the ground between different players and
to advance it up the playing field. Aside from having
conspicuous advantages, passing is a skill that demands
good technical ability not only from the distributor but
from the receiver as well. In this algorithm, passing is
a tool with great creative potential and always has to
be directed at a teammates feet. The pass is considered as an offensive action. Thus upon figure 2, in each
iteration, the rank of every player in the population is
evaluated, the best-ranked player is selected from the
current main players (based on their fitness), and parameters exchanged between passer and it.

10

Figure 3: Moving players toward the best player with


a randomly deviated

This movement is shown in equation 4 in which the


players move towards the best player by x units.
x [0, U ( d) + ]

(4)

Where U is the uniform (or any) distribution, is the


number greater than zero and causes the players to
get closer to the goal from any side,d is the distance
between the best player and other players and is a
parameter that adjusts the deviation from the original
direction.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

2.7

Moving by spectators effect

2.9

Convergence

The impact of spectators upon sport is substantial and This process is repeated until a termination condition
varied. These are one of the reasons for the success has been reached. Common terminating conditions
of football teams. Spectators at the stadium and team are:
practices increase morale and the sense of responsibility
in the football players. This feeling will be transferred
A solution is found that satisfies minimum criteamong all players and even coaches and managers.
ria (goal).
This movement is shown in figure 4 in which spectators
Fixed number of iterations reached.
effect modeling by random change in players parameters. In equation 5, m is a number of main players and
Allocated budget (computation time/money)
k is number of parameters of each player.
reached.
Ef f ectP layers = k 
Ef f ectP arameters = m

The highest solutions fitness is reaching or has


reached a plateau such that successive iterations
no longer produce better results.

(5)

Manual inspection.

2.8

Combinations of the above.

Substitutes

A number of players may be replaced by substitutes


during the course of the football game. Common reasons for a substitution include injury, tiredness, ineffectiveness, a tactical switch, or time wasting at the
end of a finely poised game. The most tired players
are generally substituted, but coaches often replace ineffective players in order to freshen up the attacking
posture and increase their chances of scoring. In this
algorithm, like football matches, substituting players is
required to make the conditions better. This can vary
during the game and put a significant impact on the
success of the team. The number of substitutes must
be determined before the algorithm begins, which may
be anywhere between zero and n.
Thus upon figure 5, in each iteration, the fitness
of every player in the team is evaluated and a comparison between the weakest main player and the best
substitute player takes place. If the substitute player
is stronger than the main player, a switch takes place.
During this execution algorithm can use of (number
of replacement) to adjust parameters. For example, if
is very high must decrease (spectators effect on
players).

player1
mainPlayer s
playerm

mk

playerm1

substitutePlayers

playern

sk

Figure 4: Player replacement by the substitute

ANN Learning based on FOA

Optimal connection weights can be formulated as a


global search problem where in the architecture of the
neural network is pre-defined and fixed during the evolution. Three-layered Perceptron Neural Network applied, including an input layer, a hidden and an output
layer which is formulated as formula (6):
op =

H
X
i=1

wip f [

n
X

wjp xj ]

(6)

j=1

Where p denotes the number of epoch,H denotes the


number of neurons in the hidden layer,w denotes the
weights of the network and f denotes the activation
function of each neuron which can be considered as
sigmoid and tanh. The number of input nodes set to
the number of attributes, hidden nodes to 10, and one
node in the output layer. We considered the weights
of network training phase as the variables of an optimization problem. The Mean Square Error (MSE)
used as a cost function in algorithm. The goal in proposed algorithm is minimizing this cost function. Figure 6 shows the flowchart of the proposed algorithm.
Evolutionary search of connection weights can be formulated as follows: (1) Generate an initial team of N
weight and evaluate the MSE of each population depending on the problem. (2) Depending on the fitness
and using suitable selection methods dividing players
into main players and substitute players. (3) Apply
football operators to players. (4) Check whether the
network has achieved the specified number of generations has been reached then goes to step 3. (5) End.

11

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Teaching Assistant Evaluation: includes evaluations of teaching performance; scores are low,
medium, or high. It contains 151 samples with 5
attributes.
50% of instance data applied for training the Neural
Network and the remaining 50% for testing. The neural network trained by FOA, ICA, PSO and GA algorithms and the results compared with each other. An
accurate comparison of the four methods is presented
that uses 10-fold experiment replication. For each classification problem, same topologies have been selected
Figure 5: Training and classification processes
and minimum cost function value and mean cost value
versus epochs are presented. The results of these experiments are presented in Table 2 and 3. Figures 7 show
the mean test error and the mean train error (false
4 Experimental Results
classification percent) for each of the four compared
optimization methods on five classification problems.
From the experimental results, it can be seen that in
In this paper, the proposed method performance evalall cases the FOA performed better.
uated in comparison to ICA, PSO and GA algorithms
for training a three layered Perceptron Neural Network.
FOA
ICA
PSO
GA
Dataset
Precision
Precision
Precision
In the FOA algorithm, the parameters , , , and
MSE
MSE
MSE Precision MSE
(%)
(%)
(%)
respectively are set to 0.5, 1, 2.5, 0.1 and 0.1. The
Wine
0.0509
0.9348
0.0945
0.8587
0.0552
0.8478
0.1783
0.4022
Glass
0.4881
0.742
0.5762
0.5158
0.5505
0.5965
0.6948
0.4649
number of players is considered 100. In the ICA alHeart
0.0887
0.8603
0.1778
0.7647
0.1051
0.6765
0.1999
0.7059
Vertebral 0.2836
0.7134
0.4148
0.6115
0.3154
0.7061
0.4319
0.5796
gorithm, the parameters , a and b respectively are
Teaching 0.2878
0.6538
0.5264
0.4487
0.3103
0.6026
0.5077
0.4872
set to 2, 0.5 and 0.2. The number of imperialists and
the colonies are considered 10 and 100. In the PSO
Table 2: Train result for each of the four methods
algorithm, the parameters c1 and c2 are fixing to 1.5
and the number of the particle is 100. Determining
this amount for c1 and c2 we have given equal chance
FOA
ICA
PSO
GA
Dataset
Precision
Precision
Precision
Precision
to social and cognition components take part in search
MSE
MSE
MSE
MSE
(%)
(%)
(%)
(%)
process. In GA the population size is 100, the muWine
0.0224
0.9651
0.0723
0.9186
0.3316
0.7326
0.2350
0.3372
Glass
1.261
0.651
1.5927
0.5300
1.4525
0.4800
2.8823
0.4000
tation and crossover rate are respectively set to 0.03
Heart
0.2008
0.7463
0.2099
0.7015
0.1874
0.5970
0.2185
0.6642
Vertebral 0.3649
0.6948
0.4298
0.5294
0.4293
0.6869
0.5301
0.5556
and 0.5. The number of iteration is considered 1000
Teaching 1.621
0.3562
2.600
0.2899
2.077
0.3014
3.588
0.2110
for all methods. The datasets used for evaluating the
proposed approach are known datasets that are availTable 3: Test result for each of the four methods
able for download from UCI and refer to classification
problems. Five datasets have selected as follows:
0.6

Glass: includes glass component analysis for glass


pieces that belong to 7 classes. It contains 214
samples with 9 attributes.
Statlog (Heart): includes heart disease that belongs to 2 classes. It contains 270 samples with
13 attributes.

0.5

0.4
FOA

Error

Wine: includes data from wine chemical analysis


that belong to 3 classes. It contains 178 samples
with 13 attributes.

ICA

0.3

PSO
GA

0.2

0.1

0
Mean Test Error

Mean Train Error


Algorithms

Figure 6: Mean train(test) error for all methods

Vertebral Column: containing values for six


biomechanical features used to classify orthopedic patients into 3 classes (normal, disk hernia or
Figures 8-11, the comparison of Mean Square Error
spondilolysthesis). It contains 310 samples with (MSE) of Neural Network trained by FOA, ICA, GA
6 attributes.
and PSO algorithms with Teaching Dataset, indicated

12

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

that the proposed algorithm trained very well rather of ball to the goal as expected. In this cooperation,
than the other algorithms.
the ball is moved gradually to the goal and finally best
player takes a shot at the goal. Then, Football Optimization Algorithm uses an evolutionary algorithm
in order to optimize the weights of a neural network.
The FOA method is evaluated on five known classification problems and compared against the state of the
art methods: ICA, PSO and GA. The consideration
of the results showed that the training and test error
of the network trained by the FOA algorithm has been
reduced in comparison to the other three methods. FuFigure 7: Mean square error for FOA per iteration
ture work will consist in modifying some parts of the
algorithm improve the algorithm execution speed.

Refrences

Figure 8: Mean square error for ICA per iteration

[1] T. J. Glezakos, T. A. Tsiligiridis, L. S. Iliadis, C. P. Tsiligiridis, F. P. Maris, and P. K. Yialouris, Feature extraction
for time-series data: An artificial neural network evolutionary training model for the management of mountainous watersheds: Lecture Notes in Computer Science, Neurocomputing 73/2009 (2009), 4959.
[2] T. J. Glezakos, G. Moschopoulou, T. A. Tsiligiridis, S.
Kintzios, and C. P. Yialouris, Plant virus identification
based on neural networks with evolutionary preprocessing:
Lecture Notes in Computer Science, Computers and Electronics in Agriculture 70/2010 (2010), 263275.
[3] M. Georgiopoulos, C. Li, and T. Kocak, Learning in the
feed-forward random neural network: A critical review: Lecture Notes in Computer Science, Performance Evaluation
68/2011 (2011), 361384.

Figure 9: square error for GA per iteration

[4] P. Kordik, J. Koutnik, J. Drchal, O. Kovarik, M. Cepek,


and M. Snorek, Meta-learning approach to neural network
optimization: Lecture Notes in Computer Science, Neural
Networks 23/2010 (2010), 568582.
[5] C. Dai, W. Chen, Y. Zhu, Z. Jiang, and Z. You, Seeker optimization algorithm for tuning the structure and parameters of neural networks: Lecture Notes in Computer Science,
Neural Networks 74/2011 (2011), 876883.
[6] D. Mantzaris, G. Anastassopoulos, and A. Adamopoulos,
Seeker optimization algorithm for tuning the structure and
parameters of neural networks: Lecture Notes in Computer
Science, Neural Networks 24/2011 (2011), 831835.

Figure 10: Mean square error for PSO per iteration

Discussion and Future Works

In this paper, an optimization algorithm based on modeling the football match is proposed. Each individual
of the population is called a player. The team is divided into two groups: main players and substitute
players. A team composed of good passers and mobile players. Teamwork among main players forms the
core of this algorithm and results in the convergence

[7] D. Rivero, J. Dorado, J. Rabunal, and A. Pazos, Generation


and simplification of Artificial Neural Networks by means of
Genetic Programming: Lecture Notes in Computer Science,
Neurocomputing 73/2010 (2010), 32003223.
[8] A. Sedki, D. Ouazar, and E. Mazoudi, Evolving neural network using real coded genetic algorithm for daily rainfallrunoff forecasting: Lecture Notes in Computer Science, Expert Systems with Applications 36/2009 (2009), 4523
4527.
[9] S. Kiranyaz, T. Ince, A. Yildirim, and M. Gabbouj, Evolutionary artificial neural networks by multi-dimensional particle swarm optimization: Lecture Notes in Computer Science, Neural Networks 22/2009 (2009), 14481462.
[10] J. Yu, S. Wang, and L. Xi, Evolving artificial neural networks using an improved PSO and DPSO: Lecture Notes
in Computer Science, Neurocomputing 71/2008 (2008),
10541060.

13

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[11] M. Abdechiri, K. Faez, and H. Bahrami, Neural Network


Learning Based on Chaotic Imperialist Competitive Algorithm: Lecture Notes in Computer Science, 2nd Int. Workshop on Digital Object Identifier Intell (2010), 15.

14

[12] S. Kulluk, L. Ozbakir, and A. Baykasoglu, Self-adaptive


global best harmony search algorithm for training neural
networks: Lecture Notes in Computer Science, Procedia
Computer Science 3/2011 (2011), 282286.

Evaluating XML Retrieval Systems Using Methods of Averaging


Precision and Recall at Rank Cut-offs
Marzieh Javadi

Department of Computer Engineering, Zanjan Branch, Islamic Azad University, Zanjan, Iran
MarziehJavadi@ymail.com

Hassan NADERI
Faculty of Iran University of Science and Technology (IUST)
naderi@iust.ac.ir

Abstract: Today, with growing of XML documents on the web, attempts to develop XML retrieval
systems is also growing. As more XML retrieval systems are offered, performance evaluation of them
become more important. In this context, there are some metrics that are used to rank retrieval
systems, and most of them, extend the definitions of precision and recall.
In this paper, ranking of XML retrieval systems, for INEX 2010 runs, according to three methods
of Averaging precision and recall values in specific rank-cutoffs, are compared with results of MAiP
metric, that is used for evaluation by INEX.

Keywords: XML Retrieval; Precision; Recall; Evaluation Metrics.

Introduction

In section 2, part of INEX that is used in this paper, is


described. In section 3, evaluation metrics is presented.
In section 4, ranking of XML retrieval systems, based
Compared with traditional information retrieval, that on the measure stated in section 3, has been compared
is considered whole of document as retrievable unit, with each other. In section 5, results are presented.
XML document structure, provides more evaluation
challenges. XML retrieval systems on collection of articles that have been marked with XML, retrieve a list
of XML elements as the best answers to user query[1]. 2
INEX
With the increasing number of XML retrieval systems,
evaluate the performance of them become more imporINEX project is a large scale project in field of XML
tant.
Most of these evaluation methods use of test collections retrieval, and includes set of test collection and evaluthat is made for this purpose. These test collections are ation methods[3].
usually include set of documents, user requests and relevance assessments, that are specify set of correct answers for user request[2].
2.1 INEX 2010 Data Centric Collection
Since 2002, several metrics for performance evaluation
and ranking of XML retrieval systems is presented,
that any of them have some disadvantages. In this INEX collection in this track used of IMDB collecpaper, influence of precision and recall, that are main tion,that built from www.imdb.com. Text files from
concepts in evaluation of retrieval systems efficiency, this site was converted to XML documents. Overally,
has been studied at specific rank cut-offs.
this collection included 4,418,102 XML files[4].
Corresponding

Author, T: (+98) 5827115

15

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

2.2

Topics

the relevance value of ei .


Precision at rank r, is measured as the fraction of retrieved relevant information at rank r as it shown in
In total, 28 topics were selected for this track, that
equation 2:
Pr
reflect users information need. A sample of topic from
rval(ei )
(2)
P @r = Pi=1
this track is shown in Fig. 1:
r
i=1 size(ei )
In above equation, size(ei ) is size of retrieved element
on rank i, and |L| is the length of retrieved element
list[7].
Recall at rank r, is measured as the fraction of relevant
information retrieved at rank r as it shown in equation
3:
r
1 X
R@r =
rval(ei )
(3)
Trel i=1
Figure 1: INEX 2010 Data Centric Track Topic Sample In above equation, Trel shows toatal amount of highlighted relevant text for a topic[7].

2.3

Assessments and Evaluation In


3.2
INEX 2010 Data Centric Track

MAiP

For each topic, AiP calculated by averaging iP scores


Runs of XML retrieval systems evaluated with preci- on 101 standard recall levels:
sion, recall, MAP, P@5, P@10, ... in INEX 2010.
X
1
iP [x]
(4)
AiP =
101 x=0.00,0.01,...,1.00

Evaluation Metrics

In INEX 2010, for a topic, systems should retrieve a


ranked list of XML elements,that have been detected
related to request. So XML retrieval systems, in addition to ordering of XML elements according to their estimated relevance score, should retrieve elements that
havent overlap with previously retrieved elements[5].

Also, iP on standard recall level of x, is the highest


precision on each recall level, that was obtained from
this level or following recall level. Mean average interpolated precision (M AiP ) calculated by averaging the
AiP values for all of topics[5]. M AiP for n topics is
shown in equation 5:
1X
M AiP =
AiP
(5)
n

3.3
3.1

F Measure

Precision and Recall

Precision and recall values using F measure, can be


converted to a single score. By comparing scores that
Amount of retrieved relevant information, measure
obtained from F measure, knowing which system is cawith length of relevant text. So, instead of counting
pable to retrieve more relevant information, without
the number of retrieved relevant documents, amount
retrieving significant amount of irrelevant information,
of relevant text that is retrieved is measured[6].
is possible[7]. F measure, calculated as equation 6:
As it shown in equation 1, rsize(ei ) is amount of rel2.P @r.R@r
evant text in element e, that was retrieved in rank i.
(6)
F measure =
P @r + R@r
For measuring amount of relevant text that is retrieved
from ei , relevance value function was defined as below:
(
rsize(ei )
if overlap(i) = 0 3.4 Arithmetic and Geometric Means
P
rval(ei ) =
rsize(ei ) . ( ej Ri )rval(ej ) else
(1) Arithmetic and geometric means of precision and recall
In equation 1, if there is overlap between ei and ej that values, calculated as equation 7 and 8:
is on the Ri (list of elements that retrieved before elP @r + R@r
Arithmeticmean =
(7)
ement i), relevance value of ej will be deducted from
2

16

The Third International Conference on Contemporary Issues in Computer and Information Sciences

GeometricM ean =

2
P @r.R@r

(8) recall at rank 50, using 27 runs of INEX IMDB 2010.


The Spearman correlation coefficient is 0.94.

System Rankings with Evaluation Measures

In this section, rankings of XML retrieval systems, for


INEX 2010 runs on data centric track are computed,
according to three methods of averaging precision and
recall, and compared with system ranking obtained
from MAiP.
First, we calculated precision, recall and MAiP for
XML retrieval systems of INEX 2010 IMDB. Afterwards, evaluation of systems was done by calculation
of F measure and arithmetic and geometric means, and
ranking based on these measures.
INEX is selected 1, 2, 5, 10, 25, 50 for rank cut-offs, so
we use these cut-off points too.
Table 1, 2 and 3 show spearman correlation coefficients
calculated from the run orderings using the 27 submitFigure 2: Correlation between run orderings by MAiP
ted runs, respectively for arithmetic and geometric
and arithmetic mean of precision and recall at rank 50
means and F measure at rank cut-offs, for the INEX
IMBD 2010. Best results on each method are shown
in bold.
The graph of Figure 3 provides a detailed overview
We observe that results of each of the three averaging
of
the
observed correlation between run orderings genmethods is strongly correlated to MAiP.
erated by MAiP and geometric mean of precision and
Table 1. Spearman correlation coefficients calculated recall at rank 10, using 27 runs of INEX IMDB 2010.
from the run orderings obtained from arithmetic mean The Spearman correlation coefficient is 0.93.
at rank cut-offs and MAiP

MAiP

a@1
0.87

a@2
0.85

a@5
0.92

a@10
0.93

a@25
0.92

a50
0.94

Table 2. Spearman correlation coefficients calculated


from the run orderings obtained from geometric mean
at rank cut-offs and MAiP

MAiP

G@1
0.84

G@2
0.85

G@5
0.93

G@10
0.93

G25
0.92

G50
0.93

Table 3. Spearman correlation coefficients calculated


from the run orderings obtained from F measure at
rank cut-offs and MAiP

MAiP

F@1
0.80

F@2
0.86

F@5
0.90

F@10
0.90

F25
0.89

F50
0.91

Figure 3: Correlation between run orderings by MAiP


and geometric mean of precision and recall at rank 10

The graph of Figure 2 provides a detailed overview of


The graph of Figure 4 provides a detailed overview
the observed correlation between run orderings gener- of the observed correlation between run orderings genated by MAiP and arithmetic mean of precision and erated by MAiP and F measure at rank 50, using 27

17

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

runs of INEX IMDB 2010. The Spearman correlation related with MAiP measure. Despite of importance of
coefficient is 0.91.
research for overcoming the weaknesses of existing metrics and efforts to creat new metrics, results of simplest
definitions are very close to the best existing metrics.
According to results were shown tables in section 4,
arithmetic mean of precision and recall at rank cut-off
50, has created the best results. Hence it can be the
appropriate baseline for comparing results of metrics
that are created in the future. In the future, we want
to expand this research with the Wikipedia collection
and more cut-off points.

Refrences
[1] J. Pehcevski and B. Piwowarski, Evaluation Metrics for
Semi-Structured Text Retrieval (2009).
[2] M. Lalmas and A. Tombros, INEX 2002-2006: Understanding XML Retrieval Evaluation: DELOS07 Proceedings of
the 1st international conference on Digital libraries: research
and development, Springer, Berlin/Heidelberg (2007), 187
196.

Figure 4: correlation between run orderings generated


by MAiP and F measure at rank 50

Conclusions
Works

and

Future

In this paper, ranking of XML retrieval systems, based


on three metrics has been compared. For measuring
amount of correlation between results obtained from
metrics, spearman correlation coefficients is used. Results from this study show that the average values of
precision and recall at some rank cut-offs, strongly cor-

[3] N. Fuhr, N. Govert, G. Kazai, and M. Lalmas, INEX: Initiative for the Evaluation of XML Retrieval: Proceedings of the
SIGIR 2002 Workshop on XML and Information Retrieval
(2002).
[4] A. Trotman and Q. Wang, Overview of the INEX 2010 Data
Centric Track: Lecture Notes in Computer Science, Springer,
Berlin/Heidelberg 6932/2011 (2011), 171181.
[5] J. Kamps, J. Pehcevski, G. Kazai, M. Lalmas, and S. Robertson, INEX 2007 evaluation measures, Springer, Heidelberg
4862 (2008), 2433.
[6] J. Pehcevski and J.A. Thom, HiXEval: Highlighting XML
Retrieval Evaluation. In Advance in XML Information Retrieval and Evaluation: Fourth Workshop of the Initiative
of XML Retrieval, INEX 2005, Springer, Berlin/Heidelberg
3977/2006 (2006), 4357.
[7] J. Pehcevski, Evaluation of Effective XML Information Retrieval, Phd thesis, Chapter 5, pages: 149184, 2006.

18

Performability Improvement in Grid Computing with


Artificial Bee Colony Optimization Algorithm
Neda Azadi

Mohammad Kalantari

Islamic Azad University of Qazvin

Islamic Azad University of Qazvin

Department of Electrical, IT & computer science

Department of Electrical, IT & computer science

Qazvin, Iran

Qazvin, Iran

Neda.Azadi@qiau.ac.ir

md.kalantari@aut.ac.ir

Abstract: Modeling and evaluating of grid computing environment is very difficult because of
complexity and distribution nature. The present paper studies the evaluation of the performability
of grid computing. Here, a tree structure is assumed for the grid with RMS in its root. Users
give their tasks as well as their requirements to the RMS and finally take back the result from it.
The RMS divides the task into parallel and smaller subtasks in order to get a better performance.
Transferring each parallel subtask to several resources also increases its reliability. Analysis of the
system by means of reliability and performance measure is called performability. The performability
improvement is directly related to resource allocation among subtasks. In this paper, we present an
algorithm for resource allocation based on artificial bee colony optimization algorithm. The most
important step in optimizing algorithms is to define the objective function that should be solved
with optimizing algorithms. In this paper, the objective function is the performability improvement.
Since the tree structure is used in the resource allocation problem, the Bayesian logic and graph
theory are also used.

Keywords: RMS; performability; Bayesian model; graph theory; optimization; artificial bee colony; swarm intelligent.

Introduction

Grid computing[2] has emerged as an important new


field, distinguished from conventional distributed computing by its focus on large-scale resource sharing,
innovative applications, and, in some cases, highperformance orientation. The Open Grid Services Architecture (OGSA[1]) enables the integration of services and resources across distributed, heterogeneous,
dynamic virtual organizations sharing and serviceprovider relationships. This feature with OGSA gives
the grid this opportunity to satisfy its users requirements with best QoS and performance. So providing
the grid users requirement has great importance. The
grid users requirement is of different levels and a combination of high performance and reliability together
which is called performability.
Corresponding

The users give their desired level of performance


and reliability requirements to the RMS [3]; RMS divides users task into parallel subtasks and then it distributes the subtasks among the available resources according to the types of resource conditions and level of
users requirements. This is resource allocation. After
performing the subtasks, resources give back results to
the RMS and finally they are delivered to the users.
Performance, which is mostly interpreted as the execution time, is affected by factors such as number of
available resources, the rate of resources reliability and
communication channel [5]. It is evident that performance and reliability affect each other; however they
were evaluated separately in the past [4]. In this paper
these two measures are considered simultaneously in a
grid with a tree structure.

Author, P. O. Box 7153744414, T: (+98) 917 110 0291

19

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

If a task breaks into n parallel subtasks, the execution time will decrease. But in a real situation which is not devoid of failure - any failure in each subtask makes the whole task execution problematic. In
order to solve this problem, increase the reliability besides performance, and make harmony between these
two measures, we assign each subtask to several resources. In this way, if a failure accrues, any subtask
can be performed by other resources and the probability of the flawless accomplishment of the main task will
increase [5].
In the paper [10] the evaluation of performability is
studied in a grid with star structure. The models used
in the evaluation of systems performance and reliability are queuing network [12], Stochastic Petri Nets [14],
Bayesian model [16] and markov models [13]. Each
of the above models can be evaluated by the analysis or simulation methods [15]. Like paper [5], we use
Bayesian method for evaluation.
As mentioned, one way to increase the grids
performability is to optimize the resource allocation
among subtasks. In paper [6] this is done with Genetic
algorithm [25]. Nevertheless, in the present paper we
have made use of the artificial bee colony since it is
simpler and more flexible than the genetic algorithm.
The rest of the paper is organized as follows. Section 2 presents a model for the evaluation of reliability
and performance. The artificial bee colony algorithm is
explained in third part. The result of the optimization
is presented in part 4 and in the final part, a comparison between genetic optimization algorithm and artificial bee colony optimization algorithm is presented.

The resources are particularly assigned to a subtask.


Each resource has a constant processing speed
and rate of failure. Each communication channel
has a constant rate of failure and bandwidth.
The rate of failure in processing resource and
communication channel does not change during
the activity period.
The failure in resources and communication
channel is follow Poisson process. The probability of failures depends on the time of information
transfer and input processing. In fact exponential distribution is a general distribution in the
discussion of reliability analysis of software and
hardware component that is both theoretically
and practically acceptable [9].
The failure of resources and communication
channel is independent.
If a failure happens in resources or communication channels before the transformation of outputs from resource to the RMS, the whole task
encounters failure.
The resources start processing of the subtasks immediately after receiving them. This way, there
is no waiting queue in the resource for doing the
task; and consequently there is neither waste of
time nor waiting time.
The whole task would not be completely performed unless the result of all subtasks is delivered to the RMS.
If the information transfers via several communication channels, the transferring speed will be
limited to a link with the least bandwidth.
RMS is thoroughly reliable and its failure probability is assumed to zero.

Performance And Reliability


Models

The time of input transfer depends on the


amount of inputs (that should be transferred).
The time of subtask processing depends on the
complexity of calculation.

A few researches have been done about grids performability since their complexity challenges their model
making and evaluation [8]. In this part, grid is evalu- According to assumption above, when the subtask j is
ated from a performability point of view. In order to assigned to a resource i, the processing time is a ranutilize this model, the following hypothesis are needed dom variable that can be calculated from this relation:
[5, 6, 9]:
Cj
Tij =
xi
The requirements are taken into account imme- Where xi is the processing speed of resource i and Cj
is the computational complexity of subtask j. If data
diately; therefore, no time is wasted.
transmission between the RMS and resource i is ac The RMS divides each task into several subtasks.
complished through links belonging to a set i . Where
The resources are automatically recorded in si is the link with minimum bandwidth in a set i ,
and ai is denoted the amount of data that should be
RMS.

20

The Third International Conference on Contemporary Issues in Computer and Information Sciences

transmitted for the subtask j, thus the random time of does not fail:
I
X
communication between the RMS, and the resource i
i Qi
(3)
W
=
that executed the subtask j, can calculated from this
R()
aj
i=1
relation:ij = si
R() is defined as the probability that the correct outFor the constant failure rate the probability that puts without respect to the service time.
resource i does not fail until the completion of the subA tree is composed of the combination of resources
task j can be obtain as: pij = ei Tij
and the communication channel taking part in the exWhere i is the failure rate of resource k.
ecution of a task. Each tree contains several minimal
Given a constant failure rate the probability that spanning trees (MST) that guarantee the complete exthe communication channel between the RMS and the ecution of a task by the subtasks. On the condition
resource i does not fail until the completion of the sub- that any composing part of a tree encounters a failure, the whole task will be jeopardized. As any task
task j is:
is divided into parallel subtasks, different realizations
qij = ei ij
Where i is the failure rate of communication channel (MST) have been made. The execution time of any
MST is determined by the features of the grid such
between the RMS and the resource i.
as the bandwidth of communication channels, the pro
The random total completion time for subtask j cessing speed of performing resources and . . . .
assigned to resource i is equal to Tij + ij . It distribution of this time is P r( ) = pij qij .

The probability of performing a task by each tree is


measured after arranging the MST according to each
During the division of a task into parallel subtasks, ones execution time in an upward manner. The exeseveral combinations for performing the task come into cution of a task can be transferred to next MST of the
existence. Each combination is called a realization [5]. list if the previous MST encounters a failure.
Each realization performs the task in a deterministic
According to model [5] that was briefly mentioned
time and specific probability. At the end, the reliabilabove
and on the basis of conditional probability, the
ity of each task is defined as the probability that the
probability
of performing a task by each M ST (Qi )
correct output is produced according to users requireused
in
relation
1 and 3, can be deduced through the
ments. Depending on whether the user has determined
following
relation:
his execution time requirement or not, there are two
ways for calculating the reliability:
i1 , E
i2 , ..., E
1 )
Qi = pr(Ei , E
(4)

1 If the user has requested the time limitation Where Ei is the event when M STi is available and Ei
for the execution, the reliability of each task can is the event when M STi is not available. A binary
search tree can be used to calculate relation 4.
be calculated through this relation:
R( ) =

I
X

Qi .1(i < )

(1)

i=1

2 If the user hasnt determined the execution time,


the reliability can be calculated this way:
R( ) =

I
X

Qi

(2)

i=1

In such relations, parameter i shows the number of realizations of performing a task,i equals the time of
execution task by realization i and Qi is the probability
of performing the task by realization i that performing
a task in i .
he conditional expected service time W is considered to be a measure of its performance, which determines the expected service time, given that the service

21

Optimizing Technique

Resource allocation in grids is a complicated problem


[6]. To optimize it, we should make use of Meta heuristic methods [17]. The Meta heuristic algorithms are
placed in the group of approximate algorithms and
have the ability to exit on the condition of getting
caught in the local optimization. The application of
Swarm Intelligent (SI) [18] in Meta heuristic algorithm
is a method vastly used in complicated problems. SI
is a kind of artificial intelligent that is formed according to collection behavior in the distributed and selforganized environments [21]. This intelligence is inspired by natural behaviors. Examples of such intelligence are ant colony optimization algorithm [19],
particle swarm optimization algorithm [24], bee colony

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

optimization algorithms [20] and cuckoo optimization solution is the best distribution of resources among the
algorithm.
subtasks to gain the highest degree of reliability.
Artificial bee colony (ABC) algorithm is one of the
newest and most applied optimization algorithms because of its simplicity and few control variables. The
researchers have paid special attention to it since 2005.
In ABC algorithm, each cycle of the search consists
of three steps: moving the employed and onlooker
bees onto the food sources and calculating their nectar
amounts; and determining the scout bees and directing
them onto possible food sources. A food source position represents a possible solution to the problem to
be optimized. The amount of nectar of a food source
corresponds to the quality of the solution represented
by that food source(fitness function). Onlookers are
placed on the food sources by using a probability based
selection process. As the nectar amount of a food
source increases, the probability value with which the
food source is preferred by onlookers increases, too[21].

In each cycle, 10 new solutions are produced. These


solutions are, in fact, the same as food resources around
the hive for bees to find the best ones for honey. The
produced solutions are saved in some arrays to be evaluated. The subtask numbers allocated to any resource
are recorded in the solution array. For example, if we
have 3 subtasks and 9 execution resources in a way that
the first subtask is allocated to the resources 1,4,6,7,
the second subtask to the resources 2 and 9, and the
third subtask to the resources 3, 5, 8, then the solution
array would be as bellow:

The main steps of the algorithm are given as below


Ten random solutions are produced in the initial
[20]:
phase. The first solution is presupposed to be the most
optimized one. Then in the 100 cycle the employee
bees, scout bees and onlooker bees produce the next
1: Initialize Population
solutions in order to reach the most optimized one.
The new solution is made by the old solution in a way
2: repeat
that we change the random place of the array, which
3: Place the employed bees on their food sources
shows its allocated subtask number, with another sub4: Place the onlooker bees on the food sources de- task number. Making new solutions, we should pay
pending on their nectar amounts
attention to the fact that some solutions are not valid.
5: Send the scouts to the search area for discovering For instance, there should be just one subtask in each
solution.
new food sources
6: Memorize the best food source found so far
7: until requirements are met
The selection is controlled by a control parameter
called limit. If a solution representing a food source
is not improved by a predetermined number of trials,
then that food source is abandoned by its employed
bee and the employed bee is converted to a scout.

3.1

Description of problem and its solution

This solution should again be replaced by a new


valid solution. In each stage of movement of employee
bees and onlooker bees, the fitness of each solution is
evaluated by relation 2 that is introduced in second
part of the paper; and the most valid solution is memorized and placed in the global optimized variable. The
limit control variable is used to control the inappropriate solutions. If a solution does not improve the
problem solving procedure, the counter of limit variable adds one unit to the last phase, if the limit variable goes beyond a determined number, it means that
the current solution is not appropriate anymore and
the scout bees should replace it with a new solution.

Here we want to use ABC algorithm to optimize the


Evaluating The Work of Optiproblem of resource allocation in grid with the aim of 4
improvement in its performability. To do this, we remizing Algorithm
spectively appoint 100 and 20 to the control variable of
algorithm maxcycle and limit. The three main stages
of this algorithm are actually performed in 100 cycles in In order to evaluate the performance of the optimizing
order to reach the optimized solution. The optimized algorithm of ABC in the resource allocation problem,

22

The Third International Conference on Contemporary Issues in Computer and Information Sciences

we need to define a context for performing the algo- algorithm. This result actually shows that the ABC
rithm. Since our goal is to compare the result of this is a more appropriate solution for resource allocation
optimizing algorithm with the genetic optimizing algo- among the subtasks.
rithm, we use the mentioned context of paper [6].
Subtask Amount of
Distribution
Same as[6] the task is broken into 3 subtasks by the
complexity
RMS. The amount of complexity and data transferring
SB1
38.94
R1 , R3 , R7
of any subtask is shown in Tables 1 and 2.
SB2
25.44
R2 , R5
SB3
35.62
R4 , R6 , R8 , R9
SB1 38.94%
SB2 25.44%
Table 3: Distribution of Optimal Solution
SB3 35.62%
Table 1: Amount of Complexity of Each Subtask

SB1
SB2
SB3

250 MB
350 MB
400 MB

Table 2: Amount of Transferred Data of Each Subtask

There are 9 processing resources that are connected Figure 2: Diagram of comparison of two optimizing
together like a tree structure. Figure 1 shows the algorithm
grid environment. In this figure, the rate of resource
failures, communication channel failures, information
transfer speed and resource processing speed are also
4.1 The effect of bandwidth on evaluaexhibited.

tion measures
We have two scenarios for limiting the communication
channel in calculating the quality of the resources in
ABC algorithm:
1: the bandwidths of communication channel are supposed to be the minimum of the existing channel.
2: the bandwidths of communication channel are supposed to be the average of the existing channel.
If we use later scenario the reliability seems to be
increased and performance would increase. In Figure
3, the effect of average communication channel bandwidth usage has been shown.

Figure 1: Structure of evaluating grid [6]

A program for the ABC algorithm was written in


java, and executed on a Pentium IV 1.5G processor.
This takes about 1.50 minutes, and this time converge
is better than genetic optimization algorithm. The result for the near-optimal subtask distribution among
the nine resources, are given in table 3. The resulting
reliability and performance for the near-optimal solutions that found are alternatively 0.9799 and 41.10. As
exhibited in the following diagram in Figure 2, the re- Figure 3: Diagram of the comparison of bandwidths
liability resulted from ABC is better than the genetic influence

23

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

As shown in the above diagram, the reliability of


performing task is increased by the average bandwidth
via limiting the communication channels.

4.2

The effect of users requirement on


the optimizing algorithm

As previously mentioned, users requirements for execution time are different. Some of them should be performed in a time limit while some other are supposed to
be done correctly without any time constraint. Therefore, the user will be more pleased with the result if the
requirements are analyzed in addition to selecting and
allocating the resources. This way, the resources and
their power will be used in appropriate manners [7].
So, in this part, we consider the proposed algorithm
with respect to users requirements.
Imagine that in the previous context, a user delivers the task along with constraint on execution time
(deadline) to the RMS. Such limitation of time directly
affects the tasks reliability since, in such a situation;
the subtasks are only assigned to those realizations that
can perform them in shorter time duration than the
users deadline. In the following diagram that is shown
in Figure 4 the three following condition are compared
in a particular distribution. If each execution time
limit is between 40 to 120 seconds, different conditions
can be defined as follow:

Conclusion and Future Work

The problem of resource allocation is highly complicated because the complexity and distribution of computational grids are more complicated than other distributed environments. It is not possible to optimize such difficult problems with common algorithms;
rather Meta heuristic optimization algorithms are more
useful. The most important step in optimizing algorithms is to define the objective function that should
be solved with optimizing algorithms. In this paper,
the objective function is the simultaneous increase in
the two measures of performance and reliability or, in
the other word, performability. The grid user delivers
his intended task as well as requirements (optionally)
to the RMS. After dividing the task to the parallel subtask, the RMS allocates the best resources to subtasks
using an optimizing algorithm, and in the meanwhile,
it considers the processing resource and communication channels and their elements such as speed rate,
bandwidth, processing speed and so on. Therefore the
task can be performed with the highest performability.

As a future procedure for optimizing the problem of resource allocation in grids, we can use other
Meta heuristic algorithms that are inspired by nature,
like the artificial immune system, ant colony, particle
swarm, etc and then compare the results. Exponential
distribution is a general distribution in the reliability
analysis of hardware and software component but it
has a constant rate, while in the real environment, the
failure rate is time varying parameter. Therefore, the
use of another appropriate distribution for failures can
First condition: the users deadline is 70 second.
be studied in the future.
Second condition: the users deadline is 100 second.
Third condition: no deadline is defined.

Refrences
[1] I Foster, C Kesselman, and S Tuecke, The anatomy of the
grid: Enabling scalable virtual organizations, International
Journal of High Performance Computing Applications 15
(2001), 200-222.
[2] I Foster, D. Becker, and C Kesselman, The grid 2: Blueprint
for a new computing infrastructure, San Francisco, CA:
Morgan-Kaufmann, 2003.

Figure 4: Diagram of influence of users requirement


on reliability

[3] K Krauter, R Buyya, and M Maheswaran, A taxonomy and


survey of grid resource management systems for distributed
computing, SoftwarePractice and Experience 32(2): (2002),
135-164.
[4] I. Eusgeld, J. Happe, and P. Limburg, Performability. In:
Dependability Metrics, Springer, Berline/Heidelberg (2008),
pp.254.

As the above picture shows, the more time limits in


users requirements, i.e. less time considered for task
execution, leads to less reliability.

[5] Y.S. Dai and G. Levitin, Reliability and performance of treestructured grid services, Reliability and performance of treestructured grid services 55(2) (2006), 337-349.

24

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[6] Y.S. Dai and G. Levitin, Optimal Resource Allocation for


Maximizing Performance and Reliability in Tree-Structured
Grid Services, IEEE Transactions on Reliability 56(3)
(2007), 444-453.
[7] L Ramakrishnan and D.A Reed, Performability Modeling
for Scheduling and Fault Tolerance Strategies for Scientic
Workows, HPDC 08 Proceedings of the 17th international
symposium on High performance distributed computing.
[8] S. Jarvis, N. Thomas, and A.V. Moorsel, Open issues in
grid performability, I.J.of Simulation 5(5) (2005), 3-12.
[9] Y.S. Dai, M. Xie, and K.L. Poh, Reliability Analysis of
Grid Computing Systems, Proc. Ninth IEEE Pacific Rim
Intl Symp.Dependable Computing (PRDC02) (2002), 97
104.
[10] G. Levitin and Y.S Dai, Performance and reliability of a
star topology grid service with data dependency and two
types of failure, IIE Transactions 39(8) (2007), 783-794.
[11] A. Heddaya and A. Helal, Reliability, Availability, Dependability and Performability: A User-centred View, Technical
Report BU-CS-97-011, Boston University (1996).
[12] L. Kleinrock, Queueing Systems, Theory, Wiley 1 (1975).
[13] M. Bernardo and M. Bravetti, Performance Measurement
Sensitive Congruencies for Markovian Process Algebras,
Theoretical Computer Science 290 (2003), 117-160.
[14] M. Ajmone, G. Balbo, and G. Conte, A Class of Generalized Stochastic Petri Nets for the Performance Evaluation
of Multiprocessor Systems, ACM Transactions on Computer
Systems 2 (1984), 93-122.

25

[15] J. Banks, J.C. Ii, B. Nelson, and D. Nicol, Discrete-event


System Simulation, Prentice- Hall, 1999.
[16] J.G.T. Toledano and L.E. Sucar, Bayesian Networks for
Reliability Analysis of Complex System, Springer-Verlag
Berlin Heidelberg. (1998).
[17] E.G. Talbi, Metaheuristics: from design to implementation,
Wiley, 2009.
[18] G. Beni and J. Wang, Swarm Intelligence in Cellular
Robotic Systems, Proceed, NATO Advanced Workshop on
Robots and Biological Systems, Italy (1989).
[19] M. Dorigo and D. Stutzle, Ant Colony Optimization, MIT
Press, 2004.
[20] D. Karaboga, Artificial bee colony algorithm, Scholarpedia
5(3) (2010), no. 6915.
[21] D. Karaboga, An Idea Based on Honey Bee Swarm for Numerical Optimization, Technical Report-TR06, Erciyes University, Engineering Faculty, Computer Engineering Department, Turkey. (2005).
[22] D. Karaboga and B. Akay, A comparative study of Artificial
Bee Colony algorithm, Applied Mathematics and Computation 214(1) (2009), 108132.
[23] D. Karaboga and B. Akay, A survey:algorithms simulating
bee swarm intelligence, Artificial Intelligence Review (2009).
[24] M. Clerc and B. Akay, Particle Swarm Optimization by
Maurice, ISTE, 2006.
[25] D. E. Goldberg, Genetic Algorithms in Search, Optimization and Machine Learning, Kluwer Academic Publishers,
Boston, MA, 1989.

Security Enforcement with Language-Based Security


Ali Ahmadian Ramaki

Shahin Shirmohammadzadeh Sahraeii

University of Guilan, Rasht, Iran

University of Guilan, Rasht, Iran

Department of Computer Engineering

Department of Computer Engineering

ahmadianalir@msc.guilan.ac.ir

sahraei.shahin@gmail.com

Reza Ebrahimi Atani

University of Guilan, Rasht, Iran


Department of Computer Engineering
rebrahimi@guilan.ac.ir

Abstract: Language-based security is a mechanism for analysis and rewriting applications toward
guaranteeing security policies. By use of such mechanism issues like access control by employing a
computing base would run correctly. Most of security problems in software applications were previously handled by this component due to low space of operating system kernel and complicacy. These
days this task by virtue of increasing space of OS applications and their natural complicacy is fulfilled by novel proposed mechanisms which one of them is treated as security establishment or using
programming languages techniques to apply security policies on a specific application. Languagebased security includes subdivisions such as In-lined Reference Monitor, Certifying Compiler and
improvements to Type which would be described individually later.

Keywords: security; security policy; programming languages; language-based security.

Introduction

Growing use of the Internet, security of mobile codes is


one of important challenges and issues in todays computational researches. On increasing our dependency
on large global networks such as the Internet and receiving their services in order to perform personal routines and spread global information over these global
networks and even download from this perilous area is
potentially susceptible to destructive attacks from attackers and may be followed by irrecoverable effects.
We do not still forget pernicious attacks such as Mellisa and Happy 99 and while downloading plug-ins in
the internet packages careful attention is needed and
how exhaustive outcomes they have caused[7]. Recent
researches show these types of security issues are on the
rise. Today with respect to expansion of computational
environments, safety topic in term of mobile codes is
indispensible. For instance, having downloaded an ap Corresponding

plication from the internet from an unknown source


how could we warrant it would not carry an unwanted
file which may put system safety as risk? A way for
understanding of the situation is use of language-based
security. Throughout the method, security information
of an application programmed in a high-level language
is extracted during compilation of the application that
is a compiled object. The extra security information
includes formal proof, notes about the type or other
affirmable documents. The compiled object is likely to
be created alongside destination code and before running the main code is automatically examined to warn
of errors types or unauthorized acts. Java ByeCode affirmative is an example for the issue in question. The
chief challenge is as to how to create such mechanisms
such that in the first place they have the desirable performance and in the second place they are not revealing
to others as much as possible [1].
Following the paper, the issue literature is reviewed

Author, P. O. Box 41635-3756, F: (+98) 131 6690 271, T: (+98) 131 6690 274-8 (Ex 3017)

26

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

in section 2. Traditional actions to affirm the security


in computer systems are investigated in section 3. In
section 4 language-based security framework described
and later desirable techniques are explained in section
5 and finally in section 6 the conclusion is drawn.

Two Principles in Computer


Security

To understand language-based security more accurately we need to introduce two principles in computer
security systems and provide them with detail descriptions[6].
I. Principle of Least Privilege (PoLP): while running accomplishment policies, each principle is supposed to have least possible access to be applied;
II. Minimal Trusted Computing Based (MTCB):
components which should operate properly to confirm
execution system properties, such as operating system
kernel and hardware. That is mechanism in use fulfills big tasks while small. Smaller and simpler systems have less errors and improper interactions which
is quite appropriate to install safety.

Traditional Approaches to Apply Security

II. Cryptography: by this method makes it possible to install safety at a sensible data transmission
level in an unreliable network and make use of a receiver as a verifier. Power of cryptography methods
is as much complex as hypotheses. Digital Encryption
Standards (DESs) are susceptible to violation by a sufficient amount of damaging codes. Cryptography thus
cannot guarantee downloaded codes from a network to
be safe. It is only able to provide a safe transmittal
space for these codes through the Internet to avoid intrusions and suspicious interference;
III. Code instrumentation: Another approach practiced by operating system in some systems to inspect
safety level of a program from various aspects such as
writing, reading and programming jumps. Code instrumentation is a process through which machine code of
an executed program is changed and main action consequently could be overseen during execution. Such
changes occur in sequence of program machine code
for two reasons; first, behaviors of changed code and
initial code equal. It suggests that initial code did not
violate safety policy and second, if violation by initial
code occurs, changed code is immediately able to handle the situation by two options; either it recognizes
violation, gains control from system and terminates
destructive process or prevents fatal effects which are
likely to affect the system soon. For instance lets suppose a program needs to be run on a machine with certain hardware specifications. To do so lets assume the
program is loaded within a continuous space of memory addresses [c2k , c2k + 2k 1 ] where c and k are
integer numbers. The program then links to run and
after execution and obtaining destination code, by altering values of and jumping to another address space
of memory for indirect addresses, the code in question
is ready to run[4];

Traditional methods to safety issue within computer


systems include: I) Utilizing OS kernel as a reference
IV. Trusted compiler: this method is fulfilled by
monitor; II) Cryptography; III) Code instrumentation;
a
component
known as trusted compiler. By making
IV) Trusted compilation.
virtue of codes limited access, compiler attempts to
These mechanisms offer a constant amount of pre- generate a code which is trusted. There are two alterliminary security policies benefitting from low flexibil- natives for operating system kernel to warrant reliability of compiler.
ity. In future we scrutinize them in detail.
I. Utilizing operating system kernel as reference
monitor: this method is the oldest but the most exhaustive mechanism in use to guarantee security policies in software systems and fulfills single actions on
data and critical components of system through operating system kernel. Kernel is an indispensible component of operating system code retrieving vital components and data directly. The rest of programs are
somehow constrained in order to access these data and
components such that kernel plays a role of proxy interchanging messages for communication;

Language-Based Security

In computer systems, compiler usually interprets a program in a high-level language. Assembler of destination machine then issues Hex code of the program to
the hardware to let it start. Compiler obtains information about programs while compiling them. The
information includes variables values, types or speci-

27

The Third International Conference on Contemporary Issues in Computer and Information Sciences

fied information and may be analyzed and modified in


order to optimize produced destination code by compiler. After successful compilation, extra information
are mostly rallied which can provide information about
security of destination compiled code. For example in
case the program is written in a safe language before
compilation filter of type check must be complete successfully. So codes about security information should
also be generated alongside destination code in order
to run on the hardware during compilation process.
This information as a certificate is created before program execution and it starts running before produced
destination code execution to ensure security policies
of the specific convection is met. Such process is already shown in fig. 1. Concept of language-based security is given to such extra information extracted from
a program written in a high-level language and while
compiling this extra information package also called
certificate. During downloading applications from the
Internet or any other unsafe tool, this package of extra information is uploaded as well. Code consumer is
able to evoke a verifier program before running an application to confirm the certificate and code then run
it.

II. Type Assembly Language (TAL): certificate is a


type reminder such that verifying process on the users
side inspects code structure in term of type;
III. Efficient Code Certification (ECC): in this approach contains extra information about destination
code checking concept structures and code objectives
according to type theory information.

Language-Based
Techniques

Security

A reference monitor is a program execution and prevents the program if it violates the safety policies. Typical examples for reference monitor are operating systems (hardware monitor), interpreters (software monitor) and firewalls. Most of safety mechanisms, today,
employ reference monitor.
I. In-lined Reference Monitor (IRM): a mechanism
fulfilled by operating system in traditional approaches
to supervise programs flawless execution and confirmation of objective safety policies is that reference monitor and objective system are located in distinct address space. Alternative approach is an in-lined reference monitor; a similar task which is performed by
SFI. This component fulfills safety policy for objective
system by stopping reading, writing and jumps in the
memory outside a predefined area [3]. One of methods
thus, is the merge of the reference monitor with objective application. In-lined reference monitor is specified
by definitions below:
A. Security events: action to be performed by reference monitor.
B. Security status: information to be stored during
a safety event occurrence according to which a permission to progress is issued.

Figure 1: Overview of Language-Based Security

Code providers take advantage from various techniques to produce such certificate. Some of the most
important ones are:

C. Security updating: sections of the program running in response to safety events and updating safety
status. SASI is the first generation of IRM proved
by researches to be an approach guaranteeing policies in question. The first generation is programmed
in Assembly 80x86 and the second generation is programmed in Java [2]. SASI x86 that is compatible with
Assembly 80x86 is the graphical output of gcc compiler. The destination code generated meets the two
conditions below:

I. Proof Carrying Code (PCP): produced certificate


by the code provider is a first order logic proof wherein
a set of safe conditions to run code is supplied and user
checks their correctness on the downloaded application
A. The program behavior never changes by adding
while running the code;
NOPs.

28

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

B. Variables and addresses of target branch marked safety is founded on two principles of minimal access
with some tags by gcc compiler are matched during privilege and computing base. In such approaches the
compilation.
safety is warranted by operating systems and kernels
which the kernel acts as a proxy for other processes
So the first version is comprehensively employed in running on the system. Because of technology adorder to save the program memory data. In the second vances, complicacy of operating systems in terms of
version of IRM, JVML SASI, the programmed is pre- tasks and increase in kernel codes for supporting propserved in term of type safety. JVML instructions pro- erties such as graphic cards and distributed file sysvide information about the program classes, instances, tem, new approaches install safety which are proved to
methods, threads and types. Such information can be be high performance like safety establishment by using
utilized by JVML SASI to supply safety policies in ap- programming techniques. Such techniques drop under
plications [5]. Rewriting components in IRM mecha- three main categories: in-lined reference monitor, type
nism generate a verifying code with related destination system and certifying compilers which are described
code by this extra information [10].
separately.
II. Type System: the main objective is to prevent
error occurrence during the execution. Such errors are
identified by a type checker. The importance of this
case is that a high-level program certainly does have
many variables. If these variables of a programming
language are within a specific area we technically say
the language is a type safe. Lets assume variable x in
Java is defined as a Boolean and whenever it is initiated False the result is !X(not x) that is True. If
variables are under a condition such that their values
are within an undefined area we say the language is
not type safe. In such languages we do not meet types
but a global type including all possible types. An action is fulfilled by arguments and output may contain
an optional constant, an error, an exception or an uncertain effect [8]. Type system is a component of type
safe languages holding all types of variables and type of
all expressions are computed during execution. Type
systems are employed in order to decide a program is
well-formed. Type safe languages are explicitly known
as typical if types are parts of syntax otherwise implicit
type.
III. A Certifying compiler is a compiler that the
data given to it guarantees a safety policy, generates a
certificate as well as destination code which is checkable by machine i.e. it checks policies in question [9].

Conclusion

Security in computer systems holds an importance


stand. In traditional approaches computer systems

Refrences
[1] J.O. Blech and A. Poetzsch Heffter, A Certifying Code
Generation Phase. Proceedings of the Workshop on Compiler Optimization meets Compiler Verification (2007 ), 65
82.
[2] U. Erlingsson and F.B Schneider, IRM Enforcement of
Java Stack Inspection. In IEEE Symposium on Security and
Privacy, Oakland, California (2000 ), 246255.
[3] R. Wahbe, S. Lucco, T. Anderson, and S. Graham, Ecient
Software-Based Fault Isolation. In Proc.14th ACM Symp.
on Operating System Principles (SOSP) (1993 ), 203216.
[4] K. Crary, D. Walker, and G. Morrisett, Typed Memory
Management in a Calculus of Capabilities. In Proc. 26th
Symp. Principles of Programming Languages (1999 ), 262
275.
[5] U. Erlingsson and F.B. Schneider, SASI Enforcement of Security Policies: A Retrospective. In Proc. 26th Symp. Principles of Programming Languages (1999 ), 262275.
[6] F.B. Schneider, G. Morrisett, and R. Harper, A LanguageBased Approach to Security. Lecture Notes in Computer
Science (2001 ), 86101.
[7] D. Kozen, G. Morrisett, and R. Harper, Language-Based
Security. Mathematical Foundations of Computer Science
(1999 ), 284298.
[8] R. Hahnle, J. Pant, P. Rummer, and D. Walter, Integration
of a Security Type System into a Program Logic. Theoretical Computer Science (2008 ), 172189.
[9] C. Yiyun, L. Ge, H. Baojian, L. Zhaopeng, and C. Liu, Design of a Certifying Compiler Supporting Proof of Program
Safety. Theoretical Aspects of Software Engineering,IEEE
(2007 ), 127138.
[10] M. Jones and K.W. Hamlen, Enforcing IRM Security Policies: Two Case Studies. Intelligence and Security Informatics, IEEE (2009 ), 214216.

29

Application of the PSO-ANFIS Model for Time Series Prediction of


Interior Daylight Illuminance
Hossein Babaee

Alireza Khosravi

Faculty of Electrical and Computer Engineering

Faculty of Electrical and Computer Engineering

Noushirvani University of Technology

Noushirvani University of Technology

Babol, Iran

Babol, Iran

hbabaee@stu.nit.ac.ir

akhosravi@nit.ac.ir

Abstract: The increasing need for more energy sensitive and adaptive systems for building light
control has encouraged the use of more precise and delicate computational models. This paper
presents a time series prediction model for daylight interior illuminance obtained using optimized
Adaptive Neuro- Fuzzy Inference System (ANFIS). Here the training data is collected by simulation,
using the globally accepted light software Desktop Radiance. The model developed is suitable for
adaptive predictive control of daylight - artificial light integrated schemes incorporating dimming
and window shading control. In ANFIS training process, if the data clustered first and then go to
ANFIS, the performance of ANFIS will be improved. In clustering process, the radius of clusters
has high efficiency on the performance of system. In order to achieve the best performance we need
to determine the optimum value of clusters radius. In this study particle swarm optimization has
been used to determine the optimum value of radius. Simulation results show that the proposed
system has high performance.

Keywords: Particle swarm optimization, Adaptive Neuro- Fuzzy inference system, Radius, Optimization

Introduction

To develop, automatic control strategies in addition to


evaluate the visual and energy performance provided
by daylight, requires an accurate prediction of daylight
entering a building [1]. Daylight Factor (DF) [2], Daylight Coefficient (DC) [3], Useful Daylight Illuminance
(UDI), computer simulations, Average daylight factor,
etc [4] are the various methods adopted for the estimation of interior daylight illuminance. The DF approach
has been in practice for the last 50 years. The DF approach has gained favour because of its simplicity, but
it is not flexible enough to predict the dynamic variations in daylight illuminance as the sun position and
sky condition change. The DC concept was developed
by Tregenza PR [5], which considers the changes in the
luminance of the sky elements, offers more effective way
of computing indoor daylight illuminance. As the sky
is treated as an array of point sources, the daylight co Corresponding

efficient approach can be used to calculate the reflected


sunlight, and is particularly appropriate for innovative
daylight system with complex optical properties. In the
UDI approach, daylight illuminance metrics are based
on absolute values of time varying daylight illuminance
for a period of full year. Recently, Kittleret.al.[6]have
proposed a new range of 15 standard sky luminance
distributions including five clear, five partly cloudy and
five overcast sky types. DHW Li et al [4], Have proposed average daylight factor concept suitable for all
above 15 standard skies. This proposition may be a
useful paradigm for planning and design of daylight
systems, but again uncertain about the effectiveness of
this method for automated control strategy as we cannot predict the type of sky ahead. This varying illuminance predictions, as used for meteorological data sets,
offer a more realistic account of true daylight conditions, than the previously mentioned DF,DC and UDI.
ANFIS shows very good learning and prediction capabilities, which makes it an efficient tool to deal with

Author, P. O. Box 47135-484, T: (+98) 21 8890-7940

30

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

uncertainties encountered in this venture. A variety of


computer design tools are available for collecting the
data required for training the Adaptive Neuro- Fuzzy
Inference System. Here, the software Desktop Radiance is used for collection one full year data with different sky conditions. The interior illuminance level
is calculated for a given environment at any time of
the year. Instead of using measured values of illuminance levels, here we used the simulated data from the
model created using the appropriate design tool. The
illuminance levels obtained in this way are used as a
training data for ANFIS to predict the six step ahead
values for the model under consideration. Hence, these
predicted values identify how the system is going to behave ahead of a particular time. This paper highlights
how ANFIS can be employed to predict future values
of the daylight availability. In ANFIS training process,
if the data is clustered first and then goes to ANFIS,
the performance of ANFIS will be improved. In clustering process, the radius of clusters has high efficiency
on the performance of system. In order to achieve the
best performance we need to determine the find the
optimization value of clusters radius using PSO algorithm. The rest of this paper is organized as follow:
Section 2 introduces the Adaptive Neuro-Fuzzy Inference System. PSO algorithm is presented in section
3. In section 4 we describe the experimental settings
and the experimental results. The conclusions are in
section 5.

The system has two inputs x1 and x2 with one output


F. A square node (adaptive node) has parameters and
changes during training while a circle node (fixed node)
has none. Two membership functions are associated
with each input. The rule contains two fuzzy if-then
of Takagi and Sugenos type. The key features of the
five layers are described as follows. In the following
presentation Oli denotes the output of node I in a layer
L [9, 10].

The Membership function can be bell-shaped or


Gaussian. Parameters in this layer are referred to as
promise parameters

Adaptive Neuro -Fuzzy Inference System (ANFIS)

Adaptive -network-based fuzzy inference system (ANFIS) has been proposed by Jang [7]. The fuzzy inference system is implemented in the framework of
adaptive networks using a hybrid learning procedure,
whose membership function parameters are tuned using a back propagation algorithm combined with a least
square method. ANFIS is capable of dealing with uncertainty and imprecision of human knowledge. It has
self-organized ability and inductive inference function
to learn from the data. ANFIS is a multilayer feed forward network [7]. Each node of the network performs a
particular function on incoming signals as well as a set
of parameters pertaining to this node. To present the
ANFIS architecture, consider two-fuzzy rules based on
a first order Sugenos model [8] shown in Figure 1.

Layer 1: The nodes in this input layer are adaptive. They define the membership functions of the
inputs.

Figure 1: structure of ANFIS[9]

O1,i = Ai (x1 );

i = 1, 2

O1,i = Bi 2 (x1 );

i = 3, 4

(3)
(4)

Where, Ai and Bi can be any appropriate fuzzy sets


in parameter form.
Layer 2:The nodes in this rule layer are fixed. It
multiplies all the incoming signals and sends the product Out. Output of each node represents the firing
strength of a rule
O2,i = Wi = Ai (x1 )Bi (x2 );

i = 1, 2

(5)

The output of each node is this layer which represents


the firing strength of the rule.

Rule 1 : IF x is A1 and y is B1 ,
then f1 = p1 x1 + q1 x2 + r1

Layer 3:The nodes in this normalization layer are


(1) fixed. The nodes normalize the firing strengths obtained in Layer 2

Rule 2 : IF x is A2 and y is B2 ,
then f2 = p2 x1 + q2 x2 + r2

(2)

i =
O3,i = W

31

Wi
;
W1 + W2

i = 1, 2

(6)

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Layer 4:The nodes in this inference layer are adaptive. an excessive propagation of rules when the input data
The outputs in this layer are the outputs from Layer has a high dimension.
3 multiplied by a linear formula. Parameters in this
The cluster radius indicates the range of influence
layer are referred to as consequent parameters:
of a cluster when you consider the data space as a unit
i Fi = W
i (pi xi + qi x2 + ri ); i = 1, 2 (7) hypercube. Specifying a small cluster radius usually
O4,i = W
yields many small clusters in the data, and results in
Where, pi , qi and ri are design parameters (consequent many rules. Specifying a large cluster radius usually
parameter since they deal with the then-part of the yields a few large clusters in the data, and results in
fuzzy Rule).//// Layer 5:The nodes in this output fewer rules.
layer are fixed. It computes the overall output as the
summation of the Weighted outputs from Layer 4.
In this study in order to more increasing the accuracy of proposed system, we intend to find the optimum
value of clusters radius using PSO. In next section,
PSO algorithm is explained.
P
W
F
i
i
X
i Fi = iP
; i = 1, 2
(8)
O5,i = F =
W
Wi
i

The ANFIS architecture is not unique. Some layers can


be combined and still produce the same output. There
are two sets of parameters in the above fuzzy inference
system. The overall output is linear in the consequent
parameters on layer 3 but non linear in the parameters
on layer 1. The hybrid learning algorithm detailed in
[9] consists of a forward and a backward pass. In the
forward pass, the linear parameters are updated using
least squares estimator (LSE). In the backward pass,
errors of derivatives are calculated for each node starting from the output end and propagating towards the
input end of the network. The non linear parameters
are updated by steepest descent algorithm [9].
Training of Neuro-Fuzzy has several steps. At the
first step of training, the initial fuzzy sets should be
determined. Actually the fuzzy sets define the number
of sets for each input variable and their shapes. During
training, all of the training dataset would be present
to network and it tries by learning the spatial relationship between the data to minimize the error. Sometime
lower error could not guaranty the better performance
of network and it may because of network over training.
While ANFIS is being trained, all of the training
dataset would be presented to network and it tries
by learning the spatial relationship between the data
to minimize the error. If the input-output clusters of
training data are found, the cluster information could
be used to generate a fuzzy inference system. The rules
partition themselves according to the fuzzy qualities
associated with each of the data clusters.
An important advantage of using a clustering
method to find rules is that the resultant rules are more
tailored to the input data than they are in a FIS generated without clustering. This reduces the problem of

32

PSO Algorithm

The basic operational principle of the particle swarm is


reminiscent of the behaviour of a group, for example, a
flock of birds or school of fish, or the social behaviour of
a group of people. Each individual flies in the search
space with a velocity which is dynamically adjusted
according to its own flying experience and its companions flying experience, instead of using evolutionary
operators to manipulate the individuals like in other
evolutionary computational algorithms. Each individual is considered as a volume-less particle (a point) in
the N-dimensional search space. At time step t, the
ith particle is represented as: .The set of positions of
m particles in a multidimensional space is identified
as .The best previous position (the position giving the
best fitness value) of the ith particle is recorded and
represented as . The index of the best particle among
all the particles in the population (global model) is
represented by the symbol g. The index of the best
particle among all the particles in a defined topological neighbourhood (local model) is represented by the
index subscript . The rate of movement of the position
(velocity) for particle at the time step is represented as
.The particle variables are manipulated according to
the following equation (global model [11]):

vin (t) = wi vin (t 1) + c1 rand1(.)


(pin xin (t 1)) + c2 rand2(.) (pgn xin (t 1))
xin (t) = xin (t 1) + vin (t)
(9)
where n is the dimension . and c2 are positive constants, rand1(.) and rand2(.) are two random functions in the range [0,1], and is the inertia weight. For
the neighbourhood (lbest) model, the only change is
to substitute pln for pgn in the equation for velocity.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

This equation in the global model is used to calculate a particles new velocity according to its previous
velocity and the distance of its current position from
its own best experience and the groups best experience . The local model calculation is identical, except
that the neighbourhoods best experience is used instead of the groups best experience. Particle swarm
optimization has been used for approaches that can be
used across a wide range of applications, as well as for
specific applications focused on a specific requirement.
Its attractiveness over many other optimization algorithms relies in its relative simplicity because only a
few parameters need to be adjusted[12,13].

Figure 3: prediction errors without optimization

Simulation Results
4.2

Performance with Optimization

In order to build an ANFIS that can predict x(t + 6)


from the past values of daylight levels, the training data Next, we apply PSO to find the optimum value of raformat is [x(t 18), x(t 12), x(t 6), x(t); x(t + 6)] . dius. Table 2 shows the coefficient values in the PSO
Training and checking data are shown in Figure 2.
algorithm. Figure 3 shows the prediction error.
As a comparison of Figures 2 and 3 implied, optimization significantly reduces the amount of grants.
Figure 3 shows the prediction error, it is found that
error is maximum of 1 lux which does not result any
change in the control signals (very small variation will
not be considered as it results too much Fluctuation
of light). So we stopped at this level of performance
instead of going for more extensive training. Figure
4 shows the non-linear surface, of the Sugeno Fuzzy
model for the problem of time series prediction. We
Figure 2: Training & Checking data used for ANFIS have used Fuzzy logic Toolbox of MATLAB to develop
the ANFIS model with 4 inputs and single output.
prediction

Performance criteria in this study, the area enclosed


between the original signal and the signal predicted by
ANFIS is assumed. It is clear that much more accurate
prediction of the original signal and the signal is predicted to be closer together therefore The area enclosed
between them will be less.

4.1

In Table 2, the area enclosed between the original


signal and the signal predicted by the optimization and
without optimization is shown as this table suggests
that the optimal system performance is much better
than without optimization. Figure 5 of the original
signal and predicted signal is shown in ANFIS optimized. As this suggests, the two signals are very close
together.
Number of particles
Error limit
Acceleration constant
Maximum velocity
Maximum number of iterations
Size of the local neighborhood
Constants c1 = c2

Performance without optimization

First we have evaluated the performance of the recognizer without optimization. Figure 2 shows the prediction error. As Figure 2 suggests that the difference
between the original signal and the signal expected at
different times almost too much and is approximately
0.05

10
e-10
3
8
100
2
2.1

Table 1: Coefficient Values In PSO Algorithm

33

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Status
Optimized ANFIS
ANFIS without optimization

Value
8.7649e-004
0.3214

is only one output, genfis2 may be used to generate an


initial FIS for ANFIS training. genfis2 accomplishes
this by extracting a set of rules that models the data
behaviour.

Table 2: The Area Enclosed Between the Original Signal and the Signal Prediction

The rule extraction method first uses the sub cluster function to determine the number of rules and
antecedent membership functions and then uses linear
least squares estimation to determine each rules consequent equations. This function returns a FIS structure
that contains a set of fuzzy rules to cover the feature
space.
ANFIS
ANFIS uses a hybrid learning algorithm to identify
the membership function parameters of single-output,
Sugeno type FIS. A combination of least-squares and
back propagation gradient descent methods are used
for training FIS membership function parameters to
model a given set of input/output data.

Figure 4: prediction errors with optimization

EVALFIS
This performs fuzzy inference calculations. Y = EVALFIS(U,FIS) simulates the FIS for the input data U and
returns the output data Y. For a system with N input
variables and output variables, U is M-by-N matrix,
each row being a particular input vector and Y is
M-by-L matrix, each row being a particular output
Figure 5: InputOutput SURFVIEW of ANFIS scheme vector.

Conclusion

The most important advantage of the proposed model


is the ability to predict natural systems behaviour at a
future time, which can be used for lighting control. The
implementation of ANFIS model is less complicated
than the sophisticated identification and optimization
procedures. Compared to fuzzy logic systems, ANFIS
Figure 6: Original signal and the signal predicted
has an automated identification algorithm and has easier design. In comparison with neural networks, it has
fewer numbers of parameters and has faster adaptation. In order to increase the accuracy of the proposed
5 Matlab Functions Used For system, a PSO algorithm is used to determine the optimum value of cluster radius (which is used in ANFIS
Time Series Prediction
training). The non-linear characteristics of the daylight systems can be tolerably handled in the proposed
GENFIS2
system. This prediction could be utilized as an input
for the artificial light and shading controls. Possibility
genfis2 generates a Sugeno-type FIS structure using to reduce the number of sensors and connections, imsubtractive clustering and requires separate sets of in- prove the performance of control strategy. PSO-ANFIS
put and output data as input arguments. When there based time series prediction model for daylight interior

34

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

luminance is the unique, novel as it is simple, reliable


and easily accessible for different room conditions.

Refrences
[1] A.Nabil and J.Mardaljevic, Useful daylight illuminance: a
new paradigm for assessing daylight in building, Lighting
Research & Technology 37 (2005), no. 1, 4159.
[2] DHW Li, CCS Lau, and JC Lam, Predicting daylight Illuminance by computer simulation techniques, Lighting Research & Technology 36 (2003), no. 2, 113119.
[3] P.J Littlefair, Daylight coefficients for practical computation of internal illuminances, Lighting Research & Technology 24 (1992), no. 3, 127135.
[4] DHW Li and GHW Cheung, Average daylight factor for the
15 CIE standard skies, Lighting Research & Technology 38
(2006), no. 1, 137152.
[5] Tregenza PR and Waters IM, Daylight coefficients, Lighting
Research & Technology 15 (1983), 6571.
[6] Kittler R, Darula S, and Perez R, A set of standard skies
characterizing daylight conditions for computer and energy
conscious design, Bratislava, Slovakia, 1998.

[7] J.-S and R. Jang, ANFIS: Adaptive-Network-Based Fuzzy


Inference System, IEEE Transactions on Systems, Man, and
Cybernetics 23 (1993), 665685.
[8] K. Erenturk, ANFIS-Based Compensation Algorithm for
Current-Transformer Saturation Effects, IEEE Transactions on Power Delivery 24 (2009), no. 1.
[9] S. R. Jang and E. Mizutani, Neuro-Fuzzy and soft computation, Prentice Hall, NJ, 1997.
[10] S. R. Jalluri and B. V. S Ram, A Neuro -Fuzzy Controller
for Induction Machines Drives, Journal of Theoretical and
Applied Information Technology 19 (2010), no. 2.
[11] R.C Eberhart and Kennedy.j, A new Optimizer Using Particle Swarm Theory, Proceeding of the Sixth International
Symposium on Micro Machine and Human Science, Nagoya,
Japan (2005), 3943.
[12] H.babaee and A.khosravi, presented at IEEE Conference,
China (2011).
[13] Hongchao Yin and Wenzhi Dai, Optimal Operational Planning of Steam Power Systems Using an IPSOSA Algorithm,
Journal of Computer and Systems Sciences International 49
(2010), no. 5, 750756.

35

Evaluating the impact of using several criteria for buffer management


in VDTNs
Zhaleh Sadreddini

Mohammad Ali Jabraeil Jamali

Department of Computer Sciences

Department of Computer Sciences

zh.sadreddini@iaushab.ac.ir

mjamali@itrc.ac.ir

Ali Asghar Pourhaji Kazem


Department of Computer Engineering
apourhajikazem@iaut.ac.ir

Abstract: In Vehicular Delay Tolerant Networks (VDTNs), the optimal use of buffer management
policies can improve the overall network throughput. Due to the fact that several message criteria
can be considered simultaneously for the optimal buffer management, conventional policies are unsuccessful to support different applications. Through this research, we present a buffer management
strategy called Multi Criteria Buffer Management (MCBM). This technique applies several message criteria according to the requirements of different applications. We examine the performance
of proposed buffer management policy by comparing it with existing FIFO and Random. According
to the proposed scenario in this paper, simulation results prove that the MCBM policy perform well
as existing ones in terms of the overall network performance.

Keywords: Buffer management policies, Epidemic routing, Vehicular Delay Tolerant Networks.

Introduction

to increase network performance has been presented to


improve the efficiency of network.

In VDTNs we can point out to different scenarVehicular Delay Tolerant Networks (VDTNs) are an
ios:
traffic condition monitoring, collision avoidance,
application of the Delay-Tolerant Networks (DTNs),
emergency
message dissemination, free parking spots
where the mobility of vehicles is used for connectivity
information,
advertisements, etc.[4],[5],[6]
and data communications.[1]
According to the mobility and high speed of vehicles, the end to end path is not available all the time.
Therefore, in such networks the intermittent connection is occurred and as a result, the sending of the
message encounters a delay.[2],[3] In order to overcome
the intermittent connectivity and increase the delivery
rate of messages and reduce the average latency, we
have used store carry and forward patterns. So, messages are stored and sent among network nodes until
reaching the final destination. Consequently, according
to the limitation of space in buffer nodes, messages are
faced with the buffer overhead and dropped. To overcome this problem, optimal buffer management policy
Corresponding

Author,T: (+98) 914 411-5567

36

According to the requirements of different applications, it is possible that multiple major message criteria
are considered simultaneously for optimal buffer management. In addition, different criteria may have different levels of importance and conflict with each other.
However, the existing policies have considered only one
or two message criteria; as a result they are for a single
purpose and do not support different applications.
The MCBM technique offers the buffer management problem as a multi- criteria decision problem.
Therefore, different criteria can be applied to manage
the buffer in terms of requirement of different applica-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

tions via the MCBM technique.[7]


In this article, the Emergency Warning scenario has
been considered to represent improvement of the network performance and responding to requirements of
different applications via MCBM technique. In this
scenario, the delivery of the emergency message has
special importance. In the simulation performed, increasing the message delivery rate and the acceleration
of delivery of the emergency messages (like collision
alarm messages) are considered.

2.1

Existing buffer management


policies

Table 1: Decision matrix

In the decision matrix, rij shows the value of j-th


criterion for i-th messages and we have:

First-In First-Out (FIFO)

FIFO is a straightforward policy, which just orders


messages to be forwarded at a contact opportunity
According to the requirement of applications, the
based on their receiving time (first-come, first-served
values of different criteria have different units and conbasis). When the FIFO dropping policy is enforced,
tradict with each other. Normalization should be done
the buffer congestion dropped messages will be the ones
to assimilates units, elimination conflict of criteria and
at the head of the queue (drop head).[?2-8]
equalize of the range of values.

2.2

Random

One of the most common methods of MCDM is


WSM (Weighted Sum Model). In this method, first the
normalization operations of rij values are done. After
normalization, a message would be selected to send or
drop in which value of A*WSM-Score is the highest:

In random policy, messages are scheduled for transmission in a random order. Moreover, the selection of
messages to be dropped is in random order.[9],[10]

2.3

Approach

Time complexity of this method is O (mn), where


m is the number of criteria and n is the number of mesAccording to the requirements of different applications, sages. According to this fact, the number of criteria is
Multi Criteria Buffer Management (MCBM) technique less than the number of current messages in the buffer,
applies different and even contradictory criteria for op- so the complexity is linear and would be O (m).
timal buffer management.[7] For this purpose, using
In order to prove the improvement of the network
the Multi Criteria Decision Making (MCDM) method
decision matrix for buffer management is created as Ta- performance via MCBM technique, the Emergency
ble 1. In the decision matrix for buffer management, Warning scenario has been presented. Simulation rebuffer has n messages, each message has m different sults in the next section show the performance of excriteria, and the importance degree of j-th criterion is isting FIFO and Random and MCBM technique via
above-mentioned scenario.
wj.

37

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Performance Evaluation

3.2

Performance analysis of Epidemic


Routing protocol for the scenario
with 20 vehicles

In the Emergency Warning scenario, message priority


and acceleration of delivery of emergency messages is
an important issue. However, we consider the presented priority pattern in DTN architecture. There- The performance analysis starts with the scenario
fore, this work considers three traffic priority classes: where only 20 vehicles move across the map roads.
Bulk, Normal and Expedited (emergency).
Figure1 shows the effect of the delivery probability
with existing FIFO, Random and MCBM. According
The MCBM technique can be compared with dif- to the Emergency Warning scenario, MCBM technique
ferent types of buffer management policies. However, improves the emergency message delivery rate in the
we compare our technique with conventional FIFO and worst conditions (low traffic, buffer size and TTL). The
Random policies. For this purpose, a simulation study delivery rate of the emergency message in the MCBM
using the Opportunistic Network Environment (ONE) is about 6
Simulator has been executed.[11] We created a set of
extensions for the ONE simulator to support traffic
priorities, schedule and drop policies for traffic differentiation. The performance metrics considered are the
delivery rate of messages, per priority class. Next section describes the two simulation scenarios and the corresponding performance analysis.

3.1

3.3

Simulation Setup

Performance analysis of Epidemic


Routing protocol for the scenario
with 100 vehicles

When the number of vehicles is increased in VDTN, the


possibility of connecting factors is increased too. The
result will have significant effect on improving the delivery rates. According to Figure 2, the delivery rate of
emergency messages for the above-mentioned scenario
in the MCBM is about 20

Table 2: Simulation Setup

In this study, it is assumed that the delivery of emergency messages is so important and that these messages generate larger volumes of traffic. Thus, messages are generated with sizes uniformly distributed
in the ranges of [250 KB, 750 KB] for bulk messages,
[500 KB, 1 MB] for normal messages, and [750 KB,
1.5 MB] for emergency messages. In all policies, the
creation probability of three priority class is set as:
Emergency=20The performance of policies assessment
is done with the Epidemic routing protocol.[12] Epidemic is a flooding based routing protocol where nodes Figure 1: MCBM, FIFO and Random Delivery Probability with 20 vehicles
exchange the messages they dont have.

38

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

buffer management.

Refrences
[1] V. N. G. J. Soares, F. Farahmand, and J. J. P. C. Rodrigues,
A layered architecture for vehicular delay-tolerant networks,
(ISCC09), Sousse, Tunisia (2009).
[2] V. N. G. J. Soares, J. J. P. C. Rodrigues, and P. S. Ferreira, Improvement of messages delivery time on vehicular
delay-tolerant networks, ICPP, Vienna, Austria, (2009).

Figure 2: MCBM, FIFO and Random Delivery Probability with 100 vehicles

[3] V. N. G. J. Soares, F. Farahmand, and J. J. P. C. Rodrigues,


Evaluating the impact of storage capacity constraints on
vehicular delay-tolerant networks, CTRQ, Colmar, France
(2009).
[4] D. Niyato and P. Wang, Optimization of the Mobile Router
and Traffic Sources in Vehicular Delay Tolerant Network,
IEEE, (2011).

Discussion and Future Works

In order to support different types of applications in


VDTNs, buffer management policies should be designed so that several criteria could be imposed on
them. Therefore, we propose the MCBM technique. In
this article, the Emergency Warning scenario has been
considered to compare the efficiency of the proposed
technique with conventional FIFO and Random policies. In this comparison we observe that single purpose
buffer management policies are not able to respond in
different types of scenarios. The obtained results of the
simulation according to the proposed scenario show improvement of network efficiency in the message delivery
rate compared with other policies. In future works, the
efficiency of proposed technique will be studied with
different types of required application scenarios via several routing protocols. According to the requirement
of applications, we can also apply network criteria in

[5] R. Tatchikou, S. Biaswas, and F. Dion, Cooperative vehicle


collision avoidance using inter-vehicle packet forwarding,
IEEE, MO, USA, (2005).
[6] V. N. G. J. Soares, F. Farahmand, and J. J. P. C. Rodrigues, Scheduling and drop policies for traffic differentiation on vehicular delay-tolerant networks, SoftCOM, Croatia, (2009).
[7] TomGl., Theodor J. Stewart., and H. Thomas, Multi criteria decision making: advances in MCDM models, algorithms, theory, and applications, Springer (1999).
[8] V. N. G. J. Soares, F. Farahmand, and J. J. P. C. Rodrigues, Traffic differentiation support in vehicular delaytolerant Networks, Springer science (2010).
[9] Q. Ayub, S. Rashid, and M. SoperiMohdZahid, Buffer
Scheduling Policy for Opportunitic Networks (2011).
[10] S. Rashid, Q. Ayub, M. SoperiMohdZahid, and A. HananAbdullah, Impact of Mobility Models on DLA (Drop
Largest) Optimized DTN Epidemic routing protocol (2011).
[11] A. Kernen, J. Ott, and T. Krkkinen, The ONE simulator
for DTN protocol evaluation, SIMUTools, Rome, (2009).
[12] D. Becker, Epidemic routing for partially connected ad hoc
networks, Duke University, (2000).

39

Improvement of VDTNs Performance with Effective Scheduling Policy


Masumeh Marzaei Afshord

Mohammad Ali Jabraeil Jamali

Islamic Azad University, Shabestar Branch

Islamic Azad University, Shabestar Branch

Department of Computer Science

Department of Computer Science

Shabestar, Iran

Shabestar, Iran

m.marzaei@gmail.com

m jamali@itrc.ac.ir

Ali Asghar Pourhaji Kazem


Islamic Azad University, Tabriz Branch
Department of Computer Engineering
Tabriz, Iran
a pourhajikazem@iaut.ac.ir

Abstract: In Vehicular Delay Tolerant Networks (VDTNs), buffer management policies effect on
performance of network. Most conventional buffer management policies make decision only based on
message criteria and do not consider features of environment where nodes are located. In this paper
we propose knowledge based scheduling (KBS) policy which make decision using two knowledge
of amount free space of receiver node s buffer and traffic amount of segment where sender node
is located. Using simulation, we evaluate performance of proposed policy and compare it with
Random and Lifetime desc policies. Simulation results show that our buffer management policy
increases delivery rate and decreases number of drop significantly.

Keywords: Epidemic Router, Scheduling Policy, Vehicular Delay Tolerant Networks.

Introduction

new contact opportunity. This process continues until


messages reach to destination.

Delay Tolerant Network (DTN) has been introduced


for situation where the connection between the nodes
is sparse. As a result, unlike traditional mobile ad hoc
network (MANET), the end to end path between a
source and destination will only be available for a brief
and unpredictable period of time [7].

In order to increase delivery rate and decrease average of latency in VDTNs, message replication is
performed by many of routing protocols. Combination of message storage during long periods of time
and their replication imposes high storage overhead on
buffer nodes and reduce overall performance of network. Therefore efficient buffer management policies
are required to improve overall performance of network. Most conventional buffer management policies,
make decision just based on message criteria (like size
of message, time-to-live (TTL) of message, number of
forwarding of message).

Vehicular Delay Tolerant Networks (VDTNs) are


an application of DTNs where vehicles are responsible to make communication. In VDTNs, movement
and high velocity of vehicles lead to short contact durations, intermittent connectivity and highly dynamic
network topology issues. To overcome these problems,
In this paper, we present an effective buffer manstore-carry and forward strategy used in VDTNs. Vehicles store messages on their buffers while a connec- agement policy, called Knowledge Based Scheduling
tivity is not available. They carry messages until a (KBS), that in addition to considering message crite Corresponding

Author, T: (+98) 914 302 3661

40

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

ria, it forwards a message based on knowledge of free where this message will also contain information about
space amount of receiver nodes buffer and also knowl- free space of buffer [4].
edge of traffic amount of segment where sender node
is located. Using simulation, we show that KBS policy
Assume the free space of receiver nodes buffer is
improve performance of network.
250K. As can be seen in Table 1, in buffer of sender
node, there is multiple messages with equal or smaller
size than free space of node receivers buffer. In this
case, KBS policy makes decision based on amount traf2 Existing Scheduling Policies
fic of segment where the sender node is located. Based
on segment traffic, it selects the message with least
TTL or with highest TTL, among messages with equal
Scheduling policy determines the order which messages or smaller than free space of receiver buffer. Therefore
should be forwarded at a contact opportunity.
KBS policy gives opportunity both messages with low
TTL and messages with high TTL.

2.1

Msgid
M1
M2
M3
M4
M5
M6

FIFO (First in-First out)

FIFO scheduling policy orders messages to be forwarded at a contact opportunity based on their entry
time into nodes buffer.

2.2

Msgsize
180K
450K
200K
150K
300K
550K

MsgTTL
50
120
90
100
70
35

Table 1: Buffer space of sender node

Random

Random scheduling policy forwards messages in a ranIf the segment traffic of sender node is low or
dom order.
medium (An interval has been defined for low or
medium traffic), among messages with equal or smaller
size than free space of receiver nodes buffer (M1, M3,
M4), the message with the highest TTL (M4) is se2.3 Lifetime descending order
lected to forward. The reason of selecting the message
with the highest TTL is that since segment traffic is
Lifetime descending order (Lifetime desc) policy sorts low, contact opportunities in segment are also low and
messages based on their TTL in a descending order waiting times in buffers are high, so the possibility of
and at contact opportunity forwards the message with traversing current segment by messages with high TTL
is more than the possibility of traversing current seghighest TTL.
ment by messages with low TTL. But if segment traffic
is high (An interval has been defined for high traffic),
the message with the least TTL is selected to forward
(M1). In this case, since segment traffic is high, con3 Proposed policy
tact opportunities in segment are also high and waiting
times in buffers are low, so it is possible that messages
Knowledge Based Scheduling (KBS) policy in addition with low TTL traverse current segment before expirato message criteria, considers neighboring environment tion. As a result an opportunity would be given to
of node and using two knowledge of free space amount messages with low TTL. Knowledge of segment traffic
of receiver nodes buffer and traffic amount of segment is obtained using traffic oracle [3]. Based on Cartesian
where sender node is located, makes decision. In a coordinate of each node, this oracle obtains related segcontact opportunity, KBS policy considers free space ment and determines traffic amount that including the
of receiver nodes buffer and forwards a message with number of present nodes in that segment.
equal or smaller size than it. Therefore it reduces numIf the size of all messages in buffer of sender node
ber of drops. The knowledge of free space of receiver
nodes buffer is obtained based on HELLO-RESPONSE is larger than the free space of node receivers buffer,
technique. Sender node sends a HELLO message in or- the KBS policy makes decision just by considering segder to make communication. If receiver node hears the ment traffic of sender node. When segment traffic of
HELLO message, it will send a RESPONSE message sender node is low or medium, due to above mentioned

41

The Third International Conference on Contemporary Issues in Computer and Information Sciences

reasons, the message with highest TTL is selected to


forward and when the traffic is high, the message with
the least TTL is selected to forward.

Simulation results

Figure 1 shows the comparison of buffer management


policies with respect to delivery ratio. KBS policy by
considering free space of receiver buffer and by forwarding a message based on it, reduces number of dropped
messages and consequently increases number of delivered messages. Moreover, it by making decision based
on TTL of messages also can increase delivery probability.

Simulation setup

In this section, we evaluate our KBS policy and compare it with Random, FIFO and Lifetime desc scheduling policies. The dropping policy in all three policies is
Drop head [17]. Evaluation is done by simulation using
Opportunistic Network Environment (ONE) simulator
[14].
Performance metrics considered are the message delivery probability (measured as the ratio of the delivered messages to the sent messages), and number of
drop.
Figure 1: KBS, Lifetime desc, Random and FIFO DeTo evaluate, Epidemic routing protocol is used [1]. livery Probability
Epidemic routing protocol is a flooding based protocol.
According to this protocol, when two nodes connect,
they send to each other messages which they do not
have.
Figure 2 represents the comparison of buffer management
policies with respect to the number of drops.
In order to examine the performance of KBS policy, we use an urban scenario. Area of simulation is KBS policy reduces number of drop to significant rate,
6000m 6000m. We simulate 100 vehicles. Buffer ca- because it forwards a message based on free space repacity of vehicles is 20Mbyte. Vehicles move with ran- ceiver node receives the message with the least number
dom speed 30 and 50 Km/h and using shortest avail- of drops.
able path. Random Wait times of vehicles are between
5 and 15 minutes.
Network nodes communicate with each other using
a wireless connectivity link with data transmission rate
of 6Mbps and transmission range of 30 meter.
The messages are generated using an inter-message
creation interval that is uniformly distributed in the
range of [5,20] seconds. Message size is uniformly distributed in the range of [500K, 1M]. TTL of messages
are 120 minutes along the simulations. Simulation time
is 12 hours.
In all the scenarios we have defined the number of
vehicles less than 6 in one segment as a low traffic, the
number of vehicles 6 to 8 as medium traffic and the Figure 2: KBS, Lifetime desc, Random and FIFO
Number of Drop
number of vehicles higher than 8 as high traffic.

42

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Conclusion and future works

In this paper KBS buffer management policy was presented which in addition to message criteria, uses
knowledge of neighboring environment of nodes. This
policy based on free space of receiver nodes buffer and
traffic of segment where sender node is located, select
a message to forward.
Using simulation, performance of KBS policy was
compared with FIFO, Random and Lifetime desc
scheduling policies. Results were showed that KBS,
increases delivery ratio and decrease number of drop
significantly. In future works, we can present dropping policy that considers neighboring environment
of nodes. Moreover, we can compare the proposed
method with other buffer management policies.

Refrences
[1] A. Vahdat and D. Becker, Epidemic routing for partially
connected ad hoc networks, Duke University, Tech. Rep. Cs200006, 2000.
[2] K. Fall, Delay-tolerant network architecture for challenged
internets, In Proc. SIGCOMM (2003).
[3] S. Jain, K. Fall, and R. patre, Routing in delay tolerant
network, In Proc. SIGCOMM (2004).
[4] J. Lebrun, Ch.N. Chuah, D. Ghosal, and M. Zhang,
Knowledge-based opportunistic forwarding in vehicular
wireless ad Hoc Networks, IEEE Conference on Vehicular
Technology 4 (2005), 22892293.
[5] A. Lindgren and K.S. Phanse, Evaluation of queuing policies and forwarding strategies for routing in intermittently
connected networks, IEEE international Conference on
Communication System Software and Middleware (2006),
110.

[6] A. Jindal and K. Psounis, Performance analysis of epidemic


routing under contention, In Proc. IWCMC.
[7] D. Niyato, P. Wang, and J.Ch.M. Toe, Performance Analysis of the Vehicular Delay Tolerant Network (2007).
[8] G. Fathima, R.S.D. Wahidabanu, Singer Y, and Kaelbling
P, Effective buffer management and scheduling of bundles
in delay tolerant networks with finite buffers, In International Conference on Control, Automation, Communication
and Energy Conservation (INCACEC 2009) (2000), 14.
[9] N. Dusit, P. Wang, and J.Ch.M. Teo, Performance Analysis
of the Vehicular Delay Tolerant Network, In proc. NSERC
(2009).
[10] V.N.G.J. Soares, J.J.P.C. Rodrigues, P.S. Ferreira, and
A.M.D. Nogueira, Improvement of message delivery time
on vehicular delay-tolerant networks, In International Conference on Parallel Processing Workshops (2009), 344349.
[11] V.N.G.J. Soares, F. Farahmand, and J.J.P.C. Rodrigues,
Evaluating the impact of storage capacity constraints on
vehicular delay-tolerant networks, In second International
Conference on Communication Theory, Reliability and
Quality of Service (CTRQ 2009) (2009).
[12] V.N.G.J. Soares, F. Farahmand, and J.J.P.C.Rodrigues, A
layered architecture for vehicular delay-tolerant networks,
In IEEE Symposium on Computers and Communications
(ISCC09) (2009), 122127.
[13] S. Kaveevivitchai and H. Esaki, Independent dtns message
deletion mechanism for multi-copy routing scheme, In Sixth
Asian Internet Engineering Conference (AINTEC) (2009).
[14] A. Keranen, J. Ott, and T. Kakkainen, The ONE Simulator for DTN Protocol Evaluation, In SIMUTools: 2nd International Conference on Simulation Tools and Techniques
(2009).
[15] V.N.G.J. Soares, F. Farahmand, and J.J.P.C.Rodrigues,
Traffic differentiation support in vehicular Delay tolerant
Networks, Springer (2010).
[16] A. Krifa and Ch. Barakat, Th. Spyropoulos, Message drop
and scheduling in dtns: Theory and practice, IEEE Transactions on Mobile Computing (2010).
[17] V.N.G.J. Soares, F. Farahmand, and .J.P.C. Rodrigues,
Performance analysis of scheduling and dropping policies
in vehicular delay-tolerant networks, In International Journal on Advances in Internet Technology 3 (2010), 137145.

43

Classification of Gene Expression Data using Multiple Ranker


Evaluators and Neural Network
Zahra Roozbahani

Ali Katanforoush

Department of Computer Science, Math. Sci.

Department of Computer Science, Faculty of Math. Sci.

Shahid Beheshti University,Tehran, IRAN

Shahid Beheshti University,Tehran, IRAN

z.roozbahani@mail.sbu.ac.ir

a katanforosh@sbu.ac.ir

Abstract: Samples assayed by high-throughput micro array technologies challenge conventional


Machine Learning techniques. The major issue is the number of attributes (genes) that is highly
greater than the number of samples. In feature selection, we attempt to reduce the number of
attributes to obtain the most effective genes. In the prediction scheme introduced in this paper,
several feature selection methods are combined with an Artificial Neural Network (ANN) classifier.
Initially, we exploit various evaluators to measure association between the gene expression rate and
susceptibly categories. Then we rank genes based on each of measures and select a fixed number
of top rank genes. To assess the performance of this method, we use a Multi Layer Perceptron
(MLP) in which the input layer is associated to the genes commonly selected by all evaluators. We
consider gene expression samples for Leukemia, Lymphoma and DLBCL to evaluate our method
using Leave-one-out cross validation. Results show that our approach outperforms the predictive
accuracy compared to other methods.

Keywords: Feature Selection, Artificial Neural Network, Gene Expression, Cancer Classification.

Introduction

Microarray technology can profile the expression level


of thousands of genes, simultaneously. The resulting
profile simply reveals which genes are up or down regulated. It plays an important role on the study of specific cancers, activation of oncogenic pathways, and to
discover novel biomarkers for the clinical diagnosis [1].
In practice, classification algorithm is widely adopted
to analyze gene expression data.
Artificial Neural Network (ANN) are widely used
in microarray data analysis [2,3]. The great number
of genes (regarding to the number of samples) makes
conventional Machine Learning techniques, like ANN
being impractical. A common approach to resolve this
issue is reduction to the most associated genes. This is
an important problem, which is referred to as feature
selection. Feature selection is one of the most important issues in data mining, machine learning, pattern
Corresponding

classification, and so on. Only relevant features are


useful for classification to produce better performance
and reduces computation cost. It is necessary to take
measures to decrease the feature dimension under not
decreasing recognition effect; this is called the problems of feature optimum selection [4]. It is also an
effective dimensionality reduction technique and an essential preprocessing method to remove noise features
[5]. The basic idea of feature selection methods is
searching through all possible combinations of features
in the data to find which subset of features works best
for pattern reorganization. There are at least two Advantages for reduction of feature dimension; time and
space complexity of the model are reduced and redundant correlations are discarded. A successful selection
method should produce simple, moderate, less redundancy and unambiguous features [6,7]. Generally, feature selection methods are distinguished to two categories; 1) filter methods and 2) wrapper methods [1].
In filter methods, genes are selected based on their relevance to certain classes. A wrapper method embeds a

Author, P. O. Box 3719166943 ,T: (+98) 9192510195

44

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

gene selection method within a classification algorithm.


The wrapper methods are not as efficient as the filter
methods due to the fact that an algorithm runs on the
original high dimensional Microarray dataset. [8]

All mentioned evaluators are implemented in Weka


package [14]. The only admissible search method for
above evaluators is the Ranker method which ranks
features by their individual evaluations.

So far, several soft computing methods such as


fuzzy sets [9], rough set theory [10], and neural netThe Neural Network Classifier
works [11,12] have been proposed for gene expression 3
based association study. All these methods consider
only one evaluator while the prediction accuracy of a
classifier is quite sensitive to selected genes [3]. There- In this section, details of the ANN which we use for
fore we propose the gene selection method in which the association study using gene expression profiles are
discussed. The ANN has three types of layers, namely,
different evaluators are simultaneously satisfied.
the input layer, the output layer and the hidden layer,
The paper is organized as follows. In Section 2, we which is intermediate between the input and output
review some feature evaluators. Section 3 presents our layers. Fig. 1 shows a Multilayer feed-forward ANN
proposed method where an ANN classifier is modeled. structure. The neurons in two adjacent layers are fully
Section 4 focuses on experimental results and conclu- connected, while the neurons within the same layer are
not connected.
sion.

Feature Selection

We study five attribute evaluators with ranker search


method to find the best set of features. In the feature
selection step, two objects should be considered: a feature evaluator and a search method. The evaluator
assigns a predictive value to each subset of features.
Details for evaluators and search algorithms are discussed in [13].

Figure 1: Multilayer Feedforward ANN Structure.

Evaluators
In this paper, each neuron in the input layer is asBrief description of evaluators used in this paper is sociated with a gene selected by the pervious step (feaas follows;
ture selection), the number of hidden layers is 1 or 2,
and the output layer has just a single neuron. We
use four different training algorithms, Resilent Back GainRatioAttributeEval: measure the gain ratio
propagation(RP), Levenberg Marquardt(LM), Onewith respect to the class.
Step Secant Backpropagation (OSS), and Broyden,
InfoGainAttributeEval: measure the information Fletcher,Goldfarb, Shanno(BFGS). In the framework
of Backpropagation (BP) scheme. We set up initial
gain with respect to the class.
weight with random values. The learning procedure
OneRAttributeEval: evaluate the worth of an atiterates until the error (estimated by validation set) is
tribute using the OneR classifier.
fallen under a pre-specified threshold.
ReliefFAttributeEval: resample an instance and
In our method, selection algorithm is implemented
consider the value of the given attribute for the
nearest instance of the same and different class. in two steps: 1) first the relevant candidate genes from
the initial set of features are selected by each criterion
SymmetricalUncertAttributeEval: measure the
evaluator, and 2) genes which are commonly passed all
symmetrical uncertainty with respect to the
evaluators threshold are selected.
class.
CfsSubsetEval: Subsets of features that are
highly correlated with the class while having low
intercorrelation are preferred.

Datasets
For exploring the performance of new gene se-

45

The Third International Conference on Contemporary Issues in Computer and Information Sciences

lection method, three well known gene expression


datasets are considered: the leukemia, the lymphoma
and the Diffuse Large B-cell Lymphoma (DLBCL)
datasets. These data have been received with great
interest in gene selection and cancer classification
researches [10,15]. These data are publicly available from www.upo.es/eps/aguilar/datasets.html and
datam.i2r.a-star.edu.sg/datasets/krbd . To assess the
performance of classification, we evaluate the ANN
once with LOOCV (leave-one-out cross validation) and
once again by a particular test dataset without cross
validation.
Table 1: Rank Thresholds of Feature Selection

Bagging and AdaBoost), while we achieved to


98.61 % (LOOCV) just using 11 gene. Also we
can achieve a higher performance compared to
[18] that obtained the accuracy of 98% (LOOCV)
using 132 genes.
Lymphoma
This data set contains 45 samples, 22 of them
are germinal center B-like group (GCL) and 23
are activated Blike group (ACL) and the number
of genes is 4026.
In the lymphoma data set, RP is the most accurate classifier (accuracy=97.77%) using leaveone-out cross validation (Table 4). The ANN,
regardless of the training algorithm, exactly classified all test samples of lymphoma without cross
validation (Table 3). Our results on this dataset
and those obtained by SVM and Bayesian net
[19] are tightly close to each other; 97.77% vs.
97.87%. It is remarkable that the same result
has been obtained by the hyper-box enclosure
method [15].
DLBCL

Experimental
Discussion

Results

and

The feature selection evaluators and their Rank thresholds with respect to each datasets are shown in Table
1. In the first step of selection algorithm, the number of selected genes is set to a moderate number, e.g.
between 30 to 90. Then, we find minimum number
of genes that is shared in all evaluators criteria. Informative genes found in datasets are listed in Table
2.
ALL-AML Leukemia The leukemia data consists
of 72 samples among which are 25 samples of
AML and 47 samples of ALL. The number of
genes in each sample in this dataset is 7129. The
training data consists of 38 samples (27 ALL and
11 AML), and the rest is considered as test data
[16]. Using the test data without cross validation,
a perfectly accurate classification is observed (Table 3). This also achieves the best leave-one-out
(LOOCV) result (98.61) with RP training algorithm (Table 4). LM and OSS methods are the
second highest accurate classifier with the accuracy of 95.83% and 94.44% respectively.
Our result compares with the result reported in
[17] where 1038 genes predict for 91.18% (10-CV,

46

The third dataset contains 58 samples from DLBCL patients and 19 samples from follicular lymphoma (FL) on 7029 genes. Here, RP and BFGS
obtained the most accurate results; respectively
100% (estimated by the test data, Table 3) and
96.10% (estimated by LOOCV, Table 4).
It should be noted that result reported in [20]
are more accurate than our result (97.50% vs.
96.10%), but they have not identified any group
of genes responsible for DLBCL. Our results compare with results of the kNN based method (reported accuracy=92.71%) [21] where eight genes
have been identified to be associated with DLBCL. The hyper-box enclosure method [15] obtains the same accuracy as our multiple Ranker
methods with ANN.
We are also interested in the effect of the feature reduction on the classification accuracy. We
gradually reduce the number of initial genes selected by each evaluator and re-organize the ANN
classifier. Fig. 2 shows the trend of accuracy
with respect to the number of initial genes. The
numbers of commonly selected genes are shown
by bullets on each curve. As shown in Fig.
2, over 90 percent of Lymphoma samples can
be perfectly identified by using only one gene
(GENE3330X). The same accuracy can be also
achieved by four genes for Leukemia (M84526 at,
X95735 at, U46499 at, L09209 s at). DLBCL is
rather complicated; a reliable classification requires at least seven genes, even more (see Ta-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

ble 2). It should be noted that no subset of six


genes or less can result in a classification with the
accuracy above 90 percent.

Fig. 3. In this step, we consider only the results of LM algorithm known as the most efficient
training algorithm in our experiences. We have
also studied some other classifiers than ANN, like
SMO, Kstar and Logistic regression, but no better results have been obtained.

Precision of classification that is the ratio of truly


predicted samples in each class is illustrated in

Table 2: Informative Genes Found in Datasets

Table 3: Accuracy of Classification with Test set Using ANN

Table 4: Accuracy of Classification with LOOCV Using ANN

Conclusion

has been introduced. Selected genes have been used


to establish an ANN by which sample types of gene
expression data have been classified. Three public
In this paper, a successful gene selection method based datasets of gene expression have used to test the peron combination of multiple feature selection methods formance. Our comprehensive assessment using leave-

47

The Third International Conference on Contemporary Issues in Computer and Information Sciences

one-out cross validation has shown the highest prediction accuracy for the proposed approach among gene
expression classification algorithms. It suggests our
method can select informative genes for cancer classification.

[6] S.B. Dong and Y.M Yang, Hierarchical Web Image Classification By Multi-Level Features, Proceedings of the first
international conference on Machine Learning and Cybernetics, Beijing (2002), 663 668.
[7] R. Setiono and H Liu, Feature Selection via Discretization,
IEEE Transactions on Knowledge and Data Engineering 9
(1997), 642645.
[8] H. Hu, J Li, H Wang, and G Daggard, Combined Gene Selection Methods for Microarray Data Analysis, Proceedings
on 10th International Conference Knowledge-Based Intelligent Information and Engineering Systems, Bournemouth,
UK (2006), 911.
[9] S.A. Vinterbo, E.Y Kim, and L Ohno-Machado, Small,
fuzzy and interpretable gene expression based classifiers,
bioinformatics/bti287 21 no. 9 (2005), 19641970.
[10] L. Sun, D Miao, and H Zhang, Gene Selection with Rough
Sets for Cancer Classification, IEEE, Fourth International
Conference on Fuzzy Systems and Knowledge Discovery,
Haikou (2007), 167172.

Figure 2: Accuracy vs. number of common genes (CG).

[11] J. Khan, J.S Wei, M Ringner, L.H Ladanyi, F Westermann,


F Berthold, and F Berthold et al, Classifi- 620 cation and
diagnostic prediction of cancers using gene expression profiling and artificial neural networks, Nature Med 7 (2001),
673679.
[12] M. Muselli, M Costacurta, and F . Ruffino, Evaluating switching neural networks through artificial and real
gene expression data, Artificial Intelligence in Medicine 45
(2009), 163171.
[13] Y. Wang and IV Tetko, Gene selection from microarray
data for cancer classification a machine learning approach,
Comp Biol Chem (2005), 3746.
[14] H. Hall, G Holmes, B Pfahringer, and P Reutemann et al,
The weka data mining software: An update; sigkdd explorations, SIGKDD Explorations 11: Issue 1 (2009).

Figure 3: Class precision of ANN classifier with LM


algorithm.

[15] O. Dagliyan, F Uney-Yuksektepe, I. H Kavakli, and M


Turkay, Optimization Based Tumor Classification from Microarray Gene Expression Data, journal. Plos one 6, Issue
2, e14579 (2011), 3746.

Refrences

[16] TR. Golub, DK Slonim, P Tamayo, C Huard, and M


Gaasenbeek et al, Molecular classification of cancer Class
discovery and class prediction by gene expression monitoring, Bloomfield CD, Lander ES (1999), 531537.

[1] R. Kohavi and G.H. John, Wrapper for feature subset selection, Artif.Intell. 97,1/2 (1997), 273324.
[2] Z. Zainuddin and P. Ong, Reliable multiclass cancer classification of microarray gene expression profiles using an
improved wavelet neural network, Expert Systems with Applications, 38 (2011), 1371113722.
[3] L. Nanni and A. Lumini, Wavelet selection for disease classification by DNA microarray data, Expert Systems with
Applications, 38 (2011), 990995.
[4] Y. Yan and J.O Pederson, Comparative Study of feature selection in Text Categorization, Proceedings on Fourteenth
International Conference on Machine Learning (ICML97),
(1997), 412420.
[5] B. Krishnapuram, A.J Hartemink, L Carin, and M.A.T
Figueiredo, A Bayesian Approach to Joint Feature Selection and Classifier Design, IEEE Transactions on Pattern
Analysis and Machine Intelligence 26, No. 9 (2004), 1105
1111.

48

[17] AC. Tan and D Gilbert, Ensemble machine learning on gene


expression data for cancer classification, Appl Bioinformatics 2: S (2003), 7583.
[18] M. Okuya, H Kurosawa, J Kikuchi, Y Furukawa, and H Matsui et al, Upregulation of survivin by the e2a-hlf chimera is
indispensable for the survival of t(17;19)-positive leukemia
cells, JBiolChem 285: 18 (2010), 5060.
[19] R. Hewett and P Kijsanayothin, Tumor classification ranking from microarray data, BMC Genomics 9: S21 (2008).
[20] A. Statnikov, CF Aliferis, I Tsamardinos, and D Hardin et
al, Acomprehensive evaluation of multicategory classification methods for microarray gene expression cancer diagnosis, BMC Genomics 21 (2005), 631643.
[21] JG. Zhang and HW Deng, Gene selection for classification
of microarray data based on the bayes error, BMC Bioinformatics 8 (2007), 370.

Data mining with learning decision tree and Bayesian network for
data replication in Data Grid
Farzaneh Veghari Baheri

Farnaz Davardoost

Department of Computer, Khodaafarin Branch,

Department of Computer, Khosroshahr Branch,

Islamic Azad University, Khodaafarin-Iran.

Islamic Azad University, Khosroshahr-Iran.

Farzaneh Veghari@Yahoo.com

Farnaz Davardoost@Yahoo.com

Vahid Ahmadzadeh

Department of Computer,
Payame Noor University,
PO BOX 19395-3697 Tehran, Iran.
Ahmadzadeh.Vahid@Gmail.com

Abstract: Data management is a main problem in Grid environment. A data Grid is composed
of thousands of geographically distributed storage resources usually located under different administrative domains. The size of the data managed by data Grid is continuously growing, and it has
already reached Petabytes. Large data files are replicated across the Data Grid to improve the
system performance. In this paper, we improve the performance of data access time and reduce
the access latency. In this research, a hybrid model is extended by combining a Bayesian Network
and a learning decision tree. We suppose hierarchical architecture which has some clusters. This
approach detects which data should be replicated. Initially, an algorithm calculates Entropy for
dataset then calculated Gain for every attribute. Finally the probability of result calculated with
Bayesian expression and replication rule will be produced. We simulate this approach to evaluate
the performance of proposed hybrid method. The simulation results show that the data access time
is reduced.

Keywords: Bayesian Network; Data Replication; Entropy; Gain; Grid; Learning Decision Tree.

Introduction

In recent years, applications such as bioinformatics, climate transition, and high energy physics produce large
datasets from simulations or experiments. Managing
this huge amount of data in a centralized way is ineffective due to extensive access latency and load on
the central server. In order to solve these kinds of
problems, Grid technologies have been proposed. Data
Grids aggregate a collection of distributed resources
placed in different parts of the world to enable users to
share data and resources (Chervenak et al., 2000; Allcock et al., 2001; Foster, 2002; Worldwide Lhc Computing Grid, 2011). Data replication has been used in
Corresponding

Author

49

database systems and Data Grid systems. Data replication is an important technique to manage large data
in a distributed manner. The general idea of replication
is to place replicas of data at various locations. Learning decision trees and Bayesian networks are widely
used in many areas, such as data mining, classification
systems, and decision support systems and so on.
A decision tree is model that of inductive learning from observation. Decision trees are creating from
training data in a top-down direction. A learning decision tree is like a hierarchical tree structure which
is divided based on a single attribute at each internal
node. The first stage of a learning decision tree is the
root node that is allocated all the examples from the

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

training set. If it is the case that all examples belong


to the same class, and then no other decisions need to
be create partition the examples, and the solution is
perfect. If classes at this node be a part of two or more
classes, then a test is made at the node that will result
in a split. The process is repeated for each of the new
nodes until a differentiating tree is complete.

4. Plain Caching: a local copy is stored on initial


request.

5. Caching plus Cascading: combines plain caching


and cascading strategies.

Bayesian networks are popular within the community of artificial intelligence due to their ability to
support probabilistic reasoning from data with uncertainty. A Bayesian Network (BN) is a directed acyclic
6. Fast Spread: file copies are stored at each node
graph that represents relationships of probabilistic naon the path to the best client.
ture among variables of interest. With a network at
hand, probabilistic inference can be conducted to predict the values of some variables based on the observed
In [9] authors discussed a new dynamic replication
values of other variables and find a pattern in training
method
in a multi-tier data Grid called predictive hierdata. [1, 2, 3, 4, 5, 6, 7].
archical fast spread (PHFS) which is an extended verIn this paper, we present a hybrid model compose sion of fast spread. Considering spatial locality, PHFS
of the learning decision trees and Bayesian networks tries to increase locality in accesses by predicting users
resultant from running database. We assume a hier- subsequent file demands and pre-replicate them beforearchical architecture of data Grid system. Proposed hand in hierarchal manner. In PHFS, in order to elimarchitecture composes some clusters that every cluster inate the delay of replication on request, data must be
composes some sits. At first, decision tree based on replicated in advance by using the concept of predicting
ID3 learning algorithm is being created then, a set of future request while we use learning decision tree and
decision rules is generated for data replication in Grid Bayesian network for determining which data should
environment. We simulate our method to evaluate the be replicated.
performance of this training method.. Providing the
replication rule for data will increase the performance
of system and it is obtained the more optimum solution than the other methods. Summary, the time of
data access is reduced with proposed hybrid method.
Section 2, in this paper introduces some previous work
on data replication. Section 3, explain our proposed
method in detail and section 4, we evaluate proposed
method. Finally, conclusion are presented is section 5.

Related work

Some recent studies have discussed the problem of


replication in data Grids. Some of these works will
be surveyed in this section. In [8] six distinct strategies are presented for the multi-tier data Grid. These
strategies are as follows:

In [10], a hybrid model is advanced by integrating a


case-based data clustering procure and a fuzzy decision
tree for medical data classification. A large amount of
research has been guided to study the behavior of a
group of medical symptoms. However, the researcher is
more interested in discovering potential disease factors.
Therefore, they take a different method by proposing a
case-based fuzzy decision tree to diagnose the potential
illness symptoms. In [10] authors used decision trees
for medical data classification while in this paper we
use ID3 learning algorithm in decision trees for data
replication in data Grid.

The Proposed Architecture

1. No Replication: in this case only the root node The performance of replication strategies is highly deincludes the replicas.
pendent on the architecture of data Grid. One of the
basic models is the hierarchical data model which is
2. Best Client: a replica is created for the client
also known multi-tier. In this paper, we assumed hiwhich access frequently.
erarchical architecture with 2 tires furthermore; our
3. Cascading: a replica is created on the path of the architecture is considered as cluster. This hierarchical
best client.
architecture is shown in Fig.1.

50

The Third International Conference on Contemporary Issues in Computer and Information Sciences

If all Examples are positive, Return the singlenode tree Root, with label = +
If all Examples are negative, Return the singlenode tree Root, with label = If Attributes is empty, Return the single-node
tree Root, with label = most common value of
Target attribute in Examples
Figure 1: Hierarchical architecture for data manage- Otherwise Begin
ment

Tier 0 is broker which is responsible replicate of


data and at tire 1, there are the cluster heads. Cluster
head is a central node which is managing all nodes of
cluster and monitors status of nodes. At tire 2, there
are users which requested are being inserted from this
part.

3.1

ACalculate Gains of all attributes then select


the attribute with the highest Gain
The decision attribute for RootA
For each possible value, vi, of A,
Add a new tree branch below Root, corresponding to the test A = vi
Let Examples vi be the subset of Examples
that have value vi for A

Learning Decision Tree

If Examples vi is empty

A decision tree is a hierarchical model for supervised


learning whereby the local region is identified in a sequence of recursive splits in a smaller number of steps.

A decision tree is composed of internal decision nodes


and terminal leaves. We want to realize which data
must be replicated according to historical data access. ID3(Examples
There are tables in the broker and cluster head like End
Return Root
table 1 which included following fields [11].
Table1. Database in Broker and Clusters Head
Fields name
ID
Access number
Priority
Service time
Size of data

Description
Data Identification
Number of
accessing data
importance of data
Long of time for
allocating requested data
Size of data

Values
Number
Low, Mid,
High
Low, High
Low, High

Then below this new branch add a leaf


node with label = most common Value
of Target attribute in Examples
Else below this new branch add the subtree
vi, Target attribute, Attributes - (A)))

The central choice in the ID3 [16] algorithm is selecting


which attribute to test at each node in the tree. What
is a good quantitative measure of the worth of an attribute? We will define a statistical property, called
information Gain. In order to define information Gain
precisely, we begin by defining a measure commonly
used in information theory, called Entropy. Equation
(1) shows the formula for calculating the Entropy.

Low, High

c
X

pi Log2 pi
(1)
i=1
In this paper, we present the basic algorithm for
decision tree learning corresponding approximately to
The information gain, Gain(S, A) of an attribute A, relID3 Where Examples are according to table 1, target
ative to a collection of examples S, is defined as Equaattribute is data replication and attributes are fields
tion (2).
of table (Access Number, Priority, Service time, Size
X
of data). The summary of ID3 algorithm is as follows:
|Sv |
Gain = Entropy(S)
Entropy(Sv ) (2)
|S|
Entropy(s) =

vV alues(A)

ID3(Examples, Target attribute, Attributes)


Create a Root node for the tree

51

where V alues (A) is the set of all possible values for


attribute A, and Sv is the subset of S for which attribute A has value v (i.e., Sv = {sS|A(s) = v}).

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Note the first term in Equation (2) is just the entropy


of the original collection S, and the second term is the
expected value of the entropy after S is partitioned using attribute A. The expected entropy described by
this second term is simply the sum of the entropies of
each subset Sv, weighted by the fraction of examples
|Sv |
that belong to Sv. Gain(S, A) is therefore the
|S|
expected reduction in entropy caused.

3.2

Bayesian Networks

Bayesian Network could represent the probabilistic relationships between data and data replication. Given
symptoms, the Bayesian network can be used to compute the probabilities of the use of data in future. To
develop a Bayesian network, at first we often improve
a DAG such decision tree. Then we verify the condiFigure 2: Example of Decision Tree
tional probability distributions of each variable. Now
we calculated probability of attribute. We find the influence factor for all the attribute values. The influence
factor gives the dependability of the attribute value on
Rule base allows knowledge extraction. The rules
the class label. The formula for Influence factor for a
reflect
the main characteristics of the dataset. Decision
particular Class Ci is given Equation (3).
tree of Figure 2 can be written down as the following
set of rules:


N (Aj = Xi Ci )
I(Aj = xi ci ) =
(3)
N (Ci )
No Replication Rule:
IF
{
Where Aj =attribute that is currently considered for (Accesss Number= Low) OR
calculating, j varies from 1..n here n refers to maxi- (Access Number=Mid AND priority=Low) OR
mum number of predictive attributes and k is maxi- (Access Number=Mid AND priority=High AND Sermum number of attribute values for the attribute Aj vice Time=Low) OR
[12, 13, 14, 15].
(Access Number=Mid and priority=High AND Service Time=High AND size of data =High)
}
Then replicate=No

3.3

Extracting rules from combina- Replication Rule:


IF
tion of learning decision tree and {
Bayesian network
(Access Number=High) OR

Access Number=Mid AND priority=High AND Service Time=High AND size of data =Low OR
We assume that a learning decision tree for target value Access Number=Mid AND priority=High AND Serdata replication is according to Figure 2 each path from vice Time=High AND size of data =mid
the root to the leaf can be written down as set of IF- }
THEN rules.
Then replicate=Yes

52

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Refrences

Simulations

We evaluate and compare the performance of our approach with no replication algorithm in a condition
which number of clusters verify. Figure 3 Illustrates
comparing of access time in 4 clusters, 8 clusters and
12 clusters.

[1] J.Zhang L , B.Lee, X.Tang, C.Yeo, A model to predict the


optimal performance of the Hierarchical Data Grid, future
Generation Computer Systems 26 (2010).
[2] J.Perez, F.Garcia-Carballeira, J.Carretero, A.Calderon,
J.Fernandez, Branch replication scheme: A new model for
data replication in large scale data Grids, Future Generation Computer Systems 26 (2010) 12-20.
[3] R.Chang, C.Lin, S.His, Accessing data from many servers
simultaneously and adaptively in data Grids, future Generation Computer Systems 26 (2010) 6371.
[4] N.Mansourin, G.Dastghaibyfard, A dynamic replica management strategy in data Grid, Journal of Network and
Computer Applications.
[5] B. Chandra, P. P.Varghese, Fuzzifying Gini Index based
decision trees, in: expert systems with applications,
36(2009)8549 -8559,contents lists available at sciencdirect.
[6] T.D. Schneider, Information Theory Primer, 1995 12
November 2000 [cited 1996; version 2,32,27 July 1995] Available from:ftp://ftp.ncifcrf.gov/delila/primer.ps.

Figure 3: Access time for various cluster number

[7] K. Sashia, A.Selvadoss Thanamani, Dynamic replication


in a data grid using a Modified BHR Region Based Algorithm, Future Generation Computer Systems 27 (2011)
202210.
[8] K. Ranganathan, I. Foster, Design and evaluation of dynamic replication strategies for a high performance data
Grid, in: International Conference on Computing in High
Energy and Nuclear Physics, vol. 2001.

As it shown in Figure 3, access time is decreasing


as the number of cluster increase. Comparing access
time of our approach with no replication shows 22%
decreasing in 4 clusters, 27% decreasing in 8 clusters
and finally 34% decreasing in 12 clusters.

[9] L.Mohammad Khanli, Ayaz Isazadeha, Tahmuras N.


Shishavan, PHFS: A dynamic replication method, to decrease access latency in the multi-tier data grid, Future
Generation Computer Systems 27 (2011) 233244.
[10] C.Y.Fan, P.C.Chang, J.J.Lin, J.C.Hsieh, A hybrid model
combining case-based reasoning and fuzzy decision tree
for medical data classification, in: applied soft computing(2010),contents lists.
[11] L.Mohammad Khanli,F.Veghari Baheri, A hybrid model
combining decision tree and Bayesian network for data replication In Grid environment, Journal of telecommunications, volume3, issue 1, april 2010.

Conclusion

In this paper, we present a hybrid model compose of


the learning decision tree and Bayesian network for
replication strategy in data Grid. We assume a hierarchical architecture of data grid system. Proposed
architecture composes some clusters that every cluster composes some sits. Finally, a set of decision rules
is generated for data replication in grid environment.
We simulate our method to evaluate the performance
of this training method. It is obtained the more optimum solution than the other methods. Providing the
replication rule for data will increase the performance
of system. Summary, the time of data access is reduced
with proposed hybrid method.

53

[12] S. A.a.Balamurugan, R.Rajaram, Effective solution for unhandled exception in decision tree induction algorithms
, in: expert systems with - applications 36(2009)1211312119,contents lists available at sciencedirect.
[13] T.Amjad, M.Sher, A.Daud, A survey of dynamic replication strategies for improving data availability in data grids,
Future Generation Computer Systems 28 (2012) 337349.
[14] N.Xiong, Learning fuzzy rules for similarity assessment in
case-based reasoning, Expert Systems with Applications
38 (2011) 1078010786.
[15] L.Mohammad Khanli, F.Mahan, A.Isazadeh, Active rule
learning using decision tree for resource management in
Grid computing, Future Generation Computer Systems 27
(2011) 703710.
[16] Tom M. Mitchell, Machine Learning, McGraw-Hill Science/Engineering/Math; (March 1, 1997)

Design and Implementation of a three-node Wireless Network


For Tranferring Patients Medical Information without Data Collision

Roya Derakhshanfar

Islamic Azad University


Department of Biomedical Engineering, Science and Research Branch
Tehran, Iran
r.derakhshanfar@srbiau.ac.ir

Maisam M.Bassiri
Iran University of Science and Technology
Department of Electrical Engineering
Tehran, Iran
basiri@iust.ac.ir

S.Kamaledin Setarehdan
University of Tehran
Control and Intelligent Processing Center of Excellence, School of ECE, College of Engineering
Tehran, Iran
ksetareh@ut.ac.ir

Abstract: The purpose of this paper is to introduce a method for transmitting patients data
using a wireless network. By this network, the patients data is first gathered at a central station
and from there, it is then sent to a computer. In the computer, the patients profiles are created,
so that their medical information can be controlled every moment. The existing protocol between
master and slave provides synchronous data transfer without collision. Another protocol is also
provided between the computer and the master in order to collect, save and process the data.

Keywords: Medical devices; Telemedicine; Codevision Software; Wireless networks.

Introduction

Telemedicine was pioneered at the beginning of the


20th century, for example, in the field of maritime
medicine by using telecommunication and morse code
[1]. The intention of telemedicine it means that patients can receive health services such as prognosis of
disease, advance diagnosis, treatment service and therapy procedure at any time and in any geographical
area. This is significant to provide the better facility
for people that living in outlying regions and that people which are located in the deprived regions. The evo Corresponding

lution of telemedicine has created birth to a widespread


nomenclauture that are as follows: e-health, telediagnosis, teletreatment, telemonitoring and telehealth.
Telemedicine has been employed to surmount distance
and at this way, wireless communication devices are reliably helpful. The improvements in telemedicine and
communication technologies such as wireless networks
generates supporting systems for the management of
chronic illness such as heart diseases and hypertension
and aids to the physicians for careful examination related to disease at any situation. Nowadays, computerbased systems are used for the clinical applications.
In the territory of telemedicine, monitoring applica-

Author, P. O. Box 1568834911, T: (+98) 21 88400950

54

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

tions using wireless networks were developed. The


telemedicine service requires a computer-based system
which can control patients health condition using modern monitor and can transmit, patients profile to assess
as soon as possible. Telemedicine has advanced in the
industrialized countries. This way can be convenient to
reduce of costs and disinclination of the elderly people
for return to the hospital or clinical centers. The purpose of telemedicine is to provide immediate medical
treatment through modern monitors, wireless networks
and telecommunications procedure such as, satellites,
mobile e-health applications and specific medical devices based on the sensors and microcontroller devices,
for the help to the patients. Some articles in the domain of telemedicine, were published. Andreas Lymberis and Silas Olsson, have described the current status of multidisciplinary research and development of
IBC (intelligent biomedical clothing), based on bibliographic research [1]. Anthoula P.Anagnostaki, Sotiris
Pavlopoulos, Efthivoulos Kyriakou and Dimitris Koutsouris, have discussed a novel codification scheme based
on two healthcare informatics standards, the VITAL
and DICOM sup.30, in addressing the robust interchange of waveform and medical data for a home care
application [2]. Edward Mutafungwa, Zhong Zheng,
Jyri Hamalainen, Mika Husso and Timo Korhonen,
have proposed a complementary solution based on the
emerging femtocellular approach for indoor emergency
telemedicine scenarios [3]. A.Yadollahi, Z.Moussavi
and P.Yahampath, have described an adaptive method
for compression of respiratory and swallowing sounds
[4]. Claudio De Capua, Antonella Meduri and Rosario
Morello, have presented an original ECG measurement
system based on web-service-oriented architecture to
monitor the heart health of cardiac patients [5]. Alfonso Prieto-Guerrero, Corinne Mailhes and Francis
Castanie, have described a method based on left-sided
and right-sided autoregressive modeling to interpolate
missing samples in an ECG signal [6]. Tae-Soo Lee,
Joo-Hyun Hong and Myeong-Chan Cho, have implemented a protable and wearable biomedical digital assistant using a personal digital assistant (PDA) and
an ordinary cellular phone [7]. Daniel Lucani, Giancarlos Cataldo, Julio Cruz, Guillermo Villegas and
Sara Wong, have developed a prototype of a portable
ECG-monitoring device for clinical and non-clinical environments as part of a telemedicine system to provide remote and continuous surveillance of patients
[8]. Marco J.Suarez Baron, Juan J.Velasquez, Carlos A.Cifuentes and Luis E.Rodriguez, have introduced
an algorithm to telemedicine intelligent through web
mining and instrumentation wearable [9]. J.Escayola,
I.Martinez, J.Trigo, J.Garcia, M.Martinez-Espronceda,
S.Led and L.Serrano, have described an extended review of the most recent innovative advances in biomedical engineering applied to the standard-based design

for ubiquitous and personal healthcare environments


[10]. A.Bramanti, L.Bonanno, A.Celona, S.Bertuccio,
A.Calisto, P.Lanzafame and P.Bramanti, have identified regional spots as potential territorial stations for
the telemedicine interventions delivery through the use
of GIS (geographical information system), a technology
that is recently considered an important and new component for many epidemiological and health projects
[11]. Cristian Ravariu and Florin Babarada, have offered a web protocol for medical diagnosis, under educational projects, as application to learning, bioinformatics and telemedicine [12]. Yibao Wang, Yang
Liu, Xudong Lu, Jiye An and Huilong Duan, have proposed a simple and complete representation of biosignal
data based on MFER and CDA [13]. Stefano Bonacina
and Marco Masseroli, have designed a web application
that enables different healthcare actors to insert and
browse healthcare data, bio-signals and biomedical images of patients enrolled in a program of cardiovascular
risk prevention [14]. Mohd Fadlee A.Rasid and Bryan
Woodward, have described the design of a processor,
which samples signals from sensors on the patient. It
then transmits digital data over a Bluetooth links to
a mobile telephone that uses the general packet radio
service (GPRS) [15]. This study consists of two slaves
and one master. Each slave is used as an interface for
receiving patients information and the master is a central station in fact. Patients information is gathered in
the master through that it is sent to a PC. In this way,
patients information profiles are created in the PC.
This enables doctors to evaluate each patients medical information momentarily. Here the information of
drug and temperature enters the central station (master) through each of the slaves, with its characteristic
code. The master/slave is a model of communication
where one device or process has unidirectional control
over one or more other devices. In some systems a
master is elected from a group of eligible devices, with
the other devices acting in the role of slaves [16].

2
2.1

Material And Methods


Hardware

In this study, HM-T and HM-R have been used as the


transmitter and receiver respectively. HM-R is a receiver module that is easily usable through serial connection to a microcontroller or a PC. This module is
based on FSK modulation which provides longer working distance and less interference in comparison to ASK
technology. For carrying out data transmission, a receiver pair is needed. For this purpose a HM-T trans-

55

The Third International Conference on Contemporary Issues in Computer and Information Sciences

mission module with three pins, two for the module


supply and one for data, has been used. It is enough to
deliver data to this module through serial interface so
that it can send the data via FSK modulation. These
modules can be found in different working frequencies of 315, 433, 868 and 915 MHz. According to the
datasheet, these modules have three options for baud
rate. Considering the necessity of this study to have
noise reduction and achieve maximum efficiency, 4800
bps was chosen as the most suitable baud rate for the
application. In this baud rate, the modules work better and data is received more accurately and precisely.
Baud rate set-up was done through codewizard AVR.
A sharp note which has to be considered when using
these modules is that they go to a sleep mode if there
is no data transferring for 70 ms. At first the module
is off. When data transferring is started, the module is
in its sleep mode. In order to awake it, we have to send
some assumed data and then it is ready to transfer the
main data. The modules work in different frequencies
in master and slave. In master, the HM-T transmission
module and the HM-R receiver module are connected
to the microcontroller with frequencies of 433 MHz and
915 MHz respectively. In slave, the HM-T transmitter
and the HM-R receiver module are connected to the
microcontroller with frequencies of 915 MHz and 433
MHz respectively. The reason for putting the modules
to work with different frequencies in master and slave
is preventing from interference during sending and receiving. SMT160 is an intellectual temperature sensor
with a PWM digital output. Therefore it can be connected to the microcontroller without using an ADC
convertor. The thermal range of this sensor is between
45 C to 150 C and its output is a rectangular waveform whose duty cycle is dependent on temperature.
This dependency is given by the following linear equations:
T1
(1)
DC =
T1 + T2
(DC 0.032)
t=
(2)
0.0047
Where DC is duty cycle and t is temperature in Celsius degree. For measuring temperature by SMT160
sensor, it is enough to measure the DC and then the
temperature is obtained using equation (2). The easiest way to measure the DC is using a microcontroller.
In AVR microcontroller, DC can be measured with different methods. In the method used in this study, it is
enough to read and control a pin in the program. Using these two counters, duty cycle is obtained and the
rest is calculated as mentioned before. In the slave, an
ATmega16 from AVR microcontroller has been used.
A keyboard is connected to port B and a LCD to port
A of the microcontroller. The temperature sensor is
connected to pin PD4, HM-T transmission module to
TX pin (pin No.15) and HM-R receiver module to RX

56

pin (pin No.14). In the master, an ATmega16 has been


used as well. The module connections to micro in the
master are the same as ones in the slave. Besides communication with the receiver and transmitter modules,
the master should be capable of communication with
PC through serial interface. Serial interface is provided
via a MAX232 voltage converter IC and a proper serial
cable. The hardware setup is shown in Figure 1.

2.2

Software

Serial interface can be explained in two parts; one between the master and slave and the other between the
master and the PC.

Figure 1: Hardware setup

In the communication protocol between master and


slave, data is gathered in slaves by the micro-controller
and then it is sent to the master. The request asked by
the master is as follows: first in sends the code consisting of star character, channel number and square character in sequence as the transmission code. Then the
slave responds to the master with the code consisting
of star character, channel number, drug, temperature
and finally check sum (CS) and square character in sequence. Check sum is calculated as sum of the ASKII
code of star character, channel number, drug and temperature. CS is calculated at both sides of master and
slave; the error is detected if there is any difference
between these two. The explained protocol is written
in codevision software for the both master and slave
parts.
In the communication protocol between the master
and the PC, the temperature and drug of each channel
are defined by three characters. Via this protocol the
PC can read the sent data with the corresponding port
and finally display or save it if necessary. This protocol
has been written in Visual Basic.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Results

The built setup works as follows considering both of the


explained protocols: the temperature corresponding to
each of the slaves is displayed against the word temp
on the LCD. The temperatures of the two slaves are
not necessarily equal. The numbers from 1 to 8 have
been assigned to each of the 8 special drugs, one by
one. For selecting a special drug, it is simply enough
to push the assigned number on the keyboard and then
press F1 key to confirm it. The drug information is sent
according to the number and is displayed against the
word drug. For example, if the channel 0 slave needs
the fifth drug, after pressing No.5 on the keyboard and
confirming it by pressing F1, drug number is saved and
displayed in the form of D5 in drug section of the table
on the PC when the default program delay is passed.
This means drug No.5 is requested. This program is
also capable of sending several drugs in sequence. Another feature is also included in this program and that
is if, for any reason, any of the slaves is disconnected
and data has not been delivered to the master, the master can recognize it by channel No and announce the
disconnection. For example, if channel 1 (CH1) slave
is disconnected, the massage CH1 disconnected is
shown in bottom of the table in the exe file.

Conclution

[2] Anthoula P.Anagnostaki, Sotiris Pavlopoulos, Efthivoulos


Kyriakou, and Dimitris Koutsouris, A novel codification
scheme based on the VITAL and DICOM standards for
telemedicine applications, IEEE Trans.Biomed. Eng 49
(2002), no. 12.
[3] Edward Mutafungwa, Zhong Zheng, Jyri H
am
al
ainen, Mika
Husso, and Timo Korhonen, Exploiting femtocellular networks for emergency telemedicine application in indoor environments, IEEE (2010).
[4] A.Yadollahi, Z.Moussavi, and P.Yahampath, Adaptive compression of respiratory and swallowing sounds, IEEE, 28th
Annual Int. Conf. New York (2006).
[5] Claudio De Capua, Antonella Meduri, and Rosario Morello,
A smart ECG measurement system based on web-serviceoriented architecture for telemedicine applications, IEEE
Trans. Instrumentation and Measurement 59 (2010), no. 10.
[6] Alfonso Prieto-Guerrero, Corinne Mailhes, and Francis Castani
e, Lost sample recovering of ECG signals in e-healyh
applications, IEEE, 29th Annual Int. Conf, Lyon, France
(2007).
[7] Tae-Soo Lee, Joo-Hyun Hong, and Myeong-Chan Cho,
Biomedical digital assistant for ubiquitous healthcare, IEEE,
29th Annual Int. Conf, Lyon, France (2007).
[8] Daniel Lucani, Giancarlos Cataldo, Julio Cruz, Guillermo
Villegas, and Sara Wong, A portable ECG monitoring device with blue-tooth and holter capabilities for telemedicine
applications, IEEE, 28th Annual Int. Conf., New York
(2006).
[9] Marco J.Suarez Baron, Juan J.Velasquez, Carlos
A.Cifuentes, Luis E.Rodriguez, and Sara Wong, An
approach to telemedicine intelligent, through web mining
and instrumentation wearable, IEEE (2011).
[10] J.Escayola, I.Martinez, J.Trigo, J.Garcia, M.MartinezEspronceda, S.Lel, and L.Serrano, Recent innovative advances in biomedical engineering : standard-based design
for ubiquitous p-health, IEEE, 4th Int. Multi-Conference on
Computing in the Global Information Technology (2009).

Today, medical technologies development has been


greatly affected clinical examination and consult methods in hospitals and health care centers. Wireless network appearance in making connections with people
in different places and obtaining desirable results have
caused many hospitals to use different methods in order to computerize their affairs. In this study, transferring the information of drug and temperature for two
patients, without collision was investigated using two
slaves. Adding other patients is possible by considering related slave circuits and program and hardware
operation. We hope that in near future, patients electronic profiles can be accessible for doctors and nurses
via wireless network communication so that a better
control and care in health and treatment can be provided.

[11] A-Bramanti, L.Bonanno, A.Celona, S.Bertuccio, A.Calisto,


P.Lanzafame, and P.Bramanti, GIS and spatial analysis for
costs and services optimization in neurological telemedicine,
IEEE, 32th Annual Int. Conf. Buenos Aires, Argentina
(2010).

Refrences

[16] http://en.wikipedia.org/wiki/master/slave-(technology).

[12] Cristian Ravariu, Florin Babarada, A.Celona, S.Bertuccio,


A.Calisto, P.Lanzafame, and P.Bramanti, The e-healthcare
point of diagnosis implementation as a first instance, IEEE,
1th Int. Conf, Data Compression, Communications and Processing (2011).
[13] Yibao Wang, Yang Liu, Xudong Lu, Jiye An, and Huilong
Duan, A general-purpose representation of biosignal data
based on MFER and CDA, Third Int. Conf. Biomedical Engineering and Informatics (2010).
[14] Stefano Bonacina and Marco Masseroli, A web application
for managing data of cardiovascular risk patients, IEEE,
28th Annual Int. Conf., New York (2006).
[15] Mohd Fadlee, A.Rasid, and Bryan Woodward, Blue-tooth
telemedicine process for multichannel biomedical signal
transmission via mobile cellular networks, IEEE Trans. Inform. Tech. Biomed 9 (2005), no. 1.

[17] http://www.atmel.com.
[1] Andreas Lymberis and Silas Olsson, Intelligent biomedical
clothing for personal health and disease management : state
of the art and future vision, Telemedicine Journal and ehealth 9 (2003), no. 4.

[18] http://www.hy-line.de.
[19] http://www.hoperf.com.
[20] http://www.maxim-ic.com.

57

CEA Framework: A Comprehensive Enterprise Architecture


Framework for middle-sized company

Elahe Najafi

Ahmad Baraani

MSc of Information Technology

Department of Computer Engineering

Research institute for ICT,Tehran,IRAN

Isfahan University,Isfahan,IRAN

enajafi@aut.ac.ir

ahmadb@eng.ui.ac.ir

Abstract: Designing architecture for Organizations is a complex and confusing process. It is not
obvious that you should start from which point and how you can continue to achieve the holistic
architectural model of an organization. Using CEA framework (CEAF), a semantic enterprise architecture framework (EAF), brings a new opportunity for enterprise experts to get their enterprise
ontology by focusing on one variable at a time without losing sense of enterprise as a whole. A
number of semantic frameworks like CEAF have been presented by famous Enterprise Architecture
(EA) researchers and experts till now. A significant goal of all of them is to design a transparent
Enterprise which is as LEAN as possible to adapt and adopt external demands and environment
changes. To achieve this goal CEAF is based on primitive object named Service. It is a substantial
characteristic of CEAF which distinguishes it from other presented frameworks till now.

Keywords: Enterprise Architecture(EA);Enterprise Architecture Framework(EAF); Service Oriented; Service Oriented Framework; Service Oriented Enterprise Architecture(SOEA) .

Introduction

works in section 2, CE framework is elaborated in details in Section 3. Finally, Directions of future research
are discussed in section 4 to conclude the paper.

Enterprise architecture (EA) is an approach that organizations should practice to integrate their business with Information and Communication Technology (ICT). It presents a comprehensive and rigorous
solution describing a current and future structure and
behaviour of an organization by employing a logical
structure. This structure comprising a comprehensive
collection of different views and aspects of the enterprise, called EAF . EAF is a total picture of an organization showing how all organization elements work
together to achieve defined business objectives.
Several distinctive EAF have been proposed, till now,
but many organizations are struggling with using these
frameworks, the main challenge which current EA
Frameworks faced is that using these frameworks are a
tedious and complex activity.
In this paper we will present a new service oriented
semantic framework to reduce this challenge.In the remainder of this paper we will discussed the related
Corresponding

Author, T: (+98) 913 3264575

58

1.1

Related Work

Several distinctive EA Framework(EAF) have been


proposed from 1980 till now, but many organizations
are struggling with using these framework. The main
challenges of existing frameworks are:
1. Inflexible to deliver
andrespond manner.

services

in

sense-

2. Lack of well-defined alignment between services


delivered at every level of the organization.
3. Lack of integration between enterprise goals, actions and resources.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

4. Does not support cross-organizational interac- 2.1.1


tion.

First Row:Service Strategy

Observer: Strategist
Description: Service Strategy provides a foundation
for enterprise management.It derives all enterprise activities.
To eliminate these challenges a number of researchers Critical Questions:
try to use SO paradigm with EAF for generating EA
artifacts )[16]. These researchers believe that this
What are our business objectives and expectaparadigm re-engineer enterprise to one that senses the
tions?
environment rapidly and adapts itself to business chal In which domains and to whom do we offer our
lenges and opportunities quickly.
services (our stakeholders)?
Although the scope and coverage of these frameworks
differ extensively, they do not completely clarify how
What value do we create for our stakeholders?
combination between EAF can takes advantage of ser What services do we offer to our stakeholders now
vices and what is a well-defined classification schema
or plan to offer in the future?
to support this combination. To eliminate the deficien What is the quality and warranty of our services
cies of current SOEAF we suggest a new SO semantic
to differentiate our services from rivals?
framework named CEA in next section.
Who are our service provider partners?
5. Heterogeneous models for each cell.

CEA Framework

What is the pattern of our activities in our value


chain or value network to create value for our
stakeholders?
How do we allocate resources efficiently to resolve
conflicting demands for shared resources?

CEA Framework is a two dimensional normalized


scheme which is the intersection of two classifications,
Goals:
the aspects of organization and the organization audience perspectives. These views and aspects will be
Thinking about why and what we want done beelaborated in details as CEAF rows and columns in
fore thinking of how does it.
two next sections.
Define our scope

2.1

CEAF Rows

CEAF rows show the enterprise from various viewpoints. For describing each row we defined a template
comprising below items and defined each row by this
template.
Observers:a list of audiences and viewers.
Description:a brief depiction.
Goals:a list of goals targeted by each row.
Critical Question:a list of questions to be answered
by end of each row.
Organization:a list of the roles and the responsibility
of each role corresponding COBIT RACI chart.In this
field A,R and I are stand for Accountable, Responsible
and Informed.
Candidate Patterns:a list of patterns and suitable
references.
Perquisite:a list of perquisite inputs.
Deliverables:a list of deliverables of each row
Based on:a list of theoretical concepts which each row
is based on.

Ensuring that organizations are in position to


handle the costs and risks associated with their
services
Set up foundation for operational effectiveness
and instinctive performance.
Organization:
I Strategists, CEO, CFO, FIO, SOA ESC, SOA BC,
AMC,ARB,A CIO,R Chief EA Architect
Candidate Patterns:Strategic pattern,Business pattern
Perquisite: Business strategy
Deliverables:Context
Diagram,Service
Portfolio,Service Design Requirement
Based on: ITIL Service Strategy

2.1.2

2nd Row:Process Service Design

Observer:Customer, Business Owner.


Description: Process service design covers design

59

The Third International Conference on Contemporary Issues in Computer and Information Sciences

principles and methods for converting strategic goals


and desires into real orchestration services.
Critical Questions:

What are the quality requirements and constraints of each business services?
What is the pattern of each business services?

Which of our design services realizing our business processes are meaningful for our external Goals:
stakeholders?
What are the quality, management and operational requirements of each design services which
must be addressed as a fundamental part of design?

Design of new or changed business service aligned


process services.
Provide a holistic view of business design.

Provide business services handled business needs.


How can we interact with external service
provider and use the services they provide to
achieve our IT service targets and business exOrganization:
pectations?
I Business executive, business owner,A Chief EA Ar What is the pattern of each process services?
chitect
R Business analyst ,Service Analyst, Service Designer,
Service architect
Goals:
Candidate Patterns: Business Pattern
Perquisite:Services Catalogue (Business part),Process
Design of new or changed service for introduction SLA, OLA
Deliverables:Detailed Service catalogue (Business
into the live environment.
Part),Business Service level Agreement
Provide a holistic view to all aspects of process
Based on:ITIL Service design,IBM Service Model and
design insuring us that all of process services are
SOMA technique,Thomas ERL Service Oriented Apconsidered when any of the individual one change
proach
or amend.
Provide service alignment.
Organization:
I Strategists, CFO,A Chief EA Architect
R Business analyst, Service architect, Business executive, business owner
Candidate Patterns:Process pattern
Perquisite: Service Portfolio, Strategic goals
Deliverables:Service
catalogue
(High
levelProcess),Process Service level Agreement,Operational
level Agreement
Based on:ITIL Service design,IBM Service Model
and SOMA technique,Thomas Erl Service Oriented
Approach

2.1.4

4th:IT Service Design

Observer: IT specialist Description:In this process


we design the basic or complex IT services of each
business service .These are the components which are
loosely coupled, self-contained and stateless.
Critical Questions:
What are Our IT services to realize Business process?
What are the non-functional requirements of each
IT services?
What is the design pattern of each IT services?

2.1.3

3th Row:Business Service Design

Observer:Owner and User


Description:In this level we design the Business services as component of process services.
Critical Questions:
Which of our design services realizing business
activities is meaningful for internal stakeholders?

60

Goals:
Design of new or changed IT service aligned business services.
Provide a holistic view to all aspects of system
design
provide IT services realized business needs.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Organization: I Business analyst,A Chief EA Architect


R System analyst, Service Analyst, Service Designer,
IT architect, Application Designer, DB Designer
Candidate Patterns: Design and Architectural pattern
Perquisite: Service catalogue (Business part)
Deliverables:Service catalogue (IT Part),Service Solution Architecture,Quality of Services
Based on:ITIL Service design,IBM Service Model and
SOMA technique,Thomas ERL Service Oriented Approach

2.2

CEA Columns

The aspects of enterprise are specified by six categories


defining CEAF Columns. These categories are: Purpose, Pattern or Practice, Policy, Stakeholder, People
and Resource. These aspects are described by beyond
template
Description: a brief depiction.
Levels: Different elements of each column show in different Levels.
Deliverables: a list of deliverables of each row

2.2.1

2.2.2

2nd:Policy

Description: This column is about policies of organization.A policy is management expectation, intention
and condition used to ensure that consistent and appropriate decision, design and development of goals,
responsibilities, resources and process are created at
last. Policies are about constraints and quality of different type of services exposed in different rows.
Levels:
Strategy policies: Strategic policies describe governance rules that driving the strategic decision .they
should be considered to accomplish strategic mission
through well understood steps by an agreed date and
budget. These policies consider any risk, constraints
and limitation affecting business strategy and quality
of delivered services.
Orchestration policies: Orchestration policies address any constraint exists for composition and integration of business services together.
Business Policies:
These policies specify constraints, standards and business rules regarding the operation of services.
IT policies: IT policy is about the quality of IT services. It covers all types of non-functional requirements
like performance, efficiency, security, availability, reliability which should be addressed by service-oriented
architecture.
Deliverable: Policy relationship Map

First Column:Purpose

Description:In this column the goals which we want


to be achieved by means of the remainder columns are
enumerated. These goals will define in various levels
ranging from the most abstract depiction of the business to more detailed and measurable set of objectives.
Levels:
Strategic headline: These are business philosophy,
the manner in which, services are provided, the governing set of beliefs, values and a sense of purpose shared
by the entire organization as vision, mission and strategic headlines.
Business goals: These are realistic translation of abstract strategic headline .These type of goals are like a
mountain peaks that we want to achieved in long term.
Achieving these goals insure us to accomplish strategic
headline.
Business objectives: These are quality format of
Business goals which provide measurable views of business goals.
IT Targets: IT targets are quantity goals declared
for each quality objectives defined in previous level.
Deliverable: The hierarchically tree between different types of goals defined in this cell based on Balanced
Score Card

2.2.3

3rd Column:Service

Description:A Service is a loosely coupled, selfcontained and stateless component that interacts with
other services to accomplish business goals and deliver
value to customers. In this column we design services
in three levels.
Levels:
Process services: Process services provide the control capabilities required to manage the flow and interactions of multiple services in ways that implement
business processes. These services representing longterm work flows or macro flow of business processes
which implemented by an orchestration of basic and
complex business services.
Business services: Each business service may participate in different process services. These services
contain business micro logic. These services are meaningful from business internal viewer of systems.
IT services: IT services are services which handle
the technical view of system. These types of services
include the technology solutions and IT constraints to
design services. An IT Service may be composite or
basic.

61

The Third International Conference on Contemporary Issues in Computer and Information Sciences

2.2.4

Training and skills enhancement was needed for personnel and communication management (3) New roles
and responsibilities should be defined (4) the governance structure must be established.
Deliverables: Organization structure, Chain of Authority and Responsibility.

4th Column:Pattern or Practice

Description: This column specifies how we can


achieve defined goals in first column. In this column a
organization specifies a solution pattern comprising a
set of activities and practices that solve common problems in a given context of business success.
Levels:
Strategic pattern:
These patterns leverage best
practices along with a collection of proven architectures
used in different domains which organization offers its
services in. Best practices include methodologies, techniques, guidelines and strategies.
Process pattern: This pattern defines the model of
orchestration between different services. The process
services must do its responsibility stand alone. Each
orchestration between services helps us to accomplish
some parts of our strategic mission.
Business pattern:
Business pattern includes the
patterns matching business service scenarios. Each
business service scenario is minimally mapped to the
activities it supports, the rules it abides by, the messages it transfers, the data warehouses it retrieves data
from and the information it captures, processes, stores
and accesses.
IT Pattern: In IT level we focus on IT solution to
realize business services. In this level we use technical
and design pattern to cover business scenarios.
Deliverables: Pattern Language

2.2.5

2.2.7

7th Column:Resource

Description: Through earlier columns we defined our


goals and perspectives, The services and the level of
then, we can offer to our stakeholders, the pattern and
ways we must follow to achieve the goals and the human resources needed to accomplish way. The only
thing which remains is other resources. Raw materials, Environment resources, technical resources and
financial resources are some example of these resources
which must be considered in this column.
Deliverable: List of resources

Conclusion and Future Work:

In this paper we introduce a comprehensive framework


using services as primitive elements. This classification structure shows a holistic view of any organization
by Seven aspects(Purpose, Policy, Service, Pattern or
Best Practice, Stakeholder, People and Resource)from
different perspectives.Our future work agenda encompasses using this framework in more case studies and
presented a schema which is suitable for companies
with every size.

5th Column:Stakeholder

Description: In this column we focus on stakeholder


management to realize organization stakeholders, categorize them, understanding their needs, expectations,
responsibilities, authorities and decision rights. These
stakeholders are all of the organizations and external
peoples that affect our business and our organization
activities.
Deliverables: Stakeholder Model

Refrences
[1] D. HARRISON and L. VARVERIS, TOGAF: ESTABLISHING ITSELF AS THE DEFINITIVE METHOD FOR
BUILDING ENTERPRISE ARCHITECTURES IN THE
COMMERCIAL WORLD (2004).
[2] D. MINOLI, Enterprise Architecture A to Z: Frameworks,
Business Process Modelling, SOA, and Infrastructure Technology, Auerbach PUBLICATIONS, 2008.
[3] J. Schekkerman, How to Survive in the Jungle of Enterprise
Architecture Frameworks: Creating or Choosing an Enterprise Architecture Framework, Trafford Publishing, 2006.

2.2.6

6th Column:People

Description: In this column we define all of organization workers and committees which participant in
defining EA.By focusing on people we can clarify: (1)
the changes needed in organization structure, chains
of responsibility, authority and communication. (2)

62

[4] A. AYED, M. ROSEMANN, E. FIELT, and A. KORTHAUS, ENTERPRISE ARCHITECTURE AND THE
INTEGRATION OF SERVICE-ORIENTED ARCHITECTURE, PACIS 2011 PROCEEDINGS, Brisbane , Australia
(2011).
[5] A. NABIOLLAHI, R. A. ALIAS, and S. SAHIBUDDIN, A
SERVICE BASED FRAMEWORK FOR INTEGRATION
OF ITIL V3 AND ENTERPRISE ARCHITECTURE, Design (2010), 15.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

[6] J. Schekkerman, EA and Services Oriented Enterprise


(SOE) / Service Oriented Architecture (SOA) and Service
Oriented Computing (SOC) (2008).

[7] M. Ibrahim, Service-Oriented Architecture and Enterprise


Architecture (2007).

63

Thick non-crossing paths in a polygon with one hole

Maryam Tahmasbi

Narges Mirehi

Shahid Beheshti University, G.C.,Tehran, Iran1

Shahid Beheshti University, G.C.,Tehran, Iran

Department of Computer Science

Department of Computer Science

m tahmasi@sbu.ac.ir

n.mirehi@mail.sbu.ac.ir

Abstract:
We consider the problem of finding a large number of disjoint paths for unit disks moving amidst
static obstacles. The problem is motivated by the problem of finding shortest non-crossing paths for
aircraft in air traffic management, in which one must determine the shortest path for any aircraft
that can safely move through a domain while avoiding each other and avoiding no-fly zones and
predicted weather hazards. We compute K shortest paths for aircrafts in a domain with one hole,
where K 1 pairs of terminals lie on the boundary of the domain, but one pair of terminals lie on
the boundary of the domain and the boundary of the hole. We present an algorithm for solving the
problem in polynomial-time.

Keywords: K thick paths; minkowski sum; non-crossing paths; simple polygon with one hole; minsum

Introduction

[6].
The input to the problem is a simple polygonal domain, and K pairs of terminals (sk , tk ) that are sources
and sinks of any path, and one hole/obstacle. K 1
pairs of terminals lie on the boundary of the domain
and one of the points of the last pair lies on the boundary of the hole. The goal is to find K thick non-crossing
paths in the domain with no intersection to the hole,
such that the total length of the paths is minimum.

One of the most studied subjects in computational geometry is the shortest path problem [1],[2]. one of the
extension geometric shortest path problem is [3]: given
a set of obstacles and a pair of points (s, t), find a
shortest s t path avoiding the obstacles. The noncrossing paths problem is an extension of the shortest
path problem: given a set of obstacles and K pairs
of points (sk , tk ), find a collection of K non-crossing
Thick path planning in geometric domains is an imsk tk paths such that the paths are optimal accord- portant computational geometry subject with applicaing to some criterion. The objective may be either to tions in robotics, VLSI routing, air traffic management
minimize the sum of the lengths of the paths (minsum (ATM), sensor networks, etc. [3].
version) or to minimize the length of the longest path
(minmax version). A thick path is the Minkowski sum
of a curve and the unit disk. Two thick path are called
non-crossing when they are non-intersecting, the thick 2
Motivation
paths allow to share parts of the boundary with each
other; but the interiors of the paths are disjoint [4].
We are motivated by an application in ATM; similar
The problem of finding multiple thick paths (the problems may arise in other coordinated motion planThick non-Crossing Paths Problem), which we consider ning problems in transportation engineering, e.g., shipin this paper, is an extension of both the shortest non- ping vessels, robotic material handling machines, etc.
crossing paths [5] and the shortest thick path problems The polygon P models an airspace through which the
Corresponding

Author: T: (+98) 21 29903004

64

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

aircrafts intend to fly. We assume that the aircraft remain at constant altitude (as is often the case during
en route flight), so that we can consider the problem
to be in a two-dimensional domain. There is an obstacle within P that correspond to a no-fly zone arising
from special use of airspace (military airspace, noise
abatement zone, security zone over city, etc.).
We are interested in determining the paths for
any aircraft from source to sink that safely be routed
through P whit optimal total length of paths. while
maintaining safe separation from each other and from
the obstacle .

Related work

This problem can be viewed as a variation of the FatEdge Graph Drawing Problem (FEDP) [7],[8], which,
in turn, is an extension of the continuous homotopic
routing problem (CHRP) a classical problem in VLSI
design [9],[10],[11],[12]. A related problem is that of
finding shortest paths homotopic to a given collection
of paths [13],[14],[15]. The novelty of our work lies in
considering the problem in simple polygons and polygonal domains; the previous research concentrated on
point obstacles for the paths. Although only point obstacles are considered in CHRP/FEDP, the existing results on FEDP [7],[8] are more general than our result
in some other aspects: the general FEDP receives as
input, an embedding of an arbitrary planar graph and
finds a drawing with the edges of maximum thickness;
we do not answer the question of finding the maximum
separation between the paths. Some heuristics for finding thick non-crossing paths in polygonal domains are
suggested in the VLSI literature [16], but neither complexity analysis nor performance guarantees are given
there. A very restricted version is considered in [17].
In a rectilinear environment, fast algorithms are known
for some special cases of the minsum version [18],[19].
A related problem can be viewed in [6], that considered
all pairs (sk , tk ) lie on the boundary of the polygon and
the sources/sinks are not allow to lie on the boundary
of the holes. We extend the work in [6] for the case
where one of the sinks/sources lies on the boundary of
the hole, and compute K shortest path in linear time.

The input is a polygonal domain specified by an


outer (simple) polygon P and a hole in it. Let n denote the number of vertices on the boundary of , and
let h denote the number of hole that here equals to
one.We specify boundary P by P and boundary hole
with Q.
A (thin) path is a simple (non-self-intersecting)
source-sink curve in . For w 0 and S R2 , let
hSi(w) denote the Minkowski sum of S and the disk of
radius w centered at the origin. A w- thick path
is the Minkowski sum of a path and the disk of radius w : = hi(w) . The path is called the reference
path for the thick path . A 1-thick path is called just
a thick path.In this paper we will also use polygon to
refer to a set whose boundary consists of straight line
segments and circular arcs; the complexity of such a
polygon is the number of its boundary segments and
arcs.

Figure 1: The simple polygon with one hole and pairs


(sk , tk ).

When there is one sink that


lies on the hole boundary

We start by recollecting and extending known results


on finding multiple thin non-crossing shortest paths [5].
We consider the case of simple polygon with one hole
where there is a pair (si , ti ) with si P and ti Q
and other terminals lie on the boundary of P as shown
in figure 1. In this condition there exists a linear time
algorithm for finding K shortest thin paths [5], we want
to extend this result and find K thick non-crossing optimal paths using the results in [6].

Preliminaries

Let P 1 = P \(P )1 be the 1 unit offset of P inside it. We assume that P 1 is still a simple polygon.
We begin with a formal statement of our problem and Let ST = {(sk , tk ), k = 1...K} be the set of K pairs
a review of some relevant notions and results from pre- of points on the boundary of P 1 and the boundary of
vious works [20],[6].
Q1 (one point lies on the boundary of Q1 ) and ()k be a

65

The Third International Conference on Contemporary Issues in Computer and Information Sciences

sk tk path within P 1 ; We call sk the start and tk the


destination of the kth path. Let k be the thick path
within P with k as the reference path, i.e. k = (k )1 .
Thick paths 1 , ..., K are called non-crossing if
i j = for i 6= j {1 . . . K}. Note that we
allow the thick paths to share parts of the boundary
with each other, we only require that the interiors of
the paths are disjoint. We require that for k = 1 . . . K
the sk tk path in the collection is as short as possible
the existence of other paths. We also assume that the
problem instance is feasible, that means the polygon Figure 2: From left to right: an instance of the probis wide enough to accommodate the thick paths. The lem; the mapping of P and the terminals to the unit
circle; the tree of slices Tsl .
approach of [5] to the problem was as follows.
First, the boundary of the polygon is mapped to the
unit circle bdC and the terminals are identified with
their images. Then, a chord sk tk is drawn between the
terminals in every pair (sk , tk ), k = 1 . . . K 1 (ignore
pair its ti lie on the hole). If two of the chords cross,
then the problem instance is infeasible. Otherwise, the
K1
S
tree of slices Tsl on C sl(t1 , s1 )
sl(sk , tk ) is built
k=1

in which the root is the whole circle C, the roots immediate children are sl(s1 , t1 ) and sl(t1 , s1 ), and the
parent-child relation is defined by containment of the
slices (see Fig. 2, ignore the shaded disk now; see also
[5] for details).

Let ST ord = {(v1 ), . . . , (v2K1 )} be the set


{s1 , . . . , sK , t1 , . . . , tK |ti 6= tl } ordered clockwise
around bdP 1 . Similar to [4], we define dk (u, v). Let
v, u be two consecutive points in ST ord. Let be a
path within C from a point on bdC(v, u) to a point on
the chord sk tk . The kth depth of P 1 (v, u) denoted by
dk (v, u), is defined as the minimum number of (other)
chords that crosses, for all paths . Let Ok be the
set of obstacles, obtained by inflating each part of bdP 1
by 2 times its kth depth (arithmetic modulo 2K 1 is
assumed in the indices):

Let us assume that there is a pair (sl , tl ) with


sl P and tl Q. We can have two paths from
sl to tl , one passing above the hole and the other be2K1
S 1
low it (Fig.3), when a path between sl and tl is routed,
O
=
P (vj , vj+1 )2dk (vj ,vj+1 )
k
then there is only one way to route the rest of the pairs
j=1
in a non-crossing fashion. Furthermore, each path can
be routed in two ways: above or below the hole Q, we
denote these paths by a (sl , tl ) and b (sl , tl ), respectively. Thus we can compute a (sl , tl ) and b (sl , tl ),
solve the problem separately for each case, and choose
Ok can be founded in O(n + K) time [6] by adaptthe solution with minimum total length. In the follow- ing the algorithm for computing the medial axis of a
ing, we concentrate on determining non-crossing paths simple polygon in linear time [21].
assuming that a (sl , tl ) has been routed. The other
case can be treated in the same way.
We first consider a (sl , tl ) for sl tl , then we solve
problem for b (sl , tl ), separately.

Figure 3: The paths b (sl , tl ) (left), and a (sl , tl )


(right).

66

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

5.1

Algorithm

Data: The simple polygon P with one hole, K


pairs (si , ti ) where there is a pair (si , ti )
with si P and ti Q and other
terminals lie on the boundary of P
Result: K thick non-crossing optimal paths
begin
1. First build Ol , the free space for l is a
splinegon and thus routing a (sl , tl ) inside
Ol can be done in linear time [22].

Conclusions and open problems

Our result can be extended to the case where there is


more than one sink that lies on the hole boundary. We
leave open the problem of finding K thick non-crossing
paths in a polygonal domain with more than one source
and sinks can lie on one of the hole boundaries. We conjecture that our approach can be extended to higher
dimensions and to other shapes of the moving objects,
as long as the motion is purely translational.

2. Inflate a (sl , tl ) by 2 units.


3. Define 0 = a (sl tl ), 0 is a simple
polygon or several connected simple
polygons, the boundry is included in
the polygon (Fig.4).
4. For all i = 1, . . . , K, i 6= l, we can find
K 1 thick paths in a simple polygon as
[6] in O(n + K) time. For finding every
path k, we must first build Ok , then
in the free space route k inside Ok .
end

Refrences
[1] Je. Erickson and A. Nayyeri, A Shortest non-crossing walks
in the plane, Proceedings of the 22nd Annual ACM-SIAM
Symposium on Discrete Algorithms (SODA) (2011), 125
128.
[2] E. M. Arkin, J. S. B. Mitchell, and V. Polishchuk, Maximum thick paths in static and dynamic environments, Comput. Geom 43(3) (2010), 279-294.
[3] J. S. B. Mitchell, Geometric shortest paths and network optimization Handbook of Computational Geometry,
J. Sack and G. Urrutia, editors, Elsevier Science B.V.
North-Holland, Amsterdam, pages: 633701, 2000.

Similarly we solve the problem again, starting with


b (sl , tl ), then we compare the total length of the two
results and choose the smaller one.

[4] V. Polishchuk, non-crossing paths and minimumcost


continuous
flows
in
geometric
domains,
P.h.D thesis, Stony Brook University Available at
http://cs.helsinki.fi/valentin.polishchuk/pages/thesis.pdf,
August 2007.

Figure 4 shows a polygon with one hole and a path


from sl to tl that passes above the hole. In the right,
the path is inflated and a simple polygonal free space
is created.

[5] E. Papadopoulou, k-pairs non-crossing shortest paths in a


simple polygon, Int. J. Comp. Geom. Appl., 9(6) (1999),
533-552.
[6] J. S. B. Mitchell and V. Polishchuk, A Thick non-crossing
paths and minimum-cost flows in polygonal domains, 23rd
ACM Symposium on Computational Geometry (2007), 56
65.
[7] C. A. Duncan, A. Efrat, S. G. Kobourov, and C. Wenk,
Drawing with fat edges, In GD01, Revised Papers from the
9th International Symposium on Graph Drawing, London,
UK, Springer-Verlag. (2002), 162177.
[8] A. Efrat, S. Kobourov, M. Stepp, and C. Wenk, Growing
fat graphs, In SCG 02: Proceedings of the eighteenth annual symposium on Computational geometry, New York,
NY, USA, ACM Press (2002), 277278.

Figure 4: Shortest thick path from sl to tl and the


resulting simple polygonal free space.

5.2

Running time

The time complexity of finding K thick paths is K(n +


K). Since we must compute K thick paths twice, the
exact running time is 2 K(n + K).

[9] R. Cole and A. Siegel, River routing every which way, but
loose, Proc. 25th Annu. IEEE Sympos. Found. Comput. Sci
(1984), 6573.
[10] S. Gao, M. Jerrum, M. Kaufman, K. Mehlhorn, and W. R
u
lling, On continuous homotopic one layer routing, SCG88:
Proc. of the fourth annual symposium on Computational geometry, New York, NY, USA, ACM Press (1988), 392-402.
[11] C. E. Leiserson and F. M. Maley, Algorithms for routing
and testing routability of planar VLSI layouts, Proc. 17th
Annu. ACM Sympos. Theory Comput. (1985), 6978.
[12] F. M. Maley, Single -Layer Wire Routing and Compaction,
MIT Press, Cambridge, MA (1990).
[13] S. Bespamyatnikh, Computing homotopic shortest paths in
the plane, J. Algorithms 49(2) (2003), 284303.

67

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[14] A. Efrat, S.G. Kobourov, and A. Lubiw, Computing homotopic shortest paths efficiently, In Proceedings of the 10th
Annual European Symposiumon Algorithms London, UK,
Springer-Verlag (2002), 411423.
[15] T. Dayan, Rubber-Band Based Topological Router, P.h.D
thesis, UC Santa Cruz, 1997.
[16] C. P. Hsu, General river routing algorithm, Proc. of the
twentieth design automation conference on Design automation (1983), 578583.
[17] A. Aggarwal, M. M. Klawe, S. Moran, P. W. Shor, and R.
Wilber, Geometric applications of a matrix searching algorithm, In Proc. 2nd Annu.ACM Sympos. Comput. Geom.
(1986), 285292.
[18] Y. Kusakari, H. Suzuki, and T. Nishizeki, Finding a shortest pair of paths on the plane with obstacles and crossing

68

areas, In J. S. et al, editor Algorithms and Computation


(1995), 4251.
[19] J. Takahashi, H. Suzuki, and T. Nishizeki, Finding shortest non-crossing rectilinear paths in plane regions, ISAAC
(1993), 98107.
[20] J.S. B. Mitchell, maximum flows in polyhedral domains, J.
Comput. Syst. Sci. 40 (1990), 88123.
[21] F. Chin, J. Snoeyink, and C. A . Wang, Finding the medial
axis of a simple polygon in linear time, Discrete Comput.
Geom. 21(3) (1999), 405420.
[22] E. A. Souvaine and D. L. Delage, Convex Hull and Voronoi
Diagram of Additively Weighted Points, Proc. 6th Annu.
ACM Sympos. Comput. Geom. (1990), 350359.

A Note on the 3-Sum Problem


Keivan Borna

Zahra Jalalian

Faculty of Mathematical Sicenes and Computer

Faculty of Engineering

Kharazmi University

Kharazmi University

borna@tmu.ac.ir

jalalian@tmu.ac.ir

Abstract: The 3-Sum problem for a given set S of integers is subject to find all three-tuples
(a, b, c) for which a + b + c = 0. In computational geometry many other problems like motion
planning relate to this problem. The complexity of existing algorithms for solving 3-Sum are O(n2 )
or a quotient of it. The aim of this paper is to provide a linear hash function and present a fast
algorithm that finds all suitable three-tuples in one iteration of S. We also improve the performance
of our algorithm by using index tables and dividing S into two negative and non-negative parts.

Keywords: 3-Sum; Computational Complexity; Linear Hash Function; Motion Planning.

Introduction

The 3-Sum problem for a given set S of n integers asks


whether there exist a three-tuples of elements from S
that sum up to zero. A problem P is 3-Sum-hard if
every instance of 3-Sum of size n can be solved using
a constant number of instances of P each of O(n) size
and o(n2 ) additional time. One can think of a 3-SUMhard problem in many interesting situations including
incidence problems, separator problems, covering problems and motion planning. Obviously by testing all
three-tuples this problem can be solved in O(n3 ) time.
Furthermore if the elements of S are sorted then we
can use Algorithm 1 with O(n2 ) complexity. It is interesting to mention that Algorithm 1 is essentially the
best algorithm known for 3-Sum and it is believed that
the problem cannot be solved in sub-quadratic time,
but so far this has been proven in some very restricted
models of computation only, such as the linear decision tree model. In fact Erickson [4, 5] proved an (n2 )
lower bound in the restricted linear decision tree model.
This model is based on the transpose of the transformation presented in [3] that maps each point (a, b) to the
line y = ax + b and vice-versa. However, the problem
remained unsolved in general for other computational
models.
Corresponding

Data: A sorted set S of integers


Result: A three-tuple (a, b, c) for which
a + b + c = 0.
for i = 1, , n 2 do
j=i, k=n-1;
while k > j do
if si + sj + sk = 0 then
print si , sj , sk ;
j=j+1;
k=n-1;
end
if si + sj + sk > 0 then
k=k-1;
else
j=j+1;
end
end
end
Algorithm 1: An O(n2 ) algorithm for finding all
solutions for the 3-Sum problem
In [1] the authors presented a subquadratic algorithms for 3-Sum. More precisely on a standard word
RAM with -bit words, they obtained a running time
of O(n2 / max{lg 2 n/(lglgn)2 , /lg 2 } + sort(n)). This
method is based on using an almost linear map h that
was already introduced in [2]. In fact for a random odd
integer a on bits, the hash function h maps each x

Author, P. O. Box 45195-1159, F: (+98) 26 3455-0899, T: (+98) 26 3457-9600

69

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

to the first s = lg bits of a x. In the second section


of [1] for a given , the authors found suitable a, b for
which = a + b. In fact they proved that if = a + b
then h() = h(a) h(b) {0, 1} and if 6= a + b
then h(a) h(b) h() {0, 1, 2} is only true
with small probability, i.e., if h(a) h(b) h()
/
{0, 1, 2} then there is no (a, b) for which = a + b.
Notice that the operator is modulo 2s and we have
h(a) h(b) h(c) h(a + b + c) {0, 1, 2}. Furthermore since multiplication in a is linear, removing the
first s bits of the multiplication makes h to be
non-linear. The reason to use the factor {0, 1, 2} is to
make this map linear. We refer the interested reader to
see [2, 6] for more information about this hash function
which is known as Universal hashing.

resents the index of b in array R for which a+b+c = 0.


2. If indexb 0, indexb < sizeR and R[indexb] 6=
S[0]-1 then let b = R[indexb]. Now if a + b + c = 0
and b > a and b < c then we have found a suitable
three-tuples. In order to find the other solutions let
indexc = indexc 1 and do the next repetition of the
loop.
3. If (a + c < S[0] and a + c > S[n 1]) or
a c, then there is no suitable value for b for which
a + b + c = 0 and so a must be changed with larger
values in S. Thus in order to find a solution let
indexa = indexa + 1, a = S[indexa], indexc = n 1
and c = S[indexc] and do the next repetition of the
loop.

In this paper we apply a linear hash function h


that uses only one substraction operation. More pre4. Else if a < c, then c is too large and so c
cisely for a sorted array S of length n we define h via must be changed with smaller values in S. That is
h(i) = S[i] S[1]. Now we construct a new array R let indexc = n 1 and c = S[indexc].
of length S[n 2] S[1] + 1 for which the relation
R[h(i)] = S[i] between the indices and values of its elIn the following, Algorithm 2 computes all suitable
ements is established. In fact R indicates a set that three-tuples for the 3-Sum problem:
is created by h and knowing the value h(i) one can
obtain S[i] as S[i] = h(i) + S[1]. Now applying our alData: A sorted set S of n integers
gorithm and by only one iteration over S one can find
Result: All (a, b, c) for which a + b + c = 0.
all three-tuples (a, b, c) for which a + b + c = 0.
indexa = 0, indexc = n-1;
a = S[indexa], c = S[indexc];
The organization of this paper is as follows. In SecsizeR = S[n-2] - S[1] + 1;
tion 2 our proposed algorithm and its complexity analwhile indexa < n - 3 do
ysis are presented. In Section 3 the performance of
indexb = -(a+c)-S[1];
our algorithm by using index tables instead of arrays
if
and dividing S into two specific parts are improved.
0 indexb < sizeR and R[indexb] 6= S[0]-1
Finally Section 4 is devoted to some conclusions and
then
future works.
b = R[indexb];
if a + b + c = 0 and c > b > a then
print a, b, c;
end
indexc = indexc - 1;
2 Our Algorithm
c = S[indexc];
end
if S[n 1] < a + c < S[0] or a c then
In this section we present our proposed algorithm. For
indexa = indexa + 1;
the ease of reader more details about this algorithm
a = S[indexa];
will be given during an example. Let S be a sorted
indexc = n-1;
array of integers of length n. We first define a hash
c = S[indexc];
function h with h(i) = S[i] S[1]. Then we construct
a sorted array R of length sizeR = S[n 2] S[1] + 1
end
initialized with S[0] 1. Then we allocate members
else if a < c then
of S in R via h and the formula R[h(i)] = S[i]. Now
indexc = indexc - 1;
let indexa = 0, indexc = n 1, a = S[indexa], c =
c = S[indexc];
S[indexc].
end
end
Repeat the following commands while indexa <
Algorithm 2: Our algorithm for finding all solutions
n 3:
for the 3-Sum problem
1. Let indexb = (a + c) S[1]. In fact indexb rep-

70

The Third International Conference on Contemporary Issues in Computer and Information Sciences

As an example if S is an array with 8 elements


25, 10, 7, 3, 2, 4, 8, 10, then R has 19 elements
and index of each element of it will be computed via
S[i] S[1]. Thus R[0] = 10, R[3] = 7, , R[18] =
10 and the other cells will be filled with S[0]1 = 26.
Hence R = 10, 26, 26, 26, 7, 26, , 26, 8.
Now let a (and c) be the first (and last) element of
S. That is, a = S[0] = 25, c = S[n 1] = 10. Let
j = ((a + c) S[1]) = ((25 + 10) (10)) = 25
and since j is not in the range of indices of R, for this
a we can not find b, c. But since a + c = 15 < 0 hence
i = i + 1 = 1, a = S[i] = 10. Then the new value for
j is j = ((a + c) S[1]) = ((10 + 10) (10)) =
10. Since R[10] = 26, no value for b for which
a + b + c = 0 is found. On the other hand since
a + c = 0 so we should seek for a smaller value of
c. Thus let l = l 1 = 6, c = S[l] = 8. Since
j = ((a + c) S[1]) = ((10 + 8) (10)) = 12,
b = R[12] = 2 and a + b + c = 10 + 2 + 8 = 0 thus
we have found a suitable three-tuples. Our algorithm
reports the other solution as 7 + 3 + 10 = 0.

of data in each block. In this way we can quickly access


to the address of data and we can check if there is any
b for a+ c such that a+ b+ c =0.
The bit array R is a word in RAM consisting of
m := S[n 2] S[1] + 1 bits. We initialize all bits of R
with zero. In order to map elements of S in R, we consider R as a word with at least m bits in RAM. Then
each bit indicates a number in S. If this bit is one
(zero), then we deduce that the number exists (does
not exist) in S. Then for members of S we use the
following formulas:
= S[0], h(S[i]) = S[i] + , R[h(S(i)] = 1.

As an example let S = {25, 10, 7, 3, 2, 4, 8, 10}.


If a = 10, c = 8, then b should satisfy b =
(10 + 8) = 2. Therefore using the bitmap we
refer to the bit number b + = 2 + 10 = 12 and if this
bit is one we conclude that b exists in S and henceforth
One can see that this algorithm finds all the three we have found a suitable three-tuples in S.
tuples a, b, c for which a + b + c = 0 in one iteration of
the given array. Furthermore in this example one can
note that if in each loop three conditional statements
are going to run (in the middle case), then we obtain
3.2 Dividing S into two parts
24 statements totally. Whereas using Algorithm 1 this
amount will be 35. One of the advantages of our algorithm is that the collisions in our hash function is When the number of elements of R is very large and
impossible, this is because elements of set S and thus we can not store S in the main memory, another useful
R will not repeat.
approach can be applied. As a matter of fact we can
put the array R into a file and use two index tables for
choosing the values of a and c. Note that in order to
a + b + c = 0, at least one of a, b, c should be nega3 Improving the performance of have
tive. Thus we can divide S into two subsets S1 and S2
for choosing negative and non-negative values and store
our Algorithm
them in the main memory. The following algorithm
computes all suitable three-tuples for the 3-Sum probTwo possible limitations of our algorithm are the use lem very quickly. Let midS denote the index of the first
of extra memory and the number of comparisons. In non-negative element of S. Let a S1 , c S2 , then usthis section we provide solutions to overcome these two ing the relation indexb = (a + c) + S[1] we can decide
limitations.
whether b exists in the file or not. If (a + c) < S[1]
then we have to move a to the next element of S1 ,
or if (a + c) > S[n 1] then there is no suitable
3.1 Using data file plus index table in- b for the current values of a, c and we have to take
another elements from S1 and S2 . As an example if
stead of array
S = {25, 10, 7, 3, 2, 4, 8, 10}, then the two index
tables are S1 = {25, 10, 7, 3}, S2 = {2, 4, 8, 10}.
In this subsection we improve the performance of our Our algorithm proceeds and reports the following two
algorithm when the number of elements of S is very solutions: 10+2+8 = 0 = 7+3+10. In the followlarge using a bitmap. In fact when the size of array S ing we present the improved version of our algorithm
is large and it would not fit in the main memory, we can for finding all solutions for the 3-Sum problem. Note
use a file located in the auxiliary memory and an index that since the sets for choosing the values for a and c
table which is in the main memory. The data in the file are being smaller, the amount of comparisons and thus
put in blocks and index table shows the largest amount the running time of our algorithm will decrease.

71

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Data: A sorted set S of n integers


Result: All (a, b, c) for which a + b + c = 0.
indexa = 0, indexc = n-1;
a = S[indexa], c = S[indexc];
sizeR = S[n-2]-S[1]+1;
while indexa < midS do
indexb = -(a+c)-S[1];
if
0 indexb < sizeR and R[indexb] 6= S[0]-1
then
b = R[indexb];
if a + b + c = 0 and c > b > a then
print a, b, c;
end
indexc = indexc - 1;
if indexc < midS then
indexa = indexa + 1;
a = S[indexa];
indexc = n-1;
end
c = S[indexc];
continue;
end
if S[n 1] < a + c < S[0] or a c then
indexa = indexa + 1;
a = S[indexa];
indexc = n-1;
c = S[indexc];
end
else if a < c then
indexc = indexc - 1;
if indexc < midS then
indexa = indexa + 1;
a = S[indexa];
indexc = n-1;
end
c = S[indexc];
end
end
Algorithm 3: Our improved algorithm for finding
all solutions for the 3-Sum problem
We conclude our discussions in Sections 2 and 3 in
the following theorem.
Theorem 1: Algorithm 3 computes all solutions for
the 3-Sum problem faster than Algorithms 1 and 2.
Finally we compare our Algorithms 2 and 3 with
Algorithm 1. We generated 100 sets all of size 20 of
random integers in the range [50, 50] and counted the
amount of operations that each algorithm are doing. In
figures 1 and 2 (the outpus of a Java program) the leftmost, the middle and the rightmost histograms count
the amount of operations that each of Algorithm 1, 2
and 3 are doing.

Figure 1: The histogram of amount of operations that


Algorithms 1, 2 and 3 are doing.

In Figure 2 we draw the histograms for the average


amount of operations that each algorithm are doing.

Figure 2: The histogram of avarage amount of operations that Algorithms 1, 2 and 3 are doing in 100 tests.

Discussion and Future Works

In this paper a fast and optimized algorithm for finding


all solutions for the 3-Sum problem is presented. Further works for generalizing this algorithm to rational
and complex numbers are under progress.

Refrences
[1] I. Baran, E. D. Demaine, and M. Patrascu, Subquadratic algorithm for 3-SUM: Proc. 9th Worksh. Algorithms & Data
Structures, Springer, Berlin/Heidelberg 3668/2005 (2005),
409421.
[2] M. Dietzfelbinger, Universal hashing and k-wise independent
random variables via integer arithmetic without primes: Lecture Notes in Computer Science, Proc. 13th Symposium on
Theoretical Aspects of Computer Sceince (1996), 569580.
[3] H. Edlesbrunner, J. ORourke, and R. Seidel, Constructing
arrangements of lines and hyperplanes with applications:
Lecture Notes in Computer Science, SIAM. J. Comput. 15
(1986), 341363.
[4] J. Erickson, Lower bounds for fundamental geometric problems, PhD thesis, University of California at Berkeley, 1996.

72

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[5] J Erickson, Lower Bounds for Linear Satisfiability Problem:


Chicago Journal of Theoretical Computer Science 8 (1999).

73

[6] M. N. Wegman and J. L. Carter, New classes and applications of hash functions, Proc. 20th IEEE FOCS (1979),
175182.

Voronoi Diagrams and Inversion Geometry


Zahra Nilforoushan

Abolghasem Laleh

Department of Computer Engineering

Department of Mathematics, Faculty of Science

Kharazmi University, Tehran, Iran

Alzahra University, Tehran, Iran

shadi.nilforoushan@gmail.com

aglaleh@alzahra.ac.ir

Ali Mohades

Faculty of Mathematics and Computer Science


Amirkabir University of Technology, Tehran, Iran
mohades@aut.ac.ir

Abstract: Voronoi diagrams have proven to be useful structures in various fields and are one of
the most fundamental concept in computational geometry. Although Voronoi diagrams in the plane
have been studied extensively, using different notions of sites and metrics, little is known for other
geometric spaces. In this paper we are interested in the Voronoi diagram of a set of sites in the
given inversion circle. We studied various cases which show some difference of Voronoi diagram
between Euclidean and inversion geometry. Finally, a special partition of the inversion circle which
is proven that will be a Voronoi diagram of inverted point in the inversion circle, is given.

Keywords: Inversion circle,Stereographic Projection, Voronoi diagrams.

Introduction

see [1, 3, 5, 713] for more details.

Given a set of sites and a distance function from a point


to a site, a Voronoi diagram can be roughly described
as the partition of the space into cells that are the locus
of points closer to a given site than any other sites.
Voronoi diagrams belong to the computational geometers favorite structures. They arise in nature and
have applications in many fields of science [4]. Excellent surveys on the background, construction and
applications of Voronoi diagrams can be found in Aurenhammers survey [2] or the book by Okabe, Boots,
Sugihara and Chiu [15]. Naturally the first type of
Voronoi diagrams being considered was the one for
point sites and the Euclidean metric. Subsequent studies considered extended sites such as segments, lines,
curved objects, convex objects, semi-algebraic sets and
various distances like L1 , l , hyperbolic metric or
any distance defined by convex polytope as unit ball;
Corresponding

Consider a circle with center O and radius r. If


point P is not at O, the inverse of P with respect to
the circle C is the point P 0 lying on ray OP such that
(OP )(OP 0 ) = r2 . The circle is called the circle of inversion, and point O is the center of inversion (see
Figure 1).

Figure 1: The inversion circle.

An inversion effectively turns the circle inside out.

Author, Algorithm and Computational Geometry Research Group, Amirkabir University of Technology, Tehran,
Iran, T: (+98) 26 34550002
Algorithm and Computational Geometry Research Group, Amirkabir University of Technology, Tehran, Iran.

74

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Every point inside goes outside, every point outside


goes inside, and all the points on the circle itself stay
put. The only thing unaccounted for is the center of
the circle. Let us say that it goes to a point at infinity [6].
The key properties of inversions are that circles map
to either circles and lines and that inversion preserves
the size of angles. Thus, inversions will prove to be
most useful whenever we are dealing with circles and
there are many interesting applications of inversion. In
particular there is a surprising connection to the Circle
of Apollonius. There are also interesting connections to
the mechanical linkages, which are devices that convert
circular motion to linear motion. Finally, as suggested
by the properties of inversion, there is a connection
between inversion and isometries of the Poincar Disk.
In particular, inversion will give us a way to construct
hyperbolic reflections in h-lines.

Definition 1. As to the inversion relation


(OP )(OP 0 ) = r2 of Figure 1, for a given line l :
ax + by + c = 0 with c 6= 0, we define the left and
the right side of l to be as follows:
The right side of l is the side whose distance of its
point from the center of inversion is greater than the
distance of ls points from the center of inversion. We
denote this region by lR . The other side of l is the left
side of l and we denote it by lL .
Theorem 1: If the given line l : ax + by + c = 0 with
c 6= 0, does not have any contact with the inversion
circle C, then the image of the right side of l under the
inversion map t relative to the inversion circle C, will
be inside of the circle t(l).
Proof: Without loss of generality, let C be the
unit circle. Thus by Lemma 1, there is some t :
C {0} C such that t(z) = z1 . If c < 0, then
lR = {(x, y) : ax + by + c > 0}. (see Figure 2 for
example).

Voronoi diagrams have nice properties which motivated us to study if they will be preserved in other
spaces. In this paper, we study the case in inversion
geometry specially in a given inversion circle.

Some properties of inversion


geometry
Figure 2: The figure of Theorem 1.

There are some facts about the inversion:


1: The inverse of a circle ( not through the center of
inversion ) is a circle.
2: The inverse of a circle through the center of inversion is a line.
3: The inverse of a line ( not through the center of
inversion ) is a circle through the center of inversion.
4: A circle orthogonal to the circle of inversion is its
own inverse.
5: A line through the center of inversion is its own inverse.
6: Angles are preserved in inversion.
Lemma 1: An inversion relative to the unit circle is
defined by:
t : C {(0, 0)} C st.

t(z) =

1
.
z

Proof: See [16]. 


Lemma 2: Let A and B be two given region
t be the inversion map obtained from Lemma
t(A B) = t(A) t(B).
Proof: Note that t is a 1-1 map and this
the proof. 

Note that t(lR ) = t({(x, y) : x2 + y 2 < ax+by


c }).
Obviously t(lR ) is the inside of circle t(l). If c >
0, then lR = {(x, y)|ax + by + c < 0} and hence
t(lR ) = t({(x, y)|x2 + y 2 < ax+by
c }). This completes
the proof. 

The proof of Theorem 1 enable us to deduce that


lL is mapped into the outside of the circle t(l) in the
inversion circle.
Theorem 2: Let the given line l : ax + by + c = 0 with
c 6= 0 intersects the inversion circle C, then under the
inversion relative to the circle C (if situation is as in
Figure 3) we have followings:
1. The region with light gray color of lR will be mapped
into the region with the same color inside of the circle
t(l) and vise versa.
and let 2. The region with dark gray color of lR which is out1, then side of the circle C, will be mapped into the region with
the same color inside of the circle C and vise versa.
finishes 3. The region with white color of lL which is inside the
circle C, will be mapped into the region with the same

75

The Third International Conference on Contemporary Issues in Computer and Information Sciences

color outside of the circle C and vise versa.


Proof: By using Lemma 2 and Theorem 1 and applying the fact that An inversion turns the inside of
circle out and vise versa, after some computation the
proof will be obtained. 

In this case, for any given Voronoi diagram with


the above property and given inversion circle, there is
a point in the inversion circle that corresponds to the
Voronoi vertex. Thus mapping the sites, edges and regions of Voronoi diagram into the inversion circle, we
will find a region in the inversion circle that contains
the two other corresponding regions. It is intersecting
to know that the mentioned region is the image of that
Voronoi region which contains the inversion circle (see
Figure 4).

Figure 3: The figure of Theorem 2.

Mapping Voronoi diagram


into the inversion circle

Figure 4: Voronoi vertex is outside of C and non of the


Voronoi edges intersect C.

In this section we study the image of Voronoi diagram


in the given inversion circle by the inversion map. We b. When the Voronoi vertex is outside of the inversion
circle and there is a Voronoi edge that contactins the
discuss it more precisely step by step.
inversion circle.
As an immediate result of Section 2, one can see
In this case, there is a point in the inversion cirthat the image of the given Voronoi diagram under the
cle
corresponding
to the Voronoi vertex and the image
inversion map relative to the inversion circle C is as
of
the
Voronoi
diagram
will divide the inversion circle
follows. For this let N denote the number of sites:
into three distinct regions (see Figure 5).
N = 2.
In this case Voronoi diagram consists of a line l which
is the perpendicular bisector of sites. Now for mapping
these into the given inversion circle, we have to study
two cases that the line l is through the center of inversion or not.
The case that l is not through the center of inversion
had been studied in Theorems 1 and 2. Further if l
is through the center of inversion, since a line through
the center of inversion is its own inverse, and since an
inversion turns the circle inside out, we are done.
N = 3.
When the number of sites are more than two, we have Figure 5: Voronoi vertex is outside of C and there is a
Voronoi vertex. Thus we can discuss about the case Voronoi edge intersecting C.
that it is inside of the inversion circle or outside of it.
Now since Voronoi edges intersect the inversion circle,
we will obtain the followings:
b. When the Voronoi vertex is inside of the inversion
circle.
a. When the Voronoi vertex is outside of the inversion
circle and non of the Voronoi edges have an intersect
It is clear that in this case the corresponding point
with the inversion circle.
to the Voronoi vertex is not inside the inversion circle.

76

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

That is, the image of the Voronoi diagram under the


inversion relative to the inversion circle, will be three
curves inside the inversion circle which meet at infinity
(the center of inversion). See Figure 6.

Figure 7: Stereographic Projection.

Figure 6: Voronoi vertex is inside C.

The number of sites are more than three.


For this case we can generalize our results mentioned
above. Each Voronoi vertex which is inside the inversion circle has no corresponding point in the inversion
circle. Therefore, the image of the Voronoi diagram inside the inversion circle will miss the images of Voronoi
vertices which are inside of the inversion circle.

From the north pole N of the sphere , draw the


line through the point p in C; the stereographic image of p on is the point p where this line intersects
. Since this gives us a one-to-one correspondence between points in C and points on , let us also say that
p is the stereographic image of p. Notice that no confusion should arise from this, the context making it clear
whether we are mapping C to , or vice versa.

4.2

Stereographic Formulae

In this subsection we recall explicit formulae connecting the coordinates of a point z in C and its stereographic projection z on . These formulae are useful
in investigating non-Euclidean geometry.

To begin with, let z = x + iy and let (X, Y, Z) be


the Cartesian coordinates of z on . Here the X and
Y axes are chosen to coincide with the x and y axes of
C, so that the positive Z-axis passes through N . To
make yourself comfortable with these coordinates, the
equation of is X 2 + Y 2 + Z 2 = 1, the coordinates of
4 Special partition
N is (0, 0, 1), and similarly S = (0, 0, 1), 1 = (1, 0, 0),
i = (0, 1, 0), etc.
be the stereographic map
Theorem 3: Let : C
In this section we focus on the image of the given 2- (with the above mentioned properties), then for any
dimensional Voronoi diagram inside the given inversion point (X, Y, Z) on :
circle. We first briefly explain Stereographic Projection.
Y
X
+i
(X, Y, Z) =
1Z
1Z

and the invert formula for given point z = x + iy in C


is:

4.1

Stereographic Projection

1 (x + iy) = (

2y
x2 + y 2 1
2x
,
,
)
x2 + y 2 + 1 x2 + y 2 + 1 x2 + y 2 + 1

.
Let be the sphere centered at the origin of C, and
Proof: See [14]. 
with unit radius. That is, its equator coincides with
Now change the direction in Figure 7, and let S be
the unit circle. We now seek to setup a correspondence
between points on and points in C (see Figure 7).
the north pole of Riemann sphere, then we deduce the
following:

77

The Third International Conference on Contemporary Issues in Computer and Information Sciences

be the stereographic map with this Theorem 5: For a given set of points as sites in C, the
Let u : C
assumption that S is the north pole, then for the given image of Voronoi diagram of mentioned sites inside the
point (X, Y, Z) on :
given inversion circle C, will be a partition of C which
preserve symmetry. That is, the image of each pair of
X
Y
u(X, Y, Z) =
+i
sites inside C are symmetric with respect to the image
1+Z
1+Z
of corresponding Voronoi edge.
and
Therefore according to Theorem 5 and
2y
x2 + y 2 1
2x
, 2
, 2
) Lemma 4.1 [2], we will obtain the main result of this
u1 (x + iy) = ( 2
2
2
2
x +y +1 x +y +1 x +y +1
paper as follows.
.
Theorem 6: For a given set of points as sites in C, the
image of Voronoi diagram of mentioned sites inside the
Hence in this case we have the followings:
(i) The interior of the unit circle is mapped to the given inversion circle C, is the Voronoi diagram of the
southern hemisphere of . In particular, 0 is mapped inverted point sites in C. That is, the inversion of any
Voronoi diagram in C relative to the given circle C,
to the south pole N .
will give a Voronoi diagram in C.
(ii) Each point on the unit circle is mapped to itself.
(iii) The exterior of the unit circle is mapped to the
northern hemisphere of , except that S is the stereographic image of .

Refrences

4.3

[1] H. Alt and O. Schwarzkopf, The Voronoi diagram of curved


objects, Proc. 11th Annu. ACM Sympos. Comput. Geom.
(1995), 8997.

Main results

By combining Theorem 3 and Corollary ??, the following interesting theorem will be derived:
and denote
Theorem 4: Let P be a given point in C
P 0 = u 1 (P ), then P 0 is the inverse of P with respect to the unit circle.
By using
Proof: Let P (x, y) be a given point in C.
Theorem 3,
1 (x + iy) = (

2y
x2 + y 2 1
2x
,
,
)
x2 + y 2 + 1 x2 + y 2 + 1 x2 + y 2 + 1
2

2y
x +y 1
2x
according to Corollary ??, u( x2 +y
2 +1 , x2 +y 2 +1 , x2 +y 2 +1 )
y
y
x
x
0
= ( x2 +y2 , x2 +y2 ). Thus P = ( x2 +y2 , x2 +y2 ). Therefore it will imply that (OP )(OP 0 ) = 1 and the proof is
done. 

For a given set of sites in C with Euclidean distance


function from a point to a site, a Voronoi diagram can
be described as the partition of C into cells that are the
locus of points closer to a given site than to any other
site. The boundaries of these cells are called Voronoi
edges and each of the Voronoi edges correspond to two
sites. On the other hand, each pair of sites are symmetric with respect to the corresponding Voronoi edge.
As mentioned earlier, the Riemann sphere is a model
so it is obvious that
for extended complex plane C,
the stereographic projection in any case will preserves
symmetry. Therefore for a given Voronoi diagram in C
and inversion circle C in C, it is sufficient to consider a
Riemann sphere whose equator coincides with the inversion circle C. Now the stereographic projection and
Theorem 4 gives the following theorem.

78

[2] F. Aurenhammer and R. Klein, Voronoi Diagrams, Handbook of Computational Geometry, J. Sack and G. Urrutia,
editors, Elsevier Science Publishers, B.V. North-Holland,
Chapter 5, pages: 201290, 2000.
[3] L. P. Chew and L. Drysdale, Voronoi diagram based on convex distance functions, Proc. 1st Ann. Symp. Comp. Geom.
(1985), 235244.
[4] S. Drysdale, Voronoi Diagrams: Applications from Archaology to Zoology, Regional Geometry Institute, Smith College,
July 19 (1993).
[5] A. Francois, Voronoi diagrams of semi-algebraic sets, Ph.D
Thesis, Department of Computer Science, The University of
British Colombia, January, 2004.
[6] M. J. Greenberg, Euclidean and Non-Euclidean Geometries,
2nd ed., W. H. Freeman & Co., 1988.
[7] M. Karavelas, 2D Segment Voronoi Diagrams, CGAL User
and Reference Manual: All parts, Chapter 43, 20 December,
2004.
[8] M. I. Karavelas and M. Yvinec, The Voronoi Diagram of
Planar Convex Objects, 11th European Symposium on Algorithms (ESA 2003), LNCS 2832 (2003), 337348.
[9] D.-S. Kim, D. Kim, and K. Sugihara, Voronoi diagram of a
circle set from Voronoi diagram of a point set: 2. Geometry,
Computer Aided Geometric Design 18 (2001), 563585.
[10] V. Koltun and M. Sharir, Polyhedral Voronoi diagrams of
polyhedra in three dimensions, In Proc. 18th Annu. ACM
Sympos. Comput. Geom. (2002), 227236.
[11]

, Three dimensional Euclidean Voronoi diagrams of


lines with a fixed number of orientations, In Proc. 18th
Annu. ACM Sympos. Comput. Geom. (2002), 217226.

[12] D. T. Lee, Two-dimensional Voronoi diagrams in the Lp


metric, JASM 27(4) (1980), 604618.
[13] Z. Nilforoushan and A. Mohades, Hyperbolic Voronoi Diagram, ICCSA 2006, LNCS 3984 (2006), 735742.
[14] T. Needham, Visual Complex Analysis, Oxford University
Press Inc., New York, 1998.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

[15] A. Okabe, B. Boots, K. Sugihara, and N. Chiu, Spatial tesselations: concepts and applications of Voronoi diagrams,
2nd edition., John Wiley & Sons Ltd., Chichester, 2000.

[16] The Open University Mathematics: Unit 25 Geometry VI


the Kleinian view., The Open University Press, Prepared
by the Course Team, 1984.

79

Selection of Effective Factors in Estimating of Costumers Respond to


Mobile Advertising by Using AHP
Mehdi Seyyed Hamzeh

Pnu University
Department of Computer Engineering and Information Technology
mehdi seidhamze@yahoo.com

Bahram Sadeghi Bigham


Institute for Advanced Studies in Basic Sciences
Department of Computer Science and Information Technology
b sadeghi b@iasbs.ac.ir

Reza Askari Moghadam


Pnu University
Department of Computer Engineering and Information Technology
askari@pnu.ac.ir

Abstract: This paper presents an application of the analytic hierarchy process used to selection
of effective factors in estimating of costumers respond to mobile advertising and then investigates
the most successful factors form of mobile communication; short message services (SMS).
This method adapts a multi-criteria approach that can be used for analysis and comparison of
mobile advertising. Four criteria were used for evaluating mobile advertising: Information services,
Entertainment, Coupons, Location base services. For each, a matrix of pair wise comparisons
between factors influence was evaluated. Finley the aim of this investigate is to gain a better
understanding of how companies use mobile advertising in doing business.

Keywords: : Mobile Advertising; E Advertising; Personalization; Analytic Hierarchy Process; Short Message
Services (SMS); Successful Factors

Introduction

put them in to analysis.


Recently a great attention has been veered toward
the efficacy of mobile advertisement in cellular phones
from scholars and experts. This is due to the peculiarity of cellular phone which makes it different from other
media, for instance we can mention to personalization
of advertisement.

With a growth and progress in electronic business, especially in cellular phones, mobile advertisement seems
to be successful when elements which are effective on
customers attitude in electronic and wireless situation
are to be well-understood and necessary actions shell
be done in this case. While several elements are efOnline advertising (ad) is a form of promotion that
fective on customers attitude owe can mention to peruses the Internet and World Wide Web for the express
sonal values and inner believes customers characterpurpose of delivering marketing messages to attract
istics and also technological and media elements and
customers [3].
even strategies which companies take up and finally
Corresponding

Author, T: (+98) 911 325-8525

80

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

2
2.1

Literature review
Mobile advertising

Short message services (SMS) have become a new technological buzzword in transmitting business to customer messages to such wireless devices as cellular telephones, pagers, and personal data assistants. Many
brands and media companies include text message
numbers in their advertisements to enable interested
consumers to obtain more information [4].Mobile marketing uses interactive wireless media to deliver personalized time- and location-sensitive information promoting goods, services, and ideas, thereby generating value
for all stakeholders [2].Studying interactive mobile services such as SMS and MMS suggests drawing upon
theories in marketing, consumer behavior, psychology
and adoption to investigate their organizational and
personal use [4].
Mobile advertising is predicted to be an important
source of revenue for mobile operators in the future
[9] and has been identified as one of the most promising potential business areas. For instance, in comparison with much advertising in traditional media,
mobile Advertisements can be customized to better
suit a consumers needs and improve client relationship [1].Examples of mobile advertising methods include mobile banners, alerts, and proximity-triggered
advertisements [6].

3
3.1

The AHP

Personalization

Marketers can personalize text messages based on the


consumers local time, location, and preferences, directions to the nearest vegetarian restaurant open at the
time of request. A person may use a mobile device
to receive information but also for a purpose of personal entertainment, it mean if any matter (mobile
SMS/Advertising) will disturb his personal entertainment then he will never like to disclose his personal
information [10].

3.2

Credibility

The credibility of the mobile applications is what


makes their uses frequent. If the user experienced a
problem during the transaction or mobile advertising,
it is certain that it will not use the mobile applications
once more [7].

3.3

2.2

Alternatives

Consumer permission

Corporate advertising often serve as the primary point


of contact, asking consumers for permission to receive
SMS and According to all the experts, advertisers
should have permission and convince consumers to opt
in before sending advertisements [1].

The AHP is one the extensively used multi-criteria decision making (MCDM) methods. The AHP has been
applied to a wide variety of decisions including car purchasing, IS project selection [8], and IS success[5].
The AHP is aimed at integrating different measures 3.4 Consumer control
in to a signal overall score for ranking decision alternative. Its main characteristic is that it is based on pair
wise comparison judgments.
There is a trade-off between personalization and consumer control. Gathering data required for tailoring
In this paper, we discuss one representative the re- messages raises privacy concerns. Corporate policies
lationship between the effective factors in success the must consider legalities such as electronic signatures,
marketing companies and also influence the way the electronic contracts, and conditions for sending SMS
consumer reacts to mobile advertising.
messages [1].

81

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Figure 1: Hierarchical model for selection of effective factors

4
4.1
4.1.1

A case study using AHP

4.2

Applying the AHP method


Breaking down the problem

The first step was to develop a hierarchical structure


of the problem. This classifies the goal and all decision
criteria and variables into three major levels, as depicted in Figure 1. The highest level of the hierarchy
is the overall goal: to select the best influence factors
mobile advertisement. Level 2 represents the criteria
that companies offer the services for their customers.
Level 3 contains the decision alternatives that in mobile ads we consider the types of proposed solutions
which are important to users.

4.1.2

Discussion of results

In the previous section the weight of the criteria with


regard to the purpose and also the weight of the alternatives with regard to the criteria were determined.
Now, the way relative weights are combined for evaluating the final weights to choose and prioritize the
best elements will be explained in Fig2. Since the rate
of compatibility is less than 0.1, it can be concluded
that the group decision has an acceptable compatibility. Therefore, the obtained results consist of personalizing messages in the first rank with the final weight
of 0.484, message credit with the weight of 0.310 in the
second rank. Sending message in an appointed time
with the weight of 0.118 in the third rank, and the
right of message reception on the part of the user with
the weight of 0.089 in the final rank.

Comparative judgments to establish priorities

5
After calculating the weight of the effective elements
in mobile ads in relation to the total designated criteria, we should determine the weight of the criteria. In
other words, the quota of each criterion in determining
the best effective element must be identified. To do
this we need to compare the criteria in pairs. For example, in order to determine the relative importance of
the four major criteria, a 4 4 matrix was formed. Expert Choice provided ratings to facilitate comparison,
these then needed to be incorporated into the decision
making process. After inputting the criteria and their
importance into Expert Choice, the priorities from each
set of judgments were found.

82

Conclusion

Online advertising is a new service in the marketing


industry. An AHP-based methodology was designed
and applied and has proven its potential in helping
decision- makers in supporting in order to specify and
prioritize elements that are effective on mobile business
processes in cellular phone user, in has paid attention
to find out the relation between attitude and effective
element in comparison with mobile advertisement in
cellular phones. Finally we were show the operational
process of hierarchical analysis on the mobile advertisement.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Figure 2: Synthesis for Select best influence factors

Refrences
[1] Arno Scharl, Astrid Dickinger, and Jamie Murphy, Diffusion and success factors of mobile marketing, Electronic
commerce research and applications 4 (2005), 159-217.
[2] A.P. Dickinger, A. Haghirian, A. Scharl, and J. Murphy,
A conceptual model and investigation of SMS Marketing,
Thirty-Seventh Hawaii International Conference on System
Sciences (HICSS-37), Hawaii, U.S.A (2004).
[3] Cookhwan Kim, Kwiseok Kwonb, and Woojin Chang, How
to measure the effectiveness of online advertising in online
marketplaces, Expert Systems with Applications 38 (2011),
4234-4243.
[4] David Jingjun Xu, Stephen Shaoyi, and Liao Qiudan Li,
Combining empirical experimentation and modeling techniques: A design research approach for personalized mobile advertising applications, Decision support Systems 44
(2008), 710-724.
[5] E.W.T.Ngai, Selection of web sites for online advertising
using the AHP, Information & Management 40 (2003), 233
242.

[6] G.M. Giaglis, P. Kourouthanassis, and A. Tsamakos, towards a classification framework for mobile location services, in: B.E.Mennecke, T.J. Strader (Eds.), Mobile Commerce: Technology ,theory, and applications, Idea Group
Publishing (2003).
[7] Glin Bykzkan, Determining the mobile commerce user requirements using an analytic approach, Computer Standards & Interfaces 31 (2009), 144-152.
[8] M.J. Schniederjans and RL. Wilson, sing the analytic hierarchy process and goal programming for information system
project selection, Information & Management 20 (1991), 33
342.
[9] DeZoysa and E. Mizutani, Mobile advertising needs to get
personal, tele-communications International 36 (2002),
no. 2.
[10] Thtinen j and B. V. S Ram, Mobile Advertising or Mobile
Marketing a need for new concept?, Conference proceeding
of eBRF, 152164.

83

An Obstacle Avoiding Approach for Solving Steiner Tree Problem on


Urban Transportation Network
Ali Nourollah
Department of Electrical and Computer Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran
Department of Electrical and Computer Engineering, Shahid Rajaee Teacher Training University, Tehran, Iran

Fatemeh Ghadimi
Department of Electrical and Computer Engineering, Qazvin Branch, Islamic Azad University, Qazvin, Iran
F Ghadimi@qiau.ac.ir

Abstract: The Steiner Tree Problem in a graph, which is one of the most well known optimization
problems, is used for finding minimum tree between some Terminal nodes. This problem has various
usages that one of them is routing in the urban transportation network. In these networks, there
are some obstacles that Steiner tree must avoid them. Moreover, as this problem is NP-Complete,
the time complexity of solving it, is very important for make it useable in large networks. In this
article, an obstacle avoiding approach has proposed that can find the near optimum answer of the
Steiner tree problem, in polynomial time. This approach has good rates in comparison with the
others, and it can find the possible near optimum tree, even when there are some obstacles in the
network.

Keywords: Steiner Tree on the Graph; Urban Transportation Network; Free-Form Obstacles; Heuristic Algorithms.

Introduction

graph interconnecting all given terminal nodes that do


not cross obstacles. It has been proven that this sub
graph is absolutely a tree if the given graph has no
negative weight [1]. The only difference between STP
and Minimum Spanning Tree (MST) is the ability of
using some extra nodes called Steiner nodes, in order
to reduction of path cost. This difference has made
STP an NP-Complete Problem.

The Steiner Tree Problem (STP) has several definitions, but in this article, it is considered on a graph.
The STP on a graph has many practical usages such
as global routing and wire length estimation in VLSI
applications, civil engineering and routing on urban
networks, and also multicasting in computer networks.
In 1972, the STP even on the graph, has been
This article focuses on the urban transportation network routing, so in computing Steiner tree, the sug- proven to be NP-Complete [2], so there is no polygested approach should avoid obstacles that may be nomial time solution for it, that can find the optimum
answer. Thus, there is a need for heuristics and approxexisted in this network.
imate approaches instead of exact algorithms. Some of
The urban transportation network is assumed as an these approaches are as follows: MST based algorithms
undirected, weighted graph. The nodes of this graph like algorithms of Takahashi et al. [3] and Wong et al.
are intersects, the edges are roads and the weights are [4] that for finding Steiner tree, they add an edge at
traffic volume. In this graph there can be some poly- each time until all terminals connect together; Nodegons that they are the obstacles, like Tehran restricted based local search algorithms like Dolagh et al. [5]
traffic area. The Steiner Tree Problem that can avoid that find Steiner tree with using local search and idenobstacles is defined as follows: finding the shortest sub tifying proper neighbors; Greedy Randomized Search
Corresponding

Author, T: (+98) 912 7668429

84

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

algorithms [6] that have three phases: a construction 3


Our Proposed Approach
phase, a local search phase and if necessary an updating phase. All of these approaches do not find the optimum answer of the STP, but they find near optimum The algorithm that is suggested in this article called
Obstacle Avoiding Steiner Tree on Urban transportaanswers in polynomial time.
tion Network (OASTUN). It can find Steiner tree on
In this article, we present a new heuristic approach an undirected and weighted graph with avoiding obthat can find near optimum answers of Steiner tree in stacles.
polynomial time and avoid obstacles. This article is
The inputs of this algorithm are the graph G and
organized as follows: In the next Section, some definitions and notations are reviewed. In Section3, our new the set T and also there is an assumption that says
algorithm is explained. In Section4, the experimental there is no isolated terminal in G, even when the obresults are presented and finally in Section5 there are stacles are considered. This algorithm consists of four
phases: the obstacle avoiding phase, the preprocessing
the conclusions.
phase, the first phase and the second phase. Finally,
the outputs of this algorithm are the Steiner tree and
its cost. The time complexity of OASTUN algorithm,
in the worst case is O(n(m + nlogn)), and it can find
near optimum answers while avoiding obstacles.

Definitions and Notations

3.1

Obstacle Avoiding Phase

In this phase of the algorithm, the given graph must be


refining from nodes and edges that are in the obstacle
Problem Definition: The STP asks for the minimum polygons. The inputs of this phase are graph G and
cost sub graph of G spanning T , that can use Steiner obstacles O, and the output of it is the refined graph
nodes. This tree must avoid any obstacles.
G0 . The related pseudo code is Algorithm 1.
Graph Definition: The undirected, weighted graph
G = (V, E, W ) includes of a set of vertices (V ) that
each one has a coordinates, and a set of edges (E) that
each edge is undirected and connects two vertices with
nonnegative weight (W ). Terminal nodes (T ) are in
a subset of graph vertices (T V ) and all the other
vertices are Steiner nodes (S = V \T ). The number of
vertices in V , is n, the number of edges in E, is m and
the number of terminal nodes in T , is r.
Obstacle Definition: On the graph G there can be
some free-form polygons as Obstacles O = (OV, OE)
that each one has a set of nodes (OV ), and a set of
edges (OE). The Steiner tree must avoid these obstacles and uses nodes and edges that are not in these
restricted areas. There is no limitation on the shape
of polygons and the number of them, but no terminal
nodes must be in these areas, and no terminals must
become isolated. This means that from each terminal,
there must be some edges to the other terminals.
Dijkstras algorithm Definition: This algorithm is
used for finding the shortest tree from one node to the
other nodes in a graph. By using Fibonacci Heap for
implementing Dijkstras algorithm, it has O(m+nlogn)
time complexity [7].

Algorithm OASTUN// Obstacle Avoiding Phase


Input. = (OV ,OE ), ( = , , )
Output. ( = , , )
1 for each do
2 if Inside( , ) is true then
3
Remove from and its edges from ;
4 end if
5 end for
6 for each do
7 for each OE do
8
if Intersect( , ) is true then
9
Remove from ;
10
break this inner loop;
11
end if
12 end for
13 end for

Algorithm 1: Pseudo code of Obstacle Avoiding


Phase
Definition 1: Inside(x, A) is a procedure that its
output is true if the node x be inside of the polygon
A and otherwise its false. The algorithm of this procedure, first of all, draws a line from outside of the
polygon A to the node x. If this line intersects with
the edges of the polygon A, for odd times, the node x

85

The Third International Conference on Contemporary Issues in Computer and Information Sciences

is inside of the polygon.

algorithm. Afterward, among all these paths in the


tree, the shortest path from ti to another terminal is
Definition 2: Intersect(A, B) is a procedure that selected and added to the ith cell in the array D. Then
its output is true if the edge A intersects with the edge its edges and nodes are added to P and N .
B otherwise its false. The algorithm of this procedure
is explained in [8].
The second loop (lines 6-20) is repeated for J times
or until no changes occur in the P . This loop is exactly
like the previous loop, but it obtains a shortest path
tree from ti to other nodes in N . If the weight of this
3.2 Preprocessing phase
shortest path is less than the weight of the previous
path for ti in D and it has no repeated edge with the
In this phase in order to reduce the counts of nodes previous path, then its edges and nodes are exchanged
and edges in the graph G0 , those ones that arent nec- with the previous ones in P and N . The number of
essary in Steiner tree computation must be omitted. the variable J, according to the experimental results
Therefore, the Steiner nodes that have less than two has been determined three, and its sufficient. At the
edges (Deg < 2) and their connected edges are omit- end of this phase there is an omission of repeated nodes
ted. The resulted graph of this phase is called G00 and in N and repeated edges in P (lines 21, 22).
the related pseudo code is Algorithm 2.
Algorithm OASTUN // First Phase
Algorithm OASTUN // Preprocessing Phase

Input. , = , ,
Output. ,

Input. , ( = , , )
Output. ( = , , )
// = \
1 for each do
2
if Deg( ) < 2 then
3
Remove from and its edge from ;
4
end if
5 end for

Initialization: = , = , = , = 3.
//P is a set of edges; N is a set of nodes; is an array of size r, J is
//a counter.

Algorithm 2: Pseudo code of Preprocessing Phase

3.3

First Phase

In this phase of the algorithm, the computation of the


Steiner tree is started. The shortest path between each
terminal node to one of the other terminals, which is
the nearest one, is computed in this phase. The inputs
of this phase are graph G00 and the set T and the outputs are the set of edges (P ) and the set of nodes (N )
from the obtained paths. Algorithm 3 is the pseudo
code of this phase.

1 for each do
2 Min{ShrtTree( , };
3 + .edges;
4 + .nodes;
5 end for
6 repeat
7 flag true;
8 -1;
9
for each do
10 Temp Min{ShrtTree( , };
11 if Temp < and Temp.edges .edges then
12
\ .edges;
13
\ .nodes;
14
Temp;
15
+ .edges;
16
+ .nodes;
17
flag false;
18
end if
19 end for
20 until flag=true or =0.
21 Remove all repeated edges in ;
22 Remove all repeated nodes in ;

Definition: ShrtT ree(x, A) is a procedure that its


output is a set of shortest paths from node x to each
Algorithm 3: Pseudo code of First Phase
node in the set A. These paths are obtained by computing the shortest tree that rooted in x and its leaves
are the nodes in the set A. This procedure uses Di3.4 Second Phase
jkstras algorithm to make a short tree, so if there be
more than one path with a same weight for two nodes,
In this phase after examination of the connectivity staone of them is chosen.
tus of the terminals, if there are any isolated trees they
The first loop (lines 1-5) of this pseudo code is ex- should be connected. For this reason, all the edges in
ecuted for each terminal (ti ), and it obtains a shortest P that connected together are put in the same groups.
path tree from ti to other terminals by using Dijkstras Afterward if the number of groups be greater than one,

86

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

three loops are executed. Algorithm 4 is the pseudo


In the third loop (lines 16-22), if there are any
code of this phase.
Steiner nodes in the set H, for each of them, the shortest path is computed. If this path has lower cost than
the previous one, and also it has the connection condiAlgorithm OASTUN // Second Phase
tions, it is replaced with the previous path and the related edges and nodes in P and N are exchanged. The
Input. , = , , , ,
path that has the connection condition doesnt make a
Output. Steiner tree path, Steiner tree Cost
cycle, or it doesnt make terminals to be isolated.
Initialization: = , = .
// C is a set of founded paths; H is a set of selected Steiner
//nodes.

1 Put all which are connected together, in


the same groups;
2 if groups number > 1 then
3 for each do
4 all nodes in with different groups
from ;
5 + ShrtTree( , ;
6 end for
7 while groups number> 1 do
8
Temp Min{ }which is not added yet;
9
if Temp connects two groups then
10
+ Temp.edges;
11
+ Temp.nodes;
12
+ Temp.Steiner nodes;
13
Update groups number;
14
end if
15 end while
16 for each do
17
if has a shorter path to any then
18
if this shorter path has the conditions then
19
Replace it with previous one and update
and ;
20
end if
21
end if
22 end for
23 Delete all repeated edges in ;
24 Delete all repeated nodes in ;
25 end if
26 for each do
27 if Deg < 2 then
28
Remove from and its edge from ;
29 end if
30 end for
31 Compute the summation of costs of all .

Algorithm 4: Pseudo code of Second Phase

At the end of this phase (lines 23- 30), there are


omissions of repeated edges of P , and repeated nodes
of N . Moreover, Steiner nodes with the deg less than
2 are also omitted from N , and their edges from P .
Finally, all the edges in the set P are the edges of the
Steiner tree, and the summation of their weights is the
cost of the Steiner tree.

Experimental Results

We implemented our algorithm in the C# programming language, and all the experiments were performed
in a computer with a 2.50 GHz Intel processor and 3GB
of RAM. This algorithm has been executed on several
data sets such as Beasleys data sets [9] and SteinLib
data sets [10]. Here the results of running the OASTUN algorithm on the set B of Beasleys data set are
shown.
The costs of the resulted Steiner trees from executing OASTUN algorithm on the set B, without running
Obstacle Avoiding phase, are in Table 1. The rate of
this algorithm is computed from the ratio of the cost
of OASTUN to the optimum cost.

Table 1: The results of OASTUN algorithm without


running Obstacle avoiding phase, on the set B
Graph Nodes Edges Terminals Optimum OASTUN
Number Count Count
Count
Cost
Result

In the first loop (lines 3-6), the shortest paths from


each node in N to other nodes of it that they are not
in the same groups, are computed. Afterward, the resulted paths will be added to C.

1B
2B
3B
4B
5B
6B
7B
8B
9B
10 B
11 B
12 B
13 B
14 B
15 B
16 B
17 B
18 B

In the second loop (lines 7-15), until all the separated trees are not joined together, a path with lowest
cost that connects two trees is selected from the C. The
edges and nodes of the selected path are respectively
added to P and N and also if there is any Steiner node
in this path, it is added to set H. In this situation,
the connectivity status of the groups and the number
of isolated groups are updated.

87

50
50
50
50
50
50
75
75
75
75
75
75
100
100
100
100
100
100

63
63
63
100
100
100
94
94
94
150
150
150
125
125
125
200
200
200

9
13
25
9
13
25
13
19
38
13
19
38
17
25
50
17
25
50

82
83
138
59
61
122
111
104
220
86
88
174
165
235
318
127
131
218

82
83
138
59
61
122
111
104
220
86
92
174
170
235
321
132
131
218

Rate
Time
(Opt/MSTG) (h:m:s:ms)
1
1
1
1
1
1
1
1
1
1
1.045
1
1.03
1
1.009
1.039
1
1

0: 0: 0: 10
0: 0: 0: 16
0: 0: 0: 33
0: 0: 0: 15
0: 0: 0: 20
0: 0: 0: 50
0: 0: 0: 28
0: 0: 0: 37
0: 0: 0: 101
0: 0: 0: 54
0: 0: 0: 66
0: 0: 0: 131
0: 0: 0: 69
0: 0: 0: 123
0: 0: 0: 220
0: 0: 0: 125
0: 0: 0: 168
0: 0: 0:350

The Third International Conference on Contemporary Issues in Computer and Information Sciences

When the obstacles are drawn on the main graphs,


according to the shape and position of the obstacles,
the underlying graph will be changed. Therefore, the
Steiner tree and its cost become different from the original graph that had no obstacle.
In the urban transportation network that intersects
are assumed as vertices of the graph and roads are the
edges, the obstacles are the places where drivers cannot pass through them. In Figure1, there are some
samples of graph 16 from data set B, with some obstacles. According to these obstacles, the Steiner tree and
its cost are changed. In this Figure all the Terminal
nodes are shown with filled circles. In Fig.1 (a), the
obtained Steiner tree of graph 16, without considering
obstacles is shown. The cost of this tree is 132, that
its a near optimum cost. In Fig.1 (b) and (c), two freeform obstacles in different positions have drawn. The
cost of the obtained Steiner tree in this condition is respectively 200 and 193. These are the possible shortest
trees that do not pass through the obstacles. In Fig.1
(d), some boundaries are drawn for the graph and the
cost of the resulted Steiner tree is 163.

It is obvious that if there were no obstacles, and


all the edges of the given graph could be used, the result would be closer to the optimum. However, this
approach finds near minimum possible answers, even
while it should avoid obstacles.

Conclusions

Steiner tree problem is an important issue in many


fields. In this article, a new heuristic approach proposed that it could find Steiner trees on the graphs even
when there are obstacles. This algorithm can be used
on the huge graphs such as transportation networks, in
appropriate running time. As there are free-form restricted areas in the urban transportation network that
drivers cannot pass through them, the OASTUN approach considers these free-form obstacles and it finds
the Steiner tree with avoiding them. This algorithm
has polynomial time complexity, and it can find near
optimum answers in good rates in comparison with the
optimum answer and the other works.

Refrences
[1] S. E. Dreyfus and R. A. Wagner, The Steiner Problem in
Graphs, Networks 1 (1972), 195207.
[2] R.M. Karp, Reducibility among Combinatorial Problems,
Complexity of Computer Communications, Plenum Press,
New York (1972), 85103.

(a)

[3] H. Takahashi and A. Matsuyama, An approximate solution


for the Steiner problem in graphs, Math. Jpn. 24 (1980),
573577.

(b)

[4] Y. F. Wu, P. Widmayer, and C. K. Wong, A faster approximation algorithm for the Steiner problem in graphs, Acta.
Info. 23 (1986), 223229.
[5] S. V. Dolagh and D. Moazzami, New Approximation Algorithm for Minimum Steiner Tree Problem, International
Mathematical Forum 6/53 (2011), 26252636.
[6] S. L. Martins, P. M. Pardalos, M. G. C. Resende, and C.
C. Ribeiro, Greedy Randomized Adaptive Search Procedures
For The Steiner Problem In Graphs, AT&T Labs Research,
Technical Report (1998).
[7] S. Dasgupta, C. H. Papadimitriou, and U. V. Vazirani, Algorithms, Chapter 4, Section 4, 2006.

(c)

[8] M. De Berg, O. Cheong, M. V. Kreveld, and M. Overmars,


Computational Geometry, Third Edition, Springer-Verlag,
Berlin/Heidelberg, Chapter 2, Section 1, 2008.

(d)

Figure 1: The obtained Steiner trees of graph 16 from


data set B. The color of the given graph is white smoke;
the color of obstacles is blue, and the color of obtained
Steiner tree is black.

88

[9] J. E. Beasley, OR-Library: Distributing Test Problems by


Electronic Mail, Operational Research Soc. 41/11 (1990),
10691072.
[10] T. Koch, A. Martin, and S. Voss, SteinLib: An Updated
Library on Steiner Tree Problems in Graphs, ZIB-Report
00-37, Germany (2000).

Black Hole Attack in Mobile Ad Hoc Networks


Kamal Bazargan

University Of Guilan
Department of IT Engineering Trends in Computer Networks
Ka.Bazargan@yahoo.com

Abstract: Ad hoc wireless network includes a set of distributed nodes that are connected with
each other wirelessly. Nodes can be the host computer or router. Nodes directly without any access
point to communicate with each other and have no fixed organization and therefore have been
formed in an arbitrary topology. Each node are equipped with sender and receiver. An important
feature of this network is a dynamic and changing topology .It is result of node mobility. Nodes in
these networks are continually changing its position that it requires a routing protocol that has the
ability to adapt to these changes, to appear.

Keywords: Specialized mobile networks, network security, massive attack of the black hole, the routing protocol,
Black Hole, AODV

Introduction

order to use this network.


Communication between nodes in the ad hoc networks are via radio waves and if another node is a
node in radio range is considered as its neighboring
nodes and not requested the communication between
two nodes that wouldnt in radio range. So can be used
other nodes for communication, so communication between nodes is created based on cooperation and mutual trust between nodes is created. Stimulated nodes,
the wireless communication, lack of defensive lines, the
lack of centralized management to review behavior of
the existing nodes in the network, the dynamic change
of network structure and power constraints of nodes,
provides a good platform for various attacks against
wireless networks. Nodes in these networks that they
work together and exchange information (in fact, work
together based on trust) provides a good opportunity
for attacker that penetrate the network and disrupt
network routing and the elimination of exchange of information on their networks.

Ad hoc network routing and security is the problem


of todays networks. Ad hoc wireless networks are
two types: smart sensor networks and mobile Ad hoc
networks. In Ad hoc sensor networks routing hardware sensor imposed restrictions on the network that
should be considered routing methods, including the
power supply be limited in nodes. And in practice
it is not possible to replace or recharge; the routing
method proposed in this network should be best to
use the available energy, must be informed of the resources so if nodes were not sufficient resources dont
send packet for destination. Autonomous and capable of being adapted to create nodes. Nowadays tend
to use wireless networks is growing day by day, because every person in any place and any time it can
be. Special mobile networks are set of wireless nodes
that can be formed dynamically at any place and at any
time without using any form of network infrastructure.
Most of these nodes play a roll both as the router as
One of the most popular protocol used in these neta node in the same time. This feature has made the works, is AODV protocol that in many studies effects
possibility of establishing networks with fixed structure of the attacks on AODV protocol has been studied.
and is not predefined, such important occasions, such
as military items, earthquakes, floods and the like, in
Corresponding

Author, P. O. Box 45195-1159, T: (+98) 241 424-8299

89

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

AODV using a query cycle makes a route and route


a request. When the source node to route the request
to find the destination node, a node that is currently
has no route to destination in the database, packet is
broadcast the route request across the network, Nodes
that receive this packet, if they have you have the news
of its destination say it and finish the routing otherwise by adding its node number, scrolling messages,
and finally sent to a neighboring node to destination
node And message routing for source node is returned.

3.2

Introduction Targets of Attacks

Black holes are two characteristics: first of all, introduce his path as the shortest route (reliable routes), although this is a false path, with intention to the packet
stopping Secondly, black holes is wasting with the passage of the node of origin to consumption. In Ad hoc
networks routing, AODV protocols is one of the most
popular protocols that are used, Black hole nodes that
more damage to this protocol are in the routing protoThe routing nodes are created problem during the col and cause disturb to the routing protocol.
routing that causes data loss in the network so are
called them malicious nodes or black holes in, this paper is presented solution black holes attack. This way
3.3 Divided Black Hole Nodes
the behavior of nodes in the network decides whether
the target node is malicious or not?
Black hole nodes can be divided into several categories:

1 Nodes that are created problem individually.

AODV Algorithm

2 That a group of nodes (some nodes) are working


together

AODV algorithm (Advanced On-demand Distance


Vector) doesnt data path in the header. Each node
controls RREQ on the table when it is already. If your In another division malicious node can also be divided
table is in the final node, then issued RREP. Other- into two functional parts:
wise, the messages RREQ get broadcast. RREP can
certainly be sent back to the RREQ sender. For an
intermediate node is aware of this issue that he knows
1 Malicious nodes use the received data and remove
whether the route request is newer, are used a sequence
them from their path.
number in RREQ messages. So just in case the RREQ
2 Malicious nodes that consume data received and
sequence number is smaller than the known sequence
then broadcast them in a perverse way (This crenumber, RREP message is issued by the intermediate
ates networks traffic and cause buffers of nodes
nodes.
be used.).

3
3.1

Black Hole Attacks

Types of Black Hole Attacks


Division

Introducing Black Hole Attack

The most dangerous attacks are black hole attacks.


In black hole attacks, the attacker with false news for
shortest path routing, can be attracted network traffic
to own side and then puts it away. Black hole attack is
a severe attack that can easily be used against routing
in mobile phones. Black hole, has been specified a malicious node that it respond falsely path to any request
path without an active route to the destination and in
this way all packets are received. If malicious nodes
work together as a group, very serious injuries will be
created in the network. This type of attack is named
Cooperative black hole attacks.

4.1

Methods of Indivisual Nodes

In these nodes when routing is done from source node,


the node change information of routing table of source
node (reduce HOP (number of nodes to navigate to
the destination) and reduce destination routing time)
Source node choose and that node as path of the data
sending that this node will cause the data consumed
and destroyed. If the black hole introduce itself as the
proper path for all nodes, in this case will cause loss of
all network packets and eventually caused the denial of
service.

90

The Third International Conference on Contemporary Issues in Computer and Information Sciences

4.2

Introduce Making Resistant Tech- is issued, Takes voting place around the process of a
niques in The Black Hole Attack on node. Then, based on opinions issued by the neighbor
node RREP, takes decision that the node is being held
Indivisual Nodes
for bad business.

Some solution is proposed, for single black hole in this


way in the next data to the destination, when intermediate node responds to the RREQ, add to RREP
packets then the source node, sends a request (FREQ)
to the next responsive node and ask about the responsive node and destination path. Using this method,
can be detected reliability of responsive node only if
the next step is reliable. This solution cannot prevent
the attack of the massive black hole in the MANETs.
For example, if the next node cooperates with the responsive node, it will simply respond to FREQ for each
question. Source node trust to the next step and sends
data and response node that this node is a black hole
node. In the next proposed way to prevent attacks on
individual black holes, the proposed method requires
intermediate nodes to send the request route confirm
or CREQ to the destination node in the next HOP.

4.3

Introduce Making Resistant Techniques in The Massive Black Hole


Attack

In this solution with little change in AODV protocol


introduces data routing table (DRI) and with checking
this table can be largely prevent the black hole attacks.
There is a solution to identify black hole nodes in cooperate by adding two bits of information in the routing
data table (DRI) that it will fix the problem somewhat.
Node #
3
6
B2

Data Routing
From
1
1
0

Information
Through
0
1
0

Table 1:

In the above table, the value of 1 means true and


the value 0 is false. From in this table means information that the node sends to the desired node and
Through means the information that gets the desired
node. The table is an example of node 4, that is maintained through = 0, from = 1 means that data sent
from node 4 to node 3, But no data packets has been
found through the node. Node (6) is in the table means
that data is sent from node 4 to node 6 and the data
path of the desired node is found. (I.e., through this
node to the correct data is added), and for node B2 the
value. Inserted by means that are not tied to this data
is sent and received data from node that this method
is presented to detect malicious and new nodes with 2
bits of data.

After that, the next HOP received CREQ. The


memory will search itself route to find a route to destination. If there is a path sends route confirm or
reply CREP to the source node with route information. Source node by comparing the information in the
CREP detects whether RREP is or not. Because added
operation to the routing protocol overhead so overhead
is high. In another method, the source node via finding more than one path, agrees with destination, the
validity of the RREP begun. Source node tries to get
RREP packet more than two nodes. In Ad hoc netAnother solution is presented that can prevent masworks in much the same routes, there are a number of
sive black hole attacks. This solution that is developed
nodes and the Hop is common.
is AODV. These solutions discover a safe route that
When the source node receives the RREP, if the avoids of massive black hole. In this supposed that to
paths to the destination, Hop is common, Source node confirm nodes participate in relation .In this method,
can identify the safe route to the destination node, the to prevent black hole attacks is used from the truth
routing delay is caused because the node must wait to table in which each node has a degree of accuracy that
receive RREP from more than two nodes. We use this is as the size of the node. If nodes accuracy degree is
way to prevent increasing of the routing overhead and zero, this means that the node should be discarded so
delay of routing. The next method when the RREP it is called black hole.

91

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

The Proposed Method

the received information the desired accuracy will be


checked and if it is malicious nodes an alarm message
will broadcast in the network to the target node to
The proposed method is to try to decide whether the be placed in quarantine. The proposed algorithm has
hostile node malicious via nods behavior. Principles been implemented on AODV protocol and for doing its
of the proposed method are as follows:
operations are used from several packages:
1 Recorded information about that node is as follows:
Total data sent to neighboring nodes
Total data received from a neighbor node
The number of responses received from a
neighbor node
2 Sending packets requesting comments from
neighbors about a neighboring node that has
send reply packets
3 Receiving recorded information about the sender
of reply packet in its neighboring nodes

1 Request packet of information about a node: the


packet contains the node ID in question, request
sender ID and time of packet life.
2 Data packets to neighboring nodes about the
node in question: This package includes the number of received data packets received from the
target node, the number of sent packets sent to
the target node, and the number of RREP received packets received from the desired node.
3 Warning packet: This package includes nodes
that are known to be malicious node and should
be in the quarantine list. Alarm package is distributed in the whole network.

4 Review the received information and comments


about the malicious node
The benefits of the proposed method are that the node
begins the poll process when received a RREP packet
6 Removal of the Quarantine node in the routing from an unreliable. I.e. if a node already has proven
process
its integrity (via sending data packets) the survey is
longer than others. This will reduces the overhead of
the proposed algorithm. Secondly, when the informaIn the proposed method, each node in the network data tion requested, the neighboring nodes are also updated
structures is:
to reduce the algorithm overhead.
5 Send risk packet quarantine to a malicious node

1 Each node has a table that is related to its behavior and its of neighboring. Each entry in this
6 Simulations of Black Hole Attable specifies that the neighbor node with the
tacks
specified Id how many data packet send with this
and how many reply packet send this node and
how many data packet the desired node is delivIn this simulation using NS simulation software and the
ered to a neighbor node.
number of healthy nodes and the number of malicious
2 Each node contains a list of nodes that are in nodes, we show the simulation results.
quarantine and will be removed from the routing
process.

6.1
Malicious nodes are nodes that are responding RREQ
packets to send RREP packets to the large number
of data packets delivered to it by the data, but the
minimum data has been sent to neighboring nodes.
When a node receives RREP packet from its neighbor node if the node receives a RREP responding to a
RREQ, be an intermediate node and destination node,
it checks whether the responding node is not the nodes
that are in quarantine. If the node is a malicious node,
the RREP packet is discarded. Otherwise, voting process is performed around the responding node so as to
obtain all the desired node activity. Then based on

Parameters Used in Simulation


Parameter
Simulation software
Simulation time
Number of nodes
Routing Protocol
Traffic model
Stop time
The implementation
Transfer of
The number of malicious nodes

92

value
OPNET
600 sec
50
AODV
CBR
2 sec
600*600 m
250 m
2

The Third International Conference on Contemporary Issues in Computer and Information Sciences

6.2

The Simulated Performance

The average delay of End-to-End: This specifies that


when the source packet reaches to the destination
Packet delivery ratio PDR (Packet Delivery Ratio):
Data packet transmission rate from origin to destination
Routing rates: The maximum data rate transmission
in Routers (Such times: RREQ, RREP, RERR)

6.3

Figure 1: OPNET simulation environment

The Simulation

Simulation
time in
seconds
100
130
160
190
210
240
270
300

The
average
delay
End-to-End
0.003323244
0.003323344
0.003323371
0.003323419
0.003323444
0.003323454
0.003323474
0.003323348

Packet
delivery
rate
2557
2551
4001
4001
4001
4001
4001
4001

Routing
rates
2171
2133
2151
2198
2165
2171
2199
2111

Table 2: Results of simulation: AODV under attack


Figure 2: Packet delivery ratio vs. Simulation Time
Simulation
time in
seconds
100
130
160
190
210
240
270
300

The
average
delay
End-to-End
1.104323644
1.104323644
1.104323644
1.104323654
1.104323654
1.104323664
1.104323664
1.104323664

Packet
delivery
rate
2271
2333
2451
2698
2765
2871
2899
2911

Routing
rates
4950
4950
4950
4950
4950
4950
4950
4950

Table 3: Results of simulation: After removal of the


black hole

The following we will display the simulation using simulation software graph with malicious nodes.
Figure 1: OPNET simulation environment
Figure 2: View increasing the delivered package with
the removal of malicious nodes
Figure 3: Show created delay because of. Attack elimination
Figure 4: View the routing overflow because of attack
elimination.

93

Figure 3: Packet delivery ratio vs. Simulation Time

Conclusion

In this paper described methods to deal with the black


hole. This method is applicable on AODV protocol
that can be easily detected malicious nodes and destroyed them. The main advantage of this method
is that it detects malicious nodes with minimal overhead and puts them in quarantine, we can be used this

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

method in a more ad hoc networks Because of this simplicity and ease of implementation.

ad-hoc networks, IEEE Transactions on Mobile Computing


2 (2003), no. 3, 257269.
[2] H. Deng, W. Li, and D. P. Agrawal, Routing security in ad
hoc networks, IEEE Communications Magazine 40 (2002),
no. 10, 7075.
[3] Karpijoki,
Security
in
Ad
Hoc
http://www.hut.fi/ vkarpijo/netsec00/.

Networks:

[4] Zhou and Z. J. Haas, Securing Ad Hoc Networks, IEEE 13


(1999), no. 6.
[5] Lundberg, Routing Security in Ad-Hoc Networks, Helsinki
University of Technology.
[6] Elizabeth M. Royer and Chai-Keong Toh, A Review of Current Routing Protocols for Ad-Hoc Mobile Wireless.
[7] Charles E. Perkins and Elizabeth M. Royer, Ad-hoc OnDemand Distance Vector (AODV) Routing, Internet.

Figure 4: routing overhead vs. Simulation Time

Refrences
[1] C. Bettstetter, G. Resta, and P. Santi, The node distribution of the random Waypoint mobility model for wireless

[8] David B. Johnson and David A. Maltz, Dynamic Source


Routing in Ad-Hoc Wireless Networks, Mobile Computing.
[9] Izhak Rubin, Arshad Behzad, Huiyo Luo Ruhne Zhang,
and Eric Caballero, TBONE: A Mobile-Backbone Protocol for Ad-Hoc Wireless Networks, In Proceedings of IEEE
Aerospace Conference 6 (2002), 27272740.
[10] Y. Zhang, W. Lee, Huiyo Luo Ruhne Zhang, and Eric Caballero, Intrusion Detection in Wireless Ad-Hoc Networks,
In Proceedings of Mobicom 2000 (2000), 275283.

94

Improvement of the Modeling Airport Assignment Gate System


Using Self-Adaptive Methodology
Masoud Arabfard

Mohamad Mehdi Morovati

Kashan University of Medical Sciences, kashan, Iran

University of Kashan, Kashan, Iran

arabfard-ma@kaums.ac.ir

Department of Computer Engineering


mm.morovati@grad.kashanu.ac.ir

Masoud Karimian Ravandi


Science and Research Branch , Islamic Azad University , Yazd , Iran
Departemant Of Computer
karimianravandi@gmail.com

Abstract: Nowadays, the influence of software on most of the fields such as industry, sciences,
economy, etc is understood significantly. Success of software systems depends on its requirements
coverage. Requirement Engineering explains that the system can do what work in what circumstances. Successful Requirement Engineering depends on exact knowledge of users, customers and
beneficiaries requirements. Airport Assignment Gate System is a System Software which performs
the Gate Assignment Management to aircrafts automatically. This system is used in order to reduce
delays in airline system as well as reducing the delay time for planes which are waiting for landing
or flying. In this paper, the Self-Adaptive Methodology has been used for modeling this system and
with regard to this issue that this system should show different behavior in different conditions.
Self-Adaptive System is a system which is able to change itself at the time of responding to the
changing needs, system and environment. Using this Methodology, this paper attempts to support
the uncertainty and accountability to the needs created in the runtime more than ever.

Keywords: Self-adaptive Software; Run-time Requirements Engineering; KAoS; Uncertainly Management; Goal
Oriented; Airport Assignment Gate System.

Introduction

Nowadays, the software has been influenced most of


the fields such as industry, sciences, economy, etc significantly. Success of software systems depends on
its requirements coverage. Requirement Engineering
explains that the system can do what work in what
circumstances. Successful Requirement Engineering
depends on exact knowledge of users, customers and
beneficiaries requirements. Understanding the concept of system can be used anywhere in software development such as modeling, analysis, negotiation and
documentation of beneficiaries requirements, evaluating the documents provided of requirements, and
Corresponding

Author, T: (+98) 913 2649405

95

management of requirements evolution [1]. In this


paper, a kind of Requirement Engineering named the
Requirement Engineering based on the goal has been
used. In the Goal-based Requirement Engineering, the
main focus is on the goals of system. In fact, the goal
is used in this technology for extracting the requirements, evaluation, structure, documentation, analysis,
and system evolution. According to this viewpoint, the
goal is an instructional explanation of the system concept and the system should achieve it by participating
with the agents. The Agent is an active component
of system and has a particular role in the system. In
the other words, the agent is responsible for satisfying
the needs. On the other hand, the agents in each system show the range of system. Therefore, the goal of

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

system should be shared as the phenomena among the


agents in order that the agents satisfy them by doing
their own tasks[2].We require a language of Requirement Engineering Modeling in order to accomplish the
Requirement Engineering. In this Paper, we have used
the KAoS language for modeling the requirements.
The reason for using the KAoS is the superiority of
this technology compared to other languages. The
cause for this superiority is the definition of system
concept as the hierarchy of objectives and using the
concept Agent [3]. Moreover, this technology was
introduced in [4] with fully object-oriented agents and
this caused that it was used more than before.

feedback circle in order to adapt itself with changes occurred in the runtime (Figure 1). These changes may
have arisen from the system itself (Internal factors) or
the concept of system (External factors). Thus, these
kinds of systems required to scan themselves, detect
the changes, decide to react against the change, and
finally implement the decided action[6].

Airport Assignment Gate System

Gates are the final ports for passengers entry and exit
at the airport. Airport Assignment Gate is the process of selecting and assigning the aircraft to the Gate,
which is used for exact and scheduled assignment and is
considered as one of the important tasks at an airport.
This assignment is connected with a set of arrived and
moved flights, the Gates, which are ready to be assigned, and a set of constraints, which are imposed by
the airlines and airport. Thus, the Assignment Process
may be different due to various circumstances. In order
to create an efficient assignment, the assignment process must be able to cope with the sudden changes in
the operating environment and provide a timely soluFigure 1: Self-Adaptive System Feed Back Loop
tion for satisfying the proactive needs. Therefore, the
Gate Assignment should be quite clear and explicit and
has the ability to cope with the changes [7].By increasing the number of passengers and flight, the complexity
of this process will be increased significantly and the
2 Self-Adaptive System
optimal use of gates will be so important. Furthermore,
as mentioned, due to the sudden changes, which may
Self-Adaptive System is a kind of system which is able be occurred, the system should apply the optimal and
to change itself in the runtime in respond to changing efficient assignment according to the new conditions
needs, system and environment. These kinds of sys- and caused requirements.
tems depend on a variety of aspects like user needs,
features of system, features of environment, etc. The
main feature of these systems is that it reduces partly
Goal and Agent in Assignment
the dependence on human management. In fact, the 4
Self-Adaptive Software assesses its own behavior and
Gate System
changes it if it becomes clear in the assessments that
the system has not done the task, which is assigned
to it, completely and not achieved the desired objec- System Goal is the system final aim which should
tive, or the work can be done with greater efficiency achieve it. Goal can be connected with the life of sysand effectiveness [5].Before creation of Self-Adaptive tem or its scenario. Goal can be displayed as several
Software, the reimplementation and reconfiguration quantities each of which is connected with different feaprocess of system, which was a time consuming and tures. In addition, this goal can be divided into several
costly act, was done by human or his direct manage- sub goals each of which are associated with a feature.
ment in order to respond the occurred changes. There- A Behavioral goal defines a maximum set of system
fore, the research on software, which can automatically permissible behaviors. This kind of goal is divided into
and without human interference adapt itself with the two groups including the Achieve Goal and Maintain
occurred changes in the runtime, became important. Goal. Achieve goal is an objective which indicates the
Self-Adaptive Software was developed as a system with ultimate destination of system and demonstrates the

96

The Third International Conference on Contemporary Issues in Computer and Information Sciences

behavior which finally should be existed in the system.


The norm goal prioritizes among the alternative behaviors of system and determines that which alternative
has much profits and advantages. In the other words,
it is not necessary to have all of them implemented continuously in the system. This type of goal is usually
considered as a criterion for selecting the options in the
system. However, we cannot show whether the target
goal can be satisfied with these conditions or dissatisfied with other conditions. In fact, unlike the behavioral goal, there is no clear understanding in this type
of goal [8].An Agent is an active member of a system
which plays a role in satisfying the goal. What is considered in the Agent Model is not the special features of
agent and its features, but is the role of agent in satisfying the goal [9]. From a functional standpoint, an agent
is a processor which performs a specific function under
the transparent and clear conditions for satisfying the
desired goals. These conditions depend on the certificates and tasks which are defined for each operator in
the operator model. In order to analyze the responsibilities and guiding them to the permissions and tasks,
we require dividing the multiple-agent responsibilities
into the single-agent ones for the low level goals, so
that each goal card is assigned to a software agent for
the requirements and/or an environmental agent for
the expectations. In order to assign a goal to an agent,
the ability of that agent should be considered. These
abilities, as the features of classes corresponding to the
agent, are defined in the object model.

Figure 3: Goal Diagram of System

System Model Simulation

Figure 2 represents the Use Case chart of Airport System. KAoS charts are drawn using this chart and based
on it. The players of this chart are in fact the agents
of goal and responsibility model. Moreover, the cases
of this chart help us in determining the existing methods in the object model and the operators of operator
model. The goal chart of Airport Assignment Gate
System is presented in Figure 3. This chart is designed
based on the goal-based methodology. The numbers
shown in figures 3 and 4 are described as follows.
1. Gate Is Requested
2. Achieve[Getting information If Pilot was requested]
3. Achieve[Checking, Assigning emergency flight If
Information was given]
4. Achieve[Checking capacity, airline, area and
making a decision If emergency was not true]
5. Achieve[Update1 database]
6. Achieve[Inform To pilot]
7. Achieve[Inform pilot to allocator for leaving gate]
8. Achieve[Assigning gate if flight was emergency]
9. Achieve[Assigning gate that is appropriate to
other constraint if flight was not emergency]

Figure 2: Usecase Diagram of Assignement Gate System

97

10. Achieve[Update2 DataBase]

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

After extracting the system goal chart, the agent chart


is extracted from the goal chart. Figure 4 shows the
system agent chart. Then, the object model is extracted using the Use Case chart. Figure 5 represents
this chart. Finally, the performance of system is displayed by developing the operator chart. This chart is
obtained by connecting the extracted charts at the previous stages. Figure 6 shows the operator chart. The
numbers shown in Figure 6 are described a follows.
A. RequestGate
B. AddInfo
Figure 5: Object Diagram of System

C. GetInfo
D. CheckEmergent
E. AssignGateToEmergencyFlight
F. CheckOtherConstraint
G. AssignGateToOtherFlight
H. AddQueue

Related Works

I. Update1Databese
J. InformedPilot
K. LeaveGate
L. FindEmergencyWaitingFlight
M. AssignToEmergencyWaitingFlight
N. FindAppropriateFlight
O. AssignAppropriateFlight
P. Update2Database

In the runtime, the Requirement Engineering is considered as a subset of self-adaptive software engineering
science which only has been studied seriously in recent
years. [10] is one of the works which can solve the
problem of Airport Assignment Gate System. In the
method presented in [10] the ability of functions based
on the past knowledge and experiences for the manual
operation is used for solving this problem, but the algorithms used in this method had more analyzing and
computing power than before. The major problem of
this method was the manual section of operation. Furthermore, in order to optimize the Gate Assignment
in [11], it has been focused on minimizing the distance
passed by passenger between the terminal and gate
assigned to the aircraft. Despite the fact that this
subject is considered as a second-rate problem among
the problems of gate assignment, generally solving
this problem will affect the optimum gate assignment.
Moreover, in order to solve the assignment problem
in [12], the probable flight delay has been focused. In
this method of problem solving, the probable gate assignment model and proactive assignment rules have
been used.Since that this system should be changed according to various conditions, which may be occurred
in the operational environment, and adapt itself with
new circumstances, none of the existing systems have
focused on much supporting of the uncertainty in designing this system. This aim has been achieved in the
method presented in this paper using the self-adaptive
methodology and the modeling language of KAoS requirements.

Figure 4: Agent Diagram

98

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Figure 6: Operator Diagram of System

Conclusions and Future Work

This paper attempted to support the uncertainty more


in designing and implementing the Airport Assignment
Gate System by using the Self-Adaptive Methodology,
and a system, which was much independent and had
the most optimized performance, was designed. According to the charts presented for designing this system, it became obvious that this system adapts itself
with different conditions and performs the best possible performance in any circumstances. Recently, a new
technology named Techne has been developed for
modeling the requirements of Self-Adaptive Software
Systems and the main focus in this technology is on the
more Management of uncertainty in the Self-Adaptive
Software [13]. Due to the shortcomings in this technology such as the lack of development environment
of charts, etc it cannot be considered as a formally
introduced technology yet. But, according to the research conducting in this field, this technology can
be considered as a turning point in the Requirement
Engineering and designing the Self-Adaptive Software
Systems.

[3] A. Uszok, J.M. Bradshaw, and R. Jeffers, KAoS: A policy


and domain services framework for grid computing and semantic web services, Trust Management (2004), 16-26.
[4] J.M. Bradshaw, S. Dutfield, P. Benoit, and J.D. Woolley,
KAoS: Toward an industrial-strength open agent architecture, Software Agents (1997), 375-418.
[5] M. Salehie and L. Tahvildari, Self-adaptive software: Landscape and research challenges, ACM Transactions on Autonomous and Adaptive Systems (TAAS) 4 (2009), no. 2,
14.
[6] B. Cheng, R. de Lemos, H. Giese, P. Inverardi, J. Magee, J.
Andersson, B. Becker, N. Bencomo, Y. Brun, and B. Cukic,
Software engineering for self-adaptive systems: A research
roadmap, Software Engineering for Self-Adaptive Systems,
LNCS (2009), 1-26.
[7] H. Ding, A. Lim, B. Rodrigues, and Y. Zhu, The overconstrained airport gate assignment problem, Computers
and operations research 32 (2005), no. 7, 1867-1880.
[8] A. Dardenne, A. Van Lamsweerde, and S. Fickas, Goaldirected requirements acquisition, Science of computer programming 20 (1993), no. 1-2, 3-50.
[9] P. Donzelli, A goal-driven and agent-based requirements engineering framework, Requirements Engineering 9 (2004),
no. 1, 16-39.
[10] Y. Cheng, A knowledge-based airport gate assignment system integrated with mathematical programming, Computers
and industrial engineering 32 (1997), no. 4, 837-852.
[11] A. Haghani and M.C. Chen, Optimizing gate assignments at
airport terminals, Transportation Research Part A: Policy
and Practice 32 (1998), no. 6, 437-454.

Refrences
[1] M. Jackson, The meaning of requirements, Annals of Software Engineering 3 (2010), no. 1, 5-21.
[2] A. Van Lamsweerde, Requirements engineering: from system goals to UML models to software specifications, Vol. 3,
Wiley, 2009.

99

[12] S. Yan and C.H. Tang, A heuristic approach for airport gate
assignments for stochastic flight delays, European journal
of operational research 180 (2007), no. 2, 547-567.
[13] I.J. Jureta, A. Borgida, N.A. Ernst, and J. Mylopoulos,
Techne: Towards a new generation of requirements modeling languages with goals, preferences, and inconsistency
handling, IEEE (2010), 115-124.

A new model for solving capacitated facility location problem with


overall cost of losing any facility and comparison of Particle Swarm
Optimization, Simulated Annealing and Genetic Algorithm
Samirasadat jamali Dinan

Fatemeh Taheri

Amirkabir University of Technology

Amirkabir University of Technology

Department of Mathematics and Computer Science

Department of Mathematics and Computer Science

smrjamali@yahoo.com

Ft.taheri@gmail.com

Farhad Maleki

M. E. Shiri

Amirkabir University of Technology

Amirkabir University of Technology

Department of Mathematics and Computer Science

Department of Mathematics and Computer Science

maleki.farhad@gmail.com

shiri@aut.ac.ir

Abstract: Facility location problem arise in a wide variety of practical problems. In this paper we
propose a new formulation for capacitated facility location problem which is a development of general
framework including the amount of risk for each facility if other resource cannot serve its customers.
The new formulation is evaluated by three meta heuristic algorithm including Genetic algorithm,
Particle Swarm optimization algorithm and Simulated annealing and finally some numerical example
are provided to show the performance of these algorithm in solving the new problem formulation.

Keywords:
Capacitated Facility Location Problem,Genetic Algorithm ,Paticle Swarm Optimization,Simulated Annealing

Introduction

The facility location problem is a classic combinatorial optimization problem for determining the number
and locations of a set of facilities which of N capacityconstrained facilities should be used to satisfy the demand for M customers at the lowest sum of fixed and
variable costs. The problem is formulated as in Khumawala (1974). Structural properties of the location
problems treated here have been studied by e.g. Leung
and Magnanti (1989),Cornu-ejols, Sridharan and Thizy
(1991), Aardal (1992), and Aardal, Pochet and Wolsey
(1995) and (Harkness and ReVelle, 2003; Drezner et
al., 2002; Canel and Das, 2002; Nozick, 2001;Canel et
al., 1996, 2001; Melkote and Daskin, 2001; Giddings
et al., 2001; Canel and Khumawala, 1996, Hinojosa et
Corresponding

Author

100

al., 2000; Tragantalerngsak et al.,2000; Avella et al.,


1998; Owne and Daskin, 1998; Volgenant, 1996). Consequently, there is now a variety of approaches for solving these problems. The most well known of the general heuristic methods are Particle swarm optimization
(PSO), Simulated Annealing(SA), and Genetic Algorithms (GA). The popularity of these heuristics has
flourished in recent years and several published studies
can be found in the literature where they outperform
the tailored counterparts. However, only a few studies
provided comparisons of these three heuristics in depth.
In this paper, we compare the relative performance of
PSO, SA and GA on capacitated facility location problem (CFLP). The choice of CFLP is made due to its
strategic importance in the design of the supply chain
network. Our motivation is to contribute further to the
understanding of which of these three heuristics may

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

be more effective under different circumstances. In the It is defined by:


remainder of the paper, we briefly provide the references to the pertinent FLP literature. Then we present
new formulation of capacitated facility location probXX
X
Z = min
(ckj xkj ) +
(fj yj )
lem(CFLP) and benefit of using it then discuss about
jJ
solving this problem with meta heuristic method and
kK jJ
the details of implementation of PSO,SA and GA and ST:
X
finally empirical comparison are described.
k K :
xkj = 1

(1)

(D)

jJ

j J :

Problem Statement

sj yj < dk

(T)

dk xkj < sj yj

(C)

jJ

The Capacitated facility location problem is one type


of FLP with capacity restriction the natural extension
of the problem is to allow one type facilities with capacity restriction. Consider locating a number of facilities
with same types to several sites (locations). If one site
is selected, a fixed setup cost occurs, which is independent of the facilities being install in it. However, in
the real life the purchasing price often depends on the
purchasing size . It is important to note that representation selection (matrix vs. vector) as well as the
parameter values for PSO, SA and GA described below
in detail are determined based on the results of extensive pilot experiments and testing for the problem.
In these experiments, the effect of various parameter
settings on solution quality and computation time is
assessed for each of the meta heuristics and parameter
values are set accordingly. representation selection and
parameter values used for the three FLP are shown in
Table 1. In this section, we describe the solution procedure of CFLP with the three heuristics.

CFLP Problem
Genetic Algorithm
Mutation Rate
Crossover
Iteration
POP
Simulated annaleang
Accept
Rate
Partical swarm Optimization
Iteration

Parameter
0.05
0.85
100
40
1000
0.09
100

Table 1: Representation selection and parameter values.

We start by giving the mathematical formulation of


a general model for capacitated facility location problems.
The general model formulated as in Khumawala (1974).

j J :

X
kK

j J, k K :
xkj yj 6 0
0 6 xkj 6 1 , 0 6 yj 6 1
yj {0, 1}
Where K is the set of customers and J the set of potential plant locations; ckj is the cost of supplying of
the customer Ks demand dk from location j,fj is the
fixed cost of operating facility j and aj its capacity if
it is open; the binary variable yj is equal to 1 if facility j is open and 0 otherwise;Finally,xkj denote the
fraction of customer ks demand met from facility j.
the constrains (D) are the demand constraint and constraints (C) are te capacity constraints. The aggregate
capacity constraint (T) and the implied bounds (B) are
superfluous;they are ,however, usually added in order
to sharpen the bound if Lagrangean relaxation of constraints (C) and/or (D) is applied. Without loss of generality it is assumed that
P ckj 0 k, j, fjP 0 j, sj >
0j, dk 0 k K : jJ sj > dk = kK dk Lagrangean relaxation approaches for the CFLP relax at
one the constraint sets (D) or (C). 2.2 new formulation of CFLP with cost risk In the new formulation of
CFLP we add the overall cost of risk that calculate for
each facility
XX
X
X
Z = min
(ckj xkj ) +
(fj yj ) +
(Rj ) (2)
kK jJ

jJ

jJ

According to xkj ,y and dk we can Calculated total


amount of demand to meet for each facility, this vector
is qj also know the capacity for all facility, now according to sj and qj we calculate the remaining capacity
of each facility after to meet demands. If we calculate In the absence of any facility How much of the
demand is not satisfy by Remaining capacity of other
facility the risk of losses each facility calculated in this
problem assume a facility without capacity restriction
for answer to the remaining demands with expensive
transport cost (2 max(ckj )) . How to calculate Rj

101

The Third International Conference on Contemporary Issues in Computer and Information Sciences

3.2

explain with a sample .


if we assume xkj is:

0.5
0.2
0.3
0.5
0.7

0
0
0
0
0

0.1
0.1
0.1
0.1
0.1

0
0
0
0
0

0.3
0.3
0.2
0
0.1

0.1
0.4
0.4
0.4
0.4

customerf acility

y shows open and close facilities:




Unrestricted results

When all the heuristics were allowed to finish their run


according to their parameters, PSO gives best results
in terms of rapidly reaching low-cost solutions, followed
by GA and SA, respectively. Figures (3-5) illustrate the
performances of CFLP with PSO ,GA,SA when facility
is 15 and customer is 10.

dk (demand vector) is:


5

20

10

5.4

x 10

Best: 366029.8831 Mean: 368287.3626


Best fitness
Mean fitness

5.2

19.8

3.9

5.2

14

5
4.8
Fitness value

dk with sum=45 How much of the demand is satisfy


by facility 1: 19.8 = 5 0.7 + 20 0.5 + 10 0.3 +
4 times0.2 + 5 0.5 And for all facility qj :

4.6
4.4
4.2

capacity of facilities is sj :

20

26

10

10

30


3.8
3.6

Remaining capacity of each facility E:




0.2

6.1

4.8

15

20

40
60
Generation

80

100


Figure 1: CFLP with risk

Total of remaining facility = 26.7 If we lose facility 1


other facility unable to satisfy his dimands 26.7 0.2
19.2 = 7.3 If this value is negative we should be satisfy
these demand from the facility with expensive transport cost .we can decide about benefit or harm of each
facility if it is open or close.
Fig. 1 illustrate the performance of CFLP with risk
and Fig. 2 illustrate the performance of CFLP with
out risk when facility is 15 and customer is 20.

x 10

Best: 241945.5792 Mean: 241945.63


Best fitness
Mean fitness

2.9

3.1

Empirical comparison
Fitness value

Time-limited results

For the time-limited evaluation, all of the three heuristics were allowed a maximum time of 200 s and the
best solutions from each heuristic were noted. This
approach evaluates the efficiency with which the three
heuristics reach quality solutions over time. For CFLP,
PSO gives best results in terms of rapidly reaching lowcost solutions, followed by SA and GA, respectively.

102

2.8
2.7
2.6
2.5
2.4

20

40
60
Generation

80

Figure 2: CFLP with out risk

100

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

4.5

x 10

10

3.5
Function value

3
2.5
2

7
6

1.5

0.5
0

Best Function Value: 311372.8065

x 10

3
0

20

40

60

80

200

100

400

600
Iteration

800

1000

1200

Figure 5: CFLP with risk by SA


Figure 3: CFLP with risk by PSO

Refrences
5

5.5

x 10

[1] Fleischmann, Bernhard., and Klose, advanced solutions to


practical problems, Wiley Publishing, Chapter 1, pages: 110, 2005.

Best: 210400.9005 Mean: 211615.2837

[2] Zvi Drezner and Horst W. Hamacher, Facility Location:application and theory, Wiley Publishing, 2005.

Best fitness
Mean fitness

[3] K Aardal, Reformulation of capacitated facility location problems: How redundant information can help, Annals of Operations Research (1998), 289-308.

Fitness value

4.5
4

[4] Marvin A. Arostegui Jr, Sukran N. Kadipasaoglu, and


Basheer M. Khumawala, An empirical comparison of tabu
search, simulated annealing and genetic algorithms forfacilities locations problems, , Elsevier Science Publishers, Houston , USA (2006).

3.5
3
2.5
2

20

40
60
Generation

80

Figure 4: CFLP with risk by GA

100

[5] Z. Lu and N. Bostel, A Facility Location Model forLogistics


Systems Including Reverse Flows:The Case of Remanufacturing Activities, Computers Operations Research 34 (2007),
299-323.
[6] J Harkness and C. ReVelle, Facility location with increasing
production costs, European Joural of Operational Research
(2003), 1-13.

103

A hybrid method for collusion attack detection in OLSR based


MANETs

Hojjat Gohargazi

Saeed Jalili

Tarbiat Modares University (TMU)

Tarbiat Modares University (TMU)

Faculty of Electrical and Computer Engineering

Faculty of Electrical and Computer Engineering

h.gohargazi@modares.ac.ir

sjalili@modares.ac.ir

Abstract: Due to the lack of infrastructure and routers Mobile Ad hoc NETwork (MANET)s in
addition to external attacks, are vulnerable against internal attacks that can occur from authorized
nodes. Collusion attack is a prevalent attack based on Optimized Link State Routing (OLSR)
protocol. In this attack two colluding malicious node prevent routes to a target node from being
established. In this paper we propose a hybrid (One Class Classification (OCC) and Centroid)
method for detecting Collusion attack. For this purpose we adapt OCC methods using a simple
distance based method called Centroid. results show that this model increases the accuracy of
discerning this attack.

Keywords: Anomaly detection; Collusion attack; OLSR; One class classification; MOG.

Introduction

The rest of this paper is organized as follows: Section 2


discuses the related works. Section 3 gives an overview
of OLSR protocol and collusion attack. The proposed
MANETs are wireless networks with mobile nodes and method is presented in section 4. Section 5 shows the
without infrastructure. In this networks there is no experimental results, and at last section 6 describes
special device as router, so all nodes have to partici- future works and concludes the paper.
pate in routing process. Due to these factors, routing
protocols become the base of MANETs. However such
cooperation makes the network vulnerable to attacks
occurred from authorized malicious nodes.
2 Related work
Collusion attack[1] is one of particular and severe attacks against MANETs based on OLSR[2] protocol, in
which a pair of attacker nodes collude and cooperate In [1] the authors proposed a method to detect the
together to prevent routes to a specific node from be- collusion attack by including 2 hop neighbourhood ining established, thus that node will be out of reach formation of each node in its HELLO messages. The
of other nodes. Our method to detect this attack is method tries to discern the attack based on contrabased on machine learning approach. We use an OCC dictions in topology information table. This cause the
method that is able to distinguish normal and abnor- node to obtain information about its 3 hop neighbourmal behaviours. Also considering the importance of hood without the need of TC messages. Although this
collusion attack, the classifier is adapted by a simple method can detect the attack but it is difficult to disdistance based method to better discern this attack. tinguish between the topology changes and the attacks.
The distance based method is called Centroid. An- Result is the increasing of false positives.
other advantage of this method versus the other works The proposed method in [3] is incorporation of an inis the ability of detecting the attacks similar to Collu- formation theoretic trust framework in OLSR. Nodes
sion Attack.
collaborate together to calculate trust values of each
Corresponding

Author, P.O Box: 14115-143 T: (+98) 21 8288-3374

104

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

other. After a certain threshold the node with weak


trust value will be on the blacklist. This method needs
an extra storage to store trust values for each node
and furthermore correctness of cooperation of neighbour nodes affects the accuracy of method. Authors in
[4] proposed a simple attack resistant method, that is
based on the fact of forcing Target to choose only one
Multi Point Rely (MPR) node in Collusion Attack. According to this method, there should be more than one
MPR in the MPR set of each node, when the node
have more than one 1 hop neighbour. Due to choosing
a non-optimal set of MPRs, this method affects performance of network by increasing of traffic overhead.
In [5] Node Isolation Attack has been investigated, that
is similar to Collusion attack. This attack involves single attacker instead of pair of attackers. In the detection phase of proposed method in [5] Target node
observes its MPR to check if it is generating TC messages include Target link informations. This method
is not suitable for Collusion Attack because the second
attacker that drops packets may be outside of Target
range.
All of methods mentioned above are based on changes
in the protocol. also they are proposed only against
one attack, and even are not suitable for similar ones.
In literature only [6] uses machine learning algorithms
to detect attacks against OLSR. But Collusion attack
is not studied in it. This model is based on ensemble
methods and uses a two class classification algorithm
(C4.5).

3
3.1

Background
Optimized Link State Routing

OLSR is one of four standard routing protocols provided for MANETs. this protocol is proactive in which
the routes to all nodes are calculated periodically and
maintained in routing table for each node. Base of
OLSR have been established on two types of messages
HELLO and TC. Every node broadcasts HELLO messages only to its 1 hop neighbourhood at 2 second
interval times including its link, neighbourhood and
MPR informations. Using the information collected
from HELLO messages, each node selects a subset of
its 1 hop neighbours called MPR set. MPRs ensure
delivering packets received from their selectors to all 2
hop neighbours of them.
After selecting MPRs and informing them of their selectors, every MPR generates and broadcasts TC messages each 5 seconds to propagate topology information
across the network. Unlike HELLO messages, TC mes-

sages are forwarded and spread but just by MPRs. Using topology information obtained from messages every node calculates its routing table by a shortest path
computation algorithm.

3.2

Collusion Attack

In OLSR, routes are calculated by the information


collected from TC messages. So in Collusion attack
the necessary condition to prevent routes to a node
from being calculated, is that the attacker node be
the only MPR of Target. Attacker1 that is one of
1 hop neighbours of Target, advertises all Targets 2
hop neighbours as itself neighbours in its HELLO messages. According to the MPR selection phase of protocol, Attacker1 will become the Targets MPR. Thereafter Attacker1 selects the Attacker2 as its only MPR.
TC messages generated by Target will be forwarded by
Attacker1 and TC messages generated or forwarded by
Attacker1 will be forwarded only by Attacker2 . However Attacker2 drops this messages instead of forwarding them. Since TC messages of Target and Attacker1
does not reach other nodes, they will not be able to
create any route to Target.

Proposed method

In this section we describe our method to detect attacks (especially Collusion Attack) against OLSR.
First a set of features is needed to use for collecting
data samples. For this purpose we use 20 different
features, 16 of them are taken from features defined
in [6] and the others are new ones. The features and
their descriptions are listed in figure 1.

Figure 1: features (* features from [6])

105

The Third International Conference on Contemporary Issues in Computer and Information Sciences

4.3

Testing and Voting

Combining is performed in Testing phase. After learning a model and calculating T and A in Training
phase, when a sample xi receives, to test whether it is
normal or attack, first its distance to model (learned by
OCC) is computed (Di ). Then according to equation 1
RDi is calculated to determine the probability of being
Collusion Attack. At last this two values are combined
with a Voting mechanism. We use two different simple
voting functions, mean and maximum. This functions
are defined as:
y = mean(Di , RDi )
y = max(Di , RDi )

Figure 2: Proposed method

As it is shown in figure 2 the proposed method is


consist of two phases, Training and Testing. These
phases and their parts discus in the following.

Experimental results in next section exhibit the mean


function works better than maximum.

5
4.1

and

Experiment and Results

Data Scaling

To validate our model we simulated a MANET in Network Simulator 2 (NS2) to collect normal and attack
Many of OCC methods are sensitive to data scaling. datasets. The simulation parameters are as follows
So it is important how the data are scaled. AssumNumber of nodes
50
ing X = {x1 , x2 , ..., xn } as data samples, the scaling
Simulation
time
3000s
method we used, is as follows:
Area
1000m 1000m
xsi = (xi T )/T i
Mobility model
RWP
Traffic
type
CBR
in which T and T are mean and standard deviation
of training data, respectively.

4.2

Centroid method

This method is proposed to adapt the output of OCC


method to detect Collusion attack. Assume that T
and A are mean of normal data (i.e data collected
from network in lack of any attack) and mean of attack
data (i.e data collected in occurrence the Collusion attack) used for training. Thus related distance of data
sample xi is calculated as:
RDi = ||xi T ||/||xi A ||

(1)

RD shows the sample xi how much is close to attack versus normal status. Whatever the value of RD
be high, means it has occurred Collusion Attack with
higher probability. As is shown in figure 2 a part of this
method is performed in Training and part in Testing.

106

Simulation has been run with six different movements


for both normal and attack situations. Data gathered
from simulations formed six dataset for each situation,
two of them were used for training and the others were
used to test the method.
To show effect of proposed model two measures are
used. Receiver Operating Characteristic (ROC) curve
shows contrast between Detection Rate (DR) and False
Alarm Rate(FAR). DR is ratio of detected attack data
to all ones and FAR is the ratio of normal data detected as attack to all normal data. Another measure
is Area Under Curve (AUC). AUC shows the overall
superiority of a ROC versus another one.
we tested our method with an OCC method, Mixture
of Gaussian (MOG). This method naturally is not
OCC one, but in [7] it is defined and used as well as
OCC methods. The results of applying this method
purely and with combining is shown in figure 3. Since
the FAR higher than 20% is ineligible, to better representing the effect of method, in this paper ROC is

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

60

drawn just until FAR equal 20%. As can be seen,


combining MOG with Centroid by mean function is
better than combining by maximum function or using
it purely. Figure 4 compares the AUC of ROCs shown
in figure 3.
ROC is a curve based on threshold. In this study
the threshold is used to determine output value y is
Collusion attack or not. For this reason selecting a
threshold is trade of between DR and FAR. Whatever
value of threshold be closer to one, DR becomes higher
but FAR goes up too. Decreasing value of threshold
to zero causes decreasing both DR and FAR. As it is Figure 6: ROC curves for Node Isolation attack by
shown in figure 5 selecting 0.60221 as threshold results three methods
DR 78% with FAR 10% for MOG-Centroid combined
with mean function. Finally to reveal the advantage
of the model in identifying attacks similar to ColluConclusions and future work
sion attack, figure 6 represents the comparison of ROC 6
curves of methods applying on Node Isolation attack
data.
In this paper we propose a model to adapt OCC methods to detect Collusion attack against OLSR. For this
purpose we used MOG as OCC method and defined
the Centroid to adapt it. In addition to detecting Collusion attack, the other advantage of this model is its
ability to diagnose attacks similar to Collusion attack.
Results represent that the proposed model acts well
when mean function is used for combining. In future
we will focus on defining more features to represent behaviour of OLSR and detecting more attacks against
this protocol. Defining a method stronger than Centroid can be a future work.
50

DR(%)

40

30

20

10

MOG
MOGCentroid (max function)
MOGCentroid (mean function)

10
FAR(%)

12

14

16

18

20

90

80

70

DR(%)

60

50

40

30

20

MOG
MOGCentroid (max function)
MOGCentroid (mean function)

10

10
FAR(%)

12

14

16

18

20

Figure 3: ROC curves for detecting Collusion attack


by three methods

Acknowledgement

This research has been supported in partial by Iran


Telecommunication Research Center (ITRC).
Figure 4: Comparing AUC of ROCs in figure 3

Refrences
90

[1] B. Kannhavong et al, A Collusion Attack Against OLSRbased Mobile Ad Hoc Networks, Global Telecommunications
Conference, GLOBECOM 06. IEEE, 2006, pp. 15.

80
DR= 78%
FAR= 10%

70

Threshold= 0.60221

DR(%)

60

[2] T. Clausen and P. Jacquet, RFC 3626 - Optimized Link State


Routing Protocol (OLSR), IETF RFC3626 (2003), 1-75.

50

40

[3] M.N.K. Babu et al, On the prevention of collusion attack in


OLSR-based Mobile Ad hoc Networks, Networks, ICON 2008.
16th IEEE International Conference on, 2008, pp. 1-6.

30

20

10

10
FAR(%)

12

14

16

18

20

Figure 5: Seeing Threshold, higher threshold causes


higher DR and FAR and vice versa

[4] P.L. Suresh et al, Collusion attack resistance through forced


MPR switching in OLSR, Wireless Days (WD), 2010 IFIP,
2010, pp. 1-6.
[5] B. Kannhavong et al, A study of a routing attack in OLSRbased mobile ad hoc networks, International Journal of Communication Systems 20 (2007), no. 11, 12451261.

107

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[6] J.B.D. Cabrera et al, Ensemble methods for anomaly detection and distributed intrusion detection in Mobile Ad-Hoc
Networks, Information Fusion 9 (2008), no. 1, 96 - 119.

108

[7] D.M.J. Tax, One-class classification; concept-learning in the


absence of counter-examples. Ph.D. thesis, Delft University
of Technology (2001).

A Statistical Test Suite for Windows to Cryptography Purposes


R. Ebrahimi Atani

N. Karimpour Darav

Department of Computer Engineering

Department Of Computer Engineering

Faculty of Engineering

Faculty of Engineering

Guilan University

Lahijan branch, Islamic Azad University

rebrahimi@guilan.ac.ir

karimpour@liau.ac.ir

S. Arabani Mostaghim

Department of Computer Engineering


Faculty of Engineering
Guilan University
saideh arabani@yahoo.com

Abstract: Encryption has been considered a precious technique to protect information against
unauthorized access in addition to developing analytical methods to evaluate cryptographic algorithms. Analysis of statistical tests is one of the methods used by the International Institute of
standard and Technology (NIST). This article introduce a software tool that implemented by using
C and JAVA programming languages for cryptographic purposes.

Keywords: Cryptography; Statistical Tests; Pseudo Random Number Generator(PRNG), JNI

Introduction

By exploiting JNI[7] technique, C and JAVA programming languages have intertwined together to take advantage of the features of these two languages.

Encryption plays s significantly important role in protecting information against unauthorized access. The
use of random number in cryptographic applications
is increasing[1] notably. For example, needed keys are
generated by utilizing random number generators in
order to prevent attackers from guessing keys. Hence,
generating random numbers is a sobering problem,
which is done by applying random number generators. However, evaluating their quality is far from
straightforward and needs some analytical manipulation. NIST[2] for this purpose has provided a set
of statistical tests applied on the output of implemented Random Number Generators (RNGs). Consequently, their results are taken into account as a
benchmark to select the generator for desired application[2]. Nonetheless, there are many other applications
that statistical analysis can be used in the fields[1],[2].
Our tool utilizes the C and JAVA[9] programming languages, can be run under Windows operating system.
Corresponding

Random Number Generators

Naturally generating random numbers is possible only


by random physical phenomena. Mouse motions or
keyboard keys pressed for time can be a random phenomenon, and these phenomena can be used as RNGs.
The use of mathematical functions is another method
for generating random numbers, used in computer systems, named as Pseudo-random number generators
(PRNG). The Functions generating random numbers,
and used in our tool include:
Linear Congruential Generator (LCG): A LCG
produces a pseudo-random sequence of numbers
x1 , x2 , x3 , ..., xn based on the equation[5]:

Author, P. O. Box 41635-3756 T: (+98)01316690274

109

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

xi+1 = axi + b

mod

m,

f or

i0

Blum-Blum-Shub generator The output of this


generator is based on three independent parameters. Two parameters p and q are prime numbers
whereas the third parameter r is random selected
in the interval [1, pq 1] such that gcd(r, pq) = 1.
Since, the parameters p, q and s are time dependent, users will not be able to guess or produce
the same sequences[1].

(1)

The primitives of equivalence are[2]:


a is dependent on current state.
x0 = 231
And the constants b and m are:
b = 0 , m = (231 1)

Micali-Schnorr Generator: Input parameters of


this generator are also time dependent. In other
word, by changing the state, input parameters
will be changed[1]:
Two prime numbers p and q. A parameter e is
selected such that
1 e , = (p 1)(q 1)
gcd(e, ) = 1, n = pq
and
80e N = blgnc + 1.
k = bN (1 2e )c
In [1] the algorithm of this generator has been
described.

Quadratic Congruential Generator(QCG): A


QCG produces a pseudo-random sequence of
numbers x1 , x2 , x3 , ..., xn from the equation[5]:
xi+1 = ax2i + bxi + c

mod

(2)

Where i 0. According to the constants, here


the QCG is divided into two sections[2]:
For section 1: a = 1 and b, c = 0 and m is a
512 bit prime
xi+1 = x2i

mod

m,

f or

i0

(3)

For section 2: a = 2, b = 3, c = 1 and m = 2512


xi+1 = 2x2i + 3xi + 1

mod

m, f or

i0

(4)

Set of Statistical Tests

Cubic Congruential Generator: This generator A package of statistical tests are proposed by NIST[2]
produces random numbers by use of a cubic equa- as criteria to evaluate the quality of an RNG(or PRNG)
tion. The cubic equation is[17]:
to be appropriate for one (more) application(s). If a sequence is successfully passed all these 15 tests, it does
xi+1 = ax3i + bx2i + cxi + d mod m
(5) not mean that the result of these tests are exactly correct but with high probability, it can be accepted. This
By under primitives, the recurring equation is
package consist of 15 tests coming as the follow:
applied to generate random numbers by the
CCG[2]. a = 1, b = c = d = 0, m = 2512
xi+1 = x3i mod 2512 , f or i 0
Frequency test : in this test the number of 0s and
1s in a sequence is computed and then compared
Exclusive OR Generator: This generator prowith an expected result. Furthermore in this test,
duces random numbers through recurrence equa2 is computed from the equation[12],[1]:
tion[1]:
xi = xi1 xi127 , f or

i 128.

(6)

I should be considered that the input parameter


is a 127-bit variable.
Modular Exponentiation Generator: To produce
a 512-bit sequence, this generator uses the equation[6]:
xi+1
= ayi mod m, f or i 1 where a
and m are constant numbers. yi is a variable[2].
Secure Hash Generator: A Secure Hash Generator(SHA)[18] produces a sequence xi with b-bit
length as:
160 b 512.

2 =

(n0 n1 )2
n

(7)

Frequency Test within a Block: in this test, the


entire sequence is divided into blocks having Kbit length. As a matter of fact, the number of
n
blocks, N = b K
c , N.K n and | n (N.K) |
bit sequence would be neglected. Afterward, it
computes the number of 1s(as n1 ) and 0s(as
n0 ) in a block. Finally, if n1 (or n0 ) is the same
as it would be expected( K
2 ) , we can say that
the entire sequence is approximately random. It
should be considered that if the K = 1, this test
refers to frequency test[13].

110

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Runs Test: If s is considered as an entire sequence, containing 0s and 1s. The iteration of
0s (or 1s) is named as a run of sequence. If the
number of runs in s is the same as a expected
in a random sequence, the test result being the
entire sequence will be random[14].

Approximate Entropy Test: For this test, the


number of overlapping k-bit templates in an entire sequence is computed. In fact, in this test,
it is compared k and k+1 bit subsequences with
a really random sequence[15].
Cumulative Sums (Cusum) Test: In this test,
at first the sequence is converted to digit -1,1,
and then the maximum of P
sums are computed
k
by means of equation Si = i=0 (2xi 1) for k
=0..n. If the maximum of sums is too large, the
sequence could not be random[2].

Longest Run of Ones: In this test the length of


a run of 1s would be tested. In other word, the
maximum length of a run of 1s is computed, and
then compared with the expected value[14].
Binary Matrix Rank Test: In this test, the number of disjoint sub-matrices is considered[11].

Random Excursions Test: For this test, eight


stages(-4,-3,-2,-1,1,2,3,4) are considered. A cycle
in the sequence s begins from 0 and ends up
with 0. If the number of each state in cycles is
the same as what would be expected, the entire
sequence could be a random sequence[16].

Discrete Fourier Transform Test: This test


searches in a sequence and discovers the pattern of occurrences in a period of bit sequences.
In other word, the purpose of this test is to compute the number of iterations of patterns in a
sequence. The number of iterations determines
the peak heights[3] and [4].
Non-overlapping Template Matching Test: there
are some templates, that have non-periodic iteration in their sequences. In this test the number
of these templates in an entire sequence is computed. This number show whether the entire
sequence is random or not[2].

Random Excursions Variant Test: In this test,


eighteen states has been considered and like two
before tests the cusums are computed,however in
this test, the total number of occurring a stage
in a cycle is important[16].

Implementation of the Tool by


JNI

Overlapping Template Matching Test: this test is


the same as the former test, however, in this test,
the subsequences are overlapping each other[2].
In this program by use of C and JAVA programming
languages, an application program has been designed
Maurers Test: for this test, the number of bits for cryptography purposes. Its Graphical User Interin subsequences of an entire sequence being ex- faces(GUIs) was designed so that a user can easily
actly the same as what it would be expected for interact with it. In other word, it is user friendly, and
it was designed for windows operating systems. Since
a random sequence patterns is considered[2].
JAVA is a powerful programming language and has a
Linear Complexity Test : This test is based lot of enriched GUIs[8] library and tools[9], we have
on Linear Feedback Shift Register(LFSR). The decided to reap its benefits.
length of LFSR is to be considered. For instance
if the length of an LFSR is too short, it demonstrates that the entire sequence is non-random[3].
Serial Test : Subsequences of sequence s iterating and having m-bit length are countered. These
subsequences can compute2 [10]:
2 =

n
2m X
(ni m )2
n
2
m
iZi

m1

X
iZim1

(ni

n
2m1

)2

Java Native Interface: JNI is one of the best


features of the Java programming language. It
is a powerful technique on JAVA. Application
programs that use this technique, can combine
native code written in other programming languages such as C and C++ and utilize them as a
java program[7].

In our work, we use JNI technique. At first we create


a Dynamic Link Library(DLL)[7] file extracted from
(8)
codes that written by C programming language. These
codes are open source and published by NIST[2] which

111

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

could be converted to a DLL file by means of applying some changes and compiling them by visual studio
compiler. Its user interface has been designed by java,
using JNI technique to connect the GUIs to the main
core codes. As it is shown in figure 1, in the generator
window, the kind of the generating algorithm must be
selected, and then in the next step either one or more
tests should be selected. When the program ends up
successfully, a window like what is shown in figure 3
appears.

Finally the output of the program is stored in text files.


Two files for each test which selected by the user are
created as the result and states. The result.txt includes
p-values that are to be obtained by test. In other word
the results of test(s) are p-values that user can decide
which one is more suitable PRNG for an application.
However, the states.txt contains information that is
provided by each test. The information that is stored
in this file is proportioned with each test. FinalAnalysisReport.txt has brief and facilitated information of
tests. The result files can be interpreted according to
what are described in previous sections(see section 3).

6
Figure 1: Generators view

Interpretation the Output of


The Program

Conclusion

In this paper, we have used C and JAVA programming


languages to produce an efficient application program
for cryptography purposes. This program consists of
9 mathematical algorithms that produce pseudo random numbers and 15 statistical tests proposed by U.S.
NIST. These generators and tests have been introduced
by NIST using ANSI C programming language. The
graphical user interface has been written by means of
Java programming language, and using JNI technique.
One of the features of this program is that it can be
run windows operating systems with high user-friendly
capabilities .

Refrences
Figure 2: Tests view
[1] A. Menezes, P. van, Oorschot, and S. Vanstone, Handbook
of Applied Cryptography, CRC-Press,Inc, Chapter 5, pages:
169190,Chapter 9, pages: 321348, Jun,1997.
[2] Andrew Rukhin, Juan Soto, James Nechvatal, Miles Smid,
Elaine Barker, Stefan Leigh, Mark Levenson, Mark Vangel,
David Banks, Alan Heckert, James dray, and San Vo, A
Statistical Test Suite for Random and Pseudorandom Number Generators for Cryptographic Applications: Reports on
Computer Systems Technology, NIST,U. S (April,2010),
(available from: www.csrc.nist.gov).
[3] J. L. Massey and S. Serconek, A fourier transform approach to the linear complexity of nonlinearly filtered sequences: Advances in Cryptology-CRYPTO, Lecture Notes
in Computer Science (1994), 332-340.
[4] J. Stern, Secret linear congruential generators are not cryptographically secure, Proceedings of the IEEE 28th Annual
Symposium on Foundations of Computer Science (1987),
421-426.

Figure 3: Successful finished view

[5] H. Krawczyk, How to predict congruential generators, Journal of Algorithms (1992), 527-545.

112

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[6] S. M. Hong, S.Y. Oh, and H. Yoon, New modular multiplication algorithms for fast modular exponentiation, Advances
in CryptologyEUROCRYPT (1996), 166-177.
[7] Sheng. Liang, The JavaT M Native Interface Programmers
Guide and Specification, ADDISON-WESLEY, Jun,1999.

[13] Milton Abromowitz and Irene A. Stegun, Handbook


of
Mathematical
Functions:
with
Formulas,Graphical and Mathematical Tables, NBS Applied
Mathematics
Series-55,
(available
from:
http://people.math.sfu.ca/
cbm/aands/toc.htm),
December 1972.

[8] H. M. Deitel and P. J. Deitel, The JavaT M How to program


JAVA, Prentice Hall, august,2004.

[14] Anant P. Godbole and Stavros G. Papastavridis, Runs and


patterns in probability, Selected papers (1994).

[9] www.java.sun.com.

[15] A. Rukhin, Approximate entropy for testing randomness,


Journal of Applied Probability 37 (2000).

[10] I.J. Good, The serial test for sampling numbers and other
tests for randomness, Proceedings of the Cambridge Philosophical Society (1953), 276-284.

[16] M. Baron and A. L. Rukhin, Distribution of the Number of


Visits For a Random Walk: Communications in Statistics,
Stochastic Models 15 (1999), 593-597.

[11] I. N. Kovalenko, Distribution of the linear rank of a random matrix, Theory of Probability and its Applications 17
(1972), 342-346.

[17] J. Eichenauer-Herrmann and E. Herrmann, Compound cubic congruential pseudorandom numbers, Computing 59
(1997), 85-90.

[12] Kai Lai Chung and Farid AitSahlia, Elementary Probability


Theory: with Stochastic Processes and an Introduction to
Mathematical Finance, Springer-Verlag New York, Februry
14, 2003.

[18] ANSI X9.30 (PART 2), Public Key Cryptography Using Irreversible Algorithms for the Financial Services Industry:
The Secure Hash Algorithm 1(SHA-1), ASC X9 Secretariat
American Bankers Association (1993).

113

An Empirical Evaluation of Hybrid Neural Networks for Customer


Churn Prediction
Razieh Qiasi

Zahra Roozbahani

University of Qom,Qom, Iran

University of Shahid Beheshti, Tehran, IRAN

Department of Information Technology

Department of Computer Science

raziehghiasi@gmail.com

roozbahani2@gmail.com

Behrooz Minaei-Bidgoli
University of Science and Technology, Tehran, Iran
Department of Computer Engineering
minaeibi@cse.mcu.ed

Abstract: Customer churn has become a critical issue, especially in the competitive and mature
telecommunication industry. From economic and risk management perspective, it is important to
understand customer characteristics in order to retain customers. However, few studies have used
hybrid modeling for churn prediction. The main contribution of this paper is to use hybrid neural
networks for churn prediction. The experimental results show that the hybrid model performs better
than single neural network model.

Keywords: churn; customer retention; hybrid data mining; neural networks.

Introduction

As the new markets are developed, competition between companies increases sharply. Since the competition gets hard and telecommunication becomes a selling product, companies encounter to minimize costs,
add value to their services, and guarantee differentiation. Now, the customers can choose their service
providers, so companies pay attention to customer care
in order to keep their position in the market. Under
the hard conditions of competition, companies try to
focus on customers behaviors. Base on the needs of
customers, telecommunication companies decide their
service offers, give a shape to their communication network and in addition change their organizational structure [1]. If a customer ends doing business with a
provider, and join another one, the customer is called a
churner. Churn is a major problem for companies with
many customers, like credit card providers or insurance
companies. In telecommunication industry, the sharp
Corresponding

increase of competition makes customer churn a great


concern for the providers [2]. As In the wireless telephone industry, annual churn rates have been reported
to range from 23.4% [3] to 46% [4]. Churn is closely
related to the retention concept, representing the opposite effect: churn = 1- retention. While the focus
of the retention investigation is to find out why customers stay, churn focuses on the reasons a customer
may leave. In order to effectively manage customer
churn for companies, it is important to build a more effective and accurate customer churn prediction model.
Statistical and data mining techniques are useful to create the prediction models. This paper also focuses on
the use of data mining to predict customer churn. Customer churn prediction models aim to detect customers
with a high propensity to attrite. An accurate segmentation of the customer base allows a company to target
the customers that are most likely to churn in a retention marketing campaign, which improves the efficient
use of the limited resources for such a campaign. Many
studies examined different data mining techniques to

Author, P. O. Box 37181-87181, T: (+98) 919 5421835

114

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

predict customer churn. Researchers showed that hybrid data mining models can improve the performance
of the single clustering or classification techniques individually. In particular, they are composed of two
learning stages [5]. Nevertheless, few studies examine
the performance of hybrid data mining techniques for
customer churn prediction. Therefore, this paper uses
hybrid neural network in order to improve the accuracy
of prediction models. The rest of the paper is organized
as follows. The definition of churn and the summary
of the studies are introduced in Section 2. The data
which is used in the research is described in Section 3,
and the modeling process based on neural network is
presented in Section 4. The conclusion of this paper is
represented in Section 5.

Literature Review

Many highly competitive organizations have understood that retaining existing and valuable customers
is their core managerial strategy to survive in industry. This leads to the importance of churn management. Customer churn means that customers are intending to move their custom to a competing service
provider. Many studies have discussed customer churn
management in various industries, especially in mobile
telecommunications. In order to understand how related work constructs their prediction models, this paper reviews some of the current related studies. ShinYuan Hung et al. (2006) [?6] used decision tree and
neural network techniques for predicting wireless service churn. They understood that both decision tree
and neural network techniques can deliver accurate
churn prediction models. John Hadden et al. (2007)
[7] reviewed some of the most popular technologies that
have been identified for the development of a customer
churn management platform. Kristof Coussement and
Dirk Van den Poel (2008) [8] compared three classification techniques Logistic Regression, Support Vector
Machines and Random Forests to distinguish churners from non-churners. Their reviews show that Random Forests is a viable opportunity to improve prediction performance compared to Support Vector Machines and Logistic Regression which both exhibit an
equal performance. Elen Lima et al. (2009) [9] show
how domain knowledge can be incorporated in the data
mining process for churn prediction, viz. through the
evaluation of coefficient signs in a logistic regression
model, and secondly, by analyzing the decision table
(DT) extracted from a decision tree or rule-based classifier. Dulijana Popovi and Bojana Dalbelo Bai (2009)
[10] presented a model based on fuzzy methods for
churn prediction in retail banking. B.Q. Huang et al.

(2010) [11] In order to improve the prediction rates of


churn prediction in land-line telecommunication service
field, this paper proposes a new set of features with
three new input window techniques. For evaluating
these new features and window techniques, the three
modeling techniques (decision trees and multilayer perceptron neural networks and support vector machines)
are selected as predictors. Their results show that the
new features with the new window techniques are efficient for churn prediction in land-line telecommunication service fields. Afaq Alam Khan et al (2010) [12]
identified the best churn predictors on the one hand
and are evaluated the accuracy of different data mining techniques on the other in ISP industry in I.R.Iran.
Clustering users by their usage features and incorporating cluster membership information in classification
models is another aspect which has been addressed in
this study. V. Vijaya Saradhi, Girish Keshav Palshikar
(2011) [13] reviewed different methods proposed to predict customer churn and have provided a predictive
model for employ churn problem. Pnar Kisioglu, Y.
Ilker Topcu (2011) [1] constructed a model by Bayesian
Belief Network to identify the behaviors of customers
with a propensity to churn in telecommunication industry. According to the results of Bayesian Belief Network, average minutes of calls, average billing amount,
the frequency of calls to people from different providers
and tariff type are the most important variables that
explain customer churn.Guangli Nie et al. (2011) [14]
provided a model to predict customer churn with applying two techniques (logistic regression and decision
tree) using credit card data. The test result shows that
regression performs a little better than decision tree.

3
3.1

Data
Reactive Agents

In this paper we used CRM dataset provided by American telecom companies, which focuses on the task
of customer churn prediction. Database contained a
churn variable signifying whether the customer had left
the company two months after observation or not, and
a set of 75 potential predictor variables which has been
used in a predictive churn model. For the purpose of
this paper 4,000 records are randomly selected that
with ratio 9 to 1 are divided into two test data set and
train data set.

115

The Third International Conference on Contemporary Issues in Computer and Information Sciences

3.2

Noise Reduction

4.2

Noise is the irrelevant information which would cause


problems for the subsequent processing steps. Therefore, noisy data should be removed. This noise can be
removed by finding their locations and using the correct values to replace them. For instances, the correct
values are used to replace the incorrect ones; the missing values that are identified by NULL or blank spaces
can be removed by neutral values; one of the duplicated data is kept and the others are removed; outliers
can be removed with anomaly detection models.

3.3

Ccombined Neural Network Models

Combined neural network models often results in a prediction accuracy that are higher than the individual
models. This construction is based on a straightforward approach that has been termed stacked generalization. The stacked generalization concepts formalized by Wolpert [16] and refer to schemes for feeding information from one set of generalizers to another before
forming the final predicted value (output). The unique
contribution of stacked generalization is that the information fed into the net of generalizers comes from
multiple partitionings of the original learning set.[17],
[18].

Normalization
4.3

Application of Combined Neural

Normalization is changing the scale of data so that they


Network Model
map to a small and finite range such as [-1,1]. Normalization can be done in various ways such as min-max
normalization, Z-Score normalization and so on. In The combined neural network topology used for the
detection of customer churn. The network topology
this study, min-max method is used to normalize.
was the MLPNN with a single hidden layer in first
level and the network topology was the MLPNN with
a two hidden layer in second level. The network had
75 input neurons, equal to the number of feature vec4 Modeling
tors. We trained second level neural network to combine the predictions of the first level networks. The
second level network has 1 input. The target for the
4.1 Multi-Layer Feed Forward Neural second level network was the same as the targets of
the original data. In the first level and second level,
Network (MLFF)
training of neural networks was done in 50 and 100
epochs, respectively. Since the values of mean square
MLFF is one of the most common NN structures, as errors (MSEs) converged to a small constants approxithey are simple and effective, and have found home mately zero in epochs, training of the neural networks
in a wide assortment of machine learning applications. was successful. In the first level and second level analMLFF are feed-forward NN trained with the standard ysis, the Levenberg-Marquardt and RP training algoback-propagation algorithm. Multi-layer feed forward rithms were used respectively.
neural network architecture shown in fig.1 They have
been shown to yield accurate predictions in difficult
problems [15].

Evaluation Model

After building a predictive model, providers would


want to use these classification models to predict future behavior. It is essential to evaluate the classifier
in terms of performance. First, the predictive model
is estimated on a training set. Afterwards, this model
is validated on an unseen dataset, the test set. It is
essential to evaluate the performance on a test set, in
order to ensure that the trained model is able to generalize well. We can count the number of true positives
Figure 1: Multi-layer feed forward neural network ar- (TP), true negatives (TN), false positives (FP) (actuchitecture
ally negative, but classified as positive) and false nega-

116

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

tives (FN) (actually positive, but classified as negative)


examples. Therefore, sensitivity, specificity and accuracy of performance metrics are given by the following
expressions [19].
(T P + T N )
(T P + F N + T N + F P )
TN
specif ity =
(T N + F P )
TP
Sensivity =
(T P + F N )
M isclassif icationerror = 1 Accuracy
Accuracy =

Prediction performance of purposed model is shown in


Fig 2. As the results show, the hybrid model performs
better than single neural network model.

[3] They Love Me, They Love Me Not 17(21) (2000), 3842.
[4] Standing
By
Your
Carrier:
Available
From
Http://Currentissue.Telophonyonline.Com/ (2002).
[5] M. Lenard, G.R. Madey, and P. Alam, The Design And
Validation Of A Hybrid Information System For The Auditors Going Concern Decision, Journal Of Management
Information Systems 14(4) (1998), 219237.
[6] S.Y. Hung, D.C. Yen, and H.Y. Wang, Applying Data Mining To Telecom Churn Management, Expert Systems With
Applications 31(5) (2006), 1552.
[7] J. Hadden, A. Tiwari, R. Roy, and D. Ruta, Assisted Customer Churn Management:State-Of-The-Art And Future
Trends, Computers & Operations Research 34(10) (2007),
29022917.
[8] K. Coussement and D. Van Den Poel, Improving Customer Attrition Prediction By Integrating Emotions From
Client/Company Interaction Emails And Evaluating Multiple Classifiers, Expert Systems with Applications 36(3)
(2009), 61276134.
[9] E. Lima, C. Mues, and B. Baesens, Domain Knowledge Integration In Data Mining Using Decision Tables: Case Studies In Churn Prediction, Journal of the Operational Research Society 60(8) (2009), 10961106.
[10] D. Popovic and B.D. Basic, Churn Prediction Model In Retail Banking Using Fuzzy C-Means Algorithm, Informatica
33 (2009), 243247.

Figure 2: Prediction performance for prediction model.

Conclusion

In this study, we also developed and used hybrid neural networks for predicting potential churn in wireless
telecommunication services. We have tested our hybrid neural network model and compared this model
with a single neural network model. The results of our
experiments indicate that the hybrid neural networks
perform better than the single neural network model,
but are computationally expensive. However, successful churn management must also include effective retention actions. Manager need to develop attractive
retention programs to satisfy those customers. Furthermore, integrating churn score with customer segment and applying customer value will also help managers to design the right strategies to retain valuable
customers.

Refrences
[1] p. Kisioglu and Y.I. Topcu, Bayesian Belief Network Approach To Customer Churn Analysis:A Case Study On The
Telecom Industry Of Turkey, Expert Systems With Applications 37 (2011), 71517157.
[2] M. Richeldi and A. Perrucci, Churn Analysis Case Study:
Telecom Italia Lab Report, Torino, Italy (2002).

[11] B.Q. Huang, T.M. Kechadi, B. Buckley, G. Kiernan, E.


Keogh, and T. Rashid, New Feature Set With New Window
Techniques For Customer Churn Prediction In Land-Line
Telecommunications, Expert Systems With Applications 37
(2010), 36573665.
[12] A. Alam Khan, S. Jamwal, and M.M. Sepehri, Applying
Data Mining To Customer Churn Prediction In An Internet Service Provider, International Journal Of Computer
Applications 9 (2010), no. 7, 814.
[13] V. Vijaya Saradhi and G. Keshav Palshikar, Employee
Churn Prediction, Expert Systems With Applications 38
(2011), no. 3, 19992006.
[14] G. Nie, W. Rowe, L. Zhang, Y. Tian, and Y. Shi, Credit
Card Churn Forecasting By Logistic Regression And Decision Tree, Expert Systems With Applications 38(12)
(2011), 1527315285.
[15] G.E. Rumelhart, G.E. Hinton, and R.J. Williams, Learning
Internal Representations By Error Propagation: Published
in Book Parallel distributed processing: explorations in the
microstructure of cognition, MIT Press, Cambridge, MA
(1986).
[16] D.H. Wolpert, Stacked Generalization, Neural Networks
5(2) (1992), 241-259.
[17] D. West and V. West, Improving Diagnostic Accuracy Using A Hierarchical Neural Network To Model Decision
Subtasks, International Journal Of Medical Informatics 57
(2000), no. 1, 41-55.
[18] E.D. Ubeyli and I. Gler, Improving Medical Diagnostic Accuracy Of Ultrasound Doppler Signals By Combining Neural Network Models, Computers In Biology And Medicine
35(6) (2005), 533554.
[19] J. Han and M. Kamber, Data mining: Concepts and techniques, The Morgan Kaufmann Series in Data Management Systems, Jim Gray, Series Editor, Morgan Kaufmann
Publishers, San Francisco, CA, 2006.

117

A Clustering Based Model for Class Responsibility Assignment


Problem
Hamid Masoud

Saeed Jalili

Tarbiat Modares University (TMU)

Tarbiat Modares University (TMU)

Electrical and Computer Engineering Faculty

Electrical and Computer Engineering Faculty

H.Masoud@Modares.ac.ir

Sjalili@Modares.ac.ir

S.M.Hossein Hasheminejad
Tarbiat Modares University (TMU)
Electrical and Computer Engineering Faculty
SMH.Hasheminejad@Modares.ac.ir

Abstract: Assigning responsibilities to classes is a vital and critical task in the object oriented
software design process and directly affects maintainability, reusability and performance of software
system. In this paper we propose a clustering based model for solving the Class Responsibility
Assignment (CRA) problem. The proposed model is independent of specific clustering method and
has a high extensibility to cover the new features of object oriented software design. The input
of model is collaboration diagrams of analysis phase and its output is the class diagram with high
cohesion and low coupling. To evaluate the proposed model we use four different clustering methods:
X-means, Expectation Maximization (EM), K-means and Hierarchical Clustering (HC). Comparing
the obtained results of clustering methods with the expert design reveals that the clustering methods
yield promising results.

Keywords: Object-oriented analysis and design; Class responsibility assignment (CRA); Clustering.

Introduction

Object-oriented software design process involves several steps, in which each step has its own activities. Class Responsibility Assignment (CRA) is one
of the important and complex activities in the ObjectOriented Analysis and Design (OOAD). Its main goal is
to find the optimal assignments of responsibility (where
responsibilities are shown in terms of methods and attributes) to classes in regards to various aspects of coupling and cohesion, thus leading to a more maintainable and reusable model [1]. CRA not only is vital
during analysis and design phase, but also during maintenance.
There are many methodologies to help recognize responsibilities of a system [2] as well as assigning them
to classes [3], but all of them depends greatly on hu Corresponding

man judgment. On the other hand emergence of new


responsibilities or change existing responsibilities (e.g.,
removed or moved to other classes) causes the model
change, hence the need for reallocation of responsibilities is essential. CRA is an onerous task, therefore,
having an automated method for it, can provide enormous help to designers.
All researches for solving the CRA problem using metaheuristic methods and there is no method based on the
clustering techniques. In this paper we address CRA
as a clustering problem, making it fit for application of
clustering methods. For this purpose, first, we extract
some features from input collaboration diagrams, then
use clustering methods for clustering them and generating class diagram. Comparing the obtained results
of four different clustering methods (X-means, Expectation Maximization (EM), K-means and Hierarchical

Author, P. O. Box 14115-143, T: (+98) 2182883521

118

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Clustering (HC)) with the model designed by expert


reveals that the clustering methods yield promising results.
The rest of this paper is organized as follows: Section
2, discusses CRA as a clustering problem. In section
3, the proposed model is described in detail. The used
case study to evaluate the proposed model is described
in section 4. Section 5 presents experimental results.
Finally, conclusions and future work are drawn in Section 6.

Related Works

In recent years there has been a dramatic increase in


work on Search Based Software Engineering (SBSE).
SBSE is an approach to software engineering in which
search based optimization algorithms are used to address problems in Software Engineering [4]. The focus
of most research in this area is on software testing
[4]. However, there are many researches in software
design area [5]. Recently, some researches have used
the metaheuristic optimization algorithm to solve the
CRA problem. Keeffe and Cinneide [6, 7] use a Simulated Annealing (SA) algorithm to automatically improve the structure of an existing inheritance hierarchy.
Bowman et al. [8] study the use of a Multi-Objective
Genetic Algorithm (MOGA) in solving the CRA problem. The objective is to optimize the class structure
of a system through the placement of methods and
attributes. In this study the Strength Pareto Approach (SPEA2) is used. Glavas and Fertalj [9] use
four different metaheuristic optimization algorithms
(simple Genetic Algorithm (GA), hill climbing, SA,
and particle swarm optimization) to solve the CRA
problem. They use responsibility dependency graph
as input for optimization algorithms and use coupling
and cohesion metrics for evaluation. Seng et al.[10] use
GA to automatically determine potential refactorings
of a class structure and inheritance hierarchy. The
used Mutation and Crossover operators are moving a
method from a class to another class and moving methods/attributes up/down in an inheritance hierarchy.

The CRA problem can simply be mapped to a clustering problem. To show this, first, we define the clustering problem. Consider a set of N d -dimensional data
objects O = {O1 , O2 , , ON }, where Oi = (oi1 , oi2 , ,
oid ) Rd . Each oij called a feature (attribute, variable, or dimension) and represents value of data object
i at dimension j. Given O the set of data objects, the
goal of partitional clustering is to divide the data objects into K clusters {C1 , C2 , ,CK }, that satisfies the
following conditions:
a) Ci 6= ,
i = 1, ..., K
SK
b) i=1 Ci = O
c) Ci Cj = ,

i, j = 1, ..., K and i 6= j

In the CRA problem we have a set of methods and


attributes (indicating responsibility) that must be divided between K classes. If we consider each of the
classes as a cluster and each of the methods or attributes as a data object, then the CRA problem is converted to the clustering problem. In fact, the main objective of clustering algorithms is to maximize betweencluster separation (coupling between clusters) and minimize within-cluster scatter (cohesion of clusters). As
mentioned above the coupling and cohesion metrics are
two main goals in the CRA problem.

Proposed Model

As mentioned in section 1, all existing methods for


solving the CRA problem based on the metaheuristic
algorithms and there is no method based on clustering techniques. In this paper, we propose a clustering
based model for solving the CRA problem. In compared with metaheuristic based methods the proposed
model has several advantages. These advantages are:
Easy to extend; by extracting new features, according to application and user priorities, easily,
we can import the new aspects of OOAD into
the design of class diagram. But in metaheuristic based methods, development causes large
changes in the structure of population members/solutions encoding, fitness function and operators.

CRA as a Clustering Problem

Using a variety of new and efficient clustering


methods without changing the model.

Bowman [8] defines the CRA problem as:


CRA is about deciding where responsibilities, under
the form of class operations (as well as the attributes
they manipulate), belong and how objects should interact (by using those operations).

No need to design a specific fitness function; in


metaheuristic based methods, definition of a specific fitness function for the CRA problem and
weighting its elements are a complex and onerous task.

119

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Figure 1 shows our model for solving the CRA problem. The proposed model has three main steps: (1)
extracting features and generating data set, (2) clustering data set, and (3) processing clustering results and
generating class diagram. These steps are described in
the following subsections.

Table 1: Extracted features from inputs


Acronym
MAR
MMR
RA
RM
AC
MC

Definition
Method-Attribute Relation
Method-Method Relation
Related Attributes
Related Methods
Attribute Complexity
Method Complexity

mmrij {S, SP, SR, SP R}, i, j = 1, ..., M and i 6= j (2)

RA and RM show semantically related attributes and


methods, respectively. RA and RM are A A and M
M matrixes, respectively and defined as follow:

1 if attributes i and j semantically related,
raij =
0 Otherwise.
i,j=1,...,A
Figure 1: The proposed model for CRA problem
(3)

1 if methods i and j semantically related,
rmij =
0 Otherwise.
i,j=1,...,M
(4)
4.1 Extracting Features
AC and MC are vectors with length A and M, and
show the complexity of attributes and methods, reOur model is based on the clustering methods, so it is spectively. These features defined as:
necessary to generate a processable data set by clusacj = AF anInj , j = 1, ..., A
(5)
tering methods. For this purpose, first, we extract
features are shown in Table 1. These features are demci = M F anOuti + M F anIni , i = 1, ..., M (6)
fined based on the dependency between responsibiliWhere M F anOuti is the number of methods called by
ties. In fact, there are two types of dependency: data
Mi , M F anIni is the number of methods that call Mi
and functional dependencies. Data dependency is a
and AF anInj is the number of methods that used or
dependency between a method and attribute. Funcmodified Aj .
tional dependency is a dependency between two methAfter extracting the values of these features, for generods. Dependencies between two methods can be of
ating final data set, they must be processed. For this
four different types [9]: simple call dependency (source
purpose, for each method/attribute in MAR, MMR,
method simply starts destination method); parameterRA and RM matrixes, their similarity degree with
ized call dependency (source method starts destination
other elements of corresponding matrix are calculated
method and sends data); simple call waiting for reand as a feature for it are placed in the final data set.
sult (source method starts the destination method and
In this paper, Jaccard [11] binary similarity function
uses its result); parameterized call waiting for result
used to calculate the similarity degree of elements.
(source method starts destination method sending data
and uses its result). In this paper, these dependencies determined by S, SP, SR and SPR, respectively.
Also, dependencies between a method and attribute 4.2 Clustring
can be of three different types: simple use dependency
(method uses the value of attribute); modify depen- In this step, the generated data set in previous step is
dency (method modifies the value of attribute); use and clustered. Clustering algorithms are used for this purmodify dependency (method uses the value of attribute pose preferably are dynamic clustering methods. Aland modifies it). These dependencies determined by U, though non-dynamic clustering methods can be used,
M and UM, respectively.
in which case the number of clusters should be determined by an expert.
MAR and MMR show data and functional dependencies, respectively. MAR and MMR are M A and
M M matrixes, respectively, where M is the number
of methods and A is the number of attributes. These 4.3 Processing of Clustering Results
features defined as follow:
After clustering of the data set, each cluster is conmarij {U, M, U M }, i = 1, ..., M and j = 1, ..., A (1) sidered as a class and the relationship between classes

120

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

are determined according to dependency of their contents. For example, suppose method Mi from Class1
calls method Mj from Class2, in this case there is a
relationship between Class1 and Class2.

as the number of clusters and saw K-means method


with 13 for the number of clusters achieved better result than other methods. In this experiment X-means
and EM finds 14 and 17 clusters, respectively. Figure
3 shows the best design generated by proposed model.
This design is the result of K-means with 13 clusters.

Case Study

In order to validate our model, we performed a case


study. For this purpose, the iCoot [12] case study is
used. It is a car rental and reservation system and
its analysis level class diagram, designed by expert, is
shown in Figure 2. The iCoot case study consists of
18 classes and 75 responsibilities (38 methods and 37
attributes). Maximal possible number of classes in the
solution had to be equal to the number of responsibilities which made the size of search space considerably
large (7575 ).

Figure 3: The best design generated by proposed model

Figure 2: The Class diagram designed by expert

Experimental Results

We used four different clustering methods, X-means,


EM, K-means and HC, for clustering of data set. Xmeans and EM are dynamic clustering methods and
automatically finds the number of clusters. K-means
and HC are static clustering algorithms, and the number of clusters should be set before running the algorithm. For these algorithms the number of clusters is
determined by expert. We tested 13, 14, 15 and 18

The quality of a software design has mostly been


measured with cohesion and coupling, which mostly
conform to the quality factors of efficiency and modifiability [5]. Cohesion is a measure of the extent to
which the various functions performed by an entity are
related to one another [13]. Coupling is the degree of
interaction between classes [13]. There are many cohesion and coupling metrics [13]. The low value of coupling and high value of cohesion is desirable. In this
paper, we use Method-Method Coupling [8] (MMC)
and Method-Attribute Coupling [8] (MAC) for measurement of coupling, and use Ratio of Cohesive Interactions [8] (RCI) and Tight Class Cohesion [8] (TCC)
for measurement of cohesion. Table 2 shows the value
of these metrics for designed class diagram by clustering methods and expert. Based on the results shown
in Table 2, K-means with 13 clusters obtain the better
coupling and cohesion values. Also, in compared with
expert design, the results of other clustering methods
are good and have better coupling and cohesion values.
In order to evaluate the effectiveness of clustering
methods, we compared its performance with single objective GA, proposed by Bowman[8]. For this purpose, we run each method 10 times and report the
obtained best and average values. The provided results by clustering methods compared to GA shown in
Table 3. Based on the results shown in Table 3, clustering methods and GA obtain the same best value
in the coupling and cohesion metrics, but the average

121

The Third International Conference on Contemporary Issues in Computer and Information Sciences

value of these metrics, obtained by GA are worse than yield promising results. On the other hand comparing
clustering methods. Also, the computational time of the obtained results of clustering methods with single
clustering methods is better than GA.
objective Genetic algorithm reveals that the clustering
methods have low computational time and better average value for coupling and cohesion metrics.
Table 2: The value of coupling and cohesion
In a future work, we intend to use powerful dynamic
for clustering methods and expert design
clustering methods and extend the feature set to sup#
Coupling
Cohesion
port the new aspects of software design.
Algorithm

Classes

MAC

MMC

RCI

TCC

X-means
EM
K-means
K-means
K-means
K-means
HC
HC
HC
HC
Expert

14
17
18
15
14
13
18
15
14
13
18

22
25
27
22
18
13
27
25
22
18
27

29
29
23
29
29
24
27
29
29
29
29

0.137
0.125
0.079
0.128
0.139
0.149
0.097
0.125
0.120
0.129
0.005

0.102
0.102
0.272
0.201
0.35
0.35
0.173
0.098
0.35
0.35
0.109

Table 3: The obtained result by clustering methods


and Geneteic algorithm (SD: Standard Deviation)
Method

Metric
Coupling

Genetic
Algorithm

Cohesion

Avg SD
Best
Avg SD
Best

Time
Coupling
Clustering
Methods

Cohesion
Time

Avg SD
Best
Avg SD
Best

Value
38.8 1.4
37
0.420 0.09
0.499
41s
37.3 0.9
37
0.472 0.08
0.499
1 0.5 s

Conclusions and Future works

Class Responsibility Assignment (CRA) is an important and complex activity in the object oriented analysis and design. In this paper, we addressed CRA as
a clustering problem and proposed a clustering based
model (Figure 1) for solving it. The proposed model
has three main steps: (1) extracting features and generating data set, (2) clustering data set, and (3) processing clustering results and generating class diagram.
Four different clustering methods (X-means, EM, Kmeans and HC) used to evaluate the proposed model.
Comparing the obtained results of expert design with
clustering methods reveals that the clustering methods

122

Refrences
[1] L.C. Briand, J. Daly, and J. Wuest, A Unified Framework for Cohesion Measurement in Object-Oriented Systems, Empirical Software Engineering 3 (1998), 65117.
[2] C. Larman, Applying UML and patterns: an introduction
to object-oriented analysis and design and iterative development, Prentice Hall, 2004.
[3] B. Bruegge and A.H. Dutoit, Object-Oriented Software Engineering, Prentice Hall, 2004.
[4] M. Harman, S.A. Mansouri, and Y. Zhang, Search based
software engineering: A comprehensive analysis and review
of trends techniques and applications, Kings College London,Technical Report TR-09-03 (2009).
[5] O. Raiha, A survey on search-based software design, Computer Science Review 4 (2010), 203-249.
[6] M. OKeeffe and M. O Cinneide, Towards Automated Design Improvement through Combinatorial Optimization,
Proceedings of the Workshop on Directions in Software Engineering Environments (2004).
[7] M. O Keeffe and M. O Cinneide, Search-Based Refactoring
for Software Maintenance, Journal of Systems and Software
81 (2008), 502-516.
[8] M. Bowman, L.C. Briand, and Y. Labiche, Solving the Class
Responsibility Assignment Problem in Object-Oriented
Analysis with Multi-Objective Genetic Algorithms, IEEE
Transactions on Software Engineering 36 (2010), 817837.
[9] G. Glavas and K. Fertalj, Metaheuristic Approach to Class
Responsibility Assignment Problem, Proceedings of the International Conference on Information Technology Interfaces (ITI) (2011), 591596.
[10] I. Seng, J. Stammel, and D. Burkhard, Search-Based Determination of Refactorings for Improving the Class Structure of Object-Oriented Systems, Proceedings of the 8th annual conference on Genetic and evolutionary computation
(2006), 19091916.
[11] S. Choi, S. Cha, and C.C. Tappert, A survey of binary similarity and distance measures, Journal of Systemics, Cybernetics and Informatics 8 (2010), 4348.
[12] M. Docherty, Object-Oriented Analysis and Design, John
Wiley & Sons Ltd, 2005.
[13] G. Gui and P.D. Scott, Coupling and Cohesion Measures
for Evaluation of Component Reusability, Proceedings of
the international Workshop on Mining software repositories
(2006), 1821.

A Power-Aware Multi-Constrained Routing Protocol for Wireless


Multimedia Sensor Networks
Babak Namazi

Karim Faez

Amirkabir University of technology

Amirkabir University of technology

Department of Electrical Engineering

Department of Electrical Engineering

b namazi@aut.ac.ir

kfaez@aut.ac.ir

Abstract: Energy efficiency and quality of service(QoS) assurance are challenging tasks in wireless
multimedia sensor networks(WMSNs). In this paper, we propose a new power-aware routing protocol for WMSNs supporting multi-constrained QoS requirements, using localized information. For
realtime communication we consider both delay at sender nodes and queuing delay at the receiver.
In order to achieve reliability requirements and energy efficiency, each node dynamically adjusts
its transmission power and chooses nodes that have less remaining hops towards the sink. A load
balancing approach is used to increase lifetime and avoid congestion. Simulation results shows that
our protocol can support QoS with less energy consumption.

Keywords: WMSN; Routng; Quality of Service; Power Control;Energy efficiency.

Introduction

Recent advances in CMOS technology has led to


new derivative of sensor- based networks, namely wireless multimedia sensor networks(WMSNs)[1, 2]. WMSNs consist of a large number of sensor nodes equipped
with multimedia sensors, capable of retrieving multimedia information from the environment. Due to this
ability, WMSNs are gaining great potential in military
situations and other video surveillance systems.
Compared with traditional WSNs, designing a
routing protocol in WMSN is more challenging, concerning energy limitations in transmitting such bandwidth demanding data[3]. In addition multimedia content needs certain quality of service (QoS), such as reliability and real time. Traffic is also diverse and different flows may have different requirements. In this
work we propose a novel QoS routing protocol with energy consideration based on traffic differentiation and
localized information.
Many researchers have worked on real-time routing protocols for WSNs. For example, SPEED[4] attempts to choose paths that ensure a fixed speed, considering delay at the sender node. One drawback of
Corresponding

Author, T: (+98) 912 5491108

123

SPEED is that it does not support different latency


requirements, in addition delay at the receiver is not
taken into account. MMSPEED [5] defines multispeed
routing with different routing layers, each supporting
a different speed. However energy efficiency is not addressed directly in both of these protocols. RPAR[6] pioneers in incorporating energy consumption in realtime
communication. It achieves required end-to-end delay
at low power by dynamically adjusting transmission
power. None of the above mentioned protocols consider
hop count for minimizing the latency. In our protocol
we try to consider both transmission and queuing delay and use a power control approach to guarantee required end-to-end latency, over the least hops towards
the sink.
In order to support QoS in reliability domain, different approaches have been used in the literature. One
of which, is to send duplicated packets towards different nodes. MMSPEED uses this approach to achieve
higher packet reception ratio. To have more energy efficiency, EARQ [7] selects just two paths and send the
duplicated packet to the alternative path. To avoid
congestion near the sink, LOCALMOR[8] uses the single path multi sink approach and sends a copy of the

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

packet to a secondary sink. All of these methods decrease the lifetime of the network. REP[9] instead, uses
a power allocation protocol to guarantee needed reliability, since increasing transmission power results in
higher SINR. It divides the area into many concentric
coronas and randomly chooses a node from the corona
nearer to sink and increase transmission power until
the requirement is met. We use a novel HELLO message approach to find the exact remaining hops to the
sink and select nodes which have better link quality,
needing less increase in transmission power.
The rest of this paper is organized as follows: Section 2 gives the network model and assumptions. The
proposed protocol is described in section 3, and its
performance is evaluated in section 4. Finally, section
5 concludes the paper.

Each node should be aware of its neighboring nodes status, including their position, remaining energy, quality of the link, remaining hops to the
sink(level) and queue state. Like other state-of-the
art localized routing protocols, we use HELLO packets
to exchange these needed information, but instead of
sending HELLO messages simultaneously by all of the
nodes, the sink is the node that initializes the HELLO
message containing its information, labeled as level
zero. Upon receiving the first HELLO packet each node
is labeled as the next level and broadcast its information. Using this method all of the nodes know their
level and tell it to their neighbors. The sink node does
it at fixed intervals and after each reception, nodes add
an entry to their routing table, including: nodes distance to sink, level, remaining energy, speed and required transmission power. We will discuss about the
last two in more detail, later in this section.

System Model

Protocol Overview

The protocol uses local information and selects the


forwarding nodes according to their ability to fulfill the
flows requirements of latency and reliability. It has four
components: 1)Neighbor management which is responsible for gathering neighboring information and managing routing Table. 2)Latency estimation which calculates the latency of a forwarding node. 3)Reliability estimation which estimates the quality of available links
and finds the best transmission power. 4)Geographic
forwarding which defines the policy of forwarding.

3.1

Neighbor Management

In this paper we adopt a WMSN formed by a large


number of multimedia sensors randomly deployed in an
environment to collect information. All of the sensor
nodes have the same specifications, except for the sink
node which has no energy limits. Upon receiving each
data packet, nodes can calculate bit error rate by the
SINR of the signal based on the modulation used[10].
We also make the assumption that all of the nodes
in the network, know the current geographic coordinates of their owns and the sink node and are stationary during network lifetime.
Like LOCALMOR, we define four classes of packets based on two criteria;whether they are high and low
priority (reliability),and whether they are realtime not.
High priority packets can stand for I frames and low
priority packets can be considered as P frame packets
in a video stream. The type of the packet is specified
at the application layer.
Most tranceivers support different transmission 3.2 Latency Estimation
power. For calculation energy consumption we use the
model proposed in [9].The energy consumed by a transTwo types of delay may occur in these netmitter sending data of size f is based on:
works,delay at the sender and delay at the receiver.
P tx
8f cir
Delay at sending nodes mostly depends on the used
(P + P tx )
(1)
E tx (P tx ) =
MAC parameters. At the receiving nodes delay is due
R

to queuing. Propagation delay is usually ignored. In


tx
cir
where P is the transmission power, P
is the circle
order to estimate the transmission delay each node
power, R is the bandwidth and is the conversion effitimestamps the packet after receiving it from applicaciency of power amplifier. The energy consumption of
tion layer or other nodes, and after receiving the ACK
the receiver is:
packet at MAC layer. Transmission delay is computed
8f
E rx =
P rx
(2) as follows:
R
in which P rx is the received power.
dtr = trec tsent tack
(3)
We assume that all of the nodes are capable of
changing their transmission power in order to achieve in which trec is the time the packet is received at the
network QoS requirements.
routing layer, tsent is the time the node receives the

124

The Third International Conference on Contemporary Issues in Computer and Information Sciences

ACK packet and tack is the time consumed for transmitting an ACK packet at the receiver.
Transmission delay may vary because of changes
in network parameters. A reason might be variations
in transmission power level. In order to count for past
delays, we use EWMA method for estimating the transmission delay.
Queuing delay, say dq , is computed at the receiver
and is exchanged between nodes via HELLO packets.
We use the moving average approach for this delay too.
Having different kinds of delay, each node can estimate
the velocity of its neighbors and compare it to the required velocity of the packet to be transmitted. The
velocity of neighboring nodes is computed as bellow:

After HELLO packets each node knows what the


BER of the packet sent to a specific node and at which
power level it was. If the BER value is more than
the desired value, the node increases its transmission
power and otherwise decreases it in order to have a
better energy efficiency. During the next interval the
packets will be transmitted to the specific node at this
level.
The required BER is estimated locally based on
the hop count of the packet and remaining hops to the
sink node.
h
BERij = BERreq

(6)

in which: BERij is the BER of the packet transmitting


(4) from node i to node j, and BERreq is the required endto-end BER. h is the total hop count from the source to
in which disid is the distance between the node itself destination and is computed by adding the hop count
and the sink node and disjd is the distance between the of the packet and the level of the node.
By this method the needed reliability is met locally,
neighboring node and the destination. The required
and
it is expected that after a few HELLO packets invelocity is:
tervals, all of the nodes know the required transmission
disid disjd
(5) power of both high priority and low priority packets,
vreq =
tdl te
where tdl denotes the deadline of the packet and te is for all of the nodes in their fast neighbors set.
the experienced delay for packet receiving at this node.
It is a part of packets header.
Each node has a set of its neighbors fulfilling the 3.4 Geographic Forwarding
latency requirement, called fast neighbors, consisting
of the nodes meeting the condition vj > vreq . The
After finding the eligible nodes satisfying latency reforwarding node is chosen from this set.
quirement and their corresponding transmission power,
each node selects the most energy efficient node for the
next hop. To do that we consider both the energy cost
3.3 Reliability Estimation
of forwarding nodes and their residual energy. In addition selecting just one path may cause energy depletion
Considering reliability requirements, each flow can of nodes in the path. To avoid this we give a score to
tolerate certain bit error rate value. Increasing trans- all of the eligible nodes based on their residual energy
mission power is the solution we use to achieve the , energy cost and remaining hops to the sink. The
required BER. Due to wireless nature, transmitted nodes having lower level are selected first and if their
signal can be received by nodes in the range of the number is less than two, nodes having the same level
sender and this signal can be counted as interference are added to them. These nodes are sorted and chosen
to nodes that are not the packets destination. On the based on their score, i.e. higher probability is given to
other hand, increasing transmission power may cause best neighbors.
more SINR value, resulting in lower BERs. So finding
The energy cost is computed as bellow:
the most efficient transmission power is very important
E tx
in our protocol.
(7)
cost =
In order to achieve the best transmission power
disid disjd
level satisfying our reliability needs, we assume all of
the nodes to have an initial power level. At this initial If the energy cost is a small value, it means that the
power level nodes start to transmit HELLO packets. forwarding node is more energy efficient.
Each node has a table containing the BER of the
Using the proposed forwarding policy, the load is
packets received from a specific source and the power distributed over the nodes satisfying QoS needs and
at which the packet transmitted. The transmission therefore the lifetime of the network is increased and
power of a packet is a part of its header. This table less congestion will happen. Another advantage is that
is updated after receiving a packet and is sent during nodes having higher scores have higher priority and are
HELLO packets.
selected more often.
vj =

disid disjd
dtr + dq

125

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Simulation Results

protocol provides more reliability with less energy consumption.

To evaluate the performance of the proposed protocol we used Castalia-3.2[11] simulator. Castalia is a discrete event simulator designed for simulating wireless
sensor networks. The simulation configuration consists
of 36 nodes randomly deployed in a 100*100 m2 terrain.802.11 MAC protocol (with RTS/CTS packets) is
used and a node is selected randomly to send its packet
to the sink node. The traffic consists of all four class
packets and the simulation time is 600 seconds.
The performance metrics used are average energy
consumption, average end-to-end delay and BER. We
compare our protocol, called hereafter PMCR(Poweraware Multi-Constrained Routing), with LOCALMOR
protocol over these metrics.
EndtoEnd Delay(ms)
1200
PMCR
LOCALMOR
350

1000

[1] I.F. Akyildiz, T. Melodia, and K.R. Chawdury, A Survey


on Wireless Multimedia Sensor Networks, Computer Networks(Elsevier) 51,no.4 (2007), 921-960.

900
800
700
PMCR
LOCALMOR

600
200
0.05

0.1

0.15
0.2
Required BER

0.25

0.3

500
0.05

(a) Delay

0.1

0.15
0.2
Required BER

0.25

0.3

(b) Energy Consumption

Figure 1: Delay and energy consumption with Tdl =


0.3s

Fig.1(a) shows average end-to-end delay in different BER requirement for high priority packets and
Tdl = 0.3s. The average energy consumption is shown
in Fig.1(b) for this situation. It can be seen that our
protocol uses less energy than LOCALMOR protocol.

Packet BER
1200
PMCR
LOCALMOR

[2] S. Misra, M. Reisslein, and G. Xue, A Survey of Multimedia Streaming in Wireless Sensor Networks, IEEE Commun.Survey Tutorials 10 (2008), 1839.
[3] S. Ehsan and B. Hamdaouni, A Survey on Energy-Efficient
Routing Techniques with QoS Assurances for Wireless Multimedia Sensor Networks, IEEE Commun.Survey Tutorials
(early access) (2011).
[4] T. He, J.A. stankovic, C. Lu, and T.F. AbdelZaher, A SpatioTemporal Communication Protocol for Wireless Sensor
Networks, IEEE Trans. Parallel and Distributed Systems
16,no.10 (2005), 995-1006.
[5] E. Felemban, C. Lee, and E. Ekici, MMSPEED:Multipath
Multi-SPEED protocol for QoS Guarantee of Reliability and
Timeliness in Wireless Sensor Networks, IEEE Trans. Mobile Comput. 5,no.6 (2006), 738-754.
[6] O. Chipara, Z. He, G. Xing, Q. Chen, X. Wang, C. Lu, J.A.
stankovic, and T.F. AbdelZaher, Realtime Power-aware
Routing for Sensor Networks, in proc. 14th IEEE International Workshop on Quality of Service(IWQoS 2006),New
Haven,Ct (June 2006).

Energy Consumption

0.5

We propesed a novel localized routing protocol for


WMSNs. The protocol takes into account the QoS
and traffic diversity, needed for transmitting multimedia data in such resource limited networks. Simulation
results shows that our protocol outperforms other protocols like LOCALMOR in case of reliability, end-toend delay and energy consumption.

1100

300

250

Conclusion

Refrences

Energy Consumption

400

1100

0.4
1000
0.3

[7] J. Heo, J. Hong, and Y. Cho, EARQ:Energy Aware Routing


for Realtime and Reliable communication in Wireless Industrial sensor Networks, IEEE Trans.Industrial Informatics 5 (2009), 3-11.

900
800

0.2

700
0.1

PMCR
LOCALMOR

600
0
200

250

300
EndtoEnd Delay(ms)

(a) BER

350

400

500
200

250

300
Required BER

350

400

(b) Energy Consumption

Figure 2:
BER and energy consumption with
BERreq = 0.1

Packets BER for different time deadlines is shown


Fig.2(a) and corresponding energy consumption is
shown in Fig.2(b). It is obvious that the proposed

[8] D. Djenouri and I. Balasingham, Traffic-DifferentiationBased Modular QoS Localized Routing for Wireless Sensor
Networks, IEEE Trans. Mobile Computing 10 (2011), 797809.
[9] K. Lin and M. Chen, Reliable Routing Based on Energy
Prediction for Wireless Multimedia Sensor Networks, IEEE
GLOBECOM (2010), 1-5.
[10] A.F. Molisch, Wireless Communications, John Wiley and
Sons, 2011.
[11] Castalia User Manual, http://castalia.npc.nicta.com.au/,
2011.

126

Mobile Learning- Features, Approaches and Opportunities


Faranak Fotouhi-Ghazvini

Ali Moeini

Department of Computer Engineering

Faculty of Engineering

University of Qom

Tehran University

faranak fotouhi@hotmail.com

moeini@ut.ac.ir

Abstract: Mobile learning is a new paradigm of learning that takes place in a meaningful context,
involves exploration and investigation, and includes opportunities for social dialogue and interaction
where learners have access to appropriate resources. The learning process could be supported
by the use of mobile phone in a responsive manner by means of context aware hardware and
technologies that facilitate interaction and conversation. This mode of learning can enhance and
improve learning, teaching and assessment. In this article we discuss distinctive feature of mobile
learning, different approaches to mobile learning in different continents, advancement of portable
devices their implications and mobile learning in Iran.

Keywords: Mobile Learning; Mobile Games; Game Based Learningt; Augmented Reality.

Introduction

Distinct
Learning

Features

of

M-

M-learning has three main characteristics: (1) mobility, (2) context aware and (3) able to communicate.
Sharples [10] defines mobility as (a) mobility in physical space it is not bound to classroom (b) it uses mobile hardware such as Bluetooth, GPS, camera, WiFi
all integrated in a compact portable devices (c) mobile in social space learner could forms different ad hoc
During the last decade, mobile learning (m-learning) a groups during the day for collaborative learning (d)
new kind of e-learning has been introduced, in which mobile in time, learning could be distributed during
the power of wireless technologies have been used in ed- different time according to the learners preferences.
ucational context. Compared to traditional e-learning
it is more personal, always connected to communicaAnother characteristics of mobile learning is being
tion tools, portable, cheap and available to the public. context aware, which means it can collect environmenThe m-learning consumers are mobile and so learn- tal data simultaneously or with the learners command
ing could take place ubiquitously anywhere. Educa- to help his/her to better analyze and apprehend education protocols which employ this method has aided tional material that depend to the physical world. The
many aspects of learning such as: motivation [1], au- data is usually collected using devices such as GPS,
tonomy [2], interaction and collaboration [3] and [4], compass, Bluetooth, camera, accelerometer and gyroself-esteam [5], social skill [6], accessibility [7], and lan- scope.
guage acquisition [7]. It has been especially effective in
the teaching of disadvantaged students in developing
The next characteristic of m-learning is that it alcountries [8] and [9].
ways has available communication tools such as phone
Corresponding

Author, P. O. 37161466119, T: (+98) 09127506309

127

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

calls, SMS, MMS and mobile internet. These features tions for tablets that could be used in schools.
facilitate the process of learning between students and
teachers when they are located at different physical
M-learning in Middle East has had limited accomspaces.
plishments. However, it is moving towards the use of
Java applications and online electronic materials [28].

M-learning across Continents

Europe and Japan are far ahead of other countries vis-vis taking advantage of mobile phones features. They
have used SMS in mobile commerce thus forming a
rich communication ecosystem with clients. Many mlearning research projects have taken place in Europe
[11-15]. These projects have played a major role in
shaping and developing mobile learning theories and
techniques. On the other hand the homogenous mobile communication system in Europe has provided
each project with a big market. However in North
America lacking homogeneity in the implementation
of third generation of mobile communication systems
caused the late blooming of m-learning. At present,
m-learning application include game simulation environments that incorporate technologies such as GPS,
WiFi and Bluetooth [16].

Current Market Changes of


Mobile devices

Most new mobile phones take advantage of high speed


third generation of mobile systems for video conferencing and web surfing. The storage capacity, processing power and memory of devices have increased
considerably. The electronic components such as GPS,
accelerometers, compass, high resolution camera and
proximity sensors being integrated in a single device,
have increased the mobile phones capabilities compared to PCs.

According to Gartners [29] predictions, in the next


few years, Android operating systems will turn into the
most favorite mobile phone operating system and will
dominate over half of the market. iPhone and Nokia
Windows will have the second and third place respecIn Asia, the spread of cheap mobile phones with tively.
high quality of services is increasingly rising and it has
proven to be a cost effective method of teaching and
These hardware and software advances have given
learning. In Bangkok, text messaging has been used birth to Augmented Reality (AR), which is combining
to participate in examaminations [17]. In Japan mo- and incorporating the real world data in virtual enbile web has been used for English language learning vironments simulated in mobile devices. AR adds a
[18]. In the Philippines, text messaging has been used real layer of information in the form of text, graphic
to teach English, Mathematics and Science [19]. In and voice, when the mobile application accesses two
Taiwan, PDAs have been used for collaborating learn- mobile phones camera. AR is capable of transforming
ing [20] on field trips. In Hong Kong mobile web 2 educational spaces that did not have any connection to
has introduced new opportunities to form teaching and the subject that was taught into a dynamic space that
learning.
simulates an authentic learning environment that increases the learners motivation [30]. Game based learnAmbient Insight [22] has predicted that by 2015, ing has grown in recent years and different researches
the leading consumers of m-learning applications will have proven that these games are suitable for learning
be America, China, India, Indonesia and Brazil.
[31]. With recent mobile phone advances, many teachers have mobile games as an important tool of teaching
In Africa different projects have been carried out during the classroom to increase the learners interest
with the aid of British universities. For example [23], and problem solving [31].
[24] are the two projects that have been lead by UK
Open University. In these project PDAs have been
There are other mobile devices in the market such
used for studying electronic books, playing educational as Palm PC, iPod, iPad, ebook Reader, Nintendo DS
films and audio clips. Text messaging has been used and Playstation. Foresters analyzers believe, the tablet
for assessing Kenyan students; this has been devised PCs with their small size, only consisting of a flat touch
and implemented by Wolverhapton University [25].
screen with less than 9 inch display screen are the main
players [32] amongst the wide spectrum of mobile deIn Australia, m-learning had a slower pace com- vices. Display screen of tablets compared to mobile
pared to European countries, currently the two phones will provide a more complete learning experiprojects [26], [27] are designing educational applica- ence and different universities could use tablets instead

128

The Third International Conference on Contemporary Issues in Computer and Information Sciences

of printed books as part of their green projects.

Implementation
Learning in Iran

of

Mobile

There are many challenges facing mobile learning in


Iran. M-learning has not been officially recognized as
a mode of learning by the Ministry of Education. Research has been carried out in a limited scale at universities, mainly theoretical or as an MSC projects,
however without a secure financial support they have
not been able to be implemented on a large scale. Iranian users often have mobile phones with lower capabilities compared to European users; furthermore Iran
uses second generation of mobile communication system which is slower than the most recent generations.
However, mobile games seem to be a more practical
solution for its ease of implementation and less dependency on infrastructure.

versity, under the supervision of the lecturer. The virtual space of the game simulated a computer lab. The
game experience was considered highly rich and motivating by students to learn Technical English Vocabularies. It has also assisted the lecturer on the teaching
by presenting the material, involving the students on
high level cognitive processes and assessing the students work using an advanced scoring system [38].

Conclusion

Recent advances in mobile technologies and mobile


phone capabilities provide a bright future for mobile
learning. Current researches have proved that combining game based learning and augmented reality could
be very effective in the mobile education. These games
could result in a more fulfilling experience by placing the students in an authentic environment that
could apply their knowledge in an unthreatening environment. Incorporating them in the classroom helps
teachers to present and assess the educational material
There are few educational games that have been in less time and in more organized manner.
implemented and been tested in Iran. They are as follows:
(1) Adventure Quize Games [36], this game consisted of an attractive environment and fun characters.
Players had a chance to win or loose according to a
series of questions that were asked by game characters.
This game did not result in any cognitive changes. It
was mainly due to the fact that the questions did not
relate to the game story and they were considered annoying [36].
(2) MOBO City [37] was an adventure game in
a fantasy world of a computers motherboard. Electronic components were depicted as different building
in MOBO city. The main character was a bus that
carried the data from a starting location often a computers electronic port to certain location such as computers monitor. In this game, the player should have
protected the data from computer viruses that were flying in a spaceship. During the game when the player
passed from certain location or achieved a certain goal
they were presented with an appropriate Technical English vocabulary. This game has been effective on
teaching the vocabulary meaning [37].
(3) The Detective Alavi was the first AR game implemented in Iran [38]. In this game university students were constantly moving between real and virtual
spaces with the help of mobile phones graphical interface, two dimensional Quick Response (QR) codes,
Bluetooth and camera. The game took place in a uni-

129

Refrences
[1] J. L. Shih, C. W. Chuang, and G. J Hwang, An Inquirybased Mobile Learning Approach to Enhancing Social Science Learning Effectiveness, Educational Technology and
Societyg 13/4 (2010), 50-62.
[2] C. Whitet, Learner Autonomy and New Learning Environments, Language Learning and Technology 15/3 (2011),
1-3.
[3] J.
Attewellt,
From
Research
and
Development
to Mobile Learning: Tools for Education and
Training Providers and their Learners, Proceedings
of
mLearn
2005
(2005),
Available
from:
http://www.mlearn.org.za/CD/papers/Attewell.pdf.
[4] D. Corlettt and M. Sharples, Tablet technology for informal
collaboration in higher education, Proceedings of MLEARN
2004: Mobile Learning anytime everywhere, London, UK:
Learning and Skills Development Agency (2004), 59-62.
[5] M. Hansent, G. Oosthuizen, J. Windsor, I. Dohertyt, S.
Greig, K. McHardy, and L. McCann, Enhancement of Medical Interns Levels of Clinical Skills Competence and SelfConfidence Levels via Video iPods: Pilot Randomized Controlled Trial, Journal of Medical Internet Research 2011
13/1 (2011), e29.
[6] M. Joseph, C. Branch, C. March, and S. Lerman, Key factors mediating the use of a mobile technology tool designed
to develop social and life skills in children with Autistic Spectrum Disorders, Computers and Education 58/1
(2011), 53-62.
[7] F. Fotouhi-Ghazvini, R.A. Earnshaw, A. Moeini, D. Robison, and P.S. Excell, From E-Learning to M-Learning
the use of Mixed Reality Games as a New Educational
Paradigm, The International Journal of Interactive Mobile
Technologies (IJIM) 5/2 (2011), 17-25.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

[8] A. Kumar, A. Tewari, G. Shroffi, D. Chittamuru, M. Kam,


and J. Cannyl, An exploratory study of unsupervised mobile
learning in rural India, In Proceedings of the 28th international conference on Human factors in computing systems
(CHI 10) (2010), 743-752.
[9] S.S. Nashr, Blended Mobile Learning in Developing Nations
and Environments with Variable Access: Three Cases., Mobile Information Communication Technologies Adoption in
Developing Countries: Effects and Implications (2010), 91102.
[10] M. Sharples, M. Milradi, I. Arnedillo, and G. Vavoulau,
Mobile Learning: Small Devices, Big Issues, TechnologyEnhanced Learning: Principles and Products/ Springer
Netherlands 2009 (2009), 233-249.
[11] Handler,http://www.eee.bham.ac.uk/handler/default.asp.
[12] MLearn,www.mobilearn.org/mlearn2004/presentations.htm.
[13] M-learning, http://www.m-learning.org/.
[14] Leonardo, http://www.leonardo.org.uk/.
[15] Molenet, http://www.molenet.org.uk/.
[16] E. Klopfer, Augmented Learning: Research and Design of
Mobile Educational Games, The MIT Press, 2008.
[17] K. Whattananarong, An experiment in the use of mobile
phones for testing at King Mongkuts Institute of Technology (2005), http://seameo.org/vl/krismant/mobile04.pdf.
[18] P. Thornton and C. Houser, Using Mobile Phones in Education, 2nd IEE International Workshop on Wireless and
Mobile Technologies in Education. (2004).
[19] A. Ramos, J. Trinona, and D. Lambert, Viability of SMS
technologies for non-formal distance education, Information and Communication Technology for Social Development (2006), 69-80.
[20] P.P. Luo, C.H. Lai, and D. Lambert, Mobile Technology
Supported Collaborative Learning in a Fieldtrip Activity,
Technology Enhanced Learning (2009).
[21] 2011 International Conference on ICT in Teaching and
Learning (15th HK Web Symposium) 11-13 July, 2011 Hong
Kong SAR http://ict2011.com/page.

[26] http://delphian.com.au.
[27] http://www.apac.studywiz.com/.
[28] R. Belwal and S. Belwal, Mobile Phone Usage Behavior of
University Students in Oman, New Trends in Information
and Service Science (2009), 954-962.
[29] P. Christy and H. Stevens, Gartner Says Android
to Command Nearly Half of Worldwide Smartphone
Operating System Market by Year-End 2012 (2011),
http://www.gartner.com/it/page.jsp?id=1622614.
[30] H. Tarumi, Y. Tsujimoto, T. Daikoku, F. Kusunoki, S. Inagaki, M. Takenaka, and T. Hayashi, Balancing virtual and
real interactions in mobile learning, International Journal
of Mobile Learning and Organisation 5/1 (2011), 28-45.
[31] C.L. Holden and J.M. Sykes, University of New Mexico,
USA Leveraging Mobile Games for Place-Based Language
Learning, International Journal of Game-Based Learning
1/2 (2011), 1-18.
[32] J. Johnson, Tablets To Overtake Desktop Sales
By
2015,
Laptops
Will
Still
Reign
(2010),
http://www.inquisitr.com/76157/tablets-to-overtakedesktop-sales-by-2015-laptops-will-still-reign.
[33] S. Papert, The Childrens Machine: Rethinking School in the
Age of the Computers, Basic Books, New York, 1993.
[34] C.N. Quinn and R. Klein, Engaging Learning Designing eLearning Simulation Games, Pfeiffer: John Wiley and Sons,
Inc., 2005.
[35] G.a. Gunter, R. F. Kenny, and E.H. Vick, Taking educational games seriously: using the RETAIN model to design endogenous fantasy into standalone educational games,
Journal of Educational Technology Research and Development 56/5 (2008), 511-537.
[36] F. Fotouhi-Ghazvini, A. Moeini, D. Robison, R.A. Earnshaw, and P.S. Excelli, A Design Methodology for Gamebased Second Language Learning Software on Mobile
Phones, Proceedings of Internet Technologies and Applications, Wrexham, North Wales (2009), 609-618.

[22] S.S. Adkins, The Worldwide Market for Mobile [37] F. Fotouhi-Ghazvini, R.A. Earnshaw, D. Robison, and P.S.
Excelli, The MOBO City: A Mobile Game Package for
Learning Products and Services: 2010-2015 ForeTechnical Language Learning, International Journal of Incast and Analysis (2010), 1-21, Available from:
http://www.ambientinsight.com/Resources/Documents/Ambient- teractive Mobile Technologies 3/2 (2009), 19-24.
Insight-2010-2015-US-Mobile-Learning-Market-Executive[38] F. Fotouhi-Ghazvini, R.A. Earnshaw, D. Robison, A.
Overview.pdf.
Moeini, and P.S. Excelli, Using a Conversational Framework in Mobile Game based Learning Assessment and
[23] http://www.open.ac.uk/deep.
Evaluation, Communications in Computer and Information
[24] http://www.bridges.org/ipaq competition.
Science/Springer-Verlag Berlin Heidelberg 177 (2011), 200213.
[25] www.wlv.ac.uk /.

130

Predicting Crude Oil Price Using Particle Swarm Optimization


(PSO) Based Method
Zahra Salahshoor Mottaghi

Ahmad Bagheri

Faculty of Engineering

Faculty of Engineering

Department of Computer Engineering

Department of Mechanical Engineering

zsalahshoor@msc.guilan.ac.ir

bagheri@guilan.ac.ir

Mehrgan Mahdavi
Faculty of Engineering
Department of Computer Engineering
mahdavi@guilan.ac.ir

Abstract: Oil is a strategic commodity in the entire world. Oil price is always changing, but this
change is rapid and predicting this change is difficult too. So how to predict the future price of
oil is one of the major issues in this industry. In this paper,a Particle Swarm Optimization (PSO)
based method has been proposed to predict the future price of oil for upcoming 4 months. PSO is a
population-based optimization method that was inspired by flocking behavior of birds and human
social interactions. The proposed equation has 13 dimensions and 4 variables. These variables
are price of petroleum in the past 4 months. The experimental results indicate that the proposed
approach can predict monthly petroleum price with 3.5 dollar difference on average.

Keywords: Crude Oil Price; Particle Swarm Optimization; Predicting; Forecasting.

Introduction

hart had been inspired it from the life of birds and fish
[1]. This algorithm has good speed and accuracy; it can
solve engineering problems greatly. Here, a method
Prediction is an estimate or a number of quantitative based on PSO is used to predict oil price. The reestimates about the likelihood of future events that sults show this method has good ability in forecasting
will be developed by the use of current and past data. medium-term crude oil price.
Predictions are used as a guide for public and private
Many studies have predicted oil price such as intepolicies, because decision making is not possible without predictive knowledge. For thousands of years oil grating text mining and neural networks in forecasting
has had an important role in peoples lives. It is not the oil price [2], Junyou have been proposed a method
only the main source of worlds energy but also it is for forecasting stock price using PSO-trained neural
very hard to find a product that does not need oil in networks [3], and Abolhassani have been introduced a
its production or distribution. Hence, predicting oil method for forecasting stock price using PSOSVM [4].
price is considered to be a hot topic in this industry.
This paper is organized as follows; PSO algorithm
In this paper, oil price has been predicted by a PSO
is described in Section 2. In Section 3 expressed the
based method.
method based on PSO for predicting the oil price, the
PSO is one of the intelligent algorithms and is a evaluation results are given in Section 4 and Section 5
suitable algorithm in optimization. Kennedy and Eber- refers to conclusion.
Corresponding

Author,P.O. Box 3756, F: (+98) 131 669-0271, T: (+98) 131 669-0270

131

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Particle Swarm Optimization

error between the actual and predicted values that is


shown in the equation (4).

xR xmin
Particle swarm optimization is a population-based evo(3)
xn =
lutionary algorithm and is similar to other populationxmax xmin
based evolutionary algorithms. PSO is motivated by
the simulation of social behavior instead of survival of xn is normalized value, xmax, xmin are maximum and
minimum amount of data, and xR is the data that must
the fittest [1].
be normalized.
In PSO, each candidate solution is associated with
n
X
a velocity [5]. The candidate solutions are called parF (x) =
(Eactual Epredicted )2
(4)
ticles, and the position of each particle is changed aci=1
cording to its own experience and that of its neighbors
(velocity). It is expected that the particles will move
Where Eactual is the real value of oil price and Epretoward better solution areas. Mathematically, the pardicted is the predicted oil price, n is the number of
ticles are manipulated according to the following equadata. The formula (5) is estimating the predicted
tions.
value for first future month,and these formulas (6), (7),
and (8) are estimating second, third and fourth future
v~i (t + 1) = wv~i (t) + C1 r1 (~xpbesti ~xi (t))
month,respectively. These are the same but they are
difference in last phrase. In the training mode, past 4
+C2 r2 (~xgbesti ~xi (t))
(1)
months of oil price are used for learning model, But in
x~i (t + 1) = x~i (t) + v~i (t + 1)
(2) test mode, fixed price of 4 months from the previous
or past 4 months are used to calculate each of the next
4 months.the proposed method is shown in figure 1.
Where xi (t) and vi (t) denote the position and velocity of particle i, at time step t. r1, r2 are random Epredictedif irstmonth = w1 xi+3 w2 +w3 xi+2 w4 +w5 xi+1 w6
values among zero to one. C1 is the cognitive learning
factor and represents the attraction that a particle has
+w7 xi w8 + w9 xi+3 xi+2 + w10 xi+3 xi+1 + w11 xi+3 xi
toward its own success. C2 is the social learning factor
and represents the attraction that a particle has to+w12 xi+2 xi+1 + w13 xi+3 4 xi+1 6
(5)
ward the success of the entire swarm. W is the inertia
weight which is employed to control the impact of the
Epredictedisecondmonth = w1 xi+3 w2 +w3 xi+2 w4 +w5 xi+1 w6
previous history of velocities on the current velocity of
a given particle. The personal best position of the par+w7 xi w8 + w9 xi+3 xi+2 + w10 xi+3 xi+1 + w11 xi+3 xi
ticle i is xpbesti and xgbest is the position of the best
particle of the entire swarm. Here, W is 0.4, and C1,
+w12 xi+2 xi+1 + w13 xi+3 4
(6)
C2 are 2.
Epredictedithirdmonth = w1 xi+3 w2 +w3 xi+2 w4 +w5 xi+1 w6

The Proposed Method

+w7 xi w8 + w9 xi+3 xi+2 + w10 xi+3 xi+1 + w11 xi+3 xi


+w12 xi+2 xi+1 + w13 xi+3 4 xi+1

(7)
Identifying and applying various parameters influencing oil price from past and present status can be very Epredictedif ourthmonth = w1 xi+3 w2 +w3 xi+2 w4 +w5 xi+1 w6
effective in making accurate predictions. Parameters
such as dollar price and inflation in America can affect
+w7 xi w8 + w9 xi+3 xi+2 + w10 xi+3 xi+1 + w11 xi+3 xi
on the desired issue.
+w12 xi+2 xi+1 + w13 xi+3 xi+2 xi+1 1.9
(8)
In this paper, the monthly oil price from past years
has been used to predict the next 4 months. These w shows the number of dimensions that are obtained
data are divided into two parts, training and testing by PSO algorithm in the training phase and i is the
data into three 4-months periods. Then data normal- number of data. Algorithm are repeated in each stage
ization was performed by formula (3) on the data until 100 times, 36 is the number of particles. Oil price is
they were placed between zero and one. The function predicted by the use of equations (5),(6),(7), and (8)
of PSO algorithm is considered to be the total squared in three periods.

132

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Table 1: Actual and Predicted values of Test Data for


Proposed Method.

Figure 1: Shows the Proposed PSO Based Method for


Forecasting Oil Price. n is the number of training data

Experimental Results

In order to evaluate the proposed method, it was used


on oil data sets contains 351 records from 1982 to 2011
which are available on www.ioga.com. MATLAB 2011
is used to implement this method. The number of
training data for the first, second and third periods are
339, 343, and 347, respectively. Experimental results
for the three periods are shown in Figure 2 and Table
1. The proposed method is predicted the monthly oil
price with 3.5 dollar diffrence on average. The initial
population in PSO is selected random so the averages
of 4 times apply the algorithm for per month are used
in these results.

Discussion and Future Work

Oil is important in the international economy, so the


forecast of oil prices is essential in the countrys planning. In this paper, the monthly price of petroleum is
forecasted with a new method based on Particle Swarm
Optimization. The results revealed acceptable performance of this method. This method can use to predict
other stocks price such as gold price.

105
Actual
Predicted

Oil Price (Dollars per Barrel)

100

Refrences

95
90
85
80
75
70
65

6
Month

10

12

Figure 2: The Forecasting Result of the Proposed


Method in three periods.

133

[1] J. Kennedy and R.C. Eberhart, Particle swarm optimization, In Proceedings of the IEEE International Conference
on Neural Networks/Perth, Australia IV (1995), 1942-1948.
[2] Sh. Wang, L. Yu, and K.K. Lai, A novel hybrid AI system
framework for crude oil price forecasting, Lecture Notes in
Computer Science 3327 (2004), 233-242.
[3] B. Junyou, Stock forecasting using PSO-trained neural networks, In Proceedings of the congress on Evolutionary computation (2007), 2879-2885.
[4] A.M. Toliyat Abolhassani and M. Yaghobbi, Stock price forecasting using PSOSVM, 3rd International Conference on advanced computer theory and engineering (ICACTE) (2010),
352-356.
[5] R.C. Eberhart, Dobbins R., and Simpson P.K., Computational Intelligence PC Tools, Morgan Kaufmann Publishers
(1996), 233-242.

Image Steganalysis Based On Color Channels Correlation In


Homogeneous Areas In Color Images
SeyyedMohammadAli Javadi

Maryam Hasanzadeh

Shahed university, Tehran, Iran

Shahed university, Tehran, Iran

sm.javadi@shahed.ac.ir

hasanzadeh@shahed.ac.ir

Abstract: Steganography is the art of hiding information. Whereas the goal of steganography
is the avoidance of suspicion to hidden messages in other data, steganalysis aims to discover and
render useless such covert messages .In this article, we proposed a new method for steganalysis
based on the color channels correlation in adjacent pixels while omitting the heterogeneous areas
in color images.This method is designed independent of steganography method. The results of our
proposed method shows that it has high accuracy in steganalysis.It also does better than well known
WS, SP and RS steganalysis methods in low embedding rates.

Keywords: steganography, steganalysis, color channels correlation, homogeneous and heterogeneous areas

Introduction

Steganography is the art of hiding information. Despite cryptography that deals with immuning information content not to be wiretapped, Steganography techniques are used to make messages undercover. Since
the main goal of steganography is to communicate securely in a completely undetectable manner, an adversary should not be able to distinguish in any sense
between cover-objects (objects not containing any secret message) and stego-objects (objects containing a
secret message). In this context, steganalysis refers to
the body of techniques that are conceived to distinguish between cover-objects and stego-objects [1],[2].
Digital images have high degree of redundancy in
representation and pervasive applications in daily life,
thus appealing for hiding data. As a result, the past
decade has seen growing interests in researches on image steganography and image steganalysis. Some of
the earliest work in this regard was reported by Johnson and Jajodia [3],[4]. They mainly look at palette
tables in GIF images and anomalies caused there in
by common stego-tools. A more principled approach
to LSB steganalysis was presented in [5] by Westfeld
and Pfitzmann. They identify Pairs of Values (PoVs),
Corresponding

Author, T: (+98) 915 755-5288

134

which consist of pixel values that get mapped to one another on LSB flipping. Fridrich, Du and Long [6] define
pixels that are close in color intensity to be a difference
of not more than one count in any of the three color
planes. They then show that the ratio of close colors to
the total number of unique colors increases significantly
when a new message of a selected length is embedded
in a cover image as opposed to when the same message
is embedded in a stego-image. A more sophisticated
technique that provides remarkable detection accuracy
for LSB embedding, even for short messages, was presented by Fridrich et al. in [7] and called RS method.
Moreover; the other different methods of steganalysis
such as WS [8] by Fridrich and M. Goljan and sample
pair(sp) [9] by Dumitrescu , Xiaolin and Wang have
been presented.
The most of recent steganalysis methods in color
image are based on some independent process in each
color channels. In this article , we proposed a new steganalysis method for detection stego- image, while we
focused on existence correlation between color channels
in homogeneous areas in color images .
This paper is structured as follows. In Section 2,
we will introduce the principle and basic of proposed

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

method. In Section 3, we present our experimental re- will be improved. To do so, the heterogeneous areas are
sults. Finally, Section 4 concludes the paper.
computed by using the following formula (Equation 2)
and it will not interfere in calculating CF. In other
words, these pixels have no effects when calculating
features. We expect that, dont have any correlations
2 Proposed Method
in heterogeneous areas, thus accuracy in steganalysis
method will be increased by omitting these pixels from
Signv matrix.
The proposed method is based on color channels correlation and the omittion of heterogeneous areas in color Signv (P ) = (R) > T hr&(G) > T hr&(B) > T hr
image that is designed independent of steganography
(2)
methods. The basic idea of feature extraction in RGB
space is based on[10].The way of extracting features
In the above formula, threshold is selected adapin this method is as follow. In first step, for all pixels tively such that n% of the image pixels belongs to hetin the color image, we compute the differences between rogeneus area. We set the n parameter to 5 experimenpixel intensity and the intensity of its neighbor pixels in tally. It means that 5% of image pixels that have the
four direction: 0, 45, 90 and 135 (i.e. we compute dif- least correlation to their neighboring pixels wont be inferences for three channels(Red, Green and Blue) and terfered in computing Signv . In the proposed method
produce the vector V =[R G B]T .). Fig.1 shows four feature based on the mentioned correlation have
a pixel P and its neighbors in these four directions.
been extracted from the image. First we calculate this
feature in four directions:0, 45, 90 and 135.
Dif f = [CF0 , CF45 , CF90 , CF135 ]

(3)

Then the mean and the variance of Diff vector will


be computed.
F eature1 = M ean(Dif f )

(4)

F eature2 = V ariance(Dif f )

(5)

Figure 1: direction of changes in pixel P

In second step, the sign of three components of V


has been calculated, then the summation of these signs
has been computed which are between+3 to-3. We
called this value for the supposed pixel P, Signv (p).
Because natural images have correlation in color channels, we expect the sign of vector components to be
the same in each pixel. So the Signv must be +3, -3
or 0 for the most of the pixels. We use this fact as a
feature for discovering stego-image from clear image.
Actually, the values of V in a clear image have the
same sign, since the correlation between the neighbor
pixels causes the values of color intensity in pixels have
increase or decrease in one direction. But by embedding the message, this correlation diminishes and we
expect that we have some values with different signs
in the vector. In fact, that is because of the embedding process is regardless to color channels correlation.
Hence, the first feature is defined as the ratio of the
pixels with: Signv (p) = +3,-3 or 0, to the total number
of pixels in the image.
CF =

#{p|signv(p) = 3or0or + 3}
#T otalImageP ixels

These two attained values create the first and the


second features.In the above relations, it is expected
that Mean has greater value in clear image than stegoimage. The opposite holds for Variance. In the next
step we embed a random message in the image using
LSB replacement and repeat the above operation .But
in this situation, the image carries a message. The
third feature is compute by subtracting the variance of
Diff and Diff vector ( resulted vectors before and after
embedding) .
Dif f 0 = [CF0 , CF45 , CF90 , CF135 ]

(6)

F eature3 = |V ariance(Dif f ) V ariance(Dif f 0 )|


(7)
Finally, we embed a random message in the image
using LSB replacement and repeat the above operation
and calculated the 4th feature as below(we add some
little value to fraction vent to avoid of division on zero.)
Dif f 00 = [CF0 , CF45 , CF90 , CF135 ]

(1)

F eature4 =
In third step, the attained feature of previous steps

135

(8)

kM ean(Dif f 00 ) M ean(Dif f 0 )k
kM ean(Dif f 0 ) M ean(Dif f ) + Ek
(9)

The Third International Conference on Contemporary Issues in Computer and Information Sciences

If any input image is already tampered with a message, embedding it again will not modify the features
a lot. So we expect F eature3 to be close to zero and
F eature4 be close to 1.After the extraction of the feature, a key factor is choosing a classifier. In this article
we used support vector machine(SVM) with polynomial kernel.

Experimental Result

In this section we display the experimental results of


our proposed method and the gained outcomes will be
compared to the three well known WS, SP and RS steganalysis methods and with suggested method in[10].
We have downloaded 100 images for training set and
50% of these images embed using LSB replacement
with embedding rates of 10%,20%,30% 100%. To determine the accuracy of steganalysis methods in test
set, 60 various images have been chosen. In this article the training and test process have been done in the
same conditions for three well known WS, SP and RS
steganalysis methods and suggested method in[10] and
our proposed method. The steganalysis assessment has
been resulted based on confusion matrix[11].

The proposed steganalysis methods in low embedding rates(10%,20%,30%) which detection is harder,
have done better than the other methods. Also the
proposed method in high embedding rates, has suitable
performance. The proposed method in all cases does
better than the SP steganalysis method. There is a little viberation in proposed method with change in embedding rate, while in other methods, there is a lot of
viberation. In other words, in some other steganalysis
methods, there will be much viberation in embedding
rates, but in proposed method from the low embedding
rate to high embedding rate, the total detection rate
will be improving.

Figure 3: TP Rate

Figure 2: confusion matrix

T Ps
T Ps + F Ns
F Ps
F P Rate =
T Ns + F Ps
T Ps + T Ns
AccuracyRate =
T Ps + F Ns + T Ns + F Ps
T Ps
P recisionRate =
T Ps + F Ps
T P Rate =

Figure 4: FP Rate
(10)
(11)
(12)
(13)

In this article,we drawing the charts of the Equations 10, 11, 12 and 13 for three well known WS, SP
and RS steganalysis methods and suggested method
in[10] and our proposed method (fig.3-6). Regarding
the charts(fig.3-6) we come to these conclusions:

136

Figure 5: Accuracy Rate

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Refrences
[1] J. D. Boissonnat and C. Delage, ESSENTIALS OF IMAGE
STEGANALYSIS MEASURES, Journal of Theoretical and
Applied Information Technology (2010).
[2] T Morkel ., JHP Eloff ., and MS Olivier., AN OVERVIEW
OF IMAGE STEGANOGRAPHY, Proceedings of the
Fifth Annual Information Security South Africa Conference
(ISSA2005),, Sandton, South Africa (, June/July 2005 ).
[3] N. F. Johnson. and S. Jajodia ., Steganalysis: The investigation of Hidden Information, IEEE Information Technology
Conference, Syracuse, USA (1998).

Figure 6: Precision Rate

[4] BHANU PRAKASH BATTULA . and R. SATYA


PRASAD., Steganalysis of Images created using current
steganography software, Springer , David Aucsmith (Ed.):
Information Hiding, LNCS 1525, Verlag Berlin Heidelberg
(1998), 3247.
[5] A. Westfield. and A. Pfitzmann ., Attacks on Steganographic
Systems, Springer, Information Hiding, LNCS 1768, Verlag
Heidelberg, 1999 (1999), 6176.
[6] J. Fridrich., R. Du., and M. Long., Steganalysis of LSB Encoding in Color Images: M. Long, Proceedings of ICME
2000, , New York, USA (2000).

Conclusion

In this paper, we have proposed a new steganalysis technique based on color channels correlation and
omitting the heterogeneous areas in color image. We
demonstrated the effectiveness of the proposed approach against LSB replacement. It is shown that our
method detects the hidden message very accurately
even in low embedding rate. The results of our proposed method shows that it has high accuracy in steganalysis and in lower rates it also does better than
well known WS, SP and RS steganalysis methods and
suggested method in[10].

[7] J. Fridrich ., M. Goljan., and R. Du., Reliable Detection of


LSB Steganography in Color and Grayscale Images, Proc.
of the ACM Workshop on Multimedia and Security, Ottawa,
CA, October 5 (2001), 2730.
[8] J. Fridrich. and M. Goljan., On estimation of secret message
length in LSB steganography in spatial domain, Security,
Steganography, and Watermarking of Multimedia Contents
VI, E. J. Delp III and P. W. Wong, eds., Proc. SPIE 5306
(2004), 2334.
[9] S. Dumitrescu., Wu Xiaolin., and Zhe Wang., Detection
of LSB Steganography via Sample Pair Analysis, SpringerVerlag, LNCS,New York,USA ( 2003), 355372.
[10] N.yousefi, Steganalysis of 24-bit color images, M.S.C Thesis, Enginering Department,Shahed University (in persian)
(2011).
[11] Bin Li., Junhui He., Jiwu Huang., and Yun Qing Shi., A Survey on Image Steganography and Steganalysis, Journal of
Information Hiding and Multimedia Signal Processing Volume 2 (April 2011), 142-172.

137

Online Prediction of Deadlocks in Concurrent Processes


Seyed Morteza Babamir

Elmira Hasanzade

University of Kashan

University of Kashan

Department of Computer Engineering

Department of Computer Engineering

Kashan, Iran

Kashan, Iran

elm.hasanzade@grad.kashanu.ac.ir

Babamir@kashanu.ac.ir

Abstract: this study addresses an approach to predict deadlocks in concurrent processes where
processes are threads of a multithread program. A deadlock occurs when two processes need some
resource held by the other; accordingly both of them will wait for the other forever. Based on
past behavior of threads of a multithread program, deadlock possibility n future behavior of the
threads can be guessed. To predict future behavior based on past behavior stimulates us to use
a mathematical model because multithread programs have uncertain behavior. To this end, we
consider past behavior of threads in terms of time series indicating a sequence of time points. Then,
we use the past time points in Artificial Neural Networks (ANNs) to predict future time points.
Efficiency and elasticity in predicting complex behavioral patterns by ANNs was our stimulation
in using ANNs. In fact, using ANNs in predicting and improving safety of multithread programs
behavior is contribution of this study. To show the effectiveness of our model, we applied it for
some Java multithread programs that were prone to deadlock. In compared with actual execution
of the programs, it was proved that about 74% of deadlock predictions were correct.

Keywords: Multithread program, Deadlock detection, Artifical Neural Networks, Time series

Introduction

in many cases cause performance lost. In addition,


using these methods has serious risks on system integrity. It seems preventing the programs from trapping into deadlock, is much more suitable. For this
reason, some policies have been devised to suppress a
concurrent system from getting trapped in deadlock.
Deadlock Prevention and Deadlock Avoidance are
the examples of such policies. Anyway, these types
of approaches make lots of limitation for concurrency
and in many cases cause other concurrency problems
like starvation.

The prevalence of multi-core processors is widely encouraging the programmers to use the concurrent programming. However, applying concurrency introduces
lots of challenges, and among them, deadlock is one of
the most popular problems. The origin of deadlock is
in sharing exclusive resources between the processes or
threads. Locking mechanisms have been used to share
these resources between processes or threads. Locking
is a task that is done by the programmer. Because of
this fact, it is an error prone technique and potential
Online deadlock detection at runtime, has been reto cause deadlocks.
ceived attention in recent years, because it does not
have previous approaches limitations. In general, they
Recovering from deadlock is not a cost efficient so- allow the system proceed normally without any limlution. The solutions like: 1- Restarting the system, itation. When the program is running, one or more
2- killing several processes or threads until deadlock monitors observe the execution of program and try to
obviated 3- extorting some resources from processes, find out about the possibility of deadlock in the future.
are most common ways for deadlock recovery. Using
each one of these approaches is not cost efficient and
Corresponding

Author, P. O. Box 87317-51167, F: (+98) 3615559930, T: (+98) 913 163 5211

138

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

The prediction of deadlock possibility at runtime is


another choice wich we use as the bases of our proposed
approach. This paper is organized as follows: section
2 overviews the related works and technologies used in
this work. Proposed model is discussed in section 3.
problem definition is in subsection 3.1 and model architecture and its components is shown in subsection
3.2. The model implementation and evaluation results
have been discussed in section 4. We conclude this
paper in section 5.

Related Works

Deadlocks are the best known problem in concurrent


systems. Deadlocks are threats to system safety. Safety
says that something bad never happen and deadlock is
an instance of something bad which usually occurs in
concurrent systems. One early attempt to dealing with
deadlock problem was to let the program trapped into
deadlock and then trying to recover from that. Recovering from deadlock is not cost efficient and in many
cases cause performance lost. In addition, using these
methods has serious risk on system integrity. When a
deadlock occurred, its side effect may be manifested in
next states of programs and then causes serious problems. Deadlock can disable recovering methods then
system cannot rollback to a safe state automatically.
In recent years, so many techniques have been developed to detect potential deadlocks before than they
real occurrence. It is obvious if we can detect a potential deadlock earlier, we can made decisions about
preventing the system from trapping into the deadlock
or about the policy of recovering from it. In general
we can divide potential deadlock detection techniques
in two categories: offline techniques and dynamic techniques. Offline techniques analyze a simple model of
program in most of cases do model checking, or analyze the source code of program which is an annotation
based task. In the case of model checking, it analyze
the model and not the real program, it is possible that
there are some differences between model and implementation. The most important disadvantage of model
checking is state explosion problem which is a thinkable
problem in multithread programs. Annotation based
techniques, need the programmer effort to inject the
knowledge into the source code. This technique is not
useful to legacy code. Also it depend on a specific language and dont beneficiary in the languages which are
not type-able [1].

mum effort. These advantages make online techniques


a proper choice to use in potential deadlock detection.
The reason of deadlocks is in requesting shared resources nestedly. That is, thread or process requests
an exclusive shared resource while it holds other resources. These requests are also blocking, which means
if requested resource is in use by another thread, the
requester thread or process will stop working and wait
until the thread which hold the requested resource, releases it. To represent these requests and holdings resources, we can draw a graph which also represent the
threads that stop proceeding because of other threads.
There is a deadlock in the system if there is a circle in
this graph. To reason about potential deadlock, Most
of approaches use this graph. However, this graph can
also be drawn in form of a lock graph. Lock means a
constraint for each process or thread when it requests
a shared resource. This constraint is for mutual exclusion. In [2] a controller draw an online system lock
graph and using some algorithms, find specific paths
named not guarded SCC- strongly connected components. These paths have a strong possibility to change
to one or more circles. The idea is that they raise the
probability of manifesting really existing deadlocks in
a not guarded SCC with injecting noise. This approach
is similar to Goodlock algorithm [3]. The difference is
Goodlock looks within the scope of one process run,
which means, when a cycle in the graph is caused by
lock sequences from two different runs, Goodlock cannot detect it [2]. Some extension on Goodlock algorithms had been done such as what discussed in [4]
which is another form of Goodlock, named iGoodlock
or informative Goodlock. iGoodlock reports the potential cycles in a multithread program based on lick
location in the program. DEADLOCKFUZZER is another technique which combines iGoodlock with a randomized thread scheduler to create real deadlocks with
high probability.

In [5] deadlock immunity concept has been introduced. It means the ability in a system that in some
way, it can avoid from all deadlocks happened in the
past. When a deadlock happens for the first time, they
keep deadlocks information in a concept named context in order to avoid the similar contexts in future
runs. In this approach they achieve immunity against
the corresponding deadlocks. To avoid deadlocks with
already seen contexts, they use changing the scheduling of threads. Deadlock contexts increase in the system; therefore, it can avoid a wider range of deadlocks.
However, if a deadlock does not have a pattern similar
to an already encountered one, this approach will not
In turn online techniques mostly are not language avoid it.
dependent and dont need programmer effort. These
Obviously, in all online approaches, they pre-run
techniques can be applied to legacy code with mini-

139

The Third International Conference on Contemporary Issues in Computer and Information Sciences

some portion of the program and using some techniques


like noise injection or rescheduling the threads runs,
check whether it is possible to encounter a deadlock in
the future or not [6]. In all of them, using the fact that
a multithread program has a nondeterministic nature
and some other reasoning, they select a portion of state
space, and pre-search it to find out the possibility of
deadlock. However, when this portion is large enough,
searching it at runtime is not a trivial task neither in
time or space [7]. To address these issues, it will be
suitable to use process behavior prediction techniques
to predict those parts of process behavior that, are related to deadlock occurrence. Indeed, in this way the
overhead of detecting a potential deadlock at runtime
will be linear equation of cost that used in prediction
technique.

2.1

ear ones we find Autoregressive (AR), Moving Average


(MA) and combined AR and MA (ARIMA) [11]. These
techniques have some limitations, such as inefficiency
for real world problems which are mostly complex and
nonlinear. These techniques assume that a time series is generated by a linear process. In turn statistical
techniques based on nonlinear predictors like threshold predictor, exponential predictor, polynomial predictor, and bilinear predictor, were proposed to add
more precision to prediction. However, the selection of
the suitable nonlinear model and the computation of
its parameters, is a difficult task for a practical problem which there is no prior knowledge about the under
consideration time series. Moreover, it has been shown
that the capability of the nonlinear model is limited,
because it is unable to provide a long-term prediction
[9].

In recent years, artificial intelligence tools have


Process behavior prediction techbeen extensively used for time series prediction [12,13].
niques
In particular, artificial neural networks are frequently

In some applications, it will be usefull to predict the future behaviore on applications. In order to apply these
techniques, it is necessary to know the application behaviors in past and predict the future behaviors. A
process behavior can be represented by its execution
pattern. This pattern also known as process access
patterns [8].

exploited for time series prediction problems. A neural network is an information processing system that is
capable of treating complex problems of pattern recognition, or of dynamic and nonlinear processes. In particular, it can be an efficient tool for prediction applications. The advantage of neural networks compared to
statistical approaches, is their ability to learn and then
to generalize from their knowledge [14]. Also the neural networks are based on training and in many cases
their prediction results are precise, even if the training
set has considerable noise [10]. These approaches are
much more suitable for real world problems that do not
obey specific rules.

To predict the behavior of a process or thread, the


execution trace must be converted into a representative time series. Time series is a set of observations
from past until present, denoted by s(t i) where
0 < i < P , and P is the number of observations. Time
Process behavior prediction techniques, were
series prediction is to estimate future observations, lets
s(t + i){i = 1...N }, where N is the size of prediction mostly used in applications to improve performance
utilization algorithms in distributed and concurrent
window [9].
systems. This works usually done in four steps. At
Observed behaviors could been the sequence of first step, they observe application execution by usevents performed by a process or thread (for example, ing an analyzing tool. In second step, the obtained
disk I/O, CPU activity, network transmissions, grab- application behavior should be converted into time sebing a lock and so on). Then the equivalent time se- ries. In third step, the converted behavior (time series)
ries also represent these events. In cases where these used to predict some next behavior of processes. The
time series are sequential, future member of time se- final step consists of the quantication of the predicted
ries (equivalent future application event) can be easily behavior. That is, the predicted future behavior used
determined. Most applications, however, employ com- in load balancing, catching and prefetching, process
plex rules, therefore requiring different approaches for migration, utilized thread scheduling and failure preprediction, such as statistical evaluation or artificial diction algorithms [15]. For example, if we want to use
predicted behavior in a load balancing algorithm, and
neural networks techniques [10].
if the prediction affirms that the process would suffer
In general, time series prediction techniques can a transition from execution state S1, characterized by
be classified in two categories: statistical techniques an excessive CPU utilization, to the state S2, which
and techniques based on advanced tools as neural net- requires heavy network traffic, the model allows preworks. Statistical prediction techniques are based on dicting the best course of action for such operation.
linear predictors or nonlinear predictors. Among lin-

140

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Proposed Model

This type of information can be easily converted into univariant time series, which represent a dedicated thread behavior against a dedicated resource in a time interval. This time series
could be showed in the form of two elements tuple like: (threadi , resourcej )={nothing, request, nothing, nothing, nothing, release, nothing, request}which
means in the first period of time T hreadi requests
resourcej , in the next period of time it has nothing to do with resourcej and also in the next two period either. In the sixth period of time T hreadi releases resourcej . This thread will request resourcej
in eighth period again. This set can be written for any
thread and any resource which make a two elements
tuple together. Each member in this set can take one
of these three values: {release, request, nothing}. This
set is a univariant time series which can be used for
prediction of the thread behavior, in (t + 1)th period of
time. Therefore, we will have n r time series which n
is the number of threads and r is the number of shared
resources or locks.

In this section we summarize of our model in detecting


potential deadlocks in a multithread program, using artificial neural network. In multithread programs, there
is some shared resources that used by threads. When
a thread need a shared resource, requests and wants to
lock it, if resource was available takes it and otherwise
stop proceeding until it can take the resource. Also
when a thread does not need an owned resource any
more, it releases the resource. The activities like requesting and releasing a resource which issued from
threads or processes in a concurrent system, cause
deadlocks. Therefore, these types of information are
valuable for determining the possibility of deadlock.
The order of these requisitions and releases has direct
effect on deadlock occurrence. Thus, if one can predict
the future order of requisitions and releases that will
be issued from each thread or process, she/he can determine the possibility of deadlock. The precision of
predicting future requests or releases has direct effect
on the precision of our approach in detecting potential
Actually, what we are trying to do, is extracting
deadlocks.
these deadlock-wise behaviors, and predicting somehow the future deadlock-wise behaviors of processes or
In the following subsection we define the problem threads.
that should deal with and in the next subsection; we
are going to discuss our model.

3.2
3.1

Model Architecture and Components

Problem Defenition

To detecting the potential deadlocks, as mentioned,


the future order of requisitions and releases that will
be issued from each thread or process should be predicted. That is, for every process or thread, we should
predict the type of action which it is going to perform
about each shared resource. For example, if the system is in the tth period of time, for (t + 1)th period of
time our prediction should be something like: threadi
is going to request resourcej and as we know if it is
not available, the threadi will stop proceeding until
it could take the resourcej , or threadk is going to
release resourcez that it took previously. These types
of information are necessary to reasoning about deadlock possibility. To refer to these types of behaviors
we introduced deadlock-wise behavior concept. The
other behaviors of threads or process are not related
in detecting potential deadlocks. For example, the
deadlock-wise behavior of T hreadi between times 0 to
t, in an execution trace could be something like:
threadi [0t]={Request(resourcej ),Request(resourcek ),
Release(resourcej ),Request(resourcel ),
Request(resourcep ),Release(resourcel )}

To online prediction of potential deadlocks in multithread programs we proposed a model consist of four
components. Each component has a dedicated task.
The architecture of proposed model has been showed
in Figure 1. Each component task is discussed in the
following.

Figure 1: Proposed model architecture

Behavior Extraction & Time series Generation Component:


this component has
two main parts. The first one (behavior extractor) is responsible for observing the application
execution and online extracting the deadlockwise behaviors from observed execution. It sends
the extracted deadlock-wise behaviors to Run-

141

The Third International Conference on Contemporary Issues in Computer and Information Sciences

2 thread2 holds resource a and resource d, and


time Lock Tracker Component which will be exrequests resource b, then is waiting for thread4
plained later. Anyway, once a target behavior
which holds resource b
has been extracted, the second part (Time Series
Generator) converts this behavior into a member in the time series that this behavior belongs
to. The result of this part will be injected into And if the prediction results that came from Predictor
Component is something like:
Predictor Component.
Runtime Lock Tracker Component: this
component takes the extracted online deadlockwise behaviors and draws and keeps an online
lock graph which represents the current lock and
thread or process states in the system.
Predictor Component: this component is responsible for predicting the future members of
time series that came from Behavior Extraction & Time series Generation Component . As
According to these predictions, Decision Maker
we discussed earlier, considering the complexity
Component
concludes:
and nonlinearity nature of concurrent processes
or threads behaviors, the most suitable way to
predict the future member of time series which
1. thread2 is going to wait for thread1
are another representation of these threads or
2. thread3 is going to wait thread2
processes behaviors, is artificial neural network
prediction techniques. In dedicated time inter3. thread2 is going to wait thread4
vals, this component takes the time series from
Behavior Extraction & Time series Generation
Component, and predicts the future members of This component, based on what it receives from runtime lock tracker component and predictor compotime series.
Decision Maker Component: the last compo- nent draw a virtual lock graph which composes the
nent is Decision Maker Component. This compo- information gained from those two components. The
nent takes the predicted time series and the cur- composition of our example will be a virtual graph like
rent lock graph and composes these two together. Figure 3.
After that, using cycle detection algorithms results about the deadlock possibility. For example
if the current system lock graph is something like
Figure 2:

Figure 3: The Composit Graph

In our example, the composition of the real system


lock graph and predicted future events results a virtual graph that represent in the next state or period of
time, the deadlock is possible. Therefore, this component reports that a deadlock in the next state of system
is predicted.

Figure 2: Real System Lock Graph


This graph means:

All of these components linked together in a way


1 thread1 holds resource c, and requests resource r, which could cooperate in runtime. For any program
then is waiting for thread3 which holds resource r that uses acquiring locks for mutual exclusion in a multithread program, we can use this model to predict the

142

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

deadlock possibility at runtime. In the next section


we will discuss the implementation and evaluation of
our model on a Java multithread program which all of
threads behave randomly.

4.1

another algorithm that can find cycles in the resulting composed graph. It receives the predictor component and online lock tracker component results and
reasons about the possibility of deadlock in the future.

Implementation and Evalua4.2 Evaluation Results


tion Results
Implementation

We implemented each component separately and then


linked them together. In the following each component
discussed in detail.

We applied our implementation to online prediction


of potential deadlock for a multithread program written with Java. This program is a deadlock prone multithread program which consists of 20 threads which
share 10 resources. In our test suit multithread program, every thread requests the resources in a random
way, and also releases each one after a random time.
Our evaluation consists of two phase. First we evaluate multiple network configuration using training and
testing each configuration. Second we evaluate our approach in detecting potential deadlock.

Behavior Extraction & Time series Generation Component was implemented using Java and
AspectJ compiler. This component takes a Java written multithread program, and instruments it using AspectJ. What it instruments in the target code, is the
logic of extracting deadlock-wise behaviors and converting them to a time series. After doing this, any
time that targeted multithread code executed, the be- 4.2.1
haviors that we are interested in, will be extracted at
runtime and will be converted to time series.
The second component, that is runtime lock
tracker, implemented with Java. It takes online extracted deadlock-wise behaviors from the first component and draws a lock graph.
Third component has been implemented using
MATLAB default Time Series Tools, which placed
in Neural Network Toolbox of MATLAB. We used
Nonlinear Aggressive (NAR) predictor network in our
work. This network predicts each member of a time
series using d past values of that series. That is
y(t) = f (y(t 1), ..., y(t d)) . This is a simple network which consists of 3 layers named, input, output
and hidden layer. In addition to the d parameter, the
number of nodes in hidden layer is another important
factor in network configuration which has effect on the
efficiency of predictions. Hidden layer nodes are responsible for the main part of prediction task and the
proper number of these nodes is depended on type of
time series which should be predicted. We used n r
of these networks (n is the number of threads and r is
the number of shared resources) to predict all the future members of time series. This is a simple network
and its computational complexity is low.

Selecting network configuration parameters and training the networks

We deployed 200 NAR networks (Nonlinear Aggressive) to prediction Because of 20 threads and 10 resources. For the first phase evaluation, we ran our
program 250 times and used the information of these
runs to train and test the networks. In this part of
work, we examined networks using different values for
d which is the number of past values of series, and also
the different number of nodes in hidden layers. In this
way, we selected the best configuration of networks to
applying them in proposed deadlock prediction model.
The result of each configuration showed in Table 1.
As it is obvious the overall result of networks in
the case which d is 3 and the number of nodes in hidden layer is 10, is the best result. So we selected this
configuration to be placed in predictor component.

4.2.2

Evaluation of proposed approach in detecting potential deadlock

After training the networks with selected configuration,


we started the second phase evaluation and testing our
Last component is again a Java program that im- model. The obtained results have been shown in Figure
plements a two graph composition algorithm and one 4.

143

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Total number
of runs
250
250
250
250
250
250
250

Table 1: The Results of Different Network Configuration


d{y(t) = f (y(t 1), ..., y(t d))} Number of hidden Test set MSE over
layer nodes
200 network
1
5
9.0e-1
2
5
4.058e-1
3
5
1.81e-1
4
8
1.08e-0
1
10
4.001e-1
2
10
2.089e-0
3
10
1.052e-0

parison with other online techniques is more cost efficient. In addition, it does not force offline or traditional
deadlock detection techniques limitations, like Banker
algorithm.

Figure 4: The Failure rate of potential deadlock prediction

The contribution of this work is in using process behavior prediction techniques to reasoning about deadlock possibility. We first convert the process execution
behavior to multiple time series, and next predict the
future members of these series. The predicted members retranslated into behaviors; therefore, we obtain
future behaviors of threads. Using these predicted behaviors we result about deadlock possibility in the future. Rate of true detection of deadlock occurrence is
depended on the correctness of predicted behaviors. In
proposed approach, the prediction has been done using neural networks, which is a powerful technique in
predicting complex and nonlinear time series.

Figure 4 represents the failures in prediction the


deadlock possibility using our approach. In our test,
we ran the target multithread program 500 times, during these runs deadlock occurred 17 times. Our approach reported 13 of them before than their actual
occurrences and missed 4 of them. Also in 3 cases, it
We used NAR network which is a time series predicreported false positive. This is considerably a good re- tor network. We trained and evaluated these networks
sult for a program that behaves completely in a random using the information gathered from test runs. The
way.
obtained results showed an applicable performance. In
most of multithread programs, the each thread behavior is depended on its past behaviors. Therefore,
NAR networks which uses past information to result5 Conclusion
ing about future could be a proper network to predict future behaviors of threads. The results which
showed in table 1, emphasizes this claim. Finally, the
Detecting potential deadlock techniques in a multi- configuration of network which made the best results
thread program could be divided in two major cate- was selected and used in the predictor component of
gories named online techniques and offline techniques. our model. The final results obtained from the model
Online detection techniques in compare with static showed 74% of deadlocks predicted correctly before octechniques have some advantages. For example, offline currence. As we saw in Figure 4 except a few ones,
techniques suffer from state explosion or require pro- in most of cases, the model correctly conclude about
grammer effort to inject knowledge into code to detect the possibility of deadlock occurrence. Considering
potential deadlocks. Therefore in recent years, online the completely random behavior of threads, this result
techniques received lots of attention. Anyway, the most is satisfactory. Also this model doesnt augment any
important weakness of currently used online detection overhead at runtime to the program. Except a little
techniques is that they are not cost efficient, neither in instrumentation that we inject to code for extracting
time or space.
deadlock-wise behaviors, this model completely run in
separate. Also none of the components have a cost
In this work, we introduced a novel online approach intensive task, even the predictor component which is
to detect potential deadlocks. This approach in com-

144

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

implemented using neural network.

[8] E. Dodonov and R. F. d. Mello, A Model for Automatic


On-Line Process Behaviour Extraction, Classification and
Prediction in Heterogeneous Distributed Systems, Seventh
IEEE International Symposium on Cluster Computing and
the Grid(CCGrid07) (2007).

Refrences

[9] N. Baccour, H. Kaaniche, M. Chtourou, and M. B. Jemaa,


Recurrent neural network based time series prediction: Particular design problems, International Conference on Smart
Systems and Devices, Tunisia, Hammamet (2007).

[1] D.Engler and K.Ashcraft, RacerX:Effective, Static Detection of Race Conditions and Deadlocks, SOSP (2003).
[2] Y. Nir-Buchbinder, R. zoref, and S. Ur, Deadlocks: From Exhibiting to Healing, Runtime Verification: 8th International
Workshop, RV Budapest, Hungary (2008).
[3] S. Bensalem, J. Fernandez, K. Havelund, and L. Mounier,
Confirmation of deadlock potentials detected by runtime
analysis, WorkShop on Parallel and distributed systems:
testing and debugging (2006).
[4] P.Joshi, C. Park, K. Sen, and M. Naik, A randomized dynamic program analysis technique for detecting real deadlocks, ACM SIGPLAN conference on Programming language design and implementation, Dublin, Ireland (2009).
[5] H. Jula and G. Candea, A Scalable, Sound, EventuallyComplete Algorithm for Deadlock Immunity, 8th International Workshop, RV Budapest, Hungary (2008).
[6] F.Chen and G.Rosu, Predictive Runtime Analysis of Multithread Programs, supported by the joint NFS/NASA.
[7] C.Wang, S. KunduS, M. Gana, and A. Gupta, Symbolic Predictive Analysis For Concurrent Programs.

[10] R. Zemouri, D. Racoceanu, and N. Zerhouni, Recurrent radial basis function network for time-series prediction, Engineering Applications of Artificial Intelligence (2003), 453463.
[11] O. Voitcu and Y. Wong, On the construction of a non-linear
recursive predictor, Science B.V., Journal of Computational
and Applied Mathematics (2004).
[12] Y. Chen B and A. Abraham, Time-series forecasting using flexible neural tree model, journal=A. Abraham, (2004),
219-235.
[13] C.J. Lin and Y.J. Xu, A self-adaptive neural fuzzy network
with group-based symbiotic evolution and its prediction applications, Science, Fuzzy Sets and Systems (2005).
[14] R. Zemouri and P. Ciprian Patic, Recurrent Radial Basis Function Network for Failure Time Series Prediction,
World Academy of Science, Engineering and Technology 72
(2010).
[15] E. Dodonov and R. F. d. Mello, A Novel Approach For
Distributed Application Scheduling Based on Prediction of
Communication Events, Future Generation Computer Systems 26.

145

Fisher Based Eigenvector Selection in Spectral Clustering Using


Googles Page Rank Procedure
Amin Allahyar

Hadi Sadoghi Yazdi

Ferdowsi University of Mashhad

Ferdowsi University of Mashhad

Department of Computer Engineering

Department of Computer Engineering

Amin.Allahyar@stu-mail.um.ac.ir

H-Sadoghi@um.ac.ir

Soheila Ashkezari Toussi


Ferdowsi University of Mashhad
Department of Computer Engineering
Soheila.Ashkezari@stu-mail.um.ac.ir

Abstract: Ng. Jordan Weiss (NJW) approach is one of the most widely used spectral clustering
algorithms. It uses eigenvectors of the normalized affinity matrix derived from input data. These
eigenvectors are treated as new features of the input data and now has same structure of high
dimensional input data but are represented in lower dimension. these new trasformed data can
be easily used in regular clustering algorithms. NJW method uses the eigenvectors with highest
corresponding eigenvalue. However, these eigenvectors are not always the best selection to reveal
the structure of the data. In this paper, we aim to use Googles page rank algorithm to replace unsupervised problem with an approximated supervised problem then we utilize the fisher criterion to
select the most representative eigenvectors. The experimental result demonstrates the effectiveness
of selecting the relevant eigenvectors using the proposed method.

Keywords: Feature/Eigenvector Selection, Fisher Criterion, Spectral Clustering, Googles Page Rank.

Introduction

Spectral clustering techniques[1] originate from the


spectral graph theory[2] and make use of the spectrum
of the similarity matrix of the data to apply dimensionality reduction for clustering. Hence, the basic idea
is to construct a weighted graph from the input data
in such a way that the vertices of the graph are data
points, and each weighted edge represents the degree of
similarity between every corresponding pair of vertices.
Scott and Longute-Higgines algorithm[3], Perona and
Freeman algorithm[4], Normalized cut[5] and NJW[6]
are such spectral techniques.
Spectral clustering methods use the eigenvectors of the
normalized affinity matrix obtained from data to carry
out data partitioning. In most of these techniques, the
value of corresponding eigenvalue determines the pri Corresponding

ority of the eigenvectors. For example, to partition


data to K clusters, NJW uses the eigenvectors corresponding to the largest K eigenvalue of the normalized
Laplacian matrix of input data. However, this order
does not guarantee to select the best features to represent the input data[7][8][9]. In this paper with an inspiration from Googles page rank algorithm, problem
is converted to an approximated supervised problem.
Then the fisher criterion is applied. Using the score
obtained from fisher criterion we propose a new eigenvector selection method to find the relevant eigenvector
to describe the natural groups of input data.
The rest of paper is organized as follows: Section 2 contains a brief review of spectral clustering and one of its
most popular algorithm, i.e. NJW. Section 3 is dedicated to the related works about eigenvector selection,
furthermore as a requirement to propose our method,

Author, P.O. BOX: 91779-48974, F: (+98) 511 421 5067, T: (+98) 511 421-5071

146

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

we introduce the fisher criterion and the Googles page


rank algorithm. In Section 4, we propose our new
eigenvector selection approach. Section 5 contains the
empirical results and Section 6 concludes this article.

2
2.1

3. Compute the first K eigenvectors v1 , v2 , . . . , vK


corresponding to the first K largest eigenvalues
1 , 2 , . . . , K of LN and form the column wise
matrix V = [v1 , v2 , . . . , vK ] R .
4. Renormalize V and form the matrix Y such that
V
all rows have unit length as Yij = ij .

Preliminaries

j Vij

5. Cluster represented data matrix Y into K clusters via K-means.

Spectral Clustering

Spectral clustering technique[1] has a strong connection with spectral graph theory[2]. It usually refers to
the graph partitioning based on the eigenvalues and
eigenvectors of the adjacency (or affinity) matrix of
a graph. Given a set of N points in d dimensional
space X = {x1 , x2 , . . . , xN } Rd we can build a complete, weighted undirected graph G(V, A) whose nodes
V = {v1 , v2 , . . . , vN } correspond to the N patterns and
edges defined through the adjacency matrix A encode
the similarity between each pair of sample points. Adjacency between two data points can be defined as (1):
Aij = e

d2 (xi ,xj )
2

(1)

where, d measure the dissimilarity or distance between


patterns and the scaling parameter controls how
rapidly the affinity falls off as the distance between
xi and xj increase. The selection of tuning parameter
greatly affects spectral clustering result. The tuning
method proposed by Zelnik and Peona[10] introduces
local scaling by selecting a i for each data point xi
instead of the fixed scale parameter . The selection is
done using the distance between point xi and the pth
nearest neighbor. In this way, the similarity matrix is

d2 (xi ,xj )

defined as, h(xi , xj ) = e i j , i = d(xi , xp ), and


xp is the pth nearest neighbor of xi . It should be noted
that using this method the result is dependent on the
choice of parameter p.

2.2

gree
PN matrix D is a diagonal matrix whose Dii =
j=1 aij element is the degree of the point xi .

3
3.1

Related Work
Eigenvector Selection

In[7], it is shown that if the gap size between the K th


and K + 1th eigenvalue of LN is small, then there is no
guarantee that first K eigenvector set be the optimal
set. Xiang and Gong[8] proposed to use eigenvector
selection based on their relevant to improve the performance of the spectral clustering method. In [11] population analyses have been provided to gain insights into
which eigenvectors should be used. Another approach
for eigenvector selection is proposed by Feng Zhao et
al.[9], which is based on entropy ranking. It should be
noted that in all of these mentioned algorithms first
eigenvector (the maximum corresponding eigenvalue)
of the affinity matrix is excluded from selection.
From another point of view, the eigenvector selection
can be seen as a feature selection problem which many
unsupervised feature selection methods can be used including: Tabu search[12], Correlation[13], Genetic algorithm[14], Laplacian matrix[15]. Another feature selection method is the fisher criterion[16] which is looking for set of features, that keeps the data of other
classes in highest distance while the data in same class
close. This will results in the optimal solution, but its
apparently a supervised algorithm and needs the labels
of data to be available.

Ng-Jordan-Weiss (NJW) Method

NJW algorithm[6] aims to find a new representation 3.2 Fisher Criteroin


on the first eigenvectors of the Laplacian matrix using
the following steps:
Fisher criterion is a well known supervised feature selection algorithm based on the maximization of the
ratio of the between-class scatter to the within-class
1. Form the affinity matrix by (1).
scatter shown in Figure.1. Between-class scatter is a
2. Compute the degree matrix D and normalized measure of distance between mean of each class relaaffinity matrix LN = D1/2 AD1/2 . The de- tive to the mean(s) of the other classes. Within-class

147

The Third International Conference on Contemporary Issues in Computer and Information Sciences

scatter is variance of each classs. We use this measure clusters has more connection compared to data reside
in the boundary. Using the Googles page rank one
for evaluating eigenvectors relevance as (2):
can detect which data has most connections. Because
K
of the intrinsic disjunction of these data (as they reside
i,j=1 kxi xj k
(2)
fscore = K
in center of each cluster and focused on a spot) reprei=1 kvar(xi )k
sented in Figure.2 it is very easy to cluster them into
where xi and var(xi ) is the mean and variance of class i separate groups using a popular clustering algorithm
respectively. Whatever the value of this index is higher, such as K-means and it converge in one or two iterthe data points are better separated in classes.
ations. These data is then labeled according to their
clusters. So the problem is converted from an unsupervised feature selection to a supervised one. After this
step we can use a regular fisher criterion to score each
eigenvector individually. After this phase K number
eigenvectors with the highest score is selected for last
phase of spectral clustering procedure. Block diagram
of the proposed approach is shown in Figure.3

Figure 1: Fisher criterion: set W1 of features provide


a better separation in compare to set W2 [16].

3.3

Googles Page Rank Algorithm

Figure 2: Demonstration of 20 data selected from the


Page ranking[17] is an essential task in Google process. maximum value of first eigenvector of IRIS dataset.
The important pages should have a lot of link to\from Part A is representation of 2ed and 3rd and Part B is
the other pages while the links itself should be made to same data with 2ed and 3rd and 9th eigenvector.
other important pages. Google uses a specific spectral
based page ranking which utilize the first eigenvector of
affinity matrix calculated from graph of all web pages
to find the important pages[18]. Using this specific algorithm google can rank millions of web pages in few
hours. The algorithm is briefly described as follows:
Let A be the affinity matrix which Aij is 1 if page i
has a link to page j. let D be the diagonal matrix of
degrees where Dii = j Aij . Let S be the scaled matrix
defined as S = D1 A. Let i be the ith eigenvector
of S corresponding to the largest eigenvalues. So has
dimensionality of 1 n where rank of page x is x .
In the other word, if value of x is high (compared to
Figure 3: Block diagram of the proposed algorithm.
others) then it means that page x is linked to many
pages.

The Proposed Algorithm

Adjacency matrix A is basically a graph which represents the connection between data.It can be considered as the connection of web pages and be fed to the
Google ranker function so that the data (web pages)
that has many neighbors (links in web) can be determined. Regularily, data which reside in the center of

148

Experimental Result

To investigate the capability of proposed algorithm,


a number of data sets from UCI as well as MNIST
handwritten digit database is used. The comparition
is based on NMI, the standard measure in determining
the quality of clustering. Properties of the considered
datasets are reported in Table.1 and the NMI result
of NJW and proposed method is reported in Table.2.

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

To form the affinity matrix, we utilized the proposed we aim to investigate more indexes for pairwise and
method in[10] using the 7th nearest neighbor. By com- individually evaluation of eigenvectors.
paring the NMI result (Table.2 and Figure.4), it can
be seen that the proposed method has higher NMI except in two datasets, Image and Glass. By analyzing
the data in these datasets it can be observed that the
input data in these dataset has a very mixed clusters.
This issue shows the fact that first K eigenvectors related to the largest eigenvalue are more appropriate
when the clusters are too mixed together.

Table 1: Properties of selected UCI and


Mnist dataset.
Name
Instance Feature Class
Iris
150
4
3
Wine
178
13
3
Ionosphere
351
34
2
Breast-w
683
9
2
Soybeans
47
35
4
Glass
214
9
6
Liver
345
6
2
Image
210
19
7
Mnist 58
400
784
2
Mnist 89
400
784
2
Mnist 038
400
784
3
Mnist 1234
400
784
4
Table 2: NMI Comparition of NJW
and proposed method.
Name
NJW
Proposed
Iris
91.33
96.67
Wine
97.19
98.91
Ionosphere
69.8
71.35
Breast-w
61.93
74.13
Soybeans
100.00
100.00
Glass
49.53
48.52
Liver
56.81
67.42
Image
64.29
63.13
Mnist 58
84.00
86.31
Mnist 89
86.00
86.55
Mnist 038
82.67
83.02
Mnist 1234
81.13
88.49

Conclusion

In this paper a new approach for selecting relevant


eigenvectors in spectral clustering is proposed. This
approach utilizes the Googles page rank algorithm to
identify the boundary data. In the next step by exploiting their congenital splitness they get approximated labels. In the last step by utilizing the fisher criterion the
relevant eigenvectors is selected. For the future work,

Figure 4: Minimum and maximum NMI achieved during 50 run. The blue column is NJW and red column
is the proposed method.

Refrences
[1] N. and Shawe-Taylor Cristianini J. and Kandola, Spectral
kernel methods for clustering, Advances in neural information processing systems 14 (2002), 649655.
[2] F.R.K. Chung, Spectral graph theory, Amer Mathematical
Society, 1997.
[3] G.L. and Longuet-Higgins Scott H.C., Feature grouping by
relocalisation of eigenvectors of the proximity matrix, Proc.
British Machine Vision Conference, 1990, pp. 103108.
[4] P. and Freeman Perona W., A factorization approach to
grouping, Computer VisionECCV98 (1998), 655670.
[5] T. and Belkin Shi M. and Yu, Data spectroscopy:
Eigenspaces of convolution operators and clustering.
[6] A.Y. and Jordan Ng M.I. and Weiss, On spectral clustering:
Analysis and an algorithm, Advances in neural information
processing systems 2 (2002), 849-856.
[7] N. and Verri Rebagliati A., Spectral clustering with more
than K eigenvectors, Neurocomputing (2011).
[8] T. and Gong Xiang S., Spectral clustering with eigenvector
selection, Pattern Recognition 41 (2008), no. 3, 10121029.
[9] F. and Jiao Zhao L. and Liu, Spectral clustering with eigenvector selection based on entropy ranking, Neurocomputing
73 (2010), no. 10, 17041717.
[10] P. and Zelnik-Manor Perona L., Self-tuning spectral clustering, Advances in neural information processing systems 17
(2004), 16011608.
[11] T. and Belkin Shi M. and Yu, Data spectroscopy:
Eigenspaces of convolution operators and clustering, The
Annals of Statistics 37 (2009), no. 6B, 39603984.
[12] Y. and Li Wang L. and Ni, Feature selection using tabu
search with long-term memories and probabilistic neural
networks, Pattern Recognition Letters 30 (2009), no. 7,
661670.
[13] M.A. Hall, Correlation-based feature selection for machine
learning, The University of Waikato, 1999.
[14] S.C. Yusta, Different metaheuristic strategies to solve the
feature selection problem, Pattern Recognition Letters 30
(2009), no. 5, 525534.
[15] X. and Cai He D. and Niyogi, Laplacian score for feature
selection, Advances in Neural Information Processing Systems 18 (2006), 507.
[16] R.O. and Hart Duda P.E. and Stork, Pattern Classification
and Scene Analysis 2nd ed. (1995).

149

The Third International Conference on Contemporary Issues in Computer and Information Sciences

[17] L. and Brin Page S. and Motwani, The PageRank citation


ranking: Bringing order to the web. (1999).

150

[18] A.N. and Meyer Langville C.D., Google page rank and beyond, Princeton Univ Pr, 2006.

Imperialist Competitive Algorithm for Neighbor Selection in


Peer-to-Peer Networks
Shabnam Ebadi

Abolfazl Toroghi Haghighat

Islamic Azad University, Qazvin Branch, Qazvin, Iran

Islamic Azad University, Qazvin Branch, Qazvin, Iran

Department of Information Technology

Department of Information Technology

Shabnam ebadi@yahoo.com

at haghighat@yahoo.com

Abstract: Peer-to-peer (P2P) topology has significant influence on the performance, search efficiency and functionality, and scalability of the application. In this paper, we propose the Imperialist
Competitive Algorithm (ICA) approach to the problem of Neighbor Selection (NS) in P2P Networks.
Each country encodes the upper half of the peer-connection matrix through the undirected graph,
which reduces the search space dimension. The results indicate that ICA usually required shorter
time to obtain better results than PSO (Particle Swarm Optimization), specially for large scale
problems.

Keywords: Neighbor Selection, Imperialist Competitive Algorithm, peer to peer Network.

Introduction

Peer-to-peer computing has attracted great interest and attention of the computing industry and
gained popularity among computer users and their networked virtual communities [1]. All participants in
a peer-to-peer system act as both clients and servers
to one another, thereby surpassing the conventional
client/server model and bringing all participant computers together with the purpose of sharing resources
such as content, bandwidth, CPU cycles It is no longer
just used for sharing music files over the Internet.
Many P2P systems have already been built for some
new purposes and are being used. An increasing number of P2P systems are used in corporate networks or
for public welfare [2].
A recent survey states that computer users are increasingly downloading large-volume contents such as
movie and software, and 24 percent of the Internet
users had downloaded a feature-length film online at
least once, and that there exists a large demand for
this category of P2P applications. A new generation
of P2P applications serves this purpose, where their top
Corresponding

priority is to effectively distribute the content instead


of locating it. Example of these include BitTorrent,
which have seen significant increase in usage in terms
of network traffic and number of users[3]. These applications are conducive for distributing large-volume
contents partly because they divide the content into
many small pieces and allowing peers to exchange those
pieces instead of the complete file. Such a mechanism
has been demonstrated to improve the efficiency of P2P
exchanges. The intuition is that when content is broken into pieces for P2P exchange, it takes a shorter
time before a peer can begin to upload to its neighbors while simultaneously downloads from the community. In practical P2P systems, peers often keep a large
set of potential neighbors, but only simultaneously upload/download to/from a small subset of them, which
we call active neighbors, to avoid excessive connection
overhead[4].
The important process that improves the efficiency
of distribution is refer to as the neighbor selection (NS).
NS is the process where one or more entities in the P2P
network police the system by determining the neighbors the other peers that they will connect to for obtaining and/or distributing the content for each peer.

Author, P. O. Box 13418-58861, T: (+98) 21 66033847

151

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

It is intuitive to note that the mechanism adopted to


decide the neighbors has a strong influence on the distribution efficiency.

The Imperialist Competitive


Algorithm

P2P comprises peers and the connections between


these peers. These connections may be directed, may
have different weights and are comparable to a graph
with nodes and vertices connecting these nodes. Defining how these nodes are connected affects many properties of an architecture that is based on a P2P topology,
which significantly influences the performance, search
efficiency and functionality, and scalability of a system.

The Imperialist Competitive Algorithm starts with an


initial population (countries in the world). Some of the
best countries in the population are selected to be the
imperialists and the rest form the colonies of these imperialists. All the colonies of initial population are divided among the mentioned imperialists based on their
power. The power of an empire which is the counterpart of the fitness value in GA is inversely proportional
to its cost. After dividing all colonies among imperialA common difficulty in the current P2P systems is
ists, these colonies start moving toward their relevant
caused by the dynamic membership of peer hosts. This
imperialist country[10].
results in a constant reorganization of the topology [5].
Koulouris et al.[6] presented a framework and an
implementation technique for a flexible management of
peer-to-peer overlays. The framework provides means
for self-organization to yield an enhanced flexibility in
instantiating control architectures in dynamic environments, which is regarded as being essential for P2P
services to access, routing, topology forming, and application layer resource management. In these P2P applications, a central tracker decides about which peer
becomes a neighbor to which other peers.

The total power of an empire depends on both the


power of the imperialist country and the power of its
colonies. We will model this fact by defining the total
power of an empire by the power of imperialist country
plus a percentage of mean power of its colonies.
Then the imperialistic competition begins among
all the empires. Any empire that is not able to succeed
in this competition and cant increase its power (or at
least prevent decreasing its power) will be eliminated
from the competition[10].

Koo et al. [7] investigated the neighbor-selection


process in the P2P networks, and proposed an efficient
single objective neighbor-selection strategy based on
Genetic Algorithm (GA). Sun et al. [8] proposed a
PSO algorithm for neighbor selection in P2P networks.
Abraham et al. [9] proposed Multi swarms for neighbor
selection in peer-to-peer overlay networks.

The imperialistic competition will gradually result


in an increase in the power of powerful empires and a
decrease in the power of weaker ones. Weak empires
will lose their power and ultimately they will collapse.
The movement of colonies toward their relevant imperialists along with competition among empires and
also the collapse mechanism will hopefully cause all the
countries to converge to a state in which there exist
In this paper, we propose the Imperialist Comjust one empire in the world and all the other counpetitive Algorithm (ICA) approach to the problem of
tries are colonies of that empire. In this ideal new
Neighbor Selection (NS) in P2P Networks [10].
world colonies, have the same position and power as
the imperialist[10].
Imperialist Competitive Algorithm is inspired from
the socio-political process of imperialism and imperialistic competition. This algorithm (like many optimization algorithms) starts with an initial population.
Neighbor-Selection Problem
Each individual of the population is called a country. 3
Some of the best countries with the minimum cost are
in P2P Networks
considered as the imperialist states and the rest will be
the colonies of those imperialist states. All the colonies
are distributed among the imperialist countries regard- Kooa et al. model the neighborhood selection probing their power. The power of each country is inversely lem using an undirected graph and attempted to deproportional to its cost, which is the fitness value in the termine the connections between the peers [7]. Given
GA.
a fixed number of N peers, we use a graph G=(V,E)
to denote an overlay network, where the set of vertices
V = {v1 , ..., vN } represents the N peers and the set of
edges E = {eij {0, 1}, i, j = 1, ..., N } represents their
connectivities: eij = 1 if peers i and j are connected,

152

The Third International Conference on Contemporary Issues in Computer and Information Sciences

and eij = 0 otherwise. eij = eji for all i 6= j and


eij = 0 when i = j. Let C be the entire collection of
content pieces, and we denote {ci C, i = 1, ..., N } to
be the collection of the content pieces each peer i has.
We further assume that each peer i will be connected
to a maximum of di neighbors, where di < N . The
disjointness of contents from peer i to peer j is denoted
by ci \cj , which can be calculated as:

NS. In first method (ICAm) each country is an encoded


matrix of symbols representing a solution, which may
be feasible or infeasible. In particular, a country in our
problem is a matrix of bits, which corresponds to the
solution values of e0ij s in Problem (2).

The country is encoded to map each dimension to


one directed connection between peers, i.e. the dimension is N N . The domain for each dimension is limited
ci \cj = ci (ci cj )
(1) to 0 or 1. But only in the upper half of the matrix and
update the values in both the optimal solution can be
Where n denotes the intersection operation on sets. reached.
This disjointness can be interpreted as the collection
of content pieces that peer i has but peer j does not.
The algorithm randomly generates countries that
In other words, it denotes the pieces that peer i can peer i will be connected to a maximum of di neighbors,
upload to peer j. Moreover, the disjointness operation so we are somewhat improved, and this makes the imis not commutative, i.e., ci \cj 6= cj \ci We also denote plementation of this algorithm is faster than other al|ci \cj | to be the cardinality of ci \cj which is the num- gorithms to achieve optimal response. In the second
ber of content pieces peer i can contribute to peer j. In method (ICA), each country is an encoded string of
order to maximize the disjointness of content, we want symbols representing a solution. In particular, a counto maximize the number of content pieces each peer try in our problem is a string of bits, which corresponds
can contribute to its neighbors by determining the con- to the solution values of e0 s in Problem (2).
ij
nections eij s Define ij 0 s to be sets such that ij = C
if eij = 1, and ij = 0 null set) otherwise. Therefore
The country is encoded to map each dimension to
we have the following optimization problem[7]:
one directed connection between peers, i.e. the dimension is N N . But the neighbor topology in P2P networks is an undirected graph, i.e. eij = eji for all i 6= j
and eij = 0 for all i = j. We set up a search space of
D dimension as N (N 1)/2. The domain for each
dimension is limited to 0 or 1.

N
N
[
X
(ci \cj ) ij |
|
max
B

j=1 i=1

Subject to:
N
X

In a few numbers of peers, results of methods are


similar, but by increasing number of peers, result of
first method is better second method.

eij di , i

j=1

The Imperialist Competitive


Algorithm For NS

In this paper, an optimization algorithm based on


modeling the imperialistic competition is used for
NS problem. Each individual of the population is
called country. The population is divided into two
groups: colonies and imperialist states. The competition among imperialists to take possession of the
colonies of each other forms the core of this algorithm
and hopefully results in the convergence of countries
to the global minimum of the problem. In this competition the weak empires collapse gradually and finally
there is only one imperialist that all the other countries
are its colonies.
First, the random matrix of the whole country is
initialized In this paper, we propose two methods for

153

Initialize parameters
Initialize random countries(N)
Calculate fitness of countries
Initialize the empires
for i=1 to D do
Assimilate()
Revolution()
Competition()
Calculate fitness of empires
if the end Decades is met or there is just one
empire then
stop and output the best solution, the
fitness
else
go to Assimilate()
end
end
Algorithm 1: Neighbor Selection Algorithm Based
on ICA (N, D)
The main steps in the algorithm are summarized

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

in the pseudo code shown in Algorithm1. In the algorithm, N is number of peers and D is the total number
of iterations to solve the NS problem.
After initialize parameters and random countries
based on the problem (2), the initial countries are
evaluated. After initialize the empires, in assimilation
move the colonies toward their relevant imperialist. We
find in this algorithm, match bits between imperialists
and their colonies that are not similar then calculate
number of them. We select randomly some of bits that
must change; Next step will be updated colony with
new values.
After revolution and evaluation empires by using
problem (2), in this algorithm after a while all the empires except the most powerful one will collapse and
all the colonies will be under the control of this unique
empire. If the end Decades is met or there is just one
empire stop.

Figure 1: Performance for the NS (25, 1400, 12)

Figure 2 illustrate the ICA, ICAm, PSO performance during the search processes for the NS problem
versus iteration in during the each processes for the
5 Experimental Studies
problem (30, 1400, 15).
Specific parameter settings of the algorithms are described in Table 2 Simulation result of ICA and ICAm
This section analyzes and compares the results of sim- almost is similar. As evident, ICA methods obtained
ulation PSO and ICA.
better results much faster than PSO.
Given a P2P state S = (N, C, M), in which N is the
number of peers, C is the entire collection of content
pieces, M is the maximum number of the peers which
each peer can connect steadily in the session.
Figure 1 illustrate the ICA, ICAm, PSO performance
ICA, ICAm
NumOfCountries = 80, Nuduring the search processes for the NS problem versus
mOfInitialImperialists = 8, Nuiteration in during the each processes for the problem
mOfDecades = 50, Revolution(25, 1400, 12).
Rate = 0.3, AssimilationCoefficient= 2
Specific parameter settings of the algorithms are
PSO
C1=1.5, C2=4-C1 NumofPartidescribed in Table 1.
cles=80, MaxIterations=50
As evident, ICA methods obtained better results
much faster than PSO.

ICA, ICAm

PSO

NumOfCountries = 50, NumOfInitialImperialists = 5, NumOfDecades = 50, RevolutionRate = 0.3, AssimilationCoefficient= 2


C1=1.5,C2=4-C1 NumofParticles=50, MaxIterations=50

Table 1: Parameter settings for the algorithms

Table 2: Parameter settings for the algorithms

Figure 3 illustrate the ICA, ICAm, PSO performance during the search processes for the NS problem
versus iteration in during the each processes for the
problem (40, 1400, 20).
Specific parameter settings of the algorithms are described in Table 2. As evident, ICA methods obtained
better results much faster than PSO.

154

The Third International Conference on Contemporary Issues in Computer and Information Sciences

ist Competitive Algorithm. In the proposed approach,


the country encodes the upper half matrix of the peer
connection through the undirected graph, which reduces the dimension of the search space. We evaluated
the performance of the ICA with PSO. The results indicate that ICA usually required shorter time to obtain better results than PSO, especially for large scale
problems. The proposed algorithm could be an ideal
approach for solving the NS problem.

Refrences
[1] S. Kwok, P2P searching trends: 2002-2004, Information
Processing and Management 42 (2006), 237-247.

Figure 2: Performance for the NS (30, 1400, 15)

[2] T. Idris, J. Altmann, and P. Smyth, A Market-managed


topology formation algorithm for peer-to-peer files sharing
networks, Lecture Notes in Computer Science 4033 (2006),
61-77.
[3] R. Xia and K. Muppala, A Survey of Bit-Torrent Performance, IEEE Communications Surveys & Tutorials 187
(2010), 119.
[4] H. Zhang and Z. Shao, Optimal Neighbour Selection in
Bit-Torrent-like Peer-to-Peer Networks, Proceeding of the
ACM (2011).
[5] S. Surana, B. Godfrey, K. Lakshminarayanan, R. Karp, and
I.Stoica, Load balancing in dynamic structured peer-to-peer
systems, Performance Evaluation 63 (2006), 217240.
[6] Koulouris T, R Henjes, K Tutschku, and H de Meer, Implementation of adaptive control for P2P overlays, 8IWAN 12
(2003), 1229-1252.
[7] S.G.M. Koo, K. Kannan, and C.S.G. Lee, On neighbourselection strategy in hybrid peer-to-peer networks, Future
eneration Computer Systems 22 (2006), 732-741.

Figure 3: Performance for the NS (40, 1400, 20)

[8] S Sun, A Abraham, G Zhang, and H Liu, A Particle Swarm


Optimization Algorithm for Neighbour Selection in Peer-toPeer Networks, 6th International Conference on Computer
Information Systems and Industrial Management Applications 1 (2007), 166-172.
[9] A. Abraham, H. Liu, and A.E Hassanien, Multi swarms for
neighbour selection in peer-to-peer overlay networks, 2010.

Conclusions

In this paper, we investigated the problem of neighbor


selection in peer-to-peer networks using The Imperial-

155

[10] Atashpaz-Gargari E, Imperialist Competitive Algorithm:


An Algorithm for Optimization Inspired by Imperialistic
Competition, IEEE Congress on Evolutionary Computation
(2007), 46614667.

Different Approaches For Multi Step Ahead Traffic Prediction Based


on Modified ANFIS
Shiva Rahimipour

Mahnaz Agha-Mohaqeq

Amirkabir University of Technology

Amirkabir University of Technology

Department of Mathematics and Computer Science

Department of Mathematics and Computer Science

Rahimipour@aut.ac.ir

m.mohaqeq@aut.ac.ir

Seyyed Mehdi Tashakkori Hashemi


Amirkabir University of Technology
Department of Mathematics and Computer Science
Hashemi@aut.ac.ir

Abstract: In the last two decades, short term prediction of traffic parameters has led to a vast
number of prediction algorithms. Short term traffic prediction systems that operate in real time are
necessary but not sufficient. A prediction system should be able to generate accurate and reliable
multi steps ahead predictions, besides the single step ahead ones. Multi steps ahead predictions
should provide information about the future traffic states with acceptable accuracy in cases of
system failure. This paper presents a comparative study between three different approaches for
multistep ahead forecasting. After a brief discussion about each approach, we apply them for data
gathered from Tehran highways, by modifying the structure of Adaptive Neuro-Fuzzy Inference
System (ANFIS). Finally the results of the comparative study are summarized.

Keywords: short term prediction, multistep ahead, neuro fuzzy.

Introduction

Traffic prediction has many uses in planning, design


and many other operations in the field of Intelligent
Transportation Systems (ITS). Accurate predictions
can reduce the impact of traffic congestion which is
a common problem all over the world. The effectiveness of short term prediction systems that operate in
real time, depends on predicting traffic information in
a timely manner[1]. This means that besides the traffic
conditions met in real time, a prediction system should
not only be able to generate accurate single step ahead
predictions, but to produce reliable multi steps ahead
predictions, in cases of data collection failure. Therefore one crucial issue is the development of models that
provide forecasts more than one step ahead. Multiple
steps ahead forecasting is of outmost importance. Sin Corresponding

Author, T: (+98) 914 187 5640

156

gle interval prediction algorithms cannot support any


operational decision making mechanisms as they cannot provide a reliable representation of the way traffic might evolve in the following minutes. The literature shows that researchers prefer non-conventional
statistical approaches such as neural networks and nonparametric regression to produce accurate forecasts for
several steps ahead [2]. Multistep ahead predictions
efforts are made using different methods such as neural networks [3], [4], [5], statistical approaches like
ATHENA [6], ARIMA [7] and state-space models [8].
Smith and Demetsky [9] used non-parametric regression to provide forecasts for 4 h ahead in 30-min intervals. Chen et al. [10] predicted traffic conditions 12
steps ahead at 15-min intervals, and Dia [11] predicted
travel time 45 steps ahead at 20-s interval.
Multi step ahead predictions provide the means to generate information on traffics anticipated state with ac-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

ceptable accuracy for a significant time horizon in cases


of system failure. The present paper focuses on providing a comparative study between Multi step ahead
prediction approaches applied to traffic parameters collected from Tehran highways. The remainder of the
paper is structured as follows : The main characteristics of two families of prediction strategies, the SingleOutput strategy and the Multi-Output strategy, is presented in the following section. Next, we review ANFIS and its modified structures for implementing the
approaches. Finally, the paper ends with the results of
the comparative study.

2.1

Conventional approaches to multi-step-ahead prediction like iterated and direct methods, belong to this
family since they both model from historical data a
multiple-input single-output mapping. Given a timeseries of a variable - for example volume V(t),V(t1),. . . ; their difference resides in the considered output
variable: V (t + 1) in the iterated case and the variables
V (t + h), h {1, . . . , H} in the direct case [14].

2.1.1

Multi step ahead prediction


approaches

A common problem with the traffic forecasting model


is the low accuracy of long term forecasts. The estimated value of a parameter may be reasonably reliable
for the short term future, but for the longer term future, the estimate is likely to become less accurate.
There are several possible reasons to account for this
increasing inaccuracy. One reason is that the environment in which the model was developed has changed
over time. Therefore, the input valid at a given time interval does not in fact have an influence on the output
relevant for a time interval quite some distance away
in the future. Another reason is that the model itself was not well developed. The inaccuracy arises due
to immature training or a lack of appropriate data for
training. The trained model may cover the surrounding
neighborhood of data but fails to model cyclic changes
of trend or seasonal patterns of data [12].
Most predictions systems are dependent on data transmission. This suggests that continuous flow of volume
and occupancy data is necessary to operate efficiently.
However, it is common for most real-time traffic data
collection systems to experience failures [13]. For this,
a real-time prediction system should be able to generate predictions for multiple steps ahead to ensure its
operation in cases of data collection failures.
This section introduces the main characteristics of two
families of multiple steps ahead prediction strategies:
the Single-Output strategy which relies on the estimation of a single-output predictor and the MultipleOutput strategy, which learns from data a multipleinput multiple-output dependency between a window
of past observations and a window of future values.

Multi-input Single-output (MISO)


approach

Iterated method

In this method, once a one-step-ahead prediction is


computed at time t, V (t), the value is fed back as an
input to the following step at t+1.
V (t+1)=f (V (t), V (t), V (t-1), . . . )

In iterated methods, an H-step-ahead prediction


problem is tackled by iterating, H times, a one-stepahead predictor.
Iterated methods may suffer from low performance in
long horizon tasks. This is due to the fact that they
are essentially models tuned with a one-step-ahead criterion and therefore, they dont take the temporal behavior into account appropriately. Moreover, the predictor takes approximated values as inputs instead of
actual observations, which leads to the propagation of
the prediction error [12].

2.1.2

Direct method

The Direct method is an alternative method for longterm prediction. It learns H single output models
where each returning a direct forecast of V (t + h) with
h {1, . . . , H}:
V (t+h)=f(V(t),V(t-1),. . . )

h {1, . . . , H}

In fact it transforms the problem to H distinct parallel problems.


This method does not propagate the prediction errors
but the fact that the H models are learned independently induces a conditional independence of the H estimators V (t+h). This prevents the technique from
considering complex dependencies between the vari-

157

The Third International Conference on Contemporary Issues in Computer and Information Sciences

ables V(t+h) and consequently bias the prediction accuracy. Also direct methods often require higher functional complexity than iterated ones in order to model
the stochastic dependency between two series values at
two distant instants [12].
The reliability of direct prediction models is suspect because the model is forced to predict further ahead [15].
This is the main argument in using iterative models
in multiple steps ahead prediction. On the other hand,
iterative predictions have the disadvantage of using the
predicted value as input that is probably corrupted
[16]. A possible way to overcome this shortcoming is to
move from the single-output to multiple-output modeling.

2.2

Multi-input Multi-output (MIMO)


approach

Both aforementioned cases used multi-input singleoutput techniques to implement the predictors. Singleoutput approaches face some limits when the predictor
is expected to return a long series of future values.
Another possible way for multistep ahead prediction
is to move from the modeling of single-output mapping to the modeling of multi-output dependencies.
This requires the adoption of a multi-output technique
where the predicted value is no more a scalar quantity
but a vector of future values of the time series. This
approach replaces the H models of the direct approach
by one multiple-output model [14].
{V (t+h), . . . , V (t+1)}=f (V(t), V(t-1), . . . )

The MIMO method constrains all the horizons to


be predicted with the same model structure, for instance with the same set of inputs, and by using the
same learning procedure. This constraint greatly reduces the flexibility and the variability of the singleoutput approaches and it could produce the negative
effect of biasing the returned model [14].

has lots of successful application in traffic prediction


field. Fuzzy logic is also famous for the well-known
strong capability of prediction. The common structure
of Adaptive Neuro-Fuzzy Inference System is shown in
Fig.1.

Figure 1: A common ANFIS structure

ANFIS network organizes two parts like fuzzy systems. The first part is the antecedent part and the
second part is the conclusion part, which are connected
to each other by rules in network form. ANFIS structure demonstrated in five layers can be described as a
multi-layered neural network. The first layer executes
a fuzzification process, the second layer executes the
fuzzy AND of the antecedent part of the fuzzy rules,
the third layer normalizes the membership functions
(MFs), the fourth layer executes the consequent part
of the fuzzy rules, and finally the last layer computes
the output of fuzzy system by summing up the outputs
of layer fourth.

We are going to use this network to apply the iterative method to our data. For the direct approach we
have to train H single output ANFIS with each returning a direct forecast of V(t+h) with h {1, . . . , H}.
when we place as many ANFIS models side by side as
there are required, the structure is called MANFIS1 .
Here, each ANFIS has an independent set of fuzzy
rules, which makes it difficult to realize possible certain
correlations between outputs. MANFIS is used to im3 Adaptive Neuro Fuzzy Infer- plement the direct approach. Another structure which
is used here for the MIMO approach is called CANence System
FIS2 . CANFIS has extended the notion of a singleoutput system, ANFIS, to produce multiple outputs.
Neuro- Fuzzy System combines the advantages of the In short, fuzzy rules are constructed with shared memtwo intelligent methods: Neural Network and fuzzy bership values to express correlations between outputs
logic. Neural network is capable of self-learning and [17].
1 Multiple
2 Coactive

Adaptive Neuro Fuzzy inference system


Adaptive Neuro Fuzzy inference system

158

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Data

The set of data (traffic speed/density) employed in this


study are collected from certain data collection points
every 1 minute, from 7 a.m. to 11 a.m. Our data collecting tool was Aimsun simulator which shows traffic behavior in Tehran. All data were collected from a
five-lane section along the East-Hemmat-highway with
400 meters length. The data samples used for training
and testing the models are normalized to a value between zero and 1. For all models 80% of data is used
for training and the rest 20% for testing the model.

minimum error in the MIMO approach. The error


increases when the multiple ANFIS model is used for
three step ahead prediction (direct approach), and finally the iterative approach has the maximum error
between all approaches.
Figures 2-4 are the time series of the actual versus the
predicted speed for three step ahead using all three
strategies. Solid lines are the real and dashed lines are
the predicted values:

Consider the series of traffic data which varies as a


function of time (V(t)). Using the models described in
the previous section, the input-output relations for the
multistep ahead prediction is: (our goal is to predict
one and three step ahead predictions)

Iterative :
V (t+1)=ANFIS(
V (t+2)=ANFIS(
V (t+3)=ANFIS(

Figure 2: Actual vs predicted speed - Direct approach


V (t), V (t-1), V (t-2),. . . )
V (t+1), V (t), V (t-1),. . . )
V (t+2), V (t+1), V (t),. . . )

Direct :
V (t+1)=ANFIS( V (t), V (t-1), V (t-2),. . . )
V (t+3)=ANFIS( V (t), V (t-1), V (t-2),. . . )
Which is in fact :
{V (t+1), V (t+3)} = MANFIS( V(t), V(t-1),
V(t-2),. . . )
MIMO :
{V (t+1), V (t+3)} = CANFIS( V (t), V (t-1), Figure 3: Actual vs predicted speed - Iiterated apV (t-2),. . . )
proach
We repeat these predictions for two traffic parameters,
speed and density.

Results

After training process, checking the performance of the


models has been carried out using the test data. Final
results are summarized in the following table:
Figure 4: Actual vs predicted speed - MIMO approach
Table 1: MSE in 3 step ahead prediction
Direct Iterative
MIMO
Speed
0.0189 0. 0191 1.847e-004
Density 0.0245
0.0370
7.716e-004

Conclusion

In this paper, we implemented three different apAs shown in the table, all parameters achieve their proaches for multistep ahead prediction based on AN-

159

The Third International Conference on Contemporary Issues in Computer and Information Sciences

FIS. The data used to train and check the models were
acquired by Aimsun simulation. The results show that
all testing errors are low enough to be accepted, but
MIMO approach implemented by CANFIS obviously
shows the good performance of simplicity, precision
and stabilization. It can be used in practical projects as
an applied short-time prediction model of urban roads.

Refrences

[1] B. L. Smith and R. K. Oswald, Meeting Real-Time Requirements with Imprecise Computations: A Case Study in Traffic Flow Forecasting, Computer Aided Civil and Infrastructure Engineering 18/3 (2003), 201213.
[2] E. I. Vlahogianni, J. C. Goloas, and M. G. Karlaftis, Short
term traffic forecasting: Overview of objectives and methods, Transport Reviews 24/5 (2004), 533-557.
[3] M. S. Dougherty and M. R. Cobbet, Short-term inter-urban
traffic forecasts using neural networks, International Journal of Forecasting 13 (1997), 21-31.
[4] B. Abdulhai, H. Porwal, and W. Recker, Short-term Freeway Traffic Flow Prediction Using Genetically-optimized
Time-delay-based Neural Networks, UCB, UCB-ITSPWP991 (Berkeley, CA (1999).
[5] S. Innamaa, Short-term prediction of traffic situation using MLP-neural networks, Proceedings of the 7th World
Congress on Intelligent Transportation Systems, Turin,
Italy (2000).
[6] M. Danech-Pajouh and M. Aron, ATHENA: a method
for short-term inter-urban motorway traffic forecasting,
Recherche Transport Securite 6 (1991), 1116.
[7] H. Kirby, M. Dougherty, and S. Watson, Should we use neural networks or statistical models for short term motorway
forecasting, International Journal of Forecasting 13 (1997),
45-50.

160

[8] J. Whittaker, S. Garside, and K. Lindeveld, Tracking and


predicting network traffic process, International Journal of
Forecasting 13 (1997), 5161.
[9] B. L. Smith and M. J. Demetsky, Multiple-interval freeway
traffic flow forecasting, Transportation Research Record
1554 (1996), 136141.
[10] H. Chen, S. Grant-Muller, L. Mussone, and F. Montgomery,
A study of hybrid neural network approaches and the effects
of missing data on traffic forecasting, Neural Computing
and Applications 10 (2001), 277286.
[11] H. Dia, An object-oriented neutral network approach to
short-term traffic forecasting, European Journal of Operational Research 131 (2001), 253261.
[12] H.H. Nguyen and C.W. Chan, Multiple neural networks for
a long term time series forecast, Neural Computing and
Applications 13 (2004), 9098.
[13] A. Stathopoulos and M.G. Karlaftis, A multivariate statespace approach for urban traffic flow modelling and prediction, Transportation Research Part C 11/2 (2003), 121135.
[14] S.B. Taieb, A. Sorjamaa, and G. Bontempi, Multiple-Output
Modelling for Multi-Step-Ahead Time Series Forecasting,
Neurocomputing 73 (2009), 1950-1957.
[15] A.S. Weigend and N.A. Gershenfeld, Time Series Prediction: Forecasting the future and understanding the past:
Santa Fe Institute Studies in the Science of Complexity
(1993).
[16] E.I. Vlahogianni and M.G. Karlaftis, Local and Global Iterative Algorithms for Real-Time Short-term Traffic Flow Prediction, Urban Transport and Hybrid Vehicles book, pages:
192, 2010.
[17] J.S. Roger, C.T. Sun, and E. Mizutani, Neuro-Fuzzy and
Soft Computing A Computational Approach to Learning
and Machine Intelligence, Prentice-Hall, 1997.

E-service Quality Management in B2B e-Commerce Environment


Parvaneh Hajinazari

Abbass Asosheh

Tarbiat Modares University

Tarbiat Modares University

Department of Information Technology Engineering

Department of Information Technology Engineering

Tehran, Iran

Tehran, Iran

p.hajinazari@gmail.com

Asosheh@modares.ac.ir

Abstract: The service oriented architecture (SOA) and its most common implementation, services, enable the enterprises to increase their agility in the face of change, to improve their operating
efficiency, and greatly reduce the cost of doing business in e-commerce environments. However, in
order to have a certain business, the behavior of the services should be guaranteed and these guarantees can be specified by Service Level Agreements (SLAs). In this regard, we present a model to
express SLAs and utilize the business services performance requirements specified as Key Indicators
(KPIs and KQIs) to define SLA parameters. This model can help automate the process of SLA
negotiation, monitoring and take actions in case of violations.

Keywords: E-Commerce; Service Oriented Architecture (SOA); Service Level Agreements (SLAs); Key Performance Indicators (KPIs); Key Quality Indicators (KQIs).

Introduction

has to specify both non-functional as well as functional


properties, in particular the technical quality of service
characteristics such as response time, throughput and
availability. In this regard, our research is based on
Nowadays, in modern global markets of e-Commerce,
quality management of business services and utilizes
contracts are made for short-period strategies that may
the Key Indicators (KIs) and Service Level Agreements
continue some days and even less. In such a dynamic
(SLAs) concepts.
environment, enterprises need to respond more effectively and quickly to opportunities in order to remain
The increasing role of SLAs in B2B systems is concompetitive in the global markets. In this regard, SOA
siderable. In a B2B system, the structure and role of
paradigm is known as the best practice for enterprises
SLAs allow for virtualization of the providers resources
because at bare minimum, it has the potential to lead
because B2B collaboration demands not only discovery
to increase agility, transparency and to decrease deof application resources based on existing metadata,
velopment and maintenance costs. In this approach,
but also discovery of these resources in the context
services can be easily assembled to form a collection of
of agreed contracts, like SLAs. In B2B environments,
autonomous and loosely coupled business processes [1].
the SLA mechanism helps service provider and conServices include unassociated and loosely coupled units
sumer manage the risks and expectations, and establish
of functionality that have no calls to each other. A sertrusted business relationships. Effective SLAs are very
vice performs an action such as filling out an online apimportant to guarantee business continuity, customer
plication for an account. In this context, it is important
satisfaction and trust. The metrics used to measure
to be ensured that services are executed as expected.
and manage performance compliance to SLA commitFor this purpose, each involved actor should be dements, are the heart of a successful agreement and are
scribed by both functional and trustworthy capabilities
a critical long term success factor. The categorization
so that qualified services could be aggregated to fulof SLA contents with a particular focus on SLA metfill the business goal. In addition, the service provider
Corresponding

Author, T: (+98) 911 325-8525

161

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

rics facilitates design decisions and helps to identify


responsibilities for critical IT processes in disruption
management during the execution of SLAs. However,
SLA definition is not straightforward from the business
goals of the enterprise [2]. In this regard, this research
uses KIs (KPIs and KQIs) concept as a good solution
to identify the essential metrics [3]. In our work, KPI
is used as business services performance indicators in
order to map to SLA parameters. In this way, target
values for KPIs can be specified in SLAs.

In our work the SLA concept is oriented to the service relationship between service consumer and service
provider in which a set of metrics could be used for
describing levels of quality of service and in order to
guarantee these levels, mechanisms are utilized. This
is in conformity with the definition contained in the
SLA Management Handbook of the TeleManagement
Forum.

The remainder of the paper is organized as follows:


section 2 describes the basic concepts in order to facilitate the understanding of our approach. The proposed
model is presented in section 3. At last, conclusions
and an outlook to our future work are discussed in section 4.

Background

From a business perspective, a generalized statement


of business goals relevant to the scope of the project is
decomposed into sub goals that must be met in order
for the higher-level goals to be met. This hierarchical
decomposition of the goals leads to identify the services
that will help in fulfilling the sub goals. It is also necessary to identify key indicators in order to provide an
objective basis for evaluating the degree to which the
goal has been achieved. Key indicators and target values identified during the process are used to measure,
monitor, and quantify the success of the SOA solution in fulfilling business needs [4]. In this regard, the
Telecommunication Management Forum (TM Forum)
utilized KPIs and KQIs for managing service quality.
KPIs are quantifiable measurements that reflect the
critical successful or unsuccessful factors of a particular service. KPIs represent the performance; thus, they
cannot completely represent end-to-end service quality,
therefore the TM Forum proposed KQIs, which are indicators that provide measurements of a specific aspect
of the performance of the product, product components (services) or service elements, and represent their
data from a number of sources including the KPIs [5].
The TMForum also defined a hierarchy among KPIs,
KQIs, and SLAs Fig. 1. SLAs in which provider and
consumer define the expected service behavior and its
quality, can be defined in terms of services KQIs. Service KQIs use service KPIs as metrics for reporting the
performance of their services (Target values for KPIs
must be reached in a certain period). An example of
the relationship between KPIs and KQIs is shown in
Table 1.

Figure 1: KQI, KPI and SLA Relationship [5]

Proposed Model

In this study, we assume the service provider offers the


services with desired functionality, so we focus on services non-functional requirements. The correct management of such requirements directly impacts the success of organizations participating in e-commerce and
also the growth of e-commerce. To achieve these objectives, a number of research contexts need to be explored. First, mechanisms are necessary to synthesize
business services KPIs on the SLA metrics of the web
services that compose the business process. Then, a
good theoretical SLA model is necessary to formally
specify and represent SLA. Finally some approaches
need to be developed to monitor and manage SLAs.
In this regard we model SLA metrics based on performance information of the services that compose the
business process. Our approach relies on the use of ontologies to represent the SLA model. Ontology-based
approaches have been suggested as a solution for semantically describing SLA specifications in a machine
understandable format.
Then an autonomic SLA monitoring system can be
developed using SWRL rules and the Jess rule engine.
The purpose of this system is detecting SLA violations.
In this way, there is a possibility to assure the traceability between KPIs defined over business services and
their target values established in SLAs.

162

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Service

Service KQI
Availability
Speech/Visual Quality

Video Conference

Response Time
Round Trip Delay
Delay
Confidentiality
Non- repudiation
Interoperability
Connect Time

Service KPI
MTBF, MTBR
Loss of Service
MOS
Loss, Jitter, Delay
CustomerSatisfacation
Response Time
OWD, RTT
OWD, RTT
PhysicalAccecclViolations
PhysicalAccecclViolations
InteroperabilityComplaints
Connect Time

Table 1: KQI and KPI Relationship


In this section, we illustrate the ontological model
for representing SLA based on KIs. We adopt UML for
representing the model. At first, as shown in Figure
2, an SLA is a contract between a customer and a service provider over a specific service, since our ontologybased SLA should contain all of these entities. Also,
an SLA has constraints that a service provider has to
guarantee. Therefore, SLA is the main class and aggregates three main classes: Parties, Services and Obligations. Parties class describes consumer and provider
involved in an SLA. It aggregates other classes used
to store information such as name, phone number, address, email and other related data. Services class represents the information about the offered service like
the validity period for the SLA, the service level that
is specified through service parameters and their corresponding KPIs. Obligations class represents the conditions that must be respected by the provider with
respect to the offered service. These conditions are
expressed in terms of KQIs, which are used for defining the terms under which the offered service will be
monitored and evaluated. Preferentially, KQIs are expressed as a function of KPIs. Availability is a KQI
example that can be defined in terms of MTBR and
MTBF KPIs for Helpdesk service. Obligations class
also defines the penalties to be applied when the expressed conditions are offended. We define how a KQI
is calculated in terms of the associated KPIs using
SWRL rules [6].

ships among KIs or between an SLA and a KI, calculating KQIs from KPIs, and detecting SLA violations. For modeling metric dependencies between services and processes, we focus on metrics which can be
measured or calculated at runtime. Examples for such
metrics are response time and availability [8].

Conclusion and Future Work

SOA enables the integration of services from various organizations such that the organizations can easily use
the services of other organizations based on specified
standards and setting out contracts under same standards. However some external providers may offer services that they dont meet the quality attribute requirements of the service consumer organization, therefore
defining an Service Level Agreement and establishing
SLA management mechanisms are important factors
when explaining the quality requirements for achieving
mission goals of business and service-oriented environments [9]. The level of service can be specified as target
and minimum which allows customers to be informed
what to expect (the minimum), while providing a measurable (average) target value that shows the level
of organization performance. SLA management allows the enterprises to identify and solve performancerelated problems before the business is being influenced
Based on our model, we have developed an ontol- by these problems.
ogy for representing SLA specification and presented
The KI is a key instrument in order to evaluate the
a Web Ontology Language (OWL) based Knowledge
performance
of business services and detect the state of
Base that can be used in an autonomous SLA mancurrent
and
completed
processes. In our methodology,
agement system. In our study, Protg, a free openKIs
are
used
for
mapping
business services performance
source ontology editor [7] with other related plug-ins
indicators
to
SLA
parameters.
With this method one
like SWRL tab are employed. In order to implement
can
find
the
suitable
services
that
satisfy business proour ontology, we built SLA OWL, and then, the SWRL
cess
performance
requirements.
In
general, being able
rules have been added for inferring hidden relationto characterize SLA parameters has some advantages

163

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

for enterprises. First, it allows for more efficient translation of enterprises vision into their business processes,
since those can be designed according to service specifications. Second, it allows the selection and execution
of web services based on business process requirements,
to better fulfill customer expectations. Third, it makes
possible the monitoring of business processes based on
SLA. In order to achieve these purposes, we introduced
a model to manage business services with SLAs that
guarantee a certain quality of performance. In this
regard, we have investigated business services KPI hierarchy based on [5] and proposed the ontology-based
SLAs and SWRL rules that are used for inferring hidden relationships among KPIs and SLAs.

[10]. Hence, forecasting SLA violations is more appropriate than just detecting them. This is left for our
future work. In spite of the above mentioned restrictions, our ontology-based SLA has some advantages as
it is very easy to extend due to its use of ontologies.
Also, SWRL rules, which are used for reasoning, can
be defined and modified dynamically without affecting
other aspects of the code. However, one of our major
research aims for the future work is finding a suitable
way to forecast SLA violations.

Refrences
[1] M. P. Papazoglou and W. J. Heuvel, Service oriented architectures: approaches, technologies and research issues, The
VLDB 16 (2007), 389-415.
[2] G. Frankova, M. Sguranb, F. Gilcher, S. Trabelsi, J. Dorflinger, and M. Aiello, Deriving business processes with service level agreements from early requirements, Journal of
Systems and Software, Elsevier 84 (2011), 1351-1363.
[3] E. Toktar, G. Pujolle, E. Jamhour, M. Penna, and M. Fonseca, An XML model for SLA definition with key indicators,
IP Operations and Management, Springer 4786 (2007), 196199.
[4] A. Arsanjani, S. Ghosh, A. Allam, T. Abdollah, S. Ganapathy, and K. Holley, SOMA: A method for developing serviceoriented solutions, IBM Systems Journal 47 (2008), 377
396.

Figure 2: SLA Class Diagram

There are a number of restrictions and open areas in


this work that is explained in following. We considered
and classified almost every KPIs based on [5], however,
it needs to be verified whether it reflects the actual
consumer and service providers criteria. Another restriction is about violations. Neither party wants the
SLA to be violated; consumers want a high level of
service for their key business processes, not a payment
for SLA violation, which will never compensate for the
loss of business. Similarly the provider does not want
to suffer the loss of market trust and credibility which
may affect many more accounts than that affected by
the SLA violation. Therefore the SLA must be in place
to foster a cooperative business approach to common

[5] The TeleManagement Forum and the Open Group, SLA


Management Handbook, Enterprise Perspective 4 (2004).
[6] I. Horrocks, P. F. Patel-Schneider, A. Allam, H. Boley,
S. Tabet, B. Grosof, and M. Dean, SWRL: A Semantic
Web Rule Language combining OWL and RULEML, W3C
(2004).
[7] Available The protege ontology editor and knowledge acquisition system, http://www.hut.fi/ vkarpijo/netsec00/.
[8] S. Kalepu, S. Krishnaswamy, and S. W. Loke, Verity: a qos
metric for selecting web services and providers, 4th International Conference on Web Information Systems Engineering
Workshops (2003).
[9] P. Bianco, G. A. Lewis, and P. Merson, Service level
agreements in service-oriented architecture environments,
Technical Report CMU/SEI-2008-TN-021 Carnegie Mellon
(2008).
[10] B. Mitchell and P. McKee, SLAs A Key Commercial Tool,
In P. Cunningham and M. Cunningham (eds.), Exploiting the Knowledge Economy- Issues, Applications, Case
Studies, Proc. eChallenges e-2006 Conference, IOS Press 3
(2006).

164

Calibration of METANET Model for Real-Time Coordinated and


Integrated Highway Traffic Control using Genetic Algorithm: Tehran
Case Study
Mahnaz Aghamohaqeqi

Shiva Rahimipour

Amirkabir University of Technology

Amirkabir University of Technology

Department of Mathematics and Computer Science

Department of Mathematics and Computer Science

m.mohaqeq@aut.ac.ir

rahimipour@aut.ac.ir

Masoud Safilian

S.Mehdi Tashakori Hashemi

Amirkabir University of Technology

Amirkabir University of Technology

Department of Mathematics and Computer Science

Department of Mathematics and Computer Science

m.safilian@aut.ac.ir

hashemi@aut.ac.ir

Abstract: This paper employs previously developed model predictive control (MPC) approach
to optimally coordinate variable speed limits and ramp metering along the 2 km section of the
Hemmat highway to deal with the problem of rush hour congestion. To predict the evolution of
traffic situation in this zone, an adapted version of the METANET model that takes the variable
speed limits into account should be used.Before using this traffic model to predict the evolution
of the traffic situation, it should be calibrated in order to make the state variables of the model
in a good consistence with the real values. To do this, we use a genetic algorithm. Simulation
consequence show that genetic algorithm is able to find optimal solutions to model set parameters
so that MPC approach results less congestion, a higher outflow and a lower total time spent in the
controlled areas.

Keywords: Model predictive control (MPC); METANET model; calibration; ramp metering; variable speed limit
control; genetic algorithm.

Introduction

eas, and environmental considerations render this approach little attractive. The second approach is based
on the fact that the capacity provided by the existing infrastructure is practically underutilized, i.e. it
is not fully exploited [1]. Thus, before building new
infrastructure, the full exploitation of the already existing infrastructure by means of dynamic traffic management measures such as ramp metering, reversible
lanes, speed limits and route guidance should be ensured.

The notoriously increasing number of vehicles that use


the provided network capacity has leaded to severe
problems in the form of congestion which results in serious economic and environmental problems, negative
impacts on the quality of life and augmenting possibility of accidents. Two complementary approaches for
solving problems caused by motorway congestion phenomena are possible without diverting demand to other
Ramp metering is the most common way to control
modes of transportation. The first one is to construct
new motorways, i.e. address the problem by providing traffic conditions on the highway networks by regulatadditional capacity to the networks. Land availability ing the input flow from the on-ramps to the highway
issues, especially in and around large metropolitan ar- mainstream. A good overview of the different ramp
Corresponding

Author, T: (+98) 9360565834

165

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

metering algorithms is found in [2]. However, the effectiveness of this method will be reduced when the
demand from the onramp is high and traffic in the upstream mainline is getting dense [3]. In such circumstances, ramp meter cannot relieve or even alleviate
the congestion itself, because even a small flow from
the on-ramp can cause a breakdown and subsequently
congestion will be formed, especially where the capacity of on-ramp is limited. Thats because, ramp metering only controls the inflow from the on-ramp into the
mainline and the collective behaviors of the drivers in
the mainline of highway are not controlled by this. This
is why using ramp metering alone cannot appropriately
control the highway traffic in practice and employing
other control strategies such as Variable Speed Limits
is needed.
Variable Speed Limit control is a particular dynamic traffic management measure that aims to simultaneously improve both traffic safety and traffic performance (e.g., minimizing the total time spent) of highway network by dynamically adjusting optimal set of
speeds for controlled segments and display those variable speed limits on variable message signs (VMSs).
Variable Speed Limit attempts to control the collective
vehicle speed or driver behavior of mainline and in this
regard is complementary to ramp metering [4]. On the
other hand, as shown in [3], placing speed limiters just
before the on-ramp can help reduce the outflow of controlled segments so that there will be some space left to
accommodate the traffic from the on-ramp.in this way,
traffic breakdown could be prevented or delayed. These
are the motivation for using different control strategies
in coordinated scheme. References [5-8] are examples
of resources that considered both variable speed limit
and ramp metering, which are believed to be the two
key tools influencing conditions on congested highways.

One of the major difficulties to implement a modelbased optimization control strategy is that the model
parameters are difficult to calibrate. To address this
issue genetic algorithm is used to tune the model set
parameters.
The arrangement of this article is as follows. In Section 2, the basics of the MPC scheme are introduced.
In Section 3, the traffic flow model (prediction model)
is introduced. The tuning process of model parameters
base on the genetic algorithm is explained in Section
4. In Section 5, the introduced method is applied to
the 2-km section of the eastbound Hemmat highway
selected as the study network. Section 6 summarizes
the main conclusions.

Model Predictive Control

We consider the problem of finding the best control


settings for a group of controllers in the study network consisting of ramp meters and a set of variable
speed limit signs. The control objective is to minimize
the total time spent (TTS) by all vehicles in the study
network. To do this, we use previously proposed model
predictive control (MPC) approach.

The core idea of the MPC is its use of a dynamic


model to predict the future behavior of the system at
each optimization step in order to avoid making myopic control decisions. In this paper, we have utilized
MPC as an online method to optimally control traffic
flow in a part of Hemmat highway with system states
being predicted by a macroscopic traffic flow model [8].
We assume that the reader is familiar with the basic
ingredients of the MPC approach. Nevertheless, the
As noted above, for a given traffic network a com- following paragraphs provide a brief description of the
bination of various traffic control strategies has the po- MPC framework introduced in [9].
tential to achieve better performance than when they
Consider a traffic network with N controllers over
are implanted separately. Beside, the latest advances in
a
specific
time horizon. The time horizon is divided
computers and communication technologies have made
into
P
large
control intervals, each subdivided into M
it feasible and financially viable to implement these Ausmall
intervals
(called system simulation steps). It is
tomatic control tools in coordinated scheme to improve
assumed
that
over
each control interval, the control
real-world traffic conditions [8]. Modern optimal convariables
are
kept
the
same, whereas the system state
trol techniques such as MPC, model-based optimizachanges
by
the
simulation
step.Let kc be the index for
tion control strategy, seem appropriate for this purlarge
intervals
(k
=
1,
2,
,
P
) and k for all the subinterc
pose [3] .Thus, we employ previously developed model
vals
(k
=
1,
2...,
M
P
)
[8].
The
transition of the system
predictive control (MPC) approach to find the control
state
can
be
expressed
as
follows:
settings for a group of controllers in the 2 km section
of the Hemmat highway consisting of the combination
of ramp meters and variable speed limit signs in order
x(k + 1) = f (x(k), u(k), d(k))
to minimize the total time spent (TTS) by all vehicles
in this site.
Where x(k), u(k), and d(k) are vectors representing

166

The Third International Conference on Contemporary Issues in Computer and Information Sciences

corresponding to highway stretches. A highway link


(m) is divided into (Nm ) segments (indicated by the
index i) of length (lm,i ) and by the number of lanes
(nm ). Each segment (i) of link (m) at the time instant t = kT , k = 0, ..., K is macroscopically characterized by the traffic density m,i (k)(veh/lane/km),
the mean speed vm,i (k)(km/h) and the traffic volume
qm,i (k)(veh/h). Each link has uniform characteristics
i.e. no on-ramp or off-ramp and no major changes in
geometry. The nodes of the graph are placed between
For the time period of [1.2, ..., P ], in which P is the links where the major changes, such as on-ramps and
prediction horizon. To reduce the computational com- off-ramps in road geometry occur. The time step used
plexity, a control horizon C(C < P ) is usually defined for simulation is denoted by T [11].
to represent the time horizon over which the control
Table 1 describes the notations related to the
signal is considered to be fixed, i.e.,
METANET model [11]. The traffic stream models that
capture the evolution of traffic on each segment at each
u(kc ) = u(C 1)f orkc > C
time step are shown in Table 2 [11].

the system state, the Control decisions, and the disturbance at time k. At each control step k c , a new optimization is performed to compute the optimal control
decisions(u(kc )), e.g,.

u1 (kc ) u1 (kc + 1) ... u1 (kc + p 1)

.
.
...
.

.
.
...
.

.
.
...
.
uN (kc ) uN (kc + 1) ... uN (kc + p 1)

Therefore, for N controllers, the N C vector of optimal controls(u(kc )) would be

u1 (kc ) u1 (kc + 1)
.
.

.
.

.
.
uN (kc ) uN (kc + 1)

... u1 (kc + C 1)

...
.

...
.

...
.

... uN (kc + C 1)

Only the first optimal control signal ui (kc ), i =


1, 2, ..., N (first column) is applied to the real system,
and after shifting the prediction and control horizon
one step forward with the current observed states of
the real system to the model, the process is repeated.
This feedback is necessary to correct any prediction
errors and system disturbances that may deviate from
model prediction. Since we have to work with a nonlinear system (traffic model), in each control time step kc ,
a nonlinear programming has to be solved to find the
N C optimal solutions before reaching the next control time step (kc + 1) [8]. For more information about
MPC approach see [10] and the references therein.

Prediction model

The traffic flow model used here to predict the future


behavior of the traffic system is the extended version
of METANET model for speed limits.
The METANET is a macroscopic traffic model that
is discrete in both space and time. The model represents the network by a directed graph with the links

167

Table 1: Notations Used in Metanet Model


m,
Link index
i
Segment index
T
Simulation step size
k
Time step counter
m,i (k) Density of segment i of highway link m
vm,i (k) Speed of segment i of highway link m
qm,i (k) Flow of segment i of highway link m
Nm
Number of segments in link m
nm
Number of lanes in link m
lm,i
Length of segment i in link m

Time constant of the speed relaxation term

Speed anticipation parameter (veh/km/lane)

Speed anticipation parameter(km2 /h)


am
Parameter of the fundamental diagram
crit,m Critical density of link m
V (m,i (k))
Speed of segment i of link m on a
homogeneous highway as a function of m,i (k)
max,m Maximum density of link m
vf ree,m Free-flow speed of link m
w0 (k) Length of the queue on on-ramp o at the
time step k
q0 (k) Flow that enters into the highway at time
step k
d0 (k) Traffic demand at origin o at time step
k
r0 (k) Ramp metering rate of on-ramp o at time
step k
Q0
On-ramp capacity

Speed drop term parameter caused by merging


at an on-ramp
vcontrol,m,i
Speed limit applied in segment i of link m

Parameter expressing the disobedience


of drivers with the displayed speed
limits

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Table 2: Link Equations and Descriptions


qm,i (k) = vm,i (k)m,i (k)nm
m,i (k + 1) = m,i (k) + lm,iTnm [qm,i1 (k) qm,i (k)]
vm,i (k + 1) = vm,i (k)+
T
( V [m,i (k)] vm,i (k))
m
{z
}
|

Flow-Density equation
Conservation of vehicles
Speed dynamic
Relaxation term: drivers try to achieve desired speedV ().

Relaxtion Term

T
+(
vm,i (k)[vm,i1 (k) vm,i (k)])
lm,i
|
{z
}

Convection Term: Speed decrease (increase) caused by inflow

Convection Term

of vehicles.
m T m,i+1 (k) m,i (k)

(
)
m .lm,i
m,i (k) + m
{z
}
|

Anticipation Term: the speed decrease (increase)as drivers

Anticipation Term

(k)

m,i
a
V [m,i (k)] = vf ree,m exp( (1)
am ( crit,m )m
wo (k + 1) = wo (k) + T [do (k) qo (k)]
qo (k) = min[do (k) + woT(k) , Qo .ro (k),

m,1 (k)
]
Qo max,m
max,m crit,m

(k)

m,i
a
V [m,i (k)] = min[vf ree,m exp( (1)
am ( crit,m )m )
, (1 + )vcontrol,m,i (k)]

T q (k)v

( (lm,i nmo (m,1m,1


(k)+ )

experience the density increase(decrease) in downstream.


Speed-Density relation(fundamental diagram)
Origins queuing model
Ramp outflow equation
The outflow depends on the traffic condition in the
mainstream and also on the metering rate,ro (k) [0, 1]
Speed limit model
The desired speed is the minimum of the speed determined
by (4) and the speed limit, which id displayed on variable
message sign(VMS)
Speed drop caused by merging phenomena. If there is an
on-ramp then the term must be added to (3)
model data. For this purpose, the genetic algorithm
toolbox implemented in Matlab is employed.

Calibration
METANETs
model parameters

The model Calibration procedure aims at enabling the


model to represent traffic conditions with sufficient
accuracy. The macroscopic model presented in Section 3 includes a number of parameters that reflect
particular characteristics of a given highway stretch
and depend upon highway geometry, vehicle characteristics, drivers behavior and etc. these parameters
Should be calibrated to fit a representative set of real
data with the maximum possible accuracy. For this
purpose the macroscopic traffic simulator Aimsun
will be used. Data deriving from this simulator will
be used as real world data in order to be confronted
with the model data. The purpose of the calibration is
just minimizing the difference between real data and

Genetic algorithm starts with an initial set of random solutions called population. Each individual in
the population is called a chromosome, representing a
solution to the calibration problem. The evolution operation simulates the process of Darwinian evolution
to create population from generation to generation by
selection, crossover and mutation operations. The success of genetic algorithm is founded in its ability to
keep existing parts of solution, which have a positive
effect on the outcome [12].
The seven parameters of METANET model
(vf ree , , , , , am , crit ) are changed by the genetic
algorithm. To compromise between computation time
and precision, the 30 individuals are selected. After
creating a new population the fitness value has to be
calculated for each member in the population and then
ranked based on the fitness value. The genetic algorithm selects parents from the current population by
using a selection probability. Then the reproduction

168

The Third International Conference on Contemporary Issues in Computer and Information Sciences

of children from the selected parents occurs by using


recombination and mutation. The cycle of evaluation,
selection and reproduction terminates when the convergence criteria is met [11].

4.1

Fitness Function

The calibration is an optimization procedure that minimizes the difference between the real data coming
from Aimsun and the data coming from METANET
model. In particular we try to minimize the following
objective function:
Figure 1: Segment 2, measured versus predicted flow,
speed and density,qualitative validation.

Nsamp

model
sim
(qm,i
(h) qm,i
(h))2

h=0 m,iIall
model
+(vm,i
(h)

sim
sim
2
vm,i
(h))2 + (model
m,i (h) m,i (h))
(1)

Case Study

Where Nsamp is the number of simulation time step A 2-km section of the eastbound Hemmat highway was
into the entire simulation period, Iall is the set of in- selected as the study network. The Hemmat services
dexes of all pairs of links and segments.
a large volume of commuter traffic in both morning
and evening peak periods, leading to heavy recurrent
congestion. For these reasons, we consider the 2-km
section of this highway as an ideal study section to apply control framework presented above in order to al4.2 Results of Model Calibration
leviate serious congestion problems. Network topology
and the location of the control equipments and sensors
For the Calibration procedure one measurement set, can be seen in Fig.2.
corresponding to one weekday, from 7 a.m to 11 a.m,
was available from the Study site. Our data collecting tool was Aimsun simulator. These data provided
flow, speed and density measurements on a ten secondby- ten second basis. Genetic algorithm results a set of
optimal parameters. The summarized outcome of this
effort is presented in Table 3.
Figure 2: Candidate traffic network.
Table 3: Parameter set for Hemmat highway
crit,m
vf ree (km/h)

(second)
32.1646
92.1957
0.08649
13.839
(km2 /h) (veh/km)
am
31.6307
56.0935
2.425
-

The objective function used in this paper is to minimize the TTS spent by all vehicles, as defined in

Pk+P 1 P
T j=k [ m,i m,i (j)lm,i nm +
P
PK+P 1
P
[ramp oOramp (ro (j)
+
o wo (j)]
j=k
P
vi (j)vi (j1) 2
2
) +
Based on the set of parameters shown in Table 3, ro (j P 1)) + speed iIspeed (
vf ree
Fig.1 depicts the speed, density and flow trajectory de- queue 0Oramp (max(wo wmax ))2 ] (2)
termined by the Calibrated model and compared with
the actual measurements. As it can be seen in Fig.1,
For the MPC system, the optimal prediction and
after calibrating the model parameters the model is control horizons were found to be approximately 60
properly able to predict the network traffic conditions. and 48 steps, corresponding to 10 and 8 min, respecTTS

169

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

tively. The time step for control updates was set to 1


min, which means that every minute, optimal control
must be computed and applied to the traffic system.
The simulation results, from 7 a.m to 9 a.m, for nocontrol and MPC cases are shown in Fig. 3. The TTS
under no-control case was 2482.1 (veh.h). The TTS
under control case was 2192.7 (veh.h), which showed
11.6584% improvement compared with the no-control
case.

Conclusions and Futre Work

In this paper, model predictive control approach has


been used to address the problem of Congestion control
in the selected part of Hemmat highway. METANET
model which is used for the prediction step of MPC was
calibrated by genetic algorithm to enable the model
to represent traffic conditions with sufficient accuracy.
Based on the simulation consequences, MPC approach
results less congestion, a higher outflow, and a lower
total time spent in the controlled areas. For future
works, we will be focusing on testing the MPC approach for larger part of Tehrans traffic network, including more traffic controllers, to investigate proficiency of this method as the number of traffic controllers increases.

Refrences

Figure 3: The simulation results for the no-control


case: segments traffic density, segments traffic speed,
segments traffic flow and origins queue length.

Figure 4: Simulation results for the controll case: Segment traffic density, Segment traffic speed, Segment
traffic flow, Origin queue length, Optimal ramp metering rates and Optimal speed limit values.

[1] A. Kotsialos, M. Papageorgiou, C. Diakakii, Y. Pavlis, and


F. Middelham, Traffic Flow Modeling of Large-Scale Motorway Networks Using the Macroscopic Modeling Tool
METANET, IEEE TRANSACTIONS ON INTELLIGENT
TRANSPORTATION SYSTEMS 3 (2002), 282292.
[2] A. Kotsialos and M. Papageorgiou, ramp metering: An
overview, IEEE TRANSACTIONS ON INTELLIGENT
TRANSPORTATION SYSTEMS (2002), 271281.
[3] A. Hegyi, B. De Schutter, and H. Hellendoorn, Model predictive control for optimal coordination of ramp metering and
variable speed limits, Transport. Res. C 13 (2005), 185209.
[4] X. Lu, T. Qiu, P. Varaiya, R. Horowitz, and S. E Shladover,
Combining Variable Speed Limits with Ramp Metering
for Freeway Traffic Control, American Control Conference
(2010).
[5] A. Alessandri, A. Di Febbraro, A. Ferrara, and E. Punta,
Optimal control of freeways via speed signaling and ramp
metering, Control Engineering Practice 6 (1998), 771780.
[6] C. Caligaris, S. Sacone, and S. Siri, Optimal ramp metering
and variable speed signs for multiclass freeway traffic, Proc.
of European Control Conference, Kos Greece (2007).
[7] I. Papamichail, K. Kampitaki, M. Papageorgiou, and A.
Messmer, Integrated Ramp Metering and Variable Speed
Limit Control of Motorway Traffic Flow, 17th IFAC World
Congress, Seoul, Korea (2008).
[8] A. Ghods, L. Fu, and A. Rahimi Kian, An Efficient
Optimization Approach to Real-Time Coordinated and
Integrated Freeway Traffic Control, IEEE TRANSACTIONS ON INTELLIGENT TRANSPORTATION SYSTEMS (2010).
[9] A. Hegyi, Model predictive control for integrating traffic
control measures (2002).
[10] A. Kotsialos, M. Papageorgiou, and A. Memer, Integrated
optimal control of motorway traffic networks, American
Control Conf,(ACC), San Diego (1999), 21832187.
[11] A. Ghods, A. Rahimi Kian, and M. Tabibi, Adaptive Freeway Ramp Metering and Variable Speed Limit Control:A
Genetic-Fuzzy Approach, IEEE INTELLIGENT TRANSPORTATION SYSTEMS MAGAZINE (2009).
[12] D. Goldberg, Genetic Algorithm in Search, Optimization
and Machine Learning (1989).

170

Designing An Expert System To


Diagnose And Propose About Therapy Of Leukemia
Armin Ghasem Azar

Zohreh Mohammad Alizadeh Bakhshmandi

Department of Computer and Information Sciences

Department of Computer and Information Sciences

Institute for Advanced Studies in Basic Sciences

Institute for Advanced Studies in Basic Sciences

Zanjan, Iran

Zanjan, Iran

z.alizadeh@iasbs.ac.ir

a.ghasemazar@iasbs.ac.ir

Abstract: Expert systems are designed for non-expert individuals with the aim of providing skills
of qualified personnel. These programs simulate the pattern of thinking and the manner of how
human operates and cause the operation of expert systems to be close to operations of human or
an expert. Variety of expert systems has been yet offered in the field of medical science and in
this respect it is one of the leading sciences. Leukemia is very common and serious cancer starts
in blood tissue such as the bone marrow. It causes large numbers of abnormal blood cells to be
produced and enter the blood. Speed is always effective in diagnosis and treatment of Leukemia
and recovery of patients, but sometimes there is no access to specialists for patients and because of
this reason designing a system with specialist knowledge, that offers the diagnosis and appropriate
treatment to patients, provides the timely treatment of patients. In this paper an expert system
has been presented for diagnosis of Leukemia using VP-Expert shell.

Keywords: Expert System of Leukemia; Diagnosis; Therapy.

Introduction

With the expanding application of information technology, decision making systems or generally decisions
based on computer have been of very importance. In
this regard expert systems as one of the parts attributed to artificial intelligence have the main role.
All kinds of decisions in expert systems are taken by
the help of computers. Expert systems, are knowledgebased systems and knowledge is their most important
part. In these systems knowledge is transferred from
experts in any sciences to the computer. Expert systems have been used extensively in various sciences.
So far various expert systems have been designed and
presented in areas such as industry, space travel, financial decision making and etc. Using expert system has
found its way to medical world [1].
DENDRAL was presented in 1965 to describe and explain the molecular structure [2], MYCIN was submitted in 1976 to diagnose heart disease [3], and other expert systems to detect acid and electrolyte materials,
Corresponding

train in management of anaesthesia, diagnose diseases


of internal medicine are of this category [6].
The purpose of this article is to present an expert system to diagnose and propose practices in therapy of
Leukemia. The issue will be discussed in more and then
the stages of system construction and its components
will be expressed and finally the stages of designed system function will be described with an applied example.
A medical expert system is a computer program that
offers effective aids in making decision about diagnosis
of diseases and motions on treatment method. Diagnosis of disease and predicting complications is done
after the program, receives patients information. This
information is usually transmitted through the patient
to the physician. Medical expert systems have features that distinguishes them from other medical applications. One aspect of this difference is that these
systems, mimic the arguments of an expert physician,
step by step, in order to achieve accurate results. In
most cases, the specialist using this software, is aware

Author, P. O. Box 45195-1159, M: (+98) 914 306-0594, T: (+98) 241 415-5056

171

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

of these sequential arguments.


Leukemia is one the most important cancers that human society has been involved with. There is not usually a certain sign for Leukemia and when symptoms
appear, they are very ambiguous and complex and are
too similar to symptoms of flu. An expert system can
be designed that can diagnose Leukemia with a view of
the above symptoms, and suggest specific treatments.
Using expert software systems has some advantages
such that:
Individuals have fleeting and transient expertise.
For example, a person may change his job, be- Figure 1: the relationship between various components
come sick, etc, but the computer has permanent in expert systems [6]
expertise;
In this method, the systems that are not yet ready
to be formally delivered are provided to users to obtain the necessary feedback and necessary modifications are done on the system. This method involves
three stages: Analysis, Design and Implementation,
that are repeated together in a lump [13].
Prototype method is also used in this article. There Also, expert systems have the ability to upgrade. fore, the purposes and objectives of expert systems are
Some other advantages that expert systems can firstly defined and then gradually related research and
create include:
identification of hard-wares and soft-wares and related
experiences will be done. Then the environment of ex High Performance;
pert system is described and then conceptual analysis
and design of system is done and in fact a kind of feasi Full And Fast Performance Time;
bility is performed. In the next stage, the components
of expert system is determined and the soft-wares that
Good Reliability;
can support these components of system are surveyed
Being Understood;
and determined. Finally, the system is built and components are put together.
Flexibility;
Person does not have stable expertise. Expert
individual may have holidays, recreations programs, etc, that all these impact adversely on
normal function of individuals, but computers are
stable and in the same condition, offer the same
outputs;

Risk Reductions;

2.2

Components Of Expert System

Durability And Survival;


Existence Of Multiple Specialities.
The aim of the project leads to presentation of this article, taking advantages of a software system in order to
achieve all the benefits of a expert system to diagnose
the disease and propose about how to treat Leukemia
[6, 8].

Survey Method

Expert system for diagnosing and proposing about


Leukemia as any expert system is composed of three
main components:
Knowledge base management subsystem;
Interface management subsystem;
Inference engine subsystem.
The schematic view of components of an expert system
is shown in Figure 1. In following all three components
of designed system will be described [6, 7, 13].

The VP - EXPERT expert shell has been used in order


to design the mentioned expert system. This software
2.1 Stages of system construction
has been presented in 1993 by World Tech Systems
Company in America as a tool for developing rulePrototype is one of the most common design methods based expert systems. The software features can include [7, 14]:
that is used by builders of expert systems.

172

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Ability to create a knowledge base file with a sim- any question. For this purpose, 3 tables, decision in orple table;
der to identify patient, deduce the type of blood test
mode and deduce the type of symptoms are used.
Chaining capability to link together multiple
knowledge bases;
Automatically generation of some questions that
achieving to the result is not possible without
knowing their answers;
2.4 Inference Engine Subsystem
Existence of relatively diverse mathematical
functions;
In rules-based systems, inference engine, works in a
Existence of instructions that wants expert sysmanner that selects a base for the test and checks
tem to explain its activities through a consultawhether or not the conditions of this rule are correct?
tive work.
These conditions may be assessed through examination
of the user or may be derived from the facts that were
obtained during interviews. When conditions about a
rule are right, then the results of that rule will be correct. Once this rule is activated, and the result is added
to the knowledge base.

2.5

The User Interface Subsystem

Figure 2: Mockler diagram related to diagnosis of


The user interface for an expert system, normally
Leukemia [6]
should be of high exchangeable power, so that the
structure of information exchange is accomplished in
the form of talk to an applicant and an expert human
2.3 Knowledge Base Subsystem
[8]. VP-Expert shell has a user interface that some
questions are asked from the user based on rules of
The block and Mockler diagrams are used in order system knowledge base and based on the answer user
to achieve the knowledge base of mentioned system. gives the system, necessary conclusions are done and
Block diagrams are graphs in which the main tasks of at the end a good answer is offered to the user. In
the system is determined and is very suitable for ex- the next section, the work process of expert systems is
pressing the relationship between agents and targets. described with a practical example.
The block diagram related to the diagnosis of Leukemia
in the first level has been composed of three parts of
blood test, symptoms of disease and time of disease on
set.
Block diagrams do not help in writing the rules, because they don not have necessary details for this work. 2.6 Implementation
In this regards, a diagram is necessary that specifies
the relationship between factors effective on the aim
by specifying the questions, rules and recommenda- Consider a man is suddenly contacted vomiting,
tions. The first level of Mockler diagram for diagnosis headache, anemia and splenomegaly and his blood test
of Leukemia is shown in Figure 2.
shows that the PLT is 19,000, WBC to 3,000 units,
As is shown, the questions about duration of disease RBC to 5, HCT to 0.30 and the amount of hemoglobin
have been shown on the straight line and options re- is 11 units. This person plans to investigate disease
lated to the questions can be also seen under the same status (or lack of) and its kind with designed expert
line. After the questions and options that user should system. A view of the user interface and the answer
answer to every question, are determined by drawing of designed system is shown in Figure 3. After the
Mockler diagram, the results and various situations can diagnosis of disease, system provides ways of disease
be determined that the user may impose in response to treatment.

173

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Refrences
[1] Durkin J., Expert Systems: Design and Development, Prentice Hall, New York, 1994.
[2] Edward A, Feigenbaum BG, Buchanan D, and Meta D,
Roots of Knowledge Systems and Expert System Applications, Artificial Intelligent 59 (1933), no. 12, 233240.
[3] Shorrtliffe EH, Computer-based Medical Consultations:
MYCIN, Elsevier Science Publishers, New York (1976).

Figure 3: the question of VP-Expert system from user


about vomiting disease symptoms

[4] Siyadat M and Soltaniyanzadeh H, Hippocampus location in


the human brain in MRI process by expert systems, Journal
of Engineering, Faculty of Tehran University 341 (2001),
923.

[5] Hatzilygeroudis P, Vassilakos J, and Tsakalidis A, XBONE:


A Hybrid Expert System Supporting Diagnosis of Bone
Diseases, Europe97,Proceeding of the Medical Informatics,
London (1997).

Discussion And Conclusion

In this article, providing an expert system to diagnose


and recommend treatment method for Leukemia was
proceeded. For this purpose, the objectives and targets
of expert system was first defined and then the review
of relevant researches and identification of hard-wares
and soft-wares and related experiences was proceeded
and the environment of expert system was described.
Then the conceptual design and system analysis and
in fact a kind of feasibility was conducted. In the next
step, the components of expert systems was determined
and the VP-Expert shell was determined as a software
that can support those components.
It is noteworthy that it should be tried to provide systems that can simulate the behaviour od expert people,
but it is not always possible. One defect of the designed
system is that the clinical evaluation is not possible and
system acts only based on user responses and can not
survey the verification of responses received from the
user.

[6] Ghazanfari M and Kazemi Z, Expert Systems, Elmo Sanat,


Tehran, 2004.
[7] Elahi Sh and Rajabzadeh A, Expert Systems: Intelligent
Decision Making Pattern, Bazargani, Tehran, 2004.
[8] Darligton K and Motameni H(Translator), Expert Systems,
Olomeh Rayaneh, Tehran, 2003.
[9] Babamohammadi H, Internal Surgery Nursing, Boshra,
Tehran, 2009.
[10] Bahadori M, Robbins Pathology, Andisheh Rafi, Tehran,
2006.
[11] Robbins SL, Historical Specificity, Andisheh Rafi, Tehran,
1998.
[12] Shahbazi K, What is a Cancer, Elmo Sanat, Tehran, 2010.
[13] Turban E, Aronson JE, and Liang TP, Decision Support
Systems and Intelligent Systems, 7th ed., Prentice Hall,
New York, 2005.
[14] Simonovic SP, User Manual of VP Expert: Rule based
expert system development tool, Word Tech System, London, 1993.

174

A Basic Proof Method For The


Verification, Validation And Evaluation Of Expert Systems
Armin Ghasem Azar

Zohreh Mohammad Alizadeh Bakhshmandi

Department of Computer and Information Sciences

Department of Computer and Information Sciences

Institute for Advanced Studies in Basic Sciences

Institute for Advanced Studies in Basic Sciences

Zanjan, Iran

Zanjan, Iran

a.ghasemazar@iasbs.ac.ir

z.alizadeh@iasbs.ac.ir

Abstract: In the present paper, a basic proof method is provided for representing the verification,
validation and evaluation of expert systems. The result provides an overview of the basic method
for formal proof such as: partition larger systems into small systems, prove correctness on small
systems by non-recursive means, prove that the correctness of all subsystems implies the correctness
of the entire system.

Keywords: Expert System; Partition; Non-recursive.

Introduction

An expert system is correct when it is complete, consistent, and satisfies the requirements that express expert knowledge about how the system should behave.
For real-world knowledge bases containing hundreds of
rules, however, these aspects of correctness are hard to
establish. There may be millions of distinct computational paths through an expert system, and each must
be dealt with through testing or formal proof to establish correctness.
To reduce the size of the tests and proofs, one useful
approach for some knowledge bases is to partition them
into two or more interrelated knowledge bases. In this
way the VV&E problem can be minimized [1].

Overview of proofs using partitions

The basic method of proving each of these aspects of


correctness is basically the same. If the system is small,
a technique designed for proving correctness of small
systems should be used. If the system is large, a technique for partitioning the expert system must be applied and the required conditions for applying the par Corresponding

tition to the system as a whole should be proven. In


addition the correctness of any subsystem required by
the partition must be ensured. Once this has been accomplished this basic proof method should be applied
recursively to the sub-expert systems.
Once the top level structure of the Knowledge base has
been validated, to show the correctness of the expert
system, the following criteria must be accomplished [6]:
Show that the Knowledge base and inference engine implement the top level structure;
Prove any required relationships among subexpert systems or parts of the top level Knowledge representation;
Prove any required properties of the subKnowledge bases.

2.1

A simple example

To illustrate the basic proof method, Knowledge Base


1 will be proved correct in Table 1 and although this
Knowledge base is small enough to verify by inspection.

Author, P. O. Box 45195-1159, M: (+98) 914 306-0594, T: (+98) 241 415-5056

175

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Rule 1
Rule 2
Rule 3

If Risk tolerance = high AND Discretionary income exists = yes then investment = stocks.
If Risk tolerance = low OR Discretionary income exists = no then investment = bank account.
If Do you buy lottery tickets = yes OR Do you currently own stocks = yes then Risk tolerance
= high.
If Do you buy lottery tickets = no AND Do you currently own stocks = no then Risk tolerance
= low.
If Do you own a boat = yes OR Do you own a luxury car = yes then Discretionary income
exists = yes.
If Do you own a boat = no AND Do you own a luxury car = no then Discretionary income
exists = no.

Rule 4
Rule 5
Rule 6

Table 1: Knowledge Base 1 [7]


2.1.1

Knowledge Base 1

If variable Do you buy lottery tickets is assigned a


value yes, then two of the four regions are relevant.
In Figure 1.a, they are shown with a hatch. The two
2.1.2 Illustrations of Knowledge Base 1
regions corresponding to hypotheses Do you currently
The Knowledge Base 1 (KB1) has six rules. There are own stock?=yes are hatched in Figure 1.b.
seven variables which can take two possible values. It
is, therefore a seven dimensional, binary problem [5].
Lets focus on Rule 3 to understand the illustrations of
KB1.
DO YOU BUY LOTTERY TICKETS?=YES
AND
DO YOU CURRENTLY OWN STOCKS?=YES

DO YOU BUY LOTTERY TICKETS?=YES

DO YOU CURRENTLY OWN STOCKS?=YES

DO YOU BUY LOTTERY


TICKETS?

DO YOU CURRENTLY
OWN STOCK?

DO YOU CURRENTLY
OWN STOCK?

DO YOU BUY LOTTERY


TICKETS?

YES

NO
YES

YES

YES

YES

YES

NO

NO

(a)

(a)

NO
NO

DO YOU CURRENTLY
OWN STOCK?

YES

NO
NO

DO YOU CURRENTLY
OWN STOCK?

DO YOU BUY LOTTERY


TICKETS?

YES

NO
DO YOU BUY LOTTERY
TICKETS?

DO YOU BUY LOTTERY TICKETS?=YES


OR
DO YOU CURRENTLY OWN STOCKS?=YES

(b)

(b)

Figure 2: Knowledge Base 1 [7]


Figure 1: Knowledge Base 1 [7]
It has two hypotheses, and one conclusion. The hypotheses are Do you buy lottery tickets?=yes, and
Do you currently own stock=yes. They are associated with the logical operator or. The consequent is Risk Tolerance=high. This is illustrated
in Figure 1. For the two variables of the hypotheses in
Rule 3, there are two possible values: yes or no.
The number of possible combinations of values for the
variables is four. These four combinations appear in
Figure 1 as four square regions defined by the closed
boundary (defining the domain or the variables) and
the line boundaries separating the possibles values for
each variable. Each square is a Hoffman region.

In two dimensions, a Hoffman region is a surface as


shown in this example. In three dimensions, it would
be a volume.
The logical operators are and, or and not. In
Figure 1.a and 1.b, the Hoffman regions corresponding
to hypothesis of Rule 3 are hatched. When combined
with an and logical operator, intersection of the two
sets of Hoffman regions. This is shown in Figure 2.a.
The intersection in this case is a unique Hoffman region. In Rule 3, an or operator connects the two
hypotheses. In this case, the union two sets of Hoffman regions is taken, as shown in Figure 2.b.

176

The Third International Conference on Contemporary Issues in Computer and Information Sciences

RULE 3
DO YOU BUY LOTTERY TICKETS?=YES
AND
DO YOU CURRENTLY OWN STOCKS?=YES

DO YOU BUY LOTTERY


TICKETS?

DO YOU CURRENTLY
OWN STOCK?

RULE 3
DO YOU BUY LOTTERY TICKETS?=YES
OR
DO YOU CURRENTLY OWN STOCKS?=YES

DO YOU BUY LOTTERY


TICKETS?

YES

NO
YES
3

YES

NO
NO

DO YOU CURRENTLY
OWN STOCK?

YES

Include all the rules that set subsystem variables


in their conclusions. For the risk tolerance subsystem, Rules 3 and 4 are included;

NO

Include all variables that appeared in rules already in the subsystem and are not goals of another subsystem;

RISK
TOLERANCE?

(a)

Start with the variables that are goals for the subsystem, e.g., risk tolerance for the risk tolerance
subsystem;

LOW

THEN
RISK TOLERANCE>LOW

(b)

For the risk tolerance subsystem, include Do you


buy lottery tickets and Do you currently own
stocks;
Quit if all rules setting subsystem variables are
in the subsystem, or else go to Step 2. For the
risk tolerance subsystem, there are no more rules
to be added.

Figure 3: Knowledge Base 1 [7]

Next, the region by the logical expression of the hypotheses is labelled with its rule. For Rule 3, the three Figure 4 below shows the partitioning of KB1 using
Hoffman regions are labelled with a circled 3 as shown this method.
in Figure 3.a. Consequence for the Rule is linked to
the label of the region of the hypotheses. In Figure
3.b, an arrow starts at the circled 3 and ends at the
value low of the variable Risk.

2.2

LT=

To prove the correctness of Knowledge Base 1 (KB1),


the expert Knowledge can determine that the system
represents a 2-step process [3]:

INVESTMENT
(1)

DISC. INCOME
(DI)

Step 1-Determine Knowledge Base


structure
ST=

YES
NO

RISK TOLERANCE
(RT)

YES
= Boat
Y

NO

YES
= Lux. Car

YES
NO

NO
AND

BANK ACCOUNT

OR

Find the values of some important intermediate


variables, such as risk tolerance and discretionary
income;

RULES :

3, 4

1, 2

5, 6

Use these values to assign a type of investment.


KB1 was built using this Knowledge; therefore, it can
be partitioned into the following pieces:
A subsystem to find risk tolerance
(part of Step 1);

Figure 4: Knowledge Base 1 [3]

2.4

A subsystem to find discretionary income


(part of Step 1);

Step 3-Completeness of expert systems

A subsystem to find type of investment given this 2.4.1 Completeness Step 1-Completeness Of
Subsystems
information
(part of Step 2).
The first step in proving the completeness of the entire expert system is to prove the completeness of each
2.3 Step 2-Find Knowledge Base Parti- subsystem. To this end it must be shown that for all
tions
possible inputs there is an output, i.e., the goal variables of the subsystem are set. This can be done by
To find each of the three subsystems of KB1, an itera- showing that the OR of the hypotheses of the rules that
tive procedure can be followed:
assign to a goal variable is true [7].

177

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

2.4.2

Completeness Step 2-Completeness of 2.5.2


the entire system

The results of subsystem completeness are used to establish the completeness of the entire system. The basic argument is to use results on subsystems to prove
that successively larger subsystems are complete. At
each stage of the proof there are some subsystems
known to be complete; initially the subsystem that
concludes overall goals of the expert system will be
complete. At each stage of the proof, a subsystem that
concludes some of the input variables of the currentlyproved-complete subsystem is added to the currently
complete subsystem. After a number of steps equal to
the number of subsystems, the entire system can be
shown to be complete.

2.5

Step 4-Consistency of the entire


system

Consistency Step 2-Prove consistency of


subsystems

If there are inconsistent conclusions in the Knowledge


base as a whole, then the next step in proving consistency is to prove the subsystems consistent. This can
be done by showing that no set of inputs to a subsystem
can result in any of the sets of inconsistent conclusions.

2.5.3

Consistency Step 3-Consistency of entire


system

The results of subsystem consistency are used to establish the consistency of the entire system. The basic
argument is to use results on subsystems to prove that
successively larger subsystems are consistent. At each
stage of the proof, there are some subsystem known to
be consistent; initially, this is the subsystem that concludes goals of the expert system as a whole. At each
stage of the proof, a subsystem that concludes some of
the input variables of the currently-proved-consistent
subsystem is added to the currently consistent subsystem. After a number of steps equal to the number
of subsystems, the entire system can be shown to be
consistent [2].

The first step in proving the consistency of the entire


expert system is to prove the consistency of each subsystem. To do this, the user must show that for all
possible inputs, the outputs are consistent, i.e., that
the AND of the conclusions can be satisfied.
For example, if an expert system concludes temperature >0 and temperature <100, the AND of these
conclusions can be satisfied. However, if the system concludes, temperature <0 and temperature
>100, the AND of these two conclusions has to be
false. It is clear that based on the input that produced
these two conclusions, it is not possible for all of the 2.6 Step 5-Specification satisfaction
systems conclusions to be true at the same time and
thus the system producing these conclusions is inconIn order to prove that KB1 satisfies its specifications,
sistent.
the user must actually know what its specifications are.
This is a special case of the general truth that in order
2.5.1 Consistency Step 1-Find the mutually to verify and validate, the user must know what a sysinconsistent conclusions
tem is supposed to do. Specifications should be defined
in the planning stage of an expert system project [4].
The first step in proving consistency is to identify those To illustrate the proof of specifications it will be assets of mutually inconsistent conclusions for each of sumed that KB1 is supposed to satisfy: A financial
the subsystems identified in the Find partitions step advisor should only recommend investments that an
above. Some sets of conclusions are mathematically investor can afford. As with many other aspects of
inconsistent [2]. For example, if a system describes verification and validation, expert Knowledge must be
temperature, the set:
brought to bear on the proof process. For KB1, an
temperature <0, temperature >100 is mathemat- expert might say that anyone can afford a savings acically inconsistent.
count. Therefore, the user only has to look at the conBecause some sets of conclusions are inconsistent be- ditions under which stocks are recommended. Howcause of domain expertise, finding all sets of inconsis- ever, that same expert would probably say that just
tent conclusions generally requires expert Knowledge. having discretionary income does not mean that the
Note that if there are no mutually inconsistent conclu- user can afford stocks; that judgement should be made
sions in the expert system as a whole, then consistency on more than one variable. Therefore, it would be
is true by default, and no further consistency proof is reasonable to conclude that KB1 does not satisfy the
necessary.
above specification.

178

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Conclusion

This paper has argued that V&V techniques are an


essential part of the Knowledge engineering process,
because they offer the only way to judge the success
(or otherwise) of a KBS development project. This is
equally true in the context of Knowledge management,
where V&V techniques tell us whether or not the KBS
can be relied upon to accurately embody the Knowledge of the human experts that supplied it.
However, examination of known studies on the effectiveness of existing KBS VV&E techniques has shown,
that the state of Knowledge in this area is sparse. The
way to improve this situation would be by systematically gathering data from a representative set of KBS
projects and V&V techniques. Without such a study,
Knowledge engineering will remain very much an art
and, by extension, so will the use of KBS technology
in Knowledge management.
It is difficult to generalise our results to all Knowledge
based systems and, of course, further evaluations of
other applications are necessary to confirm (or challenge) our conclusions. However, since the method we
have used minimises the need for experts interpretation of the faults, we can reasonably conclude that if
we use an application of similar size and complexity
to GIBUS, we would expect to obtain similar results.

179

Consequently, since our application has a size and a


complexity which is representative of actual practice,
we would expect that consistency and completeness
checking, in addition to testing, would be an effective
combination of methods to validate many of the Knowledge based systems actually under development.

Refrences
[1] Ayel M and Laurent J-P, two different ways of verifying
Knowledge-based systems, Validation, Verification and Test
of Knowledge-Based Systems, Wiley, New York (1991), 6376.
[2] Bendou A, A constraint-based test data generator,
EUROVAV-95, Saint Badolph, France (1995), 19-29.
[3] Ginsberg A, Knowledge-base reduction: A new approach to
checking Knowledge bases for inconsistency & redundancy,
AAAI 88 2 (1988), 585-589.
[4] Kirani S, Zualkernan I.A, and Tsai W.T., Comparative Evaluation of Expert System Testing Methods, Computer Science
Department, University of Minnesota, Minneapolis 2 (1992),
9230.
[5] Laurent J-P, Proposals for a valid terminology in KBS validation, ECAI-92, Wiley, New York 2 (1992), 829-834.
[6] Lounis R and Ayel M, Completeness of KBS, EUROVAV-95,
Saint Badolph, France 2 (1995), 3146.
[7] OLeary D, Design, development and validation of expert systems: A survey of developers, Vol. 2, 1991.

Point set embedding of some graphs with small number of bends


Maryam Tahmasbi

Zahra Abdi reyhan

Shahid Beheshti University, G.C.,Tehran, Iran

Shahid Beheshti University, G.C.,Tehran, Iran

Department of Computer Science

Department of Computer Science

m tahmasi@sbu.ac.ir

z.abdi@mail.sbu.ac.ir

Abstract: In this paper we study the problem of point-set embedding. We assume that G is a
planar graph with n vertices and S is a set of n points in general position in the plane. The problem
is to find a planar drawing of G such that each vertex is mapped to one of the points in S and each
edge is mapped to a polygonal chain and the drawing has small number of bends. In this paper we
prove that (1) every wheel has a point set embedding with no bends on a set of points in non-convex
position. Moreover, if the points are in general position, then wheel has a point set embedding with
at most one bend. (2) every -graph has a point set embedding with at most six bends on a set of
points in general position such that one of the cycles is drawn with straight lines. (3) every k-path
graph has a point set embedding on a set of points in general position with at most 2k 2 bends.

Keywords: point set embedding; wheel; -graph; planar drawing; bend; convex hull,k-path graph.

Introduction

other versions of the problem, one of the most studied


ones is the one when a partial drawing of the graph is
given [1] ,[6],[7].

The problem of computing a planar drawing of a graph


Given a planar graph G = (V, E) and a planar
on a given set of points in the plane is a classical sub- straight-line partial drawing D of G, it was shown that
ject both in the graph drawing and in the computa- it is N P -hard to decide whether G admits a planar
tional geometry [1]. Let G be a planar graph with n straight-line drawing including D [7].
vertices and S be a set of n points in the plane, a point
set embedding of G on S is a planar drawing of G such
that each vertex is mapped to a distinct point of S and
each edge is drawn as a polygonal chain. A point set
embedding with no bends is called a straight line point
set embedding. There are two versions of the problem,
point set embedding with mapping and without mapping. We study the problem for the case that there
is no predefined mapping between the vertices of the 2
Wheels
graph and the points of the set.
Given a planar graph G, deciding whether G has a
straight line point set embedding is N P -complete [2].
It is proved that any outer planar graph has a planar
straight line point set embedding[3] ,[4] and any planar
graph has a planar embedding on every point set in the
plane with at most two bends per edge[5]. The problem
is also studied for planar triangulations[5]. There are
Corresponding

Author, T: (+98) 21 299-03004

180

A wheel Wn is a graph consisting of a cycle with n vertices and a vertex, called center that is adjacent to all
vertices of the cycle. In this section we study the problem of embedding a wheel on a point set S in general
position. We study two cases where the points of S are
in convex and non convex position, separately.

CICIS12, IASBS, Zanjan, Iran, May 28, 2012

2.1

Points in non-convex position

are in convex position, we can draw diagonals from pi


to all other points of S. Now, we need to connect the
points before and after c on CH(S). For this purpose
In this section we suppose that S is a set of n+1 points
we place a dummy vertex out of the convex hull and
in general non-convex position.
near pi . From this dummy vertex we draw a straight
Theorem 2.1: The graph Wn admits a straight line line segment to the points before and after pi . Finally,
point set embedding on S.
we replace this dummy vertex with a bend. Fig.2 shows
these steps.
Proof. Let CH(S) be the convex hull of S and pl and pr
be the leftmost and rightmost points of CH(S), respectively. Let pl = p1 , p2 , . . . , pk be the clockwise ordering
of points on CH(S). Let q1 , . . . , qh be the remaining
points from right to left. Choose a point qi as the center. We can draw straight lines from qi to all other
points. Starting from p1 , at each convex region, with
respect to point distance connect all points except qi .
Fig. 1 shows the steps.

Figure 2: (a) A set S of 5 points in convex position.


(b)convex hull of S. (c) A point set embedding of W4
on S with one bend in total.

-graph

A -graph consists of two vertices p and q and three


vertex disjoint paths P1 , P2 and P3 that connect p and
q. In this section we present an algorithm that draws
Figure 1: (a) The wheel W4 . (b) A set of five points in a -graph G on a set of points in general position with
non-convex position. (c) convex hull of S. (d) straight at most six bends in total such that one of the cycles
is drawn with straight lines.
line point set embedding of G on S
Let be a drawing of a -graph G and v be a
vertex of G; the vertex v is visible from below in if
2.2 Points in convex position
the open vertical half line below v does not intersect
, and is visible from above in if the open vertical
In this section, we describe how to compute a point set half line above v does not intersect [1].
embedding of Wn on a set S of n + 1 points in convex
(and general) position with at most one bend in total.
Theorem 3.1: Let G be a -graph and S be a set
of n points in general position. The graph G admits
Theorem 2.2: The wheel Wn admits a point set a point set embedding on S where one of the cycles
is drawn with straight line segments and there are six
embedding on S with exactly one bend.
bends in the drawing.
Proof. Let CH(S) be the convex hull of S and pl be
the leftmost point of CH(S). Let pl = p1 , . . . , pn+1 Proof. Let p and q be the degree three vertices of G
be the points on CH(S) in clockwise order. Choose and P1 , P2 and P3 be the three paths connecting them
an arbitrary point pi as the center. Since the points in G. Let C1 be the cycle consisting of P1 and P2 and
2

181

The Third International Conference on Contemporary Issues in Computer and Information Sciences

C2 be the cycle consisting of P2 and P3 . Suppose that


C2 has n2 vertices. Let S 0 be set of the first n2 vertices
of S from below in lexicographic order. We map p to
the left most point in S 0 , d. Then we map all vertices
of C2 to points of S 0 using the algorithm in [1]. Now we
need to map all vertices of P1 from p to q, except p and
q, to the points of S 00 = S S 0 from left to right. Suppose that q is mapped to a point c in S 0 . It is enough
to connect d to d0 , the left most vertex of S 00 , and c to
c0 , the rightmost vertex of S 00 . Let B 0 and B 00 be the
bounding boxes of S 0 and S 00 respectively. In order to
connect d to d0 , we need two bends: one is the top left
corner of B 0 and the other is the bottom left corner of
B 00 . For connecting c to c0 , we distinguish two cases:

k-path graph

A k-path graph consists of two vertices p and q and


k 3 vertex disjoint paths P1 , P2 , . . . , Pk that connect
p and q. In this section we present an algorithm that
draws k-path graph G on a set of points in general
position with at most 2k 2 bends in total.
Theorem 4.1: Let G be a k-path graph and S be a
set of n points in general position. The graph G admits
a point set embedding on S with at most 2k 2 bends
in total.

(1) if the point c is visible from above, in order to


connect c to c0 , we use a path with three bends locating
at the following positions: the projection of c on upper
edge of B 0 , the top right corner of B 0 , and the bottom
right corner of B 00 .

Proof. Let P1 , P2 , P3 , . . . , Pk be the paths of G from p


to q in counter clockwise order around p. For 2 i k
let ni be the number of vertices in Pi except p and q
and n1 be the number of vertices in P1 .

(2) if the point c is visible from below, in order to


connect c to c0 , we use four bends locating at the following positions: the projection of c on the lower edge
of B 0 , the bottom right and top right corners of B 0 and
the bottom right corner of B 00 .

Let S1 be set of the first n1 vertices of S from below in lexicographic order. We map p to the left most
point in S1 , h1l and q to the right most point in S1 ,
h1r . suppose
Pi1 that Si be the set of the first ni vertices
of S j=1 Sj from below in lexicographic order,for
2 i k.
We map all vertices of P1 from p to q, to the points
of S1 from left to right. Now we need to map all vertices of Pi from p to q, except p and q to the Si from
left to right. let Bi be the bounding boxe of Si . It is
enough to connect p to the left most point in Si , hil
and q to the right most point in Si , hir , for 2 i k.

Figure 3 shows the mapping.

In order to connect p to hil , we use a path with


one bend locating at the following position: we draw a
line with maximum positive slope from hil such that it
doesnt cross boxes Bj , for 1 j i, This line is called
Li and draw a horizontal line from vertex p in box B1
along the left side. Similarly, we draw a line below
L1i1 from vertex p in box B1 in very short distance
along with left side which is called L1i . intersection of
Li and L1i is named qi , Now we can connect vertex p
to hi1 using dummy vertex qi .
The rightmost point of the box Bi (hir ) is connected
to q in similar way, except for the line corresponding
to line Li is considered as with the minimum negative
slope, and the line corresponding to L1i is considered
along with the right side. Figure 4 shows these connections.
Figure 3: (a) -graph G. (b) straight line point set emFinally, we replace each dummy vertices with one
bedding of C2 on S 0 .(c) and (d) point set embedding bend. Thus we can draw k-path graph with maximum
of G on S with at most 6 bend
2k 2 bends.
3

182

CICIS12, IASBS, Zanjan, Iran, May 28, 2012

the wheel has a point set embedding with at most one


bend, for every - graph the number of bends in point
set embedding is at most six, and one of the cycles is
drawn with no bend in the resulting drawing. Then we
extended the results to k-path graphs and presented an
algorithms that computed a point set embedding on a
set of points in general position with at most 2k 2
bends.
Constrained point set embedding of graphs on a set
of points has been recently investigated where it is required to draw a sub graph with straight line and the
remaining parts with small number of bends. In further works we are going to examine constrained point
set embedding with several sub graphs in a way that
the sub graphs are drawn with straight lines and other
parts with small number of bends.

Refrences
Figure 4: (a)point set embedding of p1 , p2 and p3 . (b)
point set embedding of G on S with at most 6 bend

[1] E. Di Giacomo, W. Didimo, G. Liotta, H. Meijer, and S.


Wismath, constrained point-set embedding of planar graphs,
LNCS, GD07 proceeding 5417 (2008), 360371.
[2] S. Cabello, Planar embeddability of the vertices of a graph
using a fixed point set is NP-hard, J. Graph Algorithms Appl
10 (2) (2006), 353363.

Conclusions
Works

and

[3] N. Castaneda and J. Urrutia, Straight line embeddings of


planar graphs on point sets, 8th Canadian Conference on
Computational Geometry 9(6) (1996), 312318.

Future

[4] P. Gritzmann, B. Mohar, J. Pach, and R. Pollack, Embedding a planar triangulation with vertices at specified points,
Amer. Math 98 (2) ( Monthly (1991)), 165166.
[5] M. Kaufmann and R. Wiese, Embedding vertices at points:
Few bends suffice for planar graphs, J. Graph Algorithms
Appl. 6 (1) (2002), 115129.

In this paper we studied the problem of point set embedding of wheels, graphs and k-Path graphs without mapping.

[6] E. . Di Giacomo, W. Didimo, G. Liotta, H. Meijer, and S.


Wismath, Point set embeddings of trees with given partial
drawings, Comput. Geom. 42 (67) (2009), 664676.

We proved that every wheel has a point set embedding with no bends on a set of points in non-convex position. In case that the points are in general position,

[7] M. Patrignani, On extending a partial straight-line drawing,


Internat. J. Found. Comput. Sci. (Special issue on Graph
Drawing) 17 (5) (2006), 10611069.

183

On The Pairwise Sums


Keivan Borna

Zahra Jalalian

Faculty of Mathematical Sicenes and Computer

Faculty of Engineering

Kharazmi University

Kharazmi University

borna@tmu.ac.ir

jalalian@tmu.ac.ir

Abstract: The aim of this paper is to study two open problems and provide faster algorithms for
them. More precisely for two sets X and Y of numbers with the size of n and m we first present an
O(nm) algorithm to sort X + Y = {x + y | x X, y Y } of pairwise sums. Then we offer another
O(nm) algorithm for finding all pairs (x, y) and (x0 , y 0 ) from X + Y for which x + y = x0 + y 0 . In
particular if X, Y are both of size n this later algorithm enables us to know when the set X + Y
have n2 unique elements.

Keywords: Lower Bounds; Linear Hash Function; Sorting Pairwise Sums.

Introduction

Given two sets of numbers, each of size n, how quickly


can the set of all pairwise sums be sorted? In other
words, given two sets X and Y , our goal is to sort the
set X + Y = {x + y | x X, y Y }, cf. [3, Problem
41]. There are several motivations for the problem of
finding the required amount of comparisons for sorting a set if a partial order on the input set is given.
Many authors including [1,2] described several geometric problems that are Sorting-(X+Y )-hard. It is known
that there is a subquadratic-time transformation from
sorting X + Y to each of the following problems: computing the Minkowski sum of two orthogonal-convex
polygons, determining whether one monotone polygon
can be translated to fit inside another, determining
whether one convex polygon can be rotated to fit inside another, sorting the vertices of a line arrangement,
or sorting the interpoint distances between n points in
Rd . In addition there is an immediate application to
multiplying sparse polynomials [4].

and then to O(n2 logn) by Lambert [6] and Steiger and


Streinu [7]. These results imply that no superquadratic
lower bound is possible in the full linear decision tree
model. One motivation of this paper is to present an
O(n2 ) algorithm for sorting X + Y . As a matter of fact
for two sets X, Y of numbers of size n, m in Section 2
we present an O(nm) algorithm for sorting X +Y . The
decision version of this problem is also interesting; does
the set X + Y have n2 unique elements? This problem
which will be discussed in Section 3 provides another
motivation for this paper.
The organization of this paper is as follows. For
two sets X, Y of numbers of size n, m in Section 2 we
present an O(nm) algorithm for sorting X + Y . In Section 3 our O(nm) algorithm for finding all pairs (x, y)
and (x0 , y 0 ) from X + Y for which x + y = x0 + y 0 is
presented. Section 4 is devoted to conclusions.

Sorting X + Y

In [4] the author presented an algorithm that can


sort X + Y using only 8nlogn + 2n2 comparisons but
the algorithm needs exponential time to choose which For two sorted sets X, Y of numbers of size n =
comparisons to perform. This exponential overhead SizeX, m = SizeY we first find the array Z = X + Y
was reduced to polynomial time by Kahn and Kim [5] using an O(nm) algorithm.
Corresponding

Author, P. O. Box 45195-1159, F: (+98) 26 3455-0899, T: (+98) 26 3457-9600

184

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Data: Two sorted set X, Y of numbers of size


follows:
n = SizeX, m = SizeY
Result: The array Z = X + Y
Z[0] = 55, Z[11] = 44, Z[36] = 19, Z[43] = 12,
LowAmountZ=X[0] + Y[0];
Z[44] = 11, Z[47] = 8, Z[69] = 14, Z[79] = 24,
HighAmountZ = X[n-1] + Y[m-1];
AlphaZ = -LowAmountZ;
Z[80] = 25, Z[112] = 57, Z[113] = 58
SizeZ = HighAmountZ - LowAmountZ + 1;
for i = 0, , SizeZ 1 do
Z[i] = X[0] + Y[0] - 1;
Furthermore S as the set representation of Z is,
end
for i = 0, , n 1 do
for j = 0, , m 1 do
S = {55, 44, 19, 12, 11, 8, 14, 24, 25, 57, 58}.
indexZ=X[i]+Y[j]+AlphaZ;
Z[indexZ]=X[i]+Y[j];
end
end
Our discussions in this section will prove the folAlgorithm 1: An O(nm) algorithm for computing lowing theorem.
the array Z = X + Y .
Theorem 1: For two sets X, Y of numbers of size n and
m, using the Algorithms 1 and 2 one can sort X + Y
In this algorithm the first (smallest) value of X + Y
in O(nm).
is X[0] + Y [0] and the last (largest) value is X[n 1] +
Y [n 1]. Therefore, the index of X[0] + Y [0] in the
X + Y will be zero. To determine the address of the
other pairwise sums, we have to define a hash function
to find the index of each element of the sum. Since
the first pairwise sum should go to the zero index, thus 3
When x + y = x0 + y 0 ?
our hash function will be h(x + y) = x + y + , where
= (X[0] + Y [0]). The hash function gets the pairwise sum and produces the index in X + Y where the
corresponding sum should insert to it. We define an For a while assume that each of X and Y are of the
array Z and initialize it with X[0] + Y [0] 1. Finally same size n. The following problem is related to sorting
we have to delete the elements of X + Y in which their the set X + Y :
value is equal to X[0]+Y [0]1 and apply a shift-to-left
operation in each step. For this let c be the number of
Does the set X + Y have n2 unique elements?
distinct elements of Z = X + Y and let S be an array
of size c. The following easy algorithm fills S and thus
finds the sorted set X + Y .
This problem is essentially equivalent with the followData: The array Z of size c generated in
Algorithm 1
Result: The sorted set S = X + Y
j = 0;
for i = 0, , c 1 do
if Z[i] Z[0] then
S[j] = Z[i];
j = j + 1;
end
end
Algorithm 2: The algorithm for obtaining elements
of set Z generated as an array in Algorithm 1.

ing problem:
For which x, x0 X and y, y 0 Y we have x + y = x0 + y 0 ?

In this section we present a new algorithm to find all


pairwise with equal sum amount. We first create two
arrays Z1, Z2 with the same size (X[n1]+Y [n1])
(X[0] + Y [0]) + 1. Then, both arrays, Z1 and Z2 will
initialize with the amount of X[0] + Y [0] 1. For each
pairwise sum from X + Y this algorithm uses a hash
function to place the sum value into the corresponding
address in Z1 and puts the xs index of set X at the
For example if X = {27, 9, 42} and Y = same address of Z2. But, before the sum value be lo{28, 17, 15, 16}, the array Z = X + Y will be as cated to its cell, the algorithm checks the cells content,
and if this cells amount is greater than X[0] + Y [0] 1,
it means that two pairwise sum with equal sum amount
have been found.

185

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Data: Two sorted set X, Y of numbers of size


n = SizeX, m = SizeY
Result: All pairs (x, y) and (x0 , y 0 ) from X + Y
for which x + y = x0 + y 0
1. LowAmountZ=X[0] + Y[0];
2. HighAmountZ = X[n-1] + Y[m-1];
3. AlphaZ = -LowAmountZ;
4. SizeZ = HighAmountZ - LowAmountZ + 1;
5. for i = 0, , SizeZ 1 do
5.1. Z1[i] = X[0] + Y[0] - 1;
5.2. Z2[i] = - 1;
end
6. for i = 0, , n 1 do
7. for j = 0, , m 1 do
7.1. indexZ=X[i]+Y[j]+AlphaZ;
7.2. if Z1[indexZ] > (X[0] + Y [0] 1)
then
7.2.1. print X[i] + Y[j] ;
7.2.2 print X[Z2[indexZ]] +
(Z1[indexZ]-X[Z2[indexZ]]);
end
7.3. Z1[indexZ]=X[i]+Y[j];
7.4. Z2[indexZ]=i;
end
end
Algorithm 3: An O(nm) algorithm for finding all
pairs (x, y) and (x0 , y 0 ) from X + Y for which x + y =
x0 + y 0

y 0 = (x+y)x0 , therefore Z1[indexZ]X[Z2[indexZ]]


will be the value of y 0 .
For example if X = {17, 13, 12, 5, 19} and
Y = {9, 6, 2, 7, 11, 16} the output of our algorithm
are as follows:
13 + 6 = 17 + 2
13 + 7 = 17 + 11
12 + 11 = 17 + 16
5 + 6 = 12 + 11
5 + 2 = 13 + 16
Since the complexity of Algorithm 3 is obviously
(mn), we obtain the following theorem.
Theorem 2: For two sets X, Y of numbers of size n
and m, Algorithm 3 will find all pairs (x, y) and (x0 , y 0 )
from X + Y for which x + y = x0 + y 0 in O(nm).

Discussion and Future Works

For two sets X and Y of numbers of size n and m


we presented an O(nm) algorithm for sorting the set
X + Y = {x + y|x X, y Y } of pairwise sums.
We also presented another O(nm) algorithm for finding all pairs (x, y) and (x0 , y 0 ) from X + Y for which
In lines 1 and 2, we first determine the minimum x + y = x0 + y 0 . Constructing faster algorithms for
and maximum amounts of the set Z := X + Y (and these problems are the subject of further progress.
call them LowAmount and HighAmount) and assign
HighAmountZ LowAmountZ + 1 to its size. Then
Refrences
in line 3 we establish a constant for the hash function.
In the commands 5.1 and 5.2, Z1 and Z2 will be ini- [1] A. Hernandez, Finding an o(n2 logn) algorithm is sometimes
tialized with the amount of X[0] + Y [0] 1 and 1
hard: Carleton University Press, Ottawa, Canada, Proc. 8th
Canad. Conf. Comput. Geom. (1996), 289-294.
respectively, because Z2 will be used to keep the x address of each pairwise sum. In line 7.1 we sum X[i] [2] G. Barequet and S. Har-Peled, Polygon containment and
translational min-Hausdorff-distance between segment sets
and Y [j] and put the value in Z1. But first using our
are 3SUM-hard, Internat. J. Comput. Geom. Appl. 11
hash function, it appoints the cells address of Z for
(2001), 465-474.
which the sum X[i] + Y [j] has to be placed. Then it
0
[3]
E.
D. Demaine, J. S. B. Mitchell, and J. O Rourke, The Open
compares the content of the obtained address with the
Problems Project (2010), 191.
value of X[0] + Y [0] 1. If the cells content is equal
[4]
M.
L. Fredman, How good is the information theory bound
to X[0] + Y [0] 1, it means that there isnt such sum
in sorting?, Theoret. Comput. Sci. 1 (1976), 355-361.
in X + Y . Otherwise the pairwise sum amount goes
to the Z1[indexZ] and its x index, which is i, will be [5] J. Kahn and J. Han Kim, Entropy and sorting, J. Comput.
Sys. Sci. 51 (1995), 390-399.
located at the Z2[indexZ], lines 7.3 and 7.4 respec[6] J. L. Lambert., Sorting the sums (xi + yj ) in O(n2 ) compartively. In fact once the cells content amount is not
isons, Theoret. Comput. Sci. 103 (1992), 137-141.
equal with X[0] + Y [0] 1, it means that there are an[7] W. Steiger and I. Streinu, A pseudo-algorithmic separation
other x + y sum with the same value, therefore we have
of lines from pseudo-lines, Inform. Process. Lett. 53 (1995),
to find the x0 and y 0 . For this reason, the algorithm
295-299.
goes to Z2[indexZ] which has the address of x0 , and [8] J. Erickson, Lower bounds for fundamental geometric problems, PhD thesis, University of California at Berkeley, 1996.
then with the obtained index it can retrieve x0 . Since

186

Hyperbolic Voronoi Diagram: A Fast Method


Zahra Nilforoushan

Ali Mohades

Department of Computer Engineering

Faculty of Mathematics and Computer Science

Kharazmi University, Tehran, Iran

Amirkabir University of Technology, Tehran, Iran

shadi.nilforoushan@gmail.com

mohades@aut.ac.ir

Amin Gheibi

Sina Khakabi

School of Computer Science

School of Computing Science

Carleton University, Ottawa, Canada

Simon Fraser University, Burnaby, BC, Canada

amin-gheibi@carleton.ca

sinakhm.cs84@aut.ac.ir

Abstract: Voronoi diagrams have useful applications in various fields and are one of the most
fundamental concepts in computational geometry. Although Voronoi diagrams in the plane have
been studied extensively, using different notions of sites and metrics, little is known for other
geometric spaces. In this paper, we present a simple method to construct the Voronoi diagram of
a set of points in the Poincare hyperbolic disk, which is a 2-dimensional manifold with negative
curvature. Our trick is to define and use some well-formed geometric maps which take care of
connection between the Euclidean plane and Poincare hyperbolic disk. Finally, we give a brief
report of our implementation.

Keywords: Computational geometry, Hyperbolic space, Geodesic, Voronoi diagrams.

Introduction

Brown [5], its higher-dimensional analogues can be obtained using methods in Seidel [20].

Voronoi diagrams for point-sets in d-dimensional Euclidean space E d have been studied by a number of
people in their original as well as in generalized settings. For a finite set M ( E d , the (closest-point)
Voronoi diagram of M associates each p M with the
convex region R(p) of all points closer to p than to any
other point in M . More formally, R(p) = {x E d |
d(x, p) < d(x, q), q M p}, where d denotes the
Euclidean distance function. Voronoi diagrams are of
importance in a variety of areas other than computer
science whose enumeration exceeds the scope of this
paper (see for instance Aurenhammers survey [3] or
the book by Okabe, Boots, Sugihara and Chiu [18] ).
Shamos and Hoey [21] were the first to introduce the
planar diagram to computational geometry and also
demonstrated how to construct it efficiently. Using
a dual correspondence to convex hulls discovered by

As the variety of applications of the Voronoi diagram were recognized, people soon became aware of the
fact that many practical situations are better described
by some modification than by the original diagram. For
example, diagrams under more general metrics [15,16],
for more general objects than points [9, 13], and of
higher order [10, 14, 21] have been investigated.
The interesting properties of Voronoi diagrams attracted our attention to ask a natural question whether
they will be satisfied in other spaces, especially for hyperbolic surfaces. Hyperbolic surfaces are characterized
by negative curvature and cosmologists have suffered
from a persistent misconception that negatively curved
universe must be the finite 3-D hyperbolic space [23].
Although we do not see hyperbolic surfaces around us,

Corresponding Author, Algorithm and Computational Geometry Research Group, Amirkabir University of Technology, Tehran,
Iran, T: (+98) 26 34550002
Algorithm and Computational Geometry Research Group, Amirkabir University of Technology, Tehran, Iran.

187

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

often nevertheless nature does posses a few. For example, lettuce leaves and marine flatworms exhibit hyperbolic geometry. There is an interesting idea about hyperbolic plane by W. P. Thurston that if we move away
from a point in hyperbolic plane, the space around that
point expands exponentially [22]. Hyperbolic geometry has found applications in fields of mathematics,
physics, and engineering. For example in physics, until we figure out whether or not the expansion of the
universe is decelerating, hyperbolic geometry could be
the most accurate way to define the geometries of fields.
Einsteins invented his special theory of relativity based
on hyperbolic geometry.

used to transfer the Poincare hyperbolic disk to the


Euclidean plane R2 , compute the Voronoi diagram in
R2 and then transfer it back. Section 4 is devoted to
some Implementations.

Poincar
e hyperbolic disk

The Poincare hyperbolic disk is a two-dimensional


model for hyperbolic geometry. Therefore it has a
negative curvature and defined as the disk D2 =
{(x, y) R2 |x2 + y 2 < 1}, with hyperbolic metric
dx2 +dy 2
2
Now we switch to some applications of the Voronoi ds = (1x2 y2 )2 . See [2] and [12] for details.
diagram in hyperbolic spaces. In [19] the authors
The Poincare disk is a model for hyperbolic geomedeal with Voronoi diagram in simply connected complete manifolds with non positive curvature, called try in which a geodesic (which is like a line in Euclidean
Hadamard manifold. They proved that the facet of geometry) is represented as an arc of a circle whose
Voronoi diagram can be characterized by hyperbolic ends are perpendicular to the disks boundary (and diVoronoi diagram. They considered that these Voronoi ameters are also permitted). Two arcs which do not
diagrams and its dual structure, Delaunay triangula- meet correspond to parallel rays, arcs which meet ortion, can be used as mesh generation, computer graph- thogonally correspond to perpendicular lines, and arcs
ics and color space [6]. Another application of Voronoi which meet on the boundary are a pair of limit rays
diagram in hyperbolic models is triangulating a sad- (see Fig. 1).
dle surface, which is a part of the triangulation of a
general surface. On general surface, some parts have
positive curvature, other parts have negative curvature
and other parts near zero. In such cases, one can divide the surface into some parts, make triangulation of
each part according to their curvature.
Further applications of the Voronoi diagram in hyperbolic spaces is devoted to the Farey tessellation
which is studied in [1]. The Teichm
uller space for T 2 is
2
Figure 1: Poincare disk and some of its geodesics
the hyperbolic plane H = {z = x + iy C|y > 0}: Tz2
2
can be thought of as the quotient space of R over the
lattice {m.1 + n.z|m, n Z} C. Let X H2 be the
The equation of a geodesic of D2 is expressed as
set of all parameters z corresponding to the tori with
three equally short shortest geodesics (i.e., tori glued either
from a regular hexagon). Then the Farey tessellation
x2 + y 2 2ax 2by + 1 = 0, with a2 + b2 > 1,
is nothing but the Voronoi diagram of H2 with respect
to X.
or
Such applications motivated us to study the
Voronoi diagrams on hyperbolic spaces. In [17], the
first two authors of this paper have studied the Voronoi
diagram in Poincare hyperbolic disk where the running
time of the proposed algorithm was O(n2 ). In this paper, we present a new method to compute the Voronoi
diagram in Poincare hyperbolic disk whose expected
worst case running time is O(nlogn).
This paper is organized as follows. In Section 2, a
brief introduction to Poincare hyperbolic disk is studied. Section 3 briefly reports the required maps we

ax = by.
Geodesics are basic building blocks for computational
geometry on the Poincare disk. The distance of two
points is naturally induced from the metric of D2 ; consider two point z1 (x1 , y1 ), z2 (x2 , y2 ) D2 , the distance
between z1 and z2 , denoted by d(z1 , z2 ), can be expressed as
Z
d(z1 , z2 ) =
ds
the geodesic connecting z1 and z2

= tanh1 (|

188

z2 z1
|).
1 z1 z2

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Our method

(0, 0, 2)

Suppose we are given a set S of n points (representing sites) in D2 . To construct the Voronoi diagram,
we use a combination of four maps to transfer these
sites into the Euclidean plane. The maps are defined
between four hyperbolic models and Euclidean plane,
denoted by D2 , S 2 , K 2 , H 2 and R2 , respectively. In
[7], Cannon et al. have an elegant discussion about
these hyperbolic models:

H2

p1

S2
D2

p2
p3
p4

S0
p = (x0 , y0 )

(0, 0, 1)

Figure 2: An illustration of the combination maps between D2 and R2

1. D2 = {(x, y) : x2 + y 2 < 1},


dx2 +dy 2
ds2D2 = (1x
2 y 2 )2

Now by using any algorithm in [4] for constructing the Voronoi diagram of the transferred sites in
R2 which has the worst case running time complexity
O(nlogn), the combination of the inverses of fi s will
allow us to obtain the Voronoi diagram in D2 . This
3. K 2 = {(x, y) : x2 + y 2 < 1},
combination is robust, as the subsequent theorem verdx2 +dy 2
2
ifies.
dsK 2 = 4 (1x2 y2 )2
Theorem 1: Let z1 and z2 be two points in R2 and J
be their bisector. Then f (J) would be the bisector of
4. H 2 = {(x, y, z) : z 2 x2 y 2 = 1, z > 0},
f (z1 ) and f (z2 ) in D2 and f = f11 f21 f31 f41
2
2
2
2
dsH 2 = dx + dy dz .
where fi s (i = 1, 2, 3, 4) are the above mentioned
maps.
Proof: Since we use the geodesics in each hyperThe list of maps that we defined and used is given
bolic models and Euclidean plane R2 , by using the corin the following:
responding metrics ds2 , we obtain that the bisector of
two given points z1 and z2 in R2 will be mapped to the
(a) A central projection map from the point (0, 0, 1),
bisector of f (z1 ) and f (z2 ) in D2 and vice-versa .
f1 : D2 S 2 that
2. S 2 = {(x, y, z) : x2 + y 2 + z 2 = 1, z > 0},
2
2
+dz 2
ds2S 2 = dx +dy
z2

(x, y, z) 7 (x, y, 1).

As the complexity of the mentioned maps are linear, we conclude that the complexity of our method to
compute the Voronoi diagram of a set of sites in D2
is O(nlogn) using any algorithm with the complexity
O(nlogn) to compute the Voronoi diagram in R2 for the
transferred sites from D2 and this yields the following
consequence.

(c) A central projection map from the point (0, 0, 0),


f3 : K 2 H 2 that

Hyperbolic Voronoi diagram can be constructed


with an O(nlogn) time complexity algorithm.

(x, y) 7 (

2x
2y
1 x2 y 2
,
,
).
1 + x2 + y 2 1 + x2 + y 2 1 + x2 + y 2

(b) A lifting map f2 : S 2 K 2 that

(x, y, 1) ( p

x
1

x2

y2

y
1
,p
,p
).
2
2
1x y
1 x2 y 2

(d) A central projection map from the point (0, 0, 2),


f4 : H 2 R2 that

Implementation

In this section we present our implementation, and discuss its performance in some series of experiments, designed to test different aspects of our algorithm and implementation. Our code has been written in C++, and
Fig.2 is an illustration of the above mentioned for visualization we have used MATLAB. Our implespaces and the connecting maps.
mentation with C++ have three main steps: in the first
(x, y, z) (

2x 2y
,
).
z2 z2

189

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

step we transfer our points (sites) from Poincare disk


to R2 . In this step the program reads the coordinates
of points from a file and then uses some methods and
functions to transfer them to R2 . In the second step
we work on transferred points and use Fortunes algorithm and draw the Voronoi diagram of points. Source
code of the Fortunes algorithm is available in [11, 24].
The output is the end points of the voronoi edges in
R2 . In the third step we transfer the end points to the
Poincare disk. So we use the inverse mode of the maps
defined in the first step. The output is the end points
of Voronoi edges in Poincare disk. Since we have the
formula for a geodesic in the Poincare disk, so we can
draw Voronoi edges easily.
We have used Visual C++ in Microsoft Visual Studio.NET 2005 with .NET Framework 2.0 and MATLAB Ra 2006. All experiments were run on an ASUS
Notebook Z53 j series with 2.0 GHz core 2 duo CPU
and 2 GB DDR2 RAM.
In Fig.3 the result of our implemented method for
five random sites situation is given.

[3] F. Aurenhammer, Voronoi Diagrams: a Survey of a Fundamental Geometric Data Structure, ACM Computing Surveys 23(3) (1991), 345405.
[4] F. Aurenhammer and R. Klein, Voronoi Diagrams, Handbook of Computational Geometry, J. Sack and G. Urrutia,
editors, Elsevier Science Publishers, B.V. North-Holland,
Chapter 5, pages: 201290, 2000.
[5] K. Q. Brown, Voronoi diagrams from convex hulls, Inform.
Process Lett. 9 (1979), 223228.
[6] H. Brettel, F. Vienat, and J. D. Mollonl, Computerized simulation of color appearance for dichromats, Journal of Optical Society of America 14(10) (1997), 26472655.
[7] J. W. Cannon, W. J. Floyd, R. Kenyon, and W. R. Parry,
Flavors of Geometry, MSRI Publications 31 (1997), 59115.
[8] The CGAL User and Reference Manual: All Parts. Release
3.3., 2007.
[9] R. L. Drysdale and D. T. Lee, Generalization of Voronoi
diagrams in the plane, SIAM J. COMPUT. 10 (1981), 73
87.
[10] H. Edelsbrunner, J. ORourke, and R. Seidel, Constructing
arrangements of lines and hyperplanes with applications,
Proc. 20th. Ann. IEEE Symp. FOCS (1983), 8391.
[11] S. Fortune, http://cm.bell-labs.com/who/sjf/index.html.
[12] C. Goodman-Strauss, Compass and Straightedge in the
Poincar
e Disk, Disk. Amer. Math. Monthly 108 (2001), 33
49.
[13] D. G. Kirkpatrick, Efficient computation of continous skeletons, Proc. 20th. Ann. IEEE Symp. FOCS (1979), 1827.
[14] D. T. Lee, On k-nearest neighbor Voronoi diagrams in the
plane, IEEE Trans. Comp. C-31, 6 (1982), 478487.
[15] D. T Lee, Two-dimensional Voronoi diagrams in the Lp
metric, JASM 27(4) (1980), 604618.
[16] D. T. Lee and C. K. Wong, Voronoi diagrams in L1 (L )
metrics with two dimensional storage applications, SIAM
J. COMPUT. 9 (1980), 200211.

Figure 3: Result for the nine random sites in R2 and


D2

[17] Z. Nilforoushan and A. Mohades, Hyperbolic Voronoi Diagram, ICCSA 2006, LNCS 3984 (2006), 735742.
[18] A. Okabe, B. Boots, K. Sugihara, and N. Chiu, Spatial tesselations: concepts and applications of Voronoi diagrams,
Wiley Series in Probability and Statistics, 2000.

Acknowledgments

We would like to thank Professor Dr. R. Klein for


reading the first version of the manuscript.

Refrences
[1] S. Anisov, Geometrical spines of lens manifolds, Department of Mathematics, Utrecht University, 2005.
[2] J. W Anderson, Hyperbolic Geometry, New York. SpringerVerlag, 1999.

[19] K. Onishi and J. Itoh, Voronoi diagram in simply connected


complete manifold, IEICE TRANS. Fundamentals. E85-A,
5 (2002), 944948.
[20] R. Seidel, A convex hull algorithm optimal for point sets in
even dimensions, M. S. thesis, Rep. 81-14, Dep. Computer
Science, Univ. of British Colombia, 1981.
[21] M. I. Shamos and D. Hoey, Closest-Point Problems, Proceedings 16th IEEE Symposium on Foundations of Computer Science (1975), 151162.
[22] W. P. Thurston, Three dimensional Geometry and Topology, Princeton University Press, 1997.
[23] J. R. Weeks, The Shape of Space, CRC; 2nd edition, 2001.
[24] Voronoi Resources,
http://www.skynet.ie/ sos/mapviewer/voronoi.php.

190

Solving Systems of Nonlinear Equations Using The Cuckoo


Optimization Algorithm
Mahdi Abdollahi

Shahriar Lotfi

Aras International Campus. University of Tabriz

University of Tabriz

Department of Computer Sciences

Department of Computer Sciences

m.abdollahi89@ms.tabrizu.ac.ir

shahriar lotfi@tabrizu.ac.ir

Davoud Abdollahi
University of Tabriz
Department of Mathematics
d abdollahi@tabrizu.ac.ir

Abstract: Systems of nonlinear equations arise in a diverse range of sciences such as economics,
engineering, chemistry, mechanics, medicine and robotics etc. For solving systems of nonlinear equations, there are several methods such as Newton type method, Particle Swarm algorithm (PSO),
Conjugate Direction method (CD) which each has their own strengths and weaknesses. The most
widely used algorithms are Newton-type methods, though their convergence and effective performance can be highly sensitive to the initial guess of the solution supplied to the methods. This
paper introduces a novel evolutionary algorithm called Cuckoo Optimization Algorithm, and some
well-known problems are presented to demonstrate the efficiency and better performance of this new
robust optimization algorithm. In most instances the solutions have been significantly improved
which proves its capability to deal with difficult optimization problems.

Keywords: Systems of Nonlinear Equations; Optimization; Cuckoo Optimization Algorithms; Evolutionary Algorithm.

Introduction

in [3] for solving systems of nonlinear equations. From


mathematical methods we can point the Filled Function methods [4].

Solving systems of nonlinear equations has always been


important in science. Most of the scientific problems
are related to the system of nonlinear equations. As
you know, there are two types of system equations.
The first type is linear and the second type is called
nonlinear. There are several methods for the first type
but there are few methods for the second type that the
solution often comes with approximate.
So far, several methods are presented for solving systems of nonlinear equations. Existing methods have
been tried to solve such problems in less time and with
higher accuracy. The genetic algorithm is used in [1]
and the particle swarm algorithm has been improved
Corresponding

In this paper, we introduce Cuckoo Optimization


Algorithm (COA) for solving the systems of nonlinear
equations. The results of cuckoo optimization algorithm are compared with other methods found in [1],
[3] and [4] to illustrate the power and high efficiency
of this algorithm.
In sections 2, we will briefly overview the COA. In
Section 3, how to apply cuckoo algorithm for solving
systems of nonlinear equations will be explained. In
Section 4, the obtained numerical results will be presented as a comparison and finally in section 5, we have
the conclusions and future works.

Author, P. O. Box 51586-49456, F: (+98) 411 669-6012, T: (+98) 914 116-2612

191

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

The Cuckoo Optimization Algorithm (COA)

distance from their habitat. From now on, this maximum range will be called Egg Laying Radius (ELR).
In an optimization problem with upper limit of varhi
and lower limit of varlow for variables, each cuckoo has
an egg laying radius (ELR) which is proportional to
the total number of eggs, number of current cuckoos
eggs and also variable limits of varhi and varlow . So
ELR is defined as:

Like other evolutionary algorithms, the proposed algorithm starts with an initial population of cuckoos.
These initial cuckoos have some eggs to lay in some
host birds nests. Some of these eggs which are more
similar to the host birds eggs have the opportunity to
grow up and become a mature cuckoo. Other eggs are
Current cuckoo0 s eggs
(varha varlow )
ELR =
detected by host birds and are killed. The grown eggs
T otal number of eggs
reveal the suitability of the nests in that area. The
(3)
more eggs survive in an area, the more profit is gained
in that area. So the position in which more eggs surwhere is an integer, supposed to handle the maxivive will be the term that COA is going to optimize
mum value of ELR.
[2].

Solving Systems of Nonlinear


Equations With COA

Each cuckoo starts laying eggs randomly in some


other host birds nests within her ELR. After egg laying process, p% of all eggs (usually 10%), with less
profit values, will be killed.

Let the form of systems of nonlinear equations be:


When young cuckoos grow and become mature, they

immigrate
to new and better habitats with more simif1 (x1 , x2 , ..., xn ) = 0

larity of eggs to host birds and also with more food for
f2 (x1 , x2 , ..., xn ) = 0
(1) new youngsters.To recognize which cuckoo belongs to
..

which group, used K-means clustering method (a k of

fn (x1 , x2 , ..., xn ) = 0
3 5 seems to be sufficient in simulations).
In order to transform (1) to an optimization probWhen each cuckoo moving toward goal point, they
lem, we will use the auxiliary function:
only fly a part of the way and also have a deviation.
Each cuckoo only flies % of all distance toward goal
n
X
2
minf (x) =
fi (habitat),
(2) habitat and also has a deviation of radians. For each
cuckoo, and are defined as follows:
i=1
U (0, 1)
U (/6, /6)

habitat = (x1 , x2 , ..., xn )


In order to solve systems of nonlinear equations,
its necessary that the values of problem variables be
formed as an array. In GA and PSO terminologies this
array is called Chromosome and Particle Position,
respectively. But here in COA it is called habitat.
In a Nvar -dimensional optimization problem, a habitat
is an array of 1 Nvar , representing current living position of cuckoo. The profit of a habitat is obtained by
evaluation of fitness function f(x) in equation (2).We
should mention that COA maximizes a profit function.
To use COA in cost minimization problems, one can
easily multiple the profit function to a minus.

(4)

Due to the fact that there is always equilibrium in


birds population so a number of Nmax controls and
limits the maximum number of live cuckoos in the environment.
After some iterations, all the cuckoo population
moves to one best habitat with maximum similarity
of eggs to the host birds and also with the maximum
food resources. There will be least egg losses in this
best habitat. Convergence of more than 95% of all
cuckoos to the same habitat puts an end to Cuckoo
Optimization Algorithm (COA).

To start the optimization algorithm, a candidate


habitat matrix of size Npop Nvar is generated. Then
Evaluation and Experimental
some randomly produced number of eggs is supposed 4
for each of these initial cuckoo habitats. In nature,
Results
each cuckoo lays from 5 to 20 eggs. These values are
used as the upper and lower limits of egg dedication to In this section, the results of applying this algorithm
each cuckoo at different iterations. Another habit of to solve the following problems are offered:
real cuckoos is that they lay eggs within a maximum

192

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Problem 1 [1]:

cos(2x1 ) cos(2x2 ) 0.4 = 0
2(x2 x1 ) + sin(2x2 ) sin(2x1 ) 1.2 = 0

Table 1: Used parameters in cuckoo Algorithm for


problems
Parameters
Initial pop.
Range
of eggs
for each
cuckoo
Number
of
iteration
Maximum
of
cuckoos
Number
of
clusters
Egg
laying
radius
Pop. variance

2 x1 2, 2 x2 2
Problem 2 [1]:

f1 (x1 , x2 ) = ex1 + x1 x2 1 = 0
f2 (x1 , x2 ) = sin(x1 x2 ) + x1 + x2 1 = 0
2 x1 2, 2 x2 2
Problem 3 [1]:

0 = x1 0.25428722 0.18324757x4 x3 x9

0 = x2 0.37842197 0.16275449x1 x10 x6

0 = x3 0.27162577 0.16955071x1 x2 x10

0 = x4 0.19807914 0.15585316x7 x1 x6

0 = x5 0.44166728 0.19950920x7 x6 x3
0 = x6 0.14654113 0.18922793x8 x5 x10

0 = x7 0.42937161 0.21180486x2 x5 x8

0 = x8 0.07056438 0.17081208x1 x7 x6

0 = x9 0.34504906 0.19612740x10 x6 x8

0 = x10 0.42651102 0.21466544x4 x8 x1

P2
40

P3
5

P4
5

P5
5

P6
5

[2, 4]

[2, 4]

[2, 4]

[2, 4]

[2, 4]

[2, 4]

150

200

300

30

300

300

250

300

500

1000

1000

300

50

50

50

50

50

50

0.001

0.001

0.001

0.001

0.001

0.001

Table 2: Results of problems 1


Method
Newtons
Secant
Broydens
Effatis
Evolutionary
COA

(x1 ,
(0.15,
(0.15,
(0.15,
(0.1575,
(0.15772,
(0.1563,

x2 )
0.49)
0.49)
0.49)
0.4970)
0.49458)
0.4931)

(f1 , f2 )
(-0.00168, 0.01497)
(-0.00168, 0.1497)
(-0.00168, 0.1497)
(0.005455, 0.00739)
(0.001264, 0.000969)
(-3.2559e-004, 1.2562e-006)

Table 3 shows the solution obtained for problem 2.

10 xi 10, i = 1 to 10
Problem 4 [3]:


P1
20

Table 3: Results of problems 2


Method
Effatis
Evolutionary
COA

x31 3x1 x22 1 = 0


3x21 x2 x32 + 1 = 0

(x1 , x2 )
(0.0096, 0.9976)
(-0.00138, 1.0027)
(-0.00003, 1.00009)

(f1 , f2 )
(0.019223, 0.016776)
(-0.00276, -0.0000637)
(-0.0000745, 0.0000174)

As in Tables 2 and 3 is visible, the solutions obtained


for both problems 1 and 2 are better and has higher
accuracy than of the solutions obtained by other methods. Table 4 shows the solution obtained for problem 3.

1 x1 2, 1 x2 2
Problem 5 (Neurophysiolosgy Application) [1]:
2
x1 + x23 = 1

x
+ x2 = 1

2 3 4
x5 x3 + x6 x34 = 0
x5 x31 + x6 x32 = 0

x5 x1 x23 + x6 x24 x2 = 0

x5 x21 x3 + x6 x22 x4 = 0

Table 4: Results of problems 3


Method

Evolutionary

| xi | 10
Problem 6 [4]:

0.5sin(x1 x2 ) 0.25x2 / 0.5x1 = 0
(1 0.25/)(exp(2x1 ) e) + ex2 / 2ex1 = 0

COA

0.25 x1 1, 1.5 x2 2.
The used parameters in COA for problems is listed
in Table 1.

(x1 , ..., x10 )


0.1224819761
0.1826200685
0.2356779803
-0.0371150470
0.3748181856
0.2213311341
0.0697813035
0.0768058043
-0.0312153867
0.1452667120
0.2482000000
0.3869000000
0.2772000000
0.1908000000
0.4453000000
0.1487000000
0.4266000000
0.0647000000
0.3467000000
0.4119000000

(f1 , ..., f10 )


0.1318552790
0.1964428361
0.0364987069
0.2354890155
0.0675753064
0.0739986588
0.3607038292
0.0059182979
0.3767487763
0.2811693568
-0.0094474086
0.0060038145
-0.0011322079
-0.0097329967
0.0001244906
-0.0000867383
-0.0051325862
-0.0085537600
0.0008737175
-0.0152687483

Table 5 shows the solution obtained for problem 4.


Table 5: Results of problems 4
Method

Table 2 shows the solution obtained for problem 1.


To study the other compared methods in Table 2, see
[1], [5], [6] and [7].

PSO
COA

(x1 , x2 )
1.08421508149135
-0.29051455550725
1.08421508149135
-0.29051455550725

(f1 , f2 )
-9.99200722162e-016
6.77236045021e-015
-9.99200722162e-016
6.77236045021e-015

The solution obtained for problem 5 is seen in Table

193

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Conclusion and Future Works


6, and Figure 1 shows the convergence of the graph. 5
As Table 6 shows, the results of COA are better than
the evolutionary algorithm.
In this paper, we used the cuckoo optimization algorithm for solving systems of nonlinear equations.
Table 6: Results of problems 5
Some well known problems were presented to demonMethod
(x1 , ..., x6 )
(f1 , ..., f6 )
strate the efficiency of finding the best solution using
-0.8078668904
0.0050092197
the COA. The proposed method had very good per-0.9560562726
0.0366973076
Evolutionary
0.5850998782
0.0124852708
formance, and was able to achieve to better results
-0.2219439027
0.0276342907
as shown in tables (2-7). In this algorithm, a gradual
0.0620152964
0.0168784849
-0.0057942792
0.0248569233
evolution in reaching the answer was quite visible. The
-1.0000000000
1.8769e-004
figure (1) reveals this fact. According to the Figure
-1.0000000000
1.9044e-004
COA
-0.0137000000
2.9019e-008
(2), the results have been stable. Therefore, we could
-0.0138000000
-2.0000e-004
say that this algorithm has high performance for solv0.5209000000
1.3944e-006
-0.5207000000
4.9330e-005
ing systems of nonlinear equations and is effective to
find the optimum solutions with high accuracy.
Current Cost = 1.1393e007, at iteration = 300
2

10

As a future work, we are planning to extend COA on


solving the boundary value problems such as Harmonic
and Biharmonic equations. We can also use the normal
distribution instead of uniform distribution to achieve
better results. It is noteworthy that the convergence
speed could be raised by the use of chaos theory for
[8].

Cost Value

10

10

10

10

10

50

100

150
Cuckoo Iteration

200

250

300

Figure 1: The convergence chart of problem 5


Results obtained for problem 6 are in Table 7.

Refrences
[1] C. Grosan and A. Abraham, A New Approach for Solving
Nonlinear Equations Systems: PART A: SYSTEMS AND
HUMANS, IEEE VOL. 38, NO. 3 (MAY 2008), 698714.

Table 7: Results of problems 6


Method

(x1 , x2 )
0.50043285
3.14186317
0.29930000
2.83660000

Filled Function
COA

(f1 , f2 )
-0.00023852
0.00014159
-0.000071289
0.000026644

[2] R. Rajabioun, Cuckoo Optimization Algorithm, ELSEVIER


Applied Soft Computing (2011), 55085518.

Figure 2 indicates the stability diagram of problem 6


in 30 runs. We achieved to acceptable mean and standard deviation such 4.84e-07 and 5.0426e-07 respectively.

[3] M. Jaberipour, E. Khorram, and B. Karimi, Particle Swarm


Algorithm for Solving Systems of Nonlinear Equations, ELSEVIER Comput. Math. Appl 62 (2011), 566576.
[4]

C. Wang, R. Luo, K. Wu, and B. Han, A New Filled


Function Method for An Unconstrained Nonlinear Equation,
ELSEVIER Comput. Appl. Math 235 (2011), 16891699.

[5] C. G. Broyden, A Class of Methods for Solving Nonlinear Simultaneous Equations, Math. Comput vol. 19, no. 92 (Oct.
1965), 577593.
5

Fitness Function

10

[6] W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P.


Flannery, Numerical Recipes in C: The Art of Scientific
Computing, Cambridge, U.K.:Cambridge Univ. Press, 2002.
10

10

[7]

S. Effati and A. R. Nazemi, A new method for solving


a system of the nonlinear equations, Appl. Math.Comput
vol. 168, no. 2 (2005), 877894.

15

10

10

15
Run Number

20

25

30

Figure 2: The stability chart of problem 6

[8] H. Bahrami, K. Faez, and M Abdechiri, Imperialistic Competitive Algorithm Using Chaos Theory for Optimization,
12th International Conference on Computer Modelling and
Stimulation (2010).

194

A Novel Model-Based Slicing Approach For Adaptive Softwares


Sanaz Sheikhi

Seyed Morteza Babamir

University of Kashan

University of Kashan

Department of Computer Engineering

Department of Computer Engineering

sheikhi@grad.kashanu.ac.ir

babamir@kashanu.ac.ir

Abstract: Dynamic changes in operational environments of softwares and users requirements has
caused software communities to develop adaptive softwares. The inherent dynamism of adaptive
softwares makes them complex and error prone. So accomplishing many tasks such as understanding, testing, analyzing cohesion and coupling of an adaptive software is a difficult and costly labor.
we present a novel approach for slicing of an adaptive system which its result can be used for fulfilling these task with less costs and more easier. The approach uses Techne model of an adaptive
software. Being model-based gives the approach the chance of not being involved in software code
and work in a abstract level.

Keywords: slicing; adaptive software; model-based; Techne model.

Introduction

Dynamic environment and stakeholders needs can be


perfectly managed by the help of adaptive softwares.
Adaptive softwares modify their behaviors or structures in response to changes of environment or users
requirements. This level of flexibility is accompanied
with the risk of more errors and complexity. So costs
of many activities like testing, integration, debugging,
cohesion and coupling analysis increases and they get
more difficult.
Slicing techniques can be useful in these cases. They
choose some statements(commands) of a software program which affect a predefined set of desirable variables(usually output variables), called criterion, based
on their policy. The result,called a slice, comprises
the statements affecting the criterion and irrelevant
statements are removed. It is simpler, so cost effective
and easier to be analyzed.

There are slicing techniques for both the source


code and model of softwares[1]. In this paper we propose a new approach for slicing of adaptive softwares
and for avoiding problems of complex codes and application dependence we use a Techne model of an adap Corresponding

Author

195

tive software. Also the approach uses Techne model


properties such as Preferences and conflicts between
the model elements and being optional. In this way
the produced slice is also the best way of satisfaction
of the slicing criterion.
The rest of this paper is organized as follows, next
section gives a brief statement of the problem, An introduction to adaptive softwares and Techne model is
given in section 3. Concepts of slicing is clarified in section 4. Section 5 will introduce the proposed approach
for slicing of an adaptive software. the approach is applied to a case study in section 6. Related works are
briefly reviewed in section 7. Section 8 includes the
conclusion and future works.

Problem Statement

But adaptive softwares are inherently complex and using usual software engineering approaches for some
applications like understanding, testing, cohesion and
coupling analysis of them is so difficult, has high cost
and is error prone. So new Techniques should be used
to optimize adaptive software development to take advantage of them. Slicing is a reducing technique that

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Optional (O): it shows optional propositions.

can solve the problem.

ADAPTIVE SOFTWARE

Software systems operate in open, changing and unpredictable environments. So to have robustness, they
should be able of adapting to environmental changes
as well as adapting to their internal changes and stakeholders various requirements [2]. There are many languages to model these systems. Structural or object
oriented ones specify a system from its developer point
of view and dont pay attention to stakeholder of the
systems, instead goal based languages are closer to
stakeholder idea and are easier to understand [3,4].
Techne [5,6] is a goal based modeling language for
adaptive systems. In addition to general properties of
other languages, Techne has unique properties distinguishing it from other languages. The model is in the
form of a directed graph. Its nodes represent propositions related to environment or stakeholder of the software such as:

To clarify the Techne Model, we consider the problem of scheduling a meeting. Scheduling can be done
automatically by the use of email and web forms or
manually. Web forms are designed to acquire participants calendar constraints to submit the requests to
modify the meeting date and location. web forms addresses and the invitations are sent to participants by
email. The manual approach organize the meeting via
phone calls. A part of the model of meeting schedular
is depicted in Figure 1.

Goal (g): stakeholders eligible condition that


must be satisfied.
Quality constraint (q): it limits the value of a
measurable and well-defined characteristic of the
system.
Soft goal (s): its like quality constraint but it
limits the value of an ill defined characteristic of
the system.

Figure 1: Techne model of meeting scheduler

SLICING

Domain assumption (k): a proposition that


should be always true about the environment of program slicing is an approach to choose some pieces
of a program affecting a set of desired variables. The
the system regardless of the system.
result is called a slice. Size of a slice is much less than
Task (t): whatever must be done for satisfaction size of a program. so, slice is a cost effective choice
for program testing, program maintenance, cohesion
of goals,quality constraint and soft goals.
and coupling analysis, comprehension of program and
a lot more usages. Almost all the slicing methods use
Edges of the graph determine the relation between the a dependency graph of the program. It is a directed
nodes. The relations are of four types :
graph and its nodes are statements(commands) of the
program and its edges show either data or control dependency between the statements [7]. Slicing methods
Inference (I): it conveys the conjunction of a set
work base on a slicing criterion (V,n). V is a set of
of propositions to satisfy another proposition.
variables to be analyzed that are residing before line
Conflict (C): whenever two nodes are related via number n in the program source code. They choose
a conflict relation, they cant be satisfied to- all the statements of the program residing before line
n that are directly or indirectly changing values of
gether.
set V variables. They recognize the dependency and
Preference (P): when a proposition is preferred relations between the statements from the program
to another one, the former is linked to the latter dependency graph.
via a preference relation.

196

The Third International Conference on Contemporary Issues in Computer and Information Sciences

PROPOSED APPROACH

2. Determine
Decomposed(ai ),
Preference(ai ).

For slicing a program either its source code or its model


can be used. As source code of an adaptive software
is complicated we use the model of an adaptive software instead of its code for slicing. we consider the
Techne modeling language attributes (preference, being optional) in our slicing policy. The Techne model
is in the form of a dependency graph, so we use it as
software dependency graph.

We use the Techne model depicted in figure 1 to


show our slicing approach. In Techne model all the
goals (gi ), soft goals (si ), domain assumption (ki ),
quality constraints (qi ) and tasks (ti ) are supposed to
be as the graph nodes. All the inference (Ii ), conflict
(ci ) and preference (pi ) relations between the nodes are
supposed to be the dependency edges.
For slicing one of the goals, soft goals or quality constraints is picked up as slice criterion and the following
sets should be define:
concepts is the set of all the graph nodes.
A is a subset of concepts and includes all the
goals, soft goals and quality constraints.
Decomposed(ai ) { B1 ,..., Bs }
ai A , s = |concepts|
Bt = (bn )tn=1 , bn concepts , t=1,..., s
There may be diiferent pathes for satisfaction of
ai . In fact, each Bt is a path for satisfaction of
ai and is called a solution.
Conflict(ai ): it is the set of all the 2 tuple like (bp
, bq ), where ( bp and bq Ba ) or ( bp Ba and bp
Bb , a 6= b ) and (Ba and Bb Decomposed(ai )
) and bp and bq are in conflict with each other.
In the graph bp and bq are connected to each other
via an edge with conflict relation.

Conflict(ai ),

3. For each Bj Decomposed(ai ) and each bp and


bq Bj ,if ((bp , bq ) Conflict(ai )) then remove
Bj from Decomposed(ai ).
4. If Decomposed(ai ) is empty, then there is no path
for satisfaction of ai , return .
5. If Decomposed(ai ) has only one member, called
BF , then pick it up and go to step 11.
6. If each two members of Decomposed(ai ) are in
chosen-together-list then go to step 11.
If Decomposed(ai ) has more than one member
which are not in the chosen-together-list then
choose them, call them Bk , Bj and insert (Bj
, Bk ) into checked-together-list and:
7. For each bp Bk , bq Bj :
if (bp , bq ) is in Preference(ai ) then increase prefk
one unit
if (bq , bp ) is in Preference(ai ) then increase prefj
one unit
8. For each member of Bk , having optional label increase prefk one unit
For each member of Bj , having optional label increase prefj one unit
9. Insert Bj and BK names in to checked together
list.
10. If prefj < prefk then remove Bj
Decomposed(ai )
If prefj > prefk then remove Bk
Decomposed(ai )
go to step 4.

from
from

11. Each members of Decomposed(ai ) can be a path


for satisfaction of ai . It depends on designer
or programmer strategy to choose one of them,
called BF .

12. Slice(ai ) = Slice(b1 )... Slice(bF ), BF =(b1 ,...,bF )


Preference(ai ): it is the set of all the 2 tuple like
(bp , bq ) where ( bp Ba and bp Bb , a 6= b )
and ( Ba and Bb Decomposed(ai )).
At first the approach chooses a slice criterion (ai )
In the graph, bp is preferred to bq if there is a and adjusts an empty list called chosen-together-list
preference relation from bp to bq
and also sets two flags(prefk and prefj ) to zero presenting worthiness of two different solution to comAfter preparing the sets, we perform the following pare them and choose one of them. In the second step different solution of ai become members of
steps:
Decomposed(ai ) and conflicts between concepts and
priority of concepts are determined. In the third step,
1. Choose one of the set A members as the slicing Any solution consisting conflicting propositions is recriterion, called ai .
moved from Decomposition(ai ) set. In step four if
Set an empty list called chosen-together-list.
the decomposition(ai ) set is empty then the algorithm
Set prefk and prefj ,flags, to zero.
returns empty set, meaning that there is no path to

197

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

satisfy ai . On the other hand if there is more than


one solution, they are compared in consideration with
the number of optional concepts they cover and preferences existing among their concepts.
Finally one of the remained solutions in the
decomposition(ai ), called BF , is chosen. Eq.(1) determines the slice:
Slice(ai ) = Slice(b1 ) ... Slice(bF ) (1)
BF =( b1 ,...,bF )

7. prefk ++
8. 9. Insert (Bj , Bk ) into checked-together-list
10. prefk > prefj Decomposed(S1 )= {(G1 , Q2
)}
go to 11
11. BF = (G1 , Q2 )

Slice(ai ) contains the most effective elements on


satisfaction of ai and is free of irrelevant elements. SO
it is a suitable choice for analyzing cohesion and coupling, generating test cases for test cases and many
other applications easily and with less cost.

Case Study

S
S
12. slice(S1 )=slice(G1 )S slice(Q2 ) = slice(G1 )
slice(G1 )=slice(T1 ) slice(T2 )
slice(S1 )= { S1 , Q2 , G1 , T1 , T2 }
Result is a slice which is much more smaller,simpler
and cost effective to be analyzed for different applications in comparion with the whole Techne model. It
has less nodes and relations relevant to the way of satisfaction of the criterion. It contains all the elements
that effect satisfaction of the criterion.

Applying the slicing approach to the Techne model of


meeting schedular depicted in Figure 1 with the slicing criterion accommodate late changes leads to creation of the slice depicted in Figure 2. Because of space
limitation, we use the following notation for the model
elements which will be used in this section:
S1 : Accommodate late changes.
G1 : obtain change requests via web form.
Q1 : change requests can be setup up to 6h prior. Figure 2: slice for the criterion accommodate late
changes
Q2 : change requests can be setup up to 3h prior.
T1 : leave web form open up to 3h prior.
T2 : implement web form for change reqs.
T3 : leave web form open up to 6h prior.
The approach steps are :
1. criterion = S1 .
2. Decomposed(S1 )= {(G1 ,Q1 ),(G1 , Q2 )}
Conflict(S1 )=
Preference(S1 )= (Q2 ,Q1 )
3. 4. 5. 6. Bj =(G1 ,Q1 ) , BK =(G1 , Q2 )

Related Work

As far as we have studied, no research has been dedicated to slicing of adaptive system model. But there
are some researches on slicing of software models, specially UML models. Lizzhang [8] considered class diagram , and does the slicing to extract test cases based
on a black box method. Ray [9] used condition slicing
of class diagrams but it is not suitable as class diagrams are static and dont show the systems behavior
with regarding the data dependencies. To compensate
this handicap Samuel [1] benefited sequence diagram,
which is dynamic and shows the system behavior, for
slicing and test case generation. He proposes a formula
for slicing criterion adequacy and claims that it covers
the slicing criterion with the least number of test cases.
Bertolino [10] focused on message passing between sequence diagram components and tries to generate test

198

The Third International Conference on Contemporary Issues in Computer and Information Sciences

cases for covering all the predicates and interactions


existing in the sequence diagram.

Discussion and Future Works

We present a novel approach for slicing of an adaptive


system, the result of the approach is specially applicable for testing, verifying coupling and cohesion or understanding of a complex adaptive software. As adaptive softwares are usually complicated, our approach
avoids involving their source codes and instead uses
the adaptive softwares models. In this way the work
proceeds in an abstract level and gets rid of complexity
of codes details.
The approach is designed based on Techne model of an
adaptive software. It benefits the model architecture,
which is in the form of a graph, instead of creating a
dependency graph. Also it is based on the priority and
being optional properties of the Techne modeling language. Therefore, the result of the approach can be
considered as the best solution to satisfy the slice criterion. The produced slice contains exactly those parts
of the model affecting the satisfaction of the criterion
and hence a huge part of the software irrelevant to the
criterion gets eliminated and the analysis cost considerably is reduced.
Naturally there is much to be done in this field. Large
number of comparisons in the approach increases the
cost, and one of our future work is to alter the approach in such a way to solve this problem. And also
we are planning to slice the adaptive system model
with a dynamic method which probably may reduce
the complexity of the approach and size of the slice.

199

Refrences

[1] P. Samuel and R. Mall, A Novel Test Case Design Technique Using Dynamic Slicing of UML Sequence Diagrams,
e-Informatica Software Engineering Journal 2/1 (2008),
367378.
[2] A. G. Ganek and T. A. Corbi, The dawning of the autonomic computing era, IBM Systems Journal 2/1 (2003),
71-92.
[3] E. Nitto, C. Ghezzi, A. Metzger, M. Papazoglou, and K.
Pohl, A journey to highly dynamic, self-adaptive servicebased applications, Automated Software Engineering Journal/USA 15/3 (2008), 313-317.
[4] Q. Zhu, L. Lin, H. M. Kienle, and H. A. Muller: Characterizing maintainability concerns in autonomic element design,
software maintenance ICSM/Beijing (2008), 197-206.
[5] A. Borgida, N. Ernest, I.J. Jureta, A. Lapouchnian, S.
liaskos, and J Mylopoulos, Techne (another) Requirements
Modeling Language, University of Toronto (2009).
[6] I.J. Jureta, A Borgida, N. Ernest, and J Mylopoulos,
Techne: Towards a New Generation of Requirements Modeling Languages with Goals, Preferences, and Inconsistency
Handling, Proceeding of IEEE International Conferance on
Requirement Engineering,sydney,NSW (2010), 115-124.
[7] D. Binkley, S. Danicic, T. Gyimothy, M. Harman, A. Kiss,
and B. Korel, Theoretical foundations of dynamic program
slicing, Theoretical Computer Science 360/23-41 (2006).
[8] W. Linzhang, Y. Jiesong, Y. Xiaofeng, H. Jun, L. Xuandong,
and Z. Guoliang, Generating test cases from UML activity
diagrams based on gray-box method, Proceedings of the 11th
Asia- Pacific Software Engineering Conference/Washington,
DC, USA (2004).
[9] M. Ray, S. S. Barpanda, and D.P. Mohapatra, Test Case
Design Using Conditioned Slicing of Activity Diagram, International Journal of Recent Trends in Engineering 1/2
(2009), 117-120.
[10] A. Bertolino and F. Basanieri, A practical approach to UML
based derivation of integration tests, Proceedings of 4th International Software Quality Week Europe (2000).

A novel approach to multiple resource discoveries in grid environment


Leyli Mohammad khanli

Saeed Kargar

University of Tabriz

Islamic Azad University,Tabriz Branch

Department of Computer Sciences

Department of Computer

Tabriz, Iran

Tabriz, Iran

l-khanli@tabrizu.ac.ir

saeed.kargar@gmail.com

Hossein Kargar
Islamic Azad University, Science and Research Branch
Department of Computer
Hamedan, Iran
h.kargar.ir@gmail.com

Abstract: In this paper, we proposed the method of discovering resource in grid environment
which is able to discover the required combinational resources of users apart from single resources.
In this method, the idea of combination of colors was used for saving and discovering resources.
This method uses combination of colors for illustrating characteristics of resources and the users
use combination of colors or their equivalent codes for requesting their necessary resources. This
method is able to establish the users required resources with low traffic and discover them by a direct
path and diagnose the changes which were occurred in the system and update the environment.
This method is simulated in environments with different sizes and the results show that this method
established lower traffic in environment comparing the other methods and so it is more effective.

Keywords: Facility Location; Voronoi Diagram; Reactive Agent; Computational Geometry; Artificial Intelligence.

Introduction

Computational grid is a virtual distributed computing


environment aimed at establishing an environment for
sharing resources in a wide geographical range. The
resources which are being shared in grid maybe heterogeneous, be in different geographical places and belong
to different domains and etc. It is certain that finding
a resource or combination of resources for executing a
special program is very complicated and difficult. The
grid system should support a mechanism in order to
be implemented so that this mechanism can discover
the required resources of users in a widespread environment with establishing low traffic and give it to the
users. These mechanisms are known as resource discovering mechanisms.
Corresponding

The traditional resource discovering mechanisms


use methods such as centralized methods for discovering resources [16]. These methods are very effective and efficient in discovering the resources in small
environments. Although all available resources in an
environment are managed by a central server, and this
server is able to support well the sent requests, when
the size of environment was spread, these systems confronts with problem. This means that, the information
size which is saved and managed in a server, the size
of requests which are sent to discover the resources to
server, will increase significantly. This led to creating
a bottleneck in server and reduces the efficiency of system. The researchers decided to establish systems for
solving this problem which dont belong to a central
server. These systems are known as distributed systems. In distributed systems, a central server is not
responsible for all resources and the system is man-

Author, F: (+98) 411 3347480, T: (+98) 411 3304122

200

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

aged as distributed. Such systems are systems which


All aforementioned methods except the last method
were presented recently and use resource discovering were methods which discover a resource for user and
tree for discovering the resource [712]. These systems were not able to discover multi-resources simultanecan provide better efficiency than previous methods.
ously to the user. The most important difference between proposed method in this paper with [11] and
In this paper, we propose a method which is able other methods is that the proposed method is able to
to discover multi-resources to a user simultaneously. do simultaneously multi-resource on trees with differWe designated a color to each of the characteristics ent and desirable children (no merely on binary tree).
of resources for saving the information of each of the In this work, we used combination of colors for the first
resources and used combination of their constituting time in this work.
colors for combinational resources. The results of our
simulated method show the efficiency of this method
in grid systems with different sizes.

Our resource discovery and


update methods

The remainder of this paper is organized as follows.


Section 2 presents an overview of related work. Section
3 explains the proposed algorithm. Section 4 shows the
simulation results, and Section 5 concludes the paper In this section, first we explain the manner of designatand outlines some future research directions.
ing colors to each of the resources and then we introduce the method of resource discovering and update.

Related work

Resource discovering problem is one of the most important problems which researchers are trying to solve
it and they have proposed different methods for solving
this problem and we explain some of these methods in
brief in this section.

In this work, for producing the colors we use Color


Schemer Studio 2 [14]. We use three main colors including red, green and blue that each of them are represented on three bands which are darkened from one
direction to the other. Such that the colors near to
255 are lighter and the colors near to 0 are darker colors. We can create many colors from combination of
these three colors. Any desirable color can be presented
by three number of x, y, z(x, y, z {1, 2, 3, ..., 256})
that each of them represented the concentration of red,
green, blue colors respectively. So white is represented
with code (255.255.255) and black is represented with
code (0.0.0). Our imaginative grid environment which
is on a tree structure is shown in Figure 1.

Among the other resource discovering methods,


we can point to the proposed method in [13] which
uses routing tables for this reason. In this method,
the whole environment is considered as a combination
of routers and resources and resource discovering is
As you see in this figure, combinational resources
done by routing tables including the number of routers
phases to each of the type of available resources in the are written in each of nodes. These nodes are imagined
as the grid sites that each of them has resources.
environment.
The other method is the proposed method of Chang
et al which uses resource discovering tree for finding the
required resources of user [7]. This method is also a distributed method and each node is responsible for itself
and its childrens nodes. This method can improve the
previous methods for many parts.
In our previous work [8], we used the weighted resource discovering tree for discovering the resource in
grid environment. This method can reduce the re- Figure 1: An example of typical grid environment on
source discovering cost comparing the previous meth- a Resource discovery tree
ods.
In this example, we imagined two types of CPU,
The proposed method in [11] is a multi-resource discovering method which is done the resource discovering three types of HDD and two types of RAM in environment. As we told, we represented any type of resource
on a binary tree.

201

The Third International Conference on Contemporary Issues in Computer and Information Sciences

with a color. In Figure 2, all imagined resources in our


environment were attributed to a unique color. The
nodes will save the similar color with due to the available resource in a table called Color Table which will
be introduced. If nodes have combination of resources,
they will save combination of the colors of resources.

Figure 4: A sample of Color Table (belong to node 1)

3.1

Resource discovery

Now imagine that, a user needs resource CPU 3.8 GHz


& HDD 2TB. This user delivers the related color to a
node which is nearer (here for example node 7).
Figure 2: Existing resources in our grid environment

For example, if one node has the combination resource CPU 3.8 GHz & HDD 2TB, it will use the resulted color from combination of (255.102.0) (related
to CPU 3.8) and (34.0.204) (related to HDD 2T). For
obtaining combination of colors, it is enough to calculate the integral number of the average of numbers:
([(255+34)/2].[(102+0)/2].[(0+204)/2])=(144.51.102)
Figure 5: A sample of resource discovery in our method
In Figure 3, three samples of combination resources
were represented together with color and code. In FigHaving received this request, the node 7 compares
ure 4, the color table related to node 1 was shown. As
it
first
with its local color and then with available colors
you see, number of the rows of color table of each node
in
its
table.
is for the number of the children of that node. In each
row, the color related to available resources was written
Since there is no conformity, so delivers it to its
in related child.
parent (that means node 4). The node 4 delivers the
request to node 1 in the same way. As it is shown
in Figure 5, node 1 finds a conformity in row 1 which
is related to node 2 and delivers the request to that
node and the node 2 acts in the same way and sends
the request to nodes 5 and 6 and at the end, the desired resources was discovered in two nodes to the user
(multi reservation).

Figure 3: An example of combinational resources

202

As it was seen, the proposed method can discover


the required resource of the user by a direct path on

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

trees with different sizes and desired number of chil- ferent methods. We supposed 300 users that everyone
dren.
requested different number of resources. In Figures 7
and 8, the results were shown.

Simulation Results

As it can be seen in all tests, having increased the


size of environment and having increased the number
of request resources of users, our method have better
efficiency.

We did the simulation in MATLAB environment and


show the results in graphs. We consider the varied simulations for simulating the size of environment; also,
we supposed that the users request different number
of resources at any time. Having these assumptions,
we obtained the results for tree method with height
of 4 [7], FRDT [8], MMO [15, 16] and flooding-based
which are one resource methods. One-resource methods mean the methods which are able to discover just
one resource at any time to the user. In these methods,
we supposed that the current methods send the users
request separately and then discover these resources
for the user [11]. Also the proposed method in [11]
which is a multi-resource method on binary tree is one
of the methods which we compared with our proposed
Figure 7: The number of the visited nodes by the users
method.
requests during the resource discovery that the users
In the first test, the average number of nodes to request two resources
whom the requests are sent, shown for the state in
which every user requests one resource. Here, we compared our method with other methods. As show in
Figure 6, the average number of visited nodes in our
method is lower than other methods and is equal to
FRDT. It is because that first, in this experiment every user requested just one resource and second, since
our method passes just a direct path, like FRDT, so,
both visit the same number of nodes. In this test we
supposed 300 requests.

Figure 8: The number of the visited nodes by the users


requests during the resource discovery that the users
request four resources

Conclusions and future work

Figure 6: Average number of nodes that requests


are forwarded in resource discovery using different ap- This paper presents a method for discovering the disproaches
tributed and scalable resource which supported from
discovering multi-resource in grid dynamic environment. In this method, the idea of combination of colIn next simulations, we established 300 requests in ors was used for saving and discovering resources. The
the environment. These tests show the number of vis- simulation results show that this method is an effective
ited nodes in different size of grid environments for dif- and efficient method in grid environment.

203

The Third International Conference on Contemporary Issues in Computer and Information Sciences

In future, if we can reduce the size of saved data in


tables, we can improve the method for most part.

[15] Ye Zhu, Junzhou Luo, and Teng Ma, Dividing Grid Service Discovery into 2-stage matchmaking, ISPA 2004, LNCS
3358 (2004), 372-381.
[16] Sanya Tangpongprasit, Takahiro Katagiri, Hiroki Honda,
and Toshitsugu Yuba, A time-to-live based reservation algorithm on fully decentralized Resource Discovery in Grid
computing, Parallel Computing 31 (2005).

Refrences
[1] I. Foster, C. Kesselman, and Globus, A meta-computing infrastructure tool-kit, Int. J. High Perform, Comput. Appl
2 (1997), 115-128.

[17] Muthucumaru Maheswaran, Klaus Krauter, and Teng Ma,


A parameter-based approach to Resource Discovery in Grid
Computing Systems, GRID (2000).

[2] M. Mutka and M. Livny, Scheduling remote processing capacity in a workstation processing bank computing system,
Proc. of ICDCS (1987).

[18] K.I. Karaoglanoglou, H.D. Karatza, and Teng Ma, Resource


Discovery in a dynamical grid based on Re-routing Tables,
Simulation Modelling Practice and Theory 16 (2008), 704720.

[3] C. Germain, V. Neri, G. Fedak, and F. Cappello,


XtremWeb: Building an experimental platform for global
computing, Proc. of IEEE/ACM Grid (2000).
[4] A. Chien, B. Calder, S. Elbert, and K. Bhatia, Entropia:
Architecture and performance of an enterprise desktop grid
system, J. Parallel Distrib. Comput 63 (2003), no. 5.
[5] F. Berman, Adaptive computing on the grid using AppLeS,
TPDS 14 (2003), no. 4.

[19] Simone A. Ludwig and S.M.S. Reyhani, Introduction of semantic matchmaking to Grid computing, J. Parallel Distrib.
Comput 65 (2005), 15331541.
[20] Juan Li and Son Vuong, Grid resource discovery using semantic communities, Proceedings of the 4th International
Conference on Grid and Cooperative Computing, Beijing,
China (2005).

[6] M.O. Neary, S.P. Brydon, P. Kmiec, S. Rollins, P. Capello,


and JavelinCC, Scalability issues in global computing, Future Gener. Comput. Syst. J 15 (1999), no. 56, 659-674.

[21] juan Li and Son Vuong, Semantic overlay network for Grid
Resource Discovery, Grid Computing Workshop (2005).

[7] R-.S Chang and M-.S .Hu, A resource discovery tree using
bitmap for grids, Future Generation Computer Systems 26
(2010), 2937.

[22] Cheng Zhu, Zhong Liu, Weiming Zhang, Weidong Xiao,


Zhenning Xu, and Dongsheng Yang, Decentralized Grid Resource Discovery based on Resource Information Community, Journal of Grid Computing (2005).

[8] L.M Khanli and S. Kargar, FRDT: Footprint Resource


Discovery Tree for grids, Future Gener. Comput. Syst 27
(2011), 148-156.
[9] L.M Khanli, A. Kazemi Niari, and S. Kargar, An Efficient
Resource Discovery Mechanism Based on Tree Structure,
The 16th International Symposium on Computer Science
and Software Engineering (CSSE 2011) (2011), 4853.
[10] Leyli Mohammad Khanli, Saeed Kargar, and Ali Kazemi
Niari, Using Matrix indexes for Resource Discovery in Grid
Environment, The 2011 International Conference on Grid
Computing and Applications (GCA11), Las Vegas, Nevada,
USA (2011), 3843.
[11] Leyli Mohammad Khanli, Ali Kazemi Niari, and Saeed Kargar, A binary tree based approach to discover multiple types
of resources in grid computing, International journal of computer science & Emerging Technology, Sprinter Global Publication E-ISSN: 2044-6004 (2010).
[12] leyli Mohammad Khanli, Ali Kazemi Niari, and Saeed Kargar, Efficient Method for Multiple Resource Discoveries in
Grid Environment, The 2011 International Conference on
High Performance Computing & Simulation (HPCS 2011)
(2011).
[13] R. Raman, M. Livny, and M. Solomon, Matchmaking: distributed resource management for high throughput computing, hpdc, Seventh IEEE International Symposium on High
Performance Distributed Computing (HPDC-798), (1998),
140.
[14] Rajesh. Raman, Matchmaking Frameworks for Distributed
Resource Management, Wisconsin-Maddison, 2001.

204

[23] Fawad Nazir, Hazif Farooq Ahmad, Hamid Abbas Burki,


Tallat Hussain Tarar, Arshad Ali, and Hiroki Suguri, A resource monitoring and management middleware infrastructure for Semantic Resource Grid, SAG 2004, LNCS 3458
(2005), 188-196.
[24] Thamarai Selvi Somasundaram, R.A. Balachandar, Vijayakumar Kandasamy, Rajkumar Buyya, Rajagopalan Raman, N. Mohanram, and S. Varun, Semantic based Grid Resource Discovery and its integration with the Grid Service
Broker, Proceedings of 14th 4 International Conference on
Advanced Computing & Communications, ADCOM (2006),
84-89.
[25] J. Li and S. Vuong, A scalable semantic routing architecture
for Grid resource discovery, 11th Int. Conf. on Parallel and
Distributed Systems, ICPADS05 1 (2005), 29-35.
[26] K Karaoglanoglou and H Karatza, Resource discovery in
a dynamical grid system based on re-routing tables, Simulation Modelling Practice and Theory, Elsevier 16 (2008),
no. 6, 704-720.
[27] Color Schemer Studio 2: http://www.colorschemer.com/.
[28] M. Marzolla, M. Mordacchini, and S. Orlando, Resource discovery in a dynamic environment, Proceedings of the 16th
International Workshop on Database and Expert Systems
Applications, DEXA05 (2005), 356-360.
[29] M.Marzolla, M.Mordacchini, and S.Orlando, Peer-to-peer
systems for discovering resources in a dynamic grid, Parallel Comput 33 (2007), no. 45, 339-358.

HTML5 Security: Offline Web Application


Abdolmajid Shahgholi

HamidReza Barzegar

Jawaharlal Nehru Technological University

Jawaharlal Nehru Technological University

School of Information and Technology

School of Information and Technology

Hyderabad, India

Hyderabad, India

Shahgholi a@hotmail.com

Hr.barzegar@gmail.com

G.Praveen Babu
Jawaharlal Nehru Technological University
School of Information and Technology
Hyderabad, India
pravbob@jntu.ac.in

Abstract: Offline Web Application [7]: Web applications are able through using HTML5 Offline
Web Application to make them working offline. A web application can send an instruction which
causes the UA to save the relevant information into the Offline Web Application cache. Afterwards
the application can be used offline without needing access to the Internet. Whether the user is asked
if a website is allowed to store data for offline use or not depends on the UA. For example, Firefox
3.6.12 asks the user for permission but Chrome 7.0.517.44 does not ask the user for permission to
store data in the application cache. In this case the data will be stored in the UA cache without
the user realizing it.

Keywords: Offline Web Application, User Agent, Cache Poisoning

Introduction

< htmlmanif est =0 /cache.manif est0 >


< body >

Creating web applications which can be used offline


was difficult to realize prior to HTML5. Some manufacturers developed complex work around to make their
web applications work offline. This was mainly realized
with UA add-ons the user had to install. HTML5 introduces the concept of Offline Web Applications. A
web application can send information to the UA which
files are needed for working offline. Once loaded the
application can be used offline. The UA recognizes the
offline mode and loads the data from the cache. To tell
the UA that it should store some files for offline use
the new HTML attribute manifest in the < html >
tag has to be used:

The attribute manifest refers to the manifest file


which defines the resources, such as HTML and CSS
files, that should be stored for offline use. The manifest file has several sections for defining the list of files
which should be cached and stored offline, which files
should never be cached and which files should be loaded
in the case of an error. This manifest file can be named
and located anywhere on the server; it only has to end
with .manifest and returned by the web server with
the content-type text/cache-manifest. Otherwise the
UA will not use the content of the file for offline web
application cache.

<!DOCT Y P EHT M L >

User Agent (UA): The UA represents a web application consumer which requests a resource from a web

Corresponding

Author, P. O. Box 1447653148, F: (+98) 021 8827-1350, T: (+91) 8686184291

205

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

application provider. This resource is processes by the


UA and, depending on the resource, is rendered and
displayed by the UA to the end-user. The UA has
the capability to establish Hypertext Transfer Protocol (HTTP) [6] connections to a web server, to render
HTML / CSS and execute JavaScript code correctly.

is possible as well. This mainly breaks the requirement


of UA protection. But breaking this security requirement all other security requirements are endangered
implicitly as well. E.g., if the security requirement secure caching can be broken, an attacker can include any
content into the Offline Web Application cache and use
this code for breaking the other security requirements
Further, the UA has implemented the HTML 4.01 as well.
and HTML 5 standard and its corresponding capabilities such as the Geolocation API or Web Storage .
Web application: The web application is a generic
term of the entity providing web resources and is Composed out of the following three main parts:
Website: The website is composed out of several
single web resources and is accessible via its URI.
Web server: The web server is hosting at least
one website. The HTTP(S) connection is established between the UA and the web server.
Besides hosting websites additional resources are
also provided by the web server. Other connections, such as Web Socket API connections, are
also established between the UA and the web
server.
Database: The database stores any kind of data
needed for the web application such as personal
information about their users.
Motivation
As seen many attacks against web applications exist (in
2010) and the need for security in the Internet grows.
Beside the comfort the web provides, security concerns
are critical points to be considered. This applies to
current web applications but also for future web applications. The threats to web applications described in
this section need to be kept in mind when considering
HTML5 security issues.

Vulnerabilities

With the introduction of Offline Web Applications the


security boundaries are moved. In web applications
prior to HTML5 access control decisions for accessing
data and functions were only done on server side. With
the introduction of Offline Web Applications parts of
these permission checks are moved towards the UA.
Therefore, implementing protections of web applications solely on server side is no longer sufficient if Offline Web Applications are used. The target of attacking web application is not limited to the server-side; attacking the client-side part of Offline Web Application

Threats and attack scenarios

Spoofing the cache with malicious data has been a


problematic security issue already prior to HTML5.
Cache poisoning was possible with already existing
HTML4 cache directives for JavaScript files or other
resources. However, UA cache poisoning attacks were
limited. With HTML5 offline application this cache
poising attacks are more powerful. The following
threats are made worse in HTML5:
Cache Poisoning: It is possible to cache the root
directory of a website. Caching of HTTP as well
as HTTPS pages is possible. This breaks the security requirement of UA protection and Secure
caching.
Persistent attack vectors: The Offline application cache stays on the UA until either the
server sends an update (which will not happen
for spoofed contents) or the user deletes the cache
manually. However, a similar problem as for Web
Storage exists in this case. The UA manufacturers have a different behavior if the recent history is deleted. This breaks the security requirement of UA protection.
User Tracking: Storing Offline Web Application
details can be used for user tracking. Web applications can include unique identifiers in the
cached files and use these for user tracking and
correlation. This breaks the security requirement
of Confidentiality. When the offline application
cache is deleted depends on the UA manufacturers.
As already mentioned, cache poisoning is the most critical security issue for offline web applications. Therefore, possible cache poisoning attack scenario is given
in this section which is motivated on the ideas of an
article from [8]. Figure 1 shows a sequence diagram
which illustrates how an attacker can poison the cache
of a victims UA. The victim goes online through an unsecure malicious network and accesses whichever page
(the page to be poisoned does not have to be accessed

206

The Third International Conference on Contemporary Issues in Computer and Information Sciences

necessarily). The malicious network manipulates the


data sent to the client and poisons the cache of the UA.
Afterwards, the victim goes online through a trusted
network and accesses the poisoned website. Then the
actual attack happens and the victim loads the poisoned content from the cache.

9 The JavaScript performs the login request to


www.filebox-solution.com (From here the steps
are optional; theyre performed to hide the actual attack from the user).
10 The Login request is sent to www.filebox- solution.com.
11 Login successful (The user does not notice the
attack performed).

One may argue that a similar kind of attack was


possible also with standard HTML cache features.
That is correct but the offline application attack has
two advantages:

Caching of the root directory is possible: If the


user opens the poisoned website, the UA will not
make any request to the network and loads the
poisoned content from the cache. If the root directory is cached using HTML4 cache directives,
a request to the server is sent as soon the user
clicks refresh (Either the server sends a HTTP
304 not modified or an HTTP 200 OK or the page
is loaded from the server and not from cache).

Figure 1:
1 Victim access any.domain.com through a malicious access point (e.g. public wireless).
2 The HTTP GET Request is sent through the malicious access point to any.domain.com.

SSL-Resources can be cached as well: In HTML4


Man-in-the-middle attacks were possible but
then the user had to access the website through
the unsecured network. With offline application
caching of the root of an HTTPS website can be
cached; the user does not have to open the website. The user may accept an insecure connection
(certificate warning) in an unsecured network because he does not send any sensitive data. The
real attack happens if the user is back in his secured network, feels safe and logs in to the poisoned application.

3 Any.domain.com returns the response.


4 The access point manipulates the response
from any.domain.com: A hidden Iframe with
src=http://www.filebox-solution.com is added to
the response which is sent to the UA.
5 This hidden Iframe causes the UA to send a request to www.filebox-solution.com in the background (the user will not notice this request).
6 The request to www.filebox-solution.com is intercepted by the malicious access point and returns
a faked login page including malicious JavaScript.
The HTML page contains the cache manifest
declaration. The cache. Manifest file is configured to cache the root directory of www.fileboxsolution.com (the cache. Manifest file itself is
returned with HTTP cache header to expire late
in the future).
7 The victim opens his UA in a trusted network
and enters www.filebox-solution.com in the address bar. Because of the offline application cache
the UA loads the page from the cache including
the malicious JavaScript. No request is sent to
www.filebox- solution.com.

Countermeasures

The threats Persistent attack vectors and Cache poisoning cannot be avoided by web application providers.
The threats are defined in the HTML5 specification.
To come around this problem is to train the users to
clear their UA cache whenever they have visited the Internet through an unsecured network respectively be8 After the user has entered the login credentials to fore they want to access a page to which sensitive data
the faked login form (offline application), it posts are transmitted. Further, the user needs to learn to
the credentials to an attacker controlled server understand the meaning of the security warning and
(JavaScript code execution).
only accept Offline Web Applications of trusted sites.

207

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

Conclusion

Applications such as e-mail clients, word processing


or image manipulation applications will have the capabilities to run completely in the browser. Making
use of HTML5 running these application completely
offline in the browser will also be possible. This provides new ways for malware. Everything the user needs
to run HTML5 web application is a HTML5 supporting browser. This is an ideal target for a malware for
write-once, run everywhere - HTML5 is platform independent. Malware only making use of JavaScript and
HTML5 features may be seen numerous with the initiation of HTML5. It will be new that the targets of
HTML malware will no longer be limited to web application servers but move to the UA as well (beside the
problematic of exploiting browser vulnerabilities) because HTML5 provides feature rich capabilities to the
UA; they can even be persisted without exploiting UA
vulnerabilities, e.g. in the Web Storage. Overall it can
be said that making web applications secure solely with
technological solutions is a very complex task and cannot be done by all web application providers. Therefore, the end-user is highly responsible for using web
applications carefully and only providing personal and
sensitive data if a strong trust relationship exists.

Refrences
[1] World
Wide
Web
Consortium
(W3C),
HTML
4.01
Specification,
and
W3C
Recommendation,
http://www.w3.org/TR/1999/REC-html401-19991224/
(1999).
[2] The World Wide Web Consortium (W3C) and XHTML
1.0 The Extensible HyperText Markup Language,
http://www.w3.org/TR/xhtml1/ (2000).
[3] The World Wide Web Consortium (W3C), HTML5 - A vocabulary and associated APIs for HTM and XHTML, and E.
Jamhour, http://www.w3.org/TR/html5/ 4786 (2007), 196199.
[4] M. Pilgrim, HTML5: Up and Running, Sebastopol: OReilly
Media, 2010.
[5] Web
Hypertext
Application
Technology
Working
Group
(WHATWG),
What
is
the
WHATWG?:
http://wiki.whatwg.org/wiki/FAQ (2011).
[6] Internet Engineering Task Force, The Internet Society:
Hypertext Transfer Protocol HTTP/1.1,
http://www.ietf.org/rfc/rfc2616.txt (1999).
[7] The World Wide Web Consortium (W3C), Offline Web Applications: http://www.w3.org/TR/offline-webapps/ (1999).
[8] Lavakumar Kuppan and Attack and Defense Labs, Chrome
and Safari users open to stealth HTML5 AppCache
attack: http://blog.andlabs.org/2010/06/chrome-and-safariusers-open-to-Stealth.Html (2010).

208

Earthquake Prediction by Study on Vital Signs of Animals in


Wireless Sensor Network by using Multi Agent System
Media Aminian

Amin Moradi

Islamic Azad University

Institute for Advance Studies in Basic Sciences

Scienc and Research branch of Kerman, Iran

Department of Physics

Department of Computer

amin.moradi@iasbs.ac.ir

media.aminian@yahoo.com

Hamid Reza Naji


International Center for Science and High Technology,Kerman,Iran
Department of Computer
hamidnaji@ieee.org

Abstract: We use a multi agent system architecture approach in a wireless sensor network (WSN)
for prediction the occurrence earthquake by study on vital signs animals. This system uses several
agents with different functionalities. CBR methods were applied to analyze and compare the similarity in animal vital signs just before an occurred earthquake with real time to reduce false alarm.
The presented architecture consists of two layers, including interface layer and regional layer. At
the interface layer the interface agents interact with users and at the regional layer, the cluster
agents communicate with each other and packing the information.

Keywords: Earthquake prediction;WSN;Multi Agent System;CBR.

Introduction

efficiency data collection.

Every year more than 13,000 earthquakes with a magnitude greater than 4.5 occurred around the world that
hundreds of them are destructive and too many people
lost their lives[1]. If we could predict them, we will be
able to save many lives. Before the earthquake Earths
crust break and gases such as argon and radon are released into the air[2]. Animals are sensitive to these
gases and their behavior and vital signs in response
to these gases will be changed[3]. So we can detect
stress in animals by measuring the vital signs. A WSN
comprises numerous sensor devices commonly known
as motes which can contain several sensors to monitor the vital signs such as temperature, heart rate, etc.
The sensor motes are spatially scattered over a large
area Since, Data collection is difficult in this network.
So, we presented a multi layer agent system to increase
Corresponding

Author, T: (+98) 937 459-4169

209

The Proposed architecture

The present environment of collaborative agents in


WSN is described by three entities, as shown in Figure
1. These entities include Web browser, software agents
and sensor nodes[4]. The web browser is the gateway
for the user to receive results in the appropriate format.
Agents are the intelligent entities that are able to respond to the user needs and relieve the user from being
the controller of the system[5]. Sensor nodes are physical entities that are able to read temperature, heart
rate, etc from the environment.
The proposed layered system architecture consists of
two layers: interface layer and regional layer. The

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

agents in each layer collaborate with each other to


achieve their goals. The agents on each layer coordinate with their upper layers to transmit information.At
the regional layer, the cluster agents collect the sensory data from the sensors. The clusters agents process and repackage them and finally pull the packets to
Figure 2: A case base for animal vital signs
the interface layer. At the interface layer, the interface
agents interacts with the sensor network and receives
the packets from the cluster agents and shows them
in the appropriate format(text or graphic) and CBR 2.3 CBR
agent measures the similarity coefficient.
In this project, CBR methods and algorithms were
used. CBR (Case Based Reasoning) systems resolve
new problems by recovering similar resolved problems
in the case base and reapplying their solutions to the
new problem[6]. The first step of CBR system is case
representation. A case must contain the problem composition, its solution and its output. In this perspective
a CBR system can be defined by three iterative steps:
1. Retrieve the most similar cases to new problem
from case base.
2. Reuse the solutions of these recovered cases. If
necessary, adapt their solutions to resolve the
new problem by creating a suitable solution to
him.

Figure 1: Multi Agent System architecture

2.1

Regional Layer

Cluster agents operate in regional layer. In the WSN,


we have several sensors that are attached on the animal body and report animal vital signs. The attached
sensors on animal body, sends their readings to the
cluster head. Each cluster head transmits its information to the cluster agents that are locate on regional
layer. These agents receive information from the cluster head, process and repackage them.

2.2

Interface Layer

The interface layer is operated by the interface and


CBR agent. CBR agent maintains a case base that has
implemented by SQL (Figure2). This agent using CBR
methods to measures the occurrence of earthquakes.
Then, it sends all the information to the interface agent
for graphical display.

3. Keep the new solution in the case base in order


to use in the future.

2.4

Similarity Coefficients

Various similarity coefficients are proposed by the researchers in several domains. A similarity coefficient
indicates the degree of similarity between object pairs.
Methods are shown in figure 1[5]. The variables will be
transformed as a the number of property being located
in the two cases, the new case and every recorded case
in the case base; b the number of property being located in the new case; c the number of property being
located in the recorded case. The steps of a CBR system by an analytical approach can be ordered in five
steps:
1. The abnormal vital signs enter as a new case to
the system.
2. Thanks to the interviews made with the experts,
weights have already attributed to every try case.

210

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Weight
0.16
0.15
0.14
0.13
0.11
0.9
0.8
0.7
0.6
1

3. For every recorded case in the base similarity coefficient (Sij) between old case and the new case
is calculated.

4. Cases which have a similarity coefficient under


similarity limit are eliminated.

Degree(1-9)
9
8
7
6
5
4
3
2
1
55

Property
Heart rate
shaking
Temperature
Breath rate
Blood glucose
Urine volume
Calcium
Proteins
Enzymes

Table 1: Importance degrees and weights belonging to


the properties
5. Cases which are very similar to the new case are
retrieved from the case base.
Effects of stress
Increase
Increase
Decrease
Increase
Increase
Increase
Decrease

Normal range
60-120 per minute
37-40 C
20-23 per minute
53-59 mg per cc
89-109 mg per cc
11-14 mg per cc

Property
Heart rate
shaking
Temperature
Breath rate
Blood glucose
Urine volume
Calcium

Table 2: The values of cat vital signs

In this project, a calculation formula of similarity


coefficient is developed by jaccard model:
n
n
X
X
ci = (
Wi ai )/( (Wi ai + Wi bi + Wi ci ))
i=1

(1)

i=1

In this formula, a represent the properties found in


both of the cases, the new case and every registered
Figure 3: Different ways for calculating the similarity case in the case base;b the properties found only in the
new case; c the properties found only in the registered
coefficient in literature
case.

3
2.5

Conclusion

Writing Algorithm

The program compares the similarity of cases with the


new case to find the best solution. For that reason, we
need a method to calculate similarity coefficients. So,
we have to determine weight for the important properties such as heart rate, temperature, calcium, breath
rate, etc. the properties have a given value in normal cases. Table 2 indicates importance degrees and
weights belonging to the properties. The values and
the effective of stress on cats have represented in table
3[7].

211

We presented a layered system architecture using


agents for wireless sensor networks that can be useful
to predict earthquake. The network consists of several sensor nodes that able to sense the vital sign of
animals. Since the changes in animal vital signs may
be due to other factor such as noise or entrance of an
alien animal to their territory, CBR methods employed
to increases confidence coefficient. The proposed system has some disadvantages such as the probability of
detach sensors from the animal body or sensors damage
along animals activity. The other disatvantage is limitation of the animals. Because for each animal there

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

is a corresponding cluster agent in the regional layer.


But we can design a multi-hop WSN to resolve this
problem in the future.

[3] E. Buskirk, Unusual animal behavior before earthquakes:


A review of possible sensory mechanisms, REVIEWS OF
GEOPHYSICS 19 (1981), 247270.
[4] S. Hussain, Collaborative Agents for Data Dissemination in
Wireless Sensor Network . High-Performance Computing in
an Advanced Collaborative Environment (2006), 1623.

Refrences
[1] United
States
Geological
http://earthquake.usgs.gov .

Reading RG6 6BB, UK , Atmospheric and Solar-Terrestrial


Physics 72 (2009), 376-381.

Surveys

USGS,

[2] R. Harrison and K. Aplinb, Atmospheric electricity coupling


between earthquake regions and the ionosphere. a Department of Meteorology, University of Reading, Earley Gate,

[5] P. katia and, multiagent system, AI magazine 19 (1998).


[6] B. Gulcin, Intelligent system applications in electronic
tourism, Elsevier 38 (2010), 6586-6598.
[7] M. Cynthia and R. Klein, MERCK veterinary manual,
MERCK publishing, Chapter 5, pages: 201290, 2006.

212

Availability analysis and improvement with Software Rejuvenation


Zahra Rahmani Ghobadi

Baharak Shakeri Aski

Samangan Institute of Higher Education,Amol

Ramsar Azad University

Department of Computer

Department of Computer

m.rah62@gmail.com

baharakshakeriaski@yahoo.com

Abstract: Today, almost everyone in the world is directly or indirectly affected by computer
systems. Therefore, there is great need for looking at ways to increase and improve the reliability
and availability of computer systems. Software fault tolerance techniques improve these capabilities.
One of Software fault tolerance techniques is Software rejuvenation, which counteracts software
aging. In this paper, we address this technique for the application with one and two and three
software versions then extend model for n versions and show that with more software versions can
greatly improve availability of application.

Keywords: software rejuvenation; Reliability; Availability; continuous-time Markov process.

Introduction

When software applications run continuously, error


conditions are accumulated and the result is a degradation of the computer system or even a crash failure.
This phenomenon has been reported as software aging.
A proactive method in order to counteract this phenomenon is software rejuvenation.
The causes of software aging are memory leaking, unreleased file locks, file descriptor leaking, data corruption
in the operating environment of system resources, etc.
software aging will affect the performance of the application and eventually cause the application to fail. The
software rejuvenation technique terminates the program when its performance declines to a certain degree,
then restarts to clean the inner state and the software
performance will be restored.
Software rejuvenation, first reported by Huang et al.[1]
It has now been performed in various systems that software aging has been observed such as billing applications , process restart in Apache[3]. Furthermore, software rejuvenation has been proposed as an action that
will increase availability of a two node clustered computer systems[4] or service reliability in VoIP server[5]
and moreover to counteract intruders attacks[6].
Many researchers have been concentrated in studying
software rejuvenation under different circumstances.
The research effort varies as far as the system stud Corresponding

Author, T: (+98) 935 822-4714

213

ied or the kind of modeling that is used concerns.


Huang et al.[1] uses a continuous time Markov chain
to model software rejuvenation. Vaidyanathan et al.[7]
use stochastic reward nets (SRNs) to model and analyze cluster systems, which employ software rejuvenation. Park and Kim[8] use semi-Markov process to
model software rejuvenation in order to improve the
availability of personal computer-based active/standby
cluster systems. In[9] both check pointing and rejuvenation are used together to further reduce the expected completion time of a program. In[10] Dohi et al.
formulate software rejuvenation models via the semiMarkov processes, and derive analytically for respective cases the optimal software rejuvenation schedules,
which maximize system availability. Furthermore, they
develop nonparametric statistical algorithms to estimate the optimal software rejuvenation schedules, provided that the statistical complete (uncensored) sample data. In[7] Vaidyanathan et al. construct a semiMarkov reward model based on workload and resource
usage data collected from the UNIX operating system
to model software rejuvenation. Trivedi et al. discuss
stochastic models to evaluate the effectiveness of proactive fault management in operational software systems
and determine optimal times to perform rejuvenation,
for different scenarios in[12]. Two software rejuvenation policies of cluster server systems under varying
workload, called fixed rejuvenation and delayed reju-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

venation are presented and in order to achieve a higher


average throughput one of them is proposed by Xie et
al. in[13]. Okamura et al. in[14] deal with dependability analysis of a client/server software system with rejuvenation. Liu et al. in[15] use software rejuvenation as
a proactive system maintenance technique deployed in
a CMTS (Cable Modem Termination System) cluster
system, study different rejuvenation policies evaluate
these policies by stochastic reward net models solved
by SPNP (Stochastic Petri Net Package). In[16] the
optimal software rejuvenation policy maximizing the
interval reliability in the general semi-Markov framework is considered by Suzuki et al. Furthermore, Bobbio et al. in[17] use fine-grained software degradation
models for optimal rejuvenation policies.

Software Rejuvenation

Figure 1: Software rejuvenation model of single application


The system has three states: the working state 0 (denoted as H), the failure state 1 (denoted as F) and the
rejuvenation state 2 (denoted as R). In the beginning,
the application stays in the working state 0. With system performance degrades over time, a failure may occur.If system failure occurs before triggering software
rejuvenation, the application changes from the working
state 0 to system failure state 1 and then the system
recovery operation is started immediately. Otherwise,
the application changes from the working state 0 to the
software rejuvenation state 2 and later the software rejuvenation is carried out. After completing the system
repair or rejuvenation, the application becomes as good
as new and changes to the beginning working state 0
again. We define the time interval from the beginning
of the system working to the next one as one cycle.
According to the model described above, at any time
t the application can be in any one of three states:
up and available for service (working state 0), recovering from a failure (the failure state 1), or undergoing software rejuvenation (the rejuvenation state 2).
To formally describe the software rejuvenation model
of single version application, continuous time Markov
process denoted as z = (zt ; t0) is used, where zt represents the state of application at time t. The transition
probability function of Z is expressed as follows [10]:

Software rejuvenation is a proactive fault management


technique aiming at cleaning up the internal state of
the system to prevent the occurrence of more severe
crash failures in the future. It involves occasionally
terminating an application or a system, cleaning its
internal state and restarting it[18]. Application is unavailable during rejuvenation. Although rejuvenation
may sometimes increase the downtime of an application, those are usually planned and scheduled downtimes. If care is taken to schedule rejuvenation during
the idlest times of an application, then the cost due to
those downtimes is expected to be short. Downtime
costs are the costs incurred due to the unavailability of
pij (t) = p(zt = j|z0 = i)(i, j , t 0)
(3)
the service during downtime of an application [2] Let
Pij (t) be transition probability function of continuous- Where, = 0, 1, 2 is the state space set.
time Markov process and qij be transition rate. Kol- For the software rejuvenation model in Fig.1,1 , 1 , r1 ,
mogorov forward equation is defined as follows:
and R1 represents the failure rates from system working state to failure state, the transition rate to trigger
N
software rejuvenation, the rejuvenation rate from softX
dpij (t)/dt =
pik (t)qkj , i, j = 0, 1, 2
(1) ware rejuvenation state to system working state and
k=0
the recovery rate from system failure state and the recovery rate from system failure state to system working
By Letting P (t) to be the matrix of transition prob- state, respectively. Let Q be the matrix of the tranability function Pij (t)(i, j = 0, 1, 2, ) and Q to be the sition rate function. According to the state transition
matrix of transition rate function qij (t)(i, j = 0, 1, 2, ), relationship of single version application, the transition
formula (1) can be expressed in matrix format as fol- rate matrix for the continuous time Markov process Z
can be easily derived as:
lows:

(1 + 1 ) 1
1
0
p (t) = p(t)Q
(2)
R1
R1
0
Q=
(4)
r1
0
r1
First, we study Software rejuvenation model for the
application with one software version, model based Let P (t) be the matrix of transition probability function Pij (t)(i, j ). According to Kolmog forward
Markov process, as is show in Fig. 1.

214

The Third International Conference on Contemporary Issues in Computer and Information Sciences

Eq.1, transition probability matrix P (t) satisfies:


p0 (t) = p(t)Q
p(0) = I

(5)

Where, I is the unit matrix.


LetPj , j be the instantaneous steady probability
of single version application in state j. According to
the limit distribution theorem,Pj , j is given by:
pj = lim pij (t)(i, j )
t

(6)

Substitute Eq.4 and 6 to Eq.5, the following equation


is derived:
(1 + 1 )P0 + R1 P1 + r1 P2 = 0
R1 P1 + 1 P0 = 0

this model. The assumptions are explained as following:


Assumption 1: Software rejuvenation is not allowed for
both versions to be carried out concurrently.
Assumption 2: At any time t only one version can be
in rejuvenation state.
Assumption 3: if the version be in failure state, other
versions cant transfer to rejuvenation state.
Assumption 4: rejuvenation rate from software rejuvenation state to system working state is faster than
recovery rate from system failure state to system working state.
Also it is assumed that Zt is the state of the version at
time t, 0 = {0, 1, 2 7} is the state space set. Similarly, we use continuous time Markov process, denoted
as z = (zt ; t 0) to describe the software rejuvenation model of two-node application. The transition
probability function of Z is expressed as Eq. 10 and
pj , j is given by[19]:
pj = lim pij (t)(i, j 0 )
t

r1 P2 + 1 P0 = 0

(9)

Correspondingly, the transition probability matrix


(7) P (t) also satisfies the condition in Eq. 5. By subi=0
stitution Eq. 9 and 10 to Eq. 5 the Eq.11 can be
derived[19]:
Where Pi , i = 0, 1, 2 can be obtained by solving the
Eq.7. The application is available for service requests
in working state 0 and application is unavailable for
rejuvenation state 1 and failure state 2, thereafter, the
system availability for single application is given by:
2
X

pi = 1

pA1 = p0

2.1

(8)

Software rejuvenation model of


two-node application

(10)

We extend the software rejuvenation model of single


application to two-dimension state space, then derive
(1 +2 +1 +2 )P0 +R1 P1 +R2 P2 +r1 P4 +r2 P5 = 0
software rejuvenation model of two-node application
as shown in Fig.2. The states of application are de(R1 + 2 )P1 + 1 P0 + R2 P3 + r2 P7 = 0
noted by a 2-tuple S, which is formally defined as:
(R2 + 1 )P2 + 2 P0 + R1 P3 + r1 P6 = 0
S = {{(i, j)|i, j {H, F, R}} where i is the state of
(R1 + R2 )P3 + 2 P1 + +1 P2 = 0
the first version of application and j is the state of
the second version of application.For the first version
(r1 + 2 )P4 + 1 P0 + R2 P6 = 0
of application,1 , 1 , r1 , and R1 represents the failure
(r2 + 1 )P5 + 2 P0 + R1 P7 = 0
rates from system working state to failure state, the
(r1 + R2 )P6 + 2 P4 = 0
transition rate to trigger software rejuvenation,the rejuvenation rate from the: software rejuvenation state
(r2 + R1 )P7 + 1 P5 = 0
to working state, respectively. Correspondingly, for the
7
X
second version of application, 2 , 2 , r2 , and R2 depi = 1
(11)
notes the failure rate, the transition rate to trigger
i=0
software rejuvenation, the rejuvenation rate and the
By solving the above equations, we can obtain the value
recovery rate, respectively.
We discussed assumptions for simplicity and limited of Pi , i = {0, 1, 2 7}. According to the rejuvenation

215

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

model in Fig.3, the application is unavailable in the number of existence states, and transition rate funcstate of (F,F),(R,F) and (F,R). Thereafter, the avail- tion matrix, for every number version. After accountability of two-node application is given by:
ing of this matrix and it placing in Eq.5 we can obtain
present probability in every state. By accounting of
pA2 = p0 + p1 + p2 + p4 + p5
these probabilities, can obtain availability system by
following formula:
= 1 (p3 + p6 + p7 )
(12)
pA = 1 (pm1 + pm2 + + pmn + p2n 1 ) (14)

Suppose that n software version be available, the number of states at any time t account with following formula:
n

n2

m=3 2

 
 
 
n
n3 n
n4 n
2
2

2
3
4

n(n1)


 
n
nn n
2
n1
n

(15)

Figure 2: Software rejuvenation model of two applications.


Which 3n is all states that exist,2n2 n2 is number of
states that 2 version are in rejuvenation states, Accord2.2 Software rejuvenation model of ing to Assumption 2, At any time t only one version can
be in rejuvenation state therefore number of states that
three-node application
have repeated versions in rejuvenation
state, should

be deduct from 3n . So, 2n3 n3 is the number of
We study this work for three-dimension state space and
states that 3 version are in rejuvenation state and figain the less unavailability by Software rejuvenation
nally ,2nn nn is the state that all the versions be in
model of three-node application as show in Fig.3. Q is
rejuvenation state.
matrix of the transition rate function as in Eq.16.By
solving the obtained equations, we obtain the value
ofPi , i = {0, 1, 2 19}. According to the rejuvenation model in Fig.3, the application is unavailable in
the state of (F,F,F),(R,F,F),(F,R,F),(F,F,R). Thereafter, the system availability of three-node application
is given by[20]:
pA3 = 1 (p7 + p17 + p18 + p19 )

2.3

(13)

Software rejuvenation model of nversion application

As considered in previous parts, we determined the


software rejuvenation model for applications with one,
two and three software versions. By doing several experiments, we could expand this model to n software
versions, and obtained a formula for account of state
space set and also introduce reference matrix that denote the transition rate function matrix for n software
version. The non-zero elements of the matrix are show
in Table 1. In the table, there are three main columns
in which the row number, column number and value
of each non-zero element. We can, therefore, obtain

(16)

Table1: The non-zero elements in the transition rate function


matrix for n-version

216

The Third International Conference on Contemporary Issues in Computer and Information Sciences

R#
C#
Value
R#
C#
Value
0
1
1 2n n 2
2n 1
n
0
2
2
...
2n 1
...
0
...
...
2n 2
2n 1
1
0
n
n
2n
0
r1
0
2n
1
2n + 1
0
r2
0
2n + 1
2
...
0
...
0
...
... 2n + n 1
0
rn
0
2n + n 1 n
2n
2n + n
2
1
0
R1
2n
2n + n + 1 3
1
n+1
2
2n
...
...
1
n+2
3
2n
2n + 2n 2 n
1
...
...
2n + 1 2n + 2n 1 3
1
2n
n
2n + 1
...
...
2
0
R2
2n + 1
...
n
2
n+1
1
2n + 1 2n + 3n 2 1
2
2n+1
3

2
2n+2
4 2n + n 1

1
2
...
... 2n + n 1

2
2
3n-2
n 2n + n 1

...
3
0
R3 2n + n 1

n1
3
n+2
1
2n + n
2
r1
3
2n+1
2 2n + n + 1
3
r1
4
2n+2
2
...
...
r1

2n + 2n 2
n
r1
n
0
Rn 2n + 2n 1
3
r2
n
2n
1
...
...
r2
n+1
3n-2
2 2n + 3n 4
n
r2
n+1
1
R2

n+2
1
R3
n2
n
rn1
...
...
...
n2 1
n-1
rn
2n-1
1
Rn
...
...
rn
n+1
2
R1 n 2 n 2
1
rn
n+2
3
R1
...
n-1
rn1
...
...
...
...
n-1
rn1
2n-1
n
R1

2n
2
R3
m-n-2
1
r2
2n+1
2
R4
2n + n
m-n
n
...
...
...
...
m-n
...
3n-2
2
Rn
...
m-n
2
2n
3
R2 2n + 2n 1
m-n
1
2n+1
4
R2

...
...
...
m-3n
m-1
1
3n-2
n+1
R2
...
m-1
n

...
m-1
...
2n 2 2n 2n 4 Rn m 2n 1
m-1
2
2n 2
...
...
m-2n
m
n
2n 2
2n n 3 R2
...
m
...
2n 1
2n n 2 Rn
...
m
2
2n 1
...
...
m-n-1
m
1
2n 1
2n 2
R1
m-1
2n n 2 rn
2n 2n 4 2n 2
n
m-2
2n n 3 rn1
...
2n 2
...
...
...
r2
2n n 3
2n 2
2
m-n-1
2n 2
r1

After accounting m, transition rate function matrix that is a


m*m matrix determined from reference matrix.
Now obtain the probability present of in per states (such as Pi
that is i = 1, 2 . . . m) by matrixs multiply action and equations
accounting and solve of equations and then by setting these Pi in
following Eq.15 can obtain the probability of system availability.
By availability accounting for applications with multiple versions can obtain this conclusion, that constantly by increasing
of version number, decrease the amount of system unavailability
considerable.

217

Figure 3: Software rejuvenation model of three applications.


Table2: Parameter Value
r1 = .. = rn = 1
R1 = .. = Rn =0.1
1 = .. = n =0.005
1 = .. = n =0.002
Table3: Parameter Value
Number of version
1
2
3
4
5

Unavailability
0.047528
0.022814
0.022438
0.00108
0.00061

Numerical Results and Analysis

To acquire availability measure of application, we perform numerical experiments by taking system unavailability as evaluation indicator. The system parameter default values in software
rejuvenation model are given in table 2. All the parameter values are selected by experimental experience for demonstration
purposes.
The change in the unavailability of software applications with
the different number of versions and rejuvenation rates is plotted in table 3 and Fig. 2. The number of versions is varied
from simplex to multiplex (n = 5), at the same time we perform
software rejuvenation with the interval from rate=0.5 to infinity (rate=0: no rejuvenation). From the graph, the amount of
unavailability decrement from simplex to duplex is significant.
We can see that number of versions strongly influences system
reliability. With the number of version increasing, the system
unavailability reduces rapidly and goes to a steady value.

Conclusion

In this paper, we presented software rejuvenation structure and


set up the software rejuvenation model in one, two and threedimension state space for one application. In the model, the sys-

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

tem availability formula is derived from continuous time Markov


process. The numerical experiment results show that the system
unavailability greatly minimizes when the number of versions
increases.

[10] T. Dohi, K. Goseva, and K. Trivedi, Statistical nonparametric algorithms to estimate the optimal software
rejuvenation schedule: ACM SIGMETRICS Conf., ACM
Cambridge, MA (2000).

Refrences

[11] K. Vaidyanathan and K. Trivedi, A comprehensive model


for software rejuvenation: IEEE Transactions on Dependable and Secure Computing 2 (2005).

[1] S. Yu, CH. Qi, and H. Xin, Positive software fault-tolerate


technique based on time policy, Communication and Computer 4 (2007), 1114.
[2] Y. Huang, C. Kintala, N. Koletis, and N. Fulton, Software
rejuvenation: analysis, module and application: Symposium
on Fault Tolerant Computing (1995), 381390.
[3] L. Li, K. Vaidyanathan, and K. Trivedi, An approach to
estimation of software aging in a web server: International Symposium on Empirical Software Engineering ISESE (2002), 125129.
[4] V. P. Koutras and a. N. Platis, Applying Software Rejuvenation in a Two Node Cluster System for High Availability: International Conference on Dependability of Computer
Systems (DEPCOS-RELCOMEX06), Poland 6 (2006),
175182.
[5] A. Platis and V. P. Koutras, Optimal rejuvenation policy
for increasing VoIP service reliability: European Safety and
Reliability (ESREL 2006) Conference (2006), 22852290.
[6] K. M. Aung and J. S. Park, A framework for software rejuvenation for survivability: 18th International Conference
on Advanced Information Networking and Applications 2
(2004).
[7] K. Vaidyanathan and R. Harper, Analysis and implementation of software rejuvenation in cluster systems: ACM
SIGMETRICS Performance Evaluation Review 29 (2001).

[12] K. Trivedi and K. Vaidyanathan, Modelling and analysis of


software aging and rejuvenation: 33rd IEEE Annual Simulation Symposium (2000).
[13] W. Xie and Y. Hong, Software rejuvenation policies for
cluster systems under varying workload: 10th IEEE Pacific
Rim International Symposium on Dependable Computing
(PRDC04) 18 (2004), 163177.
[14] H. Okamura, S. Miyahara, and T. Dohi, Dependability analysis of a client/server software system with rejuvenation:
13th International Symposium on Software Reliability Engineering (2002).
[15] Y. Liu, K. Trivedi, Y. Ma, and H. Levendel, Modelling and
analysis of software rejuvenation in cable modem termination system (2002).
[16] H. Suzuki, K. Trivedi, T. Dohi, and N. Kaio, Modelling and
analysis of software rejuvenation in cable modem termination system (2003).
[17] A. Bobbio, M. Sereno, and C. Anglano, Fine grained software degradation models for optimal rejuvenation policies
46 (2001).
[18] T. Thein and J. Park, Availability Analysis of Application
Servers Using Software Rejuvenation and Virtualization,
JOURNAL OF COMPUTER SCIENCE AND TECHNOLOGY 24 (2009), 339346.

[8] K. Park and S. Kim, Availability analysis and improvement


of active/standby cluster systems using software rejuvenation: ACM SIGMETRICS Performance Evaluation Review,
Journal of Systems and Software 61 (2002).

[19] Q. Yong, M. Haining, H. Di, and Ch. Ying, A Study on


Software Rejuvenation Model of Application Server Cluster in Two-Dimension State Space Using Markov Process,
Information Technology Journal 7 (2008), 98104.

[9] S. Garg, Y. Huang, K. Kintala, and K. Trivedi, Minimizing


completion time of a program by checkpointing and rejuvenating: ACM SIGMETRICS Conf., ACM Cambridge, MA
(1996), 252261.

[20] Z. R. Ghobadi and H. Rashidi, Software Rejuvenation


Technique-An Improvement in Applications with Multiple
Versions, Computers and Intelligent Systems Journal 2
(2010), 2226.

218

A fuzzy neuro-chaotic network for storing and retrieving pattern


Nasrin Shourie

Amir Homayoun Jafari

Islamic Azad University

Tehran University

Department of Biomedical Engineering, Science and Research Branch

School of Medicine

Tehran, Iran

Tehran, Iran

shourie.n@srbiau.ac.ir

amir h jafari@aut.ac.ir

Abstract: In this paper, a fuzzy neuro-chaotic network is proposed for retrieving pattern. Activation function of each neuron is a logistic map with flexible searching area. Bifurcation parameter
and searching area of each neuron are determined depending on its desired output. They are obtained using two fuzzy systems, separately. In the beginning of training process, desired patterns
are stored in fixed points by use of pseudo-inverse matrix learning algorithm. Then required data
for constructing of the fuzzy systems are provided. The fuzzy rule bases are designed by use of look
up table scheme based on provided data. In the retrieving process, all neurons are initially set to
be chaotic. Each neuron searches for its state space completely to find its correct periodic points.
When this occurs, the neuron is driven to periodic state of period 2. In this case, the bifurcation
parameter and the searching area of the neuron are determined by the two obtained fuzzy systems.
When all neurons are driven to periodic state, the desired pattern is retrieved. Computer simulations represent the remarkable performance of the proposed model in the field of retrieving noisy
patterns.

Keywords: Chaotic neural model, Bifurcation, Fuzzy rules, Pattern retrieving.

Introduction

Chaotic behavior exists in many biological systems specially, in behavior of biological neuron. Observation of
chaotic behavior in biological neuron persuades many
researchers to consider these properties in artificial
neural network models, in order to obtain new computational capability. Hence, numerous chaotic neural
models with ability of representing chaotic behavior
and data processing were offered until now.
For example, G. Lee and N.H. Farhat proposed a
chaotic pulse coupled neural network as an associative
memory based on a bifurcation neuron which is mathematically equivalent to the sine circle map [3]. In
another research, a bifurcation neuron is suggested by
M.Lysetskiy and J.M. Zurada that is constructed with
the third iterate of logistic map. It uses an external
input which shifts its dynamics from chaos to one of
the stable fixed points [4]. L.Zhao et al. [5] presented
Corresponding

a chaotic neural model for pattern recognition by using periodic and chaotic dynamics. Periodic dynamic
represents a retrieved pattern and chaotic dynamic corresponds to searching process. A. Taherkhani et al. [6]
designed a chaotic neural network that could be used
for storing and retrieving gray scale and binary patterns. This model contains chaotic neurons with logistic map as activation function and a NDRAM network
which is applied as supervisor model for the neurons of
the model evaluating.
In this paper, we try to show the advantage of
chaotic behavior in artificial neural network. Chaotic
neurons are able to emerge various solutions for a problem. Therefore, we propose a fuzzy neuro-chaotic network, which is capable of pattern retrieving. In this
model, activation function of each neuron is a logistic
map with flexible searching area. Parameters of neurons are obtained using two fuzzy systems, separately.
In the training process, data are stored in memory us-

Author, P. O. Box 1388673111, F: (+98) 21 44524165, T: (+98) 21 44520786

219

CICIS12, IASBS, Zanjan, Iran, May 29-31, 2012

ing pseudo-inverse matrix learning algorithm. Then,


required data for designing the fuzzy systems are provided. The fuzzy rule bases are constructed using look
up table scheme based on provided data. In the retrieving process, noisy pattern is presented to the model as
the initial conditions of neurons. All neurons are initially set to be chaotic. Each neuron starts to search for
its state space to find its proper periodic points. The
neuron that finds its correct periodic points is driven to
periodic state of period 2. Thus, its bifurcation parameter and its searching area are determined by the two
obtained fuzzy systems. When all neurons are driven
to periodic state, the stored pattern is retrieved.

Model Description

The proposed model consists of chaotic neurons which


activation function of each one is a logistic map with
flexible searching area as described below:
xi (k + 1) = bi (t)xi (k)(1 xi (k)/i ), i = 1, 2, ..., N
(1)
Where xi (k) is output of ith neuron. bi (t) is a bifurcation parameter of ith neuron which determines its
dynamical changes. In Eq. (1), parameter i controls
searching area of ith neuron. Due to using this parameter, it is possible to determine the searching area of
each neuron individually depending on its appropriate
output. Therefore, the model is able to retrieve multi
value content patterns. N is the number of neurons
that equals to the number of elements in each pattern vector. Bifurcation diagram of logistic map with
= 0.5 is represented in Figuse 1.
The proposed model is divided into three stages:
the training stage, the designing fuzzy systems stage,
and the retrieving stage.

2.1

Training Stage

Then required data for constructing of the fuzzy systems are provided using the training patterns. The
training patterns are noisy versions of basic patterns
that are normalized into [0-1]. Each one of the training
patterns is applied to the model separately as its initial
conditions. At first, all neurons are set to be chaotic
in order to search for their state space completely to
find correct periodic points. As the maximum value
for each element of the training pattern is equal to 1,
initial searching area of each neuron is considered into
[0-1.1] and therefore i (0) is set to 1.1. The dynamic
of each neuron is determined relevant to its error that
is defined as below:




N
X



(4)
wij xj (k) i = 1, 2, ..., N
ei (t) = xi (k)


j=1
Where, wij is an element of the connection matrix that
is obtained by Eq. (3) and xi (k) is the output of ith
neuron. As the outputs of neurons in periodic state
alternate with the period of two, t = k 2. Thus ei
of each neuron is evaluated every two time units. The
bifurcation parameter of each neuron is obtained as:

wij xj (k) <


Ap (0) if ei (t) = xi (k)
bi (t) =


j=1

Ac
otherwise
(5)
Where is the threshold of error, Ac is a bifurcation
parameter corresponding to chaotic state and Ap (0) is
an initial bifurcation parameter corresponding to periodic orbit with period of 2. If ei (t) is greater than
a threshold, ith neuron still remains in chaotic state.
Otherwise, it indicates that the neuron approximates
its corresponding periodic point. When this occurs, the
neuron is driven to periodic state with period of 2 and
its initial bifurcation parameter is set as bi (0) = Ap (0).
In this case, the output of neuron and its error are
stored for constructing the fuzzy systems and then the
initial i (0) is calculated using Eq.(1). In this way,
bi (0) and the output of neuron are substituted in Eq.
(1). Then Eq. (1) is solved for in a way that one
of the periodic points of logistic map will be equal to
present output of neuron.

In the Training stage, at first basic patterns are normalized into [0-1] and then they are stored in fixed points.
It has been supposed that matrix X = {x1 , x2 , , xN }
containing M training patterns which each one include
N elements. All of the M training patterns are stored
in fixed points as [2, 5]:

Then the output of present neuron is substituted


in Eq.(1) with bi (0) and i (0) and the output of logistic map is calculated two times, named x1 and x2
which x2 is corresponding to the proper periodic point.
Thus, sign of difference between x2 and x1 is stored as
one of the fuzzy systems inputs, too. Subsequently,
WX = X
(2)
the parameters of neuron are adjusted using following
Where W is the connection matrix. It is obtained as equations to minimize difference between the output
below by using the pseudo-inverse matrix learning al- of neuron and its corresponding element in the desired
pattern:
gorithm [2, 5]:
W = X(X T X)1 X T

bi = bi + 21 sign(di oi )(x2 x1 )

(3)

220

(6)

The Third International Conference on Contemporary Issues in Computer and Information Sciences

i = i + 22 sign(di oi )

(7) 2.2

Design of Fuzzy Systems

The desired pattern in retrieving stage is not given and


therefore the parameters of neurons could not be adjusted using Eq. (6) and (7) in this stage. Each one
of the neurons parameters is determined by three variables: oi , ei and the sign of x. The two fuzzy systems
are designed separately by use of the stored data in the
training stage. Therefore Eq.(6) and (7) could be replaced with constructed fuzzy systems. The fuzzy rule
bases are designed using look up table scheme based on
stored inputoutput pairs. In order to creating look up
table for each of the fuzzy systems, at first a number
of fuzzy sets are generated which cover input-output
spaces, completely. Then the membership value of
each inputoutput pair in the corresponding fuzzy sets
is determined. The fuzzy sets which have the largest
membership values for each input-output variable are
detected and thereby the Fuzzy IF-THEN rule is generated. Since there are conflicting rules with the same
IF parts and different THEN parts, a degree is assigned
to the generated rules and only one rule from conflicting group with the maximum degree is maintained.
The degree of a rule is determined as product of maximum membership values of input-output variables [1].
Finally, the two Fuzzy systems are constructed using
Figure 1: Bifurcation diagram of logistic map with product inference engine, singleton fuzzifier and central
= 0.5 and the searching area in to [0-0.5].
average defuzzifier based on the obtained rule bases.
Where di and oi are the corresponding element of the
output of neuron in the desired pattern and the output
of the present neuron, respectively. 1 and 2 are the
learning rate parameters. By using updated b2 and 2 ,
the neurons output is calculated