You are on page 1of 211

Real Time Information Dissemination and

Management in Peer-to-Peer Networks


A Thesis
Submitted in Fulfillment of the Requirement of the Degree of

Doctor of Philosophy
By

Shashi Bhushan
Reg. No.: 2K06NITK-PhD-1095

Under supervision of

Dr. Mayank Dave

Dr. R.B. Patel

Associate Professor
Dept. of Computer Engineering
NIT, Kurukshetra

Associate Professor
Dept. of Computer Science & Engineering
G. B. Pant Engineering College, Pauri Garhwal
(Uttarakhand)

Department of Computer Engineering


National Institute of Technology
Kurukshetra-136119 India
May 2013

Department of Computer Engineering


National Institute of Technology
Kurukshetra-136119, Haryana, India
Candidates Declaration
I hereby certify that the work which is being presented in the thesis entitled Real Time
Information Dissemination and Management in Peer-to-Peer Networks in
fulfillment of the requirement for the award of the degree of Doctor of Philosophy in
Computer Engineering and submitted in the Department of Computer Engineering of
National Institute of Technology, Kurukshetra, Haryana, India is an authentic record of
my own work, carried out during the period from May 2006 to April 2013, under the
supervision of Dr. Mayank Dave and Dr. R. B. Patel.
The matter presented in this thesis has not been submitted by me for the award of any
other degree in this Institute or any other Institute/University.

Shashi Bhushan

This is to certify that the above statement made by the candidate is correct to the best of
our knowledge.

Date:

Dr. Mayank Dave


Associate Professor
Dept. of Computer Engineering
NIT, Kurukshetra

Dr. R.B. Patel


Associate Professor
Dept. of Computer Science & Engineering
G. B. Pant Engineering College, Pauri Garhwal
(Uttarakhand)

Acknowledgement
Success in life is never attained single handed. First and foremost, I would like to
express my sincere gratitude to my supervisor, Dr. Mayank Dave for his continuous
support, encouragement and enthusiasm. I thank him for all the energy and time he
has spent for me, discussing everything from research to career choices, reading my
papers and guiding my research through the obstacles and setbacks. His professional
yet caring approach towards the people, his working and his passion for living the life
to the fullest have truly inspired me.
It is extremely difficult for me to express in words my gratitude towards my cosupervisor, Dr. R. B. Patel who stood by me throughout my research work and guided
me not only towards becoming an able researcher but also a good human being. His

constant motivation made me to believe in myself towards this research work. Without
his persuasion and interest, it would not have been possible for me to gain the confidence
that I have today.

My sincere thanks goes to Dr. J.K. Chhabra, Head, Department of Computer


Engineering, for his insightful comments and administrative help at various occasions.
His hard working attitude and high expectation towards research have inspired me to
mature into a better researcher. I would also like to thank my DRC members, Dr. A.
Swarup, Dr. A K. Singh and Dr. S. K. Jain for stimulating questions and valuable
feedback. I owe my thanks to the faculty members of the department for their
valuable feedback.
I would be no where in life if I had not grown up in the most wonderful family
one can imagine. I want to thank my parents and brother for their love and for giving
me all the happiness and opportunities that most people can only dream of.
I am grateful to my better half, Dr. Anjoo Kamboj for keeping lots of patience,
constant encouragement and support. She was always with me in my difficult times and
encouraged me, whenever, I was down with frustrations. Words are not sufficient to

express my deepest love to my lovingly kids Ashu and Abhi, for their cooperation and
sacrifice of childhood, that they may enjoy with their father. They always pray to God
to make me success in the work

iii

I consider this as an opportunity to express my gratitude to all the dignitaries who


have been involved directly or indirectly with the successful completion of this work.
Last but not least, I thank God, the almighty for giving me the strength, will and
wisdom to carry out my work successfully. You have made my life more ample. May
your name be exalted, honored and gloried.

Shashi Bhushan

iv

Abstract
In P2P networks, peers are rich in computing resources/services, viz., data files, cache
storage, disk space, processing cycles, etc. These peers collectively generate huge
amount of resources and collaboratively perform computing tasks using these
available resources. These peers can serve as both clients and servers, and eliminate
the need of a centralized node. In P2P systems, a major drawback is that resources or
nodes are restricted to temporary availability only. A network element may disappear
at a given time from the network and can reappear at another locality of the network
with an unpredictable pattern.

Under these circumstances, one of the most

challenging problems is how to place and access real-time information over the
network. This is because the resources should always be successfully located by the
requesters whenever needed within some bounded delay. This requires management
of information under time constraints and dynamism of the peers. There are multiple
challenges to be addressed for implementing Real Time Distributed Databases
Systems (RTDDBS) over dynamic P2P networks. In order to enable resource
awareness in such a large-scale dynamic distributed environment, specific
management system is required, which takes into account the following P2P
characteristics: reduction in redundant network traffic, data distribution, load
balancing, fault-tolerance, replica placement/updation/assessment, data consistency,
concurrency control, design and maintain logical structure for replicas, etc. In this
thesis, we have developed a solution for resource management which should support
fault-tolerant operations, shortest path length for requested resources, low overhead in
network management operations, well balanced load distribution between the peers
and high probability of successful access from the defined quorums.
In this thesis, we have proposed a self managed, fault adaptive and load adaptive
middleware architecture called Statistics Manager and Action Planner (SMAP) for
implementing Real Time Distributed Database System (RTDDBS) over P2P networks.
Various algorithms are also proposed to enhance the performance of different
modules of SMAP. A Matrix Assisted Technique (MAT) is proposed to partition the
database for implementing the RTDDBS. This approach also provides primary
security to the database over unreliable peers and easy access to the information over
P2P systems. A 3-Tier Execution Model (3-TEM) that integrates MAT, for parallel
v

execution is also proposed. 3-TEM enhances throughput of the P2P system and
balances the load among participating peers. Timestamp based Secure Concurrency
Control Algorithm (TSC2A) is also developed, which handles the issues of concurrent
execution of transactions in the dynamic environment of P2P networks. This
algorithm has capabilities of providing security to both arrived transactions and data
items. An approach called Common Junction Methodology (CJM) is proposed to
reduce redundant traffic and improved response time in P2P network through
common junction in the paths. The quorum acquisition time is reduced through a
novel fault adaptive algorithm called Logical Adaptive Replica Placement Algorithm
(LARPA), which implements logical structure for dynamic environments. The
algorithm efficiently distributes replicas on one hop distance sites to improve data
availability in RTDDBS over P2P system. A self organized Height Balanced Fault
Adaptive Reshuffle (HBFAR) scheme is proposed for improving hierarchical
quorums over P2P systems. It improves data availability through logical arrangement
of replicas. We finally conclude and compare the proposed middleware with some
existing schemes.

vi

Table of Contents
Candidates Declaration................................................................................................... ii
Acknowledgement .......................................................................................................... iii
Abstract

......................................................................................................................v

Table of Contents .......................................................................................................... vii


List of Figures ............................................................................................................... xii
List of Tables .................................................................................................................xv
List of Abbreviations .................................................................................................... xvi
Chapter 1: Introduction ........................................................................................... 1-10
1.1

What is Peer-to-Peer Network? ...............................................................................1

1.2

Why P2P Networks? ................................................................................................2

1.3

Applications of P2P Systems ...................................................................................3

1.4

Motivation................................................................................................................3

1.5 Issues in P2P Systems..............................................................................................5


1.6

Research Problem ....................................................................................................5

1.7

Work Carried Out ....................................................................................................6

1.8

Organization of Thesis.............................................................................................9

1.9

Summary ...............................................................................................................10

Chapter 2: Literature Review................................................................................ 11-55


2.1

Peer-to-Peer (P2P) Networks.................................................................................11

2.2

Types of P2P Networks .........................................................................................13


2.2.1

Structured P2P Networks...........................................................................13

2.2.2

Unstructured P2P Networks.......................................................................14

2.3

File Sharing System ...............................................................................................15

2.4

Underlay and Overlay P2P Networks ....................................................................18

2.5

Challenges in P2P Systems....................................................................................20


2.5.1 Challenges in P2P Networks.........................................................................20
2.5.2 Challenges for Databases in P2P Networks..................................................25

2.6

Parallelism in Databases ........................................................................................27

vii

2.6.1

Partitioning Methods..................................................................................28

2.7

Concurrency Control..............................................................................................30

2.8

Topology Mismatch Problem ................................................................................31

2.9

Replication for Availability ...................................................................................32

2.10 Quorum Consensus ................................................................................................33


2.11 Databases ...............................................................................................................36
2.11.1 Real Time Applications Framework..........................................................38
2.12 Some Middlewares.................................................................................................39
2.13 Analysis..................................................................................................................53
2.14 Summary ................................................................................................................54
Chapter 3: Statistics Manager and Action Planner (SMAP) for P2P
Networks ....................................................................................................... 56-66
3.1

Introduction ...........................................................................................................56

3.2

System Architecture...............................................................................................57
3.2.1

Interface Layer (IL)....................................................................................58

3.2.2 Data Layer (DL).........................................................................................60


3.2.3 Replication Layer (RL) ..............................................................................62
3.2.4

Network Layer (NL) ..................................................................................63

3.2.5

Control Layer (CL) ....................................................................................64

3.3 Advantages of SMAP ............................................................................................64


3.4

Discussion ..............................................................................................................65

3.5

Summary ................................................................................................................65

Chapter 4: Load Adaptive Data Distribution over P2P Networks..................... 67-97


4.1

Introduction............................................................................................................68

4.2

System Model .......................................................................................................69

4.3

3-Tier Execution Model (3-TEM) .........................................................................70


4.3.1 Transaction Coordinator (TC)....................................................................71
4.3.2 Transaction Processing Peer (TPP)............................................................73
4.3.3

Result Coordinator (RC) ............................................................................74

4.3.4

Working of 3-TEM ...................................................................................75

4.4

Load Balancing ......................................................................................................76

4.5

Database Partitioning .............................................................................................76


4.5.1

Matrix Assisted Technique (MAT)............................................................77

viii

4.5.2

Database Partitioning .................................................................................79

4.5.3 Algorithm to Access the Partitioned Database. .........................................80


4.5.4
4.6

Peer Selection Criterion .............................................................................83

Simulation and Performance Study .......................................................................84


4.6.1

Assumptions...............................................................................................84

4.6.2

Simulation Model.......................................................................................84

4.6.3

Performance Metrics .................................................................................87

4.6.4

Simulation Results .....................................................................................89

4.7

Advantages of 3-TEM............................................................................................94

4.8

Discussion ..............................................................................................................94

4.9

Summary ................................................................................................................95

Chapter 5: Concurrency Control in Distributed Databases over P2P


Networks ....................................................................................................... 96-108
5.1

Introduction............................................................................................................96

5.2

System Model ........................................................................................................97

5.3

Transaction Model .................................................................................................98

5.4

Serializability of Transactions ...............................................................................99

5.5 A Timestamp based Secure Concurrency Control Algorithm (TSC2A)..............100

5.6

5.5.1

Algorithm for Write Operation ................................................................100

5.5.2

Algorithm for Read Operation .................................................................101

Simulation and Performance Study .....................................................................102


5.6.1

Performance Metrics................................................................................102

5.6.2

Assumptions.............................................................................................102

5.6.3

Simulation Results ...................................................................................103

5.7

Discussion ............................................................................................................107

5.8

Summary ..............................................................................................................108

Chapter 6: Topology Adaptive Traffic Controller for P2P Networks ........... 109-127
6.1

Introduction..........................................................................................................109

6.2

System Model ......................................................................................................112

6.3

System Architecture.............................................................................................113

6.4

Common Junction Methodology (CJM) ..............................................................114


6.4.1

Common Junction Methodology Algorithm............................................114

6.4.2

System Analysis.......................................................................................116

ix

6.5

Simulation and Performance Study .....................................................................118


6.5.1

Simulation Model.....................................................................................119

6.5.2

Performance Metrics................................................................................120

6.5.3

Simulation Results ...................................................................................122

6.6 Advantages in using CJM. ...........................................................126


6.7

Discussion ............................................................................................................126

6.8

Summary .............................................................................................................127

Chapter 7: Fault Adaptive Replica Placement over P2P Networks............... 128-147


7.1

Introduction..........................................................................................................128

7.2

System Model ......................................................................................................130

7.3

Logical Adaptive Replica Placement Algorithm (LARPA) ................................131

7.4

7.5

7.3.1

LARPA Topology....................................................................................131

7.3.2

Identification of Number of Replicas in the System................................132

7.3.3

LARPA Peer Selection Criterion .............................................................133

7.3.4

Algorithm 1: Selection of Best Suited Peers ...........................................133

7.3.5

Algorithm 2: Selection of Suitable Peers with Minimum Distance.........134

Implementation ....................................................................................................136
7.4.1

Replica Leaving from the System............................................................138

7.4.2

Replica Joining to the system ..................................................................138

Simulation and Performance Study .....................................................................139


7.5.1

Performance Metrics................................................................................139

7.5.2

Simulation Results ...................................................................................140

7.6

Discussion ............................................................................................................145

7.7

Summary ..............................................................................................................146

Chapter 8: Height Balanced Fault Adaptive Reshuffle Logical Structure for


P2P Networks ............................................................................................. 148-167
8.1

Introduction..........................................................................................................148

8.2

System Model ......................................................................................................150

8.3

System Architecture.............................................................................................151

8.4 Height Balanced Fault Adaptive Reshuffle (HBFAR) Scheme...........................153


8.4.1 Rule Set-I: Rules for Generation of Height Balanced Fault
Adaptive Reshuffle (HBFAR) Structure .............................................................156
8.4.2 Rule Set-II: Rules for replica leaving from HBFAR ..............................157

8.4.3 Rule Set-III: Rules for replica joining into the replica logical
structure................................................................................................................157
8.4.4 Rule Set-IV: Rules for Acquisition of Read/Write Quorum from
HBFAR Logical Tree...........................................................................................158
8.4.5 Correctness Proof of the Algorithm.........................................................160
8.5

Simulation and Performance Study .....................................................................162


8.5.1

Performance Metrics................................................................................163

8.5.2

Simulation Results ...................................................................................163

8.6

Discussion ............................................................................................................166

8.7

Summary ..............................................................................................................167

Chapter 9: Conclusion and Future Work......................................................... 168-173


9.1

Contributions .......................................................................................................169

9.2

Future Scope ........................................................................................................172

List of Publications ............................................................................................. 174-175


Bibliography ........................................................................................................ 176-193

xi

List of Figures
2.1

The Basic Architecture of P2P Network..........................................................12

2.2

The Basic Client/Server Architecture ..............................................................12

2.3 Distributed Hash Table (DHT) ........................................................................14


2.4

Information Retrieval form Hybrid P2P Based System...................................15

2.5

Classifications of P2P System Networks.........................................................16

2.6

Typical Overlay Network ................................................................................19

2.7

The Architecture of Napster.............................................................................40

2.8

The Architecture of Gnutella ...........................................................................42

2.9

The Freenet chain mode files discovery mechanism. The query is


forwarded from node to node using the routing table, until it reaches
the node which has the requested data. The reply is passed back to the
original node following the reverse path. ........................................................43

2.10 The path taken by a message originating from node 67493 destined for
node 34567 in a Plaxton mesh using decimal digits of length 5 in
Tapestry............................................................................................................46
2.11 Chord identifier circle consisting of the three nodes 0,1 and 3.In this
figure, key1 is located at node 1, key 2at node 3 and key 6 at node 0.............48
2.12 (a) Example 2-d [0,1][0,1] coordinate space partitioned between 5
CAN nodes. (b) Example 2-d space after node F joins...................................49
2.13 JXTA Architecture...........................................................................................50
2.14 APPA Architecture ..........................................................................................51
3.1 Architecture of Statistics Manager and Action Planner (SMAP) ....................59
4.1 3-Tier Execution Model (3-TEM) for P2P Systems ........................................70
4.2

System Architecture of 3-Tier Execution Model (3-TEM) .............................73

4.3

Logical View of Database Partitioning with df r = 10 , dfc = 3

4.4

Simulation Model for 3-TEM ..........................................................................86

4.5

Relationship between Peer Availability vs. Partitions Availability.................90

.........................78

4.6 Relationship between Throughput vs. Mean Transaction Arrival Rate...........91


4.7

Relationship between Numbers of Partitions vs. Response Time ...................91

4.8

Relationship between Mean Transaction Arrival Rate vs. Query


Completion Ratio .............................................................................................92

4.9

Relationship between Mean Transaction Arrival Rate vs. Miss Ratio ............93

4.10 Relationship between Mean Transaction Arrival Rate vs. Restart Ratio.........93
4.11 Relationship between Mean Transaction Arrival Rate vs. Abort Ratio...........94
xii

5.1

Comparisons between Miss Ratio of Transactions and Mean


Transaction Arrival Rate (MTAR).................................................................104

5.2

Comparison between Transaction Restart Ratio and MTAR ........................105

5.3

Comparison between Transaction Success Ratio and MTAR .......................106

5.4

Comparison between Transaction Abort Ratio and MTAR ..........................106

5.5

Comparison between Throughput and MTAR ..............................................107

6.1

Overlay and Underlay Networks Setup .........................................................111

6.2

3-Layer Traffic Management System (3-LTMS) for Overlay Networks ......113

6.3

Network Simulation Model for P2P Networks..............................................119

6.4

Average Number of Partitions vs. Underlay Cardinality...............................122

6.5

Average Path Lengths for Maximum Reachability vs. Underlay


Cardinality......................................................................................................123

6.6

Average Path Cost vs. Overlay Cardinality ...................................................124

6.7

Average Path Cost vs. Underlay Cardinality .................................................124

6.8

Average Response Time vs. Overlay Hop Count ..........................................125

6.9

Average %age of reduction in Path Cost vs. Overlay Path (Hop Count) ......125

6.10 Average % age Reduction in Response Time vs. Overlay Hop Count..........126
7.1

Peers Selection and Logical Connection for LARPA Structure ....................136

7.2

LARPA obtains Logical Structure from the Network shown in Figure


7.1...................................................................................................................136

7.3

LARPA Structure Representing the Replica p14 departing the Network.......138

7.4

LARPA Structure Representing the Replica p5 from the Centre


departing the Network....................................................................................138

7.5

Relationship between session time and its availability of a peer in P2P


Networks ........................................................................................................140

7.6

Variations in response time with quorum size...............................................141

7.7

Variations in restart ratio with system workload ...........................................142

7.8

Relationship of transaction success ratio with system workload...................142

7.9

Variation in throughput with system workload .............................................143

7.10 Relationship between average search time with quorum size .......................143
7.11 Variation in network traffic with quorum size...............................................144
7.12 Probability to Access Updated Data vs. Peer Availability ............................144
7.13 Response Time Comparison between LARPA1 and LARPA2 ................................. 145
7.14 Messages Overhead Comparison between LARPA1 and LARPA2 ...................... 145
8.1

7-Layers Transaction Management System (7-LTMS) .................................152

8.2

The arrangement of peers to make Height Balanced Fault Adaptive


Reshuffle Tree over the peers from underlay topology of P2P
xiii

networks. Here the dotted line connector shows the connection


between the peers in overlay topology. The dark line connector shows
the connection between the peers in the replica topology in tree. P14 is
shown as isolated peer in the network. ..........................................................155
8.3

Replica arrangements in the HBFAR Scheme generated from Figure


8.2. The session time of P1 is greater than the P2 and P3. The order of
the replicas according to session time from the HBFAR Scheme is P1,
P2, P3, P4, P5, P6, P7, and P8. ......................................................................155

8.4

Replica arrangements in a HBFAR logical structure. Peer 2 which is


shown by dotted lines is a peer leaving the network .....................................158

8.5

The HBFAR structure after leaving of Peer 2. Peer 4 takes the position
of Peer 2 which already leaved the network. All other replicas in
downlink are readjusted accordingly .............................................................158

8.6

Reachability of peers under availability in the network ................................164

8.7

Comparison in accessing stale data under availability of peers.....................164

8.8

The comparison of average search time to form the quorum from the
networks.........................................................................................................165

8.9

Comparison of average response time ...........................................................165

8.10 Comparison of average message transfer to maintain the system .................166

xiv

List of Tables
2.1

A Comparison of Various P2P Middlewares................................................55

4.1

Performance Metrics-I ..................................................................................87

4.2

Performance Metrics-II.................................................................................88

4.3

Performance Parameters Setup .....................................................................89

7.1

Effect of Peer Availability over Data Availability in the System...............132

7.2

Performance Metrics-III..............................................................................140

9.1

Comparison of Few Existing Systems with SMAP ....................................173

xv

List of Abbreviations
1-TEM

1-Tier Execution Model

3-LTMS

3-Layer Traffic Management System

3-TEM

3-Tier Execution Model

7-LTMS

7-Layers Transaction Management System

AM

Authenticity Manager

APC

Average Path Cost

APL

Average Path Length

ART

Average Response Time

CCM

Concurrency Control Manager

CJM

Common Junction Methodology

CL

Control Layer

CPU

Central Processing Unit

DA

Data Administrator

DAT

Data Access Tracker

DBA

Database Administrator

DBMS

Database Management System

DCE

Distributed Computing Environment

DD

Data Distributor

DL

Data Layer

DM

Data Manager

DS

Data Scheduler

DSS

Data Storage Space

GCM

Group Communication Manager

HBFAR

Height Balanced Fault Adaptive Reshuffle

HQC

Hierarchical Quorum Consensus

IL

Interface Layer

LA

Load Analyzer

LARPA

Logical Adaptive Replica Placement Algorithm

LD

Local Database

MAT

Matrix Assisted Technique

MTAR

Mean Transaction Arrival Rate

xvi

NCM

Network Connection Manager

NL

Network Layer

NM

Network Manager

P2P

Peer-to-Peer

PAL

Peer Allocator

PA

Peer Analyzer

PC

Path Cost

PCS

Path Cost Saved

PL

Path Length

PPQ

Participating Peer Queue

QEE

Query Execution Engine

QI

Query Interface

QM

Quorum Manager

QO

Query Optimizer

QP

Quorum Processor

RA

Resource Allocator

RC

Result Coordinator

RDA

Result Data Administrator

RL

Replication Layer

RSM

Result Manager

RM

Resource Manager

ROM

Replica Overlay Manager

ROWA

Read One Write All

RP

Result Pool

RPB

Resource Publisher

RSM

Replica Search Manager

RT

Response Time

RTDB

Real Time Database

RTDBS

Real Time Database System

RTDDBS

Real Time Distributed Database System

RTM

Replica Topology Manager

RTR

Response Time Reduction

SC

Security Checker

SI

Sub Transaction Interface


xvii

SM

Security Manager

SMAP

Statistics Manager and Action Planner

SQSM

Subquery Schedule Manager

SRTDDBS Secure Real Time Distributed Database System


SS

Schema Scheduler

SSM

Sub Transaction Manager

TAR

Transaction Abort Ratio

TC

Transaction Coordinator

TI

Transaction Interface

TLO

Traffic Load Optimizer

TM

Transaction Manager

TMR

Transaction Miss Ratio

TPP

Transaction Processing Peer

TRR

Transaction Restart Ratio

TSC2A

Timestamp based Secure Concurrency Control Algorithm

TSR

Transaction Success Ratio

TTL

Time to Live

UM

Update Manager

xviii

Chapter 1

Introduction
Peer-to-Peer (P2P) networks were developed in early 90s and were used mostly for
inhouse purposes for the companies and for limited applications of sharing
information between cooperative researchers. When the Internet began to explode in
the mid 90s, a new wave of ordinary people began to use the Internet as a way to
exchange email, access web pages, and buy things, which was much different from
the initial usage. As intelligent systems become more pervasive and homes become
better connected, a new generation of applications is being deployed over the Internet
[1]. In this scenario, P2P applications become very attractive because they improve
scalability and enhance performance by enabling direct and real time communication
among the peers.
Rest of the chapter is organized as follows. A P2P network is introduced in
Section 1.1. Objectives of P2P networks are presented in Section 1.2. Applications of
P2P systems are given Section 1.3. Section 1.4 discusses motivation behind this
research. Section 1.5 presents challenges in P2P Systems. Section 1.6 gives the
statement of this research. Section 1.7 presents work contribution of Thesis.
Organization of the thesis is explored in Section 1.8. Finally chapter is summarized in
Section 1.9

1.1 What is Peer-to-Peer Network?


Peer-to-Peer (P2P) systems provide an environment where peers (nodes)
collaboratively perform computing tasks/share resources. Moreover, a P2P system
links the resources of all participating peers in the network and allows the resources to
be shared in a manner that eliminates the need for a central host. These peers can
serve as both clients and servers. P2P systems may also be referred as P2P networks.
P2P systems are computer networks or systems in which peers of equal roles and
responsibilities, often with various capabilities, exchange information or share
resources directly with each other. These systems may function without any central

administration and coordination instance. A P2P network differs from conventional


client/server or multitier server's networks allowing direct communication between
peers.
P2P architecture enables true distributed computing and creates network of
computing resources. It allows systems to have temporary associations with each
other for a short period of time, and then separate afterwards. Besides these, peers are
autonomous in the sense that they can: (i) join the system anytime, (ii) leave without
any prior warning, and (iii) take routing decision locally in an ad hoc manner [2].
More precisely, a P2P network can be defined as a distributed system consisting
of interconnected nodes that are able to self organize into network topologies with the
purpose of sharing resources such as content, CPU cycles, storage and bandwidth,
capable of adapting to failures and accommodating transient populations of peers
while maintaining acceptable connectivity and performance, without requiring the
intermediation or support of a global centralized server or authority [3].

1.2 Why P2P Networks?


In contrast to the conventional client/server model, P2P systems are characterized by
symmetric roles among the peers, where every peer in the network acts alike, and the
processing and communication are widely distributed among the peers. Unlike the
conventional centralized systems, P2P systems offer scalability [4] and fault tolerance
[5, 6]. It is a feasible approach to implement global scale system such as the Grid [6].
An important goal in P2P networks is that all clients provide resources, including
bandwidth, storage space, and computing power. Thus, as peers arrive and demand on
the system increases, the total capacity of the system also increases. This is not true
for traditional client/server architecture having a fixed set of servers, in which adding
more clients could mean slower data transfer for all users. The distributed nature of
P2P networks also increases robustness in case of failures by replicating data over
multiple peers, and in pure P2P systems by enabling peers to find the data without
relying on a centralized index server [7]. In the latter case, there is no single point of
failure in the system.

1.3 Applications of P2P Systems


A growing application of P2P technology is the harnessing the dormant processing
power in desktop PCs [8]. Because P2P has the main design principle of being
completely decentralized and self organized, the P2P concept paves the way for new
type of applications such as file swapping applications and collaboration tools over
the Internet that has recently attracted tremendous user interest. Using software like
Kazaa [9], Gnutella [10, 11] or the now defunct Napster [12], users access files on
other peers and download these files to their computer. These file swapping
communities are commonly used for sharing media files, and MP3 music files. Kazaa
and Gnutella based networks allowed users to continue to share music files at a rate
similar to Napster at its peak. P2P networks became popular with the development,
popularity, and attention given to Napster [8, 13].
Another application domain of P2P networks is the sharing and aggregation of
large scale geographically distributed processing and storage capacities of idle
computers around the globe to form a virtual supercomputer as the SETI@Home
project did [14]. The P2P technology also allows for peripheral sharing, in which one
peer can access scanners, printers, microphones and other devices that are connected
to another peer.
Medical consultation, agricultural consultation and awareness programmers may
be provided to people in rural area using P2P technology, which may play a great role
to make India a developed country by 2020 (a vision by Dr. Abdul Kalam, Former
President of INDIA). P2P system may be used to share and exchange information,
which may help to provide the education. P2P systems may also be used for
implementation of Enterprise Resource Planning (ERP) systems, which require a huge
amount of data for processing. For these types of systems P2P may be a cheaper and
good option. Any bus/railway/airways information system may be implemented over
P2P networks. These are few of the P2P application which may be useful for the
community.

1.4 Motivation
In the traditional client/server model, one powerful machine acts as the server, i.e. the
service provider and all other attached machines are clients, the service consumers.
But from last two decades this model is facing new challenges due to increased
demands in computing and data sharing. Capacity enhancement in client/server model
3

is very expensive due to the requirement of dedicated, expensive and powerful


hardware. Other challenges are single point of failure, scalability, load balancing, and
bandwidth congestion near the server.
Evolution in computer and communication technologies has also played an
important role in Expecting More. Distributed collaborative applications are
becoming common as a result of research and development in distributed systems.
Examples of such applications are grid, P2P, cloud computing and mobile computing.
The development in communication technologies (3G/4G), availability of Internet
bandwidth at affordable rates, better connectivity and availability of Internet has also
contributed in the applications of distributed system technologies.
The number of computing devices in homes is increasing rapidly with growth in
technology and availability of compact computing devices, e.g., laptops, smartphones
apart from PCs. In this era of technology, available hardware is fast, efficient, reliable
and having large memory for storage. This combination of availability of better
hardware and connectivity provides favorable environment for distributed
applications. However, the utilization of these available resources is still limited. Most
of the time these computing devices having such powerful hardware are idle and huge
amount of computational power and storage remain underutilized or wasted. Here, a
question arises Can we combine and utilize these underutilized but distributed
resources for any useful work? The answer is, yes, through available distributed
technologies. P2P systems provide methods to combine these geographically apart
and wasted resources.
P2P systems are gaining their popularity in various application domains e.g.,
communication and collaboration, Distributed Computation, Internet services support,
Database Systems, Content Distribution, etc. The second generation Wiki is an
example of such applications that works over a P2P network and supports users on the
elaboration and maintenance of shared documents in a collaborative and
asynchronous manner.
Motivated by these challenges, this thesis has aimed to utilize the geographically
distributed resources freely available on Internet and provide a real time distributed
databases management system, by placing data over P2P systems resources.

1.5 Issues in P2P Systems


P2P systems are usually large scale dynamic systems where nodes are distributed on a
wide geographic area. In P2P systems, the resources or peers are restricted to
temporary availability only. A network element can disappear at a given time from the
network and reappear at another locality of the network with an unpredictable pattern.
Under these circumstances, one of the most challenging problems of P2Ps is to
manage the dynamic and distributed network so that resources can always
successfully be located by their requesters when needed. Another important issue is
partitioning of data for improving data availability and to provide the primary security.
The distribution of data on various peers is a difficult task. Secure distribution of data
is another issue [15, 16].
The participating peers in P2P systems may join or leave the network with or
without informing the peers. For implementation of databases over P2P systems,
issues related to P2P networks as well as related to the database systems must be
addressed. Other issues related to P2P networks that should also be addressed are
churn rate, session time, P2P network traffic, overlay and underlay topologies,
topology mismatch problems, etc. The issues related to databases are data availability,
replication handling, concurrency control and security, accessing updated data from
dynamic environment, etc.

1.6 Research Problem


There are multiple challenges to be addressed in implementing Real Time Distributed
Databases Systems (RTDDBS) over dynamic P2P networks. In order to enable
resource awareness in such a large scale dynamic distributed environment, specific
management system is required, which takes into account the following P2P
characteristics: reduction in redundant network traffic, data distribution, load
balancing, fault tolerance, replica placement/updation/assessment, data consistency,
concurrency control, design and maintenance logical structure for replicas, controlling
network traffic of overlay and underlay networks, etc.
In order to enable resource awareness in such a large scale dynamic distributed
environment, a specific resource management strategy is required that takes into
account the main P2P characteristics.

In this thesis, we are looking for a self organized system that will address some of the
above mentioned issues. Thus, we are required to develop a suitable solution for
resource management which should support fault tolerant operations, shortest path
length for requested resources, low overhead in network management operations, well
balanced load distribution between the peers and high probability of successful access
from the defined quorums. This developed system must be decentralized in nature for
managing the P2P applications and the system resources in an integrated way,
monitors the behavior of P2P applications transparently, obtains accurate resource
projections, manages the connections between the peers, distributes the objects (data
items/replicas) in response to the user requests in dynamic processing and networking
conditions. The developed system should also place/disseminate dynamic data
intelligently at the appropriate peers, or on the suitable peers. To achieve desired data
availability, data must be replicated over group of suitable peers by the system.
Further, this system should manage the data consistency among replicas. This system
should be fault tolerant and capable of managing load at every peer in the system. It
should be adaptable to any joining and leaving of peers to/from networks and address
the database related issues.

1.7 Work Carried Out


To address few of the above issues we have designed Statistics Manager and Action
Planner (SMAP) system for P2P networks. It is a five layer system. Various
algorithms are also proposed to enhance the performance of various layers. The
following are the major contributions of this research work:

1. SMAP enables fast and cost efficient deployment of information over the P2P
network. It is a self managed P2P system, having a capability to deal with high
churn rate of the peers in the network. SMAP is fault adaptive and provides load
balancing among participating peers. It permits true distributed computing
environment for every peer to use the resources of all other peers participating in
the network. It provides data availability by managing replicas in efficient logical
structure. SMAP provides fast response time for transactions with time constraints.
It reduces redundant traffic from P2P networks by reducing conventional overlay

path. It also addresses most of the implementation issues of P2P networks for
RTDDBS.

2. A 3-Tier Execution Model (3-TEM) is developed to enhance the execution


performance of the system. A Matrix Assisted Technique (MAT) is developed to
partition real time database for the P2P networks. MAT is integrated in 3-TEM. It
provides a mechanism to store partitions and access dynamic data over P2P
networks under the dynamic environment. MAT also provides the primary
security concern to the stored data simultaneously it also improves data
availability in the system. 3-TEM splits its functioning in three parts, i.e.,
Transaction Coordinator (TC), Transaction Processing Peer (TPP) and Result
Coordinator (RC). These are designed to operate in parallel to improve throughput
of the system. TC receives and manages the execution of arrived transactions in
the system. It resolves transaction mapped with global schema into
subtransactions mapped with local schema and available with TPP. TPPs are
developed for receiving subtransactions from coordinator, execute it in
serializable form and submit partial results to the RC. It compiles partial results
and prepared according to the global schema and finally delivers to the user.
3. A Timestamp based Secure Concurrency Control Algorithm (TSC2A) is
developed which handles the issues of concurrent execution of transactions in
dynamic environment of P2P network. It maintains security of data and time
bounded transactions along with controlled concurrency. TSC2A uses timestamp
to resolve the conflicts rise in the system. It uses three security levels secure the
execution of transactions. This also avoids the covert channel problem in the
system. TSC2A provides serializability in the execution of transactions at global as
well as at local level. It is implemented in the Data Layer of SMAP.

4. A Common Junction Methodology (CJM) reduces the redundant traffic generated


by topology mismatch problem in the P2P networks. CJM finds its own route to
transfer the messages from one peer to other. Common Junction among two paths
is identified for redirecting the messages. The messages are usually forwarded
from one peer to other in overlay topology. A message traverses multihop distance
in underlay to deliver the message in overlay. These multihops in underlay may
7

intersect at any point and this point referred as Common Junction which is utilized
to reroute the messages. It also reduces the traffic in the underlay network. CJM
reduces the traffic without affecting search scope in the P2P networks. It supports
a fast response time because of reducing path length at overlay level. Thus, the
cost to transfer a unit data from one peer to another is also reduced by CJM. The
correctness of the CJM is analyzed through mathematical model as well as
through simulation. It is implemented in the Network Layer of SMAP.

5. A novel Logical Adaptive Replica Placement Algorithm (LARPA) is developed


which implements logical structure for dynamic environment. The algorithm is
adaptive in nature and tolerates up to n 1 faults. It efficiently distributes replicas
on the one hop distance sites to improve data availability in RTDDBS over P2P
system. LARPA uses minimum number of peers to place replicas in a system.
These peers are identified through peer selection criteria. All peers are placed at
one hop distance from the centre of LARPA, it is place from where any search
starts. Depending upon the selection of peers for logical structure, LARPA is
classified as LARPA1 and LARPA2. LARPA1 uses the peers with highest
candidature value only, calculated through peer selection criteria. This candidature
value is compromised in LARPA2 by the distance of peers from the centre. It also
presents effect of peer leaving and joining the system. LARPA improves the
response time of the system, throughput, data availability and degree of
intersection between two consecutive quorums. The reconciliation of LARPA is
fast, because system updates itself at fast rate. It also reduces the network traffic in
P2P network due to its one hop distance logical structure formation with minimum
number of replicas. It is implemented in the Replica Management Layer of SMAP.

6. A self organized Height Balanced Fault Adaptive Reshuffle (HBFAR) scheme


developed for improving hierarchical quorums over P2P systems. It arranges all
replicas in a tree logical structure and adapts the joining and leaving of a peer in
the system. It places all updated replica on the root side of the logical structure. To
access updated data items from this structure, this scheme uses a special access,
i.e., Top-to-Bottom and Left-to-Right. HBFAR scheme always select updated
replicas for quorums from logical structure. It provides short quorum acquisition
time with high quorum intersection degree among two consecutive quorums,
8

which maximizes the overlapped replicas for read/write quorums. HBFAR


improves the response time and search time of replicas for quorums.
7. HBFAR scheme provides high data availability and high probability to access
updated data from the dynamic P2P system. High fault tolerance and low network
traffic is reported by HBFAR scheme under the churn of peers. Parallelism in
quorum accessing and structure maintenance keeps HBFAR scheme updated
without affecting the quorum accessing time. It is analyzed mathematically as well
as through simulator. It provides the feature read one in its best case. It is
implemented in the Replica Management Layer of SMAP.

1.8 Organization of Thesis


Work presented in this thesis is divided into Nine chapters. In subsequent chapters
various techniques pertaining to the requirements, design, and development of the
proposed system architecture for real time data placement and management are
presented.

Chapter 1 briefly defines what is P2P network? What are limitations and
applications of P2P systems? A look is also given on the challenges available in the
existing and for the development of new systems. Objective of this research is
followed by contribution made in this dissertation is also presented. This chapter
gives roadmap of the dissertation and finally summarizes the chapter. Chapter 2
explores the literature review. Give more detail for this chapter.

Chapter 3 enables fast, cost efficient and self managed P2P system, having a
capability to deal with high churn rate of the peers in the network. The architectural
view of Statistics Manager and Action Planner (SMAP) system, advantages behind
the development of SMAP followed by the summary of the chapter.

Chapter 4 explores a architecture of 3-Tier Execution Model (3-TEM). 3-TEM


executes system events in parallel and provides high throughput in the system. It is
integrated with a Matrix Assisted Technique (MAT) for partitioning the real time
database for the P2P networks. This chapter also discusses on simulation study
findings followed by the summary of the chapter.

Chapter 5 gives Timestamp based Secure Concurrency Control Algorithm


(TSC2A). It handles the issues of concurrent execution of transactions in dynamic
environment of P2P network. This chapter also presents system model and system
architecture for the implementation of the TSC2A. It also gives simulation results and
finding followed by summary of the chapter.

Chapter 6 highlights a Common Junction Methodology (CJM). It reduces


redundant network traffic generated by topology mismatch problem in the P2P
network. it also explores the correctness proof, implementation, simulation results and
finding followed by summary of the chapter.

Chapter 7 explores on Logical Adaptive Replica Placement Algorithm called


(LARPA). It identifies the suitable number of replicas to maintain data availability in
acceptable range. Two variants of LARPA are presented for maintaining the logical
structure. It also gives implementation, simulation results and findings followed by
summary of the chapter.

Chapter 8 discusses a self organized Height Balanced Fault Adaptive Reshuffle


(HBFAR) scheme. It is developed for improving hierarchical quorums over P2P
systems. It also gives correctness proof of HBFAR, its implementation, simulation
results, findings followed by summary of the chapter.

Finally Chapter 9 concludes the work presented in this thesis followed by future
scope.

1.9 Summary
In this chapter we have briefly defined P2P network, their limitations and applications.
A look is also given on the challenges available in the existing system. The
motivation of doing this research is followed by contribution made in this dissertation.
This chapter gives roadmap of the thesis.
In the next chapter we will present literature review.

10

Chapter 2

Literature Review
In recent years evolution of a new wave of innovative network architectures for P2P
networks has been witnessed [17]. In these networks all peers cooperate with each
other to perform a critical function in a decentralized manner. All peers. i.e., both
users and resources providers (service providers) can access each other directly
without intermediary agents. Compared with a centralized system, a P2P system
provides an easy way to aggregate large amounts of resource residing on the edge of
Internet or in ad hoc networks with a low cost of system maintenance.
Rest of the chapter is organized as follows. P2P networks are explored in Section
2.1. Types of P2P networks are presented in Section 2.2. File sharing systems are
given Section 2.3. Section 2.4 discusses underlay and overlay networks. Section 2.5
presents challenges in P2P systems. Section 2.6 discusses parallelism in database.
Section 2.7 presents concurrency control. Topology mismatch problem are explored
in Section 2.8. Replication for availability is given in Section 2.9. Section 2.10
explores on quorum consensus. Section 2.11 presents databases and some middleware
are presented in Section 2.12. Analysis of review work is presented in Section 2.13.
Finally chapter is summarized in Section 2.14.

2.1 Peer-to-Peer (P2P) Networks


A P2P system is a distributed network architecture composed of participants that
make a portion of their resources, such as, processing power, disk storage or network
bandwidth are directly available to other network participants, without the need for
central coordination instances such as servers or stable hosts (see Figure 2.1). A P2P
system assumes equipotency of its participants, organized through the free
cooperation of equals for performing a common task [18].

11

Peer
Peer

Peer

Peer

Peer

Figure 2.1 The Basic Architecture of P2P Network

In P2P networks, all peers cooperate with each other to perform a critical function
in a decentralized manner. These peers, i.e., are both users and resources providers
(service providers) can access each other directly without intermediary agents.
Compared with a centralized system, a P2P system provides an easy way to aggregate
large amounts of resource residing on the edge of the Internet or in ad hoc networks
with a low cost of system maintenance. P2P systems attract increasing attention from
researchers. Such systems are characterized by direct access between peer systems,
rather than through a centralized server. More simply, a P2P network links the
resources of all the peers on a network and allows the resources to be shared in a
manner that eliminates the need for a central host. In P2P systems, peers of equal
roles and responsibilities, often with various capabilities, exchange information or
share resources directly with each other. Such types of systems function without any
central administration and coordination instance. A P2P network differs from
conventional client/server or multitiered server's networks. The peers are both
suppliers and consumers of resources, in contrast to the traditional client/server model
where only servers supply and clients consume (see Figure 2.2).

Peer
Peer
Server

Peer

Peer
Peer

Figure 2.2 The Basic Client/Server Architecture

12

2.2 Types of P2P Networks


P2P systems can be of various types. File sharing is the dominant P2P application on
the Internet, allowing users to easily contribute, search and obtain content [19, 20,
21]. The P2P file sharing architecture can be classified according to what extent they
rely to one or more servers to facilitate the interaction between peers. Peer-to-Peer
systems are categorized into centralized, decentralized structured, decentralized
unstructured.
An important achievement of P2P networks is that all clients provide resources,
including bandwidth, storage space, and computing power. Thus, as a peer arrives and
demands on the system increases, the total capacity of the system also increases. This
is not true for client/server architecture with a fixed set of servers, in which adding
more clients could mean slower data transfer for all users.
Companies are using the processing capabilities of many smaller, less powerful
computers replacing large and expensive supercomputers [5, 22]. These features are
fulfilling the requirement of the large computing tasks using the processing of existing
in house computers or by accessing computers through the Internet.

2.2.1 Structured P2P Networks


Structured P2P network employ a globally consistent protocol to ensure that any peer
can efficiently route a search to some peer that has the desired file, even if the file is
extremely rare (see Figure 2.3). Such a guarantee necessitates a more structured
pattern of overlay links [23]. By far the most common type of structured P2P network
is the distributed hash table (DHT) [24, 52], in which a variant of consistent
hashing is used to assign ownership of each file to a particular peer, in a way
analogous to a traditional hash table's assignment of each key to a particular array
slot.
DHTs are a class of decentralized distributed systems that provide a lookup
service similar to a hash table: (key, value) pairs are stored in the DHT, and any
participating peer can efficiently retrieve the value associated with a given key.
Responsibility for maintaining the mapping from keys to values is distributed among
the peers, in such a way that a change in the set of participants causes a minimal
amount of disruption. This allows DHTs to scale to extremely large numbers of peers
and to handle continual peer arrivals, departures, and failures.

13

Data
Key
Data

Key

Hash
Function

Internet

Distributed
Key

Key
Data
Peer
Peer
Peer

Figure 2.3 Distributed Hash Table (DHT)

DHTs form an infrastructure that can be used to build P2P networks. DHT based
networks have been widely utilized for accomplishing efficient resource discovery for
grid computing systems, as it aids in resource management and scheduling of
applications. Resource discovery activity involves searching for the appropriate
resource types that match the users application requirements. Recent advances in the
domain of decentralized resource discovery have been based on extending the existing
DHTs with the capability of multidimensional data organization and query routing.

2.2.2 Unstructured P2P Networks


An unstructured P2P network is formed when the overlay links [25] are established
arbitrarily. Such networks can be easily constructed as a new peer that wants to join
the network can copy existing links of another peer and then form its own links over
time. In an unstructured P2P network, if a peer wants to find a desired piece of data in
the network, the query has to be flooded through the network to find as many peers as
possible that share the data (see Figure 2.4). The main disadvantage with such
networks is that the queries may not always be resolved. Popular content is likely to
be available at several peers and a peer searching for it is likely to find the same thing.
But if a peer is looking for rare data shared by only a few other peers, then it is highly
unlikely that search will be successful. Since there is no correlation between a peer
and the content managed by it, there is no guarantee that flooding will find a peer that
has the desired data. Flooding also causes a high amount of signaling traffic in the
network and hence such networks typically have very poor search efficiency [26].
Many of the popular P2P networks are unstructured.

14

In pure P2P networks, peers act as equals, merging the roles of clients and server. In
such networks, there is no central server managing the network, neither is there a
central router. Some examples of pure P2P Application Layer networks designed for
file sharing are Gnutella and Freenet [27].
There also exist hybrid P2P systems, which distribute their clients into two
groups: client peers and overlay peers. Typically, each client is able to act according
to the momentary need of the network and can become part of the respective overlay
network used to coordinate the P2P structure. This division between normal and better
peers is done in order to address the scaling problems on early pure P2P networks, for
example Gnutella (version 2.2).

Directory
Server
Registration
Peer

Peer

Search
Request

Peer

Response

Peer

Request

Figure 2.4 Information Retrieval form Hybrid P2P Based System

Another type of hybrid P2P network is a network using on the one hand central
server(s) or bootstrapping mechanisms, on the other hand P2P for their data transfers.
These networks are in general called centralized networks because of their lack of
ability to work without their central server(s), e.g., eDonkey network (eD2k) [28].

2.3 File Sharing System


File sharing is the dominant P2P application on the Internet, allowing users to easily
contribute, search and obtain content. These applications are popularized by file
sharing systems like Napster [13]. P2P file sharing networks have inspired new
structures and philosophies in other areas of human interaction. In such social
contexts, P2P as an idea refers to the classless social networking that is currently
emerging throughout society, in general enabled by the Internet technologies.

15

The P2P file sharing architecture can be classified according to what extent they rely
to one or more servers to facilitate the interaction between peers. P2P systems are
categorized [29] into centralized, decentralized structured, decentralized unstructured,
shown in Figure 2.5.
Centralized: In this type of systems, there is a central control over the peers. There
is a server which carries the information regarding the peers, data files and other
resources. If any peer wants to communicate or wants to use the resources of other
peer have to send the request to the server. Server then searches the location of the
peer /resource through its database/index. After getting the information, peer directly
communicates with the desired peer. This system is very similar to the client/server
model, viz., Napster which is very popular for sharing the music files. The security
measure can be implemented due to the central server. At the time of request sending
the authorization and authentication of the peer can be checked.

Peer-to-Peer Systems

Centralized
(e.g., Napster)

Decentralized

Structured
(e.g, Chord, CAN)

Unstructured
(e.g., Gnutella, Freenet)

Figure 2.5 Classifications of P2P System Networks

It is easy to locate and search an object/peer due to central server. These systems
are easy to implement as the structure is similar to client/server model, i.e.,
complexity is low.
These types of systems are not scalable due to limitation of computational
capability, bandwidth, etc. These systems have poor fault tolerance due to
unavailability of replication of objects [30] and load balancing. These types of
systems are not reliable due to single point failure, malicious attack and network
congestions near the server. These types of systems are least secure. The overhead on

16

the performance of the system is also high. Distributed Databases may be used in
these types of systems.
In centralized P2P systems the resource discovery is done using the central server
which keeps all the information regarding resource, e.g., Napster [13]. Multiple
servers are used to enhance performance in centralized systems [31].
Decentralized Structured: Decentralized structured P2P networks (e.g., Chord [31],
CAN[32, 33], Tapestry[34, 35], Pastry[34] and TRIAD[36]) use a logical structure to
organize the peers of the networks. These networks use a distributed hash table like
mechanism, to lookup files and efficient in locating the object quickly due to the
logical structure (search space is reduced exponentially). As decentralized structured
networks impose a tight control on the overlay topology, hence they are not robust to
peers dynamics. It is easy to locate and search an object/peer, due to logical structure
in these networks. The traffic of messages in these types of networks is reduced.
These systems are scalable, due to dynamic routing protocols. They have good
performance and are least affected due to scalability. These types of system are
reliable in nature, support failure peer detection and replication of objects.
The systems have tight control over the overlay topology hence they are not
robust to peer dynamics. Performance of these types of systems brutally effected, if
churn rate is high and these types of systems are not suitable for the ad hoc peer.
Database searching is comparatively complex with centralized systems [37].
Decentralized Unstructured: These types of systems are actual P2P systems, i.e.,
which are more close to the definition of P2P systems [38, 39]. There is not any
central control and all peers may act as server (which provides the service) as well as
client (which take the service). Peer wants to communicate with other peer, have to
broadcast/flooded the request to all the connected peers for searching the peer/data
object. Only the peer having the data responds and sends the data object through the
reverse path to the requesting peer. The flooding or broadcasting of requests creates
the unnecessary traffic on the network, which is main drawback of the system. A lot
of work is going on to reduce the traffic of the network. Various techniques are also
proposed, i.e., forwarding based, cached based and overlay optimization [40], etc.

17

These types of systems are not having the tight control over the overlay topology, so
they support peer dynamics. The performance is not much affected due to high churn
rate. These systems are distributed in nature, so there is no single point failure.
The scalability is poor due to overhead of traffic to discover the object/peer, as
system grows after a limit its performance goes on decreasing. It is very costly to
search a resource in unstructured system. Flooding is used to search a resource for
enhancing the performance Random Walk [41] and Location aware topology
matching [42] are used. For providing the fault tolerance, a Self maintenance and self
repairing technique are used [17].
For providing security to information, these systems use PKI [18]. Alliatrust, a
reputation management scheme [19] deals with threats, e.g., Free riders, polluted
contents, etc.
To cope up with query loss and system overloading a congestion aware search
protocol may be used [17]. This includes Congestion Aware Forwarding (CAF),
Random Early Stop (RES) and Emergency Signaling (ES). Location dependent
queries use the Voronoi Diagram [43]. The structured and unstructured P2P networks
have its own advantages and disadvantages. File sharing system of P2P networks
depends upon the application deployed on the network. To implements databases over
P2P networks, structured file sharing system have advantage over unstructured,
because of multiple communication between the peers and to reduce the search time
of data from the network.

2.4 Underlay and Overlay P2P Networks


The underlay networks comprises of the active/passive entities participating to
transfer a message (physically) from source to destination using physical
channels/links. An overlay network is a computer network {refer Figure 2.6} used to
logically connects the peers, which is built on the top of underlay networks (IP) [44,
45, 46]. Peers in the overlay (logical structure) correspond to a path between them,
through many physical links in the underlying network. For example, distributed
systems such as cloud computing, P2P networks, and client/server applications are
overlay networks because their peers run on top of the Internet. Internet was built as
an overlay upon the telephone network. Overlay networks have also been proposed as
a

way

to

improve

Internet routing,

18

such

as

through quality

of

service

(QoS) guarantees to achieve higher quality streaming media [47]. Earlier proposals
such as IntServ, DiffServ, and IP Multicast have not seen wide acceptance largely
because they require modification of all routers in the network. On the other hand, an
overlay network may be incrementally deployed on end hosts running the overlay
protocol software, without cooperation from ISPs. The overlay has no control over
how packets are routed in the underlying network between two overlay peers, but it
controls the sequence of overlay peers a message traverses before reaching its
destination.

Application
Layer

Overlay
Topology

Underlay
Topology

Network
Layer

Node Mapping
Logical Connection
Physical Connection

Figure 2.6 Typical Overlay Network

Such an overlay networks might form a structured overlay network following a


specific topology or an unstructured network where participating entities are
connected in a random or pseudo random fashion. In weakly structured P2P overlays
where peers are linked depending on a proximity measure providing more flexibility
than structured overlays and better performance than fully unstructured ones.
Proximity aware overlays connect participating entities so that they are connected to
close neighbors according to a given proximity metric reflecting some degree of

19

affinity (computation, interest, etc.) between peers. Researchers need to use this
approach to provide algorithmic foundations of large scale dynamic systems.

2.5 Challenges in P2P Systems


The Internet started out as a fully symmetric, P2P network of cooperating users. It
has grown to accommodate the millions of people flocking online, technologies have
been put in place that have split the network into a system with relatively few servers
and many clients. These phenomena pose challenges and obstacles to P2P
applications: both the network and the applications have to be designed together to
work in cycle. Application authors must design robust applications that can function
in the complex Internet environment, and network designers must build in capabilities
to handle new P2P applications. Fortunately, many of these issues are familiar from
the experience of the early Internet; the researcher must learn lessons and follow up in
the new system design. A P2P system has to address the challenges related to
networks and to application specific. In this thesis the problem defined is to make a
P2P system for real time information dissemination and management. This problem
has two folds, related to P2P dynamic networks and real time database. Thus, in the
next section a discussion of challenges related to network and real time databases is
presented.

2.5.1 Challenges in P2P Networks


P2P systems are usually large scale dynamic systems whose peers are distributed on a
wide geographic area. In order to enable resource awareness in such a large scale
dynamic distributed environment, a specific resource management strategy is required
which takes into account the P2P characteristics. Within the scope of this research, a
suitable solution for real time data/resource management in P2P systems must fulfill
the following requirements:
Fault Tolerance [8]: P2P systems are used in situations when a system has to
function properly without any kind of centralized monitoring or management facility,
because of the dynamic behavior of peers, an appropriate resource management
strategy for P2P systems must support fault tolerance in its operations [48]. Therefore,
automatic self recovery from failures without seriously affecting overall performance

20

becomes extremely important for P2P systems. The term fault tolerance means that a
system can provide its services even in the presence of faults that are caused either by
internal system errors or occur due to some influence of its environment.
Thus, scalability [49] and reliability are defined in traditional distributed system
terms, such as the bandwidth usage how many systems can be reached from one
peer, how many systems can be supported, how many users can be supported, and
how much storage can be used. However, sometimes it is not possible to recover from
a failure. It is then necessary that the system be capable of adequately providing the
services in the presence of such partial failure. In case of a failure a P2P system must
be capable of providing continuous service while necessary repairs are being made. In
other words, operation such as routing between any two peers n1 and n2 must be
completed successfully even when some peers on the way from n1 to n2 fail
unpredictably.
Reliability is related to systems and network failure, disconnection, availability of
resources, etc. With the lack of strong central authority for autonomous peers,
improving system scalability and reliability is an important goal. As a result,
algorithmic innovation in the area of resource discovery and search has been a clear
area of research, resulting in new algorithms for existing systems, and the
development of new P2P platform.
Low cost for network maintenance [5, 50, 51]: the management of a peers
insertion or deletion in the network, as well as the dissemination and replication of
resources generate control messages in the network. Control messages are mainly
used to keep the topology changing network up-to-date and in a consistent state.
However, since the number of control messages can become very large and grow even
larger than the number of data packets, it is required to keep the proportion of control
messages to the data packets as low as possible. The cost for resource management
should not be higher than the cost of the network resource utilization itself.
Load Balancing [51]: the load distribution is measured by investigating how good
the network management duties are distributed between the peers in the network. A
parameter for assessing this is for example the routing table and the location table at
each peer of the system. A suitable resource management strategy for P2P should

21

ensure a well balanced distribution of the management duties between the peers of the
system [3, 51, 53].
Peer Availability [54]: A peers lifetime is the time between when it enters the
overlay for the first time and when it leaves the overlay permanently. A peers session
time is the elapsed time between when it joins the overlay and when it subsequently
leaves the overlay. The sum of a peers session times divided by its lifetime is defined
as its uptime or called availability [55, 56, 57, 58, 59]. The availability of a P2P
management solution defines the probability that a resource is successfully located in
the system. A resource management strategy is said to be highly available, when it
enables any existing resources of the system to be found when it is requested with a
probability of almost 100%. This depends on the fault tolerant routing and the
resource distribution strategies [2, 60].
Cost sharing/reduction: Centralized systems that serve many clients typically bear
the majority of the cost of the system. When that main cost becomes too large, a P2P
architecture can help spread the cost over all the peers [1, 18]. For example, in the file
sharing space, the developed system will enable the cost sharing of file storage, and
will able to maintain the index required for sharing. Much of the cost sharing is
realized by the utilization and aggregation of otherwise unused resources which
results both in net marginal cost reductions and a lower cost for the most costly
system component. Because peers tend to be autonomous, it is important for costs to
be shared reasonably.
Logical Structure: The structures, in which replicas/peers are connected, play an
important role to reduce the search time of replicas and network traffic. The messages
propagated to search replicas from the structure generate huge network traffic,
because of the topology mismatch problem. Structures should be selected to minimize
the search time and network traffic.
Underlay/Overlay Paths [61, 62]: A message travels multiple hops in underlay
corresponding to one hop path in overlay. Each forwarding of message through
overlay path adds heavy network traffic to the physical network. The peers in
underlay are traversed multiple times, by the messages, while messages are forwarded
22

in overlay path. This cause the redundant traffic in the network, and network may
slowdown to the extent of choking. The unnecessary message forwarding at least at
overlay should be minimized.
Resource Aggregation and Interoperability [7, 50]: A decentralized approach lends
itself naturally to aggregation of resources. Each peer in the P2P system brings with it
certain resources such as computing power or storage space. Applications that benefit
from huge amounts of these resources, such as compute intensive simulations or
distributed file systems, naturally lean toward a P2P structure to aggregate these
resources to solve the larger problem. Interoperability is also an important
requirement for the aggregation of diverse resources.
Dynamism [6, 51]: P2P systems assume that the computing environment is highly
dynamic. That is, resources, such as compute peers, will be entering and leaving the
system continuously. When an application is intended to support a highly dynamic
environment, the P2P approach is a natural fit. In communication applications, such
as Instant Messaging, so called Buddy Lists are used to inform users when persons
with whom they wish to communicate become available. Without this support, users
would be required to poll for chat partners by sending periodic messages to them.
Dynamic Service Relationships [63, 64]: Dynamic Service Relationships become an
important issue in P2P systems due to the fact that those systems are non
deterministic, dynamic and are self organizing based on the immediately available
resources. A P2P system is typically loosely coupled; moreover it is capable of
adapting to changes in the system structure and its environment, viz., number of
peers, their roles, and infrastructure. In order to build a loosely coupled system that is
capable of dynamic reconfiguration, several mechanisms should be there.
Data/Peer Discovery: There must be a distributed search mechanism that allows for
finding services and service providers based on certain criteria. The challenge is to
find the right number of look up services that should be available in the system.
Another challenge is how to decide which peer will run a look up service in a fully
distributed environment. Again we need a decision making system or voting. Running

23

a look up service requires additional resources such as power and memory from the
peer, therefore cannot be always requested form the peer on a free of charge base.
Thus, shortest path of the resource lookup operation is a benchmark for the
effectiveness of the resource management. Herewith, any requested resource should
be found within an optimal lookup path length that is as close as possible to the
Moore Bound D = log-1(Nmax( - 2) + 2) - log-1 -,[65, 66]. Here, D is the
diameter of a Moore graph which is defined as the lowest possible end-to-end
distance between any two peers in a connected graph.
Naming /Addressing [6]: In order to identify a resource (peer or service) a unique
identification mechanism or naming concept needs to be introduced into a P2P
system. How to address a peer in the global network? Addresses that are normally
used to access the peer in the network (such as IP address in the TCP/IP network) do
not help a lot the P2P system is heterogeneous; therefore different addressing
protocols can be theoretically used within one P2P network.
Security [67, 68, 69, 70]: P2P systems are subjected to numerous challenges with
respect to security. Making sure a user of the system is really the one it claims to be.
In P2P systems service and resource consumers might require proof of information
about the provider; otherwise authentication cannot be considered successful.
Therefore, distributed trust establishment mechanisms are needed which will decide
the authentication of user to access the system. In centralized systems the user rights
are pre defined and therefore the decision to allow access for the certain user is taken
based on these predefined rights [13]. In P2P systems the requestor is not known a
prior, that leads to a complex decision making process. This includes the challenges
of making sure data cannot be accessed by unauthorized parties, making sure it was
not modified on the wire without this being recognized, proofing from whom the
data came. For example with cryptographic signatures, or making sure that actions
that have been executed cannot be claimed never to have happened (non repudiation).
Thus, the system must be especially hardened against insider attacks, because people
can easily become insiders.
State and Data Management: P2P systems are characterized by the fact that a single
failing peer must not bring down the system as a whole. Of course, specific services
24

(those that had lived on the dying peer) might not be available anymore, but the
system still fulfills a useful purpose. In many systems this requires facilities for some
kind of distributed data management [10]. As a consequence, we have to look at the
following challenges: replication [71, 72, 73], caching [63], consistency and
synchronization, and finding the nearest copy.

2.5.2 Challenges for Databases in P2P Networks


The conventional distributed database expects the 100% availability of server/hosts on
the network, where these databases are deployed. But, in case of P2P networks, all
peers are prone to leave the network as and when they want, i.e., there is no control
over the participating peers over which database are to be placed. This presents a
separate set of challenges to be addressed. The few of them are as follows.
Peer Selection for Storing Database [74, 75, 76]: a large number of peers are
participating in the P2P networks, depending upon their interest in the system. Peers
are connected to the system with some bandwidth. Each peer is having its session
time for which it is connected with the system. Each peer act as server as well as
client, depending upon the service/resource provided or consumed. It is very difficult
to select a suitable peer among number of participated peers, under high churn rate of
peers. For selecting suitable peers, which may contribute in the system performance, a
variety of parameters have to be checked that may affect system performance.
Peer Discovery: The network uses distributed discovery algorithms to find suitable
peer, which is having sufficient resources for the system. The selection of peer for
various roles in the system is to be addressed.
Network Traffic Balancing: P2P systems generate a huge amount of redundant
traffic over Internet due to topology mismatch problem. This traffic is further
increased exponentially, whenever, P2P system deal with the database. This increases
the Internet traffic at the level of congestion/choking. An efficient system deal with
this increased traffic. To reduce the traffic load on Internet and balance the traffic in
case of any congestion is to be addressed.

25

Database Partitioning: Databases may be partitioned to maintain availability,


security, peer load, etc. the system performance also depends upon, how database is
divided in to partitions and how these partitions are accessed for a submitted query by
the system.
Data Availability [77, 78]: Data availability of a system is a measure of how often a
complete data is available for the arrived query. High data availability increases the
performance of the system. Hence, high data availability is required by the system and
to maintain this data availability is a challenge for the system.
Security of the Database Partition [79]: The database cannot be placed at untrusted
peers. The peer can misuse/tamper/destroy the database. This issue should be
considered while addressing the other challenges.
Query Interface in Heterogeneous Environment: Each peer participating in the
system may be heterogeneous in its hardware and platform used. To execute the
arrived query through heterogeneous peer is also an important issue in implementing
database over P2P systems.
Schema Mapping [80]: at the time of data partition, a global schema is partitioned
into local schemas. The arrived queries are based upon global schema. To make
arrangement for executing that global query through local schema a mapping is
required. The technique for schema mapping also affects the system performance.
Peers Mapping: In this operation we select peers are used to store the database
partitions. The mechanism used to map data items on to the peers and affect the
system performance.
Transparency in Database: Transparency is the hiding the intermediate process from
the user. This gives the miracle that queries are executed by coordinator only, not by
the peers storing database partitions.
Data Consistency [81, 82]: system deal with number of replicas. Each replica is
accessed to read/write operations. Thus, data must be similar in all replicas after
26

execution of any operation in the system. An efficient update mechanism is required


to update each replica in the system.
One-Copy-Serializability: a P2P system uses distributed execution for a query,
where query is distributed to number of participating peers. The system compiles the
partial results from the distributed execution processes. Thus, the final result of the
global query must be similar to, if it is executed on single machine. This refers to onecopy-serializability of the query.
Locality Awareness: Locality awareness can significantly improves the high level
routing and information exchange in the application layer. This will help to check
whether a peer is still in the network or it quits from the network.
Adaptation: The overlay network should endue of good adaptation that nodes can
join and leave anytime. Thus, it has to be able to adapt changes rapidly.
Fault Tolerance & Robustness: If one or more nodes from the overlay network fail,
the overlay has still to function accurately. Failures have to be rapidly recognized and
corrected. If nodes fail, logical connections which are incident to the nodes also fail.
Thus, the overlay has to seek for alternative connections.
State and Data Management: P2P systems are characterized by the fact that a single
failing peer must not bring down the system as a whole. Of course, specific services
(those that had lived on the dying peer) might not be available anymore, but the
system still fulfills a useful purpose. In many systems this requires facilities for some
kind of distributed data management [10]. As a consequence, we have to look at the
following challenges: replication, caching [63], consistency and synchronization, and
finding the nearest copy.

2.6 Parallelism in Databases [83, 84]


Parallel execution is referred as parallelism. It may be implemented on certain types
of Online Transaction Processing (OLTP) and hybrid systems. It is the idea of
breaking down a task so that, instead of one process doing all of the work in a query,

27

many processes do part of the work at the same time. Parallelism is effective in the
systems having all of the following characteristics: (a) Symmetric Multi Processors
(SMP), clusters, or massively parallel systems, (b) Sufficient I/O bandwidth, (c)
Underutilized or intermittently used CPUs (for example, systems where CPU usage is
typically less than 30%) and (d) Sufficient memory to support additional memory
intensive processes such as sorts, hashing, and I/O buffers. An example of this is
when four processes handle four different tasks at a work place instead of one process
handling all four tasks by itself. The improvement in performance can be quite high.
In this case, each task will be a partition, a smaller and more manageable unit of an
index or table. The most common use of parallelism is in Decision Support Systems
(DSS) and data warehousing environments. The parallel execution significantly
reduces response time for data intensive operations on large databases and used in
DSS and data warehouses. Complex queries, such as those involving joins of several
tables or searches of very large tables, are often best executed in parallel. If a system
lacks any of these characteristics, parallelism might not significantly improve the
performance.

2.6.1 Partitioning Methods


Database partitioning is the process of dividing database into number of parts,
depending upon some criterion. This criterion should be such that the data items from
various database partitions may be compiled and generates similar results as
generated by non partitioned database. There are four partitioning methods:

Range Partitioning

Hash Partitioning

List Partitioning

Composite Partitioning

Each partitioning method has different advantages and design considerations. Thus,
each method is more appropriate for a particular situation.
Range Partitioning: Range partitioning maps data to partitions based on ranges of
partition key values that is established for each partition. It is the most common type
of partitioning and is often used with dates, e.g., to partition sales data into monthly
partitions. The range partitioning maps rows to partitions based on ranges of column

28

values. The range partitioning is defined by the partitioning specification for a table or
index in partition by range (column_list) and by the partitioning specifications for
each individual partition in values less than(value_list), where column_list is an
ordered list of columns that determines the partition to which a row or an index entry
belongs. These columns are called the partitioning columns. The values in the
partitioning columns of a particular row constitute that row's partitioning key.
Hash Partitioning: Hash partitioning maps data to partitions based on a hashing
algorithm may applies to a partitioning key for their identification. The hashing
algorithm evenly distributes rows among partitions, giving partitions approximately
the same size. Hash partitioning is the ideal method for distributing data evenly across
devices. It is a good and easy to use alternative to range partitioning when data is not
historical and there is no obvious column or column list where logical range partition
pruning can be advantageous. A linear hashing algorithm to prevent data from
clustering within specific partitions is used by Oracle Database.
List Partitioning: List partitioning enables to explicitly control, how rows map to
partitions. This can be done by specifying a list of discrete values for the partitioning
column in the description for each partition. This is different from range partitioning,
where a range of values is associated with a partition and with hash partitioning,
where a user have no control of the row to partition mapping. The advantage of list
partitioning is that one can group and organize unordered and unrelated sets of data in
a natural way.
Composite Partitioning: Composite partitioning combines range and hash or list
partitioning. The distribution of data into partitions by range partitioning is done in
Oracle Databases. Oracle uses a hashing algorithm to further divide the data into sub
partitions within each range partition. For range list partitioning, Oracle divides the
data into sub partitions within each range partition based on the explicit list.
In combination with parallelism, partitioning can improve performance in data
warehouses/or in a system. The partitioning significantly enhances data access and
improves overall application performance. The partitioned tables and indexes
facilitate administrative operations by enabling these operations to work on subsets of
data, e.g., create new partition, organize an existing partition, drop a partition and
29

cause less than a second of interruption to a read only application. The partitioned
data greatly improves manageability of very large databases and dramatically reduces
the time required for administrative tasks such as backup and restore, etc. The
granularity can be easily added or removed to the partitioning scheme by splitting
partitions. Partitioning also allows one to swap partitions with a table. To improve the
performance of databases over dynamic P2P networks, parallelism and database
partitioning may be useful.

2.7 Concurrency Control


For the sake of distinguishing the transactions executing order, the system assigns an
exclusive integer which increases with time when each transaction begins to execute.
We call this integer timestamp. The concurrency control based on timestamp is to
dispose the collisions by the sequence of the timestamp to make a group of
transactions cross executions equivalent to a serial sequence which is labeled by
timestamp. The aim of timestamp is to assure the collisions read operation and write
operation could be executed by the sequence of the timestamp. In the Timestamp
Method, the system would give an timestamp TS(Ti) to any transaction Ti. To any
data item R, the timestamps of the last read operation and write operation are RTS(R)
and WTS(R) respectively. When Ti request to read R, the timestamp of the read
operation is TSR(Ti); and when Ti request to write R, the timestamp is TSW(Ti).
The Concurrency Control (CC) mechanisms used in a RTDDBS have a significant
impact on the timeliness, a large amount of work has been performed on the design of
CC mechanisms for RTDBSs in past decades [46, 85, 86, 87, 88]. Locking based
protocols usually combine two phase locking (2PL) with a priority scheme to detect
and resolve conflicts between transactions. However, some inherent problems of 2PL
such as the possibility of deadlocks and long blocking times make transactions
difficult to meet their deadlines. On the other hand, the Optimistic Concurrency
Control (OCC) protocols have the properties of non blocking and deadlock free which
make them attractive for RTDDBS. The conflict resolution between transactions is
delayed until a transaction near completes, so there will be more information available
for making the choice in resolving conflicts. However, the late conflict detection
makes the restart overhead heavy. Some concurrency control protocols based on
dynamic adjustment of serialization order have been developed to avoid unnecessary

30

restarts [112, 113, 114, 116]. Among these protocols, OCC-TI [112] and OCC-DATI
[113] based on time interval are better than OCC-DA [114] which is based on single
timestamp since time interval can capture the partial ordering among transactions
more flexible. OCC-DATI is better than OCC-TI since it avoids some unnecessary
restarts in the latter. But there are still some unnecessary restarts with these protocols.
New version of OCC-DATI is Timestamp Vector based Optimistic Protocol (OCCTSV). With the new protocol, more unnecessary restarts of transactions can be
avoided. A Feedback Based Secure Concurrency Control for MLS Distributed
Database are presented in [89 ] which secure the multi level databases.
These all conventional protocols are defined for static network, for dynamic
networks like P2P, modification have to be made. The protocol/ algorithms may
include the constraints of P2P environments. The concurrent processes are distributed
over unreliable peers, which are prone to leave the network. In such environment onecopy-serializability at global and local level, in the transaction execution is hard to
achieve. To achieve secure transaction execution over secure data items, one-copyserializability of transaction at global and local levels, a secure protocol need to be
identified for dynamic environment of P2P.

2.8 Topology Mismatch Problem


There are several traditional topology optimization approaches. In [90] authors
describe an approach called End System Multicast. Here the authors first construct a
rich connected graph on which shortest path spanning trees are constructed. Each tree
rooted at the corresponding source then uses a routing algorithm for message
forwarding. This approach introduces large overhead for constructing the graph and
spanning trees and does not consider the dynamic joining and leaving characteristics
of the peers. The overhead of End System Multicast is proportional to the multicast
group size. This approach is not feasible for large scale P2P systems.
The researchers have also considered peers that are close in a cluster based on
their IP addresses [91, 92]. In this approach there are two limitations First, the
mapping accuracy is not guaranteed and second, it affects the searching scope
increasing the volume of network traffic.
In [93], the authors measure the latency between each peer to multiple stable
Internet servers called landmarks. The measured latency is used to determine the

31

distance between peers. This measurement is conducted in a global P2P domain and
needs the support of additional landmarks. Similarly, this approach also affects the
search scope in P2P systems.
GIA [94] introduces a topology adaptation algorithm to ensure that high capacity
peers are the ones with high degree and low capacity peers are within short reach of
high capacity peers. It addresses a different matching problem in overlay networks.
To chase topology mismatch Minimum Spanning Tree (MST) based approaches
are used in [95, 96]. In these peers build an overlay MST among the source peer and
certain hop neighbors, and then optimizes connections that are not on the tree. An
early attempt at alleviating topology mismatch is called Location-aware Topology
Matching (LTM) [95], in which each peer issues a detector message in a small region
so that the peers receiving the detector can record relative delay information. Based
on the delay information, a receiver can detect and cut most of the inefficient and
redundant logical links, as well as add closer peers as direct neighbors. The major
drawback of LTM is it needs to synchronize all peering peers and thus requires the
support of NTP [97], which is critical.
In [92] authors discuss the relationship between message duplication in overlay
connections and the number of overlay links. The authors proposed Two Hop
Neighbor Comparison and Selection (THANCS) to optimize the overlay network
[98], which may change the overlay topology. This change in overlay topology may
not be acceptable in many cases.
In the above approaches overlay network is only considered for optimization.
These approaches are not considering the underlay network problems. The network
search space should not be altered, as it causes the change in overlay structure of the
network. Thus, a methodology is required, which will reduce network traffic at
underlay level without affecting the overlay topology of the network and make the
system fast and scalable.

2.9 Replication for Availability [99, 100, 101, 102]


Replication is one of the most important resiliency strategies and has been used to
increase the reliability of services and the availability of data in distributed systems
[103, 104]. By providing multiple identical instances of the same data at different
locations, the data can still be available when part of the system fails or goes offline.

32

The replication for availability is an especially important principle in end system


based P2P networks, where the failures and loss of access happen frequently to the
peer. The availability of shared data in a P2P system can be improved from 24% to
99.8% at 6 times excess storage for replication [104]. With the help of erasure coding,
data can be highly available even when only a small subset of peers are online.
Similarly, [105, 106, 107] implements a scalable, distributed file system that logically
functions as a centralized file server but is physically built across a set of client
desktop computers. The system monitors machine availability and places replicas of
files on multiple client desktop machines to maximize effective system availability
using different replication algorithms.
One of the most important problem in replication systems is replica placement.
Choosing the right replica placement approach is a non trivial and non intuitive
exercise. The replica placement techniques include both passive caching [108, 109]
and proactive replication [110], both centralized mechanism [111] and distributed
methods [110].
All above methods of replication are interested only in replicating the data items,
but in P2P networks only data replication is not sufficient. This requires high
probability to access updated data from the replicas and all copies of data items
maintains data consistency among all replicas. Some protocols uses n number of
replicas in the system, it causes network choking due to huge network traffic in P2P
network. Thus, a efficient logical structure is to be established, which should be
capable to place limited number of replicas (as per requirement of system) at
appropriate place, provides high probability to access updated data items from P2P
environment.

2.10 Quorum Consensus


Data replication is a technique to improve the performance of Distributed Database
Systems (DDBS) [11, 112, 113, 114, 115] and make the system fault tolerant [41,
116, 117, 118, 119, 120]. Replication improves the system performance by reducing
latency, increasing throughput and increasing availability. However, data replication
is the basic requirement for the DDBS [178] deployed on the networks that are
dynamic in nature for example P2P systems [121]. The churn rate of peers is observed
to be high in P2P networks [122, 123, 124, 125]. For such a highly dynamic

33

environment, the probability to access stale data from the replicas is higher as
compared with the static environment where peers do not leave the system. Several
protocols have been developed to solve the problem of accessing updated data items
from replicas in dynamic environments. The examples include single lock, distributed
lock, primary copy, majority protocol [126], biased protocol, and quorum consensus
protocol [128, 129, 130, 131]. These protocols are used to keep data consistent and to
access updated data items [124] by using multiple replicas maintained in the
distributed system.
A group of replicas are accessed to get updated data items from the replicas. This
group is generally known as Quorum [122] and depending upon the operation,
quorum is said to be Read Quorum or Write Quorum. To get the updated data
item, read-write quorums and two consecutive write-write quorums must intersect.
The intersection is set of replicas which are common in read-write and two
consecutive write-write quorums. This ensures that the read quorum always gets
updated data from the system. This updated data can be propagated to all other
replicas. The degree of intersection of two quorums makes the system resilient to
churn rate of the peers.
In the literature, many replication protocols have been suggested in [63, 132] for
replica management protocol in a Binary Balanced Tree. The most simple replication
protocol is the Read One Write All (ROWA) [133]. This protocol is suitable for static
networks having fixed and dedicated servers for the replication. It has minimum read
cost amongst other protocols and is highly fault tolerant. This protocol has maximum
communication cost for write operation. This communication cost increases with
increase in number of replicas. In dynamic system update all creates the problem of
unlimited wait. A variation of this technique is known as Read One Write All
Available (ROWAA). The scheme requires all replicas to be available to perform a
write operation, which improves data availability for dynamic environments [134].
Read-Few, Write-Many, approach is presented in [136]
The Dynamic Voting protocol [135] and Majority Consensus protocol [126]
perform better than ROWAA in dynamic environments. In both protocols the number
of replicas is accessed in groups. These protocols have good read and write
availability but have a disadvantage of high read cost. They have long search time to
search the replicas as the replicas are stored randomly in the network.

34

Rather than storing replicas randomly, logical structures [137, 138, 139, 140] have
been proposed to store replicas over the dynamic network. These protocols reduce
search time to make quorum from the replicas and reduce communication cost. The
Multi Level Voting Protocol, Adaptive Voting [141], Weighted Voting, Grid protocol
[142] and Tree Quorum protocol [143] are such replication protocols each with
different operational process. The Multi Level Voting protocol is based on the
concepts of the Hierarchical Quorum Consensus (HQC) strategy. HQC [132, 144,
145, 146] is a generalization of the Majority Scheme. In this tree structure, the
replicas are located only in the leaves, whereas the non leaf peers of the tree are said
to be as logical replicas, which in a way summarize the state of their descendants.
The advantage of tree structure is it reduces the search time to find replicas from the
structure as compare to the random structure. HQC+ [147] is also a generalization of
other protocols that use a grid logical structure to form quorums. Tree structure also
reduces the message transfer to find replicas; hence, it reduces the network traffic
generated in the system. A disadvantage of Tree Quorum protocol is that the number
of replicas grows rapidly as the tree level grows. In case of Adaptive voting and
weighted Voting protocols the formed quorum satisfies some conditions which are (a)
write and read quorums always made up of more than half replicas. (b) Write and read
quorum must be such that they intersect with each other. The disadvantage of these
protocols is the size of quorums grows with increase in number of replicas; hence,
network overhead automatically increases in the system.
Bandwidth Hierarchy Replication (BHR) is proposed in [148]. BHR reduces data
access time by avoiding network congestions in a data grid network. In [149] author
proposed BHR algorithm by using three level hierarchical structures. The proposal
addresses both scheduling and replication problems. Two replication algorithms
Simple Bottom Up (SBU) and Aggregate Bottom Up (ABU) for multi tier data grids
are proposed in [150]. These algorithms minimize data access time and network load.
In these algorithms replicas of the data should be created and spread from the root
center to regional centers, or even to national centers. These strategies are applicable
only to multi tiered grids. The strategy proposed in [151] creates replicas
automatically in a generic decentralized P2P network. Their goal of proposed model
is to maintain replica availability with some probabilistic measure. Various replication
strategies are discussed in [152]. All these replication strategies are tested on
hierarchical Grid Architecture. A different cost model was proposed in [149] to
35

decide the dynamic replication. This model evaluates the data access gains by creating
a replica and the costs of creation and maintenance for the replica. Probabilistic
Quorum Systems are presented in [153].
There are several challenges to update and access replicated data items over a
dynamic network like a P2P network. Data consistency [134], degree of intersection
between two consecutive quorums, search time to find replica and fault tolerance are
some of the identified problems. There is a need of new proposals for dynamic
environment of P2P system which should facilitate low search time, low network
traffic, fast recovery from faults, high degree of quorum intersection and access to
updated data.

2.11 Databases
Database Systems are designed to manage large bodies of information. The
management of data involves both the definition of structures for the storage of
information and the provision of mechanisms for the manipulation of information.
Thus, database is a collection of objects, which satisfy a set of integrity constraints
[82, 154].
Centralized Database Systems: are those that run on a single computer system and do
not interact with each other computer systems. Such systems span a range from single
user database systems running on personal computers to high performance database
systems running on mainframe systems [154].
Distributed Database Systems[155]: consists of collection of sites, connected together
via some kind of communications network, in which each site is a database system
site in its own right, but the sites have agreed to work together, so that, a user at any
site can access data anywhere in the network, exactly as if, the data were all stored at
the users own site. The distributed database system can thus be regarded as a kind of
partnership among the individual local DBMSs at the individual local sites; a new
software component at each site logically an extension of the local DBMS provides
the necessary partnership functions, and it is the combination of this new component
together with the existing DBMS that constitutes what is usually called the distributed
database management system.

36

Real Time Database (RTDB) System[156, 157]: can be viewed as an amalgamation


of a conventional Database Management System (DBMS) and a real time systems.
Like a DBMS, it has to process transactions and guarantee basic correctness criteria.
Furthermore it has to operate in real time, satisfying timing constraints imposed on
transaction commitments and on the temporal validity of data [158]
The program, used by users to interact with database, are executed and, thus
partially ordered sets of read and write operations are generated. This set of operations
is called a transaction. The transaction is an atomic unit of work, which is either
completed in its entirety or not at all. The transaction terminates either by executing
commit or an abort operation. A commit operation implies that the transaction was
successful, and, hence all its updates should be incorporated into the database in
permanent fashion. An abort operation indicates that the transaction has failed, and,
hence requires the database management system to cancel or abolish all its effects on
the database system. In short, a transaction is an "all or nothing" unit of execution. A
transaction that updates the objects of the database must preserve integrity constraints
of the database [159, 160, 161, 162]. For example, in a bank, an integrity constraint
can be imposed on account that an account cannot have a negative balance. The
transfer of money from one account to another, reservation of train tickets, filing of
tax returns, entering marks on a student's grade sheet, etc. are all examples of
transactions.
Many distributed real time database applications store their data distributed across
various sites. These sites are connected via a communication network. A single
transaction needs to process various data within specified period of time. The
difficulty is that the data may be dispersed at various sites and, therefore the
transaction has to execute at various sites in a timely fashion. In such a distributed
environment, the problem is that the transaction at some sites could decide to commit
while at some sites it could decide to abort resulting in a violation of transaction
atomicity. To address and overcome this problem, distributed database systems use a
transaction commit protocol. A commit protocol ensures the uniform commitment of
the distributed transaction, that is, it ensures that all the participating sites agree on the
final outcome (commit or abort) of the transaction. Most importantly, this guarantee is
valid even in the presence of site or network failures.
Over the last two decades, database researchers have proposed a variety of
distributed transaction commit protocols. To achieve their functionality, these commit
37

protocols typically require exchange of multiple messages, in multiple phases,


between the participating sites (where the distributed transaction executes). In
addition, several log records are generated, some of which have to be "forced", that is,
flushed to disk immediately in a synchronous manner. Due to these costs, the commit
processing can result in a significant increase in transaction execution times, making
the choice of commit protocol an important design decision for distributed database
systems [163]. The commit protocols used in conventional database systems cannot
be directly used in Real time database systems. The conventional transaction commit
protocols do not take into considerations the real time nature of the transactions,
therefore commit protocols need some modifications to cater to the specific
requirements of the real time transactions.

2.11.1 Real Time Applications Framework


The real time applications can be classified into the following three categories based
on how the application is impacted by the violation of the task completion deadline
[164].
Hard Deadline Real Time Applications: In these applications, the consequences of
missing the deadline of even a single task could be catastrophic. Life critical
applications such as flight control systems or missile guidance systems belong to this
category. Database systems for efficiently supporting hard deadline real time
applications, where all transaction deadlines have to be met, appear infeasible due to
the large variance between the average case and the worst case execution times of a
typical database transaction. The large variance is due to transactions interacting with
the operating system, the I/O subsystem, and with each other in unpredictable ways.
Guaranteeing completion of all transactions within their deadlines under such
circumstances requires an enormous excess of resource capacity to account for the
worst possible combination of concurrently executing transactions.
Soft Deadline Real Time Applications: In these applications, the tasks are associated
with deadlines, but even if a task fails to complete within the deadline, it is allowed to
execute upto completion. Generally, in these systems, a value function" assigns a
value to the tasks. This value remains constant upto the deadline, but starts decreasing

38

after the deadline. The questions to be addressed in these applications include how to
identify the proper value function, which actually may be application dependent.
Firm Deadline Real Time Applications: These applications are different from the
soft deadline applications in the sense that the tasks, which miss the deadline, are
considered worthless (and may even be harmful if executed to completion) and are
thrown out of the system immediately. The emphasis, thus, is on the number of tasks
that complete within their deadlines. Our interest in the RTDB systems is on the
applications in the firm deadline real time domain [164]. We believe that
understanding firm deadline RTDB systems will provide necessary insight into the
RTDB technology, which is necessary for addressing the more complex framework of
soft deadline applications. Therefore, we have carried out our work from the
perspective of a Firm Deadline Real Time Database System"[163, 154, 158].

2.12 Some Middlewares


The P2P architecture is a way to structure a distributed application so that it consists
of many identical software modules, each module running on a different computer.
The different software modules communicate with each other to complete the
processing required for the completion of the distributed application. One could view
the P2P architecture as placing a server module as well as a client module on each
computer. Thus, each computer can access services from the software modules on
another computer, as well as providing services to the other computer. Each computer
would need to know the network addresses of the other computers running the
distributed application, or at least of that subset of computers with which it may need
to communicate. Furthermore, propagating changes to the different software modules
on all the different computers would also be much harder. However, the combined
processing power of several large computers could easily surpass the processing
power available from even the best single computer, and the P2P architecture could
thus result in much more scalable applications.
Napster [12, 19, 20]: it is a simply structured centralized system. We present it here
as a sort of simplest model (which was very successful socially) to contrast the other
systems to. It uses a centralized server to create its own flat namespace of host

39

addresses. In startup, the client contacts the central server and reports a list with the
files it maintains. When the server receives a query from a user, it searches for
matches in its index, returning a list of users that hold the matching file. The user then
directly connect the peer that holds the requested file, and downloads it as shown in
Figure 2.7. There are problems with using a centralized server including the fact that
there is a single point of failure. Napster does not replicate data. It uses "keepalives"
to make sure that its directories are current.
Maintaining a unified view is computationally expensive in Napster. It does not
provide scalability. The focus on Napster as a music sharing system in which users
must be active in order to participate has made it exceedingly popular. Napster does
not use the resource sharing, but it uses distributed file management. Regarding
routing, it is simply a centralized directory system using Napster servers. The main
advantage of Napster and similar systems is that they are simple and they locate files
quickly and efficiently. The main disadvantage is that such centralized systems are
vulnerable to malicious attack and technical failure. Furthermore, these systems are
inherently not largely scalable, as there are bound to be limitations to the size of the
server database and its capacity to respond to queried. This system is not reliable as it
is prone to single point failure, easily attacked by DoS. Napster provides
communication level fault tolerance as any packet dropped due to congestion, can be
retransmitted. Napster provides communication level security. It does not support
system level and application level security. Performance of Napster is good in under
load, but it falls sharply when server is overload. The response time will increase
when the number of peers and request exceed the capability of the server.

Peer

Peer
Peer
Query

Peer

Peer

Server

Peer
Peer
File

Figure 2.7 The Architecture of Napster

40

Gnutella [18, 5]: The Gnutella network, which is originated as a project at Nullsoft, a
subsidiary of America online. Gnutella is a one of the earliest P2P file sharing
systems that are completely decentralized. The general architecture of Gnutella is
given in Figure 2.8. Like most P2P systems, Gnutella builds, at the application level, a
virtual overlay network with its own routing mechanisms. In Gnutella, each peer is
identified by its IP address and connected to some other peers. All communication is
done over the TCP/IP protocol. To join to the network, the new peer needs to know
the IP address of one peer that is already in the system. It first broadcasts a join
message via that peer to the whole system. Each of these peers then responds to
indicate its IP address, how many files it is sharing, and how much space those files
take up. So, in connecting, the new peer immediately knows how much is available on
the network to search through. Gnutella uses file name as the key. In order to search a
file, in unstructured systems, random searches are the only option since the peers have
no way of guessing where the file may lie. Each peer handles the search query in its
own way. To save on bandwidth, a peer does not have to respond to a query if it has
no matching items. The peer also has the option of returning only a limited result set.
After the client peer receives response from other peers, it uses HTTP to download
the files it wants.
Gnutella is completely decentralize but the peers are organized loosely, so the
costs for peer joining and searching are O(N), which means that Gnutella cannot grow
to a very large scale. It is more reliable than the Napster as there is no single point of
failure; objects are replicated proportionally to the square root of their query rate.
Node failure can be detected by neighbors. There exist multi path to connect to a peer.
Gnutella provide similar function as the Napster does. It does not provide resource
sharing. This uses distributed file management. Gnutella uses the fault tolerance at
system level, as the process is recovered due to multiple point execution. Data
replication is also provided by this system. It also provides, the fault tolerance at
communication level due to the IP addresses, dropped packets may be recovered by
retransmission. But channel level tolerance is not supported. Gnutella does not
support the security at any level (system, communication and application level).
Threats are: flooding, malicious contents virus spread, attacks on queries, etc. The
scalability is also a little better than Napster. Gnutella can not grow after a limit, as
the performance of the system drop sharply as the traffic on the network grows.
Further, the response time is greater in Gnutella.
41

Peer
Peer
Peer
Peer
Peer

Peer

Figure 2.8 The Architecture of Gnutella

Freenet [79]: Freenet is a purely decentralized unstructured system, operating as a


self organizing P2P network (see Figure 2.9). It essentially pools unused disk space to
create a collaborative virtual file system providing both security and publisher
anonymity. Freenet provides file storage service, rather then file sharing services as
provided by Gnutella. In Freenet files are pushed to other peers for storage,
replication and persistence. Freenet peers maintain their own local data store, which
they make available to the network for reading and writing, as well as a dynamic
routing table containing addresses of other peers and the keys they are thought to
hold.
Files in Freenet are identified by binary keys. There are three types of keys:
keyword signed keys, signed subspace keys and content hash keys. To search for a
file, the user sends a request message specifying the key and a timeout (hope to live)
value. Joining the Freenet network is simply discovering the address of one or more
existing peers, and then starting to send messages.
In order to insert new files to the network, the user must first calculate a binary
file key for it, and then send an insert message to its own peer specifying the proposed
key and a hop to live value. When a peer receives the insert message, it first checks to
see if the key is already taken. If the key is found to be taken, the peer returns the pre
existing file as if a request were made for it. If the keys not found, the peer looks up
the nearest key in its routing table, and forwards the insert message to the
corresponding peer. If the hop to live limit is reached without any key collision, an all
clear result will be propagated back to the original inserter, informing that the insert
was unsuccessful. In the basic model, the request for keys is passed along from peer
to peer through a chain of requests in which each peer makes a local decision about

42

where to send the request next. In this there is no direct connection between requester
and actual data source, anonymity is maintained, and the owners of files cached
cannot be held responsible for the content of their caches (file encryption with
original text names as key is a further measure that is taken).Fig shown the discovery
mechanism in Freenet. Freenet support multi path searching and faulty peer can be
detected by the neighbor peers so Freenet is reliable in nature. Freenet uses the file
storing rather then file sharing. Load balancing, resource sharing is not supported by
Freenet, also it does not support fault tolerance and security at any level. Performance
and scalability of Freenet is not good.

Peer
Peer

Peer
Peer
Peer
Peer

Peer

Peer
File

Figure 2.9 The Freenet Chain Mode files discovery mechanism. The query is forwarded from
peer to peer using the routing table, until it reaches the peer which has the requested data. The
reply is passed back to the original peer following the reverse path.

TRIAD[ 165]: TRIAD is not a comprehensive P2P system, but a solution to the
problem of content based routing. Its goal is to reduce the time need to access content.
Despite that it is focused on the performance problem, it also represents
improvements in other traits. The core idea in TRIAD is network integrated content
routing. It is an intermediary system between a centralized model and a fully
decentralized model because it relies upon using replicated servers. So a client can go
through one of a variety of servers to reach content as long as each server hosts the
content. The content routers are integrated into the system which acts as both IP
routers and name servers. The main idea is that the content routers hold name to next
hop information so that all routing is done through adjacent servers so that each step
is on the path to the data, avoiding some of the back and forth calling of traditional
DNS (Domain Name Server). They also explore piggybacking connection set up on
the name lookup so that immediately upon locating the data the connection is already
established. Reliability is increased because the system topology is structured so that
there are multiple paths to content. TRIAD increases performance by proposing its
name based content routing as a topological enhancement. This reduces a lot of the

43

overhead from a DNS (Domain Name Server) based system. Its protocols make it
easier to maintain the system by using routing aggregates, instead of a large number
or individual names. The core ideas in TRIAD relate to P2P because in such a system
end users machines can act as either content routers or servers, or both. At minimum
this system could replace the centralized servers of a Napster type system. TRIAD
supports the distributed file management system but it does not support the resource
sharing and load balancing. TRIAD does not support the fault tolerance and security
at any level. TRIAD has good scalability.
Pastry: Pastry [55] is a generic P2P content location and routing system based on a
self organizing overlay network of peers connected via the Internet. It is completely
decentralized, scalable, fault resilient, and reliably routes a message to the live peer
with a peerId numerically closest to a key with that message; it automatically adapts
to the arrival, departure and failure of peers.
Each peer in the Pastry P2P overlay network has a unique 128-bit peerId, this
peerId is assigned randomly when a peer joins the system by computing a
cryptographic hash of the peers public key or its IP address. With this naming
mechanism, Pastry makes an important assumption that peerIds are generated such
that the resulting set of peerIds is uniformly distributed in the peerId space. Each data
also has a 128-bit key. This key can be the original key, or generated by a hash
function. The data is stored in the peer whose id is numerically closest to the key.
Each Pastry peer maintains a routing table, a neighborhood set and a leaf set.
Neighborhood set contains the peerIds and IP addresses of the peers that are closest to
the present peer. Leaf set: Leaf set contains the peerIds and IP addresses of the half
peers with numerically closest larger peerIds, and half peers with numerically closest
smaller peerIds, relative to the present peers peerId given a message, the peer first
checks to see if the key falls within the range of peerIds covered by its leaf set. If so,
the message is forwarded directly to the destination peer, namely the peer in the leaf
set whose peerId is closest to the key. If the key is not covered by leaf set, then the
routing table is used and the message is forwarded to a peer that shares a common
prefix with the key by at least one more digit. In certain cases, it is possible that the
appropriate entry in the routing table is empty or the associated peer is not reachable,
in which case the message is forwarded to a peer that shares a prefix with the key at
least as long as the present peer, and is numerically closer to the key than the present
44

peers peerId. Such a peer must be in the leaf set unless the message has already
arrived at the peer with numerically closest peer Id.
Pastry supports dynamic data object insertion and deletion, but does not explicitly
support for mobile objects. Pastry is reliable due to multi path search, replication of
data objects. Pastry supports dynamic peer join and departure. Pastry support the
distributed file management & load balancing. It also supports the communication
level fault tolerance due to maintaining the routing tables and a neighbor hood set.
Pastry support the at communication level security is supported as the hash function
& cryptography is used in the communication. Pastry has good performance due to its
content location, and scalable due to self organization.
Tapestry: Tapestry [55, 47] is an overlay infrastructure designed as a routing and
location layer in OceanStore [32]. Tapestry mechanisms are modeled after the Plaxton
scheme. Tapestry provides adaptability, fault tolerance against multiple faults, and
introspective optimizations. In Tapestry, each peer has a neighbor map, which is
organized into routing levels, and each level contains entries that point to a set of
peers closest in network distance that matches the suffix for that level. Each peer also
maintains a back pointer list that points to peers where it is referred as a neighbor.
They are used in peer integration algorithm to generate neighbor maps for a peer, and
to integrate it into Tapestry. Tapestry uses a distributed algorithm, called Surrogate
Routing, to incrementally compute a unique root peer for an object and moreover each
object gets multiple root peers through concatenating a small globally constant
sequence of salt values to each object ID, then hashing the result to identify the
appropriate roots. The appropriate root searching is shown in Figure 2.10.
When locating an object, tapestry performs the hashing process with the target
object ID, generating a set of roots to search. Tapestry, store locations of all such
replicas to increase semantic flexibility. There are only some small modifications in
routing mechanism for improving fault tolerance, e.g. in case of bad links
encountered, routing can be continued by jumping to a random neighbor peer.
Tapestry send publish and delete message to multiple roots, Tapestry provide
explicit support for mobile objects. Node insertion is easily implemented through
populating neighbor maps and neighbor notification. Node deletion is more trivial. It
is worth notice that Tapestry provides two introspective mechanisms to allow
Tapestry to adapt to environmental changes. First, in order to adapt to the changes of
45

network distance and connectivity, Tapestry peers tune their neighbor pointers by
running a refresher thread which uses network Pings to update network latency to
each neighbor. Second, Tapestry presents an algorithm that detects query hotspots and
offers suggestions on locations where the additional copies can significantly improve
query response time. Tapestry is reliable in nature as it supports the multi path
searching, failure peer detection mechanism and data replication. It does not support
the resource sharing, but the databases are shared between the peer peers. It supports
the distributed file management system & load balancing mechanism. Tapestry does
not support security at any level. Performance of Tapestry is good due to reduced
searching time (Additional copies at hot spots).Tapestry has good scalability due to
populating neighbors & neighbors notification techniques.

67493
XXXX7

64267

XXXXX

98747

XX567

XXX67

45567

XXXXX

X4567

64567
34567

34567

XXXXX
XXXXX

Figure 2.10 The path taken by a message originating from peer 67493 destined for peer
34567 in a Plaxton mesh using decimal digits of length 5 in Tapestry.

Chord: Chord [166] is a distributed lookup protocol designed by MIT (see Figure
2.11). It supports fast data locating and peer joining/leaving. Each machine is
assigned an m-bit peerID, which is got by hashing its IP address. Each data record (K,
V) has its unique key K. In Chord, it is also assigned an m-bit ID by hashing the key,
P=hash (K). This ID is used to indicate the location of the data.
All the possible N=2m peerIDs are ordered in a one dimensional circle the
machines are mapped to this virtual circle according to their peerIDs. For each
peerID, the first physical machine on its clockwise side is called its successor peer, or
succ(peerID). Each data record (K, V) has an identifier P=hash(K), which indicates
the virtual position in the circle. The data record (K,V) is stored in the first physical

46

machine clockwise from P as shown in Figure 2.11. This machine is called the
successor peer of P, or succ(P). To do routing efficiently, each machine contains part
of the mapping information. In the view of each physical machine, the virtual cycle is
partitioned into 1+logN segments itself, and logN segments with length 1, 2, 4, ,
N/2. The machine maintains a table with logN entries, each entry contains the
information for one segment. The boundaries and the successor of its first virtual peer.
In this way, each machine only need O(logN) memory to maintain the topology
information. This information is sufficient for fast locating/routing.
On query for a record with key K, the virtual position is first be calculated:
P=hash(K). The locating can start from any physical machine. Using the mapping
table, the successor of the segment that contains P is selected to be the next router
until P is lies between the start of the segment and the successor (this means the
successor is also Ps successor, i.e., the target). The distance between the target and
the current machine will decrease by half after each hop. Thus the routing time is
O(logN).
For high availability, the data can be replicated using multiple hash functions, we
can also replicate the data at the r machines succeeding its data ID. Chord also support
failure peer detection mechanism, Hence this system is reliable. The time taken by
each operation is O(logN). In Chord, machines can join and leave at any time. For
normal peer arrival and departure, the cost is O(log2N) with high probability, but in
the worst case, the cost is O(N). The peer failure can also be detected and recovered
automatically if each peer maintains a successor list of its r nearest successors on
the Chord ring. Chord is reliable, as this support failure peer detection mechanism &
data replication. This supports the distributed file management system, but does not
support the resource sharing. No security is provided at any level. Performance of
Chord is good due to fast allocation (Dynamic Hash Table is used for the purpose) of
the objects & replication of objects using multiple hashing functions. Chord has good
scalability due to distributed look up protocol, which supports the peer
joining/leaving.

47

1
2
Successor (6) =0

0
1

Successor (1) =1

5
4

3
Successor (2) =3

Figure 2.11 Chord identifier circle consisting of the three peers 0, 1 and 3.In this figure, key1
is located at peer 1, key 2 at peer 3 and key 6 at peer 0.

CAN: (Content Addressable Network) [56, 121] is a distributed hash based


infrastructure that provides fast lookup functionality on Internet like scales. In CAN,
the machines are addressed by their IP addresses. Each data record has its unique key
K. A hash function assigns a d-dimensional vector P=hash(K) for each key, which
corresponds a point in d-dimensional space. In CAN, the point indicates the virtual
position for the data. CAN maintains a d-dimensional virtual space on a d-torus.
The virtual space is partitioned into many small d-dimensional zones. Each physical
machine corresponds to one zone and stores the data that are mapped to this zone by
the hash function. These zones are divided between the new joined peer and the
previous peer as shown in Figure 2.12 (a) and 2.12(b). In the d-dimensional space,
two peers are neighbors if their coordinate spans overlap along d-1 dimensions and
about along one dimension. Each machine knows the zones and IP addresses of its
neighbors.
For a given key, the virtual position will be calculated, then starting from any
physical machine, the query message is passed through the neighbors until it find the
IP address of the target machine. In a d-dimensional space, each peer maintains 2d
neighbors [at most 4d, in fact]
CAN supports data insertion and deletion in (d/4)(n1/d) hops. In CAN, a machine
can also copy its data to one or more of its neighbors. This is very useful for load
balance and fault tolerance. It can also detect and recover peer failure automatically.
CAN also support replication thats why CAN is reliable in nature. CAN support
distributed file management and load balancing, but it does not support the resource
48

sharing. It supports good fault tolerance at system & communication level as it can
copy its contents to one or more of its neighbors. No security is provided at any level.
Performance of CAN is good due to distributed hash based infrastructure that
provides fast lookup of the contents. Due to Topology Updating, CAN supports
dynamic machine joining and leaving The average cost for machine joining is
(d/4)(n1/4) hops for machine leaving and failure recovering, it is a constant time. It is
scalable.

1
A

C
D

0.5
B

C
D

0.5

0
0.5

E
0.5

F
1

Es neighbor set :{ B, E}
Es neighbor set :{ B, E, F}
Fs neighbor set :{ E, D}
Figure 2.12 (a) Example 2-d [0,1][0,1] coordinate space partitioned between 5 CAN peers.
(b) Example 2-d space after peer F joins.

JXTA [167]: architecture is organizes in three layers as shown in Figure 2.13, JXTA
core, JXTA services and JXTA application. The core layer provides minimal and
essential primitives that are common to P2P networking. Services layer includes
network services that may not be absolutely necessary for a P2P network to operate,
but re common or desirable in the P2P environment. The application layer provides
integrated applications that aggregate services, and usually provide user interface.
Edutella [168]: attempts to design and implement a schema based P2P infrastructure
for the semantic web. It uses W3C standards RDF and RDF Schema as the schema
language to annotate resources on the web: achieving a mark up for educational
resources. Edutella provides meta data services such as querying, and replication as
well as semantic services such as mapping, mediation and clustering. Edutella
services are built over JXTA [167], a widely used framework for building P2P
applications. Edutella query service provides the syntax and semantics for querying
49

both individual RDF repositories and for distributed querying across repositories.
Edutella uses mediators to provide coherent views across data sources through
semantic reconciliation. Edutella was visualized to provide a platform for educational
institutions to participate in a global information network, retaining autonomy of
learning resources.

Figure 2.13 JXTA Architecture

The same authors have also attempted to use super peer based organization of the
Edutella peers to make searching more efficient. The paper [169] describes an
organization of the super peers based on HyperCup, a structured P2P system based on
the Hypercube topology [170]. The super peers maintain meta data for a set of peers,
instead of each peer maintaining its own meta data. The super peers themselves are
connected using the Hypercup overlay. This makes searching for meta data quite
efficient, as searches are executed only in the super peer overlay. They also use super
peer indices based on schema information to facilitate faster search.
Atlas Peer-to-Peer Architecture (APPA): is a data management system that
provides scalability, availability and performance for P2P advanced applications,
which also deals with semantically rich data, viz., XML documents, relational tables,
using a high level SQL like query language. The replication service is placed in the
upper layer of APPA architecture; the APPA architecture provides an Application
Programming Interface (API) to make it easy for P2P collaborative applications to
take advantage of data replication. The architecture design also establishes the

50

integration of the replication service with other APPA services by means of service
interfaces APPA has a layered service-based architecture shown in Figure 2.14.
Besides the traditional advantages of using services (encapsulation, reuse, portability,
etc.), this enables APPA to be network-independent so it can be implemented over
different structured (e.g. DHT) and super-peer P2P networks. The advanced services
layer provides advanced services for semantically rich data sharing including schema
management, replication [171], query processing [172], security, etc. using the basic
services.

Figure 2.14 APPA Architecture

Piazza [173]: is a peer data management system that facilitates decentralized sharing
of heterogeneous data. Each peer contributes schemas, mappings, data and/or
computation. Piazza provides query answering capabilities over a distributed
collection of local schemas and pairwise mappings between them. It essentially
provides a decentralized schema mediation mechanism for data integration over a P2P
system. Peers in the system contribute to stored relations, similar to data sources in
data integration systems. The query reformulation occurs through stored relations,
stored either locally or at other peers. Piazza also addresses the key issue of security,
which would enable users to share their data in a controlled manner. Another paper
[179] describes the way a single data item is published in protected form using

51

cryptographic techniques. The owner of the data item encrypts the data and can
specify access control rights declaratively, restricting users to parts of the data.
PIER: P2P Information Exchange and Retrieval (PIER) [43] is a P2P query engine
for query processing in Internet scale distributed systems. PIER provides a
mechanism for scalable sharing and querying of finger print information, used in
network monitoring applications such as intrusion detection. PIER uses four guiding
principles in its design. First, it provides relaxed consistency semantics best effort
results, as achieving ACID properties may be difficult in Internet scale systems [174].
Second, it assumes organic scaling, meaning that there are no data centers/warehouses
and machines can be added in typical P2P fashion. Third, the query engine assumes
data is available in native file systems and need not necessarily be loaded into local
databases. The fourth principle is that instead of waiting for breakthroughs on
semantic technologies for data integration, PIER tries to combine local and reporting
mechanisms into a global monitoring facility. PIER is realized over CAN, the
hypercube based P2P system [33].
PeerDB [175]: is an object management system that provides sophisticated searching
capabilities. PeerDB is realized over BestPeer [176], which provides P2P enabling
technologies. PeerDB can be viewed as a network of local databases on peers. It
allows data sharing without a global schema by using meta data for each relation and
attributes. The query proceeds in two phases: in the first phase, relations that match
the users search are returned by searching on neighbors. After the user selects the
desired relations, the second phase begins, where queries are directed to peers
containing the selected relations. Mobile agents are dispatched to perform the queries
in both phases.
NADSE: Neighbor Assisted Distributed and Scalable Environment (NADSE) [180]
enables the fast and cost-efficient deployment of self-managed intelligent systems
with low management cost at each peer. NADSE implements a structured P2P
concept which enables efficient resource management in P2P systems even during
high rate of network whips. It permits distributed computing environment [177] to
every peer node by grouping the nodes in a cluster and deputing a node as cluster
head (CH) and assumes whole network as group of clusters. Every CH manages
52

topology of the network and resource available in a cluster. NADSE provides


common solution to P2P systems fault tolerance and load balancing problems and
gives true distributed computing environment with the help of MAs. It also provides
fault-tolerant execution of processes (mobile device/mobile codes) which is based on
a realistic view of the current status of mobile process based computing.
In [181] Local Relational Model (LRM) is presented. In this model author
assumed that the set of all data in a P2P network consists of local (relational)
databases. Each peer can exchange data and services with a set of other peers, called
acquaintances. Peers are fully autonomous in choosing their acquaintances. In this
model Local relational database is stored at the peer, the complete information stored
at peer, which may be on the target of intruder/hacker. Peer can misuse the
information stored at the peer. In [37] Cooperative File system is proposed, but it is
read only storage system developed at MIT. This file system provides robustness, load
balancing and scalability. In this system we cannot updates the data entries, i.e., data
is static in nature. The existing systems are considering the dynamic data, which may
be updated when required. In the presented model we are also considering this
problem of existing systems.

2.13 Analysis
A number of existing P2P systems such as Napster, Gnutella, Kazaa and Overnet are
popular for file sharing over the Internet. Most of the systems are dealing with static
data. Irrespective of good research in the socially popular and emerging field, i.e., P2P
networks and systems, still there is a lot of scope of research in this field. It is
identified that most of the P2P systems are popular for the static data, while it is
shared among the networks. A little work is done in the direction of sharing dynamic
data among the P2P systems, the data which is changed, while it is shared among the
networks. Motivating from the uses of P2P resources freely available in P2P system
and wasted for implementing real time information.
To achieve the above objective of placing real time data over P2P environment,
data must be partitioned and replicated over multiple peers for number of reasons,
e.g., security, data availability, etc. Some mechanism is also required to enhance the
throughput of the system to match the expectations of RTDBS. Distributed
Concurrency control mechanism is also required for execution of concurrent

53

processes, which will maintain data consistency, serializability, etc. in the system.
Network traffic is a major issue of the P2P networks, because of topology mismatch
problem. A mechanism is required to reduce this heavy traffic, as peers will
communicate with each other in large extent, will cause the situation of network
choking. Some schemes to place the replicas will be required to reduce the replica
search time, thus, some logical structures to be identified to lace the replicas.
Reliability is another issue which needs more attention of the research society.
Other issues are concurrency control, fault tolerance and load balancing. The response
time and traffic cost needs to be measured and compared as performance measure for
the network. In order to enable resource awareness in such a large scale dynamic
distributed environment, specific middleware is required which takes into account the
following P2P characteristics: managing underlay/overlay topologies, reduction in
redundant network traffic, data distribution, load balancing, fault tolerance, replica
placement/updation/assessment, data consistency, concurrency control, design and
maintenance logical structure for replicas, controlling network traffic of overlay and
underlay networks, etc. Architecture of the proposed middleware should be suitable
for dissemination of dynamic information in the P2P networks [41, 116, 117]. In
Table 2.1 we have presented comparison of various P2P middleware approaches.

2.14 Summary
In this chapter we have presented P2P (P2P) Networks, Types of P2P Networks,
overlay networks, Overlay P2P networks, Limitation of P2P systems, Parallelism in
databases is presented. The concurrency control, topology mismatch problem is also
discussed in the chapter. Replication for availability, quorum consensus, information
regarding databases and their requirements in P2P environment, Some P2P
middleware is also discussed. At the end of the chapter, an analysis of the literature
survey is presented followed by summary.
In the next chapter we have proposed Statistics Manager and Action Planner
(SMAP) for P2P Systems.

54

Table 2.1 A Comparison of Various P2P Middlewares


Attributes

Middleware
CAN

Tapestry

Chord

Pastry

Napster

Gnutella

Freenet

APPA

Piazza

PIER

PeerDB

NADSE

Load balancing

Fault tolerant
(communication link)

Fault tolerant (host


level)

Replication

Replication

Reliable

Replication Replication Replication

Replication Replication

Replication

Replication Replication Replication

Resource sharing

Secure

Communication
level

Scalable

Little

Little

Little

Little

Good

Good

Good

Good

Better at
under load

Better at under
load

Better at
under load

Good

Better at under
load

Good

Better at
under load

Good

Poor at
overload

Poor at
overload

Poor at
overload

Performance

Distributed file
Management

Poor at overload

Poor at
overload

Data Partitioning

NA

NA

NA

NA

NA

NA

NA

Traffic Optimize

NA

NA

NA

NA

NA

NA

NA

NA

Concurrency Control

NA

NA

NA

NA

NA

NA

NA

NA

Local

Parallel Execution

NA

NA

NA

NA

NA

NA

NA

Schema Management

NA

NA

NA

NA

NA

NA

NA

Y(Global)

Y(Pairwise)

Y(Global)

Hybrid

Hybrid

Hybrid

Loosely
Structured

Loosely
Structured

File sharing
Degree of
Decentralization

Distributed Distributed Distributed

Distributed

Centralized

Decentralized

Distributed

Network Structure

Structured Structured Structured

Structured

Structured

Unstructured

Loosely Independent Unstructured


Structured

*NA: Not addressed at the best of our knowledge


55

Hybrid Super Distributed


Peer Based
Structured

Chapter 3

Statistics Manager and Action Planner


(SMAP) for P2P Networks
P2P technology facilitates to share the resources of geographically distributed peers
connected to Internet. P2P applications are attractive because of their scalability,
enhance performance and substitute to client/server model which enable direct and
real-time communication among the peers. To place the data over dynamic P2P
network while maintaining scalability, performance, resource management peer churn
rate and traffic in the P2P system are major issues. That affects the performance of
P2P system.
In this Chapter, we have proposed Statistics Manager and Action Planner (SMAP)
for P2P Network which is an evolutionary approach to P2P systems. SMAP supports
fault-tolerance, shortest path length to requested resources, low overhead generation
during network management operations, balanced load distribution between peers and
high probability of lookup success.
Rest of the chapter is organized as follows. Introduction is given in Section 3.1.
Section 3.2 explores the architecture of SMAP. Section 3.3 gives advantages behind
the development of SMAP. Section 3.4 presents discussion. Finally chapter is
summarized in Section 3.5.

3.1 Introduction
P2P technology facilitates to share the resources of geographically distributed peers
connected to Internet. As technology approaches to its peak, the computation power,
storage capacity, capability of input/output operations of any computing device also
goes on increasing. Major part of CPU ticks and storage space of computing devices
are wasted due to limited requirements of users in general. Distributed technologies
enable the sharing of data as well as resources. P2P networks share the data storage,
computation power, communications and administration among thousands of
individual client workstations. The ability of P2P systems to share data and resources

56

can be utilized to pool the wasted resources, e.g., CPU ticks, storage space of
participating peers, etc. To utilize this pool of storage space for implementing
RTDDBS over P2P network is burning challenge. This vision is also supported by
increased usage, availability of Internet facility and popularity of P2P systems. To
address these challenges a number of issues related to P2P systems, database and real
time constraints of databases have to be addressed.
The key issues in implementing RTDDBS over P2P systems is to efficiently
maintain the target data and peer availability in the environment of high node churn,
network traffic, fast response time, high throughput which are also acceptable in real
time environment. Load balancing, fault tolerance, replication are some other issues
without which a system cannot be useful.
We are required to develop a computing/communication P2P systems that fulfills
most of the above challenges. Thus, a system is needed for P2P network which will
increase the availability of the data items, reduce the response time of the system,
provide fast response to update the database system, arrange secure access, distribute
the information over the P2P networks and manage the other dynamic issues in the
database.

3.2 System Architecture


Middleware is considered as necessary layer among the hardware, operating systems,
and application. The aim of the middleware layer is to provide appropriate interfaces
to diverse applications, a runtime environment that supports and coordinates multiple
applications mechanism to achieve adaptive and efficient use of system resources.
Middleware is often used by traditional systems as a bridge between the operating
system and the applications. It makes the development of distributed applications
possible. Traditional distributed middleware (DCOM, CORBA) is not adequate for
the dynamic P2P network requirements of memory and computation. The
maintenance of traditional middleware architectures is also not easy due to dynamic
P2P network constraints. For these kinds of networks, a middleware that is simple,
light, and easy to implement is needed.
To achieve the above a Statistics Manager and Action Planner (SMAP) {Figure
3.1} is proposed. It is a decentralized management system that manages P2P
applications, system resources in an integrated way, monitors behavior of the P2P

57

applications

transparently,

obtains

accurate

resource

projections,

manages

connections between the peers, distributes replicas/database objects in response to the


user requests, changed processing and networking conditions. SMAP is a middleware
which support the P2P systems to store and access real time information. It consists of
five layers and these layers in combination helps the user to store and retrieve their
data over P2P networks efficiently. It distributes real time data in P2P networks and
replicates this data to provide acceptable data availability. It also manages the replicas
in efficient overlay topology that provides fast and updated data against any user
query. SMAP enhance response time, throughput by providing parallel execution to
arrived queries. It minimized network traffic generated against data/ control
information in the system. It also provides fault tolerance, load balancing in the
system. These layers are briefly describes as follows.

3.2.1 Interface Layer (IL)


It handles heterogeneity of the participating system. It receives the queries from the
outside world and forwards the same to next layer after checking the authenticity of
the user/query/data corresponding to the query. The results corresponding to the
received queries are returned to the user. IL also maintains the log of resource
available and in use. A brief discussion of various components of this layer is as
follows.

Authenticity Manager (AM): This module looks after the authenticity of a user and
check whether the user is authentic to use the system or not. Various
privileges/permissions, e.g., read/write/execute/update to the user are also verified
through by the authenticity module. Various conventional techniques are used to
avoid unauthorized access in this module e.g., login ids, code exchange techniques,
etc. To avoid the misuse by the malware various techniques, e.g., capcha may be used,
so that unauthorized program may not temper the information/complete system.

Resource Manager (RM): It manages the resources of the system (viz., up/down
bandwidth, storage space, CPU ticks, etc.), and also controls the participation of peers
in the network. RM mainly collect the resources and maintain the statistics of the

58

resources. To simplify the functionalities, RM is subdivided into Resource Allocator


(RA) and Resource Publisher (RP).

Application
Interface Layer

Resource
Manager

Authenticity
Manager

Data Handling Layer

Schema Manager

Control Layer

Query Execution
Engine

Data Storage
Space

Replication Management Layer

Replica Search
Manager

Network Manager

Query Processor
Data Manager
Data Scheduler

Query Optimizer

Replica Topology
Manager

Query Interface

Quorum Manager

Replica Update
Manager

Network Management Layer


Traffic Load
Optimizer

Peer Analyzer

Group
Communication

Internet

Figure 3.1 Architecture of Statistics Manager and Action Planner (SMAP)

Resource Allocator (RA): allocates and controls the resources for newly subscribed
services. Resources are allocated fairly among peers, at the same time fulfilling
individual peer requirements. RA keeps the global state of the distributed resources
consistent among all local resources based on a given coherence strategy.

59

Resource Publisher (RPB): is responsible to collect and publish the resource


permitted to be shared in the system.

Security Manager (SM): It provides coordination among all the applications running
on n numbers of peer. Security, trust and privacy are addressed from the very
beginning of system design and on all levels such as hardware, operating system,
protocols, and architecture. SM has following roles: (1) Protecting channels against
unauthorized access or modifications, (2) Program validation/verification (what an
uploaded/downloaded piece of software really does), Trust modeling, (3) How
fragments of information can be efficiently shared in a controlled manners,
Key/certificate management, and (4) Implications of dynamic P2P network (that can
be done without trusted servers). SM also provides and looks after the security levels
of the data/query/user, etc. at various levels in the system.

Query Interface (QI): It accepts the queries from outside world. Before accepting a
query it forwards the same to SM for the validation of the user. If a user is
authenticated, it returns the information of RC from where user receives the results
against the submitted queries. All the required information is exchanged with the user
for smooth functioning of the operation. This submitted query is further submitted to
the query analyzer (QA).

3.2.2 Data Layer (DL)


The Data Layer (DL) is responsible for data handling, data integrity and data flow
management in the system. It receives queries from the IL, subdivides, optimizes and
distributes to the different components as per requirement. DL ensures efficient
distribution of data to different components. It also supports the concurrent execution
of processes in the system. DL implements Matrix Assisted Technique (MAT) for
data distribution and Timestamp based Secure Concurrency Control Algorithm
(TSC2A) for concurrency control in the system. MAT also manages the global/local
schema of the database. DL compiles partial results which are corresponding to local
schemas and generates results corresponding to the global schema. The details about
MAT and TSC2A are available in the Chapter 4 and Chapter 5, respectively. A brief

60

discussion of various components that execute the functionality of MAT and TSC2A
of DL is as follows.

Schema Scheduler (SS): It is responsible for handling the global databases. Global
database is further partitioned horizontally, vertically or by both. SS ensures the
partitioning and resembling of data into the system. It helps DL in compiling the final
results from partial results received from various peers holding the replicas.

Query Processor (QP): It subdivides the received query into subqueries according to
the database schema and distributes the same to the corresponding replicas. It uses the
global schema, Local schema and MAT data partitioning algorithm for subdividing
the queries. QP also helps DL in compiling the partial results.

Query Optimizer (QO): It analyzes, resolve and optimizes the received queries. It is
also responsible for breaking query into subqueries. QO decides whether a peer is
suited for a particular subquery or not.

Query Execution Engine (QEE): It is responsible for executing the subqueries and
produces partial results corresponding to the subqueries. These partial results are
further sent to the peers responsible for compiling the partial results and send to QP.
QEE is also ensures parallel execution of submitted subqueries and manages various
stages used in the execution. It produces timely response of a subquery and execution
of all subqueries corresponding to a query. It dispatches a subquery to suitable peer
and gets back the information/partial results.

Data Scheduler (DS): It maintains the global/local schema of the database. DS finds
the correlation between global and local schema. It checks and distributes the
information to the selected peers through some predefined pattern decided by the
administrator. DS is used to gather the information from the replicas of data partition
from the peers.

Data Storage Space (DSS): maintains actual database to be accessed in response to


any query. All data items belonging to the database partition are physically stored in
DSS. This is the shared region permitted by the owner of a peer.
61

Data Manager (DM): DM is responsible to distribute data among the network. It


ensures integrity of data and correct transfer rights in the system. It permits access
rights to the database stored at a peer in serializability form. DM also checks and
manages a timely retransfer of any tempered data during transfer from one component
to another and one peer to other participating peer in the system.

3.2.3 Replication Layer (RL)


The Replication Layer (RL) replicates the data and ensures updated data availability.
To ensure updated data availability RL implements logical structure to manage
replicas. The logical structure of replicas ensures fault tolerance to the system. RL
controls the available space with peers. It provides updated data in response of a
query with reduced network traffic. RL is also responsible for all issues related to
replication, i.e., issues related to number of replicas, replica selection, logical
structure in which replicas are arranged, etc. The details about this layer are described
in Chapter 7 and Chapter 8. The various components of the layer are as follows.

Replica Topology Manager (RTM): RTM places the replicas in a logical structure
which ease the access of the information, responsible for searching any replicas from
the group of replicas. It permits read/write quorums. It implements Logical Adaptive
Replica Placement Algorithm (LARPA) and Height Balanced Fault Adaptive
Reshuffle (HBFAR) scheme for reducing search time of the replicas. LARPA and
HBFAR scheme identify the replicas for making quorum. RTM maintains the logical
structure time to time, in which replicas are placed. Every time when any replicas
leave the network, it readjusts the replicas by arranging the addresses of these in
logical structure. LARPA and HBFAR scheme are also used to maintain the overlay
topology in the system. RTM pays a key role in the SMAP.

Replica Update Manager (RUM): The major aim of RUM is to maintain the
freshness of data item. It uses LARPA and HBFAR scheme for maintaining latest
information. The probability to access stale data from the system is minimized by
minimizing the update time of the system.

62

Quorum Manager (QM): It is responsible to decide the quorum consensus to access


the data item. The quorum is decided such that the system got the desirable
availability of the replicas, i.e., the number of replica to be accessed is increased, if
the availability of the peers storing replica are low. QM decides regarding number of
replicas to be accessed. It is responsible to maintain the availability of the replicas
upto desired level. QM recognizes the replica to be accessed from the logical
structure.

3.2.4 Network Layer (NL)


The Network Layer (NL) maintains logical connection among any number of peers in
the network. It provides optimized network paths from one peer to another. NL is
responsible for sending and receiving packets into the network. This layer is further
connected to the Internet layer of TCP/IP model. All the information is sent or
received through NL only. It is completely responsible for logical structure,
connections between participating peers, network traffic, deciding paths to transfer the
messages, etc. The detailed functioning of the components involved in this layer is
described in Chapter 6. The various components of this layer are as follows:

Network Manager (NM): NM is responsible for managing logical topology in which


replicas are arranged. It implements the logical topology over P2P network. NM
assumes the network as hierarchical Distributed Computing Environment (DCE).

Group Communication Manager (GCM): Every logical address of a replica is


checked and converted into the physical address of that replica. It also helps to send
the parallel update messages to the group of replicas. GCM routes data and control
information. It establishes communication links to other replica.

Traffic Load Optimizer (TLO): Huge network traffic is generated in the P2P
networks. This is the main bottle neck of the P2P network, which prevent the network
to be scalable beyond a limit. TLO reduces the network traffic in the network by
optimizing the network paths. It analyses network traffic and provides statistics to
system. This statistics is used in managing the network traffic. TLO implements

63

Common Junction Methodology (CJM) to minimize network traffic in the system,


details are given in Chapter 6.

Peer Analyzer (PA): PA is responsible for collecting statistics of the peers available
in network. Peers are selected for storing replicas depending upon the statistics
received. They can leave and join the network with and without informing the system.
To trace the behavior of peers PA keeps tracks to each peers leaving and joining time
bandwidth with which they are connected to network, storage space available, CPU
utilization of peer, etc., are the parameters which are analyzed by PA in the network.

3.2.5 Control Layer (CL)


NL provides interaction between various layers and the corresponding components of
SMAP. It manages the complete working of the SMAP. It also synchronizes the
system to improve the system performance. CL synchronizes all layers of SMAP,
takes decisions depending upon the statistics received from various components and
activates the corresponding action. It also ensures the information and dataflow in the
system.

CL keeps track on the working of different components to ensure the

efficient working of the system.

3.3 Advantages of SMAP


P2P systems in general have high overall management cost. However, the presence of
SMAP enables fast and cost-efficient deployment of self-managed P2P system. The
SMAP enables low management cost at each peer. It supports the P2P network to
store and retrieve the real time information. It first distributes the database into n
number of partitions. These partitions of database are further stored on group of
highly available replicas. By distributing the database, SMAP supports primary
security to the database. It enables P2P network to effectively exploit under utilized
resources of the system and utilize them for highly complicated RTDDBS. It utilizes
three-stage parallelism for the execution of received queries, which enhances
throughput of the system.
SMAP accesses information from the partitions distributed over the network. It
supports concurrent execution of the processes and also accesses partitions of the
database. It also helps to use P2P networks as a Data Management System and speeds

64

up the information retrieving process. SMAP makes system similar to any other
conventional file management system stored on static network. SMAP is a highly
fault tolerant system. The availability of the data items is also improved through it.
SMAP receives both data and query for processing and management, and avoids any
leakage of information from high security level to low security level.
SMAP also provides the route management in the P2P network. It reduces the
network traffic at large scale providing scalability to the network. This solves the
problem of topology mismatch faced in the P2P networks, which generates heavy
redundant traffic in the network. SMAP helps to balance the load of traffic over the
network.
Using SMAP, one may deploy large-scale intelligent systems without the need of
cost-intensive supercomputing infrastructure in which management is highly complex
and requires high-skilled administrators for their maintenance. The approach is
evolutionary in the sense that it gives a new approach towards the application of P2P
into real-time service scenarios.

3.4 Discussion
The SMAP enables fast and cost-efficient deployment of information over the P2P
network with high availability of data and peers. It permits DCE to every peer uses
the resources of all other peers participating in the network. It utilizes the wasted
resource of peers to implement RTDDBS, offered by the owner of these peers. SMAP
is self-managed P2P system and has the capability to deal with high churn rate of the
peers in the network. It also reduces redundant network traffic generated through
topology mismatch problem in any P2P system. It provides efficient replica placement
in the network which support high data availability to the system.

3.5 Summary
In this chapter, we have presented Statistics Manager and Action Planner (SMAP)
for P2P networks. It is an evolutionary approach to P2P systems that implements any
dynamic information (which can change while shared between peers) over highly
dynamic environment of P2P networks. SMAP enables fast and cost-efficient
deployment of self-managed P2P system with high overall management cost, but with
low management cost at each peer. It implements a structured P2P concept which

65

enables efficient resource management in P2P systems even during high churn rate of
peers in network. SMAP reduces the redundant traffic generated through topology
mismatch problem. It permits distributed computing environment to every peer
participating in the network.
In the next chapter, Data Placement and Execution Model for the RTDDBS will
be discussed.

66

Chapter 4

Load Adaptive Data Distribution over


P2P Networks
Data placement and management is a challenging task for storing RTDDBS over P2P
networks. This includes the problem of data distribution on Internet, peer/data
availability, replication of partitions, congestion free load balancing, response time
and the throughput in presence of high churn rate of peer nodes. An efficient data
placement procedure plays an important role to enhance performance of the P2P
systems.
In this chapter, a 3-Tier Execution Model (3-TEM) is proposed. It divides
conventional execution process used for P2P system, into three independent stages. 3TEM implements a Matrix Assisted Technique (MAT) to distribute the database over
P2P networks. It uses the range partitioning to partition the database with dividing
factors and partition the database horizontally, vertically or both and place each
partition on group of peers as replicas. MAT also provides primary security to the
database. 3-TEM requires small chunks of CPU ticks for executing the subprocesses
through multiple stages. It improves throughput, query completion ratio, resource
utilization by executing the process using parallelism.
Rest of the chapter is organized as follows. Section 4.1 presents introduction.
Section 4.2 gives system model. Section 4.3 highlights the 3-Tier Execution Model
(3-TEM). Section 4.4 gives load balancing and database partitioning is presented in
Section 4.5. Implementation and performance study is given in Section 4.6. Section
4.7 highlights advantages of 3-TEM. Section 4.8 presents discussion and finally
chapter is summarized in Section 4.9.

67

4.1 Introduction
A large number of peers are participating in the P2P networks. P2P systems are
dynamic in nature because participating peers may join or leave the networks with or
without informing the systems. The churn rate of the peers is the rate at which these
peers are leaving and joining the system. Each peer is having its session time for
which it is connected with the system. It is very difficult to select a suitable peer
among number of participating peers for a particular task, which are having variety of
parameters that can affect system performance. P2P systems are popular for their
unrestricted sharing of data files, e.g., Napster, Gnutella. In such environment, the
processes require small CPU ticks to execute them. To mange data availability in the
presence of churn rate during the service time is an issue to be addressed.
For implementation of databases over P2P networks, a system has to address the
challenges related to P2P networks as well as databases. The challenges related to P2P
networks are peer selection, churn rate, session time, network traffic, overlay and
underlay topologies and topology mismatch problems, etc.
The challenges related to databases are data availability, replication, concurrency
control, security and real time access of data, etc. Databases may be partitioned to
maintain data availability, peer availability, primary security and peer load, etc. The
system performance also depends upon, how database is divided into partitions and
how these partitions are permitted to access for the execution of a submitted query by
the system. A global schema is partitioned into local schemas. A proper placement of
the partitions improves performance of the system. To execute a global query through
local schema, an arrangement for mapping between global and local schema is
required. The technique for schema mapping also affects the system performance.
Another challenge is that a real time environment expects the execution of a query in
bounded time. Such requirement of time bound execution of queries and high
throughput of the system is hard to achieve in P2P networks due to the churn rate of
the peers.
To address few of the above issues we have developed a 3-Tier Execution system
which addresses discovery and peer selection, churn rate, data partitioning, data
availability, primary security, schema mapping and data consistency issues.

68

4.2 System Model


To distribute a database over P2P network, it may be partitioned into small parts. The
partitions are placed on groups of selected peers to maintain the data availability at
acceptable level. The peers holding replicas are selected through predefined selection
criterion from the network. These groups are further consulted to get updated data
from the P2P network. The requester sends its query to the database in the form of
transactions with respect to the global schema. These transactions are further
subdivided into subtransactions (with respect to local schema) depending upon the
database partition. The partial results obtained from various group of replicas are
compiled to achieve the desired results corresponding to a submitted transaction. This
result is returned to its requester. The system model is as under:
The set of peers participating in the network is defined as a set
P = { p1, p2 , p3 ,..., pnp } where np is the number of peers. It is assumed that the

relational database DB has nf fields f1, f 2 , f3 ,..., f nf and nr records r1, r2, r3,..., rnr .
The

database

DB

is

divided

into

partitions

represented

as

DB = {Db1, Db 2 , Db3 ,..., Db p } .


p

Dbi = DB is an operation which compiles the partial results to produce the final

i =1

result corresponding to a global schema. Each partition Dbi is stored at set of peers,
such as replica RDbi = { p1i , p2i , p3i ,..., pi r } . A transaction

Ti

over database DB

represented as T i DB . This transaction is further subdivided into subtransactions


corresponding to the database partitions T i = {ts1i , ts2i , ts3i , ..., tsi t } , and tsij Db j . Where
tsij is the j th subtransaction of the transaction T i . And np is the number of peers
participating in the system, np is the number of partitions of the database, r is the
number of peers over which a particular partition is replicated.
i
The partial results rsm
from replica set of partitions are compiled to generate the

final results, i.e., Rsi = {rs1i rs2i rs3i ... rsi p } .

69

4.3 3-Tier Execution Model (3-TEM)


In conventional 1-Tier Execution Model (1-TEM), a head peer accepts a transaction
from a user, resolves and distributes it to corresponding processing peers for the
execution. These processing peers execute the subtransaction received from the head
peer and revert the results after executing it. The head peer compiles the received
results and returns the same to its corresponding requester. The head peer becomes
highly overloaded because of the responsibility for every event belonging to a
transaction, viz., transaction caching, transaction division, result compilation, result
caching and result delivery, etc. The head peer is prone to single point of failure.
Other issue in 1-TEM is that the query spends a large time in head peer. In the case of
head peer failure, execution time of queries gets wasted and therefore the query will
be re-executed. This reduces the throughput because throughput of the system
depends upon the performance of head peer only.

User/Requester

Transaction Coordinator
(TC)

Result Coordinator (RC)

Transaction Processing
Peers (TPP)
Transaction
Processing
Peers (TPP)
Transaction
Processing
Peers (TPP)
Transaction
Processing
Peers (TPP)

Figure 4.1 3-Tier Execution Model (3-TEM) for P2P Systems

There are three major subprocesses, responsible to execute a transaction in this


conventional P2P system, i.e., transaction coordination, transaction processing at
remote peer and result coordination. To speed up the execution process of
conventional P2P system and to balance the load of head peer, a 3-Tier Execution
Model (3-TEM) is proposed {Figure 4.1}. It comprises of three components Transaction Coordinator (TC), Transaction Processing Peer (TPP) and Result
Coordinator (RC). The execution process may be decomposed into small
subprocesses and executed independently using TC, TPP and RC. To improve the

70

response time, these subprocesses are managed by dedicated peers along with
required information. The partial results received back from the subprocesses are
compiled at RC for final results corresponding to the global schema.
These stages share control information for the execution of subtransactions of the
parent transaction. The three components require small chunks of CPU to execute
their corresponding responsibilities and may be executed in parallel. This parallelism
improves the throughput of the system. A timestamp is used to maintain the
serializability in the subtransactions.
TC receives the global transactions from the users, translates and decomposes a
transactions T i against global schema into subtransactions (local) {ts1i , ts2i , ts3i ,..., tsi t }
depending upon the partitioning mechanism used. Local subtransactions further
routed to the corresponding TPPs in serializable order for execution. A subtransaction
may be executed on a number of TPPs. 3-TEM coordinates among TPPs during the
execution of subtransactions. To improve the performance of 3-TEM in terms of
response time, TPP executes the subtransactions in the order of timestamp associated
with them. RC receives the partial results from TPPs and, are compiled for final
results against global transaction. The final results are returned to the owner of the
global transaction. The details about different components of 3-TEM are given in
Figure 4.2.

4.3.1 Transaction Coordinator (TC)


TC act as interface and used by the requesting sites for sending their requests in the
form of transactions (queries). It is also responsible for providing compatible
environment to the global transactions coming from heterogeneous peers. TC receives
transaction from the user, it guides a user regarding Result Coordinator (RC), from
where a user may obtain the results corresponding to the submitted transaction, TC
checks authenticity of the arrived transaction and assigns a security level to it. It
resolves transaction into subtransactions with respect to local schema and sends them
to the corresponding TPPs with the help of Data Access Tracker (DAT). TC has
various components to process the received transactions. These components are as
follows.
Transaction Interface (TI): It provides interface to the user and receives the
authorized global transactions T i from external world. It helps the user to get the

71

results Rs i corresponding to the submitted transaction T i from the RC. An


authenticated user always receives a token for its submitted query/transaction.

Security Checker (SC): It authenticates a requester/user. The authentication of the


user(s) is done through username and password. SC blocks the unauthorized access of
the data. It also blocks the low security level transactions to access high security data
item and filters the possibility of malicious attack through arrived queries. SC
allocates the security classes/level Lv (T i ) Sc to all authorized and arrived
transactions ( tsri ) and Lv ( x ) S c to data items stored in database, respectively. It
completely secures transactions and data in the system.

Transaction Manager (TM): It handles the transactions and data in the system and
ensures global serializability. TM resolves the global transaction T i

into

subtransactions tsri . It assigns the timestamp of global transaction to all its


subtransactions. The subtransaction will be sent to that TPP where the data required
by this subtransaction is available.

Load Analyzer (LA): It analyzes the load at each participating peer and maintains the
statistics of the load. Depending upon the statistics, load distributing mechanism is
activated to balance the load over the peers participating in the system.

Data Administrator (DA): It is responsible for all data and database related activities
in the system. DA keeps track of the peers where the partitions are stored in the form
of an address table. It also sends the update massage to the DAT in the event of any
data updation.

Data Access Tracker (DAT): It controls, manages and provides the required
information in the system. It keeps track of the read/write timestamps associated with
each data items. Every time a data item is added, read or updated by a transaction,
corresponding timestamp of data item is also updated in the DAT. Two types of
timestamps are associated with every data item, i.e., read and write. The read
timestamp is the timestamp of the last global transaction that reads particular data
items. The write timestamp is the timestamp of the last global transaction which
72

writes this data item. DAT also detects and resolves conflicts between global
transactions.

Peer Identifier (PI): It keeps track of the addresses of peers where database partitions
are stored. PI also holds the routing information of the network. It implements peer
selection criterion procedure.

4.3.2 Transaction Processing Peer (TPP)


Transaction Processing Peer (TPP) is responsible for execution and maintaining local
serializability in the received subtransactions. The components of TPP are shown in
Figure 4.2. These components is as follows.

Transaction/
Query
Transaction
Interface
Security
Checker

Subtransaction
Interface

Load
Analyzer

Transaction
Manager

Data
Administrator

Peer
Identifier

Data Access
Tracker (DAT)

Subtransaction
Manager
Data
Manager

Result Data
Administrator

Local Database
To RC

To TPP
Transaction
Coordinator
(TC)

Result
Manager

Transaction
Processing
Peer (TPP)

Result
Pool

To Requester

Result
Coordinator
(RC)

Figure 4.2 System Architecture of 3-Tier Execution Model (3-TEM)

Subtransaction Interface (SI): The subtransaction can be blocked, aborted, restarted


or preceded for the execution, depending upon the order of subtransactions and their
corresponding timestamp associated with them. SI receives the subtransaction tsri , and
checks for the prescribed format. It places the subtransactions in a priority queue
according to the timestamp associated with them. SI also checks the deadline of
subtransactions, and abort the subtransaction those exceeds their deadline.

73

Subtransaction Manager (SSM): resolves a subtransaction tsri , and decides the data
required by it at TPPs. It also checks the feasibility and availability of requested data
items in its local database. The subtransaction is further sent to the Data Manager
(DM) for data mapping. This identifies data items corresponding to subtransaction
from the local database and maintains local serializability.

Data Manager (DM): is responsible for mapping of subtransaction tsri with its
required data items from the database available at a TPP. It is responsible for all the
events being done on all data items corresponding to read/write subtransactions. It
also maintains the data consistency.

Local Database (LD): is the actual partition of global database within which data
item resides. This provides the data items corresponding to read/write subtransactions.

4.3.3 Result Coordinator (RC)


A Result Coordinator (RC) is responsible for compiling partial results received from
TPPs. This compiling of partial results is a mapping of partial results from local
schema to global schema and generation of final results corresponding to global
schema. It forwards the compiled results to its authenticated user/owner
corresponding to the received transaction. RC stores results in a result pool in sorted
order depending upon deadlines of the transactions. The components of RC are shown
in Figure 4.2 and briefly described as under.

Result Manager (RSM): is similar to Transaction Manger (TM) but does the work in
reverse of TC. It ensures global serializability of final results. Serializability of the
results identified through timestamps of partial results. RSM is responsible for
compiling the partial result rsri into global result Rs i . It also compares the partial
results for updated result among all partial results received from various replicas.
RSM sends a message to the user indicating that result is ready. It handovers final
result to the user (transaction owner) after checking its authenticity. A user is
identified through comparing the token, issued to it against a submitted transaction.

74

Result Data Administrator (RDA): It manages the global as well as local databases. It
helps in compilation of partial results.

Result Pool (RP): holds the results till it is not handover to their owner. It keeps log
of the peers from where the partial results are received. RP also keeps track of the
deadline attached with the transaction and corresponding results, which is utilized to
discard the result after a certain period of time.

4.3.4 Working of 3-TEM


The requester peer sends its query to the Transaction Coordinator (TC). On receiving
query, i.e., from the requester, it checks the authenticity of the requester and also
checks whether the requester peer authorized to access the system. The next level of
operation is the scope of query, i.e., whether the result is within the scope of system.
After authenticating and checking the scope, TC assigns a timestamp to received
query, a token number/query id corresponding to the arrived query is provided to the
requester. This query id remains with the query/corresponding subqueries through out
the life of query within the system. A timestamp is also provided to the arrived query,
which further helps in the sequence of subquery execution in the system. The address
of result coordinator is also provided to the requester so that after the specified time
period RC may be consulted to get the result corresponding to the query id. The
partition ids are calculated through partitioning algorithm, addresses of corresponding
TPPs are obtained through address mapping table. Information related to number of
partitions used, number of replicas of each partition are sent to the RC, which further
helps in the compilation of result from the partial results received from various TPPs.
The packet of information (data and control information) for the TPPs are generated
and forwarded to the corresponding TPPs.
TPP resolves the data and control information from the received packet. The
position of required data is identified (corresponding to the control information
provided in the information packet) for the TPP. The operation as mentioned in the
information is performed over the provided data. The results generated by TPP are
forwarded to the Result Coordinator (RC).
RC receives all partial results from corresponding TPPs, analyzes these partial
results against timestamp attached with the partial results, compiles the result
corresponding to the received query (corresponding to global schema). The final
75

results are stored in the RP. This result is forwarded to the requester after
authenticating the requester and the token ids/timestamp associated with the query.

4.4 Load Balancing


The time required at a peer for execution of a transaction is much higher in
conventional system (1-TEM) as compared with the 3-TEM system. In dynamic
environment of P2P, the session time of participating peer is a constraint. Thus, the
processes that require small CPU ticks get the advantage in 3-TEM over the 1-TEM
(conventional execution model). The heavy load of head peer is shared among TC and
RC in 3-TEM. 3-TEM generates no extra overhead for data transfer over the overlay
network as it performs the same task pattern as used in 1-TEM. The 1-TEM follows
the pattern TC-TPP-TC, here, TC is holding the dual responsibilities of transactions
and results. But, in 3-TEM the pattern is TC-TPP-RC. The load of TC is shared with
RC, with tolerable increase of control overhead.

4.5 Database Partitioning


To place real time information over untrusted P2P environment, a number of issues
should be addressed. Efficient utilization of storage space available on peers, primary
security of placed database and availability of data whenever required are some
primary issues to be addressed in P2P network having high churn rate. To address the
above issues, complete database cannot be placed over any peer due to the security
reasons. The idea is to place only partial information on untrusted peers which may
not be useful till it is complete. The database must be partitioned and these partitions
must be placed over multiple peers. Peer availability is another reason of database
partition. The replication of database is required to match data availability
requirement of the database. To achieve desired data availability, each database
partition is replicated over multiple peers, so that system works in case of any peer
failure.
The database partitioning is primary requirement in case of placing database over
P2P environment. In this case vertical partitioning is also required along with
horizontal partitioning of database. To achieve vertical and horizontal partitioning, a
simple, fast and efficient mechanism is required which can partition the database as
per requirement of the system. A matrix Assisted Technique (MAT) which resolves

76

the transactions according to the database partitions and compiles partial results
received from the number of partitions after execution is proposed. It is simple, fast
and efficient technique for P2P environment which addresses the above issues.

4.5.1 Matrix Assisted Technique (MAT)


To improve the security and data availability in dynamic P2P environment database
must be stored in small partition over the P2P network. Primary requirements for
partitioning are fast execution, simplicity and efficiency.

To fulfill the above

requirements of database partitioning for P2P dynamic environment, a Matrix


Assisted Technique (MAT) is identified. This is a simple hashing technique, which is
capable to partition a relational database horizontally, vertically or in both ways and
providing range partitioning of database. MAT uses dividing factors to partition the
database. With dividing factors database can be partitioned into small units. It
efficiently performs all the required tasks within time constraints. MAT is simple in
complexity and easy for implementation.
Relational database is considered for implementation of the MAT. Consider a
relational database DB has nf fields (columns) f1, f 2 , f3 ,..., f nf and nr number of
records (rows) r1, r2, r3,..., rnr is required to be placed over the P2P networks. MAT uses
two dividing factors df r and dfc for horizontal and vertical partition, respectively.
The number of horizontal partitions depends upon the quotient obtained from dividing
the number of records by df r , and similarly number of vertical partitions obtained
from dividing the number of columns by dfc . This decides which field comes under
which partition and at what position. Partitioning ranges of partial data is decided
against a partition.
Figure 4.3 presents a dummy relational database with 30 records and each with 9
columns. This database partitioned with the factors df r = 10 and dfc = 3 . This
generates total partitions ranging from [0, 0] to [2, 2]. A partition contains partial data,
e.g., [0, 0] contains only 3 columns of first 10 records in the form of stream of bytes.
The number of partitions depends upon the dividing factor, smaller the value of
dividing factor, larger is the number of partitions. Larger is the value of factor,
smaller is the number of partitions. The value of database dividing factors is decided
by the database owner/data administrator.

77

MAT is inspired by method used to search element of 2-D matrix from memory
(RAM). It identifies the partition number and local record number to the partition
against a ( Row No. , Column No. ) of a record. MAT calculates the partition number and
record number using the following procedure: Sr. No./ df r = (Qr , Rr ) , where Qr , Rr are
the quotient and remainder record wise, respectively. Quotient Qr is the partition id
and remainder Rr is the local record id in the partition.
Column.No./ df c = (Qc , Rc ) ,

where Qc , Rc are the quotient and remainder column wise,

respectively. Quotient Qc is the partition id and remainder Rc is the local column id


in the partition. Thus, the partition id [ Qr , Qc ] and local record id for [ Rr , Rc ], is
searched for ( Sr. No. , Column No. ) record. MAT stores the information in the form of
stream of bytes, and arranged in the linked list, because of security reasons and easy
access. The insertion, deletion and updation in the linked lists are easy.

0
Sr.No
0
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29

1
PAN No
P1
P2
P3
P4
P5
P6
P7
P8
P9
P10
P11
P12
P13
P14
P15
P16
P17
P18
P19
P20
P21
P22
P23
P24
P25
P26
P27
P28
P29
P30

[0, 0]

[1, 0]

[2, 0]

2
Name
Anil
Ashok
Kamal
Raja
Peter
Jony
Anjoo
Ashu
Abhi
Agya
Bohati
Parth
Ritu
Norti
Ved
Usha
Peshhi
Noni
Schin
Vansh
Heena
Shikhu
Vikram
Sukha
Krishna
Radha
Pawan
Rani
Golu
Manish

3
Address
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC
ABC

4
Age
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY
YY

[0, 1]

[1, 1]

[2, 1]

5
XY
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD
DD

6
YZ
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG
FG

7
AB

8
CD

[0, 2]

[1, 2]

[2, 2]

Figure 4.3 Logical View of Database Partitioning with df r = 10 , df c = 3 .

78

4.5.2 Database Partitioning


The records upto fixed value of primary key are selected to place in each partition
(decided at the time of partitioning), i.e., each partition contain the range of records
upto fixed value of primary keys. The data items between starting and ending primary
key reside in the particular partition. The data items in each partition are indexed
locally. The database is repartitioned in case of 25% change in the partition due to
insertion/deletion of record from the partition. Each partition of the database is
replicated on suitable Pi number of sites (peers). The Pi identifies the suitable peers
with the help of peer selection criterion {Section 4.5.4, Eqn (4.1)}.
More is the number of replicas used, better is the data availability of the partition.
Only the owner of data has the complete information of database, no other peer has
this complete information for security reasons. It requires mutual coordination mainly
between TC, TPPs and RC.

Creation of Partitions Corresponding to the Database


Prior to the creation of partitions, all decisions regarding the number of replicas per
partition are to be finalized. The addresses of the peers going to act as TC, TPPs and
RC have to be identified. These peers are selected through peer selection criteria in
{Section 4.5.6}. The number of replicas per partitions are decided on the basis of
required availability of the partitions (depends on required quality of service). The
number of partitions may be controlled (increase or decrease) by deciding the values
of df r and dfc . The operations involved to create the partitions corresponding to the
database are as follows:

1. Selection of TC among all active peers participating in the system, through


peer selection criteria.
2. Depending upon the environment and requirement of the system, number of
replicas is to be decided to maintain the level of availability.
3. df r and df c are decided as per the required number of partitions.
4. Mapping table structure is to be initialized and prepared simultaneously.
5. The target database (to be partitioned) is prepared to hold all the relevant
information used by the system (record wise), e.g., timestamp, updated, in use
etc.

79

6. Partitions generated with the help of df r and df c .


7. Each partition of the database is stored on number of selected peers (depends
upon the level of required availability). TPP module of the system is installed
on the selected peers, so that they may execute the received subqueries.
8. Addresses of replicas corresponding to each partition are stored into the
mapping table.
9. RC(s) are also decided depending upon the availability of peer(s) and RC
module of the system is installed on the selected peer(s) to compile results
received from TPPs and synchronize with all other modules.

4.5.3 Algorithm to Access the Partitioned Database


To access any of the data location from the database (stored in parts on number of
peers), a set of operation has to be performed. These operations depend on how the
database is partitioned and stored on the peers. The operations to be performed at
various components of 3-TEM are as follows.
The global query is received after checking the authenticity of the requester. TC
parsed the received query and generates subqueries against the local partitions. The
addresses of all replicas corresponding to each partition id are identified through
mapping table. The address of RC is also identified depending upon the peer selection
criterion. The address of selected RC is shared with TPPs so that the required
information can be sent to the TPP. TPP executes the received subqueries, and update
the partition with the corresponding information packet received from the TC. RC
collects the response from entire available replicas corresponding to each participating
partition, and consolidate the partial results. This result is further shared with the TC,
and exchange various signals required to confirm the completion of process. The
following steps are to be performed by 3-TEM.

80

Operations to Access the Partition at TC


1.

Authenticity and privileges to the requester is analyzed. Query is received from


the authenticated requesters. Token number corresponding to each query is
provided to the requester. Address of RC is also shared with requester, from
where the results, corresponding to the query will be received.

2.

TC resolves the incoming global query and identifies the position of the record
into global database to be accessed, i.e., Sr.No . Primary key of the inserted
record may be used to identify Sr.No. against global schema. The conventional
methods may be used to resolve the query (as per the primary key).

3.

This record may be accessed from various partitions stored at remote locations.
The partition ids are calculated, i.e., from where that record is to be accessed.
(i) Row check- Sr.No./ df r = (Qr , Rr )
Case I. All column/fields of the record are to be accessed then, each
partition ids, (Qr , i ) where i = 0... n 1 are the partitions to be accessed.
Case II. If specific column corresponding to the specified row is to be
accessed.

Column Check Column.No./ dfc = (Qc , Rc ) .


( Rr , Rc ) th position of (Qr , Qc ) th partition is having corresponding
record with Sr.No. to be accessed.
(ii) The addresses of entire available replicas corresponding to each partition
are identified from the mapping table (address of TPP(s)).
(iii) Information packets for TPPs are prepared, i.e., PacketTPP [1,0],
PacketTPP [1,1], PacketTPP [1,2], , and so on, for all partitions that are
going to be actually accessed.
(iv) Information packets are sent to all the available replicas (TPPs)
corresponding to selected partitions.

81

(v) Result information packets for RC(s) is prepared, i.e., PacketRC [],
includes operation to be performed on the TPPs, number of replicas
against each partition, partial results/acknowledgment expected from the
TPPs. Query ids, timestamps, information about the requester.
(vi) Result information packets are sent to the RC(s).

Operations at TPP:

After receiving the information packet from TC, TPP checks the serial number
( Sr.No. ) of the record to be accessed into its local database, i.e., the position of the
record.
(i) The information packet analyzed for the operation to be performed and data
required for performing the operation.
(ii) The proper position of required record is calculated through supplied Rr and Rc
with in the local database of corresponding replica.
(iii) Specified operation is performed on the data.
(iv) Partial data/ acknowledgment in the form of result packets sent to RC(s), after
completion of the operation.
(v) Send the acknowledgement to TC for successful accessing of data items.

Operations at RC:

After receiving the information packets from TC, RC prepares the store to hold the
partial results/ acknowledgements expected in the form of result packets from TPPs.
The store is prepared to hold all result packets from the multiple replicas
corresponding to each partition of the database.

82

(i) The result information packet for RC is analyzed for the operation to be
performed and space requires for holding the results/ acknowledgements,
corresponding to each replica.
(ii) Wait till the results packets received from the TPPs.
(iii) Each partial result/ acknowledgement from TPPs is positioned at proper place to
compile these partial results to the final result corresponding to the global query.
(iv) RC collects all process completion messages from each and compiles all
messages. It also sends the process completion message to TC.
(v) The compiled result is placed in the result pool.
The result is handed over to the authenticated requester, after matching the its
corresponding token number.

4.5.4 Peer Selection Criterion

The following equation can be used to compute the candidature of peers. This
candidature is further utilized to select the peers for holding replicas.
Cd i = ASTi w1 + FSAi w2 + CPAi w3 + BDi w4 + Cri w5

(4.1)

Where
Cdi

Candidature of a peer Pi to hold the replica.

ASTi

Average Session Time of the peer for which a peer Pi is active in the
system.

FSAi

Free Space Available in a peer Pi .

CPAi Computation Power Available in a peer Pi .


BDi

Bandwidth with which peer Pi is connected to a system/network.

Cri

Cardinality of a peer (Number of connection with peer Pi ).

w1 , w2 , w3 , w4 , w5 are the weights for adjusting importance of the parameters.

There are two parameters that are considered while selecting a peer for storing the
replicas, i.e., candidature of the peer, which is the measure of, how capable a peer is
to store a replica. Second is the distance of each participating peer in the system. The
83

distance is a measure of the cost spends to send and receive messages from the peers
to centre peer. For any efficient system this distance should be minimized. A priority
queue is utilized to store the best peers having largest candidature among all peers.
The length of priority queue is equal to the double of the number of peers required in
the system (length may be varied depending upon system requirement). All these
peers are best suited to store replicas among all participating peers in the overlay.

4.6 Simulation and Performance Study


To study the performance of 3-TEM integrated MAT, some assumptions are made for
the simulation. These assumptions are as follows.

4.6.1 Assumptions

Relational database is considered for the simulation. It is also assumed that database
satisfies all the normal forms and free from any anomaly, while dealing with the
database. TC takes care of ACID properties of the database. All subtransactions carry
same security level as of the main transaction. The serializability of the transaction
and subtransaction are maintained by the TC and all TPPs. Timestamp is used to
compare the freshness of data items. Each subtransaction carries the timestamp of its
parent transaction. TC does the data placement, decides the divide factor and types of
database partitioning (horizontal, vertical, or both). It has the complete information of
all the fields in database (DB). TC manages the concurrent execution of processes.

4.6.2 Simulation Model

To evaluate the performance of 3-TEM an event driven simulation model for firm
deadline real time distributed database system has been developed {Figure 4.4}. This
model is an extended version of the model defined in [164]. The model consists of a
RTDDBS distributed over n peers connected by a secure network. The various
components of the model are categorized into global and local components described
as follows:

Global Components: Transaction Generator generates transaction workload of the

system with specified mean transaction arrival rate. It also provides timestamps to the
arrived transactions. Transaction Manager models the execution behavior of the

84

transaction over the network and also resolves in subtransactions. Transaction


Scheduler schedules the global as well as local subtransactions and makes them ready
to dispatch. All global conflicts are resolved by it through timestamps. Transaction
Dispatcher dispatches the generated transactions which are in order to the network
and finally they reach to ready queue at local peer. Network Manager route all the
massages (traffic) among the peers. It also keeps track the addresses of all available
peers.

Local Components: Ready Queue all arrived subtransactions and ready to execute are

initially placed in it, according to their priority. Subtransactions get CPU ticks one by
one and in order of their priority. Wait Queue holds the subtransactions which are
blocked due to any reasons, e.g., to any conflicts of resources, concurrent execution of
processes, etc. It holds the subtransactions till their corresponding conflicts are not
resolved. A transaction from the blocked queue is also gets the CPU when it is ready
to execute, i.e., its all corresponding conflicts are resolved.
Concurrency Control Manager (CCM) implements the Timestamp based Secure
Concurrency Control Algorithm (TSC2A) {described in Chapter 5}. It manages and
resolves the concurrent execution of processes, and all conflicts of resources,
processes are resolved with the help of timestamps associated with them. Local
Scheduler is responsible for managing the locks for subtransactions. Depending on
CCM, it decides, whether the lock requesting subtransaction can be processed,
blocked in the wait queue, or restarted. It schedules the subtransactions and controls
the processes for CPU. At any given time, the transaction that has the highest priority
gets the CPU, unless it is being blocked by other transactions due to lock conflict.
For firm deadline system, the transactions that once missed their deadlines are
useless and aborted from the system. The deadlines of each transaction are checked
before execution. Ready transactions wait for execution in the ready queue according
to their priorities. Since the main memory database systems can better support real
time applications, it is assumed that the databases are residing in the main memory. A
transaction requests for a lock on data items before it executes on them. A restarted
subtransaction releases all locked resources, and be restarted from its beginning. A
subtransaction successfully committed also releases all its locked resources by it.
Finally, Sink collects statistics on the completed transactions from the peer.

85

Ready Queue
Terminate
Local
Scheduler

Transaction
Arrival

Commit

Wait Queue

Sink

Concurrency
Control Manager

Blocked
Computation

Peer 5

Memory

Database
Operation

Transaction
Generator

Peer 2

Transaction
Manager
Transaction
Scheduler

Peer 3

Peer 4

Coordinator

Network
Manager

Peer 1

Transaction
Dispatcher

Figure 4.4 Simulation Model for 3-TEM

The transaction scheduler is responsible for managing the locks for transactions.
Depending on Timestamp based Secure Concurrency Control Algorithm (TSC2A), the
transaction scheduler determines, whether the lock requesting transaction can be
processed, blocked, or restarted. A restarted transaction releases all locked resources,
and be restarted from its beginning. A transaction after successfully commitment
releases all the locked resources. The deadlines of the firm real time transactions are
defined based on the execution time of the transactions such as:
TDeadline = TArrival Time + (TExecutionTime + SF )

(4.2)

Where:
TArrival Time : Time when a transaction arrives in the system.

SF :

Slack factor is a random variable uniformly distributed between the


slack range.
86

TExecutionTime = (TTime Lock + TTime Pr ocess + TTimeUpdate ) No. of Operations

(4.3)

Where:
No. of Operations : Number of operations in the transaction.
TTime Lock :

CPU time required to set a lock.

TTime Pr ocess : CPU time required to process an operation.


TTimeUpdate : CPU time to update a data object (for write operations).

4.6.3 Performance Metrics

The performance of the MAT integrated 3-TEM is evaluated and compared with other
existing systems through simulation. In the simulation we have used the performance
metrics defined in Table 4.1 and Table 4.2 and the performance parameters defined in
Table 4.3.

Table 4.1 Performance Metrics-I

Network Load: is measure of the number of messages transferred in the network to

propagate an update message to the system in underlay topology. For an efficient


logical structure, network load should be lower. Higher network load creates
congestions in the P2P networks.

Peer Availability: is the total up time of individual peer out of its total time.

Partition Availability: is the total up time of group of peers out of its total time.

87

Table 4.2 Performance Metrics-II

Followings are the performance metrics used to evaluate the performance

Transaction Miss Ratio (TMR): is the percentage of input transactions that are unable

to complete before expiree of their deadline over the total number of transactions
submitted to the system. TMR =

TMissed
TTotal

Transaction Restart Ratio (TRR): is the percentage of transactions that are restarted

due to any reasons over the total number of transactions submitted to the system.
TRR =

TRe start
TTotal

Transaction Success Ratio (TSR): is the percentage of transactions that are committed

successfully within deadline over the total number of transactions submitted to the
system. TSR =

TSuccess
TTotal

Transaction Abort Ratio (TAR): is the percentage of transactions that are aborted due

to any reasons over the total number of transactions submitted to the system.
TAR =

TAbort
TTotal

Throughput: is the number of transaction successfully committed before their

deadlines in unit time. The logical structures with high throughput can be utilized for
high performance databases over P2P networks. Throughput =

Tcomitted
Total Time

Response Time: is the time duration between transactions submitted and gets its first

response from the system.

88

Table 4.3 Performance Parameters Setup


Name of

Default Settings

Description

Parameters

Num_Peer

200

Number of peers participating in the


network

DB_Size

200 data items/database

Size of data items/database

Mean_Arrival_Rate

0.2-2.0

Number of transactions /sec

Ex_Pattern

Sequential

Transaction type (Sequential or Parallel)

Num_CPUs

Number of processors per peer

SlackFactor

5-20 uniformly distributed

Transaction deadline time.

Min_HF

1.2

Threshold value of Health factor

Commit_Time

40 ms

Minimum time for commit processing

Num_Operation

3-8 uniformly distributed

Number of operations in transaction.

U_Carda

5-15 uniformly

Number of connection of a peer in underlay

Distributed
O_Carda

5-10 uniformly

Number of connection of a peer in overlay

Distributed
Latency

5-20 uniformly

Latency of a connection between peers

Distributed

4.6.4 Simulation Results

To evaluate the performance of MAT integrated 3-TEM, a series of experiments are


performed. The simulation is done using the performance metrics defined in Table 4.1
and Table 4.2 and the performance parameters defined in Table 4.3.
Figure 4.5 presents that the availability of individual partition is more as
compared to the overall system availability. It is also observed from graph that
partition availability reaches in acceptable range of 0.7 approximately with peer
availability 0.35 in case of individual partition availability and 0.55 in case of system
availability, with 4 replicas per partition. To achieve the availability in acceptable
level of 0.95, 6-7 numbers of peers with more than 0.7 availability, may be used for
the system.

89

1
Individual Availability

0.9

System Availability

Partition Availability

0.8
0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Peer Availability

Figure 4.5 Relationship between Peer Availability vs. Partitions Availability

Figure 4.6 shows that the throughput initially increases with increase in the mean
transaction arrival rate. After reaching its peak it starts decreasing with further
increase in MTAR. In the graph peaks of throughput is maximum, at this value of
mean transaction arrival rate, the system is in its peak performance. It is also observed
that 1-TEM produces its best performance at MTAR value in range of 1-1.2, where as
3-TEM produces its best at MTAR value of 1.4. The 3-TEM bears extra load and
executes more transaction per second as compared to 1-TEM. The throughput of 3TEM is higher than 1-TEM system for all values of MTAR because the 3-TEM
system takes small time span of CPU for the execution of a transaction.
Figure 4.7 presents that response time for 1-TEM (conventional execution model)
and 3-TEM both the case are same. It is also observed that with increase in number of
partitions the response time also goes on increasing. It may be due the possibility of
network delay, which increases with increase in number of partition.

90

8
1-TEM

3-TEM

Throughput (tps)

6
5
4
3
2
1
0
0

0.2

0.4

0.6

0.8

1
MTAR

1.2

1.4

1.6

1.8

Figure 4.6 Relationship between Throughput vs. Mean Transaction Arrival Rate

70

1-TEM

Response Time (ms)

60

3-TEM

50
40
30
20
10
0
0

10

12

14

# Partitions

Figure 4.7 Relationship between Numbers of Partitions vs. Response Time

Figure 4.8 shows that the Query Completion Ratio is initially high and then
decreases with increase in the value of MTAR. For the smaller values of MTAR, the
system has sufficient time and resources to execute the small number of query arrival
per second. The Query Completion Ratio in the case of 3-TEM is always higher as
compared to 1-TEM. It starts decreasing near MTAR value of 1.4 as compared to 1 in
case of 1-TEM. This indicates that 3-TEM completes more transactions as compared

91

to the 1-TEM and bears high load (more transaction per second) as compared to 1TEM.

Query Completion Ratio

0.9
0.8
0.7
0.6
0.5
0.4
0.3
0.2

3-TEM

0.1

1-TEM

0
0

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

MTAR

Figure 4.8 Relationship between Mean Transaction Arrival Rate vs. Query Completion Ratio

Figure 4.9 presents that the miss ratio starts increasing with increase in the value
of MTAR. It is also observed that after a certain value of MTAR, the miss ratio
increases rapidly, which is 1.6 in the case of 3-TEM and 0.6 in the case of 1-TEM.
This is because after a certain value of MTAR, the number of transactions in the
system increases beyond the execution rate. The maximum resources are occupied by
the transactions and dependency within the resources is also increased, resulting into
more transactions getting blocked in wait queue, or aborted.
From the Figure 4.10 it is observed that the restart ratio of transactions goes on
increasing with increase in value of MTAR and start decreasing after some value of
MTAR, which is 1.2 in case of 1-TEM and 1.4 for 3-TEM. The peaks in the restart
ratio shows that after that value the number of transactions do not have sufficient time
to execute, i.e., the system rejects to allocate the resources to the transactions in wait
queue, due to the shortage of available remaining time which exceeds the deadline
with the transactions. It is also observed that the restart ratio of the 3-TEM is very less
as compared to the 1-TEM. Because 3-TEM executes the subtransactions in parallel
and resources get freed by the stages after executing and readily available for next
subprocess.

92

0.7

1-TEM
0.6

3-TEM

Miss Ratio

0.5
0.4
0.3
0.2
0.1
0
0

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

MTAR

Figure 4.9 Relationship between Mean Transaction Arrival Rate vs. Miss Ratio

Restart Ratio

0.9
0.8

1-TEM

0.7

3-TEM

0.6
0.5
0.4
0.3
0.2
0.1
0
0

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

MTAR

Figure 4.10 Relationship between Mean Transaction Arrival Rate vs. Restart Ratio

Figure 4.11 presents that abort ratio start increasing with increase in the value of
MTAR. After a certain value of MTAR, the abort ratio start increasing rapidly which
is 0.6 for 3-TEM and 0.8 for 1-TEM. The abort ratio for 3-TEM is very less as
compared with 1-TEM. Thus, the resource utilization in 3-TEM is higher as compared
to the 1-TEM.

93

0.8
1-TEM

0.7

3-TEM

Abort Ratio

0.6
0.5
0.4
0.3
0.2
0.1
0
0

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

MTAR

Figure 4.11 Relationship between Mean Transaction Arrival Rate vs. Abort Ratio

4.7 Advantages of 3-TEM


In 3-TEM complete conventional execution process is divided into small processes,
i.e., TC, TPP and RC, which takes smaller CPU ticks for execution. The small ticks
processes are most suitable in dynamic P2P environment, due to small session time of
participating peers. The peers having small session time may also execute the
subprocesses completely which require small CPU ticks. The parallel execution may
increase throughput approximately three times. The speedup factor will approaches to
three in case of large number of subprocesses to be executed. The possibility of time
wastage due to leaving of a participating peer is lower in case of small ticks processes.
The distribution of work reduces the effect of single point failure at TC level. The
dependency on head peer is reduced because the responsibilities are distributed
among TC and RC.

4.8 Discussion
From the simulation results it is observed that partition availability reaches in
acceptable range of more than 0.7 with peer availability 0.35 of individual peer. To
achieve the availability in acceptable level of 0.95, 6-7 peers with more than 0.7
availability may be recommended for the system. It is observed that 1-Tier Execution
Model (1-TEM) produce its best performance at MTAR values from 1 to 1.2, where

94

as this value for 3-TEM is from 1.2 to 1.4. Thus, 3-TEM can bear extra load, i.e., it
can execute more transaction per second as compared to 1-TEM. The throughput is
always higher for all values of MTAR in case of 3-TEM, because small time span of
CPU is required to execute a transaction. The 3-TEM executes the subtransactions in
parallel fashion.
The response time increases with increase in the number of partitions, higher
network delay is responsible for this. The query completion ratio in case of 3-TEM is
always high as compared to 1-TEM, because of parallelism and the resources
availability in the system. The resource wastage is very low for 3-TEM as compared
with 1-TEM, because restart ratio, miss ratio for 3-TEM is always less than the 1TEM. Thus, the resource utilization is more for 3-TEM. The resource availability in
system is also high for 3-TEM, because of small subprocesses, a process holds the
resource for short duration for execution. The reduced load at TC also help in
improving the performance of 3-TEM.

4.9 Summary
In this chapter 3-Tier Execution Model (3-TEM) is presented which divides the
complete execution process into three independent stages.

It provides high

throughput than 1-TEM and load bearing capability as compared to 1-TEM. It


provides high query completion ratio with comparable response time and enhances
the resource utilization of the system. The 3-TEM is integrated with MAT. MAT
partitions the database horizontally, vertically or both. It places small partitions of
database over P2P network and support all the operations on partitioned database. It
also provides primary security to the database by dividing database into small
partitions. The possibility of misusing the information is also reduced due to small
and partial information stored at the peer.
The next chapter presents the timestamp based concurrency control algorithm for
distributed databases over P2P Networks.

95

Chapter 5

Concurrency Control in Distributed


Databases over P2P Networks
A global database is partitioned into a collection of local databases which are stored at
different sites is known as Real Time Distributed Database System (RTDDBS).
Distribution of real time data over the network always suffers from concurrency,
security and time bounded response. In this chapter we present a Timestamp based
Secure Concurrency Control Algorithm (TSC2A). It maintains security of data and
time bounded transactions along with controlled concurrency in the system.
Rest of the chapter is organized as follows. Section 5.1 presents introduction.
Section 5.2 explores system model. Section 5.3 gives transaction model. Section 5.4
presents Serializability of Transactions. Section 5.5 presents Timestamp based Secure
Concurrency Control Algorithm (TSC2A). Section 5.6 presents simulation and
performance study. Discussion about the findings is highlighted in Section 5.7 and
finally chapter is summarized in Section 5.8.

5.1 Introduction
In Real Time Distributed Database System (RTDDBS) multiple database sites are
linked by a communication system in such a way that the data at any site is available
to users at other sites. This system has several characteristics such as: (1) transparent
interface between user and data sites; (2) ability to locate the data; (3) Database
Management System (DBMS) to process queries; (4) distributed concurrency control
and recovery procedures on the network, and (5) mediators which translates the
queries and data between heterogeneous systems.
A Secure Real Time Distributed Database System (SRTDDBS) consists of
security classes and restricts database operations based on the security levels. It
secures each transaction and data in the system. A security level for a transaction
represents its clearance, security and classification levels. Concurrency control is an

96

integral part of a database system. It is used to manage the concurrent execution of


different transactions on the same data item without violating consistency.
The communications in a distributed system is complex and rapidly changing.
There are many different links, channels, or circuits over which the data may travel on
the network. In addition, there are several issues which must be considered when
transferring the data/transaction across the network, viz., illegal information flows
through covert channels, security of transactions, concurrency control over concurrent
transactions, follows-route establishment time, end-to-end network delay, network
bandwidth (transfer rate), fault tolerant and reliability, etc. We will concentrate on the
covert channel problem to avoid the illegal information flow and on timestamp based
concurrency control algorithm for executing concurrent transactions.

5.2 System Model


In SRTDDMS a global database is partitioned into a collection of local databases
stored at different sites. It consists of a set of ns number of sites. A site ni is having a
secure database, which is a partition of global database scattered on all the ns sites.
Each peer has an independent processor connected through communication links to
other peers.
A global transaction Tr is generated by the user and submitted to the system for
execution. A secure distributed database is defined as a set of five tuples
< Dt , O p , Ts , Sc , Lv > , where Dt is the set of data items, O p is the set of operations for

SRTDDMS and is defined as O p = {O1p , O p2 , O 3p ,..., O pk } , where k is number less than the
maximum number of operations defined for the transactions. A global transaction

Tr

can be divided into i subtransactions and can be defined as Tr = {tr1 , tr2 , tr3 ,..., tri } . The
coordinator assigns a timestamp

Ts

to each transaction at the time of its arrival into

the system. All transactions are ordered in ascending order of their timestamps. A
subtransaction carries the timestamp t si of its corresponding parent transaction Tri .
is the partially ordered set of security levels with an ordering relation and
mapping from

Dt Tr

to

Sc .

Lv

Sc

, is a

Security level Sci is said to dominate security level Scj iff

Scj Sci . A subtransaction tri can precedes a subtransaction trj if timestamp tsi of trj is

smaller than the timestamp tsj of trj , i.e., tsi < tsj .

97

For every data object

and for every tri Tr , Lv (tri ) Sc . Each secure

x Dt , Lv ( x ) S c

database N is also mapped to an ordered pair of security classes


Lv max( N ) ,

where

Lv min( N ) , Lv max( N )

Sc ,

and

L v min ( N ) L v max ( N )

Lv min( N )

and

. In other words,

every secure database in the distributed system has a range of security levels
associated with it. A data item x is stored in a secure database N , if it satisfies the
condition

Lv min( N ) Lv ( x) Lv max( N ) .

Similarly, a distributed transaction Tr is executed

at N , if it satisfies the condition Lv min( N ) Lv (Tr ) Lv max( N ) . A site


communicate with another site

Nj

iff

Lv max( Ni ) = Lv max( N j ) .

Ni

is allowed to

The security policy used is

based on the Bell-La Padula model [190] and enforces the following restrictions:

Simple Security Property: A transaction Tr (subject) is allowed to read a data item


(object) x , iff

Lv ( x) Lv (T ) .

Restricted Property: A transaction Tr is allowed to write a data item x iff


Lv ( x) = Lv (T ) .

Thus, a transaction can read objects at its level or below, but it can write

objects only at its level. A transaction with low security level is not allowed to write
at higher security level data objects. This phenomenon is used for incorporating of
database integrity. In addition to these two requirements, a secure system must guard
against illegal information flows through covert channels.

5.3 Transaction Model


A user from any peer may issue a global transaction for a global schema. This schema
may be accessible to all users by one of the following configurations:
(a) Replicate a global schema on all peers.
(b) Select the number of peers (coordinators) to maintain copies of the global
schema and the global transaction manager, and direct requests for a global
schema to the nearest coordinator.
(c) Select only one peer (called the coordinator) to maintain the global schema and
the global transaction manager, and direct requests for a global schema to that
peer.

98

For the proposed algorithm, second and third configuration favors over the first one
because it is difficult to maintain a copy of the global schema at every peer. It also
hinders the expandability and simplicity of the system. The coordinator solves the
problem of assigning timestamps incase (c), which is responsible for assigning
timestamps to all global transactions. Case (c) is considered for implementation of
TSC2A.

5.4 Serializability of Transactions


To handle the concurrent execution of transactions in the system, serializability is
enforced at global and local level, i.e., by coordinator and TPPs, respectively. To
maintain the global serializability coordinator with the help of data manager uses the
timestamp information available in DAT. All the global transactions are by default
enforces serializability, due to timestamp of each global transaction. To maintain local
serializability at TPP, the timestamps inherited from global transactions are utilized.
The coordinator sends subtransactions in order to the TPP through communication
links.

When the subtransactions are not received in order at TPP due to

communication delay and path failure, etc. TPP itself arrange these subtransactions in
order, depending upon the associated timestamps. A subtransaction may be blocked,
after receiving a subtransaction having lower timestamp. These blocked transactions
may be restarted at its turn in order. Hence, the local serializability is also guaranteed
through the used mechanism.
Let Tr be the set of global subtransactions to be executed. A transaction from Tr ,
resolved in subtransactions tr . Subtransactions from tr are executed such that, if a
subtransaction tri precedes a subtransaction trj in this ordering, then for every pair of
atomic operations Oip and Opj , from tri and trj , respectively, i.e., Oip proceeds Opj in
each local schedule. The execution of subtransaction trj can be blocked, after
receiving tri by the TPP, results the local serializability. Therefore, if the Coordinator
submits subtransactions in a serializable order to TPP, then TPP executes the
subtransaction in serializable order and guarantees the overall serializability in the
system.

99

5.5 A Timestamp Based Secure Concurrency Control Algorithm (TSC2A)


Timestamp is normally used to manage the sequence of execution of transactions in
the distributed system. A global concurrency control algorithm which is designed
using timestamps maintains the transactions in serializable order for the execution and
achieves authenticated results. The order of transactions in timestamp based
concurrency control completely depends upon the read/write timestamp associated
with each data item and timestamp of submitted transaction. Serializability pays the
major role in sequencing the transactions.

5.5.1 Algorithm for Write Operation


TSC2A presents the sequence of operations performed to execute the read/write
transactions in the system. This algorithm checks the timestamp assigned with the
data items prior to the execution on the data items and compares with the timestamp
of requesting transaction, for managing concurrent processes in the system. The steps
involved in the execution of write operation are as under:

Algorithm for write operation on data item x requested by subtransaction


timestamp Tsi .
If ( RTs ( x) > Tsi )
{
Abort ( Si );
}
Else
{
If ( WTs( x) > Tsi )
{
Ignore ( Si );
}
Else
{
If ( Lv ( x) == Lv ( Si ) )

100

Si

with

/* Lv ( x) & Lv ( Si ) are security levels of data item x


& transaction

Si

respectively*/

{
Writelock( x );
Execution( x );

WTs(x) = Tsi ;
Update DAT to Tsi ;
}
Else
{
Abort( Si );/* access denied due to security */
}
}
}//end Algorithm

5.5.2 Algorithm for Read Operations


The steps involved in the execution of read operation are as under:
Algorithm for read operation on data item x requested by subtransaction
timestamp Tsi :
If ( WTs( x) > Tsi )
{
Abort( Si );
Rollback( Si );
}
Else
{
If ( Lv ( x) Lv ( Si ) )
{
Readlock( x );
Execute( x );

RTs (x) = Tsi ;


Update DAT to Tsi ;

101

Si

with

}
Else
{
Abort( Si );
Rollback( Si );
}
} //end Algorithm

Global transactions are not likely to be rolled back frequently. But a global
subtransaction is rolled back by TSC2A, it will roll back all subtransactions
corresponding to global transaction. TSC2A enhances the execution autonomy of TPP
by rolling back a global transaction at the coordinator site, before sending its
subtransactions to the relevant TPP. This job is normally done by DAT.

5.6 Simulation and Performance Study


To study the characteristics of TSC2A, we have implemented it on 3-TEM system
described in Section {4.4}. It is an event driven simulation model for firm deadline
real time distributed database system. The model consists of an RTDDBS distributed
over m peers connected by a network and data partitions are replicates over multiple
peers. The execution model and architecture of 3-TEM is shown in Figure 4.1 and
Figure 4.2, respectively. The deadlines of the firm real time transactions are defined
based on the execution time of the subtransactions using the eqn (4.2) & (4.3) given in
Section {4.6.2}.

5.6.1 Performance Metrics

The performance of the TSC2A is evaluated and compared for three cases, viz., low,
medium and high security through simulation. In the simulation we have used the
performance parameters defined in Table 4.3 and performance metrics defined in
Table 4.2 {Chapter 4} to study performance of the TSC2A.

5.6.2 Assumptions

The following assumptions are made during the implementation of the TSC2A:

102

Arrivals of transactions at a peer are independent of the arrivals at other sites.

The model assumes that each global transaction is assigned a unique identifier.

Each global transaction is decomposed into subtransactions to be executed by


TPP.

Subtransactions inherit the identifier of the global transaction.

No site or communication failure is considered.

To execute a transaction it requires the use of CPU and data items located peer.

A communication link is used to connect the peers.

There is no global shared memory in the system and all peers communicate via
messages exchange over the communication links.

The transaction is assigned a globally distinct real time priority by using specific
priority assignment technique. Earlier deadline first is used in the simulation.

The cohorts of the transaction are activated at the corresponding TPP to perform
the operations.

A distributed real time transaction is said to commit, if the coordinator has


reached to the commit decision before the expiry of the deadline.

Each cohort makes a series of read and update accesses.

The transaction already in the dependency set of another transaction or the


transaction already having another transaction in its dependency set, cannot permit
another incoming transaction to read or update.

Read accesses involve a concurrency control request to obtain access followed by


a disk I/O to read followed by a CPU usage for processing the data item.

5.6.3 Simulation Results

To evaluate the performance of the TSC2A a series of simulation experiments are


performed. Three security levels are considered in the simulation, i.e., low, medium
and high. We report the important simulation results obtained from the simulation
experiments.
From Figure 5.1, it is observed that the transactions missed their deadlines with
increase in MTAR. This increase in miss ratio is due to the number of transactions
wait for its turn to be scheduled for CPU of any peer. It is also observed that the miss
ratio is high in case of high security transactions as compared to two other security
level transactions. This miss ratio is high due to interaction of high security

103

transactions with low secure transactions. This shows that to implement high security
transactions in any distributed database system may compromise on the miss ratio and
throughput against these high security transactions is reduced.
From Figure 5.2 it is observed that transaction restart ratio increases with increase
in the value of MTAR. After reaching a point restart ratio goes on decreasing for
further increase in the value of MTAR. It is the point where maximum numbers of
transactions are restarted. This decrease in restart ratio is due to the fact that
transactions are not having sufficient remaining time to execute before expiry of
deadline of that transaction. At this point the abort ratio of the transactions are high.
Hence, the restart ratio of the transactions goes on decreasing after the value of
MTAR. This value is different for individual security level. The transaction restart
ratio is higher in case of high security transactions, because transactions restart due to
their wait for high security level data items.

1
0.9

High Security

0.8

Medium Security
Low Security

Miss Ratio

0.7
0.6
0.5
0.4
0.3
0.2
0.1
0
0

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

Mean Transaction Arrival rate (MTAR)

Figure 5.1 Comparison between Miss Ratio of Transactions and Mean Transaction Arrival
Rate (MTAR)

104

1.2
High Security
Medium Security

Rastart Ratio

Low Security
0.8
0.6
0.4
0.2
0
0

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

Mean Transaction Arrival Rate (MTAR)

Figure 5.2 Comparison between Transaction Restart Ratio and MTAR

It is observed from Figure 5.3 that initially the success ratio in all three security
levels increases. After a value of MTAR the success ratio goes on decreasing. A
variation in the particular value is due to the transaction executing rate which is high
in case of low security transaction than the medium and high, where as this rate is
higher in case of medium than high security transactions. The system exhausts at
higher value of MTAR in case of low security transactions as compared to the
medium and high security transactions because transactions have to wait for its turn in
wait queue due to unavailability of resources. The success ratio starts decreasing after
certain value of MTAR, because system has sufficient resources and times to execute
the transactions at the rate of transaction arrivals and after this point transactions wait
queue starts increasing due to more arrival rate than the completion rate. With this
load, all resources also become busy and transactions have to further waits for locks
on resources. Thus, the success ratio starts decreasing with further increase in MTAR.
The low security level transactions have highest success ratio among three security
levels, i.e., high, medium and low security level transactions.

105

0.8
0.7

Success Ratio

0.6
0.5
0.4
0.3
High Security

0.2

Medium Security
0.1

Low Security

0
0

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

Mean Transaction Arrival Rate (MTAR)

Figure 5.3 Comparison between Transaction Success Ratio and MTAR

From Figure 5.4 it is observed that the transactions abort ratio increases with
increase in mean transactions arrival rate. This abort ratio is increased in all three
cases high, medium and low security levels. The abort ratio is highest in case of high
security transaction as compared to other security level transactions. The high abort
ratio is observed in high security transaction because higher priority is given to low
security level transactions. A high security level transaction is aborted and restarted
after some delay whenever a data conflicts occur between a high security level and
low security level transactions.

0.8
High Security
Medium Security
Low Security

0.7

Abort Ratio

0.6
0.5
0.4
0.3
0.2
0.1
0
0

0.2

0.4

0.6

0.8

1.2

1.4

1.6

Mean Transaction Arival Rate (MTAR)

Figure 5.4 Comparison between Transaction Abort Ratio and MTAR

106

1.8

Figure 5.5 shows the transaction throughput as a function of the MTAR per peer.
It can be seen that the throughput of TSC2A initially increases with the increase in
arrival rates then decreases with further increase in arrival rates. The peak values are
the workload of transaction arrival rate that the system can bear. This value is
different for all three cases and is higher in case of low security transactions.
However, the overall throughput of high security transactions is always less than
medium and low security level transactions. It is also observed that the throughput of
high security level transactions is lower than that of low security level transactions as
arrival rate increases.
160
High Security
Medium Security
Low security

140
Throughput (tps)

120
100
80
60
40
20
0
0

0.2

0.4

0.6
0.8
1
1.2
1.4
Mean Transaction Arrival Rate (MTAR)

1.6

1.8

Figure 5.5 Comparison between Throughput and MTAR

5.7 Discussion
From the simulation results it is observed that restart ratio degrades the performance
of the system, both in time requirement by the database coordinator to reset the
database to its pervious state and computation time of the individual transaction. This
performance degradations increases with increase in the number of concurrent
transactions. The restart of transactions increases because of the time required for
taking permission to access high security data items and other conflict of resource
locks. The arrival rate of transaction should be managed so that, the performance of
the system, in terms of minimum abort, restart ratio and to maximize the throughput.
The transaction execution rate is higher for less secure transactions than the
medium and high security transaction. The system starts exhausting at different value
of MTAR for all three cases and is higher in case of low security transactions. The

107

load bearing capacity of the system also varies with respect to security levels used for
the transactions and the data items. The load bearing capacity of the system in terms
of rate of transactions execution is higher in case of low security transaction as
compared in the case of medium and high security.
It is also observed that throughput of the system decreases as the security level of
the transaction increases, due to the probability of successfully executed transaction is
decreased because there is a tradeoff between the security level and the throughput of
the system.

5.8 Summary
In this chapter, we have presented an algorithm for Timestamp based Secure
Concurrency Control Algorithm (TSC2A). This algorithm takes care of security of
transactions and the data items provided to the transactions and data items stored at
various peers. It also controls the flow of high security data items to be accessed by
low security transactions.
TSC2A secures data items and transactions through security levels, and restrict the
data access from various data levels. It also avoids covert channel problem in
accessing the database. TSC2A ensures serializability in execution process of a
transaction. It enforces serializability property at global (TC) and local level (TPPs) in
the system.
In the next we will discuss topology adaptive traffic controller for P2P networks.

108

Chapter 6

Topology Adaptive Traffic Controller


for P2P Networks
In structured and unstructured P2P systems frequent joining and leaving of peers
causes topology mismatch between the P2P logical overlay network and the physical
underlying network. When a peer communicates to another in overlay topology,
exchange message travels multiple hop distance in the underlay topology. A large
portion of the redundant traffic caused by topology mismatch problem between the
overlay and underlay topology, which makes the unstructured P2P systems
unscalable.
This chapter presents a Common Junction Methodology (CJM) to reduce the
overlay traffic at underlay level. It finds common junction between available paths
and route the traffic through this common junction. CJM avoids the conventional
identified paths. Simulation results show that CJM resolves the mismatch problem
and significantly reduces redundant network traffic. This methodology works for both
structured and unstructured P2P networks. CJM reduces the response time
approximately upto by 45%. It does not alter overlay topology and perform without
affecting the search scope of the network.
Rest of the chapter is organized as follows. Section 6.1 presents Introduction.
Section 6.2 discusses System Model. Section 6.3 gives System Architecture. Section
6.4 presents CJM. Simulation and Performance Study is given in Section 6.5. Section
6.6 gives advantages of CJM. Discussion is provided in Section 6.7 and finally
chapter is summarized in Section 6.8.

6.1. Introduction
A Peer-to-Peer (P2P) network is an abstract, logical network called an overlay
network. Instead of strictly decomposing the system into clients (which consume
services) and servers (which provide services), peers in the system elects to provide
services as well as consume them. All participating peers form a P2P network over a

109

physical network. The network overlay abstraction provides flexible and extensible
application-level management techniques that can be easily and incrementally
deployed despite the underlying network. When a new peer joins the network, a
bootstrapping node provides the list of IP addresses of existing peers in the network.
The new peer then tries to connect with these peers. If some attempts succeed, the
connected peers will be the new neighbors of the peer. Once a peer connects into the
network, the new peer will periodically ping the network connections and obtain the
IP addresses of some other peers. These IP addresses are cached by this new peer.
When a peer leaves the network and wants to rejoin (no longer the first time), the peer
will try to connect to the peers whose IP addresses have already been cached.
When a peer is randomly joining and leaving the network, causes topology
mismatch between the P2P logical overlay and the physical underlying network,
causing a large volume of redundant traffic. The flooding-based routing algorithm
generates 330 TB/month in a Gnutella network with only 50,000 nodes [10]. A large
portion of the heavy P2P traffic caused by topology mismatch problem between
overlay and underlay topology, which makes the unstructured P2P systems
unscalable. A message exchanged between the peers in overlay topology, travels
multiple hop distance in the underlay topology. To maintain the topology many data
and control massages have to be sent from one peer to other in overlay network.
Generally a flooding technique is used for searching a peer/data items in the P2P
overlay networks. The search messages are sent to all connected peers, bounded by
TTL. These messages load in overlay becomes the multiple times in underlay
topology. Thus, generates a heavy redundant traffic in the network.
In P2P networks peer nodes rely on one another for services, rather than solely
relying on dedicated and often on centralized infrastructure, i.e., decentralized datasharing and discovery algorithms/mechanisms will be the boosting option for the
deployment of P2P networks. The challenge for the researchers is to address the
topology mismatch problem for avoiding unnecessary redundant traffic from the
networks. This problem can be more clearly introduced using Figure 6.1.
In Figure 6.1 the eight peers are (numbered 1-8) participating in the underlay
network, out of which only four peers are in overlay network. We deal in overlay
networks and therefore, we have two cases. First, willingly we have to send the
massage from one peer to another peer. Second, there is no option to send the
message from one peer to another peer without using some intermediate peer in
110

overlay networks. Both the cases cause a heavy redundant traffic on the physical
network. In Figure 6.1, if we send the query from peer 1 to peer 6 in overlay, let it is a
Path(1)

from (1,6).

Overlay Topology
4

2
1
5

4
3

Underlay Topology

Figure 6.1 Overlay and Underlay Networks Setup

In underlay this path is an ordered set of peers, and for the considered example it
is:
Path(1) = {1, 2,3,5, 6} from (1, 6)
Path(2) = {6,5,3, 4} from (6, 4)
Path(3) = {4,3, 7,8} from (4,8)

Say, the query is sent from peers (1, 6, 4) in overlay. The query has to travel in
underlay as {1,2,3,5,6,5,3,4} which means twice the traffic cost from {3,5,6} is wasted,

similarly as we send the query to 3 hop in overlay network, then twice the traffic cost
of {3,5,6} and {3,4} is wasted. This is the one of the major reason for unwanted heavy
traffic in any of the P2P network. We have to save the unwanted traffic mentioned
above.
A mechanism to reduce redundant traffic from the P2P network is required to be
developed. The search scope of overlay should not be changed while reducing the
network traffic. The overlay topology must be unaltered during reducing the path. In

111

this chapter a novel Common Junction Methodology (CJM) is proposed to reduce


traffic in P2P networks. It also reduces the response time of the system.

6.2 System Model


To reduce the redundant traffic generated through the topology mismatch problem, a
path to transfer the message is to be reduced. A message travels number of underlay
peers during forwarding it in the overlay topology. Some of the underlay peers are
visited by a message multiple times, while forwarding the messages indirectly. This
multiple traversing of message in underlay generates the redundant traffic which may
be reduced without affecting the system performance. This redundant traffic is a large
fraction of total traffic generated in the system. To address this problem, the following
system model is proposed:
The set of peers and elements of network participating in the system are
represented as P = { p1, p2 , p3 ,..., pn } , i.e., set of participating peers in underlay and
overlay.
The set of peers participating in overlay topology for computing, sharing of data
& services and forwarding of messages, is represented as PO = { p1, p2 , p3 ,..., pm } .
All other peers which are not in overlay topology are in set PU which is a set of
underlay peers, i.e., PU = P PO .
The path traversed by a message forwarded between two peers a, b in overlay,
starting from a as a source to b as a destination (direct path), is represented as
Path(ia, b) . This i th path, from a to b is an ordered set of peers traveled by a
message in underlay is represented as Path(ia, b) = {a, px , p y ,..., b} .
A message may use two paths (indirect path) Path(ia, b) and Path(jc, d ) while it is
forwarding from source to destination, i.e., a to b then c to d , iff b = c , otherwise
message forwarding is not possible through conventional method. This is represented
as Path((ai,, jd)) = Path(ia, b) Path(jc, d ) . Two paths i and j can be used only when
destination of path i and source of path j are same. Similarly, the cost of a path i is
the cost to transfer unit data from source a to destination b and is represented as

112

Cost(ia, b) .
Cost((ai,, jd))

The

cost

of

two

consecutive

paths

is

j
Cost i

( a, b ) + Cost(c , d ) , b = c

.
=
,

otherwise

6.3 System Architecture


Figure 6.2 presents a 3-Layer Traffic Management System (3-LTMS) for overlay
networks. First layer of 3-LTMS is under overlay control, i.e., it provides the interface
to the P2P system and receives query or subqueries.

Application

Overlay Network

Query Analyzer
Query Optimizer
Query Execution
Engine

Path Manager

Underlay
Network

Physical Network
Manager

Figure 6.2: 3-Layer Traffic Management System (3-LTMS) for Overlay Networks

This layer is implemented over the application layer of the network. Next layer
comprises of four components. The Query Analyzer accepts the queries, resolves and
forwards the same to Query Optimizer. It is also responsible for breaking query into
subqueries. The Query Optimizer decides whether a peer suitable for a particular
subquery or not. Query Execution Engine execute the subquery and produces partial
results corresponding to the subquery. These partial results are further sent to the
requesting peer or to the peer responsible for compiling these partial results. The Path
Manager is responsible for reducing the path and implements the CJM. It reduces the
logical path between the peers in overlay network. All the paths are checked and
reduced (if possible) using the database of the underlay peers. The third layer, which
113

is Physical Network Manager, is responsible for managing the underlay. This layer
utilized the information from path manager.

6.4 Common Junction Methodology (CJM)


To reduce the unnecessary traffic from the network, the query should be routed
through shorter paths. Every path in the overlay network is assigned a number. Let m
be the number of such paths. Let path P((ai,)b) (from peer a to peer b ) is in logical
topology, be the ordered set of peers in the underlay topology. Let n is the hop count
of the path for which we want to find the shortest path. The problem is to find the
optimum path for a path of length n . The path and total traffic cost is identified using
the following.
CJM is based upon the assumption that there should be a physical path
corresponding to the logical path before sending the queries/messages across the
network, i.e., the path in overlay topology. In this chapter path cost and traffic cost is
used interchangeably. These assumptions are summarized as follows.
(a) A path exists between source and destination peers participating in overlay
networks.
(b) A peer rejoins at the same position from where it lefts from.
(c) Each edge of the graph shows the traffic cost, which includes total cost to
transfer the unit data.
(d) If path is broken, a conventional strategy (may be flooding) is used to find the
new path.

6.4.1 Common Junction Methodology Algorithm

To avoid the redundant traffic in the P2P networks the two physical paths in underlay
may be checked for any common junction other than source/destination. This
common junction may be used to reroute the traffic and avoid the conventional path
of higher path cost. To find the common junction the following algorithm is used.

114

Step 1. Intialize variables:


Let n be the hop count of the path.
S1..Sn be the source peer of respective
individual paths.
D1..Dn be the destination peer of
respective individual paths.
r1..rn be the common peer find in
respective individual paths.
Start Peer = S1 ;
TTC = 0; // TTC is Total Traffic Cost;
Step 2. for all values of i from [1...(n 1)];
j varies from [n...(i + 1)];
Step 3. While ( j > i ) goto Step 4;
Else go to Step 8;

Step 4. Find Common Junction


Path((Si ) , D ) Path((Sj ) , D ) = CJ {};
i

Step 5. If CJ {} = then j = j 1, and goto Step 3;


Else: let ri is the common junction
Path((Si , ,jD) ) CJM = Path((Si ) , r ) .Path((rj,)D ) ;
i

where

Path((Si , ,jD) ) CJM


i
j

the shortest path identified through CJM;

Step 6. TTC = TTC + TC[ Start Peer , ri ];


// TC is traffic Cost between two peers;
Step 7. i = j;
Start Peer = ri ;
goto Step 3;

Step 8. End;
//Path((Si , ,j )D ) CJM holds shortest path;
i

// TTC holds total traffic cost of the shortest path;

To get the paths, CJM finds the Common Junction and diverts the query from regular
path to reduced/shorter one.
In Figure 6.1, two hop paths set CJ {} Common Junction is obtained by
Path(1) Path(2) = CJ {3} ,

here 3 is the common Junction from where query may be

diverted to save the unwanted traffic cost of the network. Now, the query has to travel
the path {1,2,3,4} . The traffic cost of this saved path is very less as compare to the
conventional path. Second, take 3 hop path traveling from Path(1,2,3) , we got the
Common Junction peer 3 from 1st and 3rd path which means the query has to travel the

115

path {1,2,3,7,8} . As a result CJM saves a lot of traffic cost on identified paths. If a
query is sent logically from peer 1 to peer 4 through peer 6, in overlay network, it is a
two hop count path {Figure 6.1}. The physical path traversed by the messages is
[1 2 3 5 6 5 3 4] .

Which is conventional path and described as:

(1, 2)
Path (1,2,3,5,6,5,3,4) . This path is combination of two paths, i.e., path 1 and path 2. These

two paths can be merged only if, when the destination of first path and the source of
second path are same peer using conventional methodology. At this point we can save
the traffic cost of the path by routing the query through path [1 2 3 4] . In this
path the traffic cost to send the query through path [3 5 6 5 3] is saved. It is
redundant traffic in the network and saves this portion of the traffic cost. Here peer 3
is common junction between the two paths. CJM is based on the key idea of common
junction between the two paths. This common junction can be identified as

Path (1,2,3,5,6) Path (6,5,3,4) = CJ {3, 6} , where CJ{} is ordered set of peers common in both
(1)

(2)

the paths. Here, peer 3 and peer 6 are the common junction of two paths, Path (1) and
(2)
Path , from where a query may be diverted. If a query is diverted through peer 6, it

becomes a conventional path. But in case of choosing peer 3 for routing a query, we
can save a lot of traffic cost. Similarly, this common junction can be identified
between any two path available. These paths may or may not be continuous.

6.4.2 System Analysis

The objective of CJM is to minimize the cost of existing paths, when data is sending
across the multiple hops.
Let P((ji,)k ) be a i th path. It is ordered set of peers coming across the path from
source peer j to destination peer k . If we want to find two hop path from (l , n) then
we have to find the P((li,,nj)) = P((li,)k ) .P((kj,)n) . Assume that there is no direct path between
peers.
The traffic cost TC((ii,)j ) of a i th path is the cost to transfer the unit data from peer i
to peer

j from conventional path. The cost of the two hop path P((l i,n, j)) is

TC (i ) + TC((mj ),n ) , k = m
TC((li,,nj)) = (l ,k )

otherwise
,

116

CJM finds the minimum path length between two connected paths, i.e., P(ik ,l ) & P( mj ,n ) .
It searches common junction between the paths using the following:

P(ik ,l ) P( mj ,n ) = CJ {}

(6.1)

If CJ {} = then there is no common junction. If CJ {} there exist a Common


Junction and CJM to finds the cost TCC((ii,)j ) accordingly, i.e., w.r.t. r (Common
Junction). It finds the cost using the following:
TCC((ki ,, nj )) = TC((ki ), r ) + (TC((mj ),n ) TC((mj ),r ) )

(6.2)

From eqn (6.2 ) there exist two cases:


Case-I CJ {} =

In Case-I, there is no common junction between two paths, i.e., no continuous path
and no common junction peer is present in the two paths. The cost computed with the
help of eqn (6.3) and (6.4) is as follows.
TC((ki ,, nj )) = TC((ki ),l ) + TC((mj ), n ) =

(6.3)

TCC((ki ,, nj )) = TC((ki ), r ) + (TC((mj ),n ) TC((mj ),r ) ) =

(6.4)

From eqn (6.3) and (6.4), both the cases are showing total traffic cost infinity, i.e.,
there is no path available from source to destination.
Case II. CJ {}

This means that there is a common junction between the two paths. The path may be
either continuous or have some common junction peers. The traffic cost for these two
cases are computed as follows:

Case-II (a): When k = m , there is continuous path, i.e., the destination of path 1st is
equal to the source of path 2nd. In this case the common junction between two

117

paths is the destination of path 1st and source of the path 2nd. If r is the last peer, it
is the last peer of path 1st and starting peer of path 2nd.
Traffic cost for the normal path (conventional path) and CJM is computed as under:
TC((ki ,, nj )) = TC((ki ),l ) + TC((mj ),n )

(6.5)

TCC((ki ,, nj )) = TC((ki ), r ) + (TC((mj ),n ) 0) = TC((ki ,,nj ))

(6.6)

From eqn (6.5) and (6.6) it is observed that the traffic cost in both the cases is
same.

Case-II (b): When k m we find a common junction between the two paths (then r
will be a common junction) r . The traffic cost of the joint is:
TCC((ki ,, nj )) = TC((ki ), r ) + (TC((mj ),n ) TC((mj ),r ) )

(6.7)

(6.8)

Equation (6.7) can be simplified as


TC((mj ), n ) TC((mj ), r ) TC((mj ), n )

Hence TCC((ki ,,nj )) TC((ki,,nj ))


From eqn (6.7) and (6.8), it can be concluded that the traffic for CJM is less than
the conventional method.

6.5 Simulation and Performance Study


It is assumed that there are 1,000 peers in the underlay, out of which 10%-20% are in
the overlay network. The peers in the underlay may be connected in any network
topology (regular/irregular/mesh). The cardinality of a peer in underlay and overlay is
a maximum number of peers to which a particular peer may connect. The value of
cardinality is ranging from 3-20 for underlay network. Uniform random number is
used to generate the cardinality of a peer participating in underlay network. The
cardinality used for overlay network is ranging from 3-12 are uniform random
numbers. The path between two peers in overlay is a sequence of peers to be traversed
in underlay between source and destination. Path Length is the number of hops to be
traversed from source to destination. Dijkstras algorithm is used to find shortest path
in underlay only. The structured and unstructured topologies are implemented in
overlay network. The path cost is used as measure the cost to transfer the message
from source to destination. The path cost comprises all the cost including bandwidth,
latency, processing cost, etc.

118

6.5.1 Simulation Model

To study the behavior of CJM an event driven simulation model is developed {Figure
6.2} in C++. A brief overview of the different components of the model is as follows.

P1
Underlay Topology
Manager

Time
Scheduler

P2
P3

Overlay Topology
Manager

Path
Manager

P4
P5
Peers

Network
Analyzer

Pn
Network
Manager

Figure 6.3 Network Simulation Model for P2P Networks

Peers are the active entities participating in the network. Each peer has its predefined

availability factor which is decided at the time of its generation. Availability factor
decides the availability behavior of a peer in the network.

Time Scheduler schedules all time based events for the system. It analyzes the session

time and other statistics of peers and networks. Time scheduler decides the time of
joining/leaving of a peer depending upon the availability factor from the network.

Underlay Topology Manager binds peer and manages the underlay topology of the

peers. The number of connections a peer has is decided by the cardinality of the peer.
The Underlay Manager randomly decides the cardinality (from the range decided by
the user) of a peer at the time of connection. The other related parameters are also
decided at the time of connection, viz., latency of the communication link, etc.

Overlay Topology Manager manages the topology used to connect the selected peers

in the overlay. It connects the selected peers in structured or unstructured topology. It

119

uses the overlay cardinality to decide the number of connections a peer has in overlay.
It also implements the logical topologies used to analyze the structure of the network.

Network Analyzer keeps track of statistics of the various elements, viz., peers,

network, paths, and cost. It collects information from all other components which
helps in making decisions about the network.

Path Manager manages all the paths to connect the peers in underlay/overlay. It uses

CJM algorithm to find the shortest path in underlay and required paths in overlay
topology. It also keeps track of underlay path to connect two peers in overlay. The
Path Manager provides multiple paths in underlay against any connection between
two peers in overlay. It updates paths in underlay after leaving of any peer from the
underlay.

6.5.2 Performance Metrics

To study the behavior of CJM based P2P network, we have considered that n
distributed peer elements are connected with communication links in underlay. The
followings metrics are considered:

Response Time (RT) is the time taken by a test message to traverse the maximum hop

count path of the network.

Average Response Time (ART): is the average of all possible paths of every hop
count. Computation of ART is as follows.
t

RT of Path(i)

ART( j Hop Path) = i =1

(6.9)

(6.10)

ART( j Hop Path)

ART =

j =1

Where

t is the number of possible paths of a particular hop count.


s is the number of all possible hop counts of available paths.

120

Path Length (PL) of a path is the maximum hop count between source and

destination. Average Path Length (APL) is the average of all PLs in the network and
computed as follows.

APL =

t PL[i ] No. of Paths


Total No. of Paths
i =1

(6.11)

Where

t is the number of possible paths of a particular hop counts.

Path Cost (PC) is the cost spent by the test message to travel over communication

link between source and destination. The Path cost comprises all the cost including
bandwidth and latency, etc. Average Path Cost (APC): is the average of cost spent in
all possible paths of every hop count. The computation of APC is as follows.

PC (i)

APC( j Hop Path) = i =1

(6.12)

(6.13)

APC( j Hop Path)

APC =

j =1

Where

t is the number of possible paths of a particular hop count.


s is the number of all possible hop counts of available paths.

Path Cost Saved (PCS):


% age PCS =

( APC APCCJM )
100
APC

Where
APCCJM is Average Path Cost through CJM

121

(6.14)

Response Time Reduction (RTR):


%age RTR =

( ART ARTCJM )
100
ART

(6.15)

Where

ARTCJM is Average Response Time through CJM

6.5.3 Simulation Results


The results obtained from simulation show that even after optimizing the overlay
network, there is a scope to reduce the traffic in underlay. A number of simulation
experiments are performed to observe the behavior of CJM and performance of P2P
systems in the presence of CJM. The number of peers considered in the network is
1,000 in underlay and 10-20% of them are participating in overlay topology.
From Figure 6.4, it is observed that the number of partitions in network decreases
with increase in underlay cardinality, and number of partitions are become
approximately constant after the cardinality value 11.

20

Avg. Number of Partitions

18
16
14
12
10
8
6
4
2

#Partitions

0
3

11
13
Underlay Cardinality

15

17

19

Figure 6.4 Average Number of Partitions vs. Underlay Cardinality

Figure 6.5 shows that maximum path length in underlay used to transfer the
data/control message depends upon underlay cardinality (number of connections a
peer has in the network). The cardinality in overlay and underlay is assumed to be
more than 3. Initially path length is more against the cardinality value 3. The path

122

length starts reducing with increase in the cardinality and start stabilizing after the

Avg. Path Lengths for Max. Reachability

value of 13.

20
18
16
14
12
10
8
6
4
2

Path Length

0
3

11
13
Underlay Cardinality

15

17

19

Figure 6.5 Average Path Lengths for Maximum Reachability vs. Underlay Cardinality

From Figure 6.6, it is observed that average path cost in P2P system also depends
upon the cardinality of participating peers in overlay topology. A high path cost is
observed in the overlay topology with less cardinality values and it reduces with
increase in the value of cardinality of a peer. Path cost in the network starts stabilizing
after the cardinality value of 6. Because the number of connections may be sufficient
to contact any peer with approximately same path cost in the network.

Figure 6.7 presents that the path cost initially is approximately same and start
decreasing with increase in underlay cardinality. In all the three cases CJM provides
minimum path cost as compared to the conventional path and path suggested by
THANCS algorithm. The path cost starts getting stabilized after the underlay
cardinality is 13. The removal of redundant path from conventional paths reduces the
path cost.

123

250

Avg Path Cost

200

150

100
Normal Path
50

THANCS
CJM

0
3

7
8
Overlay Cardinality

10

11

12

Figure 6.6 Average Path Cost vs. Overlay Cardinality

300

Avg. Path Cost

250
200
150
100
Normal Path
THANCS

50

CJM
0
3

11
13
Underlay Cardinality

15

17

19

Figure 6.7 Average Path Cost vs. Underlay Cardinality

It is observed from Figure 6.8 that average response time of a path in overlay
starts constantly increasing with increase in overlay hop count in case of conventional
path. The response time in case of THANCS and CJM is very less as compared to
conventional paths CJM is providing minimum response time among the three.
Because, it uses common junction between two paths to route the messages. The
reduction of average response time is increasing with the increase of hop count in the
overlay path, because for long paths, the possibility to find the common junction is
higher.

124

From the results shown in Figure 6.9 it is observed that the average percentage of
reduction in path cost for CJM is lower than THANCS. But after the increase in path
length more than 3 hop count, the average reduction in path cost for CJM increases
sharply and is more than THANCS. The maximum average reduction in path cost is
observed 61% for CJM and 46% for THANCS. The reason for this reduction in path
cost is, the actual path traveled (through CJM) by the messages is reducing in the
network.
450
Normal Path

Avg. Response Time (ms)

400

THANCS

350

CJM

300
250
200
150
100
50
0
1

6
7
Overlay Hop Count

10

11

10

11

Figure 6.8 Average Response Time vs. Overlay Hop Count

70
THANCS

%age Reduction in Path Cost

60

CJM

50
40
30
20
10
0
1

5
6
7
Overlay Path Hop Count

Figure 6.9 Average %age of reduction in Path Cost vs. Overlay Path (Hop Count)

125

%age Reduction in Response Time

50
45

THANCS

40

CJM

35
30
25
20
15
10
5
0
1

5
6
7
Overlay Path (Hop Count)

10

11

Figure 6.10 Average % age Reduction in Response Time vs. Overlay Hop Count

From Figure 6.10, it is observed that for initial values of hop count in overlay
path, the RTR percentage for CJM is lower than THANCS. But after the hop count
value 3 RTR percentage becomes higher than THANCS and remains higher for
higher hop count values. CJM provides approximately upto 45% RTR percentage. It
provides comparatively fast data transfer in P2P networks.

6.6 Advantages in using CJM


CJM saves traffic cost without modifying the network topology, and search space.
Other advantages of CJM are, it saves the traffic cost of the P2P network. Second, it
finds the path in underlay network for the two paths in overlay network, i.e.,
discontinuous paths in overlay network. Third, CJM find the shortest path of any hop
count path i.e., CJM is also suitable for any path length. The saved traffic cost
increases by increase in hop count of the path with increased possibility to have
common junction in the paths.

6.7 Discussion
Simulation results show that the average saving of path cost is increasing with the
increase in hop count of the path. The maximum path cost is saved upto 61% for the
hop count 11 in its best case. Initially it is observed that the average saving of path
cost increases drastically and after a limit it reduces, which indicates the average
saving in path cost is small. It is also observed that up to 8 hop count the proposed

126

technique CJM gives good results. CJM also significantly reduces the response time
of the network. Approximately 45% of response time of the network is saved on an
average using CJM. It is also observed from the results that a significant amount of
response time is reduced up to 9 hops, after that the reduction in response time is
minor. CJM reduces the physical path without altering the overlay connection and
search space of the system. It is useful for any overlay topology in the system. CJM
provides better results than THANCS for majority of the performance metrics
considered.

6.8 Summary
In this chapter, we have proposed Common Junction Methodology (CJM) technique.
This technique shows amazing results in saving of path cost and reducing the
response time of the network. It solves the topology mismatch problem upto large
extent. CJM works on any of the overlay topology, i.e., centralized or decentralized
topology. It can be implemented in any of the overlay topology without changing the
topology. Other salient features of the CJM are fast convergent speed and search
scope in the network.
In the next chapter an efficient replica placement algorithm LARPA is discussed.

127

Chapter 7

Fault Adaptive Replica Placement over


P2P Networks
Data replication technique is widely used to improve the performance of distributed
databases. The replica logical structure reduces search time to find replicas for
quorums in P2P networks. It also guides to improve the turn around time of
transactions.
In this chapter a Logical Adaptive Replica Placement Algorithm (LARPA) is
proposed. It is adaptive in nature and tolerates up to n 1 faults. It efficiently stores
replicas on the one hop distance sites (peers) to improve data availability in RTDDBS
over P2P system.
Rest of the chapter is organized as follows. Section 7.1 presents introduction.
Section 7.2 gives system model. Section 7.3 introduces the LARPA. Section 7.4
highlights implementation. Section 7.5 explores on simulation and performance study.
Section 7.6 discusses about the findings and finally chapter is summarized in Section
7.7.

7.1 Introduction
Peer-to-Peer (P2P) networks are low maintenance, massively distributed computing
systems in which peers (nodes) communicate directly with one another to distribute
tasks, exchange information, or share resources. P2P networks are also known for its
huge amount of network traffic due to topology mismatch problem. A large portion of
the heavy P2P traffic is due to topology mismatch problem between overlay topology
and underlay topology. There are currently a number of P2P systems in operation viz.,
Gnutella [67] to construct the unstructured overlay without rigid constraints for search
and placement of data items. However, there is not any guarantee of finding an
existing data object within a bounded number of hops.
P2P systems are rich in free availability of computing power and storage space. A
Real Time Distributed Database System (RTDDBS) is one of the application which is

128

suitable for such resources. But, there are various issues to be handled before
implementation, viz., time constraints to execute transactions in the said system.
Depending upon type of application, real time transaction can be categories in three
types: hard, soft and firm deadline transactions. Any transaction which misses the
deadline is considered worthless and is thrown out of the system immediately in case
of firm deadline transaction.
In replication data items are replicated over the number of peers participating in
the system. But replication is life line for the environment where nodes are prone to
leave the system and data availability is the primary challenge. Data replication
technique is used to provide the fault tolerance. It improves the performance and
reliability of the distributed systems. It also reduces the response time and increases
the data availability of the conventional distributed systems.
Replica logical structures also improve the performance of the system by
reducing the time of quorum formation. The quorums are decided from the structure
such that data consistency and data availability. A special replica overlay structure is
used to place replicas. Data availability is also a primary objective of the P2P
networks.
The numbers of replicas are increased blindly in normal cases for improving data
availability. Due to large number of replicas heavy redundant traffic is generated
during system maintenance and updating phase. Maintaining data consistency is also a
challenge in quorum system. Increasing the number of replicas in the system faces
more problems to maintain the data consistency. It takes more time to update all
replicas present in the system. Network overhead of the system is increased
exponentially with increase in number of replication. This problem has major impact
in the case of P2P network where network overhead is very large due to topology
mismatch problem. Because messages are transferred through a number of peers
present in the underlay topology. These peers are transparent in the overlay topology.
To implement RTDDBS over the P2P networks, data distribution must be efficient
to match the requirements of transactions deadlines. For improving the data
availability and fast data access, normally replicas are placed in efficient overlay
structure. Necessary modification is required in replica overlay topology to reduce the
network traffic and chasing other challenges in the P2P networks. We have
considered few of the above challenges and developed a LARPA for P2P network. It
is adaptive in nature and tolerates up to n 1 faults. LARPA efficiently stores replicas
129

on the one hop distance sites to improve data availability in RTDDBS over P2P
system. A comparative study is also made with some existing systems.

7.2 System Model


The connectivity structure of P2P network is represented by an undirected graph with
vertices as peers and edges represent connection among the peers. The overlay is
modeled as undirected graph G = ( P, E ) , where P is the set of active peers
participating in the network, and E is the set of edges (links) between the peers.
Further, P and E is defined as P = { p1 , p2 , p3 , ..., pn } , E = {e1 , e2 , e3 ,..., en } .Where n p , ne
p

are the number of peers participating in the network and number of edges to connect
the participating peers, respectively. Two peers p1 and p2 in a graph are said to be
connected if there exists a series of consecutive edges {e1 , e2 , e3 ,..., e p } such that e1 is
incident upon the vertex p1 and e p is incident upon the vertex p2 . An edge ( p1 , p2 ) in
E means that p1 knows a direct way to send a message to p2 . Henceforth, we use the

terms graph and network interchangeably. Similarly, the terms peer and vertex are
used equivalently, and so are the terms edges and connections. The series of edges
leading from p1 to p2 is called a path from vertex p1 to p2 , represented as p1 p2 ,
iff they are at more than one hop distance and by p1 p2 in case of one hop distance.
The length of a path Hopl(1,2) is the number of edges in the path from peers p1 to p2 .
The distance is the measure of total cost including all types of cost to send the unit
data from source p1 to destination p2 and can be defined as shortest distance
Dist ( p1 , p2 ) calculated in underlay topology. Replicas of the database are stored on

the peers selected through some criterion among the peers in P . This set of replicas is
defined by

PR = { pr1 , pr2 , pr3 , ..., pri } where pri P; PR P . Replicas form a replica

overlay topology, Hence, replica overlay topology can also be defined by the graph
G1 , where G1 G . PR

is the set of vertices of G1 . The edges in G1 are

ER {E New Established Overlay Links} . The one hop neighbor of any peer p1 is

defined by N P1 = { p1 | ( p1 , p2 ) E , p1 P} and two hop neighbor of any peer p1 , is


1

defined by
N P21 = { p3 | ( p3 , p2 ) E , p3 P, p2 N P11 } N P11 { p1} .

130

Data items D are defined by the set of tuples < Vri , Dci > where Vri is the version
number of the data items, highest version number implies the latest value of data
items. For every committed write query, version number is incremented by one at
particular replica. Dci is the value of data contents stored at a replica.

7.3 Logical Adaptive Replica Placement Algorithm (LARPA)


The logical replica topology improves the performance of replica system as compared
to random placement of replicas in the P2P networks. To achieve data availability,
number of replicas are increased blindly without considering effects of network traffic
and data consistency problems in the system. These are major factors affecting the
performance of any system. A logical structure reduces response time, distance of
replicas and replica searching time. A small set of replicas in a P2P system reduces
the system overhead. LARPA, considers network traffic and data consistency problem
along with problem of read/write quorum generation time, update time, churn rate of
peers and performance of the system.
LARPA places the replicas close to the point from where a search starts. It selects
a suitable peer with highest candidature value for storing the replica which acts the
centre of structure {Figure 7.1 & 7.2}. All other replicas are stored at peers having
maximum candidature. The one hop connections are established among the replica
and the centre peer. Connections can be further improved to minimize the effect of
centre failure. This can be done through establishing the new connections between the
replicas (which are at more than one hop distance) and the peers at one hop distance
from the centre (direct neighbors). These extra connections improve the search
performance of the system.

7.3.1 LARPA Topology


In LARPA overlay, all peers are selected for placing replicas on the basis of their
resource availability and the session time during which they participate in the
network. Second, threshold value (which is decided by the DBA and may vary for
every situation) of candidature is used to select the peers {Section 4.5.6}. This
threshold value is the minimum value at which one can expect acceptable results.
These conditions improve the capabilities of peers which improve the durability and
efficiency of the system.

131

For any read/write quorum, a group of replicas are selected from the logical structure
to execute read/write operations in the system. The time spent in generating the
quorums also affects the system performance. LARPA selects limited numbers of
peers for placing replicas. It forms logical structure on the basis of resource
availability with the peers. One best peer among identified peers is selected as a
centre peer. This is the point from where a query enters in the system for execution
and to select the quorums for it. All remaining special peers establish direct overlay
connection with the centre peers. In LARPA one hop overlay connections improve the
search time to generate quorums. This time to select quorum is reduced by improving
the search time of replicas from the replica overlay. These peers may also establish
extra connections with the peers at one hop distance from the centre.

7.3.2 Identification of Number of Replicas in the System


To achieve desired data availability in the network, data is replicated in the system.
There is a tradeoff between number of replicas and system overhead, it needs to select
number of replicas in the system intelligently. The number of replicas should be
minimum, to reduce the problem of data consistency, network overhead, network
traffic and other factors. Further, data availability must also be in the acceptable
range, to access the updated data from the system. From the property of parallel
n

systems at least one peer must be active and is defined by: P = (1 (1 Pi )) . The
i =1

target data availability to be achieved by any database system is assumed as 95%. To


achieve this, data is replicated over P2P networks. From Table 7.1, it is observed that
only six replicas with peer availability of 0.4 produces the data availability up to 95%.

Peer Availability
in the System

Table 7.1 Effect of Peer Availability over Data Availability in the System

0.3
0.4
0.5
0.6
0.7
0.8
0.9
1

2
0.51
0.64
0.75
0.84
0.91
0.96
0.99
1

3
0.657
0.784
0.875
0.936
0.973
0.992
0.999
1

4
0.7599
0.8704
0.9375
0.9744
0.9919
0.9984
0.9999
1

No of Replicas in the System


5
6
0.83193
0.882351
0.92224
0.953344
0.96875
0.984375
0.98976
0.995904
0.99757
0.999271
0.99968
0.999936
0.99999
0.999999
1
1

132

7
0.917646
0.972006
0.992188
0.998362
0.999781
0.999987
1
1

8
0.942352
0.983204
0.996094
0.999345
0.999934
0.999997
1
1

9
0.959646
0.989922
0.998047
0.999738
0.99998
0.999999
1
1

This data availability goes on increasing with increase in peer availability. With these
facts it is concluded that one should avoid unnecessary large number of replicas in the
system, and limit the number of replicas up to 7. Hence, if peer having more than 40%
availability it is selected for storing replica, then data availability will be in acceptable
range. To guarantee the data availability, the peers having more than 0.5 availability
are considered for storing the replicas.

7.3.3 LARPA Peer Selection Criterion


There are two parameters that are considered while selecting a peer for storing the
replicas, i.e., candidature of the peer {Section 4.5.6}, which is the measure of, how
capable a peer is to store a replica. The value of candidature is computed using eqn
(4.1) {Section 4.5.6}. Second is the distance of each participating peer in the system.
The distance is a measure of the cost spends to send and receive messages from the
peers to centre peer. For any efficient system this distance should be minimized. A
priority queue is utilized to store the best peers having largest candidature among all
the peers. The length of priority queue is equal to double of the number of peers
required in the system (length may vary depending upon system requirement). In the
present case, the number of required replicas are seven and the length of priority
queue is fourteen. The remaining seven peers are utilized as the replacement or to add
new replica, if required by the system. All these peers are best suited among all
participating peers in the overlay to store replicas.

7.3.4 Algorithm 1: Selection of Best Suited Peers


The best i peers are identified among n p peers from set P . A peer with candidature
value greater than threshold value is qualified for set PR . But a predefined number i
of best peers among these qualified peers are in the PR . The number i may vary
depending on the requirement of the system. The qualified peers are arranged in the
descending order. On the basis of peer availability LARPA decides double of the
peers required in the system. The numbers of peers decided by LARPA for the present
system is 7. The first 7 peers among 14 peers are selected from PR . The steps of
algorithm are as follows.

133

Algorithm 1: LARPA1 (selecting best suited peers)


1. pi P , calculate candidature Cdi .
2. Put first ns peers having maximum candidature in priority
queue which is sorted in descending order and having
candidature grater than the threshold value. This is the set PR .
3. Take first element from the priority queue, having highest
candidature among all participating peers. Consider this peer as
centre peer pc .
4. Take next ms peers from set PR , where ms ns . Put all ms
peers in a queue, {Participating Peer Queue ( PPQ )}.
5. pk PPQ , pk pc and find

Dist ( pk , pc ) shortest path from

each selected peer from PPQ to centre peer pc and establish


the direct connection pk pc .
6. Establish the overlay connection among all selected peers
(except pc ), such that each peer connects with at least two
other

peers

from

the

queue

PPQ . pk PPQ; pk pk +1 ; pLast p1 .

7. End

7.3.5 Algorithm 2: Selection of Suitable Peers with Minimum Distance


A set PR of i best qualified peers are selected from the set P of the participating
peers similar to the previous algorithm. These peers are selected to store replicas
among set PR , such that the distance between a peer and center peer is minimum. The
peers are selected from PR on the basis of distance and candidature. The steps of
algorithm are as follows.

Algorithm 2: LARPA2 (Selecting suitable peers with minimum distance)


1. pi P , calculate candidature Cdi .
2. Put first ns peers having maximum candidature in priority
queue and

sort them in descending order and having

candidature grater than threshold value. This is the set PR .

134

3. Take first element from the priority queue, having highest


candidature among all participating peers. Consider this peer as
centre peer pc .
4. Take ms peers from set PR , where ms ns and having
minimum distance from the centre among all participating
peers, i.e., Minkm=1 (pk PPQ | Dist ( pk , pc )); where Dist ( pk , pc )
is the minimum distance from source to destination. Put these
ms peers in a queue {Participating Peer Queue ( PPQ )}.

5. pk PPQ , pk pc and find

Dist ( pk , pc ) shortest path from

each selected peers from PPQ to centre peer pc and establish


the direct connection pk pc .
6. Establish the overlay connection among all selected peers, such
that each peer connect with at least two other peers from the
queue PPQ . pk PPQ; pk pk +1 ; pLast p1 .
7. End
In Figure 7.1 eighteen peers are participating in the network from peer p1 to p18 .
Seven peers are selected from eighteen peers on the basis of candidature, which is
nothing but the availability of resources and capability of the peers. A peer with
highest candidature value among these seven peers, is selected as centre in the replica
overlay. LARPA selects p5 as a centre peer. The remaining selected peers establish
one hop overlay connection with centre of the replica. These one hop overlay
connections may or may not be already present. New one hop connections are
established in case of connection is not already present between selected peers and the
centre. LARPA also connects selected peers with each other along with the
connection with centre. New connections are represented with bold curve connectors
e.g., p12 to p5 and p17 to p5 .

135

P18
P1
P17

P2

P12

P8
P6

P7
P5

P13

P9

P11

P10
P14

P3

P15

P16

P4

Figure 7.1 Peers Selection and Logical Connection for LARPA Structure

In Figure 7.1 circles filled with gray color, represent the peers selected for storing
replicas among all the peers. Peer p5 filled with dark gray color is selected as a
centre. The connections with bold dark connector represent existing overlay
connection among replicas, connectors with bold and gray color (curves connectors)
represents the new established connections created for LARPA. All other dashed
connectors represent other overlay connections and can be utilized in case of any path
failure in LARPA. The LARPA logical structure is presented in Figure 7.2, is
obtained from the existing/newly generated connections from network, shown in
Figure 7.1.

P10

P14

P12

P5

P9

P17

P8

Figure 7.2 LARPA obtains Logical Structure from the Network shown in Figure 7.1

7.4 Implementation
The arrivals of transactions at a site (peer) are independent of the arrivals at other
sites. The model assumes that each global transaction is assigned a unique identifier.
Each global transaction is decomposed into subtransactions to be executed by remote

136

sites. Subtransactions inherit the identifier of the global transaction. No site or


communication failure is considered. The execution of a transaction, it requires the
use of CPU and data items located on remote site. A trusted communication network
is used to connect the sites. There is no global shared memory in the system and all
sites communicate via message exchange over the trusted communication channels.
The cohorts of the transaction at the relevant sites are activated to perform the
operations. A distributed real time transaction is said to commit, if the master has
reached to the commit decision before the expiry of its deadline at the site. Each
cohort makes a series of read and update accesses. The transaction already in the
dependency set of another transaction or the transaction already having another
transaction(s) in its dependency set cannot permit another incoming transaction to
read or update. Read accesses involve a concurrency control request to obtain access
followed by a disk I/O to read followed by a CPU usage for processing the data item.
Peers that vote yes, lock their replica and send back their (global) version number to
the requesting peer. Reading or writing without a quorum jeopardizes consistency.
Firm deadline is used in the transaction. The transactions supplied to the system are
free from concurrency control.
LARPA inherits the read/write attributes of ROWAA protocol [133]. It uses all
active replicas available for write quorum. LARPA permits for quorum to start from
the centre of the structure. The address of this centre is provided to the authorized
owners/users. The replicas are searched through broadcasting from the centre peer.
The replicas having active status can respond to the query received. Control messages
are exchanged within replicas in the system to share the active status of the replicas.

Searching for Read/Write Quorums: Authorized owner/users of the database sends


the query to the system. System searches the group of replicas to respond the query
received. The replicas are searched through broadcasting, starting from the centre
replica. It takes less time in searching because all other replicas are available at one
hop distance. Read quorum is constituted of any one replica, searched first from the
structure. LARPA perform well in small replication system. The write quorum is
constituted of all available replicas from the structure. The fixed wait time is allowed
for replicas to participate, otherwise replicas are assumed to be unavailable.

137

7.4.1 Replica Leaving from the System


For LARPA fail stop based system is assumed. If any of the peers fails, it stops
sending all types of messages.
P10

P14

P12

P5

P9

P17

P8

Figure 7.3 LARPA Structure Representing the Replica p14 departing the Network
P10

P14

P12

P5

P9

P17

P8

Figure 7.4 LARPA Structure Representing the Replica p5 from the Centre departing the
Network

Replica Leave: A replica in LARPA can leave with or without informing the system.
It simply stop working and forwarding the data/control messages. A ping message
always sent to the centre against the active status of replica. This provides the
information about a replica whether it is active and working properly or not. It
maintains updated copy of data. In case of replica leaving from the system it does not
effect the functioning of the system till a single replica is active {Figure 7.3}.

Centre Leave: In case of centre fails as shown in Figure 7.4, the next replica will
automatically take the charge of centre. The replica will manage the system from its
present location in the structure.

7.4.2 Replica Joining to the system


A replica tries to connect with the addresses of its old neighbors. These addresses of
neighbors are already stored with replica at the time leaving the system. After
connecting with old neighbor replicas in the system, it updates its data contents with
138

neighbors. Active central replica may also be utilized for updating the data items.
Replica announces its active status after successfully updating its data items, through
control message passing.

Centre Joins: When the centre replica wants to join the system. It first tries to connect
with its old connection (stored in its memory). After connecting the replica in the
system, centre updates its data contents through matching the version number and its
contents. The centre replica receives data update acknowledgement from all available
replica participating in the system. The centre replica announces its active status after
successfully updating its data items, through control message.

Network Traffic: in case of LARPA, network traffic is very less as compare to the
random, Hierarchical Quorum Consensus, Extended Hierarchical Quorum Consensus
[145, 147]. This reduced traffic is due to its logical structure and placement of
replicas. Network traffic due to message passing is limited in the replica overlay,
which further reduces traffic in underlay topology and reduces the traffic at Internet.
Fault Tolerance: This is high priority requirement for the system, especially in case
of P2P systems. LARPA works with its only active last replica, available in the
system. It tolerates ns 1 faults (where ns is the number of replicas in the system), as
single replica provides the complete information to the system.

7.5 Simulation and Performance Study


To evaluate the LARPA an event driven simulation model for firm deadline real time
distributed database system has been used {Figure 4.5}. Model presented in Figure
4.5 is extended version of the model defined in [164]. This model consists of a
RTDDBS distributed over n peers connected by a secure network.

7.5.1 Performance Metrics


The performance of the LARPA is evaluated and compared with other existing
systems through simulation. Performance metrics defined in Table 4.2 and in Table
7.2 are used to evaluate the system. In the simulation we have used the performance
parameters defined in Table 4.3.

139

Table 7.2 Performance Metrics-III

Performance Metrics used to measure the performance of the system

Quorum Search Time: is the time duration between the requests for quorum is
submitted and required replicas for quorum are searched in the system.

Network Overhead: network load is measure of the number of messages


transferred in the network to propagate an update message to the system in
underlay topology. For an efficient logical structure, network load should be
lower. Higher network load creates congestions in the P2P networks.

7.5.2 Simulation Results


To evaluate the LARPA logical structure, a series of simulation experiments are
performed. Figure 7. required for the peers to respond the arrived queries and to
increase the performance of the system. The peers with availability having more than
0.5 availability are suitable to store replicas, because such peers have sufficient
session time to execute transaction. Below this availability the replica leave/rejoin
overhead is more and show poor performance. The system constituted with high
probability peers provides the minimum interference, so the effect of churn rate on the
system is minimized.5 presents behavior of the peers. It is observed that the session
time of the peers is increased with increase in availability of the peer. A sufficient
time slot is
500

Average Up Session (ms)

450

Avg. Session Time

400
350
300
250
200
150
100
50
0
0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

Availability

Figure 7.5 Relationship between session time and its availability of a peer in P2P networks

140

Figure 7.6 the compares the response times of Random [153], HQC [145] and HQC+
[147] replication systems with LARPA. LARPA minimized the response time. This
shorter response time, helps fast execution of transactions in the system. It reduces the
workload of the system. This lowest response time is due to the placement of
minimum required replicas in LARPA based system. This LARPA system bears more
workload than the other considered systems.

100
Random

90

HQC

Avg. Response Time (ms)

80

HQC+

70

LARPA

60
50
40
30
20
10
0
0

10
12
Quorum Size

14

16

18

20

Figure 7.6 Variations in response time with quorum size

It is observed from Figure 7.7, that LARPA structure is better among all the
considered structures in restart ratio. Each logical structure has a highest restart ratio
on a value of Mean Transaction Arrival Rate (MTAR), for Random MTAR=0.8, for
HQC MTAR=1.2, HQC+=1.2 and for LARPA MTAR=1.4. This represents that the
system starts exhausting near to these values of MTAR. After this peak restart ratio
decreases with increase in MTAR. This occurs because the time left with transaction
is less to restart. The time is consumed in communication delays and other factors
affecting the performance. Restart ratio for LARPA is the minimum and can bear the
load of transaction up to 1.4 approx.
Figure 7.8 presents the relationship in success ratio with MTAR in the system.
Success ratio is the number of transactions completed successfully within deadline
over the number of transaction submitted for the execution. The success ratio
decreases with increase in MTAR. The success ratio is same for all considered
systems at MTAR=0.2. The success ratio for all the systems decreases with increase

141

in MTAR. But after MTAR=0.2 success ratio decrease very sharply for random
system and much lesser for LARPA based system. LARPA executes more
transactions successfully as compared with other systems.

1.4
LARPA
Trnsaction Rastart Ratio

1.2

HQC
HQC+

Random
0.8
0.6
0.4
0.2
0
0

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

1.8

Mean Transaction Arrival Rate(MTAR)

Figure 7.7 Variations in restart ratio with system workload

1
0.9
0.8
Success Ratio

0.7
0.6
0.5
0.4

LARPA

0.3

HQC

0.2

HQC+
Random

0.1
0
0

0.2

0.4

0.6

0.8

1.2

1.4

1.6

Mean Transaction Arrival Rate (MTAR)

Figure 7.8 Relationship of transaction success ratio with system workload

From Figure 7.9 it is observed that throughput of LARPA is higher as compare to


the other selected structures. This is due to minimum time for searching quorums and
shorter response time. The throughput increases in its initial phase and after its peak it
starts decreasing. At the peaks maximum rate of transactions which a structure may
bear. After these peaks transactions waiting time and lock time, etc., are start

142

increasing and throughput starts decreasing. The maximum value of MTAR at peak
for LARPA is 1.4 (approx), which is highest among other selected logical structure.

160
140

Throughput (tps)

120
100
80
60

Random
40

HQC
HQC+

20

LARPA
0
0

0.2

0.4

0.6

0.8

1.2

1.4

1.6

1.8

Mean Transaction Arrival Rate (MTAR)

Figure 7.9 Variation in throughput with system workload

It is observed from Figure 7.10 that lesser is the search time for quorum
formation, faster is the response. It also reduces the time to execute the transactions.
LARPA performs better than other logical structures and takes minimum search time
to form quorum.
40

Random
HQC
HQC+
LARPA

35

Avg. Search Time (ms)

30
25
20
15
10
5
0
0

10

12

14

16

18

20

Quorum Size

Figure 7.10 Relationship between average search time with quorum size

Figure 7.11 represents the variation in average messages transfer in case of


random replica topology is highest where as LARPA generates minimum average

143

message transfers. This reduced average message transfer is due to its least number of
replicas activated in the system. It is also observed that LARPA generates minimum
network load.
It is observed from Figure 7.12 that LARPA has high probability to access
updated data. This is because of the time to update the system is minimum due to one
hop distance and reduced response time of replicas.
1200
LARPA
HQC

Avg. Message Transfer

1000

HQC+
Random

800
600
400
200
0
0

10
12
Quorum Size

14

16

18

20

0.9

Figure 7.11 Variation in network traffic with quorum size

Probability to Access Updated Data

1.2
CELL
1

HQC+
HQC

0.8

Random

0.6
0.4
0.2
0
0

0.1

0.2

0.3

0.4
0.5
0.6
Peer Availability

0.7

0.8

Figure 7.12 Probability to Access Updated Data vs. Peer Availability

Figure 7.13 presents the comparison of response time between LARPA1 and
LARPA2, it is observed that LARPA2 provides better response time than LARPA1.
Because LARPA2 take less time to travel from centre to peer in LARPA structure, as

144

peers are selected on the basis of minimum distance from the centre for LARPA
structure.
Figure 7.14 gives the comparison of network overhead between LARPA1 and
LARPA2 , it is observed that

LARPA2 generates less network overhead than

LARPA1.
20
LARPA1

Avg. Response Time (ms)

18

LARPA2

16
14
12
10
8
6
4
2
0
2

2.5

3.5

4.5
5
Quorum Size

5.5

6.5

Figure 7.13 Response Time Comparison between LARPA1 and LARPA2

90
LARPA1

80

LARPA2
Message Overhead

70
60
50
40
30
20
10
0
2

2.5

3.5

4.5
Quorum Size

5.5

6.5

Figure 7.14 Messages Overhead Comparison between LARPA1 and LARPA2

7.6 Discussion
LARPA permits a replica to keep its old addresses and to join the same place from
where it had left, which reduces rejoin overhead of the replica. Peer resume time is
minimum, and is similar to just checking its old neighbors. Thus, the system
145

reconciliation time is very less as compared to other considered system. It exploits the
advantages of overlay topology in establishing the connections and the connections
can be disconnected or established without affecting the peers in underlay topology.
The network overhead is minimized with limiting the number of replicas in the
system. All replicas are placed at one hop distance from the centre peer, from where
any search starts. Data availability of the system is maintained with placing the replica
over the peers with maximum candidature value. Fault detection is fast due to one
hop distance of all the replicas from centre. LARPA is adaptive in nature and tolerates
upto ns 1 faults. It allows a system to works till the last active replica is active.
In LARPA based systems one replica may be accessed in its best case. It provides
high probability to access updated data from the system. It also has the minimum
quorum acquisition time and response time. LARPA provides minimum transaction
restart ratio, better throughput, and better transaction success ratio. On the basis of
comparative analysis it is find that, LARPA2 provides better response time and
generates less network overhead to the network. All these features recommend
LARPA2 structure for the dynamic environment applications where high throughput is
required.

7.7 Summary
In this chapter we have presented a Logical Adaptive Replica Placement Algorithm
(LARPA). LARPA matches the requirements of RTDDBS over P2P where fast
response is expected from the system. It uses its own peer selection criterion to
maintain data availability of the system in acceptable range. It efficiently stores
replicas on the one hop distance peers to improve data availability for RTDDBS over
P2P system.
To avoid long waiting time LARPA inherited read/write quorum attributes of the
ROWAA protocol. LARPA is adaptive in nature and tolerates up to ns 1 faults. It
shows minimum response time, search time to generate read/write quorums,
transaction restart ratio and transaction miss ratio. It generates lowest message traffic
to update in P2P networks. LARPA based system bears maximum workload. It is
further observed that the algorithm LARPA2 performs slightly better than algorithm
LARPA1 due to its shorter distance among centre and replicas. LARPA is a suitable
for implementing reliable RTDDBS over P2P networks.

146

Next chapter represents a hierarchy based quorum consensus scheme for


improving data availability in P2P systems.

147

Chapter 8

Height Balanced Fault Adaptive


Reshuffle Logical Structure for P2P
Networks
Distributed databases are known for their improved performance over the
conventional databases. Data replication is a technique for enhancing the performance
of the distributed databases in which data is replicated over the geographically
separated systems. In highly dynamic environments the probability to access a stale
data is comparatively high.
In this chapter a Height Balanced Fault Adaptive Reshuffle (HBFAR) scheme for
Improving Hierarchical Quorums over P2P Systems is developed. It inherits read one
write all attribute of (ROWA) protocol [133]. HBFAR scheme maximizes the
overlapped replicas for read/write quorums and improves the response and search
time.
Rest of the chapter is organized as follows. Section 8.1 gives the introduction to
the chapter. Section 8.2 presents a system model. Section 8.3 briefs the system
architecture. HBFAR scheme is explored in Section 8.4.

Section 8.5 explains

simulation and performance study. Section 8.6 gives a look on findings. Finally
chapter is summarized in Section 8.7.

8.1 Introduction
Data replication is one of the technique to enhance the performance of the distributed
databases. In replication data is distributed over the geographically separated systems.
Each data replicated over the peers are generally called replica. The multiple replicas
are consulted to get the fresh data items from the distributed systems. This makes
system reliable and resilient to any fault. Data replication is a fundamental
requirement of distributed database systems deployed on the networks which are
dynamic in nature, viz., P2P networks. Peers can join or leave the network at any time

148

with or without prior information. It is also found in the literature that churn rate of
the peers is high in P2P networks. For such a highly dynamic environment the
probability to access a stale data is comparatively high as compare to the static
environment. There are number of challenges to implement the databases over the
dynamic systems like P2P networks. The major challenges are summarized as
follows- data consistency, one copy serializability, fault tolerance, availability of the
data items, response time, churn rate of the peers and network overhead.
A number of protocols and algorithms are proposed in the literature to implement
and maintain the consistency in the distributed databases. Some examples are single
lock, distributed lock, primary copy, majority protocol, biased protocol and quorum
consensus are proposed in the literature. The availability of the replicas in dynamic
P2P network is a major challenge, because of churn rate of peers. Data availability is
also affected by the peer availability in the system. To maintain the data availability in
acceptable range a quorum consensus protocol to access the replicas are quit good
option. A system with replicas stored in a logical structures, improves the quorum
acquisition time.
If a quorum is formed such that, it contains maximum updated replicas then the
probability to access a stale data is obviously reduced. To improve the probability to
access updated data from the set of replicas, the degree of intersection must be high
for two consecutive quorums. To improve the degree of intersection among
consecutive read-write and write-write quorums the logical structure needs to be
accessed in a special way.
Logical structure of the replicas reduces unnecessary network traffic due to
multicasting of search messages/queries for the existing replicas. The network traffic
can be reduced by prioritizing the access of logical structures in P2P systems. Self
organization of the logical structures may also improve the performance of the
system. The network traffic further reduced through optimizing underlay path.
In order to reduce search time we may take the advantage of overlay topology in
the P2P network. If a logical structure is organized in such a way that all updated
replicas are popped up, then search time will reduce drastically and improves the
probability to access updated data item. To address above said challenges we have
developed Height Balanced Fault Adaptive Reshuffles (HBFAR) scheme for P2P
system. It is a self organized scheme to arrange replicas in a binary complete tree with

149

some special attributes. It also improves the probability to access updated data from
the quorums. HBFAR provides high degree of intersection between two consecutive
quorums.

8.2 System Model


For the database replication peers are selected on the basis of availability factor or the
session time. A prioritized access is preferred to access the peers. A longer session
time provides better probability to access fresh data items. All replicas are organized
in a binary logical tree structure. The model defined for a system is as follows:
P is the set of peers and defined as P = { p1 , p2 , p3 ,..., pk } where k is the number of

peers participating in the system. Replica set R is the set of l number of peers
holding replicas. Where R is subset of P , i.e., R P . A replica may be active or
inactive according to its availability in the system.
Ra is a set of all active replicas in the system. Ra = {Ra1 , Ra2 , Ra3 ,..., Ras } is an

ordered set of all active replicas arranged in logical tree structure. Ra1 have larger
session time than Ra2 . Ra2 have the large session time as compare to the Ra3 and so
on. Replicas in the logical structure are managed according to the session time. A
replica from the logical structure which has longest session time is placed at the root.
Rd is a set of all dead/inactive replica at particular time. Ra Rd = , i.e., each

replica is either active or dead depending upon the present state of the peer. Replica
set R is also defined in terms of Ra and Rd , R = {Ra Rd } .
Write quorum is an ordered set of replicas from Ra , Qw = {Qw1 , Qw2 , Qw3 ,..., Qws } is
set of write quorum at various times, respectively. It is starting from the 1st replica
from Ra , up to the number of replicas equal to the quorum size decided by the
administrator, let it be a n .
Let write quorum Qw j = {Raiti | ti > tn+1 , i = [1...n]} , where n is the size of write
quorum and ti is the session time of the replica for the period it is active in the
system. Here i is the position of replica in the logical structure starting from the root,
i.e., root of the logical structure is at position 1 , left child and right child are at
position 2 & 3, respectively.

150

Read Quorums Qr = {Qr1 , Qr2 , Qr3 ,..., Qrs } is set of read quorums at various times. A
Read Quorum is defined as Qrj = {Raiti | ti > tm+1 , i = [1...m]} where m is the size of
read quorum, decided by the administrator. The read quorum must follow the
condition m n , ti ti +1 ; tm and tn are the session time of the last replica involved in
the quorum. All the replicas having largest session time are involved in the read and
write quorums. A read quorum is always subset of the write quorum, i.e.,
[Qwi Qw, Qri Wr , Qri Qwi ] . Thus, Qri will always contain updated replicas,

because all replicas with greater session time are involve in the quorums, including
root of the logical structure. The only condition when all replicas go down including
root, then only this system will fail, otherwise every time read quorum have the
updated information in the replicas.
The quorum size depends upon the availability of the peers in the system. The
value of m, n may increase in case of low availability of peers holding the replicas
and can be decrease in case of highly available peers holding replicas.

8.3 System Architecture


To study the behavior of HBFAR scheme a 7-Layers Transaction Management
System (7-LTMS) is proposed {Figure 8.1}. 7-LTMS helps to execute the received
queries and to maintains other necessary requirements in the system. A brief
discussion of 7-LTMS components are as follows:

Query Optimizer (QO): A Query is divided into number of subqueries. It optimizes


these subqueries to reduce the execution time of overall query. It decides the order of
execution of the subqueries. QO uses various optimizing techniques for the
optimization of the subqueries.

Subquery Schedule Manager (SQSM): A subquery which is ready to execute, is


scheduled to achieve one copy serializability of the transaction. SQSM is responsible
to rearrange the order of subqueries to achieve this. It also maintains concurrency and
data consistency in the system.

151

Query Optimizer

Subquery Schedule Manager

Quorum Manager

Replica Search Manager

Update Manager

Replica Overlay Manager

Network Connection Manager

P2P
Network

Local Storage of Partial


Database

Figure 8.1 7-Layers Transaction Management System (7-LTMS)

Quorum Manager (QM): is responsible to decide the quorum consensus to access the
data items. Quorum is decided such that the system achieve the acceptable replica
availability, i.e., the number of replica to be accessed is increased if the availability of
the peers storing replica are low. The number of replicas may be reduced if the
availability of the peers is high to reduce the overhead of the network. QM is
responsible to maintain the availability of the replicas at the desired level. It
recognizes the replica to be accessed from the logical structure. QM implements
read/write quorum algorithm.

Replica Search Manager (RSM): is responsible for searching any replica from the
group of replicas. These replicas are arranged into a logical structure, maintained by
ROM. RSM also facilitates searching of read/write quorums. It uses algorithms for
searching of replicas, e.g., HQC, HQC+ and HBFASR.

Replica Overlay Manager (ROM): to increase the performance of the system,


replicas are arranged into some logical structure. It is responsible to place the replica
in a logical structure, to easily access of the replicas. The efficiency of overlay logical
152

structure may reduce the search time of replicas. ROM identifies the replica for
making the quorum. It maintains the logical structure from time to time, in which
replicas are placed. Every time when any replicas left the network, ROM readjusts the
replicas by arranging the addresses of these replicas in logical structure.

Update Manager (UM): It implements all the update strategies. The eager
methodology is used to update the prioritized replicas, selected for the write quorum.
Lazy methodology is used for remaining replicas existed in the system. This update is
performed by ROM. UM maintains the freshness of data item available with replicas.
It implements the eager/lazy update algorithm. The algorithm is selected such that it
should update information within minimum time, because it plays the key role in
system performance. The probability to access stale data from the system is
minimized by minimizing the update time of the system.

Network Connection Manager (NCM): Every logical address of the replica is


checked and converted into the physical address of that replica. NCM routes data and
control information. It also maintains connectivity of the replicas.

Local Storage/Database Partition: This is actual local database to be accessed. All


data items belonging to the database partition are physically stored here. This may be
the region provided by the owner of that peer. This memory region is shared among
the network. The data encryption technologies may be implemented to protect the
data from any attack/misuse by any unauthorized person.

8.4 Height Balanced Fault Adaptive Reshuffle (HBFAR) Scheme


HBFAR scheme improves the replica acquisition time and probability to access
updated data items. It places replicas in a binary logical structure. Initially, peers
participating in overlay topology are checked for their average session time. The peers
with highest average session time are selected to hold the replicas in the system.
These replicas are arranged in binary tree logical structure such that they can form
almost complete tree. The replica having longest up session time in the system is
selected as a root. Similarly, other replicas are also arranged as left child and right

153

child, according to the session time of each replica in the system. All replicas having
higher session time are placed on root in the logical structure. These top replicas are
further participating in every read/write quorums. The regular participation of these
top replicas maintains updated copy of data items which increases the probability to
access updated copy of data. The HBFAR scheme inherits some special properties,
viz., replicas lying on a peer are arranged in the tree such that the session time of each
parent replica is greater than its child. The session time of left child is greater than
right child of any parent. All peers participating in the system find the alternate path
for all its ancestors. This includes all parent replicas comes across the path from leaf
to root.
The replicas with highest session time are given priority over the smaller session
time while including in a quorum. The quorum is formed by first taking the replica at
the root of the tree and then replicas at the branches, i.e., from top to bottom. The
branches of parent node are accessed such that the left child is accessed before right
child of that branch, i.e., from left to right. This pattern to access replicas from the
tree is referred as Top-to-Bottom and Left-to-Right. HBFAR scheme permits each
read quorum to get the updated copy of data items.
Every time replicas are accessed from same position from logical structure
increases the degree of intersection among consecutive quorums. The common
replicas will increase the probability to access updated data items from the system.
Since the replicas are accessed from upper part of tree in Top-to-Bottom and Left-toRight fashion, which minimizes the replicas search time. All read-write and writewrite quorum intersect each other; hence, every read quorum accesses the updated
copy of data item. The maximized degree of intersection set from two consecutive
read-write and write-write quorum ensures the access of fresh data items.
The problem of finding a path between a pair of source-destination peers in the
overlay is the problem of finding a route between the source and destination peers in
the underlay topology. The path between peer P1 and P2 is direct of one hop count
distance and path between peer P1 to P4 is of two hop count distance as shown in
Figure 8.2. The identified route between source and destination in the underlay may
or may not be the shortest path. However, a shortest path in the underlay will be
advantageous for reducing communication cost.

154

The working of HBFAR scheme is divided into two independent parts, accessing the
group of replicas and maintenance of logical structure. Both parts are executed in
parallel to improve the efficiency of the system. The replicas from root to terminal
nodes are included in quorums to maximize the degree of intersection set. The level
up to which replicas are included, it depends upon the size of quorum, e.g.,
1,2,3,4,5,6,7,8 are the replicas and generate HBFAR tree as shown in Figure 8.3. The
replicas 1,2,3 are used for quorum of size three replicas and 1,2,3,4 are used to form a
quorum of size four. The replicas with higher session time are given priority to form
the quorums as shown in Figure 8.3.

P14

P10

P7

P1
P3
P5

P2

P12
P6

P13
P4

P8
P15

P9

P11

Figure 8.2 The arrangement of peers to make Height Balanced Fault Adaptive Reshuffle Tree
over the peers from underlay topology of P2P networks. Here the dotted line connector shows
the connection between the peers in overlay topology. The dark line connector shows the
connection between the peers in the replica topology in tree. P14 is shown as isolated peer in
the network.

P1
Peer Id

P2

P4

P1
P2
P3
P4
P5
P6
P7
P8

P3

P5

P6

P7

1 Hop
X
P1
P1
P2
P2
P3
P3
P4

Parent Id
2 Hop 3 Hop
X
X
P1
P1
P1
P1
P2

X
X
X
X
P1

4 Hop

P8

Figure 8.3 Replica arrangements in the HBFAR scheme generated from Figure 8.2. The
session time of P1 is greater than the P2 and P3. The order of the replicas according to session
time from the HBFAR scheme is P1, P2, P3, P4, P5, P6, P7, and P8.

155

The performance of the system remains approximately same even in high churn rate
of peers. With maximized overlapped replicas in consecutive read and write quorums,
this scheme ensures the access of fresh data items from any number of replicas in the
system. The HBFAR scheme also provides high fault tolerance. With self
organization this scheme tolerate up to n 1 faults among n replicas in the system.
Multicasting and directional forwarding is used to transfer the messages in the system.
The HBFAR scheme triggers maintenance procedure for every leaved replica.
Other replicas in the systems are adjusted according to their session time. The replica
with next higher session time takes the position of leaved replica from the network.
By default, the replicas with longer session time move in the upward direction with
passage of time. It works on 4 rules which are defined in the following sections.

8.4.1

Rule Set-I: Rules for Genration of Height Balanced Fault Adaptive

Reshuffle (HBFAR) Structure


The HBFAR is a special type of logical structure similar to the complete binary tree
which is used for replica overlay. A HBFAR structure of size n is a binary tree of n
nodes which satisfies the following (Rule Set-I):
(i) This binary tree is almost a complete tree, means if there is an integer k such
that every leaf of the tree is at level k or k + 1 and if a node has a right
descendant at level k + 1 then that node also has a left descendent at level k + 1 .
(ii) The key in the nodes are arranged in such a way that the content of each node is
less than or equal to the content of its parent. This means for each
node Keyi Key j , where j is the parent of node i .
A peer at level k

holds the addresses of its connected peers at k 1 and k + 1

levels in HBFAR structure, i.e., each peer stores the addresses of its directly
connected siblings and of its parent. Simultaneously each peer stores the addresses of
all its grandparent peers coming across the path from that peer to root. All
peers/replicas follow the rules (Rule Set-I) of HBFAR structure to make this overlay
logical structure or replica overlay. The session time of each peer/replica is used as
key in the replica overlay. The use of session time as the key, results in the movement
of replicas having longer session time towards the root. Each newly joined peer
connect at the position of leaves in logical structure decided according to the rules

156

(Rules Set-I) of HBFAR. These peers also search the alternate path of each parent up
to root. Each peer holding replicas transmit the beacon against its active status to all
its directly connected peers. These addresses and beacon are used for making the
connection in case of any failure.
The addresses of peers are used to access the peers in a particular sequence.
HBFAR reduces the search time in building the quorums by using minimum hop
count. This reduction in search time is even less than that achieved by HQC and
HQC+.

8.4.2 Rule Set-II: Rules for replica leaving from HBFAR


When any replica leaves by informing or without informing the system, the following
steps (Rules Set-II) are taken into account to maintain the replica logical structure. It
is assumed that replica x is at level k going to leave the network, e.g., peer 2 going
to leave the network shown in Figure 8.4.
(i) The replicas at level k + 1 , which is directly connected with the replica x , tries
to connect with its alive grandparent (addresses are stored at each peer joined in
the system).
(ii) Alive grandparent compare the session time in case of multiple replicas
approaches to connect. The replicas with highest session time will connect with
the active grandparent and will take the position of recently left x in the logical
structure as shown in Figure 8.5.
(iii) The replicas at level k under parent at level k 1 are adjusted according to the
above conditions.

8.4.3 Rule Set-III: Rules for replica joining into the replica logical structure
When any replica rejoins the system, the following rules (Rules-III) are taken into
account by system to maintain the replica overlay topology:
(i) Initially rejoined peer searches its position through ping pong messages starting
from the root, assuming the position at level k .
(ii) The replica establishes the connection with its parent peer and both save the
addresses of each other.

157

(iii) Replica updates its data items by comparing the data items with its alive parent
and update version number of data items.
(iv) Replica stores the addresses of its entire grandparents till root, through the path
find message.
P1

Peer Id

P2

P4

P1
P2
P3
P4
P5
P6
P7
P8

P3

P5

P7

P6

1 Hop
X
P1
P1
P2
P2
P3
P3
P4

Parent Id
2 Hop 3 Hop
X
X
P1
P1
P1
P1
P2

X
X
X
X
P1

4 Hop

P8

Figure 8.4 Replica arrangements in a HBFAR logical tree structure. Peer 2 which is shown
by dotted lines is a peer leaving the network

P1
Peer Id

P4

P5

P1
P3
P4
P5
P6
P7
P8

P3

P8

P6

P7

1 Hop
X
P1
P1
P4
P3
P3
P4

Parent Id
2 Hop 3 Hop
X
X
P1
P1
P1
P1

4 Hop

X
X
X
X

Figure 8.5 The HBFAR logical tree structure after leaving of Peer 2. Peer 4 takes the position
of Peer 2 which already leaved the network. All other replicas in downlink are readjusted
accordingly

8.4.4 Rule Set-IV: Rules for Acquisition of Read/Write Quorum from HBFAR
Logical Tree
The following rules (Rule Set-IV) are defined to access the HBFAR logical tree to
form the read/write quorums:
(i) Sizeof Qri Sizeof Qwi , the size of read quorum is always less than or equal to
the size of write quorum.
(ii) Quorum acquisition is always starting from the root, i.e., root is always included
in the read/write quorum.

158

(iii) For any integer k , if the replica at k level is in the quorum then every replica
from k 1 level must be in the quorum of the HBFAR logical tree. This rule is
referred as Top-to-Bottom.
(iv) If a replica from right descendent of a parent replica is in the quorum then there
must be a replica from left descendent which is also in the quorum. Follows the
rule Left-to-Right.
These rules are implemented in the HBFAR scheme by combining Top-toBottom and Left-to-Right to access replicas from the HBFAR logical structure.

Read Write Quorum: The quorum size depends upon overall availability of replicas
and overhead of the network. The size of quorums may be increased in case of low
availability and reduced in case of high availability of the replicas. The replicas
included in quorums are selected according to the Rule Set-IV Rules. The size of
quorums also affects the network traffic. The number of messages transferred to
maintain the HBFAR logical structure increases with increase in quorum size.
Therefore, the network overhead is increased in the system. The quorum size is
directly proportional to the network overhead, i.e., there is tradeoff between the
network overhead and quorum size of the replicas.
The HBFAR scheme uses the fixed number of replicas in quorums, after
considering all factors affecting the system for read/write quorums. The quorum size
of read and write quorums may be different depending upon the requirement of a
system. The replicas are included in sequence from HBFAR logical structure
according to session time to form the quorums. All accessed replicas in read quorum
are compared for updated version of the data items. In the best case only the root may
be accessed for updated data.
Write quorums are decided same as the read quorums. The replicas from top to
bottom and left to right are selected from the HBFAR logical structure to form the
quorum. Whenever write query is executed in the system, all the replicas in quorum
are updated by write through method, i.e., write is committed after receiving
acknowledgement from all available replicas in the quorum. The remaining replicas in
the structure are updated with write back method. Here maximum queries are
responded by the top most replicas of HBFAR logical structure having longer session
time. The replicas which are not used in write quorums are updated through lazy
159

method. These extra links reduce time of update message to reach all replicas in the
system.
It quorum size equal to four, then all four replicas available at the top, starting from
root to branch and left to right will be in read/write quorum of the HBFAR logical
structure. The peers P1 and P4 are used for quorum size two. The peers P1 , P4 and P3
are used in quorum of size three. The peers P1 , P4 , P3 and P8 are used in quorum of size
four by considering the logical structure shown in Figure 8.5.

8.4.5 Correctness Proof of the Algorithm


We use mathematical induction to prove that the number of replicas accessed in the
read quorums from HBFAR logical structure has at least one replica having updated
data items. Assuming replicas are organized in the HBFAR logical structure in height
h.

Basis:
(i) Assuming the height of the HBFAR tree is 0, i.e., only one peer/replica is in the
structure (placed at root). Since according to the Rule Set-IV (ii) read as well as
write quorum must involve root peer in the quorum. Every read/write quorum
includes replica at root. Hence, every access gets the updated data items from the
root. These quorums mathematically described as:
Qw0 = {P0 } , Qr0 = {P0 } , Qr Qw , Qw0 Qr0 read quorum and write quorum

intersect with each other, i.e., read quorum gets the updated data items. HBFAR
scheme provides updated data for height 0. {Hence Proved}
(ii) Assuming height of the HBFAR logical structure be 1, i.e., HBFAR logical tree
has maximum 2-3 peers. One replica is at the root and 1-2 in the down link of the
root. According to the Rule Set-IV, the size of write quorum is greater than or
equal to the size of read quorum. The replicas in the quorum are selected through
Top-to-Bottom and Left-to-Right. Write quorum Qw1 is defined as:
Qw1 = {{P0 },{P0 , P1},{P0 , P1 , P2 }}

(8.1)

(8.2)

Read quorum is defined:


Qr1 = {{P0 }, {P0 , P1}, {P0 , P1 , P2 }}

160

For every possible set of read quorum against write quorum, quorums intersect
each other thus, it always get the updated information. Using eqn (8.1 & 8.2) we
conclude the following:
Qr1 , Qw1 : Qr1 Qw1 , Qw1 Qr1

All possible read quorums always contain at least one updated replica. This
implies that read quorum always accesses the fresh data item, as intersection is
always non-empty. Thus, HBFAR scheme provides updated data for height 1.
{Hence Proved}

Hypothesis:
Assuming HBFAR logical tree of height i and Qwi , Qri the write and read quorums
of size l and k , respectively and defined as:
Qwi = {P1 , P2 ,..., Pk ,..., Pl }

(8.3)

Qri = {P1 , P2 , P3 ,..., Pk }

(8.4)

l , k 2i 1 , and k l where 2i 1 is the total number of replicas up to i level of

the logical structure. Replicas are accessed in Top-to-Bottom and Left-to-Right


fashion as mentioned in the Rule Set-IV. Assume that entire replicas up to Pk
comes in the intersection set of write and read quorums according to the Rule SetIV. From eqn (8.3 & 8.4)
Qwi Qri = {P1 , P2 , P3 ,..., Pk }

(8.5)

Therefore, each read quorum accesses the updated replicas as intersection of write
and read quorum is not empty.

Inductive Step:
We have to prove that this is also true for the HBFAR logical structure of height
i + 1 . According to the Rule Set-IV, write quorum of size n is defined as:
Qwi +1 = {P1 , P2 , P3 ,..., Pk ,..., Pl ,..., Pn }

(8.6)

(8.7)

Read quorum of size m is defined as:


Qri +1 = {P1 , P2 , P3 ,..., Pk ,..., Pm }

161

If the size selected for the write quorum is n and size for the read quorum is m .
Where l n and k m . The quorum is generated through Rule Set-IV. From eqn
(8.6 & 8.7)
Qwi Qwi +1 , Qri Qri +1

(8.8)

From eqn (8.5 & 8.8)


Qwi +1 Qri +1 , means at least the intersection of two quorums is equal to
{P1 , P2 , P3 ,..., Pk }

from eqn (8.5). It proves that in every read quorum intersects with

the write quorum in HBFAR scheme. Therefore, every read quorum carries the
updated information. {Proof}

Adaptive & Fault Tolerance: HBFAR scheme easily adapts any of the peer faults. It
works for both cases, peer leaving or joining the system. HBFAR scheme tolerates up
to n 1 faults among n number of replicas participating in the system.

Availability: It is the probability that at least one replica is available in the system and
is given as
n
1 (1 Pr )
i
i =1

(8.9)

where Pri is the probability of i th replica to stay alive and n is the number of replicas .

8.5 Simulation and Performance Study


For simulation, a discrete event driven simulator is developed in C++. It is assumed
that the network consists of 1000 peers in underlay and 10%-20% of total peers are
used in the overlay topology. the random peer placement, HQC and HBFAR
topologies are implemented for the simulation. The average search time, response
time and average message transfer time are considered as performance measurement
metrics. These metrics are used to maintain the system after executing write quorum.
In the simulation Dijkstras algorithm is used to find shortest path in all experimental
setups.
In a particular group of replicas it is assumed that all peers are directly connected
with each other. Sending and receiving messages are the only means to communicate

162

between the peers. A replica stores the addresses of other connected replicas. This
helps in the searching for replicas, while form the read/write quorums. A peer
contains the stale information, while it rejoins the system.

8.5.1 Performance Metrics


To study behavior of HBFAR scheme we have used the performance parameters
defined in Table 4.3{Chapter 4}. The performance metrics defined in Table
7.2{Chapter 7} and few from Table 4.2 {Chapter 4} along with below defined
metrics:

Availability: of peer is calculated as the total active time of peer over the total time of
peer including active and down time. This is the measure of participation of a peer in
the system. Longer is the participation time in the system more contributor a peer is.

Percentage of Stale Data: is calculated as the number of accessed replicas with stale
data over total accesses in the quorum. Any system requires minimal amount of stale
data access. A system is considered better which is having lower value of stale data
access.

8.5.2 Simulation Results


It is observed from Figure 8.6 that approximately 80% of peers are reachable under
100% availability. The peer availability is one of the factors responsible for the
network partitioning. Failure of any peer may cause network partitioning. The
network may be partitioned with low availability peers. The peers belonging to the
network having more than one partition, are not reachable from the peer belongs to
another partition. Another factor is the cardinality of the peers participating in the
network. A connection in overlay uses the multiple underlay connections. Any loss of
connection in underlay increases the path length to reach a peer in overlay. It may also
possible that a peer is not reachable in a predefined value of Time to Live (TTL). The
reachability affects search time of replicas. The low reachability increases path length
of the searched peer which it increases the cost to access a replica.

163

120
Up Peers
100

Reachable Peers

% Reachability

80
60
40
20
0
0

0.1

0.2

0.3

0.4

0.5
Availability

0.6

0.7

0.8

0.9

Figure 8.6 Reachability of peers under availability in the network

Figure 8.7 presents the probability to access stale data decreases with increase in
availability of the peers. In HBFAR scheme, the probability to access stale data is
very less as compared to the Random and HQC. The access of all subqueries is
considered against the stale data accessing. It is also observed that the probability to
access stale data is in acceptable range with replicas having availability greater than
0.7. The peers having availability more than 0.7 may be given priority to store the
replica over other peers so that the performance of the system may be improved.

1.2

Probability of Stale Data

1
0.8
0.6
0.4
Random
HQC
HBFAR

0.2
0
0

0.1

0.2

0.3

0.4

0.5
0.6
Availability

0.7

0.8

Figure 8.7 Comparison in accessing stale data under availability of peers

164

0.9

35
HBFAR
Avg. Search Time (ms)

30

HQC
Random

25
20
15
10
5
0
0

10
12
Quorum Size

14

16

18

20

Figure 8.8 Comparison of average search time to form the quorum from the networks

Figure 8.8 shows that the quorum acquisition time is increasing while increasing
in quorum size. The average search time for the random quorum consensus is
comparatively more than the HBFAR scheme. It takes less time to search a peer
because the peers are located at proper position. The Random Quorum Consensus
takes higher time because it finds the peer through flooding as compare to the
structured.
100

Avg. Response Time (ms)

90

Random
HQC
HBFAR

80
70
60
50
40
30
20
10
0
0

10
12
Quorum Size

14

16

18

20

Figure 8.9 Comparison of average response time

It is observed from Figure 8.9 that average response time increases with increase
in quorum size. The response time in all cases becomes approximately constant after

165

12 value of quorum size. The response time of the HBFAR scheme is lowest among
all the considered schemes. The quorum acquisition time is very low from Random
and HQC.

1200
Random
Average Message Transfer

1000

Hierarchical

800
600
400
200
0
0

10
12
Size of Quorum

14

16

18

20

Figure 8.10 Comparison of average message transfer to maintain the system

Figure 8.10 shows that network overhead for Random Quorum Consensus
increases rapidly as the size of quorum is increasing. The network overhead in case of
hierarchical quorum consensus is small as compared to the random quorum because
hierarchical quorum consensus uses the binary logical structure.

8.6 Discussion
The simulation results show that average messages transfer in the P2P network is
minimized through the directional search as compared with the random search. The
message transfer time in the hierarchical topology is also less as compared with the
random topology. The quorum acquisition time is a major factor for the performance.
The system which takes lesser time to search the replicas is having better
performance. HBFAR scheme takes lesser search time as compare to the random and
HQC, because it fixes searching location in the logical structure. But in random and
HQC replicas are searched randomly. This takes time to make the quorum, which
affects the performance of the system.

166

The HBFAR scheme performs better than HQC in respect of search time to form
quorums, response time and probability to access updated data items in dynamic
environment of the network. It provides better data availability in the system. It
maximizes the degree of intersection among consecutive read-write and write-write
quorums and provides the better probability to access updated data items from the
system. HBFAR scheme easily adapts any leave and joining of peer in the system.
System performance is not seriously degrades with increase in churn rate of the peers.
It also works in case of any fault. It may tolerate up to n 1 faults.

8.7 Summary
In this chapter we have presented a HBFAR logical structure for overlay networks.
HBFAR scheme logical structure is organized in such a way that all updated replicas
are popped towards root and only updated replicas are participating in any quorum
formation. The replicas having large session time are on the root side and replicas
with lower session time are arranged on the branch sides of the root. It adjusts itself
after every leaving a replica from the structure. Always a replica spent longer session
time are on the top of the tree. This also reduces the time spent to make the quorum of
replicas, and improves the response time of the system.
In next chapter work is concluded with recommendations for future scope.

167

Chapter 9

Conclusion and Future Work


In P2P networks, peers are rich in computing resources/services, viz., data files,
cache storage, disk space, processing cycles, etc. These peers collectively generate
huge amount of resources and collaboratively perform computing tasks using these
available resources. These peers can serve as both clients and servers and eliminate
the need of a centralized node. However, owing to the properties of the nodes, which
can join and leave continually, makes P2P systems very dynamic with high rate of
churn and unpredictable topology. In P2P systems, major drawback is that resources
or nodes are restricted to temporary availability only. A network element can
disappear at a given time from the network and reappear at another locality of the
network with an unpredictable pattern. Under these circumstances, one of the most
challenging problems is to place and access real-time information over the network.
This is because the resources should always be successfully located by the requesters
whenever needed within some bounded delay. This requires management of
information under time constraints and dynamism of the peers. There are multiple
challenges to be addressed in the direction of implementing Real Time Distributed
Databases Systems (RTDDBS) over dynamic P2P networks. In order to enable
resource awareness in such a large-scale dynamic distributed environment, specific
management system is required which takes into account the following P2P
characteristics: reduction in redundant network traffic, data distribution, load
balancing, fault-tolerance, replica placement/updation/assessment, data consistency,
concurrency control, design and maintain logical structure for replicas, etc. In this
thesis, we have developed a solution for resource management which should support
fault-tolerant operations, shortest path length for requested resources, low overhead in
network management operations, well balanced load distribution between the peers
and high probability of successful access from the defined quorums.
Rest of the chapter is organized as follows. Contributions of this dissertation are
explored in Section 9.1. Finally chapter is ended with future work Section 9.2.

168

9.1 Contributions
Contributions of this dissertation are as follows.

1. We have designed Statistics Manager and Action Planner (SMAP) system for P2P
networks. Various algorithms are also proposed to enhance the performance of
various modules of this system. SMAP enables fast and cost-efficient deployment
of information over the P2P network. It is a self-managed P2P system, having a
capability to deal with high churn rate of the peers in the network. SMAP is fault
adaptive and provides load balancing among participated peers. It permits true
distributed computing environment for every peer node to use the resources of all
other peers participating in the network. It provides data availability by managing
replicas in efficient logical structure. To improve the throughput, execution
process is divided into three independent sub-processes by the system. These sub
processes can execute in parallel. SMAP provides fast response time for
transactions with time constraints. It also reduces redundant traffic from P2P
networks by reducing conventional overlay path. It also addresses most of the
issues related to RTDDBS implemented over P2P networks.

2. We have developed a 3-Tier Execution Model (3-TEM) which comprises of


Transaction Coordinator (TC), Transaction Processing Peer (TPP) and Result
Coordinator (RC). All these operate in parallel to improve throughput of the
system. It is adaptive in nature and balances the excessive load in the system by
distributing the work of head peer to TC and RC. TC receives and manages the
execution of arrived transactions in the system. It resolves transaction mapped
with global schema into subtransactions mapped with local schema and available
with TPP. The partial results received from TPPs are combined and prepared
according to the global schema and delivered to the requester through RC. TPPs
are developed for receiving subtransactions from coordinator, execute it in
serializable form and submit partial results to the RC. These three stages are
independent and execute the transactions in parallel. Peer selection criterion to
identify the most suited peers for holding the replica is also presented. The
selection of peers is performed on the basis of multiple parameters, e.g., available
resources, session time of peers, etc. which improves the performance of the
system.
169

3. A Matrix Assisted Technique (MAT) is developed to partition real time database


for the P2P networks. It provides a mechanism to store partitions and access
dynamic data over P2P networks under the dynamic environment. MAT also
provides the primary security concern to the stored data simultaneously it also
improves data availability in the system.
4. A Timestamp based Secure Concurrency Control Algorithm (TSC2A) is
developed which handles the issues of concurrent execution of transactions in
dynamic environment of P2P network. TSC2A maintains security of data and time
bounded transactions along with controlled concurrency. It uses timestamp to
resolve the conflicts rise in the system. Simultaneously, security is also provided
by TSC2A to each data items and the transactions accessing that data items. Three
security levels are used to provide the security in execution of transactions.
TSC2A also avoids the covert channel problem in the system. It provides
serializability in the execution of transactions at global as well as at local level.

5. A Common Junction Methodology (CJM) reduces network traffic in the P2P


network. It considers the redundant traffic generated by topology mismatch
problem in the P2P networks. CJM finds its own route to transfer the messages
from one peer to other. Common Junction among two paths is identified for
redirecting the messages. The messages are usually forwarded from one peer to
other in overlay topology. A message traverses multi hop distance in underlay to
deliver the message in overlay. These multi hops in underlay may intersect at any
point and this point referred as Common Junction which is utilized to reroute the
messages. The traffic in the underlay network is also reduced by it. CJM reduces
the traffic without affecting search scope in the P2P networks. The correctness of
the proposed CJM is analyzed through mathematical model as well as through
simulation.

6. A novel Logical Adaptive Replica Placement Algorithm (LARPA) is developed


which implements logical structure for dynamic environment. The algorithm is
adaptive in nature and tolerates up to n 1 faults. It efficiently distributes replicas
on the one hop distance sites to improve data availability in RTDDBS over P2P
system. LARPA uses minimum peers to place replicas in a system. These peers
170

are identified through peer selection criteria. All peers are placed at one hop
distance from the centre of LARPA, it is place from where any search starts.
Depending upon the selection of peers for logical structure, LARPA is classified
as LARPA1 and LARPA2. LARPA1 uses the peers with highest candidature value
only, calculated through peer selection criteria. This candidature value is
compromised in LARPA2 by the distance of peers from the centre. LARPA
improves the response time of the system, throughput, data availability and degree
of intersection between two consecutive quorums. It also provides the high
possibility of accessing updated data items from the system and short quorum
acquisition time. The reconciliation of LARPA is fast, because system updates
itself at fast rate. It also reduces the network traffic in P2P network due to its one
hop distance logical structure formation with minimum number of replicas.

7. A self organized Height Balanced Fault Adaptive Reshuffle (HBFAR) scheme


developed for improving hierarchical quorums over P2P systems. It arranges all
replicas in a tree logical structure and adapts the joining and leaving of a peer in
the system. It places all updated replica on the root side of the logical structure. To
access updated data items from this structure, this scheme uses a special access,
i.e., Top-to-Bottom and Left-to-Right. HBFAR scheme always select updated
replicas for quorums from logical structure. It provides short quorum acquisition
time with high quorum intersection degree among two consecutive quorums,
which maximizes the overlapped replicas for read/write quorums. HBFAR
improves the response time and search time of replicas for quorums. It provides
the feature read one in its best case. HBFAR scheme provides high data
availability and high probability to access updated data from the dynamic P2P
system. High fault tolerance and low network traffic is reported by HBFAR
scheme under the churn of peers. The parallelism in quorum accessing and
structure maintenance keeps HBFAR scheme updated without affecting the
quorum accessing time. HBFAR is analyzed mathematically as well as through
simulator.
A comparative study of SMAP is summarized in Table 9.1. It is found that
SMAP fulfills most of the existing challenges.

171

9.2 Future Scope


Information delivery in ad hoc networks is a task, which in any case, is a resource
consuming. We are heading towards a future of miniaturization and wireless
connectivity and ad hoc networks have the ability to deliver both at very low cost.

1. For future research we can extend this work for secure dissemination of
information by integrating security framework in term of trust establishment and
trust management in P2P network. This system will also be developed for
exploring and solving security issues on open networks.
2. To address unique security concerns, it would be imperative to study the adjacent
technological advances in distributed systems, ubiquitous computing, broadband
wireless communication, nanofabrication and bio-systems.
3. Irrespective of good research in the socially popular and emerging field, i.e., P2P
networks and systems, still there is a lot of scope of research in this field.
4. We identified that most of the P2P systems are popular for the static data. The
data which is not changed, while it is shared among the networks. A little work is
done in the direction of sharing dynamic data among the P2P systems. The data
which is changed, while it is shared among the networks. We have developed
SMAP and tested through simulation in future it will be transported on real
networks.
5. Reliability is another issue which needs more attention of the research society.
Other issues are secure concurrency control, secure fault tolerance and secure load
balancing.

172

Table 9.1 Comparison of Few Existing Systems with SMAP


Attributes

Middleware
CAN

Tapestry

Chord

Pastry

Napster

Gnutella

Freenet

APPA

Piazza

PIER

PeerDB

NADSE

SMAP

Load balancing

Fault tolerant
(communication
link)
Fault tolerant
(host level)

Replication

Replication

Reliable

Replication Replication Replication

Replication Replication Replication Replication Replication Replication Replication

Resource sharing

Secure

Communication
level

Scalable

Little

Little

Little

Little

Good

Good

Good

Good

Better at
under load

Better at
under load

Better at
under load

Good

Better at
under load

Good

Better at
under load

Good

Good

Poor at
overload

Poor at
overload

Poor at
overload

Performance

Distributed file
Management

Poor at
overload

Poor at
overload

Data Partitioning

NA

NA

NA

NA

NA

NA

NA

Traffic Optimize

NA

NA

NA

NA

NA

NA

NA

NA

Concurrency
Control

NA

NA

NA

NA

NA

NA

NA

NA

Local

Parallel Execution

NA

NA

NA

NA

NA

NA

NA

Advanced
Parallel

Schema
Management
File sharing

NA

NA

NA

NA

NA

NA

NA

Y(Global)

Y(Global)

Hybrid

Hybrid

Hybrid

Degree of
Decentralization

Distributed Distributed Distributed

Distributed

Centralized Decentralized Distributed

Network Structure Structured Structured Structured

Structured

Structured Unstructured

*NA: Not addressed at the best of our knowledge


173

Hybrid

Y(Pairwise) Y(Global)
Y

Hybrid Super Distributed


Peer Based

Loosely Independent Unstructured Structured Loosely


Loosely Structured
Structured
Structured Structured

List of Publications
International Journals

1. Shashi Bhushan, M. Dave, R. B. Patel, Self Organized Replica Overlay


Scheme for P2P Networks, International Journal of Computer Network and
Information Security, Vol. 4(10), 13-23, 2012. ISSN: 2074-9090 (Print),
ISSN: 2074-9104 (Online). DOI: 10.5815/ijcnis.2012.10.02
2. Shashi Bhushan, R. B. Patel, Mayank Dave, Height Balanced Reshuffle
Architecture for Improving Hierarchical Quorums over P2P Systems,
International Journal of Information Systems and Communication, Vol. 3(1),
215-219, 2012. ISSN: 0976-8742 (Print) & E-ISSN: 0976-8750 (Online)
Available online at http://www.bioinfo.in/contents.php?id=45
3. Shashi Bhushan, R. B. Patel, M. Dave, Reducing Network Overhead with
Common Junction Methodology, International Journal of Mobile
Computing and Multimedia Communication (IJMcMc), IGI-Global, 3(3), 5161, December 2011. ISSN: 1937-9412, EISSN: 1937-9404, USA. DOI:
10.4018/jmcmc.2011070104
4. Shashi Bhushan, R. B. Patel, Mayank Dave, A Secure Time-Stamp Based
Concurrency Control Protocol for Distributed Databases, Journal of
Computer Science, 3(7), 561-565, 2007, ISSN: 1549-3636, New York.
DOI:10.3844/jcssp.2007.561.565
5. Shashi Bhushan, R. B. Patel, M. Dave, LARPA - A Logical Adaptive
Replica Placement Algorithm to Improve Performance of Real Time
Distributed Database over P2P Networks, Journal of Network and
Computer Applications (JNCA), Elsevier. {Communicated on October 27,
2012}

Papers in Conference Proceedings


6. Shashi Bhushan, R. B. Patel, M. Dave, Hierarchical Data Distribution
Scheme for Peer-to-Peer Networks In Proceedings of International
Conference on Methods and Models in Science and (ICM2ST-10) December
25-26, 2010, Chandigarh. Indexed with AIP Conference Proceedings, volume

174

1324, pp. 332-336 (2010). (DOI: 10.1063/1.3526226)


7. Shashi Bhushan, R. B. Patel, M. Dave, Adaptive Load Balancing within
Replicated Databases over Peer-to-Peer Networks In Proceedings of 1st
International Conference on Mathematics & Soft Computing (Application in
Engineering) (ICMSCAE) December 4-5, 2010 at NC College of Engineering
ISRANA (HR), INDIA.
8. Shashi Bhushan, R. B. Patel, M. Dave, CJM: A Technique to Reduce
Network Traffic in P2P Systems, In Proceedings of IEEE International
Conference on Advances in Computer Engineering (ACE 2010), Bangalore,
INDIA,

June

21-22,

2010.

(DOI:ieeecomputersociety.org/10.1109/ACE.2010.55)

arnumber=5532818,

Available with IEEE XPLORE.


9. Shashi Bhushan, R. B. Patel, Mayank Dave, A Distributed System For
Placing Data Over P2P Networks, In Proceedings of International
Conference on Soft Computing and Intelligent Systems, Jabalpur Engineering
College Jabalpur, INDIA, 27-29 December, 2007, pp.160-164.
10. Shashi Bhushan, R. B. Patel, M. Dave, Load Balancing within Hierarchical
Data Distribution in Peer-to-Peer Networks, In Proceedings of 4th National
Conference on Machine Intelligence (NCMI-2008), Haryana Engg. College,
Jagadhri (HR) INDIA. August 22-23, 2008, pp. 392-395.

175

Bibliography
[1]

V.Gorodetsky,

O.Karsaev,

V.Samoylov,

S.Serebryakov,

S.Balandin,

S.Leppanen, M.Turunen, Virtual P2P Environment for Testing and


Evaluation of Mobile P2P Agents Networks, Proceedings of the IEEE
Second International Conference on Mobile Ubiquitous Computing, Systems,
Services and Technologies, pp. 422-429, 2008.
[2]

Javed I. Khan and Adam Wierzbicki, Foundation of Peer-to-Peer


Computing, Special Issue, Computer Communications, Vol. 31(2), pp. 187418, February 2008.

[3]

C. Shirky, What is P2P and What Isnt, Proceedings of The O'Reilly Peer to
Peer and Web Service Conference, Washington D.C., pp. 5-8, November 2001.

[4]

William Sears, Zhen Yu, Yong Guan, An Adaptive Reputation-based Trust


Framework for Peer-to-Peer Applications, Proceedings of the Fourth IEEE
International Symposium on Network Computing and Applications (NCA05),
pp. 1-8, 2005.

[5]

Nikta Dayhim, Amir Masoud Rahmani, Sepideh Nazemi Gelyan, Golbarg


Zarrinzad, Towards a Multi-Agent Framework for Fault Tolerance and QoS
Guarantee in P2P Networks, Proceedings of the Third IEEE International
Conference on Convergence and Hybrid Information Technology, pp. 166-171,
2008.

[6]

Qian Zhang, Yu Sun, Zheng Liu, Xia Zhang Xuezhi Wen, Design of a
Distributed P2P-based Grid Content Management Architecture, Proceedings
of the 3rd IEEE Annual Communication Networks and Services Research
Conference (CNSR05), pp. 1-6, 2005.

[7]

Grokster official homepage. http://www.grokster.com.

[8]

Nouha Oualha, Jean Leneutre, Yves Roudier, Verifying Remote Data


Integrity in Peer-to-Peer Data Storage: A comprehensive survey of protocols,
Peer-to-Peer Networking Application (Springer), Vol. 4, pp. 1-11, October
2011.

[9]

Morpheus official homepage. http://www.morpheus.com.

[10]

Ritter,

Why

Gnutella

Can't

http://www.tch.org/gnutella.html.

176

Scale.

No,

Really,

[11]

J. Holliday, D. Agrawal, A. E. Abbadi, Partial Database Replication using


Epidemic Communication, Proceedings of the 22nd International Conference
on Distributed Computing Systems, IEEE Computer Society, Vienna, Austria,
pp. 485493, 2002.

[12]

A. Gupta, Lalit K. Awasthi, Peer-to-Peer Networks and Computation:


Current Trends and Future Perspectives, International Journal of Computing
and Informatics, Vol. 30(3), pp. 559594, 2011.

[13]

Press Release, Bertelsmann and Napster form Strategic Alliance, Napster,


Inc., Oct 2000, http://www.napster.com/pressroom/pr/001031.html.

[14]

SETI@home: Search for Extraterrestrial Intelligence at Home, Space Science


Laboratory,

University

of

California,

Berkley,

2002,

http://setiathome.ssl.berkeley.edu/
[15]

N. Oualha, Y. Roudier, Securing P2P Storage with a Self Organizing


Payment Scheme, 3rd international workshop on autonomous and
spontaneous security (SETOP 2010), Athens, Greece, September 2010.

[16]

N. Oualha, Security and Cooperation for Peer-to-Peer Data Storage, PhD


Thesis, EURECOM/Telecom ParisTech, June, 2009.

[17]

D.S. Milojicic, V. Kalogeraki, R. Lukose, Peer-to-Peer Computing, Tech


Report: HPL-2002-57, http://www.hpl.hp.com/techreports/2002/HPL-200257.pdf.

[18]

Kai Guo, Zhijng Liu, A New Efficient Hierarchical Distributed P2P


Clustering Algorithm, Proceedings of the IEEE Fifth International
Conference on Fuzzy Systems and Knowledge Discovery, pp. 352-355, 2008.

[19]

Gareth Tyson, Andreas Mauthe, Sebastian Kaune, Mu Mu, Thomas


Plagemann Corelli, A Dynamic Replication Service for Supporting LatencyDependent Content in Community Networks, Proceedings of the 16th
ACM/SPIE Multimedia Computing and Networking Conference (MMCN),
San Jose, CA, 2009.

[20]

Ian Taylor, Triana Generations, Proceedings of the Second IEEE


International Conference on e-Science and Grid Computing (e-Science'06),
pp.1-8, 2006.

[21]

B. Yang, H. Garcia-Molina, Improving Search in Peer-to-Peer Networks,


Proceedings of the 22nd International Conference on Distributed Computing
Systems (ICDCS02), IEEE Computer Society, pp. 5, 2002.
177

[22]

Lijiang Chen, Bin Cui, Hua Lu, Linhao Xu, Quanqing Xu, iSky, Efficient and
Progressive Skyline Computing in a Structured P2P Network, Proceedings of
the IEEE 28th International Conference on Distributed Computing Systems,
pp. 160-169, 2008.

[23]

Lionel M. Ni, Yunhao Liu, Efficient Peer-to-Peer Overlay Construction,


Proceedings of the IEEE International Conference on E-Commerce
Technology for Dynamic E-Business (CEC-East04) 2004.

[24]

Hari Balakrishnan, M. Frans Kaashoek, David Karger, Robert Morris Ion


Stoica, Looking Up Data in P2P Systems, Communications of the ACM,
Vol. 46(2), pp. 43-48, February 2003.

[25]

David G. Andersen, Hari Balakrishnan, M. Frans Kaashoek, Robert Moms,


Resilient Overlay Networks, Proceedings of 18th ACM SOSP, Banff,
Canada, October, 2001.

[26]

S. Sen, J. Wang, Analyzing Peer-to-peer Traffic across Large Networks,


Proceedings of ACM SIGCOMM Internet Measurement Workshop, France,
2002.

[27]

Freenet projects Official Website http://freenetproject.org/index.html.

[28]

Yoram Kulbak, Danny Bickson, The eMule Protocol Specification,


Technical Report, DANSS (Distributed Algorithms, Networking and Secure
Systems) Lab, School of Computer Science and Engineering, The Hebrew,
University

of

Jerusalem,

Jerusalem,

pp.

1-67,

January,

2005

http://www.cs.huji.ac.il/labs/danss/presentations/emule.pdf
[29]

Rudiger Schollmeier, A Definition of Peer-to-Peer Networking for the


Classification of Peer-to-Peer Architecture and Applications, Proceedings of
the First International Conference on Peer-to-Peer Computing (P2P.01),
Sweden, pp. 101-102, August 27-29, 2001.

[30]

S. Saroiu, K. P.Gummadi, R. J. Dunn, S. D. Gribble, H. M. Levy, An


Analysis of Internet Content Delivery Systems, Proceedings of the 5th
Symposium on Operating Systems Design and Implementation, Boston,
Massachusetts, USA, 2002.

[31]

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari


Balakrishnan, Chord: A Scalable Peer-to-Peer Lookup Service for Internet
Applications, Proceedings of the ACM SIGCOMM 2001, San Diego, CA, pp.
149-160, August 2001.
178

[32]

HC. Kim, P2P overview, Technical Report, Korea Advanced Institute of


Technology, August 2001.

[33]

S. Ratnasamy, P. Francis, M. Handley, R. Karp, S. Shenker, A Scalable


Content-Addressable Network, Proceedings of the ACM SIGCOMM, pp.
161-172, 2001.

[34]

A. Rowstron, P. Druschel, Pastry: Scalable, Distributed Object Location and


Routing for Large-Scale Peer-to-Peer Systems, Proceedings of, International
Conference on Distributed Systems Platforms (Middleware), pp. 329-350,
2001.

[35]

Ben Y. Zhao, John D. Kubiatowicz, Anthony D. Joseph, Tapestry: An


Infrastructure for Fault-tolerant Wide-area Location and Routing, U. C.
Berkeley Technical Report CB//CSD-01-1141, April 2000.

[36]

X. Shen, H. Yu, J. Buford, M. Akon, Handbook of Peer-to-Peer Networking,


(1st ed.), New York, Springer, pp. 118, 2010, ISBN 0387097503.

[37]

B. Zheng, WC Lee, DL Lee, On semantic caching and Query Scheduling for


Mobile Nearest-Neighbor Search, Wireless Networks, Vol.10(6), pp. 653-664,
2004.

[38]

Kin Wah Kwong, Danny H.K., A Congestion Aware Search Protocol for
Unstructured Peer-to-Peer Networks, Proceedings of the, LNCS 3358, pp.
319-329, 2004.

[39]

Dr. Ing, HiPeer, An Evolutionary Approach to, P2P Systems, PhD Thesis,
Berlin, 2006.

[40]

Lalit Kumar, Manoj Misra, Ramesh Chander Joshi, Low Overhead Optimal
Check Pointing for Mobile Distributed Systems, 19th IEEE International
Conference on Data Engineering, pp. 686 688, March 2003.

[41]

D. Agrawal, A. E. Abbadi, An Efficient and Fault-Tolerant Solution for


Distributed Mutual Exclusion, ACM Transactions on Computer Systems, Vol.
9(1), pp. 120, 1991.

[42]

Aaron J. Elmore, Sudipto Das, Divyakant Agrawal, Amr El Abbadi,


InfoPuzzle: Exploring Group Decision Making in Mobile Peer-to-Peer
Databases, PVLDB, Vol. 5(12), 1998-2001, 2012.

[43]

Ryan Huebsch, Joseph M. Hellerstein, Nick Lanham, Boon Thau Loo, Scott
Shenker, Ion Stoica. Querying the Internet with PIER, Proceedings of the
29th VLDB Conference, Berlin, Germany, 2003.
179

[44]

Choudhary Suryakant, Dincturk Mustafa Emre, V. Bochmann Gregor, Jourdan


Guy-Vincent, Onut Iosif-Viorel Viorel, Ionescu Paul, Solving Some
Modeling Challenges when Testing Rich Internet Applications for Security,
Proceedings of Fifth International Conference on Software Testing,
Verification and Validation (ICST), University of Ottawa, ON, Canada, pp.
850 857, 17-21, April 2012.

[45]

K. Ramesh, T. Ramesh, Domain-Specific Modeling and Synthesis of


Distributed Networked Systems, International Journal of Computer Science
and Communication Vol. 2(2), pp. 485-495, July-December 2011.

[46]

S. Lin, Q. Lian, M. Chen, Z. Zhang, A Practical Distributed Mutual


Exclusion Protocol in Dynamic Peer-to-Peer Systems, Proceedings of the 3nd
International Workshop on Peer-to-Peer Systems, 2004.

[47]

G. Koloniari, N. Kremmidas, K. Lillis, P. Skyvalidas, E. Pitoura, Overlay


Networks and Query Processing: A Survey, Technical Report TR2006-08,
Computer Science Department, University of Ioannina, October 2006.

[48]

May Mar Oo, The Soe, Aye Thida, Fault Tolerance by Replication of
Distributed Database in P2P System using Agent Approach, International
Journal of Computers, Vol. 4(1), pp. 9-18, 2010.

[49]

Erwan Le Merrer, Anne-Marie Kermarrec, Laurent Massouli, Peer-to-Peer


Size Estimation in Large and Dynamic Networks: A Comparative Study,
Proceedings of 15th IEEE International Symposium on High Performance
Distributed Computing (HPDC), pp. 7-17, 19-23 June 2006.

[50]

Runfang Zhou, Kai Hwang, Min Cai, Gossip Trust for Fast Reputation
Aggregation in Peer-to-Peer Networks, IEEE Transactions on Knowledge
And Data Engineering, Vol. 20(9), pp. 1282-1295, September, 2008.

[51]

X.Y. Yang, P. Hernandez, F. Cores, Distributed P2P Merging Policy to


Decentralize the Multicasting Delivery, Proceedings of the IEEE
EUROMICRO

Conference

on

Software

Engineering

and

Advanced

Applications (EUROMICRO-SEAA05), pp. 1-8, 2005.


[52]

S. Ktari, M. Zoubert, A. Hecker, H. Labiod, Performance Evaluation of


Replication Strategies in DHTs under Churn, Proceedings 6th International
Conference on Mobile and Ubiquitous Multimedia MUM 07. New York, NY,
USA: ACM, pp. 9097, 2007.

[53]

Napster Website[ EB/ OL] http://www. napster.com


180

[54]

S Saroiu, P. Gummadi, S. Gribble, A Measurement Study of Peer-to-Peer


File Sharing Systems, Proceedings of Multimedia Computing and
Networking (MMCN02), 2002.

[55]

R Bhagwan, S Savage, G Voelker, Understanding Availability, Proceedings


of the 2nd International Workshop on Peer-to-Peer Systems (IPTPS '03),
Berkeley, CA, USA, pp. 1-11, February 2003.

[56]

R Bhagwan, S Savage, G Voelker, Replication Strategies for Highly


Available Peer-to-Peer Storage Systems, Proceedings of FuDiCo: Future
directions in Distributed Computing, June, 2002.

[57]

Osrael J, Froihofer L, Chlaupek N, Goeschka KM, Availability and


Performance of the Adaptive Voting Replication, Proceedings of
International Conference on Availability, Reliability and Security (ARES),
Vienna, Austria, pp. 5360, 2007.

[58]

Najme Mansouri, Gholam Hosein Dastghaibyfard, Ehsan Mansouri,


Combination of Data Replication and Scheduling Algorithm for Improving
Data Availability in Data Grids, Journal of Network and Computer
Applications, Vol. 36(2), pp. 711-722, March 2013.

[59]

R. Kavitha, A. Iamnitchi, I. Foster, Improving Data Availability through


Dynamic Model Driven Replication in Large Peer-to-Peer Communities,
Proceedings of Global and Peer-to-Peer Computing on Large Scale
Distributed Systems Workshop, Berlin, Germany, May 2002.

[60]

Heng Tao Shen, Yanfeng Shu, Bei Yu, Efficient Semantic-Based Content
Search in P2P Network, IEEE Transactions on Knowledge And Data
Engineering, Vol. 16(7), pp. 813- 826, July 2004.

[61]

Hung-Chang Hsiao, Hao Liao, Po-Shen Yeh, A Near-Optimal Algorithm


Attacking the Topology Mismatch Problem in Unstructured Peer-to-Peer
Networks, IEEE Transactions on Parallel and Distributed Systems, Vol. 21(7),
pp. 983-997, July 2010.

[62]

Z. Xu, C. Tang, Z. Zhang, Building Topology-Aware Overlays Using Global


Soft-State, Proceedings of the 23rd International Conference on Distributed
Computing Systems (ICDCS), RI, USA, 2003.

[63]

D Saha, S Rangarajan, SK Tripathi, An Analysis of the Average Message


Overhead in Replica Control Protocols, IEEE Transactions Parallel
Distributed Systems, Vol. 7(10), pp. 10261034, 1996.
181

[64]

Yao-Nan Lien, Hong-Qi Xu, A UDP Based Protocol for Distributed P2P File
Sharing, Eighth International Symposium on Autonomous Decentralized
Systems (ISADS'07), pp. 1-7, 2007.

[65]

D.P. Vidyarthi, B.K. Sarker, A.K. Tripathi, L.T. Yang, Scheduling in


Distributed Computing Systems: Analysis, Design and Models, Book,
Springer-Verlag, ISBN-10: 0387744800, 1st Edition, 2008.

[66]

A. El. Abbadi, S.Toueg, Maintaining Availability in Partitioned Replicated


Databases, ACM Transactions on Database Systems, Vol. 14(2), pp. 264-290,
1989.

[67]

Gnutella Website[ EB/ OL] . http://www.gnutella.com

[68]

Hung-Chang Hsiao, Hao Liao, Cheng-Chyun Huang, Resolving the


Topology Mismatch Problem in Unstructured Peer-to-Peer Networks, IEEE
Transactions on Parallel and Distributed Systems, Vol. 20(11), pp. 1668-1681,
November 2009.

[69]

Jing Tian, Zhi Yang, Yafei Dai, A Data Placement Scheme with TimeRelated Model for P2P Storages, Proceedings of the Seventh IEEE
International Conference on Peer-to-Peer Computing, pp. 151-158, 2007.

[70]

R. Dunaytsev, D. Moltchanov, Y. Koucheryavy, O. Strandberg, H. Flinck A


Survey of P2P Traffic Management Approaches: Best Practices and Future
Directions, Journal of Internet Engineering, Vol. 5(1), June 2012.

[71]

E. Cohen, S. Shenker, Replication Strategies in Unstructured Peer-to-Peer


Networks, Proceedings of ACM SIGCOMM02, Pittsburgh, USA, Aug. 2002.

[72]

H. Lamehamedi, Z. Shentu, B. Szymanski, E. Deelman, Simulation of


Dynamic Data Replication Strategies in Data Grids, Proceedings of the 17th
International Parallel and Distributed Processing Symposium, IEEE Computer
Society, Nice France, 2003.

[73]

Q. Lv, P. Cao, E. Cohen, K. Li, S. Shenker, Search and Replication in


Unstructured Peer-to-Peer Networks, Proceedings of the 16th annual ACM
International Conference on Supercomputing (ICS02), New York, USA, June
2002.

[74]

S. Jamin, C. Jin, T. Kurc, D. Raz, Y. Shavitt, Constrained Mirror Placement


on the Internet, Proceedings of the IEEE INFOCOM Conference, Alaska,
USA, pp. 1369 1382, 2001.

182

[75]

A. Vigneron, L. Gao, M. Golin, G. Italiano, B. Li, An Algorithm for Finding


a k-median in a Directed Tree, Information Processing Letters, Vol. 74(1, 2),
pp. 81-88, April 2000.

[76]

Anna Saro Vijendran, S.Thavamani, Analysis Study on Caching and Replica


Placement Algorithm for Content Distribution in Distributed Computing
Networks, International Journal of Peer to Peer Networks (IJP2P), Vol. 3(6),
pp. 13-21, November 2012.

[77]

Francis Otto, Drake Patrick Mirenbe, A Model for Data Management in Peerto-Peer Systems, International Journal of Computing and ICT Research, Vol.
1(2), pp. 67-73, December 2007.

[78]

Houda Lamehamedi, Boleslaw Szymanski, Zujun Shentu, Ewa Deelman,


Data Replication Strategies in Grid Environments, Proceedings of the 5th
IEEE International Conference on Algorithms and Architecture for Parallel
Processing, ICA3PP'2002, Bejing, China, pp. 378-383, October 2002.

[79]

S. Misra, N. Wicramasinghe, Security of a Mobile Transaction: A Trust


Model, Electronic Commerce Research / Kluwer Academic Publishers, Vol.
4(4), 359-372, 2004.

[80]

P. Shvaiko, J. Euzenat, A Survey of Schema-Based Matching Approaches,


Journal on Data Semantics, Springer, Heidelberg, LNCS, Vol. 3730, pp. 146
171, 2005.

[81]

HK Tripathy, BK Tripathy, K Pradip, An Intelligent Approach of Rough Set


in Knowledge Discovery Databases, International Journal of Computer
Science and Engineering, Vol. 2 (1), pp. 45-48, 2007.

[82]

Alireza Poordavoodi, Mohammadreza Khayyambashi, Jafar Haminusing,


Replicated Data to Reduce Backup Cost in Distributed Databases, Journal of
Theoretical and Applied Information Technology (JATIT), pp. 23-29, 2010.

[83]

Lin Wujuan, Bharadwaj Veeravalli, An Object Replication Algorithm for


Real-Time Distributed Databases, Distributed Parallel Databases, MA, USA,
Vol. 19, pp. 125146, 2006.

[84]

T. Loukopoulo, I. Ahmad, Static and Adaptive Distributed Data Replication


using Genetic Algorithms, Journal of Parallel and Distributed Computing,
Vol. 64(11), pp. 12701285, 2004.

183

[85]

Amita Mittal, M.C. Govil, Concurrency Control Design Protocol in Real


Time Distributed Databases, Proceedings of Emerging Trends in Computing
& Communication (ETCC07), NIT Hamirpur, pp. 155-160, July 2007.

[86]

A. Kumar, A. Segev, Cost and Availability Tradeoffs in Replicated Data


Concurrency Control, ACM Transactions on Database Systems (TODS), Vol.
18(1), pp. 102131, 1993.

[87]

J. Huang, J. A. Stankovic, K. Ramamritham, D. Towsley, On using Priority


Inheritance in Real-Time Databases, Proceedings of the 12th IEEE Real-Time
Systems Symposium, IEEE Computer Society Press, San Antonio. Texas.
USA, pp. 210-221, 1991.

[88]

P. S. Yu, K.-L. Wu, K.-J. Lin, S. H. Son, On Real-Time Databases:


Concurrency Control and Scheduling, Proceedings of the IEEE, Vol. 82(1),
pp. 14-15, January 1994.

[89]

Navdeep Kaur, Rajwinder Singh, Manoj Misra, A. K.Sarje, A Feedback


Based Secure Concurrency Control For MLS Distributed Database,
International Conference on Computational Intelligence and Multimedia
Applications 2007.

[90]

Y. Chu, S. G. Rao, H. Zhang, A Case for End System Multicast,


Proceedings of ACM SIGMETRICS, Santa Clara, California. June, 2000.

[91]

B. Krishnamurthy, J. Wang, Topology Modeling via Cluster Graphs,


Proceedings of the SIGCOMM Internet Measurement Workshop, San
Francisco, USA, November, 2001.

[92]

V. N. Padmanabhan, L. Subramanian, An Investigation of Geographic


Mapping Techniques for Internet Hosts, Proceedings of the ACM
SIGCOMM, University of California, San Diego 2001.

[93]

Z. Xu, C. Tang, Z. Zhang, Building Topology-aware Overlays Using Global


Soft-state, Proceedings of the 23rd International Conference on Distributed
Computing Systems (ICDCS), RI, USA, 2003.

[94]

Y. Chawathe, S. Ratnasamy, L. Breslau, N. Lanham, S. Shenker, Making


Gnutella-Like P2P Systems Scalable, Proceedings of the ACM SIGCOMM,
Miami, Florida, USA 2003.

[95]

Y. Liu, L. Xiao, L.M. Ni, Building a Scalable Bipartite P2P Overlay


Network, Proceedings of the IEEE Transaction Parallel and Distributed
Systems (TPDS), Vol. 18(9), pp. 1296-1306, September, 2007.
184

[96]

L. Xiao, Y. Liu, L.M. Ni, Improving Unstructured Peer-to- Peer Systems by


Adaptive Connection Establishment, Proceedings of the IEEE Transaction
Computers, 2005.

[97]

NTP: The Network Time Protocol. http://www.ntp.org/ 2007

[98]

Yunhao Liu, A Two-Hop Solution to Solving Topology Mismatch,


Proceedings of the IEEE Transactions on Parallel and Distributed Systems,
Vol. 19(11), pp. 1591-1600, November 2008.

[99]

G. Sushant, R. Buyya, Data Replication Strategies in Wide-Area Distributed


Systems, Chapter IX of Enterprise Service Computing: From Concept to
Deployment, IGI Global, pp. 211-241, 2006.

[100] Ashraf Ahmed, P.D.D.Dominic, Azween Abdullah, Hamidah Ibrahim, A


New Optimistic Replication Strategy for Large-Scale Mobile Distributed
Database Systems, International Journal of Database Management Systems
(IJDMS), Vol. 2(4), pp. 86-105, November 2010.
[101] R. Meersman, Z. Tari et al., An Adaptive Probabilistic Replication Method
for Unstructured P2P Networks, Springer-Verlag Berlin Heidelberg, LNCS
4275, pp. 480497, 2006.
[102] K. Sashi, Antony Selvadoss Thanamani, Dynamic Replica Management for
Data Grid, IACSIT International Journal of Engineering and Technology,
Vol. 2(4), pp. 329-333, August 2010.
[103] Kubiatowicz, J., Oceanstore: An Architecture for Global-Scale Persistent
Storage, Proceedings of the International Conference on Architectural
Support for Programming Languages and Operating Systems (ASPLOS), pp.
190201, November 2002.
[104] Auenca-Acuna, F. M., Martin, R. P., Nyuyen, T. D., Autonomous Replication
for High Availability in Unstructured P2P systems, 22nd International
Symposium on Reliable Distributed Systems (SRDS), 2003.
[105] Adya, A., Bolosky, W. J., Castro, M., Cermak, G., Chaiken, R., Douceur, J. R.,
Howell, J., Lorch, J. R., Theimer, M., Wattenhofer, R. P., Farsite: Federated,
Available, and Reliable Storage for an Incompletely Trusted Environment,
SIGOPS Oper. Syst. Rev., Vol. 36, (New York, NY, USA), pp. 114, ACM
Press, 2002.
[106] Adya, A., Bolosky, W. J., Castro, M., Cermak, G., Chaiken, R., Douceur, J. R.,
Howell, J., Lorch, J. R., Theimer, M., Wattenhofer, R. P., Feasibility of a
185

Serverless Distributed File System Deployed on an Existing Set of Desktop


PCs, OSDI, 2002.
[107] Douceur, J. R. and Wattenhofer, R. P., Large-Scale Simulation of Replica
Placement Algorithms for a Serverless Distributed File System, 9th
International Symposium on Modeling, Analysis and Simulation of Computer
and Telecommunication Systems (MASCOTS), 2001.
[108] Korupolu, M., Plaxton, G., Rajaraman, R., Placement Algorithms for
Hierarchical Cooperative Caching, Journal of Algorithms, Vol. 38(1), pp.
260302, January 2001.
[109] Krishnaswamy, V., Walther, D., Bhola, S., Bommaiah, E., Riley, G. F., Topol,
B., Ahamad, M., Efficient Implementation of Java Remote Method
Invocation (RMI), COOTS, pp. 1927, 1998.
[110] Ko, B.-J., Rubenstein, D., Distributed, Self-Stabilizing Placement of
Replicated Resources in Emerging Networks, 11th IEEE International
Conference on Network Protocols (ICNP), November 2003.
[111] Qiu, L., Padmanabhan, V. N., Voelker, G. M., On the Placement of Web
Server Replicas, Proceedings of INFOCOM01, pp. 15871596, 2001.
[112] T. Loukopoulo, I. Ahmad, Static and Adaptive Distributed Data Replication
Using Genetic Algorithms, Journal of Parallel and Distributed Computing,
Vol. 64(11), pp. 12701285, 2004.
[113] P. Francis, S. Jamin, V. Paxson, L. Zhang, D. Gryniewicz, Y. Jin, An
Architecture for a Global Internet Host Distance Estimation Service, IEEE
INFOCOM '99 Conference, New York, NY, USA, pp. 210-217, 1999.
[114] Chapram Sudhakar, T. Ramesh, An Improved Lazy Release Consistency
Model, Journal of Computer Science, Vol. 5(11), pp. 778-782, 2009.
[115] M. Naor, U. Wieder, Scalable and Dynamic Quorum Systems, Proceedings
of the ACM Symposium on Principles of Distributed Computing, 2003.
[116] T. Loukopoulos, I. Ahmad, Static and Adaptive Data Replication Algorithms
for Fast Information Access in Large Distributed systems, Proceedings of the
IEEE International Conference on Distributed Computing Systems, Taipei,
Taiwan, pp. 385 392, 2000.
[117] S. Abdul-Wahid, R. Andonie, J. Lemley, J. Schwing, J. Widger, Adaptive
Distributed Database Replication through Colonies of Pogo Ants, Parallel

186

and Distributed Processing Symposium, IEEE International, California USA,


pp. 358, 2007.
[118] I. Gashi, P. Popov, L. Strigini, Fault Tolerance via Diversity for Off-theShelf Products: A Study with SQL Database Servers, IEEE Transactions on
Dependable and Secure Computing, Vol. 4(4), pp. 280294, 2007.
[119] C. Wang, F. Mueller, C. Engelmann, S. Scott, A Job Pause Service Under
Lam/Mpi+Blcr for Transparent Fault Tolerance, Proceedings of IEEE
International Parallel and Distributed Processing Symposium, California USA,
pp. 1-10, 2007.
[120] Lin Wujuan, Veeravalli Bharadwaj, An Object Replication Algorithm for
Real-Time Distributed Databases, Distributed Parallel Databases, Vol. 19, pp.
125146, 2006.
[121] A. Bonifati, E. Chang, T. Ho, L. V. Lakshmanan, R. Pottinger, Y. Chung,
Schema Mapping and Query Translation in Heterogeneous P2P XML
Databases, The VLDB Journal, Vol. 2 (19), pp. 231-256, April 2010.
[122] Ricardo JimeNez-Peris, Gustavo Alonso, Bettina Kemme, M. Patin O
Martinez, Are Quorums an Alternative for Data Replication?, ACM
Transactions on Database Systems, Vol. 28(3), pp. 257294, 2003.
[123] J. Kangasharju, K.W. Ross, D. Turner, Optimal Content Replication in P2P
Communities, Manuscript, 2002.
[124] Jiafu Hu, Nong Xiao, Yingjie Zhao, Wei Fu, An Asynchronous Replic
Consistency Model in Data Grid. ISPA Workshops, LNCS 3759, Nanjing,
China, pp. 475 484, 2005.
[125] Radded Al King, H. Abdelkader, M. Franck, Query Routing and Processing
in Peer-to-Peer Data sharing Systems, International Journals of Database
Management Systems (IJDMS), Vol. 2(2), pp. 116-139, 2010.
[126] R.H. Thomas, A Majority Consensus Approach to Concurrency Control for
Multiple Copy Databases, ACM Transactions on Database Systems, Vol 4(2),
pp. 180209, 1979.
[127] Ananth Rao, Karthik Lakshminarayan, Sonesh Surana, Richard Karp Ion
Stoica, Load Balancing in Structured P2P Systems, Proceedings of ACM,
Vol. 63(3), pp. 217-240, 2006.

187

[128] Ada Wai-Chee Fu, Yat Sheung Wong, Man Hon Wong, Diamond Quorum
Consensus for High Capacity and Efficiency in a Replicated Database
System, Distributed and Parallel Databases, Vol. 8, pp. 471492, 2000.
[129] A. Sleit, W. Al Mobaideen, S. Al-Areqi, A. Yahya, A Dynamic Object
Fragmentation and Replication Algorithm in Distributed Database Systems,
American Journal of Applied Sciences, Vol 4(8), pp. 613-618, 2007.
[130] D. Agrawal, A. El. Abbadi, The Generalized Tree Quorum Protocol: An
Efficient Approach for Managing Replicated Data, ACM Transactions on
Database Systems, Vol. 17(4), pp. 689-717, 1992.
[131] David Del Vecchio, Sang H Son, Flexible Update Management in Peer-toPeer Database Systems, Proceedings of 9th International Conference on
Database Engineering and Application Symposium, VA, USA, pp. 435 444,
July 2005.
[132] Hidehisa Takamizawa, Kazuhiro Saji. A Replica Management Protocol in a
Binary Balanced Tree Structure-Based P2P Network. Journal of Computers,
Vol. 4(7), pp. 631-640, 2009.
[133] Ahmad N, Abdalla AN, Sidek RM. Data Replication Using Read-One-WriteAll

Monitoring

Synchronization

Transaction

System

in

Distributed

Environment. Journal of Computer Science, Vol. 6(10), pp. 10661069, 2010.


[134] P. A. Bernstein, N. Goodman, An Algorithm for Concurrency Control and
Recovery in Replicated Distributed Databases, ACM Transactions on
Database Systems, Vol. 9(4), pp. 596615, 1984
[135] Jajodia, S., Mutchler, D., Integrating Static and Dynamic Voting Protocols to
Enhance File Availability, Fourth International Conference on Data
Engineering, IEEE, pp. 144-153, New York, 1988.
[136] B. Silaghi, P. Keleher, B. Bhattacharjee, Multi-Dimensional Quorum Sets for
Read-Few Write-Many Replica Control Protocols, Proceedings of the 4th
International Workshop on Global and Peer-to-Peer Computing, 2004.
[137] Latip R, Ibrahim H, Othman M, Sulaiman MN, Abdullah A, Quorum Based
Data Replication in Grid Environment, Rough Sets and Knowledge
Technology, LNCS, pp. 379386, 2008.
[138] Oprea F, Reiter MK. Minimizing Response Time for Quorum-System
Protocols

Over

Wide-Area

Networks,

188

International

Conference

on

Dependable Systems and Networks (DSN), pp. 409418, Edinburgh, UK,


2007
[139] Sawai Y, Shinohara M, Kanzaki A, Hara T, Nishio S. Consistency
Management Among Replicas Using a Quorum System in Ad Hoc Networks,
International Conference on Mobile Data Management (MDM), pp. 128132,
Nara, Japan, 2006.
[140] Frain I,Mzoughi A, Bahsoun JP, How to Achieve High Throughput with
Dynamic Tree-Structured Coterie, International Symposium on Parallel and
Distributed Computing (ISPDC), pp. 8289, Timisoara, Romania, 2006.
[141] Osrael J, Froihofer L, Chlaupek N, Goeschka KM., Availability and
Performance of the Adaptive Voting Replication, International Conference
on Availability, Reliability and Security (ARES), pp. 5360, Vienna, Austria,
2007.
[142] S. Cheung, M. Ammar, A. Ahamad, The Grid Protocol: A High Performance
Scheme for Maintaining Replicated Data, IEEE Sixth International
Conference on Data Engineering, pp. 438445, Los Angeles, CA, USA, 1990.
[143] Storm C, Theel O, A General Approach to Analyzing Quorum-Based
Heterogeneous Dynamic Data Replication Schemes, 10th International
Conference on Distributed Computing and Networking, Hyderabad, India, pp.
349361, 2009.
[144] Kevin Henry, Colleen Swanson, Qi Xie, Khuzaima Daudjee, Efficient
Hierarchical Quorums in Unstructured Peer-to-Peer Networks, OTM 2009,
Part I, LNCS 5870, pp. 183200, 2009.
[145] A. Kumar, Hierarchical Quorum Consensus: A New Algorithm for Managing
Replicated Data, IEEE Transactions on Computers, Vol. 40(9), pp. 996-1004,
1991.
[146] Dongming Huang, Zong Hu, Research of Replication Mechanism in P2P
Network, WSEAS Transactions on Computer, Vol. 8(12), pp. 1845-1854,
December 2009.
[147] Nabor das Chagas Mendona, Using Extended Hierarchical Quorum
Consensus to Control Replicated Data: from Traditional Voting to Logical
Structures, Proceedings of the 27th Annual Hawaii International Conference
on Systems Sciences, Minitrack on Parallel and Distributed Databases, Maui,
pp. 303-312, 1993.
189

[148] Sang-Min Park, Jai-Hoon Kim, Young-Bae Ko, Won-Sik Yoon, Dynamic
Data Replication Strategy Based on Internet Hierarchy BHR, SpringerVerlag Heidelberg, 3033, pp. 838-846, 2004.
[149] A. Horri, R. Sepahvand, Gh. Dastghaibyfard. A Hierarchical Scheduling and
Replication Strategy, International Journal of Computer Science and Network
Security, Vol. 8, pp. 30-35, 2008.
[150] Tang M., Lee B., Tang X., Yeo C. The Impact of Data Replication on Job
Scheduling Performance in the Data Grid. Future Generation Computing
System, Vol. 22(3), 2006.
[151] Kavitha, R., A. Iamnitchi, I. Foster, Improving Data Availability through
Dynamic Model Driven Replication in Large Peer-to- Peer Communities,
Global and Peer-to-Peer Computing on Large Scale Distributed Systems
Workshop, Berlin, Germany, 2002.
[152] Ranganathan, I. Foster, Design and evaluation of Replication Strategies for a
High Performance Data Grid in Computing and High Energy and Nuclear
Physics, International Conference on Computing in High Energy and Nuclear
Physics (CHEP'01), Beijing, China, 2001.
[153] D. Malkhi, M.K. Reiter, A. Wool, Probabilistic Quorum Systems,
Information and Computation, Vol. 170(2), pp.184-206, 2001.
[154] Abraham Silberschatz, Henry F. Korth, S. Sudarshan, Database System
Concepts, McGraw-Hill Computer Science Series, International Student
Edition, 2005.
[155] Udai Shanker, Manoj Misra, Anil K. Sarje, Distributed Real Time Database
Systems: Background and Literature Review, Distributed Parallel Databases,
Vol. 23, pp. 127149, 2008.
[156] Arvind Kumar, Rama Shankar Yadav, Ranvijay, Anjali Jain, Fault Tolerance
in Real Time Distributed System, International Journal on Computer Science
and Engineering, Vol. 3 (2), pp. 933-939, 2011.
[157] Amr El Abbadi, Mohamed F. Mokbel, Social Networks and Mobility in the
Cloud, Proceedings of the PVLDB, Vol. 5 (12), pp. 2034-2035, 2012.
[158] K.Y. Lam, Tei-Wei Kuo, Real-Time Database Systems: Architecture and
Techniques, Kluwer Academic Publishers, 2001.

190

[159] A. Datta, M. Hauswirth, K. Aberer, Updates in Highly Unreliable, Replicated


Peer-to-Peer Systems, Proceedings of the 23rd International Conference on
Distributed Computing Systems, 2003.
[160] J. Wang, K. Lam, S. H. Son, and A. Mok, An Effective Fixed Priority CoScheduling Algorithm for Periodic Update and Application Transactions,
Springer, Computing, pp. 184-202, November 2012.
[161] T. Hara, M. Nakadori, W. Uchida, K. Maeda, S. Nishio, Update Propagation
Based on Tree Structure in Peer-to-Peer Networks, Proceedings of
ACS/IEEE International Conference on Computer Systems and Applications
(AICCSA05), pp. 4047, 2005.
[162] A. Datta, M. Hauswirth, K. Aberer, Updates in Highly Unreliable, Replicated
Peer-to-Peer Systems, Proceedings 23rd International Conference on
Distributed Computing Systems (ICDCS03), pp. 76, 2003.
[163] Gupta, R., Harista, J., Ramamritham, K., Seshadri, S., Commit Processing in
Distributed Real Time Database Systems, Proceedings of Real-time Systems
Symposium, Washington DC. IEEE Computer Society Press, San Francisco,
1998.
[164] Jayant R. Haritsa, Michael J. Carey, Miron Livny, Data Access Scheduling in
Firm Real-Time Database Systems, The Journal of Real-Time Systems, Vol.
4(3), 1992.
[165] K.-W. Lam, K.-Y. Lam, S. Hung, Real-Time Optimistic Concurrency
Control Protocol with Dynamic Adjustment of Serialization Order,
Proceedings of IEEE Real-Time Technology and Application Symposium,
Chicago, Illinois, pp. 174-179, May 1995.
[166] S. Gribble, A. Halevy, Z. Ives, M. Rodrig, D. Suciu, What can Databases do
for Peer-to-Peer?, Proceedings of the Fourth International Workshop on the
Web and Databases (WebDB 2001), June 2001.
[167] L Gong. JXTA: A Network Programming Environment, IEEE Internet
Computing, Vol. 5(3), pp. 8895, 2001.
[168] Wolfgang Nejdl, Wolf Siberski, Michael Sintek, Design Issues and
Challenges for RDF- and Schema-Based Peer-to-Peer Systems, ACM
SIGMOD Record, Vol. 32(3), pp. 4146, 2003.
[169] Wolfgang Nejdl, Martin Wolpers, Wolf Siberski, Christoph Schmitz, Mario
Schlosser, Ingo Brunkhorst, Alexander Loser, Super-Peer Based Routing
191

and

Clustering

Strategies

for

RDF-Based

Peer-to-Peer

Networks,

Proceedings of the 12th international conference on World Wide Web, pages


536543, New York, NY, USA, 2003.
[170] Mario T. Schlosser, Michael Sintek, Stefan Decker, Wolfgang Nejdl,
Hypercup - Hypercubes, Ontologies, and Efficient Search on Peer-to-Peer
Networks, Proceedings of the First International Workshop on Agents and
Peer-to-Peer Computing, volume 2530 of LNCS (Springer), pp. 112124,
2002.
[171] R. Akbarinia, V. Martins, E. Pacitti, and P. Valduriez. Global Data
Management (Chapter: Design and implementation of Atlas P2P architecture),
1st Edition, IOS Press, July 2006.
[172] R. Akbarinia, V. Martins, E. Pacitti, and P. Valduriez, Top-k query
processing in the APPA P2P system, Proceedings of the International
Conferenceon High Performance Computing for Computational Science
(VecPar), Rio de Janeiro, Brazil, July 2006.
[173] Igor Tatarinov, Zachary Ives, Jayant Madhavan, Alon Halevy, Dan Suciu,
Nilesh Dalvi, Xin Dong, Yana Ka diyska, Gerome Miklau, Peter Mork, The
Piazza Peer Data Management Project. ACM SIGMOD Record, September
2003.
[174] Seth Gilbert, Nancy Lynch, Brewers Conjecture and the Feasibility of
Consistent, Available, Partition-Tolerant Web Services, ACM SIGACT
News, Vol. 33(2), pp. 5159, 2002.
[175] Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan, Aoying Zhou, PeerDB: A
P2P-based System for Distributed Data Sharing, Proceedings of the 19th
IEEE International Conference on Data Engineering, pp. 633644, IEEE
Computer Society, 2003.
[176] Wee Siong Ng, Beng Chin Ooi, Kian-Lee Tan. BestPeer: A SelfConfigurable Peer-to-Peer System, Proceedings of the 18th IEEE
International Conference on Data Engineering, San Jose, CA, IEEE Computer
Society, pp. 272, 26 February - 1 March 2002.
[177] B.K. Sarker, A.K. Tripathi, D.P. Vidyarthi, Kuniaki Uehara, A Performance
Study of Task Allocation Algorithms in a Distributed Computing System
(DCS), IEICE Transactions on Information and Systems, Vol. 86(9), pp.
1611-1619, 2003.
192

[178] Anirban Mondal, Masaru Kitsuregawa, Open Issues for Effective Dynamic
Replication in Wide-Area Network Environments, Peer-to-Peer Networking
and Applications, Vol. 2(3), pp. 230-251, 2009.
[179] Gerome Miklau Dan Suciu, Controlling Access to Published Data Using
Cryptography, Proceedings of the Very Large Databases Conference, USA,
pp. 898909, September 2003.
[180] R.

B.

Patel, Vishal

Garg,

Resource

Management

in

Peer-to-Peer

Networks: NADSE Network Model, Proceedings of the 2nd International


Conference On Methods and Models in Science and Technology
(ICM2ST11), Jaipur, (India), pp. 159-164, 1920 November 2011.
[181] Alfred Loo, Distributed Multiple Selection Algorithm for Peer-to-Peer
Systems, Journal of Systems and Software, Vol. 78, pp. 234-248, 2005.

193