Sie sind auf Seite 1von 15

Available online at www.sciencedirect.

com

Information and Software Technology 51 (2009) 83–97


www.elsevier.com/locate/infsof

Evaluating the validity of data instances against ontology


evolution over the Semantic Web
Li Qin a,*, Vijayalakshmi Atluri b
a
Department of Information Systems and Decision Sciences, Fairleigh Dickinson University, 1000 River Road, Teaneck, NJ 07666, USA
b
CIMIC and MSIS Department, Rutgers University, 180 University Avenue, Newark, NJ 07102, USA

Received 21 September 2006; received in revised form 16 January 2008; accepted 16 January 2008
Available online 26 January 2008

Abstract

It is natural for ontologies to evolve over time. These changes could be at structural and semantic levels. Due to changes to an ontol-
ogy, its data instances may become invalid, and as a result, may become non-interpretable. In this paper, we address precisely this prob-
lem, validity of data instances due to ontological evolution. Towards this end, we make the following three novel contributions to the area
of Semantic Web. First, we propose formal notions of structural validity and semantic validity of data instances, and then present
approaches to ensure them. Second, we propose semantic view as part of an ontology, and demonstrate that it is sufficient to validate
a data instance against the semantic view rather than the entire ontology. We discuss how the semantic view can be generated through
an implication analysis, i.e., how semantic changes to one component imply semantic changes to other components in the ontology.
Third, we propose a validity identification approach that employs locally maintaining a hash value of the semantic view at the data
instance.
Ó 2008 Elsevier B.V. All rights reserved.

Keywords: Ontology evolution; Data validity; Semantic Web

1. Introduction has become an important research area because ontology


development is ubiquitous and ontologies have become
Ontologies are formal, explicit specifications of shared an integral part of many applications [12]. We classify the
conceptualizations of a given domain of discourse [3]. Gen- changes to ontologies into two levels – structural and
erally, an ontology for a domain contains a description of semantic. Structural changes include changes abstracted
important concepts, properties of each concept as well as at the data model such as adding/removing concepts, add-
restrictions and axioms upon properties. Ontologies pro- ing/removing restrictions over properties, etc. Note that
vide interpretation to the contents of the Semantic Web each structural change is limited to a specific component
data. Specifically, ontologies define and relate concepts in the ontology. Semantic changes obviously mean changes
used to describe the web data, which are nothing but to the semantics presented by the ontology. While struc-
instances to the concepts in ontologies. tural changes are visible changes to the ontology, semantic
Ontologies evolve over time since the targeted domain changes are not so obvious. These semantic changes are
changes with time. Changes can also be with respect to derived through evaluating visible structural changes. Fur-
how the domain is conceptualized, or the same conceptual- thermore, since the semantics of a concept is defined
ization is specified in a different way. Ontology evolution through its relationships with other concepts, semantic
changes to other concepts may imply semantic changes to
*
Corresponding author. Tel.: +1 4137821715.
this concept. As a result, in addition to the explicit seman-
E-mail addresses: liqin@fdu.edu (L. Qin), atluri@cimic.rutgers.edu tic changes brought by structural changes, implicit semantic
(V. Atluri). changes can be derived based on the semantic dependency

0950-5849/$ - see front matter Ó 2008 Elsevier B.V. All rights reserved.
doi:10.1016/j.infsof.2008.01.004
84 L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97

among concepts and properties. Research on detecting the semantic validity of a data instance, and we propose a
structural changes includes PromptDiff [14,17] and on SemValid algorithm to evaluate it. While the structural
detecting semantic changes includes SemDiff [21]. validity is evaluated based on what is visible in an ontol-
Understanding the semantic changes to an ontology is ogy, the semantic validity cannot be evaluated in this
crucial to the functioning of its data instances and depen- way. As a result, though the structural validity of a data
dent ontologies. Ontologies present semantics for interpret- instance can be evaluated with respect to an ontology,
ing data instances. If an ontology is substituted by a new the semantic validity of a data instance is defined as a func-
version, its data instances may become non-interpretable. tion of the changes between two versions. Given a seman-
This is because, the ontology components they comply with tically valid data instance, we would like to determine
may not be available in the new version, or the new version whether a set of ontology changes has changed the seman-
brings an incorrect interpretation to the data instances. tics of the data instance.
Such a mismatch in the compliance may lead to mistaken Since a data instance instantiates only part of an ontol-
decisions and actions. Secondly, a dependent ontology of ogy, not every component in the ontology has impact on its
this changed ontology has far-reaching effects on its data validity. On the other hand, if a semantic change to a com-
instances and its dependent ontologies as well. When an ponent of the ontology imply semantic changes to the
ontology imports and extends another ontology for reus- instantiated components by a data instance, then this
ability and interoperability, it is semantically dependent semantic change will ultimately affect the semantic validity
on what is defined in the imported ontology. Therefore, if of this data instance. Therefore, in addition to the instanti-
the imported ontology has changed, the dependent ontol- ated components, other components may be significant to
ogy may not provide valid meaning for interpreting its data the validity of the data instance as well. We propose seman-
instances. Since changes to an ontology are autonomous, tic view as a subset of the ontology, and demonstrate that
its data instances and dependent ontologies may become the semantic view rather than the entire ontology is respon-
invalidated. In this paper, we focus on the validity issue of sible for the validity of a data instance. Our definition of
data instances due to ontological evolution. Similar semantic view is in line with ontology view [16] in terms
approaches can be applied to the validity of dependent that it is ‘a portion of an ontology’. However, our pro-
ontologies as well. posed semantic view for a specific data instance is a ‘view’
While the author of a data instance needs to ensure the of the ontology whose changes directly or indirectly affect
validity of the data instance, the consumers need to be pro- its semantic validity. We show the semantic view be gener-
tected from erroneous interpretation caused by ontology ated based on the implication analysis, which specifies how
evolution. As an example, the state or local department semantic changes to one component imply semantic
of public health may publish guidelines and regulations changes to other components. We further propose a valid-
on disease control. If these data are presented as semantic ity identification approach in which a data instance carries
data instances, these government agencies will need to a hash value of its semantic view when it is created. By sim-
ensure that they are valid data instances. On the other ply re-computing the hash value of the semantic view and
hand, hospitals, clinics and doctors who are consumers of comparing it with that stored by the data instance, one
these data also need these data to be valid and interpretable can deduce whether the ontology has changed silently in
in order to comply with the regulations and policies. Exist- a way which may affect the semantic validity of the data
ing research is restricted to studying the impact of struc- instance.
tural changes on data preservation [12], or the validity of This paper is organized as follows. Section 2 presents the
subsumption relations [9]. Moreover, existing work dis- preliminaries on ontologies and instances. Section 3 dis-
cusses ontology-level compatibility by assuming the entire cusses the structural changes and semantic changes to
ontology affects the validity [4,5]. Therefore, there is a com- ontologies. In Section 4, we present the implication analysis
pelling need for a close examination of the validity of data which is the key to identifying implicit semantic changes
instances when their ontologies evolve. from explicit ones and generating the semantic view. Sec-
By structural validity against an ontology, we mean data tion 5 presents the structural and semantic validity of data
instances structurally comply with the ontology. A Struct- instances, and algorithms to evaluate them. In Section 6,
Valid algorithm is used to evaluate the structural validity of the semantic view along with its generation is presented.
a data instance. Our notion of semantic validity of data Implementation of our approach is discussed in Section
instances has structural validity as a prerequisite. Further- 7. Related work on detecting changes to ontologies and
more, the semantic validity of data instances determines the consequences of the ontology changes is presented in
whether these instances can be correctly interpreted Section 8. Section 9 presents our conclusions and an insight
through the ontology. For instance, if a concept in the into our future work.
ontology becomes more specialized, then instances which
are semantically valid for the old version, cannot be inter- 2. Preliminaries
preted through the new version; in other words, they may
not be semantically valid for the new version. Therefore, In this section, we present the preliminaries on ontolo-
the semantic changes between two ontology versions affect gies and their instances.
L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97 85

2.1. Ontologies For example, for concept ‘patient’ in Fig. 1, DP(patient) =


{ID, insurance_type, insurance_plan, billing_address}
A well-cited definition for an ontology is given by and OP(patient) = {subclassOf, monitored_by}. sub-
Gruber [3] as a ‘‘specification of a conceptualization”. An classOf 2 OPC(patient) with subclassOf(patient) =
ontology defined for a domain is a formal specification of person and monitored_by 2 OPI(patient) with moni-
important concepts and relationships in the domain that tored_by(patient) = monitoring device.
exist for software agents. An ontology, o, e.g. in OWL (3) Restrictions: For each property p, there exists a set of
[18], generally consists of the following components. We restrictions on the range or cardinality of the prop-
use the ontology shown in Fig. 1 as an example to explain erty, denoted by RðpÞ ¼ fr1 ; . . . ; rw g where rk ðpÞ is
these components. the restriction rk of property p.
For example, ID 2 DP(patient) and R(ID) = {cardi-
(1) Concepts: Each ontology o contains a set of concepts, nality} with cardinality(ID) = 1.
CðoÞ ¼ fc1 ; c2 ; . . . ; cn g. Fig. 1 represents an Intensive (4) Axioms: For each property p, there exists a set of axi-
Care Unit (ICU) ontology, therefore, C(ICU oms with each defined by itself or in relation to
ontology) = {ICU, person, health care personnel, another property, denoted by AðpÞ ¼ fa1 ; a2 ; . . . ; an g.
patient, . . .}. ai ðpÞ represents the axiom ai of property p.
(2) Properties: Each ontology contains a set of proper- In our example, EquivalentProperty(insurance_type)
ties. If a property relates A to B, then A is called = insurance_plan is an axiom stating that these two
the domain of the property and B is called the properties are semantically equivalent (note that
range of the property. For all the properties with equivalent properties are put in the parenthesis in
concept c as the domain, P ðcÞ ¼ DP ðcÞ [ OP ðcÞ, the figure.).
where
 DP ðcÞ ¼ fdp1 ; dp2 ; . . . ; dpm g are datatype properties, Different ontology languages may support different types
each taking a primitive data type as the range. We of class-level object properties, restrictions and axioms upon
use dpi ðcÞ to represent the datatype property dpi of properties. For instance, OWL Full [18] supports at least the
concept c. following class-level object properties: subClassOf, equiva-
 OP ðcÞ ¼ fop1 ; op2 ; . . . ; opn g are object properties, each lentClass, intersectionOf, unionOf, complementOf. It sup-
taking some concept(s) as the range. We use opi ðcÞ ports restrictions such as cardinality, minCardinality,
to denote the object property opi of concept c. maxCardinality, and axioms such as subPropertyOf, equiva-
Object properties can be of two types: lentProperty, TransitiveProperty, SymmetricProperty, Func-
– class-level properties ðOP C ðcÞÞ are properties tionalProperty, InverseFunctionalProperty, inverseOf, and so
relating two concepts. on. Though our proposed approach does not depend on any
– instance-level properties ðOP I ðcÞÞ are properties specific ontology language, we use OWL constructs for our
relating instances. discussion in this paper.

ICU
part_of part_of

person equipment
name, age, address dimensions

subclassOf subclassOf
subclassOf subclassOf
patient
health care personnel ID, treatment general
code, salary insurance_type device equipment
(insurance plan),
(insurance_plan), task performed
tas _pe o ed
billing_address monitors
subclassOf subclassOf cardinality(ID)=1 subclassOf

monitored_by
doctor nurse monitoring
device
channel_number

subclassOf

electrocardiograph

Fig. 1. Example of an ICU ontology.


86 L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97

Concept hierarchy: Concepts in an ontology o are usu- 3. Changes to ontologies


ally organized into a concept hierarchy. Given two con-
cepts c1 ; c2 2 CðoÞ where subclassOf ðc1 Þ ¼ c2 , we say Since structural and semantic changes to an ontology
that c1 is a more specialized concept than c2 . We use have impact on the validity of data instances, we discuss
eqðcÞ and supðcÞ to represent the set of concepts which them in detail in this section.
are equivalent to c and superclass concepts of c, respec-
tively. Similarly, hierarchical relationship may exist 3.1. Structural changes
among properties. A property is a sub-property of
another property if it is a specialization of the other. Types of structural changes to ontologies. Given two
We use eqðpÞ to represent the set of properties which ontology versions o and o0 , the structural changes between
are equivalent to p, and supðpÞ to represent the set of them, dðo; o0 Þ ¼ fd1 ; d2 ; . . . ; dn g are defined as operations to
properties of which p is a sub-property. transform o to o0 . Following is a list of possible structural
Ontologies share a lot of similarities with enhanced changes.
entity-relationship (EER) models in that concepts and
properties in ontologies are equivalent to entities and (1) Structural changes to a concept, dc: addition of a
relationships/attributes in EER models. However, they concept ðdca Þ, deletion of a concept ðdcd Þ, renaming
are different in many aspects. For instance, various axi- a concept ðdcr Þ, splitting a concept ðdcs Þ, and merging
oms in ontologies, which EER models do not support, of concepts ðdcm Þ.
enable automated reasonings over the data instances. (2) Structural changes to a property of a concept, dp:
Readers can refer to [12] for the similarities and differ- addition of a property ðdpa Þ, deletion of a property
ences between ontology evolution and database schema ðdpd Þ, moving a property ðdpm Þ, modifying (the
evolution. range of) a property ðdpu Þ, and renaming a property
ðdpr Þ.
(3) Structural changes to a restriction, dr: addition of a
2.2. Ontology instances restriction ðdra Þ, deletion of a restriction ðdrd Þ, and
modifying a restriction ðdru Þ.
Each concept c has a set of instances IðcÞ ¼ fi1 ; . . . ; in g. (4) Structural changes to an axiom, da: addition of an
Note that a data instance of concept c can instantiate any axiom ðdaa Þ, and deletion of an axiom ðdad Þ.
property whose domain is concept c, or any of its equiva-
lent or superclass concepts. For a concept c, any data Structural changes including split and merge operations
instances of its subclass or equivalent concepts are also are also used in PromptDiff [13]. Though one may consider
data instances of c. modifying a property as a deletion of the property followed
Each ii 2 IðcÞ is a 4-tuple, ii ¼ hURI i ; c; DP ðcÞ; OP I ðcÞi, by an addition, we prefer describing it as modifying a prop-
such that erty when the property involved refers to the same prop-
erty. Besides, when a property is modified, we mean the
(1) URI i is a uniform resource identifier (URI) [25] by range of the property has been modified, not the property
which ii can be universally identified and other itself.
instances can refer to it. Let the ontology in Fig. 1 be o and that in Fig. 2 be o0 .
(2) c is the concept that ii is asserted to instantiate. When The following structural changes have occurred in o0
a data instance instantiates more than one concept, from o:
we say it instantiates the intersection of these
concepts.  The range of the ‘subclassOf’ property for concept ‘mon-
(3) DP ðcÞ is a set of datatype property instantiations and itoring device’ is modified from concept ‘treatment
each dpi 2 DP ðcÞ takes a specific value vj , denoted as device’ to concept ‘equipment’;
ii : dpi ¼ vj , where vj has a specified primitive data  Restriction ‘cardinality = 1’ is added to property ‘code’
type as its domain. of concept ‘health care personnel’;
(4) OP I ðcÞ is a set of object property instantiations, and  Property ‘specialty’ with string as its range is added to
each opi 2 OP I ðcÞ where opi ðcÞ ¼ cj takes an instance concept ‘doctor’;
ij of concept cj as its value, denoted as ii : opi ¼ ij .  Property ‘task_performed’ is moved from concept ‘treat-
ment device’ to concept ‘equipment’.
For example, assume http://patients.org/#AW is a data
instance of concept ‘patient’ with several properties instan-
tiated as follows: 3.2. Semantic changes

‘http://patients.org/#AW’:name = ‘Amy White’, A structural change such as Modifying the range of a


‘http://patients.org/#AW’:insurance_type = ‘NJ Plus’, property does not indicate any semantic change. We need
and so on. to further examine the semantic change behind by deter-
L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97 87

ICU
part_of part_of

person equipment
name, age, task_performed,
address dimensions
subclassOf subclassOf
subclassOf subclassOf subclassOf

health care Patient


personnel ID, insurance_type monitors monitoring treatment general
code, salary (insurance_plan),
cardinality(code)=1 device device equipment
billing_address
cardinality(ID)=1 monitored_ channel_number
by
subclassOf subclassOf subclassOf

doctor nurse
specialty electrocardiograph

Fig. 2. Another version of the ICU ontology in Fig. 1.

mining whether the new property range is more generalized (3) Changes to the generalization of a property, Dpg ,
or specialized. Of course, each of the following semantic with its domain or range: a property becomes more
changes is caused and reflected by certain structural generalized ðDpg" Þ or more specialized ðDpg# Þ with
changes we have defined. its domain (or range) if its domain (or range)
Types of semantic changes to ontologies. Given two becomes a more generalized or specialized concept,
ontology versions o and o0 , the set of semantic changes respectively. Otherwise, its domain or range becomes
between them, denoted as Dðo; o0 Þ ¼ fD1 ; D2 ; . . . ; Dn g, may incomparable ðDpgy Þ if it is modified in a way which is
include: neither more generalized nor specialized. Note that
the semantic change to the domain or range of a
(1) Changes to the generalization of a concept, Dcg : A property can be either explicit (its domain or range
concept becomes more generalized ðDcg" Þ or more has changed to a different concept) or implicit (its
specialized ðDcg# Þ if it moves up or down in the con- domain or range concept has changed implicitly).
cept hierarchy, respectively. Otherwise, the general- Ontology changes which make a property more spe-
ization of a concept becomes incomparable ðDcgy Þ, cialized or incomparable may invalidate some
if it has changed, but become neither more general- instances. For instance, the range of a property is
ized nor specialized. changed from concept c1 to c2 where c2 is more spe-
Ontology changes which cause a concept to be more cialized than c1 . Instances which have an instance
specialized or incomparable will invalidate its of c2 as the property value (any instance of c2 is also
instances due to the semantic change to the concept. an instance of c1 ) are valid before or after the change.
On the other hand, if a concept becomes more gener- However, those instances with an instance of c1 (but
alized, though the semantic change to the concept not c2 ) as the property value will not be valid after the
does not invalidate any data instances, there is still semantic change.
a possibility that some of its instances become inval- (4) Changes to restrictiveness of a property, Dpr : A prop-
idated as a result of the semantic changes to its erty becomes more restrictive ðDpr" Þ or less restrictive
properties. ðDpr# Þ if changes to the restrictions of the property
(2) Changes to the descriptiveness of a concept, Dcd : a further restrict or extend its semantics, respec-
concept becomes more descriptive ðDcd" Þ or less tively.Ontology changes which make a property more
descriptive ðDcd# Þ if datatype properties or instance- restrictive may invalidate some instances which vio-
level object properties are added to or deleted from late the new restrictions.
the concept, respectively. (5) Changes to axiom over a property, Dpa : The axiom
Ontology changes which make a concept less descrip- over the property becomes more extended ðDpa" Þ or
tive may invalidate some instances. For instance, if a more restricted ðDpa# Þ if changes to its axiom further
property is deleted from a concept, all the data extend or restrict its semantics, respectively. An
instances that have instantiated this property will example axiom, which extends the meaning of the
become invalid. property, is the declaration of a property to be tran-
88 L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97

sitive, which leads to additional reasoning without need to be derived from explicit semantic changes by using
invalidating the reasoning prior to the declaration. implication analysis.
On the other hand, the declaration of a property to To facilitate the implication analysis, we introduce the
be functional will invalidate the instantiations where notion of isosem. Each isosem is nothing but a cluster of
the value to this property is not unique. It does not semantically dependent properties of the concept in an
introduce any new reasoning results, either. As a ontology.
result, this change will restrict the meaning of the
Definition 1 (Isosem). Given a concept c 2 CðoÞ, we define
property.
an isosem SðcÞ as a set of properties, where
SðcÞ ¼ fp1 ; . . . ; pk g such that p1 ; . . . ; pk 2 P ðcÞ and for each
Note that each of the above semantic changes is the
pi 2 SðcÞ, either pi is the only property in SðcÞ or pj 2 SðcÞ
result of one single structural change and they may not
if pj is an equivalent property or sub-property (specializa-
be able to represent the semantic change caused by multiple
tion) of pi (or vice versa).
structural changes altogether. For instance, adding a prop-
erty to a concept makes the concept more descriptive and Isosems are categorized into definitional isosems and
deleting another property from it makes it less descriptive. descriptive isosems based on the type of properties they
Another example is that, the domain of a property can contain, as defined below.
become more specialized while its range becomes more gen-
Definition 2 (Definitional isosem). A definitional isosem of
eralized. We do not provide a corresponding semantic
concept c, denoted as S fðcÞ, is an isosem consisting of class-
change to describe the combinational effects of these
level object properties.
changes. The reason is that each ontology change as well
as their impact on data instances can be examined sepa-
Definition 3 (Descriptive isosem). A descriptive isosem of
rately. Using the same example, adding a property to a
concept c, denoted as S s ðcÞ, is an isosem consisting of
concept does not invalidate any data instances while delet-
instance-level object properties or datatype properties.
ing another property can cause some data instances to be
invalid. We need to look at each change separately during Fig. 3 shows an example of how isosems of a concept
validity evaluation. ‘patient’ in Fig. 1 are abstracted from the properties. This
For the ontology versions in Figs. 1 and 2, the following concept has one definitional isosem, S f (patient), and four
explicit semantic changes are detected based on the struc- descriptive isosems, S s1 (patient), S s2 (patient), S s3 (patient)
tural changes: concept ‘monitoring device’ has become and S s4 (patient), each of which is indicated as an inner cir-
more generalized since its ‘subclassOf’ property has taken cle with its constituent properties enclosed. Note that prop-
a more generalized concept as its range; property ‘code’ erties ‘insurance_plan’ and ‘insurance_type’ are clustered
of ‘health care personnel’ has become more restrictive; into the same isosem since they are equivalent properties.
addition of property ‘specialty’ to concept ‘doctor’ has If these two properties have any subproperties such as ‘pri-
made it more descriptive; property ‘task_performed’ has mary insurance’ and ‘secondary insurance’, then all these
become more generalized since it is moved to a more gen- properties will be clustered into the same descriptive
eralized concept. isosem.
One may argue that concept ‘monitoring device’ has
become more specialized in some sense since it can no
longer include any instances of ‘treatment device’ while Sf
previously some instances of ‘treatment device’ could have person
been monitoring devices. Our point is that ‘monitoring
device’ potentially allows more instances after the change.
For instance, ‘monitoring device’ can only include some S1s subclassOf S4s
String
instances of ‘treatment device’ before the change; After billing String
the change, it can include instances of ‘equipment’ which ID patient address
may not belong to ‘treatment device’.
We suggest the changes between ontologies be repre- Insurance monitored_by
sented using the same ontology language for the ontologies _plan
themselves. Ontologies for annotating these changes, such monitoring
Insurance
as that in [8], can be developed. This is out of the scope String
device
of this paper. _type
String S3s

4. Detection of implicit semantic changes from explicit S2s


semantic changes

While the explicit semantic changes can be detected


based on the structural changes, implicit semantic changes Fig. 3. Abstraction of isosems from properties.
L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97 89

4.1. Implication analysis property of another property in the same isosem,


then the subproperty will depend on the superprop-
By conducting an implication analysis, we can deduce erty. For example, a ‘parent’ concept can have prop-
whether and how the semantic changes to a property or erties such as ‘has_child’, ‘has_son’, etc, where
isosem imply semantic changes to the other properties or ‘has_son’ is a subproperty of ‘has_child’. Then
isosems. This implication analysis is the basis for detecting semantic changes to ‘has_child’ will affect ‘has_son’
implicit semantic changes and generating semantic views, as well.
which will be discussed in Section 6. (2) Inter-isosem Dependency: If the property within a def-
initional isosem has changed, the domain of each
(1) Intra-isosem Dependency: Two properties pi and pj in property in the descriptive isosems of the same con-
the same isosem are semantically dependent. That is cept has changed implicitly. That is, Dpi ) Dpj ,
Dpi ) Dpj , where pi ; pj 2 S i ðci Þ, as shown in where pi 2 S f ðci Þ and pj 2 S s ðci Þ, which is shown in
Fig. 4(a). For instance, for concept ‘patient’, ‘insur- Fig. 4(b).
ance_plan’ and ‘insurance_type’ are clustered into For instance, if the definitional isosem of concept ‘mon-
the same isosem since they are equivalent properties. itoring device’ has changed, it implies a change to the
Therefore, they are semantically dependent upon domain of each property in its descriptive isosems,
one another. However, if one property is the sub- including properties ‘monitors’ and ‘channel_number’.

Sf(ci)
dependent
cj
pi cj pi
ci Si(ci) dependent ck Ss(ci)
pj ck ci
pj cl

(a) Intra-isosem Dependency (b) Inter-isosem Dependency

Sf(ci)
Sis(ci) independent
ck
cj pi
pi dependent
ci
ck
ci Sjs(ci) Sf(cj)
pj pj
cl
cj

(c) Inter-isosem Independency (d) Inter-concept Def-Def Dependency

Sf(ci)
dependent
ck
pi dependent Ss(cj) pj
ci cj
cj pj ci Ss(cj) pi Ss(ci)

(e) Inter-concept Def-Des Dependency (f ) Inter-concept Des-Des Dependency

Sjs(ci) Sks(ck)
ci pj ck pk cl
pi pl

cm cn
Sis(ci) independent Sls(cl)

(g) Inter-concept Independency

Fig. 4. Implication analysis of semantic changes.


90 L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97

(3) Inter-isosem Independency: Changes to the properties For example, ‘person’ and ‘monitoring device’ are
within an isosem imply no changes to the other isos- indirectly associated concepts, changes to the descrip-
ems of the same concept, if they belong to the same tive isosems of ‘monitoring device’ do not affect any
category. Formally, Dpi ;Dpj , if pi 2 S fi ðci Þ and descriptive isosems of ‘person’.
pj 2 S fj ðci Þ, or pi 2 S si ðci Þ and pj 2 S sj ðci Þ, as indicated
in Fig. 4(c).
The reason for this independency is that properties of 5. Structural and semantic validity
these two isosems would have been clustered into the
same isosem if semantic dependency existed. In this section, we show how changes detected to an
(4) Inter-concept Def–Def Dependency: If the property in ontology can be used in evaluating the structural validity
a definitional isosem has changed, the property in the and semantic validity of a data instance.
definitional isosem of the child concepts has changed
implicitly, which is shown in Fig. 4(d). Formally, 5.1. Structural validity
Dpi ) Dpj , if pi 2 S f ðci Þ and pj 2 S f ðcj Þ, where cj is
a child concept of ci . Structural changes to an ontology affect the structural
For instance, in the ontology versions shown in validity of its data instances. In this section, we define the
Figs. 1 and 2, property ‘subclassOf’ in the definitional structural validity and propose how it can be evaluated.
isosem of concept ‘monitoring device’ has changed. Given a data instance and an ontology, we can evalu-
Concept ‘electrocardiograph’ is a child of ‘monitoring ate whether this data instance is structurally valid with
device’. As a result, an implicit change is implied to the respect to this ontology. Obviously, if a data instance is
definitional isosem of concept ‘electrocardiograph’. structurally valid against the old version of an ontology
(5) Inter-concept Def–Des Dependency: If the definitional but not structurally valid against the new version, then
isosem of a concept has changed, the range of each the structural changes to the ontology invalidate the data
property in the descriptive isosem of another concept instance. Alternatively, if a data instance is structurally
has changed implicitly, if this property takes the for- valid against one version of the ontology, its structural
mer concept as its range, and belongs to a descriptive validity against a different ontology version can also be
isosem of the latter concept. This is depicted in evaluated based on the structural changes between the
Fig. 4(e). Formally, Dpi ) Dpj , if pi 2 S f ðci Þ and two versions.
pj 2 S s ðcj Þ, where pj ðcj Þ ¼ ci . For instance, concept
Definition 4 (Structural validity of data instances). A data
‘patient’ has a descriptive isosem containing property
instance i 2 IðcÞ is structurally valid against an ontology o,
‘monitored_by’ and taking concept ‘monitoring
if the following conditions hold:
device’ as its range. Then changes to the definitional
isosem of ‘monitoring device’ imply changes to this
(1) Concept c exists in ontology o, i.e. c 2 CðoÞ;
descriptive isosem of concept ‘patient’.
(2) Each property instantiated by instance i belongs to
(6) Inter-concept Des–Des Dependency: If two descriptive
the instantiated concept c by either explicit specifica-
isosems contain semantically dependent properties, at
tion or inheritance in the ontology;
least one of the isosems is dependent on the other.
(3) For each property instantiated by instance i, the
That is, Dpi ) Dpj where pi 2 S s ðci Þ; pj 2
s value taken by the property belongs to the range of
S ðcj Þ; ai ðpi Þ ¼ pj ðor aj ðpj Þ ¼ pi Þ, which is shown in
the property and satisfies the restrictions of the
Fig. 4(f).In Fig. 1, the ‘billing_address’ property of
property.
concept ‘patient’ is a sub-property of the ‘address’
property of concept ‘person’, and concept ‘person’
is the parent of concept ‘patient’. Then, changes to As an example, if a property p is specified to concept c1
the isosem containing ‘address’ property imply in the old version and moved to concept c2 in the new
changes to the isosem containing ‘billing_address’. version, then the data instances of concept c1 with property
As another example, ‘monitored_by’ property of p instantiated are not structurally valid against the new
‘patient’ is inverse of ‘monitors’ property of ‘monitor- version of the ontology unless c1 can inherit property p
ing device’, each of which constitutes an isosem. Then from c2 .
changes to one imply changes to the other. Using our example ICU ontology, for an instance of
(7) Inter-concept Independency: With Inter-concept Des– ‘patient’ with property ‘monitored_by’ instantiated, the
Des Dependency as the exception, if concept ci is instance is valid if this property takes a data instance of
indirectly associated with cl through ck , then the ‘monitoring device’ or ‘electrocardiograph’ as its value.
semantics of ci is independent of changes to concept Note that we only require that the range of the property
cl if the isosem between ci and ck does not is asserted to be an instance of concept ‘monitoring device’
change. Generally, as shown in Fig. 4(g), Dpi ;Dpl or ‘electrocardiograph’. However, whether this property
where pi 2 S si ðci Þ; pl 2 S sl ðcl Þ; pj 2 S sj ðci Þ and pj ðci Þ ¼ value is structurally or semantically valid does not matter
ck ; pk 2 S sk ðck Þ and pk ðck Þ ¼ cl . to evaluating the validity of this instance of ‘patient’. Of
L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97 91

course, this data instance of ‘patient’ will not be structur- (3) For each instantiated property p, the axiom over p
ally valid if it takes an instance of concepts such as ‘general does not get more restricted.
equipment’.
Algorithm 1 gives the detailed steps to evaluate the
The structural validity is the prerequisite for the
structural validity of a data instance.
semantic validity. However, the structural validity does
not necessarily lead to the semantic validity. For instance,
Algorithm 1
structural validity warrants that the data instance can
Evaluating the structural validity of data instances instantiate a concept. However, if this concept has
½StructValidði; oÞ become more specialized, this data instance may no
longer be semantically valid. Though one should be able
Require: Data instance i 2 IðcÞ; i instantiates property to determine whether an instance is structurally valid
p, ontology o for an ontology irrespective of other ontology versions,
StructValidði; oÞ ¼ truefdefault valueg the semantic validity of a data instance can be evaluated
{Evaluating the structural validity of a data with the semantic changes between two versions of the
instance i against an ontology o} ontology given, on the basis of the structural validity.
if c is a concept in ontology o then In Algorithm 2, we show how to evaluate whether a data
for each p instantiated by instance i do instance i is semantically valid against a new version of an
if p does not belong to concept c or any of its ontology o0 , given that it is semantically valid against its
superclass or equivalent concepts, OR the value taken old version o and the semantic changes between the two
by p does not belong to the range of p or does not sat- versions, Dðo; o0 Þ.
isfy the restrictions then
StructValidði; oÞ ¼ false Algorithm 2
end if
end for Evaluating the semantic validity of data instances
else ½SemValidði; o0 Þ
StructValidði; oÞ ¼ false
Require: Data instance i 2 IðcÞ; i instantiates property
end if
p, ontology o and o0 ; Dðo; o0 Þ
SemValidði; o0 Þ ¼ true {default value}
Assume a data instance of the concept ‘patient’ in {Evaluating the semantic validity of a data instance
ontology shown in Fig. 1 has properties ‘ID’, ‘name’, i against ontology o0 }
‘insurance_type’ and ‘monitored_by’ instantiated. If con- if StructValidði; o0 Þ ¼ true then
cept ‘patient’ or any of these instantiated properties are if c0 has not become more specialized than c or
deleted from the ontology, then this data instance will incomparable then
not be structurally valid against the new version of the for each p instantiated by instance i do
ontology. if axiom over p gets more restricted then
SemValidði; o0 Þ ¼ false
end if
5.2. Semantic validity end for
else
Data instances can be interpreted correctly only if they SemValidði; o0 Þ ¼ false
are semantically valid with respect to the ontology they end if
comply with. Their semantic validity need to be evaluated else
when semantic changes take place to the ontology that they SemValidði; o0 Þ ¼ false
comply with. Given that a data instance is semantically end if
valid to an ontology, we are interested in finding out how
semantic changes to this ontology affect the validity of its
data instance.

Definition 5 (Semantic validity of data instances). A data For the ontology versions shown in Figs. 1 and 2, con-
instance i 2 IðcÞ where c 2 CðoÞ, is semantically valid sider the semantic validity of a data instance of concept
against o0 where c0 2 Cðo0 Þ, if the following conditions ‘monitoring device’. The semantic changes between the
hold: ontology versions include that this concept has become
more generalized. In this case, the data instance which is
(1) Data instance i is structurally valid against o0 , i.e. semantically valid against the old version will be semanti-
StructValidði; o0 Þ ¼ true; cally valid against the new version except that they may
(2) Concept c does not get more specialized or become invalidated due to semantic changes to the proper-
incomparable; ties. On the other hand, if a concept becomes more special-
92 L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97

ized, then instances of this concept, which are semantically ties of ‘patient’ instantiated. Concept ‘person’ is
valid for the old version, will be determined by the algo- included in this step for the semantic view. This is
rithm as semantically invalid for the new version and because, semantic changes to the properties in the
may be subject to further human analysis. definitional isosem of ‘person’ would imply semantic
changes to those in the definitional isosem of ‘patient’
Instances are inter-related; however, as we mentioned in
based on Inter-concept Def–Def Dependency; Based
discussing structural validity, a semantically invalid
on Inter-isosem Dependency, changes to the proper-
instance does not invalidate its related instances simply
ties in the definitional isosem of ‘patient’ imply
because they take a semantically invalid instance as the
semantic changes to its datatype properties and
property value. Evaluating semantic validity will become
instance-level object properties, which are instanti-
impossible otherwise.
ated by this data instance of ‘patient’.
(2) For each object property instantiated by i where ck is
6. Validation using semantic view the range of this property, find the equivalent con-
cepts of ck ; eqðck Þ, and their superclass concepts,
Note that each data instance only instantiates part of an supðck Þ. That is, we employ the Inter-concept Def–
ontology. Therefore, not all the changes to the ontology Des Dependency and Inter-concept Def–Def Depen-
have an impact on a particular data instance. Starting with dency.
the ontology against which a data instance is semantically Consider the same data instance of ‘patient’. Based on
valid, we call the part of this ontology whose semantic Inter-concept Def–Des Dependency, semantic changes
change affects this data instance as the semantic view of to the properties in definitional isosem of ‘monitoring
the ontology for this data instance. device’ imply changes to the isosem consisting of prop-
The significance of the semantic view is that the seman- erty ‘monitored_by’ of ‘patient’ concept. Therefore,
tic validity of a data instance depends only on a portion of concept ‘monitoring device’ should be included in the
the ontology, i.e. the semantic view, rather than the entire semantic view. Furthermore, changes to the properties
ontology. As a result, changes to components outside the in the definitional isosem of the superclass concepts
semantic view will not affect the semantic validity of the including ‘treatment device’ and ‘equipment’ imply
data instance and thus can be ignored during the changes to those of ‘monitoring device’ based on
evaluation. Inter-concept Def–Def Dependency. Concepts
For a specific data instance, its semantic view in the obtained in this step include concepts ‘monitoring
ontology not only includes the components that it instanti- device’, ‘treatment device’, and ‘equipment’.
ates, but also any other components whose changes implic- (3) For each property p instantiated by i, find the equiv-
itly affect the instantiated components based on the alent properties of p; eqðpÞ, their inverse properties,
implication analysis presented in Section 4.1. denoted as invðpÞ, and their superproperties, supðpÞ.
These properties may belong to the same concept as
6.1. Semantic view p or any concept in the set obtained in step 1 and 2.
This is based on Intra-isosem Dependency and
Definition 6 (Semantic view for a data instance). The Inter-concept Des–Des Dependency.
semantic view of an instance ii in its ontology oi , denoted For our example, the properties obtained in this step
as SV ðii ; oi Þ, consists of all the components in oi whose include property ‘address’ where subpropertyOf(bill-
changes may lead to SemValidðii ; o0i Þ ¼ false. ing_address) = address and property ‘monitors’ where
inverseOf(monitors) = monitored_by. Semantic changes
Identifying the semantic view needs the knowledge of to these properties imply changes to those of ‘patient’
implication analysis, i.e. the dependencies and independen- based on Inter-concept Des–Des Dependency.
cies between the components in the ontology. Below is how (4) The semantic view of ontology o for data instance i
the semantic view can be generated for a data instance. consists of the concepts obtained in steps 1 and 2
(along with the class-level object properties among
6.1.1. Identifying the semantic view for a data instance them), the properties obtained in step 3 along with
Given the instantiated concept, c, for a data instance i, their restrictions and axioms.
and all the instantiated properties of i; P ðiÞ, we perform
the following steps. By applying all the dependencies in our implication anal-
ysis, all the ontology components in the semantic view for a
(1) Find the equivalent concepts of c; eqðcÞ, and the specific data instance are relevant to the semantic validity of
superclass concepts of c; supðcÞ. Essentially, we apply the data instance. All the other components of the ontology,
the Inter-concept Def–Def Dependency and Inter- based on the Inter-isosem Independency and Inter-concept
isosem Dependency. Independency, do not affect the semantic validity of this
For instance, consider the ontology in Fig. 1 for a instance and therefore are not part of the semantic view.
data instance of concept ‘patient’ with all the proper- This naturally leads to the following theorem:
L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97 93

Theorem 1. [Completeness and minimality of the semantic cj is in the semantic view based on Inter-concept Def–Des
view] The semantic view for a data instance includes all Dependency since changes to the definitional isosem of cj
(completeness) and only (minimality) the ontology components imply changes to p. Again, eqðcj Þ and supðcj Þ are included
which can affect the semantic validity of the data instance. in the semantic view based on Inter-concept Def–Def
Dependency.
Proof. Based on our implication analysis, ontology compo-  Based on Intra-isosem Dependency and Inter-concept
nents in the semantic view are identified through the follow- Des–Des Dependency, p; eqðpÞ; supðpÞ; invðpÞ 2 SV ði; oÞ
ing dependencies: Intra-isosem Dependency, Inter-isosem (including their restrictions and axioms).
Dependency, Inter-concept Def–Def Dependency, Inter- In addition to property p, other related properties
concept Def–Des Dependency, and Inter-concept Des–Des including eqðpÞ; supðpÞ, and invðpÞ are in the semantic
Dependency. Other ontology components are excluded from view based on Intra-isosem Dependency if they have
the semantic view based on the following independencies: concept ci as the domain or based on Inter-concept
Inter-isosem Independency, and Inter-concept Independency. Des–Des Dependency if they have a different concept
For a data instance i which instantiates concept ci and as the domain.
property p from ontology o ðcj is the range of property p if By applying all the dependencies, we have shown that all
p is an object property). h the ontology components in the semantic view are rele-
vant to the semantic validity of the data instance. In
6.1.2. Proof for minimality other words, no component that is not relevant has been
included in the semantic view, and therefore we prove
 Based on Inter-isosem Dependency and Inter-concept the minimality.
Def–Def Dependency. ci ; eqðci Þ; supðci Þ 2 SV ði; oÞ
(including the class-level object properties among 6.1.3. Proof for completeness
them).The Inter-isosem Dependency indicates the defini-
tional isosem of concept ci affects its descriptive isosems,  On the other hand, any other concepts and properties
consisting of instance-level properties instantiated by along with their restrictions and axioms R SV ði; oÞ. This
any data instance of ci . Also, based on Inter-concept is because, any other isosems of concept ci or any other
Def–Def Dependency, eqðci Þ and supðci Þ have to be concept have no impact on the data instance based on
included in the semantic view since changes to these con- Inter-isosem Independency or Inter-concept Indepen-
cepts may imply changes to ci . dency.
 Based on Inter-concept Def–Des Dependency and Inter- This means that the semantic view has included all the
concept Def–Def Dependency. cj ; eqðcj Þ; supðcj Þ 2 SV ontology components relevant to the semantic validity
ði; oÞ (including the class-level object properties among of the data instance, and therefore we prove the complete-
them). ness of the semantic view.

ICU
part_of part_of

person equipment
name, age, address task_performed,
dimensions

subclassOf subclassOf
subclassOf subclassOf
treatment
patient device general
health care ID, task_ equipment
personnel insurance_type performed
code, salary (insurance_plan), monitors subclassOf
billing_address
cardinality(ID)= monitored_ monitoring
subclassOf 1 by device
subclassOf channel_number

doctor nurse subclassOf

electrocardiograph

Fig. 5. The semantic view of the ontology in Fig. 1 for a data instance of concept ‘patient’.
94 L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97

ICU
part_of part_of

person equipment
name, age, address task_performed,
dimensions

subclassOf subclassOf
subclassOf subclassOf
subclassOf
health care Patient monitors
personnel ID, monitoring treatment general
code, salary insurance_type device device equipment
cardinality(code)= (insurance_plan), channel_number
1 billing_address monitored_by
subclassOf cardinality(ID)=
1 subclassOf
subclassOf

doctor nurse
specialty electrocardiograph

Fig. 6. The semantic view of the ontology in Fig. 2 for a data instance of concept ‘patient’.

The semantic view thus obtained for the ‘patient’ in a way that affects the interpretation of their data
instance is shown in Figs. 5 and 6 as the components within instances. On the other hand, from the perspective of the
the region of the curve. One can see that the semantic view consumers of these data instances, an interpretation, which
for this data instance is different when the ontology evolves is not intended by the authors of data instances, is defi-
from Fig. 1 to Fig. 2. nitely undesirable. Therefore, the question is, how can
Two instances will share the same semantic view if the silent changes to an ontology be identified? In this sec-
they instantiate the same concept and properties in the tion, we present our approach of using the semantic view to
same ontology. Also, the semantic view for a data address this problem.
instance will change when the instantiated components When a data instance is created, it is semantically valid
of the data instance change. However, changes to the with respect to its ontology. We suggest each data
specific values of the properties do not affect the semantic instance carry a hash value of its semantic view at its cre-
view if the same properties are still instantiated by the ation. When the consumers attempt an interpretation of
data instance. this data instance, they identify its semantic view based
As a note, the semantic view for a data instance will on the current version of the ontology, compute its hash
include only the components relevant to the validity of value and compare the hash value with that stored locally
the data instance. Considering the independencies we pre- by the data instance: if these two hash values match, it
sented in Section 4.1, the semantic view will not end up means that the semantic view has not changed and thus
to be the entire ontology. So, identifying the semantic view the data instance can be interpreted safely with the ontol-
becomes even more significant when the ontology is much ogy. When the hash values do not match, we cannot tell
larger than the semantic view. exactly whether the data instance is still valid. The reason
is that this will require the knowledge of the changes
between the two versions, which can be obtained by find-
6.2. Change identification using semantic view ing changes between two ontology versions structurally
[14] and semantically [17]. When the ontologies do not
Our previous discussion of validity evaluation is based change frequently, this hash value can provide a quick
on the assumption that both versions of the ontologies assurance for the validity of the data instance. The hash
are available. However, in a distributed and decentralized value should be based on an unordered hashing of the
environment such as the Semantic Web, there cannot be RDF triples of its semantic view. Some existing algorithm
such guarantees since the old version of an ontology may for computing the digest of an RDF/OWL graph [24,1,10]
be substituted by a new version and no notification can can be adopted.
be sent to entities where its data instances reside. In such The author of a data instance can further specify how
a case, the authors of data instances are concerned that the data instance should be interpreted when the computed
the semantics of data instances cannot be correctly inter- hash value for the semantic view does not match with the
preted based on the new version. Therefore, they would locally stored one. One can specify one of the following
like to identify whether the ontology has silently changed two options.
L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97 95

(1) Cascaded: This means the data instance can be inter-


preted according to the new version of the ontology. Instance Ontology
If this option is taken, the data instance may need to
be modified to remain valid against the new version
1
of the ontology. This can be opted if the ontology
is from a trusted source and the author of the data 1 Reasoner
instance would like the data instance to change with 2
e.g. RacerPro
the ontology.
(2) Rejected: This means the data instance cannot be 3 4
interpreted based on the new version of the ontology.
If this option is taken, the data instance will be
unavailable to its consumers. This can be opted when 5 Semantic
SemView Generator View
the ontology is from an untrusted source.

Fig. 8. Implementation architecture for identifying the semantic view for


an instance.
7. Implementation discussion
reasoning queries in our algorithms include, What are the
Our algorithms themselves do not support reasonings superclass (ancestor) concepts of a concept? What is the
over ontologies, but exploit the reasoning services provided inverse property of a given property? and so on. The API
by a reasoner. RacerPro [23], a Semantic Web reasoning functions used in our discussion below belong to JRacer
system, can act as the back-end inference server for pro- library (Java-based).
cessing queries over OWL Lite and OWL DL. We use More specifically, to evaluate the structural validity of a
RacerPro as a tool in implementing our approaches though data instance against an ontology, the concept it instanti-
RacerPro does not determine validity by itself. The ontol- ates can be identified using RacerPro API function such
ogies we use are represented in OWL, which is the most as individual-direct-types. API function concept can be used
popular ontology language. RacerPro provides Lisp-based to evaluate whether the concept belongs to the ontology or
and Java-based APIs for building clients to access Racer- not. Then, for each property instantiated by the data
Pro via TCP/IP. instance, API function role-domain can be used to test
Figs. 7 and 8 depict the architecture for implementing whether the property belongs to the concept or any of its
our approach to evaluating the validity and to identifying equivalent or superclass concepts. RacerPro can determine
the semantic view for an instance, respectively. The Struc- the subsumption or equivalence between two concepts
tural (Semantic) Validator in Fig. 7 and SemView Genera- through functions concept-subsumes and concept-equiva-
tor in Fig. 8 are clients which interact with the reasoning lent, respectively. For an object property of a data instance,
server. As shown in Fig. 7, the ontologies and instances similar evaluations can be done for the instance taken as
are loaded into the reasoner first. In order to evaluate the the value of the property. RacerPro can automatically test
validity of a data instance, the Structural (Semantic) Vali- whether the value to a datatype property belongs to the
dator needs to interact with a reasoner in which the Valida- specified primitive data type. On the basis of the structural
tor sends reasoning queries to the reasoner and receives validity, the semantic validity of a data instance against a
answers back from the reasoner. Similarly, as shown in new version of an ontology can be evaluated with the
Fig. 8, the SemView Generator also needs the reasoning semantic changes provided as input.
services provided by a reasoner to identify the semantic When identifying the semantic view for a data instance,
view for a given data instance and ontology. Examples of the concept of the data instance can be identified through
API function individual-direct-types, its equivalent concepts
Structural
(Semantic) Instance Ontologies and superclass concepts can be identified through API
Changes function concept-synonyms and concept-ancestors, respec-
1 1 tively. Similarly, the concept as the range of an object prop-
erty of the data instance can be identified along with its
2 2
Reasoner own equivalent concepts and superclass concepts. For each
e.g. RacerPro
property of the data instance, the semantic view will also
include its equivalent properties identified through role-
3 4
synonyms, its superproperties identified through role-ances-
tors, and its inverse property identified through role-
5 StructValid inverse. We did not find API functions to retrieve all the
Structural (Semantic) Validator (SemValid)
restrictions or axioms over a specific property; However,
RacerPro provides capabilities of querying about the char-
Fig. 7. Implementation architecture for validity evaluation for an instance. acteristics of a given property, e.g. transitive-p can deter-
96 L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97

mine whether the property is transitive or not. These con- for the diversity of changes which concurrently occur to
cepts, the taxonomy/associations among them along with an ontology, noting that instantiation is done at a fine-
their restrictions and axioms are part of the semantic view. grained level of concepts and properties. Even the limited
discussion on describing semantic changes is confined to
8. Related work equivalence and subsumption between concepts. Further-
more, changes to an ontology are examined mainly by con-
Most related work on detecting changes to ontologies sidering each individual concept in isolation. As a result,
concentrates on finding structural changes between ontol- this analysis lacks an examination of the semantics of a
ogy versions. OntoView [6,7] is a web-based ontology ver- concept in relation to other associated concepts and there-
sioning management system, in which structural changes fore is not capable of analyzing the implication of the
are identified by rules. It allows the ontology engineers to semantic changes.
compare versions of ontology, and to specify whether Noy and Klein [12] presented a systematic discussion on
(but not how) these structural changes entail any concep- the differences between ontology evolution and schema evo-
tual implications. PromptDiff [14,15] is a tool that utilizes lution. Klen and Noy [8] described a framework integrated
heuristics for matching components of ontology versions. with different representations of the changes between ontol-
It integrates different heuristics matchers for comparing ogy versions and the transformation from one representa-
ontology versions, where the matchers conform to monoto- tion to another. An ontology of (structural) change
nicity principle, i.e. no matchers retract the results of previ- operations was also presented. Klen and Stuckenschmidt
ous matchers or its own results from previous runs. The [9] discussed how the changes in one ontology affect another
outcome of the algorithm is a structural diff, consisting of ontology which uses concepts defined in the changed ontol-
frame pairs which are images of each other from different ogy and reasonings based on the imported concept hierar-
versions or one of them is null. The change types are chy. In particular, it studied how changes to the concept
described as changed, unchanged and isomorphic. Noy hierarchy of the external ontology affect the validity of the
and Klein [12] discussed how structural changes in ontolo- compiled subsumption relations which are added as axioms
gies affect the preservation of their data instances. Research to the local ontology.
complementary to detecting changes to ontologies can be Recently, [21] investigated how the semantic changes to
found in finding similarities between independently devel- an ontology can be evaluated and detected. It proposes a
oped ontologies for the purposes of merging and aligning SemDiff algorithm that first detects explicit semantic
ontologies. Examples of such tools include Prompt [13], changes based on the structural changes and then detect
GLUE [2], etc. Mitra et al. [11] presented an ontology map- implicit semantic changes implied by other semantic
ping approach using Bayesian Network. changes. However, it did not address the issue of structural
The above work, however, is limited to detecting and semantic validity. In this paper, we exploited the types
changes at structural level, but does not address detecting of structural and semantic changes detected to verify
changes occurring at the semantic level. Detecting changes whether the instances are structurally and semantically
at the semantic level is crucial since ontologies are consid- valid when the ontologies change.
ered as a means for presenting semantics. Typically, seman- We propose using hashing of values of the semantic view
tic changes are not clearly distinguishable from structural in our change identification approach. Hashing a RDF/
changes in the existing work. For example, if the range to OWL document can be done very efficiently. The complexity
a property is changed in the new version to a subclass of of hashing in [24] is O(N) and that of [1,10] is O(NlogN).
the concept in the old version, the range of the property, Qin and Atluri [19,22] proposed utilizing ontologies in
therefore certain semantic aspect of the concept owning guiding the change detection to the data instances under
the property, becomes narrowed. This change should not the Semantic Web, and [20] proposed a methodology by
be described and treated as a structural change only. exploiting the semantic relationships among concepts to
Heflin and Hendler [4] presented SHOE, a web-based specify access control policies such that undesired inference
knowledge representation language that supports multiple of unauthorized information can be prevented.
versions of ontologies and discussed how the features of
SHOE address ontology versioning. More specifically, the 9. Conclusions and future work
authors analyzed scenarios for semantic subsumption at
the ontology level. Compatibility between ontology ver- Ontologies are formal, explicit specifications of shared
sions was also discussed in [5] by assuming that the data conceptualizations that evolve over time. Changes to an
source that conforms to a specific version of the ontology ontology may occur at two levels – structural and semantic.
uses the whole ontology. Though it mentioned the possibil- When an old version of an ontology is substituted by a new
ity that an incomparable revision may not affect the inter- one, data instances that comply with the old version of the
pretation of a data source which uses the ontology part ontology may become invalidated, and therefore be irre-
unaffected by the revision, it lacks a close examination of trievable and non-interpretable. Therefore, it is essential
the implication within the ontology. The compatibility to verify if the data instances are still valid under the new
measured at the ontology level [4,5] is too coarse-grained version of the ontology.
L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97 97

In this paper, we have proposed approaches to verify both [9] M. Klein and H. Stuckenschmidt, Evolution management for
structural validity and semantic validity of data instances. interconnected ontologies, in: Workshop on Semantic Integration at
ISWC 2003, Sanibel Island, Florida, 2003.
We have also proposed the semantic view of an ontology [10] S. Melnik, RDF API draft: Cryptographic digests of RDF models and
for a data instance as the portion of an ontology whose statements, http://www-db.stanford.edu/melnik/rdf/api.html#digest.
changes directly or indirectly affect the validity of the data [11] Prasenjit Mitra, Natasha N. Noy and A.R. Jaiswal, Ontology
instance. We have discussed how the semantic view can be mapping discovery with uncertainty. Fourth International Semantic
identified through implication analysis. Besides, by storing Web Conference (ISWC), Galway, Ireland, November 8–10, 2005.
[12] N.F. Noy, M. Klein, Ontology evolution: not the same as schema
a hash value of the semantic view, one can quickly identify evolution, Knowledge and Information Systems 5 (2003).
whether the ontology has changed, so that they can be pre- [13] N.F. Noy and M.A. Musen, PROMPT: algorithm and tool for
vented from making any erroneous interpretation. automated ontology merging and alignment, in: Seventeenth National
As part of our future work, we intend to address the Conference on Artificial Intelligence (AAAI2000), 2000.
issue of efficiently propagating ontology changes to its data [14] N.F. Noy and M.A. Musen, PromptDiff: a fixed-point algorithm for
comparing ontology versions, in: Proceedings of the National
instances and dependent ontologies. Conference on Artificial Intelligence (AAAI), 2002.
[15] N.F. Noy, M.A. Musen, The PROMPT suite: interactive tools for
ontology merging and mapping, International Journal of Human-
References Computer Studies 59 (6) (2003) 983–1024.
[16] N.F. Noy, M.A. Musen, Specifying Ontology Views by Traversal,
[1] J. Carroll, Signing RDF graphs, Lecture Notes in Computer Science, Third International Semantic Web Conference (ISWC2004), Hiro-
vol. 2870, Springer-Verlag, 2003. shima, Japan, 2004.
[2] A. Doan, J. Madhavan, P. Domingos and A. Halevy, Learning to [17] N.F. Noy, S. Kunnatur, M. Klein, M.A. Musen, Tracking changes
map between ontologies on the Semantic Web, in: Proceedings of the during ontology evolution. Third International Semantic Web Con-
World-Wide Web Conference (WWW), 2002. ference (ISWC2004), Hiroshima, Japan, 2004.
[3] T.R. Gruber, A translation approach to portable ontologies, Knowl- [18] OWL Web Ontology Language Guide. Available at http://
edge Acquisition 5 (2) (1993) 199–220. www.w3.org/TR/owl-guide/.
[4] J. Heflin, J. Hendler, Dynamic ontologies on the Web, in: Proceedings [19] L. Qin and V. Atluri, Ontology-guided change detection to the
of the 17th National Conference on Artificial Intelligence (AAAI- Semantic Web data, in: 23rd International Conference on Conceptual
2000), AAAI/MIT Press, Menlo Park, CA, 2000, pp. 443–449. Modeling (ER 2004), pp. 624–638.
[5] M. Klein, D. Fensel, Ontology versioning for the Semantic Web, in: [20] L. Qin and V. Atluri, Concept-level access control for the Semantic
Proceedings of the International Semantic Web Working symposium Web. ACM Workshop on XML Security, held in conjunction with
(SWWS), Stanford University, California, USA, 2001, pp. 75–91. the 10th ACM Conference on Computer and Communications
[6] M. Klein, D. Fensel, A. Kiryakov and D. Ognyanov, Ontology Security, October 2003.
versioning and change detection on the Web, in: 13th International [21] L. Qin, V. Atluri, SemDiff: an approach to detecting semantic changes
Conference on Knowledge Engineering and Knowledge Management to ontologies, International Journal of Semantic Web and Informa-
(EKAW02), 2002. tion Systems 2 (4) (2006) 1–32.
[7] M. Klein, A. Kiryakov, D. Ognyanov and D. Fensel, Finding and [22] L. Qin, V. Atluri, An ontology-guided approach to change detection
characterizing changes in ontologies, in: Proceedings of the 21st of the Semantic Web data, Journal on Data Semantics V (2006) 130–
International Conference on Conceptual Modeling (ER2002), Tam- 157, LNCS 3870.
pere, Finland, 2002, pp. 79–89. [23] RacerPro, http://www.racer-systems.com.
[8] M. Klein and N.F. Noy, A Component-based framework for [24] C. Sayers and A.H. Karp, Computing the Digest of an RDF Graph,
ontology evolution, in: Proceedings of the Workshop on Ontologies Technical Report HPL-2003-235, 2003.
and Distributed Systems, IJCAI’03, Acapulco, Mexico, 2003. [25] http://www.w3.org/Addressing/.

Das könnte Ihnen auch gefallen