Beruflich Dokumente
Kultur Dokumente
com
Received 21 September 2006; received in revised form 16 January 2008; accepted 16 January 2008
Available online 26 January 2008
Abstract
It is natural for ontologies to evolve over time. These changes could be at structural and semantic levels. Due to changes to an ontol-
ogy, its data instances may become invalid, and as a result, may become non-interpretable. In this paper, we address precisely this prob-
lem, validity of data instances due to ontological evolution. Towards this end, we make the following three novel contributions to the area
of Semantic Web. First, we propose formal notions of structural validity and semantic validity of data instances, and then present
approaches to ensure them. Second, we propose semantic view as part of an ontology, and demonstrate that it is sufficient to validate
a data instance against the semantic view rather than the entire ontology. We discuss how the semantic view can be generated through
an implication analysis, i.e., how semantic changes to one component imply semantic changes to other components in the ontology.
Third, we propose a validity identification approach that employs locally maintaining a hash value of the semantic view at the data
instance.
Ó 2008 Elsevier B.V. All rights reserved.
0950-5849/$ - see front matter Ó 2008 Elsevier B.V. All rights reserved.
doi:10.1016/j.infsof.2008.01.004
84 L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97
among concepts and properties. Research on detecting the semantic validity of a data instance, and we propose a
structural changes includes PromptDiff [14,17] and on SemValid algorithm to evaluate it. While the structural
detecting semantic changes includes SemDiff [21]. validity is evaluated based on what is visible in an ontol-
Understanding the semantic changes to an ontology is ogy, the semantic validity cannot be evaluated in this
crucial to the functioning of its data instances and depen- way. As a result, though the structural validity of a data
dent ontologies. Ontologies present semantics for interpret- instance can be evaluated with respect to an ontology,
ing data instances. If an ontology is substituted by a new the semantic validity of a data instance is defined as a func-
version, its data instances may become non-interpretable. tion of the changes between two versions. Given a seman-
This is because, the ontology components they comply with tically valid data instance, we would like to determine
may not be available in the new version, or the new version whether a set of ontology changes has changed the seman-
brings an incorrect interpretation to the data instances. tics of the data instance.
Such a mismatch in the compliance may lead to mistaken Since a data instance instantiates only part of an ontol-
decisions and actions. Secondly, a dependent ontology of ogy, not every component in the ontology has impact on its
this changed ontology has far-reaching effects on its data validity. On the other hand, if a semantic change to a com-
instances and its dependent ontologies as well. When an ponent of the ontology imply semantic changes to the
ontology imports and extends another ontology for reus- instantiated components by a data instance, then this
ability and interoperability, it is semantically dependent semantic change will ultimately affect the semantic validity
on what is defined in the imported ontology. Therefore, if of this data instance. Therefore, in addition to the instanti-
the imported ontology has changed, the dependent ontol- ated components, other components may be significant to
ogy may not provide valid meaning for interpreting its data the validity of the data instance as well. We propose seman-
instances. Since changes to an ontology are autonomous, tic view as a subset of the ontology, and demonstrate that
its data instances and dependent ontologies may become the semantic view rather than the entire ontology is respon-
invalidated. In this paper, we focus on the validity issue of sible for the validity of a data instance. Our definition of
data instances due to ontological evolution. Similar semantic view is in line with ontology view [16] in terms
approaches can be applied to the validity of dependent that it is ‘a portion of an ontology’. However, our pro-
ontologies as well. posed semantic view for a specific data instance is a ‘view’
While the author of a data instance needs to ensure the of the ontology whose changes directly or indirectly affect
validity of the data instance, the consumers need to be pro- its semantic validity. We show the semantic view be gener-
tected from erroneous interpretation caused by ontology ated based on the implication analysis, which specifies how
evolution. As an example, the state or local department semantic changes to one component imply semantic
of public health may publish guidelines and regulations changes to other components. We further propose a valid-
on disease control. If these data are presented as semantic ity identification approach in which a data instance carries
data instances, these government agencies will need to a hash value of its semantic view when it is created. By sim-
ensure that they are valid data instances. On the other ply re-computing the hash value of the semantic view and
hand, hospitals, clinics and doctors who are consumers of comparing it with that stored by the data instance, one
these data also need these data to be valid and interpretable can deduce whether the ontology has changed silently in
in order to comply with the regulations and policies. Exist- a way which may affect the semantic validity of the data
ing research is restricted to studying the impact of struc- instance.
tural changes on data preservation [12], or the validity of This paper is organized as follows. Section 2 presents the
subsumption relations [9]. Moreover, existing work dis- preliminaries on ontologies and instances. Section 3 dis-
cusses ontology-level compatibility by assuming the entire cusses the structural changes and semantic changes to
ontology affects the validity [4,5]. Therefore, there is a com- ontologies. In Section 4, we present the implication analysis
pelling need for a close examination of the validity of data which is the key to identifying implicit semantic changes
instances when their ontologies evolve. from explicit ones and generating the semantic view. Sec-
By structural validity against an ontology, we mean data tion 5 presents the structural and semantic validity of data
instances structurally comply with the ontology. A Struct- instances, and algorithms to evaluate them. In Section 6,
Valid algorithm is used to evaluate the structural validity of the semantic view along with its generation is presented.
a data instance. Our notion of semantic validity of data Implementation of our approach is discussed in Section
instances has structural validity as a prerequisite. Further- 7. Related work on detecting changes to ontologies and
more, the semantic validity of data instances determines the consequences of the ontology changes is presented in
whether these instances can be correctly interpreted Section 8. Section 9 presents our conclusions and an insight
through the ontology. For instance, if a concept in the into our future work.
ontology becomes more specialized, then instances which
are semantically valid for the old version, cannot be inter- 2. Preliminaries
preted through the new version; in other words, they may
not be semantically valid for the new version. Therefore, In this section, we present the preliminaries on ontolo-
the semantic changes between two ontology versions affect gies and their instances.
L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97 85
ICU
part_of part_of
person equipment
name, age, address dimensions
subclassOf subclassOf
subclassOf subclassOf
patient
health care personnel ID, treatment general
code, salary insurance_type device equipment
(insurance plan),
(insurance_plan), task performed
tas _pe o ed
billing_address monitors
subclassOf subclassOf cardinality(ID)=1 subclassOf
monitored_by
doctor nurse monitoring
device
channel_number
subclassOf
electrocardiograph
ICU
part_of part_of
person equipment
name, age, task_performed,
address dimensions
subclassOf subclassOf
subclassOf subclassOf subclassOf
doctor nurse
specialty electrocardiograph
mining whether the new property range is more generalized (3) Changes to the generalization of a property, Dpg ,
or specialized. Of course, each of the following semantic with its domain or range: a property becomes more
changes is caused and reflected by certain structural generalized ðDpg" Þ or more specialized ðDpg# Þ with
changes we have defined. its domain (or range) if its domain (or range)
Types of semantic changes to ontologies. Given two becomes a more generalized or specialized concept,
ontology versions o and o0 , the set of semantic changes respectively. Otherwise, its domain or range becomes
between them, denoted as Dðo; o0 Þ ¼ fD1 ; D2 ; . . . ; Dn g, may incomparable ðDpgy Þ if it is modified in a way which is
include: neither more generalized nor specialized. Note that
the semantic change to the domain or range of a
(1) Changes to the generalization of a concept, Dcg : A property can be either explicit (its domain or range
concept becomes more generalized ðDcg" Þ or more has changed to a different concept) or implicit (its
specialized ðDcg# Þ if it moves up or down in the con- domain or range concept has changed implicitly).
cept hierarchy, respectively. Otherwise, the general- Ontology changes which make a property more spe-
ization of a concept becomes incomparable ðDcgy Þ, cialized or incomparable may invalidate some
if it has changed, but become neither more general- instances. For instance, the range of a property is
ized nor specialized. changed from concept c1 to c2 where c2 is more spe-
Ontology changes which cause a concept to be more cialized than c1 . Instances which have an instance
specialized or incomparable will invalidate its of c2 as the property value (any instance of c2 is also
instances due to the semantic change to the concept. an instance of c1 ) are valid before or after the change.
On the other hand, if a concept becomes more gener- However, those instances with an instance of c1 (but
alized, though the semantic change to the concept not c2 ) as the property value will not be valid after the
does not invalidate any data instances, there is still semantic change.
a possibility that some of its instances become inval- (4) Changes to restrictiveness of a property, Dpr : A prop-
idated as a result of the semantic changes to its erty becomes more restrictive ðDpr" Þ or less restrictive
properties. ðDpr# Þ if changes to the restrictions of the property
(2) Changes to the descriptiveness of a concept, Dcd : a further restrict or extend its semantics, respec-
concept becomes more descriptive ðDcd" Þ or less tively.Ontology changes which make a property more
descriptive ðDcd# Þ if datatype properties or instance- restrictive may invalidate some instances which vio-
level object properties are added to or deleted from late the new restrictions.
the concept, respectively. (5) Changes to axiom over a property, Dpa : The axiom
Ontology changes which make a concept less descrip- over the property becomes more extended ðDpa" Þ or
tive may invalidate some instances. For instance, if a more restricted ðDpa# Þ if changes to its axiom further
property is deleted from a concept, all the data extend or restrict its semantics, respectively. An
instances that have instantiated this property will example axiom, which extends the meaning of the
become invalid. property, is the declaration of a property to be tran-
88 L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97
sitive, which leads to additional reasoning without need to be derived from explicit semantic changes by using
invalidating the reasoning prior to the declaration. implication analysis.
On the other hand, the declaration of a property to To facilitate the implication analysis, we introduce the
be functional will invalidate the instantiations where notion of isosem. Each isosem is nothing but a cluster of
the value to this property is not unique. It does not semantically dependent properties of the concept in an
introduce any new reasoning results, either. As a ontology.
result, this change will restrict the meaning of the
Definition 1 (Isosem). Given a concept c 2 CðoÞ, we define
property.
an isosem SðcÞ as a set of properties, where
SðcÞ ¼ fp1 ; . . . ; pk g such that p1 ; . . . ; pk 2 P ðcÞ and for each
Note that each of the above semantic changes is the
pi 2 SðcÞ, either pi is the only property in SðcÞ or pj 2 SðcÞ
result of one single structural change and they may not
if pj is an equivalent property or sub-property (specializa-
be able to represent the semantic change caused by multiple
tion) of pi (or vice versa).
structural changes altogether. For instance, adding a prop-
erty to a concept makes the concept more descriptive and Isosems are categorized into definitional isosems and
deleting another property from it makes it less descriptive. descriptive isosems based on the type of properties they
Another example is that, the domain of a property can contain, as defined below.
become more specialized while its range becomes more gen-
Definition 2 (Definitional isosem). A definitional isosem of
eralized. We do not provide a corresponding semantic
concept c, denoted as S fðcÞ, is an isosem consisting of class-
change to describe the combinational effects of these
level object properties.
changes. The reason is that each ontology change as well
as their impact on data instances can be examined sepa-
Definition 3 (Descriptive isosem). A descriptive isosem of
rately. Using the same example, adding a property to a
concept c, denoted as S s ðcÞ, is an isosem consisting of
concept does not invalidate any data instances while delet-
instance-level object properties or datatype properties.
ing another property can cause some data instances to be
invalid. We need to look at each change separately during Fig. 3 shows an example of how isosems of a concept
validity evaluation. ‘patient’ in Fig. 1 are abstracted from the properties. This
For the ontology versions in Figs. 1 and 2, the following concept has one definitional isosem, S f (patient), and four
explicit semantic changes are detected based on the struc- descriptive isosems, S s1 (patient), S s2 (patient), S s3 (patient)
tural changes: concept ‘monitoring device’ has become and S s4 (patient), each of which is indicated as an inner cir-
more generalized since its ‘subclassOf’ property has taken cle with its constituent properties enclosed. Note that prop-
a more generalized concept as its range; property ‘code’ erties ‘insurance_plan’ and ‘insurance_type’ are clustered
of ‘health care personnel’ has become more restrictive; into the same isosem since they are equivalent properties.
addition of property ‘specialty’ to concept ‘doctor’ has If these two properties have any subproperties such as ‘pri-
made it more descriptive; property ‘task_performed’ has mary insurance’ and ‘secondary insurance’, then all these
become more generalized since it is moved to a more gen- properties will be clustered into the same descriptive
eralized concept. isosem.
One may argue that concept ‘monitoring device’ has
become more specialized in some sense since it can no
longer include any instances of ‘treatment device’ while Sf
previously some instances of ‘treatment device’ could have person
been monitoring devices. Our point is that ‘monitoring
device’ potentially allows more instances after the change.
For instance, ‘monitoring device’ can only include some S1s subclassOf S4s
String
instances of ‘treatment device’ before the change; After billing String
the change, it can include instances of ‘equipment’ which ID patient address
may not belong to ‘treatment device’.
We suggest the changes between ontologies be repre- Insurance monitored_by
sented using the same ontology language for the ontologies _plan
themselves. Ontologies for annotating these changes, such monitoring
Insurance
as that in [8], can be developed. This is out of the scope String
device
of this paper. _type
String S3s
Sf(ci)
dependent
cj
pi cj pi
ci Si(ci) dependent ck Ss(ci)
pj ck ci
pj cl
Sf(ci)
Sis(ci) independent
ck
cj pi
pi dependent
ci
ck
ci Sjs(ci) Sf(cj)
pj pj
cl
cj
Sf(ci)
dependent
ck
pi dependent Ss(cj) pj
ci cj
cj pj ci Ss(cj) pi Ss(ci)
Sjs(ci) Sks(ck)
ci pj ck pk cl
pi pl
cm cn
Sis(ci) independent Sls(cl)
(3) Inter-isosem Independency: Changes to the properties For example, ‘person’ and ‘monitoring device’ are
within an isosem imply no changes to the other isos- indirectly associated concepts, changes to the descrip-
ems of the same concept, if they belong to the same tive isosems of ‘monitoring device’ do not affect any
category. Formally, Dpi ;Dpj , if pi 2 S fi ðci Þ and descriptive isosems of ‘person’.
pj 2 S fj ðci Þ, or pi 2 S si ðci Þ and pj 2 S sj ðci Þ, as indicated
in Fig. 4(c).
The reason for this independency is that properties of 5. Structural and semantic validity
these two isosems would have been clustered into the
same isosem if semantic dependency existed. In this section, we show how changes detected to an
(4) Inter-concept Def–Def Dependency: If the property in ontology can be used in evaluating the structural validity
a definitional isosem has changed, the property in the and semantic validity of a data instance.
definitional isosem of the child concepts has changed
implicitly, which is shown in Fig. 4(d). Formally, 5.1. Structural validity
Dpi ) Dpj , if pi 2 S f ðci Þ and pj 2 S f ðcj Þ, where cj is
a child concept of ci . Structural changes to an ontology affect the structural
For instance, in the ontology versions shown in validity of its data instances. In this section, we define the
Figs. 1 and 2, property ‘subclassOf’ in the definitional structural validity and propose how it can be evaluated.
isosem of concept ‘monitoring device’ has changed. Given a data instance and an ontology, we can evalu-
Concept ‘electrocardiograph’ is a child of ‘monitoring ate whether this data instance is structurally valid with
device’. As a result, an implicit change is implied to the respect to this ontology. Obviously, if a data instance is
definitional isosem of concept ‘electrocardiograph’. structurally valid against the old version of an ontology
(5) Inter-concept Def–Des Dependency: If the definitional but not structurally valid against the new version, then
isosem of a concept has changed, the range of each the structural changes to the ontology invalidate the data
property in the descriptive isosem of another concept instance. Alternatively, if a data instance is structurally
has changed implicitly, if this property takes the for- valid against one version of the ontology, its structural
mer concept as its range, and belongs to a descriptive validity against a different ontology version can also be
isosem of the latter concept. This is depicted in evaluated based on the structural changes between the
Fig. 4(e). Formally, Dpi ) Dpj , if pi 2 S f ðci Þ and two versions.
pj 2 S s ðcj Þ, where pj ðcj Þ ¼ ci . For instance, concept
Definition 4 (Structural validity of data instances). A data
‘patient’ has a descriptive isosem containing property
instance i 2 IðcÞ is structurally valid against an ontology o,
‘monitored_by’ and taking concept ‘monitoring
if the following conditions hold:
device’ as its range. Then changes to the definitional
isosem of ‘monitoring device’ imply changes to this
(1) Concept c exists in ontology o, i.e. c 2 CðoÞ;
descriptive isosem of concept ‘patient’.
(2) Each property instantiated by instance i belongs to
(6) Inter-concept Des–Des Dependency: If two descriptive
the instantiated concept c by either explicit specifica-
isosems contain semantically dependent properties, at
tion or inheritance in the ontology;
least one of the isosems is dependent on the other.
(3) For each property instantiated by instance i, the
That is, Dpi ) Dpj where pi 2 S s ðci Þ; pj 2
s value taken by the property belongs to the range of
S ðcj Þ; ai ðpi Þ ¼ pj ðor aj ðpj Þ ¼ pi Þ, which is shown in
the property and satisfies the restrictions of the
Fig. 4(f).In Fig. 1, the ‘billing_address’ property of
property.
concept ‘patient’ is a sub-property of the ‘address’
property of concept ‘person’, and concept ‘person’
is the parent of concept ‘patient’. Then, changes to As an example, if a property p is specified to concept c1
the isosem containing ‘address’ property imply in the old version and moved to concept c2 in the new
changes to the isosem containing ‘billing_address’. version, then the data instances of concept c1 with property
As another example, ‘monitored_by’ property of p instantiated are not structurally valid against the new
‘patient’ is inverse of ‘monitors’ property of ‘monitor- version of the ontology unless c1 can inherit property p
ing device’, each of which constitutes an isosem. Then from c2 .
changes to one imply changes to the other. Using our example ICU ontology, for an instance of
(7) Inter-concept Independency: With Inter-concept Des– ‘patient’ with property ‘monitored_by’ instantiated, the
Des Dependency as the exception, if concept ci is instance is valid if this property takes a data instance of
indirectly associated with cl through ck , then the ‘monitoring device’ or ‘electrocardiograph’ as its value.
semantics of ci is independent of changes to concept Note that we only require that the range of the property
cl if the isosem between ci and ck does not is asserted to be an instance of concept ‘monitoring device’
change. Generally, as shown in Fig. 4(g), Dpi ;Dpl or ‘electrocardiograph’. However, whether this property
where pi 2 S si ðci Þ; pl 2 S sl ðcl Þ; pj 2 S sj ðci Þ and pj ðci Þ ¼ value is structurally or semantically valid does not matter
ck ; pk 2 S sk ðck Þ and pk ðck Þ ¼ cl . to evaluating the validity of this instance of ‘patient’. Of
L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97 91
course, this data instance of ‘patient’ will not be structur- (3) For each instantiated property p, the axiom over p
ally valid if it takes an instance of concepts such as ‘general does not get more restricted.
equipment’.
Algorithm 1 gives the detailed steps to evaluate the
The structural validity is the prerequisite for the
structural validity of a data instance.
semantic validity. However, the structural validity does
not necessarily lead to the semantic validity. For instance,
Algorithm 1
structural validity warrants that the data instance can
Evaluating the structural validity of data instances instantiate a concept. However, if this concept has
½StructValidði; oÞ become more specialized, this data instance may no
longer be semantically valid. Though one should be able
Require: Data instance i 2 IðcÞ; i instantiates property to determine whether an instance is structurally valid
p, ontology o for an ontology irrespective of other ontology versions,
StructValidði; oÞ ¼ truefdefault valueg the semantic validity of a data instance can be evaluated
{Evaluating the structural validity of a data with the semantic changes between two versions of the
instance i against an ontology o} ontology given, on the basis of the structural validity.
if c is a concept in ontology o then In Algorithm 2, we show how to evaluate whether a data
for each p instantiated by instance i do instance i is semantically valid against a new version of an
if p does not belong to concept c or any of its ontology o0 , given that it is semantically valid against its
superclass or equivalent concepts, OR the value taken old version o and the semantic changes between the two
by p does not belong to the range of p or does not sat- versions, Dðo; o0 Þ.
isfy the restrictions then
StructValidði; oÞ ¼ false Algorithm 2
end if
end for Evaluating the semantic validity of data instances
else ½SemValidði; o0 Þ
StructValidði; oÞ ¼ false
Require: Data instance i 2 IðcÞ; i instantiates property
end if
p, ontology o and o0 ; Dðo; o0 Þ
SemValidði; o0 Þ ¼ true {default value}
Assume a data instance of the concept ‘patient’ in {Evaluating the semantic validity of a data instance
ontology shown in Fig. 1 has properties ‘ID’, ‘name’, i against ontology o0 }
‘insurance_type’ and ‘monitored_by’ instantiated. If con- if StructValidði; o0 Þ ¼ true then
cept ‘patient’ or any of these instantiated properties are if c0 has not become more specialized than c or
deleted from the ontology, then this data instance will incomparable then
not be structurally valid against the new version of the for each p instantiated by instance i do
ontology. if axiom over p gets more restricted then
SemValidði; o0 Þ ¼ false
end if
5.2. Semantic validity end for
else
Data instances can be interpreted correctly only if they SemValidði; o0 Þ ¼ false
are semantically valid with respect to the ontology they end if
comply with. Their semantic validity need to be evaluated else
when semantic changes take place to the ontology that they SemValidði; o0 Þ ¼ false
comply with. Given that a data instance is semantically end if
valid to an ontology, we are interested in finding out how
semantic changes to this ontology affect the validity of its
data instance.
Definition 5 (Semantic validity of data instances). A data For the ontology versions shown in Figs. 1 and 2, con-
instance i 2 IðcÞ where c 2 CðoÞ, is semantically valid sider the semantic validity of a data instance of concept
against o0 where c0 2 Cðo0 Þ, if the following conditions ‘monitoring device’. The semantic changes between the
hold: ontology versions include that this concept has become
more generalized. In this case, the data instance which is
(1) Data instance i is structurally valid against o0 , i.e. semantically valid against the old version will be semanti-
StructValidði; o0 Þ ¼ true; cally valid against the new version except that they may
(2) Concept c does not get more specialized or become invalidated due to semantic changes to the proper-
incomparable; ties. On the other hand, if a concept becomes more special-
92 L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97
ized, then instances of this concept, which are semantically ties of ‘patient’ instantiated. Concept ‘person’ is
valid for the old version, will be determined by the algo- included in this step for the semantic view. This is
rithm as semantically invalid for the new version and because, semantic changes to the properties in the
may be subject to further human analysis. definitional isosem of ‘person’ would imply semantic
changes to those in the definitional isosem of ‘patient’
Instances are inter-related; however, as we mentioned in
based on Inter-concept Def–Def Dependency; Based
discussing structural validity, a semantically invalid
on Inter-isosem Dependency, changes to the proper-
instance does not invalidate its related instances simply
ties in the definitional isosem of ‘patient’ imply
because they take a semantically invalid instance as the
semantic changes to its datatype properties and
property value. Evaluating semantic validity will become
instance-level object properties, which are instanti-
impossible otherwise.
ated by this data instance of ‘patient’.
(2) For each object property instantiated by i where ck is
6. Validation using semantic view the range of this property, find the equivalent con-
cepts of ck ; eqðck Þ, and their superclass concepts,
Note that each data instance only instantiates part of an supðck Þ. That is, we employ the Inter-concept Def–
ontology. Therefore, not all the changes to the ontology Des Dependency and Inter-concept Def–Def Depen-
have an impact on a particular data instance. Starting with dency.
the ontology against which a data instance is semantically Consider the same data instance of ‘patient’. Based on
valid, we call the part of this ontology whose semantic Inter-concept Def–Des Dependency, semantic changes
change affects this data instance as the semantic view of to the properties in definitional isosem of ‘monitoring
the ontology for this data instance. device’ imply changes to the isosem consisting of prop-
The significance of the semantic view is that the seman- erty ‘monitored_by’ of ‘patient’ concept. Therefore,
tic validity of a data instance depends only on a portion of concept ‘monitoring device’ should be included in the
the ontology, i.e. the semantic view, rather than the entire semantic view. Furthermore, changes to the properties
ontology. As a result, changes to components outside the in the definitional isosem of the superclass concepts
semantic view will not affect the semantic validity of the including ‘treatment device’ and ‘equipment’ imply
data instance and thus can be ignored during the changes to those of ‘monitoring device’ based on
evaluation. Inter-concept Def–Def Dependency. Concepts
For a specific data instance, its semantic view in the obtained in this step include concepts ‘monitoring
ontology not only includes the components that it instanti- device’, ‘treatment device’, and ‘equipment’.
ates, but also any other components whose changes implic- (3) For each property p instantiated by i, find the equiv-
itly affect the instantiated components based on the alent properties of p; eqðpÞ, their inverse properties,
implication analysis presented in Section 4.1. denoted as invðpÞ, and their superproperties, supðpÞ.
These properties may belong to the same concept as
6.1. Semantic view p or any concept in the set obtained in step 1 and 2.
This is based on Intra-isosem Dependency and
Definition 6 (Semantic view for a data instance). The Inter-concept Des–Des Dependency.
semantic view of an instance ii in its ontology oi , denoted For our example, the properties obtained in this step
as SV ðii ; oi Þ, consists of all the components in oi whose include property ‘address’ where subpropertyOf(bill-
changes may lead to SemValidðii ; o0i Þ ¼ false. ing_address) = address and property ‘monitors’ where
inverseOf(monitors) = monitored_by. Semantic changes
Identifying the semantic view needs the knowledge of to these properties imply changes to those of ‘patient’
implication analysis, i.e. the dependencies and independen- based on Inter-concept Des–Des Dependency.
cies between the components in the ontology. Below is how (4) The semantic view of ontology o for data instance i
the semantic view can be generated for a data instance. consists of the concepts obtained in steps 1 and 2
(along with the class-level object properties among
6.1.1. Identifying the semantic view for a data instance them), the properties obtained in step 3 along with
Given the instantiated concept, c, for a data instance i, their restrictions and axioms.
and all the instantiated properties of i; P ðiÞ, we perform
the following steps. By applying all the dependencies in our implication anal-
ysis, all the ontology components in the semantic view for a
(1) Find the equivalent concepts of c; eqðcÞ, and the specific data instance are relevant to the semantic validity of
superclass concepts of c; supðcÞ. Essentially, we apply the data instance. All the other components of the ontology,
the Inter-concept Def–Def Dependency and Inter- based on the Inter-isosem Independency and Inter-concept
isosem Dependency. Independency, do not affect the semantic validity of this
For instance, consider the ontology in Fig. 1 for a instance and therefore are not part of the semantic view.
data instance of concept ‘patient’ with all the proper- This naturally leads to the following theorem:
L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97 93
Theorem 1. [Completeness and minimality of the semantic cj is in the semantic view based on Inter-concept Def–Des
view] The semantic view for a data instance includes all Dependency since changes to the definitional isosem of cj
(completeness) and only (minimality) the ontology components imply changes to p. Again, eqðcj Þ and supðcj Þ are included
which can affect the semantic validity of the data instance. in the semantic view based on Inter-concept Def–Def
Dependency.
Proof. Based on our implication analysis, ontology compo- Based on Intra-isosem Dependency and Inter-concept
nents in the semantic view are identified through the follow- Des–Des Dependency, p; eqðpÞ; supðpÞ; invðpÞ 2 SV ði; oÞ
ing dependencies: Intra-isosem Dependency, Inter-isosem (including their restrictions and axioms).
Dependency, Inter-concept Def–Def Dependency, Inter- In addition to property p, other related properties
concept Def–Des Dependency, and Inter-concept Des–Des including eqðpÞ; supðpÞ, and invðpÞ are in the semantic
Dependency. Other ontology components are excluded from view based on Intra-isosem Dependency if they have
the semantic view based on the following independencies: concept ci as the domain or based on Inter-concept
Inter-isosem Independency, and Inter-concept Independency. Des–Des Dependency if they have a different concept
For a data instance i which instantiates concept ci and as the domain.
property p from ontology o ðcj is the range of property p if By applying all the dependencies, we have shown that all
p is an object property). h the ontology components in the semantic view are rele-
vant to the semantic validity of the data instance. In
6.1.2. Proof for minimality other words, no component that is not relevant has been
included in the semantic view, and therefore we prove
Based on Inter-isosem Dependency and Inter-concept the minimality.
Def–Def Dependency. ci ; eqðci Þ; supðci Þ 2 SV ði; oÞ
(including the class-level object properties among 6.1.3. Proof for completeness
them).The Inter-isosem Dependency indicates the defini-
tional isosem of concept ci affects its descriptive isosems, On the other hand, any other concepts and properties
consisting of instance-level properties instantiated by along with their restrictions and axioms R SV ði; oÞ. This
any data instance of ci . Also, based on Inter-concept is because, any other isosems of concept ci or any other
Def–Def Dependency, eqðci Þ and supðci Þ have to be concept have no impact on the data instance based on
included in the semantic view since changes to these con- Inter-isosem Independency or Inter-concept Indepen-
cepts may imply changes to ci . dency.
Based on Inter-concept Def–Des Dependency and Inter- This means that the semantic view has included all the
concept Def–Def Dependency. cj ; eqðcj Þ; supðcj Þ 2 SV ontology components relevant to the semantic validity
ði; oÞ (including the class-level object properties among of the data instance, and therefore we prove the complete-
them). ness of the semantic view.
ICU
part_of part_of
person equipment
name, age, address task_performed,
dimensions
subclassOf subclassOf
subclassOf subclassOf
treatment
patient device general
health care ID, task_ equipment
personnel insurance_type performed
code, salary (insurance_plan), monitors subclassOf
billing_address
cardinality(ID)= monitored_ monitoring
subclassOf 1 by device
subclassOf channel_number
electrocardiograph
Fig. 5. The semantic view of the ontology in Fig. 1 for a data instance of concept ‘patient’.
94 L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97
ICU
part_of part_of
person equipment
name, age, address task_performed,
dimensions
subclassOf subclassOf
subclassOf subclassOf
subclassOf
health care Patient monitors
personnel ID, monitoring treatment general
code, salary insurance_type device device equipment
cardinality(code)= (insurance_plan), channel_number
1 billing_address monitored_by
subclassOf cardinality(ID)=
1 subclassOf
subclassOf
doctor nurse
specialty electrocardiograph
Fig. 6. The semantic view of the ontology in Fig. 2 for a data instance of concept ‘patient’.
The semantic view thus obtained for the ‘patient’ in a way that affects the interpretation of their data
instance is shown in Figs. 5 and 6 as the components within instances. On the other hand, from the perspective of the
the region of the curve. One can see that the semantic view consumers of these data instances, an interpretation, which
for this data instance is different when the ontology evolves is not intended by the authors of data instances, is defi-
from Fig. 1 to Fig. 2. nitely undesirable. Therefore, the question is, how can
Two instances will share the same semantic view if the silent changes to an ontology be identified? In this sec-
they instantiate the same concept and properties in the tion, we present our approach of using the semantic view to
same ontology. Also, the semantic view for a data address this problem.
instance will change when the instantiated components When a data instance is created, it is semantically valid
of the data instance change. However, changes to the with respect to its ontology. We suggest each data
specific values of the properties do not affect the semantic instance carry a hash value of its semantic view at its cre-
view if the same properties are still instantiated by the ation. When the consumers attempt an interpretation of
data instance. this data instance, they identify its semantic view based
As a note, the semantic view for a data instance will on the current version of the ontology, compute its hash
include only the components relevant to the validity of value and compare the hash value with that stored locally
the data instance. Considering the independencies we pre- by the data instance: if these two hash values match, it
sented in Section 4.1, the semantic view will not end up means that the semantic view has not changed and thus
to be the entire ontology. So, identifying the semantic view the data instance can be interpreted safely with the ontol-
becomes even more significant when the ontology is much ogy. When the hash values do not match, we cannot tell
larger than the semantic view. exactly whether the data instance is still valid. The reason
is that this will require the knowledge of the changes
between the two versions, which can be obtained by find-
6.2. Change identification using semantic view ing changes between two ontology versions structurally
[14] and semantically [17]. When the ontologies do not
Our previous discussion of validity evaluation is based change frequently, this hash value can provide a quick
on the assumption that both versions of the ontologies assurance for the validity of the data instance. The hash
are available. However, in a distributed and decentralized value should be based on an unordered hashing of the
environment such as the Semantic Web, there cannot be RDF triples of its semantic view. Some existing algorithm
such guarantees since the old version of an ontology may for computing the digest of an RDF/OWL graph [24,1,10]
be substituted by a new version and no notification can can be adopted.
be sent to entities where its data instances reside. In such The author of a data instance can further specify how
a case, the authors of data instances are concerned that the data instance should be interpreted when the computed
the semantics of data instances cannot be correctly inter- hash value for the semantic view does not match with the
preted based on the new version. Therefore, they would locally stored one. One can specify one of the following
like to identify whether the ontology has silently changed two options.
L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97 95
mine whether the property is transitive or not. These con- for the diversity of changes which concurrently occur to
cepts, the taxonomy/associations among them along with an ontology, noting that instantiation is done at a fine-
their restrictions and axioms are part of the semantic view. grained level of concepts and properties. Even the limited
discussion on describing semantic changes is confined to
8. Related work equivalence and subsumption between concepts. Further-
more, changes to an ontology are examined mainly by con-
Most related work on detecting changes to ontologies sidering each individual concept in isolation. As a result,
concentrates on finding structural changes between ontol- this analysis lacks an examination of the semantics of a
ogy versions. OntoView [6,7] is a web-based ontology ver- concept in relation to other associated concepts and there-
sioning management system, in which structural changes fore is not capable of analyzing the implication of the
are identified by rules. It allows the ontology engineers to semantic changes.
compare versions of ontology, and to specify whether Noy and Klein [12] presented a systematic discussion on
(but not how) these structural changes entail any concep- the differences between ontology evolution and schema evo-
tual implications. PromptDiff [14,15] is a tool that utilizes lution. Klen and Noy [8] described a framework integrated
heuristics for matching components of ontology versions. with different representations of the changes between ontol-
It integrates different heuristics matchers for comparing ogy versions and the transformation from one representa-
ontology versions, where the matchers conform to monoto- tion to another. An ontology of (structural) change
nicity principle, i.e. no matchers retract the results of previ- operations was also presented. Klen and Stuckenschmidt
ous matchers or its own results from previous runs. The [9] discussed how the changes in one ontology affect another
outcome of the algorithm is a structural diff, consisting of ontology which uses concepts defined in the changed ontol-
frame pairs which are images of each other from different ogy and reasonings based on the imported concept hierar-
versions or one of them is null. The change types are chy. In particular, it studied how changes to the concept
described as changed, unchanged and isomorphic. Noy hierarchy of the external ontology affect the validity of the
and Klein [12] discussed how structural changes in ontolo- compiled subsumption relations which are added as axioms
gies affect the preservation of their data instances. Research to the local ontology.
complementary to detecting changes to ontologies can be Recently, [21] investigated how the semantic changes to
found in finding similarities between independently devel- an ontology can be evaluated and detected. It proposes a
oped ontologies for the purposes of merging and aligning SemDiff algorithm that first detects explicit semantic
ontologies. Examples of such tools include Prompt [13], changes based on the structural changes and then detect
GLUE [2], etc. Mitra et al. [11] presented an ontology map- implicit semantic changes implied by other semantic
ping approach using Bayesian Network. changes. However, it did not address the issue of structural
The above work, however, is limited to detecting and semantic validity. In this paper, we exploited the types
changes at structural level, but does not address detecting of structural and semantic changes detected to verify
changes occurring at the semantic level. Detecting changes whether the instances are structurally and semantically
at the semantic level is crucial since ontologies are consid- valid when the ontologies change.
ered as a means for presenting semantics. Typically, seman- We propose using hashing of values of the semantic view
tic changes are not clearly distinguishable from structural in our change identification approach. Hashing a RDF/
changes in the existing work. For example, if the range to OWL document can be done very efficiently. The complexity
a property is changed in the new version to a subclass of of hashing in [24] is O(N) and that of [1,10] is O(NlogN).
the concept in the old version, the range of the property, Qin and Atluri [19,22] proposed utilizing ontologies in
therefore certain semantic aspect of the concept owning guiding the change detection to the data instances under
the property, becomes narrowed. This change should not the Semantic Web, and [20] proposed a methodology by
be described and treated as a structural change only. exploiting the semantic relationships among concepts to
Heflin and Hendler [4] presented SHOE, a web-based specify access control policies such that undesired inference
knowledge representation language that supports multiple of unauthorized information can be prevented.
versions of ontologies and discussed how the features of
SHOE address ontology versioning. More specifically, the 9. Conclusions and future work
authors analyzed scenarios for semantic subsumption at
the ontology level. Compatibility between ontology ver- Ontologies are formal, explicit specifications of shared
sions was also discussed in [5] by assuming that the data conceptualizations that evolve over time. Changes to an
source that conforms to a specific version of the ontology ontology may occur at two levels – structural and semantic.
uses the whole ontology. Though it mentioned the possibil- When an old version of an ontology is substituted by a new
ity that an incomparable revision may not affect the inter- one, data instances that comply with the old version of the
pretation of a data source which uses the ontology part ontology may become invalidated, and therefore be irre-
unaffected by the revision, it lacks a close examination of trievable and non-interpretable. Therefore, it is essential
the implication within the ontology. The compatibility to verify if the data instances are still valid under the new
measured at the ontology level [4,5] is too coarse-grained version of the ontology.
L. Qin, V. Atluri / Information and Software Technology 51 (2009) 83–97 97
In this paper, we have proposed approaches to verify both [9] M. Klein and H. Stuckenschmidt, Evolution management for
structural validity and semantic validity of data instances. interconnected ontologies, in: Workshop on Semantic Integration at
ISWC 2003, Sanibel Island, Florida, 2003.
We have also proposed the semantic view of an ontology [10] S. Melnik, RDF API draft: Cryptographic digests of RDF models and
for a data instance as the portion of an ontology whose statements, http://www-db.stanford.edu/melnik/rdf/api.html#digest.
changes directly or indirectly affect the validity of the data [11] Prasenjit Mitra, Natasha N. Noy and A.R. Jaiswal, Ontology
instance. We have discussed how the semantic view can be mapping discovery with uncertainty. Fourth International Semantic
identified through implication analysis. Besides, by storing Web Conference (ISWC), Galway, Ireland, November 8–10, 2005.
[12] N.F. Noy, M. Klein, Ontology evolution: not the same as schema
a hash value of the semantic view, one can quickly identify evolution, Knowledge and Information Systems 5 (2003).
whether the ontology has changed, so that they can be pre- [13] N.F. Noy and M.A. Musen, PROMPT: algorithm and tool for
vented from making any erroneous interpretation. automated ontology merging and alignment, in: Seventeenth National
As part of our future work, we intend to address the Conference on Artificial Intelligence (AAAI2000), 2000.
issue of efficiently propagating ontology changes to its data [14] N.F. Noy and M.A. Musen, PromptDiff: a fixed-point algorithm for
comparing ontology versions, in: Proceedings of the National
instances and dependent ontologies. Conference on Artificial Intelligence (AAAI), 2002.
[15] N.F. Noy, M.A. Musen, The PROMPT suite: interactive tools for
ontology merging and mapping, International Journal of Human-
References Computer Studies 59 (6) (2003) 983–1024.
[16] N.F. Noy, M.A. Musen, Specifying Ontology Views by Traversal,
[1] J. Carroll, Signing RDF graphs, Lecture Notes in Computer Science, Third International Semantic Web Conference (ISWC2004), Hiro-
vol. 2870, Springer-Verlag, 2003. shima, Japan, 2004.
[2] A. Doan, J. Madhavan, P. Domingos and A. Halevy, Learning to [17] N.F. Noy, S. Kunnatur, M. Klein, M.A. Musen, Tracking changes
map between ontologies on the Semantic Web, in: Proceedings of the during ontology evolution. Third International Semantic Web Con-
World-Wide Web Conference (WWW), 2002. ference (ISWC2004), Hiroshima, Japan, 2004.
[3] T.R. Gruber, A translation approach to portable ontologies, Knowl- [18] OWL Web Ontology Language Guide. Available at http://
edge Acquisition 5 (2) (1993) 199–220. www.w3.org/TR/owl-guide/.
[4] J. Heflin, J. Hendler, Dynamic ontologies on the Web, in: Proceedings [19] L. Qin and V. Atluri, Ontology-guided change detection to the
of the 17th National Conference on Artificial Intelligence (AAAI- Semantic Web data, in: 23rd International Conference on Conceptual
2000), AAAI/MIT Press, Menlo Park, CA, 2000, pp. 443–449. Modeling (ER 2004), pp. 624–638.
[5] M. Klein, D. Fensel, Ontology versioning for the Semantic Web, in: [20] L. Qin and V. Atluri, Concept-level access control for the Semantic
Proceedings of the International Semantic Web Working symposium Web. ACM Workshop on XML Security, held in conjunction with
(SWWS), Stanford University, California, USA, 2001, pp. 75–91. the 10th ACM Conference on Computer and Communications
[6] M. Klein, D. Fensel, A. Kiryakov and D. Ognyanov, Ontology Security, October 2003.
versioning and change detection on the Web, in: 13th International [21] L. Qin, V. Atluri, SemDiff: an approach to detecting semantic changes
Conference on Knowledge Engineering and Knowledge Management to ontologies, International Journal of Semantic Web and Informa-
(EKAW02), 2002. tion Systems 2 (4) (2006) 1–32.
[7] M. Klein, A. Kiryakov, D. Ognyanov and D. Fensel, Finding and [22] L. Qin, V. Atluri, An ontology-guided approach to change detection
characterizing changes in ontologies, in: Proceedings of the 21st of the Semantic Web data, Journal on Data Semantics V (2006) 130–
International Conference on Conceptual Modeling (ER2002), Tam- 157, LNCS 3870.
pere, Finland, 2002, pp. 79–89. [23] RacerPro, http://www.racer-systems.com.
[8] M. Klein and N.F. Noy, A Component-based framework for [24] C. Sayers and A.H. Karp, Computing the Digest of an RDF Graph,
ontology evolution, in: Proceedings of the Workshop on Ontologies Technical Report HPL-2003-235, 2003.
and Distributed Systems, IJCAI’03, Acapulco, Mexico, 2003. [25] http://www.w3.org/Addressing/.