Asset Categorization

Asset Categorization
Asawin Rajakrom
Course Syllabus
This course describes how the power distribution network assets are modeled and categorized into classes and draw a relationships among those classes. The class attribute represents a network data that will be used for inducing asset conditions, costs, probability of network failure as well as social and environment factors that influence the asset investment decision. The modeling approach bases on the prominent a Common Information Model (CIM) modeling method that used for representing real-world objects and information entities exchanged within the value chain of the electric power industry. Underpinning the CIM knowledge representation are several methods and methodologies such as UML, XML, and RDF. The course provides all necessary background of these technologies. In addition, engineering disciplines such as knowledge engineering and ontological engineering which emphasizes the knowledge acquisition and ontology development are also explicated. Combining them all together, attendees will equip themselves with all necessary knowledge to model not just power distribution system assets but all the other area of knowledge modeling.
Course Outline
Categorization principle & terminologies Unified modeling language eXtensible markup language Resource description framework Common information model knowledge engineering Ontological development Power distribution network asset categorization
Categorization Principle & Terminologies
Categorization Overview
The basic cognitive process of arranging into classes or categories The process in which ideas and objects are recognized, differentiated and understood. Categorization implies that objects are grouped into categories, usually for some specific purpose. Ideally, a category illuminates a relationship between the subjects and objects of knowledge The function of category systems and asserts that the task of category systems is to provide maximum information with the least cognitive effort The structure of the information so provided and asserts that the perceived world comes as structured information rather than as arbitrary or unpredictable attributes
Controlled Vocabulary
Way of describing a concept under a single word or phrase May vary in its definition and usage when use in different domain An established list of standardized terms used for both indexing and retrieval of information The list of terms should be controlled by and be available from a controlled vocabulary registration authority in order to make a it unambiguous, non-redundant
Controlled Vocabulary
At a minimum, the following two rules should be enforced to make true in practice:
If the same term is commonly used to mean different concepts in different contexts, then its name is explicitly qualified to resolve this ambiguity. If multiple terms are used to mean the same thing, one of the terms is identified as the preferred term in the controlled vocabulary and the other terms are listed as synonyms or aliases.
Classification
Systematic arrangement in groups or categories according to established criteria Act or process of putting people or things into a group or class Establishing the correct class (or category) for an object where an object needs to be characterized in terms of class to which it belongs
Classification
Classification is an approach to systematically arranging objects into categories according to established criteria. Objects are the physical and conceptual things we find in the universe around us: Hardware, software, documents, animals, human beings, and even concepts. Classification allows to us manage things easily by grouping them into certain category under specific criteria and then manipulate against established condition.
Taxonomy
An orderly classification of plants and animals according to their presumed natural relationships A hierarchy created according to data internal to the items in that hierarchy An orderly classification of objects into hierarchical structure using a parent-child relationships Using parent-child relationships in taxonomy: e.g., whole part, genus species, type instance, or class subclass. Differ from classification in the sense that it classifies in a structure according to some relation between the entities and that a classification uses more arbitrary (or external) grounds
Taxonomy
Ontology
A branch of metaphysics concerned with the nature and relations of being A system of concepts used as building blocks of an information processing system Consists of concepts, hierarchical (is-a) organization of them, relations among them, in addition to is-a and part-of, axioms to formalize the definitions and relations. An explicit specification of a conceptualization
Ontology
Taxonomy and ontology are often interchangeably used, however they are fundamentally different. Taxonomy classifies objects in a domain in hierarchical structure give exact names for everything in a specified domain show which things are parts of other things Ontology offers more by expressing meaningful content within a specified domain of interest. Has strict, formal rules (a "grammar") about those relationships that let us make meaningful, precise statements about our entities/relationships A formal ontology is hence a controlled vocabulary expressed in an ontology representation language
Meta-model
Data about data
Facilitate the understanding, characteristics, and management usage of data
An explicit model of the constructs and rules needed to build specific models within a domain of interest A valid meta-model is an ontology, but not all ontologies are modeled explicitly as meta-models Schema is Metadata
Power Distribution System Asset Categorization

Provide all key attributes of network assets, either concrete or abstract, operational stresses and external environments for determining asset conditions and failure probability Provide all key attributes to deduce asset costs Provide all associated social and environment factors that influence decision of asset investment This information is modeled into classes and attributes as well as class relationships using the common information model (CIM) specification
UML
Unified Modeling Language
Origins of UML
Evolution of object-oriented technology: Develop and start using OOP language Use of OOAD in business process modeling, requirement analysis and software systems design UML was designed to bring together the best features of a number of analysis and design technologies and notations to produce and industrial standard.
Emergence of UML
What is UML?
UML is a visual language that originally applied in developing software systems. Now is extended for using in other area like knowledge modeling. It is a specification language. it has a set of elements and a set of rules that determine how it can be used. Most of UML elements are graphical: lines, rectangles, ovals and other shapes, and many of these graphical elements are labelled with words that provides additional information.
Why use UML?

The needs of modeling: Modeling can be as straightforward as drawing a flowchart listing the steps carried out in business process. Readability brings clarityease of understanding. This involves knowing what a system is made up of, how it behaves, and so forth. Reusability is the byproduct of making a system readable. After a system has been modeled to make it easy to understand, we tend to identify similarities or redundancy, be they in terms of functionality, features, or structure. The underline is standardization.
UML Concepts
UML is used to:
Show main functions and boundaries in a system using use cases and actors. Illustrate use case realizations using interaction diagrams. Represent a static structure of a system using class diagrams. Modelling object behaviour using state diagrams. Show implementation of the physical architecture using component and deployment diagrams. Enhance the functionality using stereotypes.
UML Diagrams and Elements

Use case diagrams Static structural diagrams
Class, object
Interaction diagrams
Sequence, collaboration
State diagrams Activity diagrams Implementation diagrams

Packages, Components, Deployment
Use Cases Diagram

Use cases diagrams describes the behavior of the target system from an external point of view. Use cases describe "the meat" of the actual requirements. Use cases: A use case describes a sequence of actions that provide something of measurable value to an actor and is drawn as a horizontal ellipse. Actors: An actor is a person, organization, or external system that plays a role in one or more interactions with your system. Actors are drawn as stick figures. Associations: Associations between actors and use cases are indicated by solid lines. An association exists whenever an actor is involved with an interaction described by a use case
Use Cases Diagram
Class Diagram
Class diagrams show the classes of the system, their inter-relationships, and the operations and attributes of the classes Explore domain concepts in the form of a domain model. Analyze requirements in the form of a conceptual/analysis model Depict the detailed design of objectoriented or object-based software
Class Diagram
Class name Person Attributes attribute name : type Operations operation name(parameter : type) : result type Person - TaxIDNo : String - Name : String + Income : double + TaxPaid : Boolean + calcTax() + calcTaxBal()
Object Diagram
Object diagrams (instance diagrams), are useful for exploring real world examples of objects and the relationships between them. It shows instances instead of classes. They are useful for explaining small pieces with complicated relationships, especially recursive relationships.
Class and Objects

City Name : String = default Country : String = default Population : integer = default
setName (s : String = deault) setPopulation(p : integer = default)

<<instanceOf>>
London : City
Name = London Country = UK Population =2,324,320
<<instanceOf>>
New York : City Name = New York Country = USA
<<instanceOf>>
Sydney : City Name = Sydney Country = Australia
Population =5,734,012
Population =3,536,000
Sequence Diagram
Sequence diagrams models the collaboration of objects based on a time sequence. It shows how the objects interact with others in a particular scenario of a use case.
Sequence Diagram
Collaboration Diagram
Collaboration (Communication) diagrams used to model the dynamic behavior of the use case. When compare to Sequence Diagram, the Communication Diagram is more focused on showing the collaboration of objects rather than the time sequence.
Collaboration Diagram
State Diagram
State diagrams can show the different states of an entity also how an entity responds to various events by changing from one state to another. The history of an entity can best be modeled by a finite state diagram.
State Diagram
Activity Diagram
Activity diagrams helps to describe the flow of control of the target system, such as the exploring complex business rules and operations, describing the use case also the business process. It is object-oriented equivalent of flow charts and data-flow diagrams (DFDs).
Activity Diagram
Packages Diagram
Package diagrams simplify complex class diagrams, it can group classes into packages. A package is a collection of logically related UML elements. Packages are depicted as file folders and can be used on any of the UML diagrams.
Packages Diagram
Components Diagram
Component diagrams shows the dependencies among software components, including the classifiers that specify them (for example implementation classes) and the artifacts that implement them; such as source code files, binary code files, executable files, scripts and tables.
Components Diagram
Deployment Diagram
Deployment diagram depicts a static view of the run-time configuration of hardware nodes and the software components that run on those nodes. Deployment diagrams show the hardware for your system, the software that is installed on that hardware, and the middleware used to connect the disparate machines to one another.
Deployment Diagram
UML Class Diagrams and Relationships

How would you draw a family tree? The steps you would take would be:
Identify the main members of the family Determine how they are related to each other Identify the characteristics of each family member Find relations among family members Decide the inheritance of personal traits and characters
UML Class Diagrams and Relationships

By definition, a class diagram is a diagram showing a collection of classes and interfaces, along with the collaborations and relationships among classes and interfaces. A class diagram consists of a group of classes and interfaces reflecting important entities of the business domain of the system being modeled, and the relationships between these classes and interfaces. A class diagram is a pictorial representation of the detailed system design.
Elements of a Class Diagram

Name
Attributes
Methods
UML Class Relationships

Relation
Association
Symbol
Description
When two classes are connected to each other in any way, an association relation is established. For example: A "student studies in a college" association can be shown as:

Relation
Multiplicity
Symbol
Description
An example of this kind of association is many students belonging to the same college. Hence, the relation shows a star sign near the student class (one to many, many to many, and so forth kind of relations).

Relation
Directed Association
Symbol
Description
Association between classes is bi-directional by default. You can define the flow of the association by using a directed association. The arrowhead identifies the container-contained relationship.

Relation
Reflexive Association
Symbol
Description
No separate symbol. An example of this kind of relation is when a class has a variety of responsibilities. For example, an employee of a college can be a professor, a housekeeper, or an administrative assistant.

Relation
Aggregation
Symbol
Description
When two classes are When a class is formed as a collection of other classes, it is called an aggregation relationship between these classes. It is also called a "has a" relationship.

Relation
Composition
Symbol
Description
Composition is a variation of the aggregation relationship. Composition connotes that a strong life cycle is associated between the classes.

Relation
Inheritance/ Generalization
Symbol
Description
Also called an "is a" relationship, because the child class is a type of the parent class. Generalization is the basic type of relationship used to define reusable elements in the class diagram. Literally, the child classes "inherit" the common functionality defined in the parent class.

Relation
Realization
Symbol
Description
In a realization relationship, one entity (normally an interface) defines a set of functionalities as a contract and the other entity (normally a class) "realizes" the contract by implementing the functionality defined in the contract..
Other Terms for Annotations of Class Diagrams

Responsibility of a class: It is the statement defining what the class is expected to provide. Stereotypes: It is an extension of the existing UML elements; it allows you to define new elements modeled on the existing UML elements. Only one stereotype per element in a system is allowed. Vocabulary: The scope of a system is defined as its vocabulary. Analysis class: It is a kind of a stereotype. Boundary class: This is the first type of an analysis class. In a system consisting of a boundary class, the users interact with the system through the boundary classes. Control class: This is the second type of an analysis class. A control class typically does not perform any business functions, but only redirects to the appropriate business function class depending on the function requested by the boundary class or the user. Entity class: This is the third type of an analysis class. An entity class consists of all the business logic and interactions with databases.
Put Them Together
XML
eXtensible Markup Language
Evolution
SGML (Standard Generalized Markup Language) ISO Standard, 1986, for data storage & exchange Metalanguage for defining languages (through DTDs) A famous SGML language: HTML!! Separation of content and display Used in U.S. gvt. & contractors, large manufacturing companies, technical info. Publishers,... SGML reference is 600 pages long XML (eXtensible Markup Language) W3C (World Wide Web Consortium) -http://www.w3.org/XML/) recommendation in 1998 Simple subset (80/20 rule) of SGML: ASCII of the Web, Semantic Web. XML specification is 26 pages long
Evolution
Canonical XML normalization, equivalence testing of XML documents SML (Simple Markup Language) Reduce to the max: No Attributes / No Processing Instructions (PI) / No DTD / No non-character entityreferences / No CDATA marked sections / Support for only UTF-8 character encoding / No optional features XML Schema XML Schema definition language Back to complex:
Part I (Structures), Part II (Data Types), Part III aehm 0 (Primer)
What is XML?
XML is a universal format for structured documents and data. Can be understood using any (archaic CP/M) editor Can be parsed easily Contains its own structure (=parse tree) in the data Allows separation of marked-up content from presentation (style sheets) As a self-describing format good for archival into the past - not bad for archival into the future XML uses a Document Type Definition (DTD) or an XML Schema to describe the data XML with a DTD or XML Schema is designed to be self-descriptive
Simple XML Example

<?xml version=1.0 encoding=windows-874?> <note>
<to> Tom </to> <from> Jane </from> <heading> Reminder </heading> <body> Meeting at 9.00 AM</body>
</note>
Why Is XML Important?

Plain Text
Easy to edit Useful for storing small amounts of data Possible to efficiently store large amounts of XML data through an XML front end to a database
Data Identification
Tell you what kind of data you have Can be used in different ways by different applications

Stylability
Inherently style-free XSL---Extensible Stylesheet Language Different XSL formats can then be used to display the same data in different ways
Inline Reusabiliy
Can be composed from separate entities Modularize your documents without resorting to links

Linkability -- XLink and XPointer
Simple unidirectional hyperlinks Two-way links Multiple-target links Expanding links
Easily Processed
Regular and consistent notation Vendor-neutral standard
Hierarchical
Faster to access Easier to rearrange
XML Building Blocks

Element
Delimited by angle brackets Identify the nature of the content they surround General format: <element> </element> Empty element: </empty-Element>
Attribute
Name-value pairs that occur inside start-tags after element name, like: <element attribute=value>
XML Building blocks--Prolog

The part of an XML document that precedes the XML data Includes
A declaration: version [, encoding, standalone] An optional DTD (Document Type Definition )
Example
<?xml version="1.0" encoding="ISO-8859-1" standalone="yes"?>
XML Syntax
All XML elements must have a closing tag XML tags are case sensitive All XML elements must be properly nested All XML documents must have a root tag Attribute values must always be quoted With XML, white space is preserved With XML, a new line is always stored as LF Comments in XML:

XML is Based on Markup

<bibliography> Markup indicates <paper ID= "object-fusion"> structure and semantics <authors> <author>Y.Papakonstantinou</author> <author>S. Abiteboul</author> <author>H. Garcia-Molina</author> </authors> <fullPaper source="fusion"/> <title>Object Fusion in Mediator Systems</title> <booktitle>VLDB 96</booktitle> </paper> Decoupled from </bibliography>
presentation
XML Elements
XML Elements are Extensible
XML documents can be extended to carry more information
XML Elements have Relationships

Elements are related as parents and children
Elements have Content

Elements can have different content types: element content, mixed content, simple content, or empty content and attributes
XML elements must follow the naming rules
XML as Labeled Ordered Trees

bibliography paper authors author paper fullpaper ... title ...
can also represent relational and object-oriented data
author
Object Fusion
<bibliography> <paper ...> <authors> <author>Yannis</author> <author>Serge</author> ... </authors> <title>Object Fusion</title> ... </paper> </bibliography>
Yannis
Serge
semistructured data labeled trees/graphs
Elements and their Content

element name
<bibliography> <paper ID="object-fusion"> <authors> <author>Y.Papakonstantinou</author> <author>S. Abiteboul</author> <author>H. Garcia-Molina</author> </authors> <fullPaper source="fusion"/> <title>Object Fusion in Mediator Systems</title> <booktitle>VLDB 96</booktitle> </paper> </bibliography>
Element Content Empty Element
element
Character content
XML Attributes
Located in the start tag of elements Provide additional information about elements Often provide information that is not a part of data Must be enclosed in quotes Should I use an element or an attribute?
metadata (data about data) should be stored as attributes, and that data itself should be stored as elements
Element Attributes
Attribute name
<bibliography> <paper ID="object-fusion"> <authors> <author>Y.Papakonstantinou</author> <author>S. Abiteboul</author> <author>H. Garcia-Molina</author> </authors> <fullPaper source="fusion"/> <title>Object Fusion in Mediator Systems</title> <booktitle>VLDB 96</booktitle> </paper> </bibliography>
Attribute Value
XML Validation
"Well Formed" XML document correct XML syntax "Valid" XML document well formed Conforms to the rules of a DTD (Document Type Definition) XML DTD defines the legal building blocks of an XML document Can be inline in XML or as an external reference XML Schema an XML based alternative to DTD, more powerful Support namespace and data types
Displaying XML
XML documents do not carry information about how to display the data We can add display information to XML with
CSS (Cascading Style Sheets) XSL (eXtensible Stylesheet Language) -- preferred
XML Specification
XML Document Type Definitions (DTDs):
define the structure of "allowed" documents (i.e., valid written a DTD) database schema improve query formulation, execution, ...
XML Schema
defines structure and data types allows developers to build their own libraries of interchanged data types
XML Namespaces
identify your vocabulary
Document Type Definitions (DTD)

Define and Constrain Element Names & Structure
<!element <!element <!element <!element <!element <!element <!element <!attlist <!attlist bibliography paper*> paper (authors, fullPaper?, title, booktitle)> authors author+> Element Type author (#PCDATA)> fullPaper EMPTY> Declaration title (#PCDATA)> booktitle (#PCDATA)> fullPaper source ENTITY #REQUIRED> Attribute List paper ID ID>
Declaration

Sequence of 0 or more paper
<!element <!element <!element <!element
bibliography paper*> paper (authors, fullPaper?, title, booktitle)> authors author+> Sequence of 1 or author (#PCDATA)>
Authors followed by optional fullpaper, followed by title, followed by booktitle
more author
Character content
<!element <!element <!element <!attlist <!attlist fullPaper EMPTY> title (#PCDATA)> booktitle (#PCDATA)> fullPaper source ENTITY #REQUIRED> paper ID ID>

<person ID="yannis"> Yannis info </person> <bibliography>
Object Identity Attribute
<paper ID="object-fusion" ROLE="publication">
CDATA (character data) <authors> <author authorRef="yannis"> IDREF Y.Papakonstantinou</author> intradocument </authors> reference <fullPaper source="fusion"/> <title>Object Fusion in Mediator Systems</title> <related papers= "semistructured-data" "mediators"/> </paper>
</bibliography>
Reference to external ENTITY
XML Namespaces
Namespace is a mapping between an element prefix and a URI
cars is the prefix in this example,
<cars:part xmlns:cars=URI>
URIs are not a pointer to information about the Namespace. They are just unique identifiers. You cannot resolve XML namespace URIs.
XML Namespaces
An XML document may reference more than one schema A Namespace specifies which schema defines a given tag XML, like Java, uses qualified names
This helps to avoid collisions between names Java: myObject.myVariable XML: myDTD:myTag Note that XML uses a colon (:) rather than a dot (.)
If an XML processor is not namespaceaware, the colon is just part of the name
Namespaces and URIs

A namespace is defined as a unique string
To guarantee uniqueness, typically a URI (Uniform Resource Indicator) is used, because the author owns the domain It doesn't have to be a real URI; it just has to be a unique string Example: http://www.matuszek.org/ns
There are two ways to use namespaces:

Declare a default namespace Associate a prefix with a namespace, then use the prefix in the XML to refer to the namespace
Namespace Syntax
In any start tag you can use the reserved attribute name xmlns: <book xmlns="http://www.matuszek.org/ns">
This namespace will be used as the default for all elements up to the corresponding end tag You can override it with a specific prefix
You can use almost this same form to declare a prefix: <book xmlns:dave="http://www.matuszek.org/ns">
Use this prefix on every tag and attribute you want to use from this namespace, including end tags--it is not a default prefix <dave:chapter dave:number="1">To Begin</dave:chapter>
You can use the prefix in the start tag in which it is defined: <dave:book xmlns:dave="http://www.matuszek.org/ns">
Namespaces and DTD

Here is a sample Namespace specification within a DTD.
<!ELEMENT title ...> <!ATTLIST title xmlns CDATA #FIXED ttp://www.person.com"> <!ELEMENT person:title ...> <!ATTLIST person:title xmlns:person CDATA #FIXED "http://www.person.com">
XML Schema
People are dissatisfied with DTDs due to: It's a different syntax You write your XML (instance) document using one syntax and the DTD using another syntax --> bad, inconsistent Limited datatype capability DTDs support a very limited capability for specifying datatypes. You can't, for example, express "I want the <elevation> element to hold an integer with a range of 0 to 12,000" Desire a set of datatypes compatible with those found in databases
DTD supports 10 datatypes; XML Schemas supports 44+ datatypes
What is XML Schema?

A grammar definition language
Like DTDs but better Uses XML syntax Defined by W3C
Primary features

Datatypes e.g. integer, float, date, etc More powerful content models e.g. namespace-aware, type derivation, etc type definitions simple type complex type (contains element or attribute) element declarations
A schema is a collection of:

Schema Terminology
Schema: a formal description for the structure and allowed content of a set of data (esp. in databases) XML Schema is often used for each of 1. XML Schema, the W3C Rec. that defines 2. XML Schema Definition Language (XSDL), an XML-based markup language for expressing ... 3. schema documents, each of which describes a schema (DTD) for a set of XML document instances
Advantages of XSDL
XML syntax
schema documents easier to manipulate by programs (than the special DTD syntax)
Compatibility with namespaces

can validate documents using declarations from multiple sources
Content datatypes
44 built-in datatypes (including primitive Java datatypes, datatypes of SQL, and XML attribute types) mechanisms to derive user-defined datatypes
Advantages of XSDL
Independence of element names and content types; Compare with
DTDs: 1-to-1 correspondence btw. element type names and their content models CFGs: 1-to-1 correspondence btw. nonterminals and their productions
For example, could define titles of people as Mr./Mrs./Ms. and titles of chapters as strings
Advantages of XSDL
Support for schema documentation
element annotation with sub-elements documentation (for human readers) and appInfo (for applications)
Ability to specify uniqueness and keys within selected parts of document for example, that titles of chapters should be unique
Disadvantages of XSDL
Complexity of XSDL (esp. of Rec. Part 1!) > a long learning curve Possible immaturity of implementations (?) W3C XML Schema Web site mentions a dozen of tools or processors (http://www.w3.org/XML/Schema#Tools, March 2002) Open-source Apache XML parsers (Xerces C++ 1.7.0 and Xerces Java 1.4.4) seem reasonable implementations, but also document limitations/problems in their XML Schema support
Highlights of XML Schemas

XML Schemas are a tremendous advancement over DTDs: Enhanced datatypes 44+ versus 10 Can create your own datatypes Example: "This is a new type based on the string type and elements of this type must follow this pattern: ddd-dddd, where 'd' represents a digit". Written in the same syntax as instance documents less syntax to remember Object-oriented'ish Can extend or restrict a type (derive new type definitions on the basis of old ones) Can express sets, i.e., can define the child elements to occur in any order Can specify element content as being unique (keys on content) and uniqueness within a region Can define multiple elements with the same name but different content Can define elements with nil content Can define substitutable elements - e.g., the "Book" element is substitutable for the "Publication" element.
Example: DTD
<!ELEMENT note (to, from, heading, body)> <!ELEMENT to (#PCDATA)> <!ELEMENT from (#PCDATA)> <!ELEMENT heading (#PCDATA)> <!ELEMENT body (#PCDATA)>
note.dtd
Example: XMLDTD
<?xml version="1.0"?> <!DOCTYPE note SYSTEM "http://www.w3schools.com/dtd/note.dtd"> <note> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
note.xml
Example: XML Schema

<?xml version="1.0"?> <xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" targetNamespace="http://www.w3schools.com" xmlns="http://www.w3schools.com" elementFormDefault="qualified"> <xs:element name="note"> <xs:complexType> <xs:sequence> <xs:element name="to" type="xs:string"/> <xs:element name="from" type="xs:string"/> <xs:element name="heading" type="xs:string"/> <xs:element name="body" type="xs:string"/> </xs:sequence> </xs:complexType> </xs:element> </xs:schema>
note.xsd
Example: XMLXML Schema

<?xml version="1.0"?> <note xmlns="http://www.w3schools.com" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xsi:schemaLocation="http://www.w3schools.com note.xsd"> <to>Tove</to> <from>Jani</from> <heading>Reminder</heading> <body>Don't forget me this weekend!</body> </note>
note.xml
RDF
Resource Description Framework
Motivation for RDF

RDF and Metadata
Scenario 1: The library
Lookup system search properties include author, title, subject etc.
Scenario 2: The video store

Lookup system search properties include directors, actors, etc.
The common thread:

Metadata: information about information
Motivation for RDF

What about the Web?
One big library, need call number to get things without a search Has hardly any metadata, HTML Yahoo
Has metadata based lookup facility, uses human generated subject categories and site labels
Library example to illustrate need for metadata
What is RDF?
RDF stands for Resource Description Framework RDF is a framework for describing resources on the web RDF provides a model for data, and a syntax so that independent parties can exchange and use it RDF is designed to be read and understood by computers RDF is not designed for being displayed to people RDF is written in XML RDF is a part of the W3C's Semantic Web Activity RDF is a W3C Recommendation
What is RDF?
Describe relationships and attributes of (Internet) resources, i.e. advanced metadata Based on Directed Labelled Graphs (DLG) and classical Information Analysis Also represented in XML, N3, N-Triple Attributes and Relation types may be defined by XML Namespaces, e.g. Dublin Core A general method to decompose knowledge into small pieces with some rules about semantics or meaning of those pieces Designed for knowledge, not data, means RDF is particularly concerned with meaning
RDF and XML

RDF is an implementation of XML Why not just use XML?
XML falls apart on the scalability design goal. There are two problems: Order of elements important unnatural in metadata, also expensive in practice Representation of XML documents in memory trees difficult to manage when large
XML unequalled as an exchange format on the Web, but it doesnt provide a metadata framework
Uses of RDF
Resource Discovery to provide better search engine capabilities Cataloging for describing the content and content relationships Intelligent software agents to facilitate knowledge sharing exchange Content rating in describing collections of pages that represent a single logical document
Uses of RDF
Describing intellectual property rights Privacy preferences expression of a user as well as the privacy polices of a Web site Web of Trust RDF with digital signatures will be key to building the Web of Trust for electronic commerce, collaboration, and other applications.
RDF Components
Formal data model Syntax for interchange of data Schema Type system (schema model) Syntax for machine-understandable schemas Query and profile protocols
RDF Data Model

Imposes structural constraints on the expression of application data models
for consistent encoding, exchange and processing of metadata
Enables resource description communities to define their own semantics Provides for structural interoperability
RDF Data Model

Directed labelled graphs Model elements
Statement: Resource (Subject) + Property (Predicate) + Value (Object) Resource: anything that can be identified, identified by a URI. Property: specific aspect, characteristic, attribute, or relation used to describe a resource URI: verbose name for Resource, can be http, urn, tag types Value
RDF Elements
Subject source of relationship
Always a resource
Predicate labeled arc

Always a resource
Object relationships destination

Resource or literal
Subject and Predicates are first-class objects

Which means they can be used as subjects or objects of other statements
RDF Model Primitives
Property
Resource
Value Resource
Statement
RDF Model
Author
Resource
Paul
RDF Syntax
RDF Model defines a formal relationships among resources, properties and values Syntax is required to...
Store instances of the model into files Communicate files from one application to another
W3C XML eXtensible Markup Language

http://www.w3.org/XML
RDF Model Example
dc: Title
URI:R
dc: Creator
RDF Presentation
Paul Miller
RDF Syntax Example

dc: Title
URI:R
dc: Creator
RDF Presentation
Paul Miller <RDF xmlns = http://www.w3.org/TR/WD-rdf-syntax# xmlns:dc = http://purl.org/dc/elements/1.0/> <Description about = URI:R> <dc:Title> RDF Presentation </dc:Title> <dc:Creator> Paul Miller </dc:Creator> </Description> </RDF>
RDF Model Example

dc: Title
URI:R
dc: Creator
RDF Presentation
Paul Miller URI:PAUL
bib:Aff UKOLN
bib:Name Paul Miller
bib:Email p.miller@ ukoln.ac.uk
URI:UKOLN
RDF Syntax Example

<RDF xmlns = http://www.w3.org/TR/WD-rdf-syntax# xmlns:dc = http://purl.org/dc/elements/1.0/ xmlns:bib = http://www.bib.org/persons#> <Description about = URI:R> <dc:Title> RDF Presentation </dc:Title> <dc:Creator> <Description> <bib:Name> Paul Miller </bib:Name> <bib:Email> p.miller@ukoln.ac.uk </bib:Email> <bib:Aff resource = http://www.ukoln.ac.uk /> </Description> </dc:Creator> </Description> </RDF>
RDF Schema
RDFS or RDF Schema is an extensible knowledge representation language, providing basic elements for the description of ontologies, otherwise called RDF vocabularies, intended to structure RDF resources. The first version was published by W3C in April 1998, and the final W3C recommendation was released in February 2004. Main RDFS components are included in the more expressive language OWL. RDFS is also written in XML.
RDF Schema
RDF describes resources with classes, properties, and values. In addition, RDF also need a way to define application -specific classes and properties. Application-specific classes and properties must be defined using extensions to RDF: RDF Schema RDF Schema does not provide actual applicationspecific classes and properties. Instead RDF Schema provides the framework to describe application-specific classes and properties Classes in RDF Schema is much like classes in object oriented programming languages. This allows resources to be defined as instances of classes, and subclasses of classes
RDF Schema
Basic vocabulary to describe RDF vocabularies Defines properties of the resources (e.g., title, author, subject, etc) Defines kinds of resources being describes (books, Web pages, people, etc) XML Schema gives specific constraints on the structure of an XML document RDF Schema provides information about the interpretation of the RDF statements
RDFS / RDF Classes

Class
Resource
Datatype
Container
Literal
Property
List
Statement
Alt
Bag
Seq
XMLLiteral
ContainerMembershipProperty
RDFS / RDF Properties

Element rdfs:domain rdfs:range rdfs:subPropertyOf rdfs:subClassOf rdfs:comment rdfs:label rdfs:isDefinedBy rdfs:seeAlso rdfs:member rdf:first rdf:rest rdf:subject rdf:predicate rdf:object rdf:value rdf:type Domain Property Property Property Class Resource Resource Resource Resource Resource List List Statement Statement Statement Resource Resource Range Class Class Property Class Literal Literal Resource Resource Resource Resource List Resource Resource Resource Resource Class The subject of the resource in an RDF Statement The predicate of the resource in an RDF Statement The object of the resource in an RDF Statement The property used for values The resource is an instance of a class Description The domain of the resource The range of the resource The property is a sub property of a property The resource is a subclass of a class The human readable description of the resource The human readable label (name) of the resource The definition of the resource The additional information about the resource The member of the resource
RDFS / RDF Attributes

Element rdf:about rdf:Description rdf:resource rdf:datatype rdf:ID rdf:li rdf:_n rdf:nodeID rdf:parseType rdf:RDF xml:base xml:lang Domain Range Description Defines the resource being described Container for the description of a resource Defines a resource to identify a property Defines the data type of an element Defines the ID of an element Defines a list Defines a node Defines the ID of an element node Defines how an element should be parsed The root of an RDF document Defines the XML base Defines the language of the element content
RDF Schema Example

<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns:rdfs="http://www.w3.org/2000/01/rdf-schema#"> <rdfs:Class rdf:ID="Person"> <rdfs:comment>Person Class</rdfs:comment> <rdfs:subClassOf rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Resource"/> </rdfs:Class> <rdfs:Class rdf:ID="Student"> <rdfs:comment>Student Class</rdfs:comment> <rdfs:subClassOf rdf:resource="#Person"/> </rdfs:Class> <rdfs:Class rdf:ID="Teacher"> <rdfs:comment>Teacher Class</rdfs:comment> <rdfs:subClassOf rdf:resource="#Person"/> </rdfs:Class>
RDF Schema Example (cont.)

<rdfs:Class rdf:ID="Course"> <rdfs:comment>Course Class</rdfs:comment> <rdfs:subClassOf rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Resource"/> </rdfs:Class> <rdf:Property rdf:ID="teacher"> <rdfs:comment>Teacher of a course</rdfs:comment> <rdfs:domain rdf:resource="#Course"/> <rdfs:range rdf:resource="#Teacher"/> </rdf:Property> <rdf:Property rdf:ID="students"> <rdfs:comment>List of Students of a course in alphabetical order</rdfs:comment> <rdfs:domain rdf:resource="#Course"/> <rdfs:range rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Seq"/> </rdf:Property> <rdf:Property rdf:ID="name"> <rdfs:comment>Name of a Person or Course</rdfs:comment> <rdfs:domain rdf:resource="#Person"/> <rdfs:domain rdf:resource="#Course"/> <rdfs:range rdf:resource="http://www.w3.org/1999/02/22-rdf-syntax-ns#Literal"/> </rdf:Property> </rdf:RDF>
RDF (corresponded to previous schema)

<?xml version="1.0"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#" xmlns="http://www.cs.rpi.edu/~puninj/XMLJ/course_schema.rdf#"> <Course rdf:ID="csci_2962"> <name>Programming XML in Java</name> <teacher> <Teacher rdf:ID="jp"> <name>John Punin</name> </Teacher> </teacher> <students> <rdf:Seq> <rdf:li> <Student rdf:ID="er"> <name>Elizabeth Roberts</name> </Student> </rdf:li> <rdf:li> <Student rdf:ID="gl"> <name>George Lucas</name> </Student> </rdf:li> <rdf:li> <Student rdf:ID="js"> <name>John Smith</name> </Student> </rdf:li> </rdf:Seq> </students> </Course> </rdf:RDF>
CIM
Common Information Model
CIM Motivation
Deregulation of the power industry worldwide requires utility companies share power system data: Energy Management System- EMS Exchanging power systems data is always problematic due to use of proprietary formats Needs of open standard for representing power system components CIM defines a common model for describing the components in power systems for use in a common EMS
CIM Overview
CIM is an information object-oriented model representing real-world objects found in transmission and distribution operation and management Enable integration of applications/systems Provides a common model behind all messages exchanged between systems Basis for defining information exchange models CIM provides a comprehensive, logical view of EMS information for: Transmission network analysis Generation control SCADA Operator training simulation
CIM Overview
Enable data access in a standard way Common language to navigate and access complex data structures in any database
Provides a hierarchical view of data for browsing and access with no knowledge of actual logical schema
Inspiration for logical data schemas (e.g., for an operational data store) Not tied to a particular applications view of the world But permits same model to be used by all applications to facilitate information sharing between applications Also provides consistent view of the world by operators regardless of which application user interface they are using
CIM Overview
A data model to enable data transfer or integration in any domain where a common power system model is needed
Model includes Classes, their Attributes, and Relationships to represent utility objects The Classes (Objects) are abstract and may are used in a wide variety of applications Useful: As Foundation for Logical Data Base Schema To Define Component Interfaces Common Language for Data Exchange
Sample Power System Model
Role of CIM in Utility Enterprise

Data preparation Provides common set of semantics and data representation regardless of source of data Improves data quality and enables data validation Data exchange Provides common language and format Provides common set of services for sharing data System integration Provides basis for a standards-based integration framework Web services payloads and Service Oriented Architecture (SOA) Enterprise Information Management Part of overall Enterprise Information Model relating to business processes/automation/management
Benefits of Using CIM Approach

Data model driven solutions leads to interoperability Provides common semantics for information exchange between heterogeneous systems Used for CA to CA communications
NERC mandated use of CIM and RDF Schema version for power system model exchange
Provides for automatic generation of message payloads in XML Ensures common language for all messages defined Avoids proprietary message formats from vendors (based on internal schemas) Eliminates work of creating DTD for each message Alternative to EDI or CSV file formats
Benefits of Using CIM Approach

Uses industry standard modeling notation
UML, XML, RDF
Permits software tool use for:

Defining and maintaining data models Single point of maintenance for changes Documenting data models Automatic generation of information payloads
Automatically generate IDL, Java, C code
CIM Related Standards

EPRI CCAPI: The Electric Power Research Institute (EPRI) proposed an integration framework called control center application program interface for EMS data sharing IEC 61970-301: Common Information Model (CIM) base- A semantic model describing the components of a power system at an electrical level and the relationships between each component IEC 61970-501: Common Information Model Resource Description Framework (CIM RDF) schema IEC 61968-4: Interfaces for records and asset management IEC 61968-11: Extends the model to cover the other aspects of power system software data exchange such as asset tracking, work scheduling and customer billing
CIM Representation
CIM is documented as a set of class diagrams using the Unified Modeling Language (UML) UML specifies CIM in an abstract manner that allows for open implementation:
There is no restriction to relational, object oriented or other modeling technologies
The UML is a Diagramming Tool for CIM
An Example of CIM in UML
CIM Packages
CIM consists of a number of packages
CIM - Common Information Model

Needed to make the model easier to design, understand and review Packages are grouped to be handled as a single standard document
CIM Base in UML - IEC 61970 Part 301 CIM Energy Scheduling, Reservations & Financial - IEC 61970 Part 302CIM SCADA - IEC 61970 Part 303 GID - Generic Interface Definition CIM Model Exchange Format

CIS - Component Interface Specifications

CIM RDF Schema (UML->RDF) - IEC 61970 Part 501 CIM XML Model data Exchange Format - IEC 61970 Part 552-4
CIM Base Part 301

CIM Base in UML
Package used for the Project
Dashed lines indicate a dependency relationship between packages Arrow points from the dependent package to the package on which it has a dependency The Generation package is divided into two sub packages:
Production GenerationDynamics
Components of Part 301

Core
This package contains the core Naming, PowerSystemResource, EquipmentContainer, and ConductingEquipment entities shared by all applications plus common collections of those entities Not all applications require all the Core entities This package does not depend on any other package, but most of the other packages have associations and generalizations that depend on it
This package is an extension to the Core Package Specifies physical definition of how equipment is connected together In addition it models Topology, that is the logical definition of how equipment is connected via closed switches The Topology definition is independent of the other electrical characteristics
Topology


Wires The Wires package is an extension to the Core and Topology packages Models information on the electrical characteristics of Transmission and Distribution networks This package is used by network applications such as State Estimation, Load Flow and Optimal Power Flow Outage This package is an extension to the Core and Wires packages Models information on the current and planned network configuration

Protection
This package is an extension to the Core and Wires packages Models information for protection equipment such as relays
Meas
Describes dynamic measurement data exchanged between applications

LoadModel
Provides models for the system load as curves and associated curve data Used for Load Forecasting and Load Management
Production
Provides models for various types of generators Models production costing information which is used to economically allocate demand among committed units and calculate reserve quantities This information is used by Unit Commitment and Economic Dispatch, Load Forecasting, Automatic Generation Control applications.

Generation Dynamics
Provides models for prime movers This information is used by Unit Modeling for Dynamic Training Simulator applications
Domain
Data dictionary of quantities and units This package contains the definition of datatypes, including units of measure and permissible values
Core Package
Topology Package
Wire Package
Outage Package
Protection Package
Meas Package
LoadModel Package
Production Package
GenerationDynamic Package
Domain Package
CIM XML
A common model exchange format based on the CIM data definition and XML was developed Proposed to NERC and subsequently adopted by their Data Exchange Working Group (DEWG) All major vendors of energy management systems have voiced their support for the format CIM/XML is a language for expressing CIM models in XML The NERC has adopted CIM/XML as the standard for exchanging models between power transmission system operators The CIM/XML format is also going through an IEC international standardization process
CIM XML
Resource Description Framework (RDF) defines a mechanism for describing resources RDF is a general-purpose language for representing information in the Web RDF integrates a variety of applications using XML as an interchange syntax RDF Schema is a standard which describes how to use CIM XML CIM/XML is an RDF application, using RDF and RDF Schema to organize its XML structures
CIM XML RDF Example

The base class of the CIM is the PowerSystemResource class Other more specialized classes such as Substation, Switch, and Breaker are defined as subclasses CIM/XML uses RDF as the language for exchanging specific system models
CIM XML RDF Example
CIM XML RDF Example
KE
knowledge engineering
What is Knowledge?
Data: raw, simply exists, intercept by sensory devices or organ Information: meaning that interpreted from data Knowledge: collection of information, people use when solving the problem
Data Inform ation Knowle dge
Where Knowledge Resides?
The problem with knowledge, however, is that, unlike information, it typically doesn't reside on paper. Instead, it lives inside people's heads.
Knowledge Management (1)

A strategy, framework or system designed to help organisations create, capture, analyse, apply, and reuse knowledge to achieve competitive advantage.
A key aspect is that knowledge within an organisation is treated as a key asset.

A core aspect is "getting the right knowledge to the right people at the right time in the right format".
to
Tacit Knowledge
Type of Knowledge
Tacit Knowledge
Explicit Knowledge
Socialization
Externalization
Explicit Knowledge
Internalization
to
Combination
Nonaka SECI Model
Knowledge Engineering (1)

A field within artificial intelligence that develops knowledge-based systems Computer programs that contain large amounts of knowledge, rules and reasoning mechanisms to provide solutions to realworld problems An expert system that designed to emulate the reasoning processes of an expert practitioner

Key KE principles: Different types of knowledge Different types of experts and expertise Different ways of representing knowledge Different ways of using knowledge Right approach and technique must be employed acquire validate and reuse of Knowledge

Three types of experts Academic:
Theoretical understanding is prized. Their job is to explicate clarify and teach others May be far from day-to-day problem solving Engage constant day-to-day problem solving Implicit Difficult for them to articulate Pure performance expert Equipped with theoretical knowledge and put them into real problem solving Comfortable to articulate
Practitioner:
Samurai:

Need a way to relates different type of knowledge, experts, representation and task together to perform a knowledgeoriented activity Not to interview experts about knowledge they cannot articulate, represent it in a form no one understand and eventually find they do not really need it Use structured methods
Knowledge Roles
knowledge manager defines knowledge strategy initiates knowledge development projects facilitates knowledge distribution
knowledge provider/ specialist
elicits knowledge from elicits requirements from
knowledge engineer/ analyst manages project manager
validates
delivers analysis models to KS uses knowledge user designs & implements knowledge system developer manages
Classification of Knowledge
Declarative and Procedural Knowledge: Knowing what vs. knowing how Tacit and Explicit Knowledge: Easy to articulate vs. hard to articulate Generic and Specific Knowledge: Applying across many situations vs. applying across a few situations
Knowledge Modeling
A way of structuring projects, acquiring and validating knowledge and storing knowledge for future use.
Symbolic character-based languages, such as logic Diagrammatic representations, such as networks and ladders Tabular representations, such as matrices Structured text, such as hypertext
Knowledge Object
Field of logic has also inspired important knowledge types, notably concepts, attributes, values, rules and relationships Concepts are the things (physical objects, information, people, etc.) that constitute a domain. Each concept is described by its relationships to other concepts in the domain (e.g. in a hierarchy), and by its attributes and values. Instance is an instantiated class. For example, "my car" is an instance of the concept "car Attributes are the generic properties, qualities or features belonging to a class of concepts, e.g. weight, cost, age and ability. Values are the specific qualities of a concept such as its actual weight or age. Values are associated with a particular attribute and can be numerical (e.g. 120Kg, 6 years old) or categorical (e.g. heavy, young) Rules are statements of the form "IF... THEN...". Relationships represent the way knowledge objects (such as concepts and tasks) are related to one another. Important examples include is a to show classification, part of to show composition,
Structured Modeling Techniques

Relational database (RDB) Object oriented database (OODB) eXtensible markup language (XML) Unified modeling language (UML)
Uses of Knowledge Models

Knowledge elicitation (from an expert) Validation (with the same expert) Cross-validation (with another expert) Knowledge publication Maintenance and updating of the knowledge system or publication
Knowledge Acquisition (1)

Generic process
1. Conduct an initial interview with the expert to
2. Transcribe the initial interview and analyze the resulting document (called a protocol) to produce a set of questions that cover the essential issues across the domain and that serve the goals of the knowledge acquisition exercise
a) scope what knowledge should be acquired, b) determine to what purpose the knowledge should be put, c) gain some understanding of key terminology, and d) build a rapport with the expert

Generic process
1. Conduct a second interview with the expert using the pre-prepared questions to provide structure and focus. (This is called a semi-structured interview.) 2. Transcribe the semi-structured interview and analyse the resulting protocol, looking for knowledge types: concepts, attributes, values, classes of concepts, relationships between concepts, tasks and rules. 3. Represent these knowledge elements in a number of formats, for example, hierarchies of classes (taxonomies), hierarchies of constitutional elements, grids of concepts and attributes, diagrams, and flow charts. In addition, document, in a structured manner, anecdotes (war stories) and explanations that the expert gives.

Generic process
1. Use the resulting representations and structured documentation with contrived techniques to allow the expert to modify and expand on the knowledge you have already captured. 2. Repeat the analysis, representation-building and acquisition sessions until the expert is happy that the goals of the project have been realised. 3. Validate the knowledge acquired with other experts, and make modifications where necessary.

Issues in Knowledge Acquisition:
Most knowledge is in the heads of experts Experts have vast amounts of knowledge Experts have a lot of tacit knowledge
They don't know all that they know and use Tacit knowledge is hard (impossible) to describe
Experts are very busy and valuable people Each expert doesn't know everything Knowledge has a "shelf life"

Requirements for knowledge acquisition:
Take experts off the job for short time periods Allow non-experts to understand the knowledge Focus on the essential knowledge Can capture tacit knowledge Allow knowledge to be collated from different experts Allow knowledge to be validated and maintained
Knowledge Acquisition Techniques (1)

Interviewing Work observation Commentary Protocol analysis Laddering Concept sorting Repertory grid
Interviewing (1)
Common use for knowledge acquisition Range from completely unstructured to formally planned, structured interview Audio-visual recording is required
Interviewing (2)
Probe Code
P1 P2 P3 P4
Question template
Why would you do that? How would you do that? When would you do that? Is<the rule>always the case? What alternatives to <the prescribed action/decision> are there? What if it were not the case that <currently true condition>? Can you tell me more about <any subject already mentioned>?
Effect
Converts an assertion into a rule Generates lower-order rules Reveals the generality of the rule and may generate other rules Generates more rules
P5 P6
Generates rules for when current condition does not apply Used to generate further dialogue if expert dries up
Interviewing (3)
EX: KE: EX: KE: EX: KE: EX: I actually checked the port of the computer Why did you check the port? (P1) If its been lightning recently then its good to check the port, because lightning tends to damage the ports. Are there any alternatives to that problem? (P4) Yes, that ought to be prefaced by saying that if it was several keys with odd effects, not necessarily all of them, but two or more. Why does it have to be more than two? Well, if it was only one or two keys doing funny things then the thing to do is check theyre closing property, speed would affect all keys, parity would affect about half the keys.
Interviewing (4)
IF THEN IF THEN IF THEN there has been recent lightning check port for damage there are two or fewer malfunction keys check the key contacts about half the keyboard is malfunctioning check the parity
IF THEN
the whole keyboard is malfunctioning check the speed
Work observation
Simply observing and making notes as the expert performs their daily activities Videotaping task performance can be useful especially if combined with retrospective reporting techniques
Commentary
Think aloud problem-solving
Expert providing a running commentary of their thought processes as they solve a problem Experts protocol of task behaviour shown in video and asked to provide a running commentary on what they were thinking and doing
Protocol Analysis
To identify of basic knowledge objects within a protocol - transcript An interview transcript would be analyzed by highlighting all the concepts that are relevant to the task Categories of fundamental knowledge such as concepts, attributes, values, tasks and relationships would be extracted For example, if the transcript concerns the task of diagnosis, then such categories as symptoms, hypotheses and diagnostic techniques would be used for the analysis
Laddering
Involve the creation, reviewing and modification of hierarchical knowledge, often in the form of ladders, i.e. tree diagrams See example
Knowledge intensive Task Hierarchy

knowledgeintensive task
analytic task
synthetic task
classification
diagnosis
prediction
design
planning
assignment
modelling assessment monitoring
scheduling
configuration design
Analytic versus synthetic tasks

analytic tasks
system pre-exists it is typically not completely "known" input: some data about the system, output: some characterization of the system
synthetic tasks
system does not yet exist input: requirements about system to be constructed output: constructed system description
Structure of template description in catalog

General characterization typical features of a task Default method roles, sub-functions, control structure, inference structure Typical variations frequently occurring refinements/changes Typical domain-knowledge schema assumptions about underlying domainknowledge structure
Classification
establish correct class for an object object should be available for inspection "natural" objects examples: rock classification, apple classification terminology: object, class, attribute, feature one of the simplest analytic tasks; many methods other analytic tasks: sometimes reduced to classification problem especially diagnosis
Classification: Pruning method

generate all classes to which the object may belong specify an object attribute obtain the value of the attribute remove all classes that are inconsistent with this value
Classification:inference structure
object specify attribute
generate
class
obtain
match
feature
truth value
Classification: method control

while new-solution generate(object -> candidate) do candidate-classes := candidate union candidate-classes;
while new-solution specify(candidate-classes -> attribute) and length candidate-classes > 1 do obtain(attribute -> new-feature); current-feature-set := new-feature union current-featureset; for-each candidate in candidate-classes do match(candidate + current-feature-set -> truth-value); if truth-value = false; then candidate-classes := candidate-classes subtract candidate;
Classification: method variations

Limited candidate generation Different forms of attribute selection
decision tree information theory user control
Hierarchical search through class structure
Classification: domain schema

object type
has-attribute class-of
2+
object class
1+
attribute
requires value: universal
class constraint
Rock classification
rock
texture grain size colour
1+
mineral
minerals ontology
igneous rock
mineral content
percentage presence
silicate
volcanic rock
plutonic rock mineral content constraint
neso silicate
tecto silicate
syenite
diorite olivine quartz
peridotite
dunite
Nested classification
rock classifcation
rock sub-task obtain: Quartz percentage contains identify Quartz minerals
mineral classification
Quartz olivine
Rock classification prototype
Assessment
find decision category for a case based on domain-specific norms. typical domains: financial applications (loan application), community service terminology: case, decision, norms some similarities with monitoring
differences:
timing: assessment is more static different output: decision versus discrepancy
Assessment: abstract & match method

Abstract the case data Specify the norms applicable to the case e.g. rent-fits-income, correct-householdsize Select a single norm Compute a truth value for the norm with respect to the case See whether this leads to a decision Repeat norm selection and evaluation until a decision is reached
Assessment:inference structure
case
abstract
abstracted case
specify
norms
select
evaluate
norm
decision
match
norm value
Assessment: method control

while new-solution abstract(case-description -> abstracted-case) do case-description := abstracted-case; end while specify(abstracted-case -> norms); repeat select(norms -> norm); evaluate(abstracted-case + norm -> norm-value); evaluation-results := norm-value union evaluationresults; until has-solution match(evaluation-results -> decision);
Assessment control: UML notation

[more abstractions] abstract
specify norms [no more abstractions] select norm [match fails no decision] [match succeeds: decision found]
evaluate norm
match decision
Assessment: method variations

norms might be case-specific
cf. housing application
case abstraction may not be needed knowledge-intensive norm selection

random, heuristic, statistical can be key to efficiency sometimes dictated by human expertise
only acceptable if done in a way understandable to experts
Assessment: domain schema

case abstraction rule
case datum
1+
value: universal 1+
has abstraction
case datum
implies
requirement
norm
indicates truth-value: boolean 1+

decision rule
decision
Claim handling forunemployment benefits

claim handling
collect data data entry decide about claim
finacial department
:claim
[no right] [right] compute benefit
send notification
prepare payment
Decision rules for claim handling

<norm> WW benefit requirement DEFINES <decision> WW benefit right
<decision rule> benefit decision rule
insured = false DEFINES WW-benefit-right.value = no-right iunemployed = false DEFINES WW-benefit-right.value = no-right weeks-worked-requirement = false DEFINES WW-benefit-right.value = no-right
insured = true AND unemployed = true AND weeks-worked--requirement = true AND years-worked-requirement = false DEFINES WW-benefit-right.value = short-benefit
insured = true AND unemployed = true AND weeks-worked--requirement = true AND years-worked-requirement = true DEFINES WW-benefit-right.value = long-benefit
Diagnosis
find fault that causes system to malfunction example: diagnosis of a copier terminology: complaint/symptom, hypothesis, differential, finding(s)/evidence, fault nature of fault varies state, chain, component should have some model of system behavior default method: simple causal model sometimes reduced to classification task direct associations between symptoms and faults automation feasible in technical domains
Diagnosis: causal covering method

Find candidate causes (hypotheses) for the complaint using a causal network Select a hypothesis Specify an observable for this hypothesis and obtain its value Verify each hypothesis to see whether it is consistent with the new finding Continue this process until a single hypothesis is left or no more observables are available
Diagnosis:inference structure
hypothesis specify observable
complaint
select
obtain
cover
hypothesis
verify
finding
result
Diagnosis: method control

while new-solution cover(complaint -> hypothesis) do differential := hypothesis add differential; end while repeat select(differential -> hypothesis); specify(hypothesis -> observable); obtain(observable -> finding); evidence := finding add evidence; foreach hypothesis in differential do verify(hypothesis + evidence -> result); if result = false then differential := differential subtract hypothesis until length differential =< 1 or no observables left faults := hypothesis;
Diagnosis: method variations

inclusion of abstractions simulation methods see literature on model-based diagnosis
library of Benjamins
Diagnosis: domain schema

syst em feat ure
syst em st at e syst em observable syst em st at e
can cause
syst em feat ure
value: universal
status: universal
causal dependency
fault
prevalence: number[0..1]
Monitoring
analyze ongoing process to find out whether it behaves according to expectations terminology: parameter, norm, discrepancy, historical data main features: dynamic nature of the system cyclic task execution output "just" discrepancy => no explanation often: coupling monitoring and diagnosis output monitoring is input diagnosis
Monitoring:data-driven method
Starts when new findings are received For a find a parameter and a norm value is specified Comparison of the find with the norm generates a difference description This difference is classified as a discrepancy using data from previous monitoring cycles
Monitoring: inference structure

system model
receive
new finding
select
parameter
compare
norm
specify
difference
classify
discrepancy
historical data
Monitoring: method control

receive(new-finding); select(new-finding -> parameter) specify(parameter -> norm); compare(norm + finding -> difference); classify(difference + historical-data -> discrepancy); historical-data := finding add historical-data;
Monitoring: method variations

model-driven monitoring
system has the initiative typically executed at regular points in time example: software project management
classification function treated as task in its won right

apply classification method
add data abstraction inference
Prediction
analytic task with some synthetic features analyses current system behavior to construct description of a system state at future point in time. example: weather forecasting often sub-task in diagnosis also found in knowledge-intensive modules of teaching systems e.g. for physics. inverse: retrodiction: big-bang theory
Synthesis
Given a set of requirements, construct a system description that fulfills these requirements
requirements (external)
soft requirement
"fast system"
constraints & preferences (internal)

preference
"prefer cheapest component"
hard requirement
"price lower than $2,000"
constraint
"P166 processor requires 16Mb"
Ideal synthesis method

Operationalize requirements
preferences and constraints
Generate all possible system structures Select sub-set of valid system structures
obey constraints
Order valid system structures

based on preferences
Synthesis:inference structure
operationalize requirements system composition knowledge
generate
possible system structures
hard requirements
select subset
constraints
valid system structures

preferences
soft requirements
sort
preference ordering knowledge
list of preferred system structures
Design
synthetic task system to be constructed is physical artifact example: design of a car can include creative design of components creative design is too hard a nut to crack for current knowledge technology sub-type of design which excludes creative design => configuration design
Configuration design
given predefined components, find assembly that satisfies requirements + obeys constraints example: configuration of an elevator; or PC terminology: component, parameter, constraint, preference, requirement (hard & soft) form of design that is well suited for automation computationally demanding
Elevator configuration: knowledge base reuse
Configuration:propose & revise method

Simple basic loop: Propose a design extension Verify the new design, If verification fails, revise the design Specific domain-knowledge requirements revise strategies Method can also be used for other synthetic tasks assignment with backtracking skeletal planning
Configuration: method decomposition

requirements specify skeletal design
operationalize
soft requirements
propose
extension
hard requirements design verify
modify
action
critique
violation
truth value
select
action list
Configuration: method control

operationalize(requirements -> hard-reqs + soft-reqs); specify(requirements -> skeletal-design); while new-solution propose(skeletal-design + design +soft-reqs -> extension) do design := extension union design; verify(design + hard-reqs -> truth-value + violation); if truth-value = false then critique(violation + design -> action-list); repeat select(action-list -> action); modify(design + action -> design); verify(design + hard-reqs -> truth-value + violation); until truth-value = true; end while
Configuration: method variations

Perform verification plus revision only when for all design elements a value has been proposed. can have a large impact on the competence of the method Avoid the use of fix knowledge Fixes are search heuristics to navigate the potentially extensive space of alternative designs alternative: chronological backtracking
Configuration: domain schema

act ion t ype
fix act ion
1+
fix
const raint
preference rating: universal
implies
const raint expression
1+
design element
computes 1+
design element
1+
defines preference
calculat ion expression
preference expression
component
component
0+ has-parameter
paramet er value: universal
1+
model list: list
Types of configuration may require different methods

Parametric design Assembly is largely fixed Emphasis on finding parameter values that obey global constraints and adhere to preferences Example: elevator design Layout Component parameters are fixed Emphasis on constructing assembly (topological relations) Example: mould configuration Literature: Motta (1999), Chandrasekaran (1992)
Assignment
create mapping between two sets of objects allocation of offices to employees allocation of airplanes to gates mapping has to satisfy requirements and be consistent with constraints terminology subject, resource, allocation can be seen as a degenerative form of configuration design
Assignment: method without backtracking

Order subject allocation to resources by selecting first a sub-set of subjects If necessary: group the subjects into subjectgroups for joint resource assignment requires special type of constraints and preferences Take an subject(-group) and assign a resource to it. Repeat this process until all subjects have a resource
Assignment:inference structure
subjects select subset subject set
subject group
group
resources
assign
resource
current allocations
Assignment:method control
while not empty subjects do select-subset(subjects -> subject-set); while not empty subject-set do group(subject-set -> subject-group); assign(subject-group + resources + currentallocations -> resource); current-allocations := < subject-group, resource > union current-allocations; subject-set := subject-set/subject-group; resources := resources/resource; end while subjects := subjects/subject-set; end while
Assignment: method variations

Existing allocations
additional input
subject-specific constraints and preferences

see synthesis and configuration-design
Planning
shares many features with design main difference: "system" consists of activities plus time dependencies examples: travel planning; planning of building activities automation only feasible, if the basic plan elements are predefined consider use of the general synthesis method (e.g therapy planning) or the configurationdesign method
Planning method
requirements plan goal generate plan composition knowledge
operationalize possible plans
hard requirements
select subset
constraints
valid plans
preferences
soft requirements
sort
preference ordering knowledge
list of preferred plans
Scheduling
Given a set of predefined jobs, each of which consists of temporally sequenced activities called units, assign all the units to resources at time slots production scheduling in plant floors Terminology: job, unit, resource, schedule Often done after planning (= specification of jobs) Take care: use of terms planning and scheduling differs
Scheduling:temporal dispatching method

Specify an initial schedule Select a candidate unit to be assigned Select a target resource for this unit Assign unit to the target resource Evaluate the current schedule Modify the schedule, if needed
Scheduling: inference structure

job specify truth value
select
schedule
verify
candidate unit
assign
modify
select
target resource
Scheduling: method control

specify(jobs -> schedule); while new-solution select(schedule -> candidate-unit) do select(candidate-unit + schedule -> target-resource); assign(candidate-unit + target-resource -> schedule); evaluate(schedule -> truth-value); if truth-value = false then modify(schedule -> schedule); end while
Scheduling: method variations

Constructive versus repair method Refinement often necessary
see scheduling literature catalog of Hori (IBM Japan)
Scheduling: typical domain schema

schedule job
release-date: time due-date: time includes

{temporally ordered}
job unit
resource
{dynamically linked}
unit
preference constraint
type: string start-time: time end-time: time

is performed at
start: time end: time resource-type: string
resource capacity constraint
Modeling
included for completeness "construction of an abstract description of a system in order to explain or predict certain system properties or phenomena" examples: construction of a simulation model of nuclear accident knowledge modeling itself seldom automated => creative steps exception: chip modeling
In applications: typical task combinations

monitoring + diagnosis
Production process
monitoring + assessment
Nursing task
diagnosis + planning
Troubleshooting devices
classification + planning
Military applications
Example: apple-pest management

mintor crop execute plan
[possible threat] [possible pest]
identify pest
plan measure
Comparison with O-O analysis

Reuse of functional descriptions is not common in O-O analysis notion of functional object But: see work on design patterns strategy patterns templates are patterns of knowledgeintensive tasks Only real leverage from reuse if the patterns are limited to restricted task types
Ontology Engineering
Ontology Development
What Is An Ontology?
An ontology is an explicit description of a domain:
concepts properties and attributes of concepts constraints on properties and attributes Individuals (often, but not always)
An ontology defines
a common vocabulary a shared understanding
Ontology Examples
Taxonomies on the Web
Yahoo! categories
Catalogs for on-line shopping

Amazon.com product catalog
Domain-specific standard terminology

Unified Medical Language System (UMLS) UNSPSC - terminology for products and services Common Information Model (CIM)- A semantic model describing the components of a power system at an electrical level and the relationships between each component
What Is Ontology Engineering?

Defining terms in the domain and relations among them Defining concepts in the domain (classes) Arranging the concepts in a hierarchy (subclass-superclass hierarchy) Defining which attributes and properties (slots) classes can have and constraints on their values Defining individuals and filling in slot values
Why Develop an Ontology?

To share common understanding of the structure of information
among people among software agents
To enable reuse of domain knowledge

to avoid re-inventing the wheel to introduce standards to allow interoperability
Why Develop an Ontology?

To make domain assumptions explicit
easier to change domain assumptions (consider a genetics knowledge base) easier to understand and update legacy data
To separate domain knowledge from the operational knowledge

re-use domain and operational knowledge separately (e.g., configuration based on constraints)
Backbone of Other systems

Declare structure
Databases
Knowledge bases
Ontologies
Provide domain description
Software agents
Problemsolving methods
Domainindependent applications
Ontology Development Process

Determine the domain and scope of the ontology Consider reusing existing ontologies Enumerate important terms in the ontology Define the classes and the class hierarchy Define the properties of classesslots Define the facets of the slots Create instances
Ontology Development 101: A Guide to Creating Your First Ontology
Pizza Domain
DMRs Olive Oil
Contains
Onion
Made by
Contains
Offers
Provolone
Contains
The special
Competency Questions
Which styles should I consider when choosing a pizza? Is a Sicilian pizza a tomato or olive oil base? Does tuna go well with pepperoni? What is the best choice of pizza for a vegetarian? Which characteristics of a pizza affect its appropriateness for a party? Does the flavor of an ingredient change with the base? What were good toppings for a thick crust?
Consider Reuse
Why reuse other ontologies?
to save the effort to interact with the tools that use other ontologies to use ontologies that have been validated through use in applications
What to Reuse?
Ontology libraries

DAML ontology library (www.daml.org/ontologies) Ontolingua ontology library (www.ksl.stanford.edu/software/ontolingua/) Protg ontology library (protege.stanford.edu/plugins.html)
IEEE Standard Upper Ontology (suo.ieee.org) Cyc (www.cyc.com) DMOZ (www.dmoz.org) WordNet (www.cogsci.princeton.edu/~wn/)
Upper ontologies

General ontologies
Domain-specific ontologies

UMLS Semantic Net GO (Gene Ontology) (www.geneontology.org) CIM
Enumerate Important Terms

What are the terms we need to talk about? What are the properties of these terms? What do we want to say about the terms?
Define Classes and the Class Hierarchy

A class is a concept in the domain
a class of pizzas a class of pizza shops a class of ingredients
A class is a collection of elements with similar properties

Instances of classes
the pizza you will have for lunch
Class Inheritance
Classes usually constitute a taxonomic hierarchy (a subclasssuperclass hierarchy) A class hierarchy is usually an IS-A hierarchy:
an instance of a subclass is an instance of a superclass
If you think of a class as a set of elements, a subclass is a subset
Class Inheritance - Example

Mushroom is a subclass of Topping
Every Mushroom is an Topping
Green-pepper is a subclass of Vegetable

Every green-pepper is a vegetable
Provolone is a subclass of Cheese

Every Provolone is a Cheese
What should be the specification? The Kind? The hunk-of?
Modes of Development
top-down define the most general concepts first and then specialize them bottom-up define the most specific concepts and then organize them in more general classes combination define the more salient concepts first and then generalize and specialize them
Documentation
Classes (and slots) usually have documentation
Describing the class in natural language Listing domain assumptions relevant to the class definition Listing synonyms
Documenting classes and slots is as important as documenting computer code
Define Properties of Classes Slots

Slots in a class definition describe attributes of instances of the class and relations to other instances
Each Pizza will have crust, sauce, and toppings.
Necessary conditions? Necessary and sufficient? Sufficient?
Properties (Slots)
Types of properties
intrinsic properties: Crust, sauce, extrinsic properties: name, price, parts: ingredients for a pizza relations to other objects: pizza store, customer,
Simple and complex properties

simple properties (attributes): contain primitive values (strings, numbers) complex properties: contain (or point to) other objects (e.g., a pizza instance)
Slot and Class Inheritance

A subclass inherits all the slots from the superclass
If a topping has a name and a cost, a cheese also has a name and flavor
If a class has multiple superclasses, it inherits slots from all of them

Use great care!!
Property Constraints
Property constraints (facets) describe or limit the set of possible values for a slot
The name of a pizza is a string The pizza producer is an instance of PizzaShop A PizzaShop has exactly one location
Common Facets
Slot cardinality the number of values a slot has Slot value type the type of values a slot has Minimum and maximum value a range of values for a numeric slot Default value the value a slot has unless explicitly specified otherwise
Common Facets: Slot Cardinality

Cardinality Cardinality N means that the slot must have N values Minimum cardinality Minimum cardinality 1 means that the slot must have a value (required) Minimum cardinality 0 means that the slot value is optional Maximum cardinality Maximum cardinality 1 means that the slot can have at most one value (single-valued slot) Maximum cardinality greater than 1 means that the slot can have more than one value (multiple-valued slot)
Common Facets: Value Type

String: a string of characters (The Special) Number: an integer or a float (15, 4.5) Boolean: a true/false flag Enumerated type: a list of allowed values (high, medium, low) Complex type: an instance of another class
Specify the class to which the instances belong The Pizza class is the value type for the slot produces at the PizzaShop class
Domain and Range of Slot

Domain of a slot the class (or classes) that have the slot
More precisely: class (or classes) instances of which can have the slot
Range of a slot the class (or classes) to which slot values belong
Facets and Class Inheritance

A subclass inherits all the slots from the superclass A subclass can override the facets to narrow the list of allowed values
Make the cardinality range smaller Replace a class in the range with a subclass
Pizza
is-a producer PizzaShop is-a producer
The Special
DMRs
Create Instances
Create an instance of a class
The class becomes a direct type of the instance Any superclass of the direct type is a type of the instance
Assign slot values for the instance frame

Slot values should conform to the facet constraints Knowledge-acquisition tools often check that
Power Distribution Network Asset Categorization
Development Process
Defining purpose, domain and scope Performing competency questioning and informal describing of domain knowledge Analyzing to capture concepts and properties Considering of reuse of existing ontology, i.e. CIM, and mapping concepts into CIM Modeling asset classes and relationships Verifying of interchangeability, expressivity, reusability, extensibility and integrateability
Purpose, Domain and Scope

The purpose is to facilitate the determination of risks, costs and socials factors associated with the implementation of power distribution network The domain encompass the medium voltage (MV) distribution feeder including network components, network operation, and operational environment. The scope is limited to capture information that aids determining risks, costs and socials factors involved with distribution feeder.
Elicitation of Domain Knowledge

The competency questions are formed and asked, and Then human experts are thus interviewed and concerning documents are researched to elaborate informal description about the domain, i.e. MV distribution feeder. ows some of the domain informal description elicited from the experts.
Domain Informal Description: Example

What is power distribution network?

It is a part of power system. It distributes electric energy from main substation to distribution substations and transformers. It situates in diverse landscapes and environments. It runs along public road. It also runs through field and forest. It can be overhead or underground construction or combination of both. Overhead power line is placed above ground with appropriate clearance from nearby structures and trees. Underground power line is placed under ground with some kind of protection. Underground power line can also be put above ground, inside a type of structure, e.g. buildings, bridges, etc.
Informal Description Analysis

Using an annotation technique to capture the keywords that represent the concepts in the domain. The concept basically features with characteristics that differentiates itself from other concepts. For example, the concepts in power distribution system domain include distribution feeder, overhead line, underground line, or location.
Classes and Relationships

Transform concepts to asset classes while their characteristics are will turn into class properties Employ CIM specification to constrain the modeling work Reuse of existing CIM models where applicable Extend or develop new models to suite application Verifying of interchangeability, expressivity, reusability, extensibility and integrateability
Asset Classes and Its Relationships

PSR
Equipment EquipmentContainer
ConductingEquipment
Substation
Conductor
WireType
Feeder Jumper Fuse Insulator OHLine
Location
Switch
Pole
DS LBS Cable
Hanger OHConductor Joint Termination
UGLine
Duct
Thank you

Asset Categorization

Hochgeladen von

Dokumentinformationen

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Asset Categorization

Hochgeladen von

Copyright:

Verfügbare Formate

Asset Categorization

Power Distribution System Asset Categorization

Why use UML?

UML Diagrams and Elements

State diagrams Activity diagrams Implementation diagrams

Use Cases Diagram

Use Cases Diagram

Class and Objects

setName (s : String = deault) setPopulation(p : integer = default)

UML Class Diagrams and Relationships

UML Class Diagrams and Relationships

Elements of a Class Diagram

UML Class Relationships

UML Class Relationships

UML Class Relationships

UML Class Relationships

UML Class Relationships

UML Class Relationships

UML Class Relationships

UML Class Relationships

Other Terms for Annotations of Class Diagrams

Put Them Together

Simple XML Example

Why Is XML Important?

Why Is XML Important?

Why Is XML Important?

XML Building Blocks

XML Building blocks--Prolog

XML is Based on Markup

XML Elements have Relationships

Elements have Content

XML elements must follow the naming rules

XML as Labeled Ordered Trees

can also represent relational and object-oriented data

semistructured data labeled trees/graphs

Elements and their Content

Element Content Empty Element

Document Type Definitions (DTD)

Document Type Definitions (DTD)

Authors followed by optional fullpaper, followed by title, followed by booktitle

Document Type Definitions (DTD)

Object Identity Attribute

<paper ID="object-fusion" ROLE="publication">

Reference to external ENTITY

Namespaces and URIs

There are two ways to use namespaces:

Namespaces and DTD

What is XML Schema?

A schema is a collection of:

Compatibility with namespaces

Highlights of XML Schemas

Example: XML Schema

Example: XMLXML Schema

Motivation for RDF

Scenario 2: The video store

The common thread:

Motivation for RDF

Library example to illustrate need for metadata

RDF and XML

RDF Data Model

RDF Data Model

Predicate labeled arc

Object relationships destination

Subject and Predicates are first-class objects