Sie sind auf Seite 1von 81

Module 5

Natural language Processing

Two of the most difficult tasks that facing AI researchers are


- developing programs that understand Natural language &
-comprehend Visual scenes
Developing Programs that can understand Natural
Language is very difficult. Why?
Natural languages are large
They contain a number of different sentences.
New sentences can always be produced.
There is ambiguity in a natural language
Many words have several meanings and sentences can
have several meanings in different contexts

English sentences are incomplete descriptions of the information.


-some dogs are outside.

The same expression means different things in different contexts:


Where is the water ?
Advt: communicate about an infinite world using a finite number of
symbols.
No natural language pgm can be complete because new words,
expressions and meanings can be generated quite freely.
There are lots of ways to say the same thing.
Mary was born on October11.
Marys birthday is on October 11.

Overview of Linguistics
Linguistics study of language
Levels of knowledge used in Natural language understanding
1. Phonological knowledge
-knowledge which relates sounds to the words
Phoneme smallest unit of sound
2. Morphological knowledge
- lexical knowledge related to word constructions from basic units called morphemes.
Morphemes- smallest unit of meaning
3.Syntactic knowledge
4.Semantic knowledge
5.Pragmatic knowledge
6.World knowledge

Morphological Analysis
-punctuations are seperated from words
Syntactic analysis
Eg: Boy the go the to store
Semantic analysis
Colorless green ideas sleep furiously
I dropped my dimond
Discourse Integration
John wanted it.
Pragmatic analysis
Do you know what time it is

Grammers and languages


Language can be considered as a set of strings of finite or infinite
length
String constructed by concatenating symbols( alphabets)
Alphabets symbols of the language.
Sentences are constructed using a set of rules called grammer.
Language generated by grammer G L(G)
Grammer G can be defined as G = (Vn, Vt, s, p)
Terminal symbols
symbols which cannot be decomposed further.
eg: adjectives , nouns or verbs in English
NonTerminals symbols can be decomposed further or expanded by rules.
eg: Noun phrases or Verb phrases

Most common way to represent grammers is as a set of production rules


S
NP VP
NP
ART N
NP
N
VP
N
V
ART

V NP
boy | popsicle | frog
ate | kissed
the | a

|flew

With this G, following sentence can generated:

The boy ate a popsicle


The frog kissed a boy
A boy ate the frog

NP VP
ART N VP
the N VP
the boy VP
the boy V NP
the boy ate NP
the boy ate ART N
the boy ate a N
the boy ate a popsicle.
A grammer does not gurantee the generation of meaningful sentences,
only that they are structurally correct.
The Popsicle flew a frog

Structural Representations
Sentences can be represented as a tree or graph to expose
the structure of the constituent parts.
S
NP

VP

ART

the

boy

ate

Phase marker or syntactic tree

NP
ART

N
popsicle

Basic Parsing Techniques


The process of determining the syntactical structure of a
sentence is known as parsing.

The process of analyzing a sentence by taking it apart


word-by word and determining its structure from its
constituent parts and sub parts.
The structure of a sentence can be represented with a
syntactic tree or a list

To parse a sentence, it is necessary to find a way in which that


sentence could have been generated from the start symbol. This
can be done in two ways:
Top-Down Parsing
Begin with the start symbol and apply the grammar rules forward
until the symbols at the terminals of the tree correspond to the
components of the sentence being parsed
Bottom-up Parsing
- Begin with the sentence to be parsed and apply the grammar rules
backward until a single tree whose terminals are the words of the
sentence and whose top node is the start symbol has been
produced.

Parsing an input to create an output


structure
Input string

Parser

Lexicon

Output
representation
structure

Kathy jumped the horse


S

NP VP
N VP
Kathy VP
Kathy V NP
Kathy jumped NP
Kathy jumped ART N
Kathy jumped the N
Kathy jumped the house
Top down Parsing

Kathy jumped the horse

N jumped the horse


N V the horse
N V ART horse
NP V ART N
NP V NP
NP VP
S
Bottom up Parsing

The Lexicon
A dictionary of words, where each word contains some syntactic,
semantic and possibly some pragmatic information
Usually made up of variable length data structures such as lists or
records arranged in alphabetical order.

Access to the words may be facilitated by indexing, with binary


searches, hashing, or combinations of these methods.

Typical entries in a lexicon


Word Type
Features
a Determiner
{3s}
beVerb
Trans:intransitive
Boy
Noun
{3s}
Can Noun
{1s,2s,3s,1p,2p,3p}
Carried
verb
form: past, past participle
:
:
Orange
adjective
Noun
{3s}
To preposition

Understanding written text is easier than understanding


speech.
General approaches to natural language Understanding
The use of keyword and pattern matching.
Syntactic and semantic directed analysis.
Comparing and matching the input to real world situations.
Of these second approach is the most popular one.

Transformational Grammars
Provide a mechanism to produce single representations for
sentences having the same meanings through a series
of transformations

Generative Grammers
-produce different structures for sentences having
different syntactical forms even though they may have
the same semantic content.
Consider the following sentences

S
NP

VP

V
Susan

printed

NP
NP

VP

ART

The

file

ART

the

file

V
was

PP
printed by

susan

Case Grammars
Grammer rules are written to describe syntactic rather than
semantic regularities.
(printed (agent Susan)
(object File)
Mother baked for three hours
(baked (agent Mother)
(timeperiod 3-hours)
(baked (Object Pie)
(timeperiod 3-hours)

Different Cases are used by Case grammer are


(A) Agent Instigator of the action(animate)
Instrument - Cause of the event or object used in causing the
event(inanimate)
(D) Dative- Entity affected by the action.(animate)
(L) Locative- Place of the event
(S) Source Place from which something moves
(G) Goal Place to which something moves
(T) Time Time at which the event occurred.
(O) Object Entity that is acted upon or that changes,
Describe relationships between verbs and their arguments.

The process of parsing into a case representation is heavily directed


by the lexical entries associated with each verb
open [ _ _ O (I) (A)]
The door opened
John opened the door
John open the door with a chisel.
Die

[ _ _ D]
John died

Kill

[ _ _ D (I) A]
Bill killed John
Bill killed John with a knife.

Parsing using a case grammer is expectation-driven

Transition networks
Another popular method used to represent formal and
natural language structures
Based on the application of directed graphs(digraphs) and
finite state automata.
Consists of a number of nodes and labeled arcs.

Semantic Analysis and


Representation structures
Semantic interpretation is the most difficult stage in the transformation
process.
The domain refers to the knowledge that is part of the world model the
system knows about.
-includes object descriptions, relationships and other relevant
concepts.
The context relates to previous expressions, the setting and time of the
utterances , and the beliefs, desires and intentions of the speakers.
The task is part of the service the system offers, such as retrieving
information from a data base, providing expert advice, or performing
a language translation.

Lexical semantics Approaches


1. based on Semantic grammars
2.uses conceptual dependency theory.
Semantic Grammar
- a context free grammar in which the choice of nonterm inals and
production rules is governed by semantics as well as syntactic
function.
- there is usually a semantic action associated with each grammar rule.
Eg: Primitive action INGEST with unfilled slots ACTOR,OBJECT and
TENSE
(INGEST (ACTOR nil)
(OBJECT nil)
(TENSE past)

(INGEST (ACTOR nil)


(OBJECT nil)
(TENSE past)
The boy drank a soda
(INGEST (ACTOR (PP NAME boy)(CLASS PHY-OBJ)
(TYPE ANIMATE)(REF DEF)))
(OBJECT (PP(NAME soda)(CLASS PHY-OBJ)
(TYPE INANIMATE)(REF INDEF)))
(TENSE past)

Compositional semantics Approaches


The meaning of an expression is derived from the meanings of the parts
of the expression.
- The target knowledge structures constructed in this approach are
typically logic expressions such as the formulas of FOPL.
Eg: NL statement -

Sample24 contains silicon

Result of parsing
(S DCL
(NP (N Sample 24)))
(AUX (TENSE(PRESENT)))
(VP (V contain))
(NP (N (silicon))))

Using this structure, the semantic interpreter would produce the following
predicate clause

(CONTAIN sample24 silicon)

Natural language Generation


Exact inverse of language undestanding.
More difficult than understanding,because the system must
decide
- what to say, and
how the utterances should be stated
which form is better(active or passive)
which words and structures best express the intent
when to say what.

The study of language generation falls naturally into three areas:


1) the determination of content
2) formulating and developing a text utterance plan, and
3) achieving a realization of desired utterances.
Content determination
Concerned with what details to include in an explanation, a request, a
question or argument in order to convey the meanings set forth by
the goals of the speaker.
Text planning
Process of organizing the content to be communicated so as to
achieve the goals of the speaker.
Realization
the process of mapping the organized content to actual text.

Natural Language Systems


A few of the more successful natural language
understanding systems are
The LUNAR system
The LIFER System
- Language Interface facility with Ellipsis and recursion
- developed by Gary Hendrix
The SHRDLU System
- developed by Terry Winograd

Pattern Recognition
Computer pattern recognition
- a process whereby computer programs are used to recognize
various forms of input stimuli such as visual or acoustic(speech)
patterns.
Pattern recognition Systems are used to identify or classify
objects on the basis of their attribute and attribute-relation
values.
Recognition is the process of establishing a close match between
some new stimulus and previously stored stimulus patterns.

Object classification is closely related to recognition.


The ability to classify or group objects according to
some commonly shared features is a form of class
recognition.
Classification is
- essential for decision making, learning, and many
other cognitive acts.
- Depends on the ability to discover common patterns
among objects.

The recognition and classification process


Step 1
- stimuli produced by objects are perceived by sensory devices.
The more prominent attributes( such as size, shape, color, and
texture) produce the strongest stimuli. The values of these attributes
and their relations are used to characterize an object in the form of a
pattern vector X
- The range of characteristic attribute values is known as the
measurement space M
Step 2
A subset of attributes whose values provide object grouping or
clustering are selected.
The range of the subset of attribute values is known as the feature
space F.

Step 3
Using the selected attribute values, object or class
characterization models are learned by forming generalized
prototype descriptions, classification rules, or decision
functions.
The range of the decision function values or classification
rules is known as the decision space D.
Step 4
Recognition of familiar objects is achieved through
application of the rules learned in Step 3 by comparison and
matching of object features with the stored models.

The pattern recognition process

Classification

Stimuli
Sensors

Feature
selection

Matching

Classification rules
Learning

There are two basic approaches to the


recognition problem
1)The decision theoretic approach
2)The syntactic approach

Decision Theoretic classification


Based on the use of decision functions to classify objects.
A decision function maps pattern vectors X into decision
regions of D.
Syntactic Classification
-The syntactic recognition approach is based on the
uniqueness of syntactic structure among the object
classes.
- a kind of grammar is defined for object descriptions.
- vocabulay is based or shape primitives.

Learning Classification Patterns


Before a system can recognize objects, it must posses
knowledge of the characteristics features for those
objects
Learning decision functions, grammars or other rules can
be performed in either of the two ways, through
Supervised learning
Unsupervised learning

Supervised Learning
-

accomplished by presenting training examples to a


learning unit.
The examples are labelled beforehand with their correct
identities or class. The attribute values and object labels
are used by the learning component to extract and
determine pattern criteria for each class. This knowledge
is used to adjust parameters in decision functions or
grammer rewrite rules.

Unsupervised Learning
Labled training examples are not available and little is
known beforehand regarding the object population.In
such cases, the system must be able to perceive and
extract relevant properties from the unknown objects,
find common pattern among them, and formulate
descriptions or discrimination criteria consistent with the
goals of the recognition process.
-

Learning through Clustering


-

Clustering is the process of grouping or classifying objects on the basis of


a close association or shared characteristics.
- a discovery learning process in which similar patterns are found among a
group of objects.
The clustering problem gives rise to several subproblems
1. What set of attributes and relations are most relevant, and what weights
should be given to each?
2. What representation formalism should be used to characterize the objects?
3. What representation scheme should be used to describe the cluster
groupings or classifications?
4. What clustering criteria is most consistent with and effective in achieving
the objectives relative to the context or domain?
5. What clustering algorithms can best meet the criteria

Recognizing and Understanding


Speech
The ability to communicate directly with programs offers
several advantages.
It eliminates the need for keyboard entries
Speeds up the interchange of information between the
user and system.
With speech as communication medium, users are free
to perform other tasks concurrently with the computer
interchange.
Untrained personnel would be able to use computers in
a variety of applications

How speech recognition works ?


To covert speech to on- screen text , a computer has to go through several
complex steps.
When we speak we create vibrations in the air. The analog-to-digital
converter(ADC) translates this analog wave into digital data that the
computer can understand.
To do this , it digitizes the sound by taking precise measurements of
the wave at frequent intervals. The system filters the digitized sound to
remove unwanted noise, and sometimes to separate it into different bands
of frequency.
Next the signal is divided into small segments and the program then
matches these segments to known phonemes in appropriate language. A
phoneme is the smallest element of a language- a representation of the
sounds we make and put together to form meaningful expressions. The
program examines phonemes in the context of other phonemes around
them. The software language model compares the phonemes to words
in its built-in dictionary. The program then determines what the user was
probably saying and either outputs it as text or issues a computer
command

Expert System Architectures

Expert Systems
- a recent product of AI
- a kind of knowledge based systems
- have proven to be effective in a number of problem
domains which require the kind of intelligence
possessed by a human expert.
Application Domains
Law

aerospace

Chemistry

military operations

Biology

finance

Engineering

banking

Medicine

geology

manufacturing

Definition
A set of programs designed to act as an
expert in a particular domain.

Not meant for replacing experts in that domain,but to assist


them.

Characteristic features of Expert


systems

Use knowledge rather than data

Knowledge is encoded and maintained separately.

Capable of explaining how a particular conclusion


was reached

Use symbolic representations for knowledge

Can reason with meta knowledge

Importance of Expert Systems

Expert System Architectures


1) Rule based System or Production Systems
-use knowledge encoded in the form of production rules
ie . If .....then... rules.
Each rule represents a small piece of knowledge
relating to the given domain of expertise.

Components of an Expert
System
Explanation
Module
INPUT

I/O
Interface

OUTPUT

Editor

Inference
Engine

Knowledge
base

Learning
Module

Case
history
file

Working
memory

Knowledge Base
Contains facts and rules about some domain.
Eg:
IF : The patient has a chronic disorder,and
the age of the patient is less than 30, and
the patient shows condition A, and
test B reveals biochemistry condition C
THEN: Conclude the patient's diagnosis is autoimmunechronic-hepatitis

In PROLOG
conclude(patient,diagnosis,autoimmune-chronic-hepatitis):same(patient,disorder,chronic),
lessthan(patient,age,30)
same(patient,symptom_A,value_A),
same(patient,biochemistry,value_C).

The Inference Process


- accepts user input queries and responses to questions
through the I/O interface and uses this dynamic
information together with the static knowledge(the rules
and facts) stored in the knowledge base.

The inferring process is carried out


recursively in three stages:
1) match
2) select
3) execute

The Production system Inference cycle


Knowledge base

Working Memory

match

Conflict Set

Select

execute

Building a Knowledge base

An editor is used by developers to create new rules


for addition to the knowledge base, to delete
outdated rules ,or to modify existing rules in some
way.
Most difficult task in creating and maintaining
production systems is

-building and maintaining of a consistent but complete


set of rules. This should be done without adding
redundant or unnecessary rules.

Eg: of an intelligent editor TEIRESIAS (developed to


work with systems like MYCIN)

I/O Interface

Permits the user to communicate with the system in a


more natural way.
The system must have special prompts or a specialized
vocabulary which encompasses the terminology of the
given domain of expertise.
Eg: MYCIN has a vocabulary of some 2000 words.

Learning module and history file


- Not common components of expert
systems
- Used to assist in building and refining the
knowledge base

Non Production system


Architectures
- less common expert system architecture.
- Instead of rules, these systems employ more structured
representation schemes like

Associative or semantic networks

Frame structures

Decision trees

Specialized networks like neural networks.

Associative or Semantic Network


Architectures
- useful in representing hierarchical knowledge structures,
where property inheritance is common.
- Not a popular form of representation for standard expert
systems.
- can be used in natural language systems or computer
vision systems also.

Eg: Expert system based on the use of an associative


network representation CASNET
CASNET Causal Associational Network
-used to diagnose and recommend treatment for
Glaucoma

fly
CAN
A-KIND-OF

tweety

bird

HAS PARTS

COLOR

wings

Fragment of an associative network

yellow

ISA

Bob
MARRIED
TO

Sandy

Professor
OWNS

House

DRIVES

Bike

Frame Architectures
Eg: for a frame based expert system - PIP system
PIP Present Illness Program
Medical knowledge in PIP is organized in frame structures.

Decision Tree Architectures

Knowledge for expert systems may be stored in the form


of a decision tree when the knowledge can be structured
in a top-to-bottom manner.
Knowledge base can be constructed with a special treebuilding editor or with a learning module.

A segment of decision tree structure


attribute1
Burn test
orange

yes

_______

red

yes

no

_______

_______

Compound-38

blue

no

yes

_______

no

Solubility test

_______

Compound-39

Blackboard System Architecture


- a special type of knowledge-based system which uses a
form of opportunistic reasoning.
- Differs from pure forward or pure backward chaining
- Either direction may be chosen dynamically at each
stage in the problem solution process.
- Blackboard systems are composed of
a number of knowledge sources
a globally accessible database structure,called a
blackboard
Control Information

jigsaw puzzle

A puzzle consisting of a mass of irregularly shaped pieces


of cardboard, plastic, or wood that form a picture when
fitted together. Also called picture puzzle.

Components of blackboard systems


Blackboard

Knowledge sources

Control Information

Knowledge sources
- separate and independent sets of coded knowledge
- may contain knowledge in the form of procedures, rules, or
other schemes.
Each knowledge source may be thought of as a specialist in
some limited area needed to solve a given subset of
problems
Blackboard
- Contain current problem state and information needed by the
knowledge sources such as input data, partial solutions,
control data, alternatives, final solutions
- Knowledge sources make changes to the blackboard data.
- Communication and interaction between the knowledge
sources takes place solely through the black board.

Control Information
- May be contained within the sources, on the black board,
or possibly in a separate module.
- Monitors the changes to the blackboard and determines
what the immediate focus of attention should be in solving
the problem.
- One of the application of Blackboard System Architecture
was in the HEARSAY family of projects(speech
understanding systems)

Analogical Reasoning
Architectures
- solve new problems like humans, by finding a similar
problem solution that is known and applying the known
solution to the new problem, possibly with some
modifications.
- Will require a large knowledge base having numerous
solutions and other previously encountered situations
or episodes.
- The inference mechanism must be able to extend
known situations or solutions to fit the current problem
and verify that the extended solution is reasonable.

Neural Network Architectures

Artificial Neural networks

Artificial Neural networks


ANN are mathematical inventions inspired by observations
made in the study of biological system.
Loosely based on the actual Biology
Can be described as mapping an input space to an output
space.
Consists of artificial neurons composed of weights and
connections.

Neurons
Neurons are connected to one another
A simplified model of the neuron

Modeling Neurons
A simplified model of the neuron
I
N

OUTPUT

P
U
T
S
Articial neuron can be thought of as a small computing engine that takes in
input, process them and then transmit an output.

Z=f Wi Xi
i =0

X3
X2
X1

W3
W2
W1
W0

X0

Neural Network Architecture


Neural networks
Large networks of simple processing elements or nodes which process
information dynamically in response to external inputs
The nodes are simplified models of neurons.
The knowledge in a neural network is distributed throughout the network in
the form of internode connections and weighted links which form the
inputs to the nodes.
The link weights serve to enhance or inhibit the input stimuli values which are
then added together at the nodes.If the sum of all the inputs to a node
exceeds some threshold value T, the node executes and produces an output
which is passed on to other nodes or is used to produce some output
response.
No output is produced if the total input is less than T

Knowledge System Building


Tools
-

these tools range from high level programming


languages to intelligent editors.
When evaluating building tools for expert system
development, the developer should consider the
following features and capabilities:
1. Knowledge representation methods available.
2. Inference and control methods available.
3. User interface characteristics.
4. General system characteristics and support available.

Personal Consultant Plus


Radian Rule master
KEE(Knowledge Engineering Environment)
OPS5 System