Beruflich Dokumente
Kultur Dokumente
DRAFT COPY
January 16, 2003
The contents of this documentation are strictly confidential and are proprietary to Banter Technology
Inc. No part of this documentation may be reproduced, transmitted, or stored, in any form, in whole or
in part, or by any means for any purpose without the prior written consent of Banter Technology Inc.
The software described in this document is furnished under a license agreement and may be used or
copied only in accordance with the terms stipulated therein.
Banter Technology Inc. reserves the right to modify the information contained in this document
without prior notification.
Tel: 1-415-247-2600
Fax: 1-415-247-2626
E-mail: info@banter.com
Knowledge Base Development and RME Processing Page 1
Introduction
The Rapport Knowledge Base is a unique, adaptive repository of linguistic and
statistical information that enables Rapport to accurately manage and classify high-
volume customer e-communications. Rapport is a learning system—the Knowledge
Base continuously evolves to model an organization’s current communication
environment.
Working in conjunction with the Rapport Knowledge Base, Rapport’s Relationship
Modeling Engine (RME) analyzes customer communications and takes the most
appropriate action on-the-fly, based on various user specifications, including
Rapport’s broad spectrum of configuration settings.
This white paper examines Rapport’s unique adaptive Knowledge Base, its
development process, and how it enables the RME to accurately process and classify
messages.
2. The NLP engine identifies concepts within the message using linguistic data
stored in the Knowledge Base.
4. The statistic engine compares the message’s concept models with each
category’s existing concept models to determine category relevancy.
Optionally, logical expressions are used to refine or override concept-based
message categorization.
Rapport
Knowledge
Base
Note: Using the Knowledge Base Editor, logical expressions may also be associated
with specified categories at this stage to further refine the classification
process.
Learning
Learning is an ongoing automatic process, invisible to the user, that gathers concept
models for each category in the SKB over time. Concept models are gathered by
collecting feedback from normal message processing activity, bootstrapping the
system for accurate message classification in the future.
For example, when a customer service agent uses the Rapport Message Center
application to compose a reply to a message, the agent may choose from a database of
pre-written responses linked to categories in the Knowledge Base. The act of choosing
a response provides feedback to the system; concept models contained in the message
form the basis of concept models associated with categories linked to the response.
In addition to bootstrapping the system, learning continuously updates and enriches
existing concept models in the SKB during normal Rapport usage. Learning is an
organic process, enabling the Knowledge Base to grow and adapt over time. Concept
models are refined by introducing new information derived from changes that have
occurred in the composition of messages, and from agent activity. As Rapport learns,
it broadens the base of concept models, making the system more precise over time.
Training
The Training process is an optional, but recommended method for gathering models
for categories in the SKB decision tree. Training is implemented offline, and involves
analyzing a corpus of sample messages classified into pre-defined categories. These
messages are first processed by Rapport’s Lexical Editor to enrich the LKB with user-
specific linguistic data. Then each message in the corpus is processed individually by
the NLP and Statistic engines to populate the SKB with models used to classify
incoming messages.
The sections that follow discuss the Training process in greater detail.
Creating a Corpus
A corpus of sample messages, pre-classified according to categories, provides
source material for NLP and Statistic Training processes that build the Rapport
Knowledge Base.
Creating a Corpus
A corpus is a collection of sample messages gathered by an organization (prior to
using Rapport) that have been pre-classified according to their subject matter. The
corpus provides source data used during Pre-training, NLP Training, and Statistic
Training (described below).
The corpus may be organized by grouping similar messages in directories or folders
according to category names that represent the messages’ content. Alternatively, each
message may have a field or data identifier that indicates its category (or categories).
For the subsequent Pre-Training and Training processes to be most effective, the
corpus should only contain messages that are accurately classified and free of
extraneous text (unrelated to the message’s category). An ideal corpus consists of
messages that are classified according to well-defined categories (avoiding
redundancies between categories), with textual content that is consistently
representative of the category’s subject. As many messages as possible with similar
message content should be grouped together for each category—more messages per
category improves the quality of concept models created during the statistic Training
process (described below).
Note: The Pre-Training process is particularly useful for preparing the RME to
accurately classify and process messages from international sources,
especially messages including frequent misspellings and non-standard
English usage.
NLP Engine
Pre-Processing
Analyzes and processes each Processing
message individually Uses morphological rules, word
Corpus Identifies the portion of text to be associations, and complex
processed algorithms for generating
Receives data from the concepts, and concepts based
Linguistic Knowledge Base on other concepts
Generates an intermediate Exports concepts to the Concept
representation of concepts Modeler
■ ■ ■ ■ ■
■
LKB ■Concepts
■ ■ ■ ■
■
Linguistic
Knowledge Base
Concept
Concept Statistic
Modeler
Modeler Engine
Converts
Converts concepts
concepts Implements
into
into concept
concept models
models Statistic Training
Note: Statistic Training may also provide feedback (manually) to the NLP
Training process, improving NLP analysis and the determination of
concepts.
Statistic Engine
Concept
Concept
Models
Models Knowledge Base
SKB
Statistic
Editor Knowledge Base
♦ Populates decision tree with
new concept models based Stores concept models for
on each message’s concept each category in decision
models trees
Per
Per Individual
Individual ♦ Updates existing models in
Message
Message the Statistic Knowledge
Base
Learning
To gather this data, the system can be bootstrapped by an automatic process called
Learning. Learning is ongoing, invisible to the user, and populates the SKB decision
tree with concept models over time during normal Rapport operation. In addition to
bootstrapping the system, learning continuously updates models in the SKB,
improving message classification.
Training
Alternatively, the Rapport Knowledge Base may be built based on a corpus of sample
messages classified according to categories. During Pre-Training, the Lexical Editor is
used to analyze the corpus, identify significant, corpus-specific linguistic data, and
refine the LKB. NLP Training analyzes each message in the corpus individually, and
exports concepts via the Concept Modeler to the statistic engine. The Knowledge Base
Editor application is used to create a skeletal decision tree structure based on corpus
categories. For each message in the corpus, concept models are gathered for
categories in the decision tree, and are stored in the SKB.
The following simplified diagrams illustrate the chronological development of the
Rapport Knowledge Base using the Training process.
Creating a Corpus
Classified Corpus
Corpus
Sample Messages according to
message content
Lexical
Linguistic
Corpus
Corpus Editor
Knowledge
Application
Base
NLP Engine
Corpus
Corpus Concept
Pre-Processing
Modeler
& Processing
Concept Models
Exported to
Statistic Engine
Linguistic
Knowledge
Base
Statistic Statistic
Engine Knowle
dge
Concept
Models from
NLP Training
Concept
NLP Engine Modeler
Pre-Processing
Statistic
Processing
Engine
Customer Message Routed for
Message Automatic or Semi-
Automatic Action
Knowledge Base
LKB SKB