Sie sind auf Seite 1von 42

Dialog System

A comprehensive understanding
Mr. T
Perception
Dialog System

Natural Semantic Frame


Language Ask_weather(date=weekends)
Speech
Utterance Understanding
Recognition • Domain identification
• User intent detection
Trigger • Slot Filling
Dialogue Backend
Text Input Management Knowledge
“I will go out at weekends, what is the weather?” • Dialogue state tracking
• Dialogue policy optimization Providers
Natural
Speech
Response Language
Synthesis
Generation System Action
Text Response Request_location
“Where will you go?/”Where you want to ask for the weather this weekends?”
Trigger Word
Hey Bot/Ok Bot! Patterns extraction Classifier

high level features

Low level features


Unknown

Trigger
Word
Wave Sound Frequency Domain
Convoluted Neural Network Recurrent Neural Network Output

A solution for a trigger word system

What’s else…
Speech Recognition
Speech Recognition
Speech wave
Pre-processing

Acoustic Features
v
WORD PRON (ipa)
vợ v ə ˨˩ˀ w
quê w e Decoder Acoustic Model

NGRAMsko SCORE
Acoustic Dictionary Vợ 2.5
(Pronunciation model) quê 0.7

“Vợ tôi ở quê rất đẹp” Language Model


Speech Synthesis
Ideas of TTS

PG & E will file schedules on April, 20

https://web.stanford.edu/~jurafsky/slp3/ed3book.pdf
Look up and Concatenate
No!
Pipeline for Text To Speech

Text Normalization Phonetic Analysis


• Sentence Tokenization • Dictionary Look Up
Text Input • Non-standard words • Names Spelling
• Disambiguation words • Grapheme 2 Phoneme
• Trained by Machine Learning (SVM, DT, LR) 𝑃෠ = 𝑎𝑟𝑔𝑚𝑎𝑥 𝑃 (𝑃|𝑆)

Prosodic Analysis
• Prosody Structure
Voice Output • Prosody Prominence
• Tune

https://web.stanford.edu/~jurafsky/slp3/ed3book.pdf
The state-of-the-art

Where GAN comes into play.


Natural Language Understanding
Word Embedding
(1, 0, 0, 0, 0, 0, 0)
(0, 1, 0, 0, 0, 0, 0) One-hot Encoding Vector
(0, 0, 1, 0, 0, 0, 0)
) word Vector
(0, 0, 0, 0, 1, 0, 0)
(0, 0, 0, 0, 0, 1, 0)

King (0.12, 0.23, 0.43) 1


Documents
2 3 4

Queen (0.14, 0.57, 0.88) 1 10 0 1 0


Frequency Based Vector
Man (0.44, 0.90, 0.11) Terms 2 0 0 0 2

Woman (0.19, 0.23, 0.53) 3 4 0 7 0 word Vector

Boy (0.12, 0.65, 0.42) 4 0 5 0 12

Girl (0.34, 0.44, 0.68) Docs Vector

king
Predication Based Vector
Words Embedding man queen

woman
One-hot Encoding Vector
1 2 3 4 5 6 7 8
Co gai 1 0 0 0 0 0 0 0
Corpus: hot girl 0 1 0 0 0 0 0 0
Co gai, hot xinh dep 0 0 1 0 0 0 0 0 each word
girl, xinh đep, Gets a 1x 8
truoc day 0 0 0 1 0 0 0 0
truoc day, la, vector
mot, chang la 0 0 0 0 1 0 0 0 representation
trai, dam my mot 0 0 0 0 0 1 0 0
chang trai 0 0 0 0 0 0 1 0
dam my 0 0 0 0 0 0 0 1

What’s wrong…
Custom Encoding Vector
ban Thoi So Nu
nguoi
chat gian dem tinh
Co gai 1 0 0 0 1
Corpus: hot girl 0.7 1 0 0 0.7
Co gai, hot each word
girl, xinh đep, xinh dep 0.6 1 0 0 0.5 Gets a 1x5
truoc day, la, truoc day 0 0 1 1 0 vector
mot, chang representation
la 0 0 0 0 0
trai, dam my
mot 0 0 0 1 0
chang trai 1 0 0 0 0
dam my 0.7 1 0 0 0
Custom Encoding Vector
ban Thoi So Nu
nguoi
chat gian dem tinh
Co gai 1 0 0 0 1
Corpus: hot girl 0.7 1 0 0 0.7
Co gai, hot each word
girl, xinh đep, xinh dep 0.6 1 0 0 0.5 Gets a 1x5
truoc day, la, truoc day 0 0 1 1 0 vector
mot, chang representation
la 0 0 0 0 0 And better
trai, dam my
mot 0 0 0 1 0 relationship
chang trai 1 0 0 0 0
dam my 0.7 1 0 0 0
Count Vector
Let us understand this using a simple example.
• D1: He is a lazy boy. She is also lazy.
• D2: Neeraj is a lazy person.
Dictionary = [‘He’, ‘She’, ‘lazy’, ‘boy’, ‘Neeraj’, ‘person’]
D=2 (# docs), N=6 (# words in the dictionary)

Count Vector matrix M = DXN, vector (“lazy”) = [2, 1]


He She lazy boy Neeraj person
D1 1 1 2 1 0 0
D2 0 0 1 0 1 1
TF-IDF vectorization
TF = (Number of times term t appears in a document)/(Number of
terms in the document)

So, TF(This,Document1) = 1/8

TF(This, Document2)=1/5

DF = log(N/n), where, N is the number of documents and n is the


number of documents a term t has appeared in.

where N is the number of documents and n is the number of


documents a term t has appeared in.
So, IDF(This) = log(2/2) = 0.

Let us compute IDF for the word ‘Messi’.


TF-IDF penalizes the word ‘This’ IDF(Messi) = log(2/1) = 0.301.

but assigns greater weight to Now, let us compare the TF-IDF for a common word ‘This’
and a word ‘Messi’ which seems to be of relevance to Document 1.
‘Messi’. TF-IDF(This,Document1) = (1/8) * (0) = 0

TF-IDF(This, Document2) = (1/5) * (0) = 0

TF-IDF(Messi, Document1) = (4/8)*0.301 = 0.15


Co-Occurrence Matrix with a fixed context window

The big idea – Similar words tend to occur together and will
have similar context for example –
“Apple is a fruit. Mango is a fruit.”
Apple and mango tend to have a similar context i.e fruit.

Not preferred in practical


Prediction based Vector
• Continuous Bag of words & Skip-Grams model

P(word|context) P(context|word)

Input Weight matrix = a word vector

https://www.analyticsvidhya.com/blog/2017/06/word-embeddings-count-word2veec/
Intent and Entities

• Intent = topic/domain Intent = “Home_activity”

“Go home to have the dinner”


• Entities = keywords
Location Action Object
Dialogue Management

VS
Statefulness is the key
• Follow-up
• Pending action
Natural Language Generation
• Fix Response + slot filling + random from a pool

User: Do you know “I’m really quite something”?


Bot: “I’m really quite something” composed by Son Tung-MTP

• Using Neural Network and Language Model

Not recommended
Future of End2End
Data Driven
• Seq2Seq
• Reinforcement

https://aclweb.org/anthology/C18-3006
Tips
• Script Writer
• Personality
• Control the dialogue
• API saves time
• Label Intent and Entities
• Design the flow
• Expandable
• Lots of testing
Applications

Database
Data warehouse

Cloud Services Web service


Rasa: Open source
conversational AI
PRACTICAL TIME
Use case 1: Health-care

Data
Data
Logical
Functions

Speech Dialog
Text Management
Google (Rest API)
Virtual
Assistant Request Analyzed
Text Text

Emotion Health-Care Recommendation


Detection System System NLP
Intent
Entities
Use case 2: HR-chatbot
Chatbot

Supporting
Communicate
Monitoring

Training
HR staffs Employees

Company’s
Knowledge
Resource
THANK YOU!