Sie sind auf Seite 1von 15

12/31/13

Chatbot Tutorial - CodeProject

ArticlesGeneralProgrammingAlgorithms&RecipesGeneral

ChatbotTutorial
ByGonzalesCenelia,29May2013
4.67(44votes)

Overview
AstepbystepguidetoimplementyourownArtificialIntelligencechatbot.

Tableofcontents
1. IntroductionChatbotdescription(firstexample) 2. Introducingkeywordsandstimulusresponse 3. Preprocessingtheuser'sinputandrepetitioncontrol 4. Amoreflexiblewayformatchingtheinputs 5. Usingclassesforabetterimplementation 6. Controllingrepetitionmadebytheuser 7. Using"states"torepresentdifferentevents 8. Keywordboundariesconcept 9. UsingSignonmessages 10. "KeywordRanking"concept 11. Keywordequivalenceconcept 12. Transpositionandtemplateresponse 13. Keywordlocationconcept 14. Handlingcontext 15. UsingTextToSpeech 16. Usingaflatfiletostorethedatabase 17. Abetterrepetitionhandlingalgorithm 18. Updatingthedatabasewithnewkeywords 19. SavingtheconversationLogs 20. Learningcapability

Introduction
BasicallyachatterbotisacomputerprogramthatwhenyouprovideitwithsomeinputsinNatural Language(English,French...)respondswithsomethingmeaningfulinthatsamelanguage.Which meansthatthestrengthofachatterbotcouldbedirectlymeasuredbythequalityoftheoutputselected bytheBotinresponsetotheuser.Bythepreviousdescription,wecoulddeducethataverybasic chatterbotcanbewritteninafewlinesofcodeinagivenspecificprogramminglanguage.Letsmake

www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print

1/15

12/31/13

Chatbot Tutorial - CodeProject

ourfirstchatterbot(noticethatallthecodesthatwillbeusedinthistutorialwillbewritteninC++.Also,it isassumedthatthereaderisfamiliarwiththeSTLlibrary)Thistutorialisalsoavailableinthefollowing languages:Java,VisualBasic,C#,Pascal,PrologandLisp


/ / / / P r o g r a m N a m e : c h a t t e r b o t 1 / / D e s c r i p t i o n : t h i s i s a v e r y b a s i c e x a m p l e o f a c h a t t e r b o t p r o g r a m / / / / A u t h o r : G o n z a l e s C e n e l i a / / # i n c l u d e < i o s t r e a m > # i n c l u d e < s t r i n g > # i n c l u d e < c t i m e > i n t m a i n ( ) { s t d : : s t r i n g R e s p o n s e [ ] = { " I H E A R D Y O U ! " , " S O , Y O U A R E T A L K I N G T O M E . " , " C O N T I N U E , I M L I S T E N I N G . " , " V E R Y I N T E R E S T I N G C O N V E R S A T I O N . " , " T E L L M E M O R E . . . " } s r a n d ( ( u n s i g n e d ) t i m e ( N U L L ) ) s t d : : s t r i n g s I n p u t = " " s t d : : s t r i n g s R e s p o n s e = " " w h i l e ( 1 ) { s t d : : c o u t < < " > " s t d : : g e t l i n e ( s t d : : c i n , s I n p u t ) i n t n S e l e c t i o n = r a n d ( ) % 5 s R e s p o n s e = R e s p o n s e [ n S e l e c t i o n ] s t d : : c o u t < < s R e s p o n s e < < s t d : : e n d l } r e t u r n 0 }

Asyoucansee,itdoesn'ttakealotofcodetowriteaverybasicprogramthatcaninteractwithauser butitwouldprobablybeverydifficulttowriteaprogramthatwouldreallybecapableoftrulyinterpreting whattheuserisactuallysayingandafterthatwouldalsogenerateanappropriateresponsetoit.These havebeenalongtermgoalsincethebeginningandevenbeforetheveryfirstcomputerswerecreated. In1951,theBritishmathematicianAlanTuringhascameupwiththequestionCanmachinesthinkand hehasalsoproposeatestwhichisnowknownastheTuringTest.Inthistest,acomputerprogram andalsoarealpersonissettospeaktoathirdperson(thejudge)andhehastodecidewhichofthem istherealperson.Nowadays,thereisacompetitionthatwasnamedtheLoebnerPrizeandinthis competitionbotsthathassuccessfullyfoolmostofthejudgeforatlist5minuteswouldwinaprizeof 100.000$.Sofarnocomputerprogramwasabletopassthistestsuccessfully.Oneofthemajor reasonsforthisisthatcomputerprogramswrittentocomputeinsuchcontesthavenaturallythe tendencyofcommittingalotoftypo(theyareoftenoutofthecontextoftheconversation).Which meansthatgenerally,itisn'tthatdifficultforajudgetodecidewhetherheisspeakingtoa"computer program"orarealperson.Also,thedirectancestorofallthoseprogramthattriestomimica conversationbetweenrealhumanbeingsisEliza,thefirstversionofthisprogramwaswrittenin1966 byJosephWeizenbaumaprofessorofMIT. ChatbotsingeneralareconsideredtobelongtotheweakAIfield(weakartificialintelligence)as opposedtostronga.iwho'sgoalistocreateprogramsthatareasintelligentashumansormore intelligent.Butitdoesn'tmeanthatchatbotsdonothaveanytruepotential.Beingabletocreatea programthatcouldcommunicatethesamewayhumansdowouldbeagreatadvancefortheAIfield. Chatbotisthispartofartificialintelligencewhichismoreaccessibletohobbyist(itonlytakesome

www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print

2/15

12/31/13

Chatbot Tutorial - CodeProject

averageprogrammingskilltobeachatbotprogrammer).So,programmersouttherewhowantedto createtrueAIorsomekindofartificialintelligence,writingintelligentchatbotsisagreatplacetostart!

Now,let'sgetbacktoourpreviousprogram, whataretheproblemswithit?
Well,thereisalotofthem.Firstofall,wecanclearlyseethattheprogramisn'treallytryingto understandwhattheuserissayingbutinsteadheisjustselectingarandomresponsefromhis databaseeachtimetheusertypesomesentenceonthekeyboard.Andalso,wecouldnoticethatthe programrepeathimselfveryoften.Oneofthereasonforthisisbecauseofthesizeofthedatabase whichisverysmall(5sentences).Thesecondthingthatwouldexplaintherepetitionsisthatwehaven't implementedanymechanismthatwouldcontrolthisunwantedbehavior.

Howdowemovefromaprogramthatjustselectresponsesrandomlyto whateverinputthattheusermightenteronthekeyboardtoaprogramthat showssomemoreunderstandingoftheinputs?


Theanswertothatquestionisquietsimplewesimplyneedtousekeywords. Akeywordisjustasentence(notnecessarilyacompleteone)orevenawordthattheprogrammight recognizefromtheuser'sinputwhichthenmakesitpossiblefortheprogramtoreacttoit(ex:by printingasentenceonthescreen).Forthenextprogram,wewillwriteaknowledgebaseordatabase, itwillbecomposedofkeywordsandsomeresponsesassociatedtoeachkeyword. so,nowweknowwhattodotoimprove"ourfirstchatterbot"andmakeitmoreintelligent.Letsproceed onwriting"oursecondbot",wewillcallitchatterbot2.
/ / / / P r o g r a m N a m e : c h a t t e r b o t 2 / / D e s c r i p t i o n : t h i s i s a n i m p r o v e d v e r s i o n / / o f t h e p r e v i o u s c h a t t e r b o t p r o g r a m " c h a t t e r b o t 1 " / / t h i s o n e w i l l t r y a l i t t l e b i t m o r e t o u n d e r s t a n d w h a t t h e u s e r i s t r y i n g t o s a y / / / / A u t h o r : G o n z a l e s C e n e l i a / / # p r a g m a w a r n i n g ( d i s a b l e : 4 7 8 6 ) # i n c l u d e < i o s t r e a m > # i n c l u d e < s t r i n g > # i n c l u d e < v e c t o r > # i n c l u d e < c t i m e > c o n s t i n t M A X _ R E S P = 3 t y p e d e f s t d : : v e c t o r < s t d : : s t r i n g > v s t r i n g v s t r i n g f i n d _ m a t c h ( s t d : : s t r i n g i n p u t ) v o i d c o p y ( c h a r * a r r a y [ ] , v s t r i n g & v ) t y p e d e f s t r u c t { c h a r * i n p u t c h a r * r e s p o n s e s [ M A X _ R E S P ] } r e c o r d r e c o r d K n o w l e d g e B a s e [ ] = { { " W H A T I S Y O U R N A M E " ,

www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print

3/15

12/31/13

Chatbot Tutorial - CodeProject

{ " M Y N A M E I S C H A T T E R B O T 2 . " , " Y O U C A N C A L L M E C H A T T E R B O T 2 . " , " W H Y D O Y O U W A N T T O K N O W M Y N A M E ? " } } , { " H I " , { " H I T H E R E ! " , " H O W A R E Y O U ? " , " H I ! " } } , { " H O W A R E Y O U " , { " I ' M D O I N G F I N E ! " , " I ' M D O I N G W E L L A N D Y O U ? " , " W H Y D O Y O U W A N T T O K N O W H O W A M I D O I N G ? " } } , { " W H O A R E Y O U " , { " I ' M A N A . I P R O G R A M . " , " I T H I N K T H A T Y O U K N O W W H O I ' M . " , " W H Y A R E Y O U A S K I N G ? " } } , { " A R E Y O U I N T E L L I G E N T " , { " Y E S , O F C O R S E . " , " W H A T D O Y O U T H I N K ? " , " A C T U A L Y , I ' M V E R Y I N T E L L I G E N T ! " } } , { " A R E Y O U R E A L " , { " D O E S T H A T Q U E S T I O N R E A L L Y M A T E R S T O Y O U ? " , " W H A T D O Y O U M E A N B Y T H A T ? " , " I ' M A S R E A L A S I C A N B E . " } } } s i z e _ t n K n o w l e d g e B a s e S i z e = s i z e o f ( K n o w l e d g e B a s e ) / s i z e o f ( K n o w l e d g e B a s e [ 0 ] ) i n t m a i n ( ) { s r a n d ( ( u n s i g n e d ) t i m e ( N U L L ) ) s t d : : s t r i n g s I n p u t = " " s t d : : s t r i n g s R e s p o n s e = " " w h i l e ( 1 ) { s t d : : c o u t < < " > " s t d : : g e t l i n e ( s t d : : c i n , s I n p u t ) v s t r i n g r e s p o n s e s = f i n d _ m a t c h ( s I n p u t ) i f ( s I n p u t = = " B Y E " ) { s t d : : c o u t < < " I T W A S N I C E T A L K I N G T O Y O U U S E R , S E E Y O U N E X T T I M E ! " < < s t d : : e n d l b r e a k } e l s e i f ( r e s p o n s e s . s i z e ( ) = = 0 ) { s t d : : c o u t < < " I ' M N O T S U R E I F I U N D E R S T A N D W H A T Y O U A R E T A L K I N G A B O U T . " < < s t d : : e n d l } e l s e { i n t n S e l e c t i o n = r a n d ( ) % M A X _ R E S P s R e s p o n s e = r e s p o n s e s [ n S e l e c t i o n ] s t d : : c o u t < < s R e s p o n s e < < s t d : : e n d l } } r e t u r n 0 } / / m a k e a s e a r c h f o r t h e u s e r ' s i n p u t

www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print

4/15

12/31/13

Chatbot Tutorial - CodeProject

/ / i n s i d e t h e d a t a b a s e o f t h e p r o g r a m v s t r i n g f i n d _ m a t c h ( s t d : : s t r i n g i n p u t ) { v s t r i n g r e s u l t f o r ( i n t i = 0 i < n K n o w l e d g e B a s e S i z e + + i ) { i f ( s t d : : s t r i n g ( K n o w l e d g e B a s e [ i ] . i n p u t ) = = i n p u t ) { c o p y ( K n o w l e d g e B a s e [ i ] . r e s p o n s e s , r e s u l t ) r e t u r n r e s u l t } } r e t u r n r e s u l t } v o i d c o p y ( c h a r * a r r a y [ ] , v s t r i n g & v ) { f o r ( i n t i = 0 i < M A X _ R E S P + + i ) { v . p u s h _ b a c k ( a r r a y [ i ] ) } }

Now,theprogramcanunderstandsomesentenceslike"whatisyourname","areyouintelligent"etc Andalsohecanchooseanappropriateresponsefromhislistofresponsesforthisgivensentenceand justdisplayitonthescreen.Unlikethepreviousversionoftheprogram(chatterbot1)Chatterbot2is capableofchoosingasuitableresponsetothegivenuserinputwithoutchoosingrandom responsesthatdoesn'ttakeintoaccountwhatactuallytheusertryingtosay. Wevealsoaddedacoupleofnewtechniquestothesesnewprogram:whentheprogramisunableto findamatchingkeywordthecurrentuserinput,itsimplyanswersbysayingthatitdoesn'tunderstand whichisquiethumanlike.

WhatcanweimproveonthesepreviousChatbot tomakeitevenbetter?
Therearequietafewthingsthatwecanimprove,thefirstoneisthatsincethechatterbottendstobe veryrepetitive,wemightcreateamechanismtocontroltheserepetitions.Wecouldsimplystorethe previousresponseofthatChatbotwithinastrings P r e v R e s p o n s e andmakesomecheckingswhen selectingthenextbotresponsetoseeifit'snotequaltothepreviousresponse.Ifitisthecase,wethen selectanewresponsefromtheavailableresponses. Theotherthingthatwecouldimprovewouldbethewaythatthechatbothandlestheusersinputs, currentlyifyouenteraninputthatisinlowercasetheChatbotwouldnotunderstandanythingaboutit eveniftherewouldbeamatchinsidethebot'sdatabaseforthatinput.Alsoiftheinputcontainsextra spacesorpunctuationcharacters(!,.)thisalsowouldpreventtheChatbotfromunderstandingthe input.That'sthereasonwhywewilltrytointroducesomenewmechanismtopreprocesstheusers inputsbeforeitcanbesearchintotheChatbotdatabase.Wecouldhaveafunctiontoputtheusers inputsinuppercasesincethekeywordsinsidethedatabaseareinuppercaseandanotherprocedure tojustremoveallofthepunctuationsandextraspacesthatcouldbefoundwithinusersinput.That said,wenowhaveenoughmaterialtowriteournextchatterbot:"Chattebot3".Viewthecodefor Chatterbot3

Whataretheweaknesseswiththecurrent versionoftheprogram?
Clearlytherearestillmanylimitationswiththisversionoftheprogram.Themostobviousonewouldbe thattheprogramuse"exactsentencematching"tofindaresponsetotheuser'sinput.Thismeansthat ifyouwouldgoandaskhim"whatisyournameagain",theprogramwillsimplynotunderstandwhat

www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print

5/15

12/31/13

Chatbot Tutorial - CodeProject

youaretryingtosaytohimandthisisbecauseitwasunabletofindamatchforthisinput.Andthis definitelywouldsoundalittlebitsurprisingconsideringthefactthattheprogramcanunderstandthe sentence"whatisyourname".

Howdoweovercomethisproblem?
Thereareatlisttwowaystosolvethisproblem,themostobviousoneistouseaslightlymoreflexible wayformatchingkeywordsinthedatabaseagainsttheuser'sinput.Allwehavetodotomakethis possibleistosimplyaloudkeywordstobefoundwithintheinputssothatwewillnolongerhavethe previouslimitation. Theotherpossibilityismuchmorecomplex,ituse'stheconceptofFuzzyStringSearch .Toapplythis method,itcouldbeusefulatfirsttobreaktheinputsandthecurrentkeywordinseparatewords,after thatwecouldcreatetwodifferentvectors,thefirstonecouldbeusetostorethewordsfortheinputand theotheronewouldstorethewordsforthecurrentkeyword.Oncewehavedonethiswecouldusethe Levenshteindistanceformeasuringthedistancebetweenthetwowordvectors.(Noticethatinorder forthismethodtobeeffectivewewouldalsoneedanextrakeywordthatwouldrepresentthesubjectof thecurrentkeyword). So,thereyouhaveit,twodifferentmethodsforimprovingthechatterbot.Actuallywecouldcombine bothmethodsandjustselectingwhichonetouseoneachsituation. Finally,therearestillanotherproblemthatyoumayhavenoticedwiththepreviouschatterbot,you couldrepeatthesamesentenceoverandoverandtheprogramwouldn'thaveanyreactiontothis.We needalsotocorrectthisproblem. So,wearenowreadytowriteourfourthchatterbot,wewillsimplycallitchatterbot4.Viewthecodefor Chatterbot4 Asyouprobablymayhaveseen,thecodefor"chatterbot4"isverysimilartotheonefor"chatterbot3" butalsotherewassomekeychangesinit.Inparticular,thefunctionforsearchingforkeywordsinside thedatabaseisnowalittlebitmoreflexible.So,whatnext? Dontworrytherearestillalotofthingsto becovered.

Whatcanweimproveinchatterbot4tomakeit better?
Herearesomeideas
sincethecodeforthechatterbotshavestartedtogrow,itwouldbeagoodthingtoencapsulate theimplementationofthenextchatterbotbyusingaclass. alsothedatabaseisstillmuchtoosmalltobecapableofhandlingarealconversationwithusers, sowewillneedtoaddsomemoreentriesinit. itmayhappensometimesthattheuserwillpresstheenterkeywithoutenteringanythingonthe keyboard,weneedtohandlethissituationaswell. theusermightalsotrytotrickthechatterbotbyrepeatinghisprevioussentencewithsomeslight modification,weneedtocountthisasarepetitionfromtheuser. andfinally,prettysoonyouwillalsonoticethatwemightneedawayforrankingkeywordswhen wehavemultiplechoicesofkeywordsforagiveninput,weneedawayforchoosingthebestone amongthem. Thatsaid,wewillnowstarttowritetheimplementationforchatterbot5.DownloadChatterbot5 Beforeproceedingtothenextpartofthistutorial,youareencouragedtotrycompilingandrunningthe

www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print

6/15

12/31/13

Chatbot Tutorial - CodeProject

codefor"chatterbot5"sothatyoucanunderstandhowitworksandalsotoverifiesthechangesthat havebeenmadeinit.Hasyoumayhaveseen,theimplementationofthe"currentchatterbot",isnow encapsulatedintoaclass,also,therehasbeensomenewfunctionsaddedtothenewversionofthe program.

Wewillnowtrytodiscusstheimplementationof "chatterbot5"
s e l e c t _ r e s p o n s e ( ) : t h i s f u n c t i o n s e l e c t s a r e s p o n s e f r o m a l i s t o f r e s p o n s e s , t h e r e i s a n e w h e l p e r f u n c t i o n t h a t w a s a d d e d t o t h e p r o g r a m s h u f f l e , t h i s n e w f u n c t i o n s h u f f l e s a l i s t o f s t r i n g s r a n d o m l y a f t e r s e e d _ r a n d o m _ g e n e r a t o r ( ) w a s c a l l e d . s a v e _ p r e v _ i n p u t ( ) : t h i s f u n c t i o n s i m p l y s a v e s t h e c u r r e n t u s e r i n p u t i n t o a v a r i a b l e ( m _ s P r e v I n p u t ) b e f o r e g e t t i n g s o m e n e w i n p u t s f r o m t h e u s e r . v o i d s a v e _ p r e v _ r e s p o n s e ( ) : t h e f u n c t i o n s a v e _ p r e v _ r e s p o n s e ( ) s a v e s t h e c u r r e n t r e s p o n s e o f t h e c h a t t e r b o t b e f o r e t h e b o t h a v e s t a r t e d t o s e a r c h r e s p o n s e s f o r t h e c u r r e n t i n p u t , t h e c u r r e n t r e s p o n s e s i s s a v e i n t h e v a r a i b l e ( m _ s P r e v R e s p o n s e ) . v o i d s a v e _ p r e v _ e v e n t ( ) : t h i s f u n c t i o n s i m p l y s a v e s t h e c u r r e n t e v e n t ( m _ s E v e n t ) i n t o t h e v a r i a b l e ( m _ s P r e v E v e n t ) . A n e v e n t c a n b e w h e n t h e p r o g r a m h a s d e t e c t e d a n u l l i n p u t f r o m t h e u s e r a l s o , w h e n t h e u s e r r e p e a t s h i m s e l f o r e v e n w h e n t h e c h a t t e r b o t m a k e s r e p e t i t i o n s h a s w e l l e t c . v o i d s e t _ e v e n t ( s t d : : s t r i n g s t r ) : s e t s t h e c u r r e n t e v e n t ( m _ s E v e n t ) v o i d s a v e _ i n p u t ( ) : m a k e s a b a c k u p o f t h e c u r r e n t i n p u t ( m _ s I n t p u t ) i n t o t h e v a r i a b l e m _ s I n p u t B a c k u p . v o i d s e t _ i n p u t ( s t d : : s t r i n g s t r ) : s e t s t h e c u r r e n t i n p u t ( m _ s I n p u t ) v o i d r e s t o r e _ i n p u t ( ) : r e s t o r e s t h e v a l u e o f t h e c u r r e n t i n p u t ( m _ s I n p u t ) t h a t h a s b e e n s a v e d p r e v i o u s l y i n t o t h e v a r i a b l e m _ s I n p u t B a c k u p . v o i d p r i n t _ r e s p o n s e ( ) : p r i n t s t h e r e s p o n s e t h a t h a s b e e n s e l e c t e d b y t h e c h a t r o b o t o n t h e s c r e e n . v o i d p r e p r o c e s s _ i n p u t ( ) : t h i s f u n c t i o n d o e s s o m e p r e p r o c e s s i n g o n t h e i n p u t l i k e r e m o v i n g p u n c t u a t i o n s , r e d u n d a n t s p a c e s c h a r a c t e s a n d a l s o i t c o n v e r t s t h e i n p u t t o u p p e r c a s e . b o o l b o t _ r e p e a t ( ) : v e r i f i e s i f t h e c h a t t e r b o t h a s s t a r t e d t o r e p e a t h i m s e l f . b o o l u s e r _ r e p e a t ( ) : V e r i f i e s i f t h e u s e r h a s r e p e a t e d h i s s e l f . b o o l b o t _ u n d e r s t a n d ( ) : V e r i f i e s t h a t t h e b o t u n d e r s t a n d t h e c u r r e n t u s e r i n p u t ( m _ s I n p u t ) . b o o l n u l l _ i n p u t ( ) : V e r i f i e s i f t h e c u r r e n t u s e r i n p u t ( m _ s I n p u t ) i s n u l l . b o o l n u l l _ i n p u t _ r e p e t i t i o n ( ) : V e r i f i e s i f t h e u s e r h a s r e p e a t e d s o m e n u l l i n p u t s . b o o l u s e r _ w a n t _ t o _ q u i t ( ) : C h e c k t o s e e i f t h e u s e r w a n t s t o q u i t t h e c u r r e n t s e s s i o n w i t h t h e c h a t t e r b o t .

www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print

7/15

12/31/13

Chatbot Tutorial - CodeProject

b o o l s a m e _ e v e n t ( ) : V e r i f i e s i f t h e c u r r e n t e v e n t ( m _ s E v e n t ) i s t h e s a m e a s t h e p r e v i o u s o n e ( m _ s P r e v E v e n t ) . b o o l n o _ r e s p o n s e ( ) : C h e c k s t o s e e i f t h e p r o g r a m h a s n o r e s p o n s e f o r t h e c u r r e n t i n p u t . b o o l s a m e _ i n p u t ( ) : V e r i f i e s i f t h e c u r r e n t i n p u t ( m _ s I n p u t ) i s t h e s a m e a s t h e p r e v i o u s o n e ( m _ s P r e v I n p u t ) . b o o l s i m i l a r _ i n p u t ( ) : C h e c k s t o s e e i f t h e c u r r e n t a n d p r e v i o u s i n p u t a r e s i m i l a r , t w o i n p u t s a r e c o n s i d e r e d s i m i l a r i f o n e o f t h e m i s t h e s u b s t r i n g o f t h e o t h e r o n e ( e . g . : h o w a r e y o u a n d h o w a r e y o u d o i n g w o u l d b e c o n s i d e r e d s i m i l a r b e c a u s e h o w a r e y o u i s a s u b s t r i n g o f h o w a r e y o u d o i n g . v o i d g e t _ i n p u t ( ) : G e t s i n p u t s f r o m t h e u s e r . v o i d r e s p o n d ( ) : h a n d l e s a l l r e s p o n s e s o f t h e c h a t r o b o t w h e t h e r i t i s f o r e v e n t s o r s i m p l y t h e c u r r e n t u s e r i n p u t . S o , b a s i c a l l y , t h e s e f u n c t i o n c o n t r o l s t h e b e h a v i o u r o f t h e p r o g r a m . f i n d _ m a t c h ( ) : F i n d s r e s p o n s e s f o r t h e c u r r e n t i n p u t . v o i d h a n d l e _ r e p e t i t i o n ( ) : H a n d l e s r e p e t i t i o n s m a d e b y t h e p r o g r a m . h a n d l e _ u s e r _ r e p e t i t i o n ( ) : H a n d l e s r e p e t i t i o n s m a d e b y t h e u s e r . v o i d h a n d l e _ e v e n t ( s t d : : s t r i n g s t r ) : T h i s f u n c t i o n h a n d l e s e v e n t s i n g e n e r a l .
Youcanclearlyseethat"chatterbot5"havemuchmorefunctionalitiesthan"chatterbot4"andalsoeach functionalitiesisencapsulatedintomethods(functions)oftheclassC B o t butstilltherearealotmore improvementstobemadeonittoo. Chattebot5introducetheconceptof"state",inthesenewversionoftheChatterbot,weassociatea different"state"tosomeoftheeventsthatcanoccurduringaconversation.Ex:whentheuserentersa nullinput,thechatterbotwouldsetitselfintothe"N U L L I N P U T * * "state,whentheuserrepeatthe samesentence,itwouldgointothe"REPETITIONT1**"state,etc. Alsothesenewchatterbotusesabiggerdatabasethanthepreviouschatbotthatwehaveseensofar: chatterbot1,chatterbot2,chatterbot3...Butstill,thisisquietinsignificantduetothefactthatmost chatterbotsinusetoday(theverypopularones)haveadatabaseofatleast10000linesormore.So, thiswoulddefinitelybeoneofthemajorgoalthatwemighttrytoachieveintothenextversionsofthe chatterbot. Buthoweverfornow,wewillconcentratealittleproblemconcerningthecurrentchatterbot.

Whatexactlywouldbethisproblem?
Well,it'sallaboutkeywordboundaries,supposethatuserentersthesentence:"Ithinknot"duringa conversationwiththechatbot,naturallytheprogramwouldlookintohisdatabaseforakeywordthat wouldmatchthesentence,anditmightfoundthekeyword:"Hi",whichisalsoasubstringoftheword "think",clearlythisisanunwantedbehaviour.

Howdoweavoidit?
Simplybyputtingaspacecharacterbeforeandafterthekeywordsthatcanbefoundinsidethe databaseorwecansimplyapplythechangesduringthematchingprocessinsidethe"f i n d _ m a t c h ( ) function".

www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print

8/15

12/31/13

Chatbot Tutorial - CodeProject

Arethereotherthingsthatwecanimprovein "Chatterbot5"?
Certainlythereis.SofartheChatbotstarta"chattingsession"withtheuserswithoutsayinganythingat thebeginningoftheconversations.Itwouldbegoodifthechatterbotcouldsayanythingatalltostartup theconversations.Thiscaneasilybeachievedbyintroducing"signonmessages"intotheprogram.We cansimplydothisbycreatinganewstateinsidetheChatbot"knowledgebase"andbyaddingsome appropriatemessagethatlinkstoit.Thatnewstatecouldbecall"SIGNON**". DownloadChatterbot6

Introducingtheconceptof"KeywordRanking"
Asyoucansee,oneachnewversionofthechatterbot,weareprogressivelyaddingnewfeaturesin ordertomaketheChabotmorerealistic.Now,inthesesection,wearegoingtointroducetheconcept of'keywordranking'intotheChatterbot.Keywordrankingisawayfortheprogramtoselectthebest keywordsinhisdatabasewhentherearemorethanonekeywordthatmatchtheusersinputs.Ex:ifwe havethecurrentuserinput:Whatisyournameagain,bylookingintohisdatabase,theChatbotwould havealistoftwokeywordsthatmatchthisinput:'WHAT'and'WHATISYOURNAME'.Whichoneisthe best?Well,theanswerisquietsimple,itisobviously:'Whatisyourname'simplybecauseitisthe longestkeyword.Thesenewfeaturehasbeenimplementedinthenewversionoftheprogram: Chatterbot7. DownloadChatterbot7

Equivalentkeywords
WithinallthepreviousChatterbotstherecordforthedatabasealoudustouseonlyonekeywordfor eachsetofresponsesbutsometimesitcouldbeUsefultohavemorethanonekeywordassociatedto eachsetofresponses.Speciallywhenthesekeywordshavethesamemeaning.E.g.:Whatisyour nameandCanyoupleasetellmeyournamehavebothhadthesamemeaning?Sotherewouldbeno needtousedifferentrecordsforthesekeywordsinsteadwecanjustmodifytherecordstructuresothat italoudustohavemorethanonekeywordperrecords.DownloadChatterbot8

Keywordtranspositionandtemplateresponse
Oneofthewellknownmechanismsofchatterbotsisthecapacitytoreformulatetheuser'sinputby doingsomebasicverbconjugation.Example,iftheuserenters:YOUAREAMACHINE,thechatterbot mightrespond:So,youthinkthatI'mamachine. Howdidwearriveatthistransformation?Wemayhavedoneitbyusingtwosteps: Wemakesurethatthechatterbothavealistofresponsetemplatesthatislinkedtothe correspondingkeywords.Responsestemplatesareasortofskeletontobuildnewresponsesfor thechatterbot.usuallyweusedwildcardsintheresponsestoindicatethatitisatemplate.Onthe previousexample,wehaveusedthetemplate:(so,youthinkthat*)toconstructourresponse. Duringthereassemblyprocess,wesimplyreplacethewildcardbysomepartoftheoriginalinput. Inthatsameexample,wehaveused:Youareamachine,whichisactuallythecompleteoriginal inputfromtheuser.Afterreplacingthewildcardbytheuser'sinput,wehavethefollowing sentence:So,youthinkthatyouareamachinebutwecannotusethesesentenceasitis,

www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print

9/15

12/31/13

Chatbot Tutorial - CodeProject

beforethatweneedtomakesomepronounreversalinit. Theusualtranspositionsthatweusemostlyarethereplacementofpronounofthefirstpersonto pronounofthesecondperson,e.g.:you>me,I'm>youareetc.Inthepreviousexampleby replacing"YOUARE"by"I'M"intheusersinput,Afterapplyingthesechanges,theoriginal sentencebecomes:I'mamachine.Nowwecanreplacethewildcardfromthetemplatebythese newsentencewhichgiveusourfinalresponsefortheChatbot:So,youthinkthatI'mamachine. Noticethatit'snotagoodthingtousetranspositiontoomuchduringaconversation,themechanism wouldbecometooobviousanditcouldcreatesomerepetition. DownloadChatterbot9

Keywordlocationconcept
Somekeywordscanbelocatedanywhereinagiveninput,someotherscanonlybefoundinonlysome specificplacesintheuser'sinputotherwiseitwouldn'tmakeanysense.Akeywordlike:"Whoareyou" canbefoundanywhereontheuser'sinputwithoutcreatinganyproblemswiththemeaningofit. Someexamplesofsentencesusing"WHOAREYOU"wouldbe: 1. Whoareyou? 2. Bytheway,whoareyou? 3. Sotellme,whoareyouexactly? Butakeywordsuchas"whois"canonlybefoundatthebeginningorinthemiddleofagivensentence butitcannotbefoundatendofthesentenceoralone. Examplesofsentencesusingthekeyword:"whois": 1. Whoisyourfavoritesinger? 2. Doyouknowwhoisthegreatestmathematicianofalltime? 3. Tellme,doyouknowwhois?(thisclearlydoesn'tmakeanysense) Howdowemakesurethatthechatterbotwillbeabletodistinguishsuchkeywordsandthespecific placesweretheyarealoudtobefoundonasentence?Wewillsimplyintroducesomenewnotations forkeywords: 1. Keywordsthatcanonlybefoundatthebeginningorinthemiddleofasentencewillbe representedby:_ K E Y W O R D (Ex:_WHOIS) 2. Keywordsthatcanonlybefoundatendorinthemiddleofasentencewillbedenotedby: K E Y W O R D _ (WHATAREYOU_) 3. Keywordsthatshouldonlybefoundaloneinasentencewillberepresentedby:_ K E Y W O R D _ (Ex: _WHAT) 4. Andfinally,keywordsthatcanbefoundanywhereinasentenceorevenalonewouldbesimply representedby:K E Y W O R D (Ex:IUNDERSTAND) Akeywordcanhavedifferentmeaningsdependingonit'spositioninagivensentence. DownloadChatterbot10

HandlingContext
ContextaiswayfortheChatterbottokeepintractwhatithassaidpreviouslyandbeingabletotake

www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print

10/15

12/31/13

Chatbot Tutorial - CodeProject

thisintoaccountwhenselectinghisnextresponse.Sofar,everyresponseselectedbytheChatbot duringaconversationischosenonlybasedonthecurrentuser'sinput.Butsometimes,wemightneed moredatainordertobeabletorespondproperlytoagiveninput,that'swhenweneedtousecontext. Toillustratetheseconcept,wearegoingtolookatthefollowingconversationlog: USER:Whatisyourfavoritemovie? CHATTERBOT:ITISTERMINATORII. USER:Whydoyoulikethismovie?(Nowhowarewesupposedtoanswerthatquestionifweknew nothingaboutthepreviousresponseoftheChatbot?) Soclearly,someinputsrequirestheusageof"context"inordertobeabletoformulateacorrect answer.Inthepreviousexample,itwouldsimplybe:ITISTERMINATORII.NowtheBotknowswhatit wastalkingaboutpreviously,itcanmoreeasilyformulateagoodanswertotheuser'sinput. Wecannowcontinuethepreviousconversationlog: (Context:ITISTERMINATORII) CHATTERBOT:BECAUSEITISASCIENCEFICTIONMOVIEANDILOVESCIENCEFICTION. ContextalsoaloudustocontrolimproperreactionfromtheChatbot.Example,iftheuserentersthe sentence:"Whydoyoulikethesemovie?"duringaconversationwithouttheChatterboteventalking aboutthesesubject.Itcouldsimplyrespondbysaying:WHATAREYOUTALKINGABOUT? ThecontextfeaturehasbeenimplementedinChatterbot11. DownloadChatterbot11 AnothergreatfeaturethatwouldbeveryinterestingtoimplementintoaChatterbotisthecapacityto anticipatethenextresponseoftheuser,thesewouldmaketheChatbotlooksevenmoresmarter duringaconversation.

UsingTextToSpeech
Wouldn'titbegreatifyourcomputercouldspeakbacktoyouwheneveryouorderittodosomething, we'veaccomplishjustthatin"Chatterbot12"thelatestversionoftheprogram.Nowtheprogramcan speakouteveryanswerthatishasselectedafterexaminingtheuser'sinput.TheSAPIlibraryfrom Microsoftwasusedinordertoaddthe"TextToSpeech"featurewithintheprogram.Forthe implementationpart,threenewfunctionswereaddedtotheprogramtoimplementthe"TextTo Speech"functionality:I n i t i a l i z e _ T T S _ E n g i n e ( ) ,s p e a k ( c o n s t s t d : : s t r i n g t e x t ) , R e l e a s e _ T T S _ E n g i n e ( ) .

I n i t i a l i z e _ T T S _ E n g i n e ( ) :Thesefunctionasthenamesuggestinitializedthe"TextTo
SpeechEngine"thatis,wefirststartbyinitializingthe"COMobjects"sinceSAPIisbuildontopof theATLlibrary.Iftheinitializationwassuccessful,wethencreateaninstanceoftheI S p V o i c e objectthatcontrolledthe"TextToSpeech"mechanismwithintheSAPIlibrarybyusingthe C o C r e a t e I n s t a n c e function.Ifthatalsowassuccessful,itmeansthatour"TextToSpeech Engine"wasinitializedproperlyandwearenowreadyforthenextstage:speakoutthe "responsestring" s p e a k ( c o n s t s t d : : s t r i n g t e x t ) :So,thisisthemainfunctionthatisusedfor implementing"TextToSpeech"withintheprogram,itbasicallytakesthe"responsestring" convertedtowidecharacters(W C H A R )andthenpassittothe"Speakmethod"ofthe "I S p V o i c e "objectwhichthenspeakoutthe"bot'sresponse". R e l e a s e _ T T S _ E n g i n e ( ) :Oncewearedoneusingthe"SAPITextToSpeechEngine",we justreleasealltheresourcesthathasbeenallocatedduringtheprocedure.

www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print

11/15

12/31/13

Chatbot Tutorial - CodeProject

DownloadChatterbot12

Usingaflatfiletostorethedatabase
Sofarthe,databasewasalwaysbuiltintotheprogramwhichmeanswheneveryoumodifiedthe database,youwouldalsohavetorecompiletheprogram.Thisisnotreallyconvenientbecauseitmight happensometimesthatweonlywanttoeditthedatabaseandkeeptherestoftheprogramasitis.For thesereasonandmanyothers,itcouldbeagoodthingtohaveaseparatefiletostorethedatabase whichthengivesusthecapabilityofjusteditingthedatabasewithouthavingtorecompileallthefilesin theprogram.Tostorethedatabasewecouldbasicallyuseasimpletextfilewithsomespecific notationstodistinguishthedifferentelementsofthedatabase(keywords,response,transpositions, context...).Inthecurrentprogram,wewillusethefollowingnotationsthathasbeenusedbeforesome implementationoftheElizachatbotinPascal. 1. Linesthatstartsby"K"inthedatabasewillrepresentkeywords. 2. Linesthatstartsby"R"willrepresentresponses 3. Linesthatstartsby"S"willrepresentsignonmessages 4. Linesthatstartsby"T"willrepresenttranspositions 5. Linesthatstartsby"E"willrepresentpossiblecorrectionscanbemadeaftertransposingthe user'sinput 6. Linesthatstartsby"N"willrepresentresponsesforemptyinputfromtheuser 7. Linesthatstartsby"X"willrepresentresponsesforwhenthatchatbotdidnotfindanymatching keywordthatmatchthecurrentuserinput. 8. Linesthatstartsby"W"willrepresentresponsesforwhentheuserrepeatitself. 9. Linesthatstartsby"C"willrepresentthecontextofthechatbot'scurrentresponse. 10. Linesthatstartsby"#"willrepresentcomments Wenowhaveacompletearchitectureforthedatabase,wejustneedtoimplementthesesfeaturesinto thenextversionofthechatbot(Chatterbot13). DownloadChatterbot13

Abetterrepetitionhandlingalgorithm
Inanefforttopreventthechatbotfromrepeatingitselftoomuch,previouslywehaveuseaverybasic andsimplealgorithmthatconsistofcomparingthecurrentchatbot'sresponsetothepreviousone.If thecurrentresponseselectionisequaltothepreviousone,wesimplydiscardthatresponseandlook overforthenextresponsecandidateonthelistofavailableresponses.Thisalgorithmisveryefficient whenitcomestocontrolimmediaterepetitionsfromthechatbot.However,it'snotthatgoodtoavoid morelongtermrepetition.Duringachattingsession,thesameresponsecanoccursmanytimes.With thenewalgorithm,wecontrolhowlongittakesforthechatbottoreselectthesameresponse.Actually wemakesurethatithasuseallavailableresponseforthecorrespondingkeywordbeforeitcanrepeat thesameresponse.Thisisinturncanimprovethequalityoftheconversationexchanges.Hereisa decryptiononhowthealgorithmworks:Duringtheconversationbetweenthechatbotandtheuser,we makealistofalltheresponsespreviouslyselectedbythechatrobot.Whenselectinganewresponse, wemakeasearchofthencurrentselectedresponseinsidetheliststartingfromtheend.Ifthecurrent responsecandidatewasfoundduringthatsearchwithinthelist,wethenmakeacomparisonofthat positionthetotalnumberofavailableresponses.ifthepositionplusoneisinferiortothetotalof availableresponses,weconsiderthatitisarepetition,sowehavetodiscardthecurrentresponseand selectanotherone. DownloadChatterbot14

www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print

12/15

12/31/13

Chatbot Tutorial - CodeProject

Updatingthedatabasewithnewkeywords
Sometimes,whenitcomestoaddnewkeywordstothedatabase,itcouldbedifficulttochoosethose thatarereallyrelevant.However,thereisaverysimplesolutiontothatproblem.Whenchatingwiththe chatrobot,wejustmakesurethatwestoretheuser'sinputinafile(ex:unknown.txt)eachtimethe chatbotwasnotabletofindanymatchingkeywordforthecurrentinput.Lateron,whenweneedto makesomekeywordsupdatesinthedatabase,wejusthavetotakealookatthefilethatwe'veuseto savetheunkownsentencesfoundearlierduringthepreviousconversations.Bycontinuouslyadding newkeywordsusingtheseprocedure,wecouldcreateaverygooddatabase. DownloadChatterbot15

SavingtheConversationLogs
Whysavingtheconversationsbetweentheusersandthechatbot?Becauseitcouldhelpusfindthe weaknessofthechatbotduringagivenconversation.Wemightthendecideonwhichmodificationsto maketothedatabaseinordertomakethefutureconversationsexchangesmorenatural.Wecould basicallysavethetimeandalsothedatetohelpusdeterminetheprogressofthechatbotafternew updateswereappliedtoit.Savingthelogshelpsusdeterminehowhumanlikeistheconversationskill ofthechatbot. DownloadChatterbot16

LearningCapability
Sofar,thechatbotwasnotabletolearnnewdatafromtheuserswhilechatting,itwouldbeveryuseful tohavethisfeaturewithinthechatbot.Itbasicallymeansthatwheneverthechatbotencountersan inputthathasnocorrespondingkeyword,itwouldprompttheuseraboutit.Andinreturntheuser wouldbeabletoaddanewkeywordandthecorrespondingresponsetoitinthedatabaseofthechat robot,doingsocanimprovethedatabaseofthechabotverysignificantly.Hereishowthealgorithm shouldgo: 1. NOKEYWORDWASFOUNDFORTHISINPUT,PLEASEENTERAKEYWORD 2. SOTHEKEYWORDIS:(key) 3. (ifresponseisno)PLEASEREENTERTHEKEYWORD(gobacktostep#2) 4. NORESPONSEWASFOUNDFORTHISKEYWORD:(key),PLEASEENTERARESPONSE 5. SO,THERESPONSEIS:(resp) 6. (ifresponseisno)PLEASEREENTERTHERESPONSE(gobacktostep#4) 7. KEYWORDANDRESPONSELEARNEDSUCCESSFULLY 8. ISTHEREANYOTHERKEYWORDTHATISHOULDLEARN 9. (ifresponseisyes,otherwisecontinuechating ):PLEASEENTERTHEKEYWORD(goback tostep#2) Returntobeginningofthedocument Checktheaiprogramming.blogspot.comwebpageforthelatestupdates

License
Thisarticle,alongwithanyassociatedsourcecodeandfiles,islicensedunderTheCodeProjectOpen License(CPOL)

www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print

13/15

12/31/13

Chatbot Tutorial - CodeProject

AbouttheAuthor
GonzalesCenelia
Helpdesk/SupportGexelTelecom Canada

IhavebeenprogramminginCandC++formorethanfouryears,thefirsttimethatihadlearn programmingwasin1999incollege.Howeveritwasonlybytheyear2000whenihavebuy myfirstcomputerthatihadtrulystartedtodosomemoreinterestingthingsinprogramming. Asaprogrammer,mymaininterestisA.Iprogramming.Soi'mreallycaptivatedbyallthatis relatedtoN.L.U(NaturalLanguageUnderstanding),N.L.P(NaturalLanguageProcessing), ArtificialNeuralNetworksetc.Currentlyi'mlearningtoprograminPrologandLisp.Also,i'm reallyfascinatedwiththeoriginalchatterbotprogramnamed:Eliza,thatprogramwaswroteby JosephWeizenbaum.Everytimeirunthisprogram,itmakesmereallythinkthatA.Icouldbe solveoneday.AlotofinterestingstuffhasbeenaccomplishinthedomainofArtificial Intelligenceinthepastyears.Averygoodexampleofthoseaccomplishmentsis:Logic Programming,whichmakesitpossibletomanipulatelogicstatementsandalsotomakesome inferencesaboutthosestatements.Aclassicalexamplewouldbe:giventhefactthat"Every manismortal"andthatSocratesisaman,thanlogicallywecandeducethatSocratesis mortal.SuchsimplelogicalstatementscanbewroteinPrologbyusingjustafewlinesofcode: prologcodesample: mortal(X):man(X).%rule man(socrates).%declaringafact theprecedingprologrulecanberead:foreveryvariableX,ifXisamanthanXismortal.these lastPrologcodesamplecanbeeasilyextentedbyaddingmorefactsorrules,example: mortal(X):man(X).%rule mortal(X):woman(X).%rule man(socrates).%fact1 man(adam).%fact2 woman(eve).%fact3 formore,check:aiprogramming.blogspot.com

CommentsandDiscussions
www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print 14/15

12/31/13

Chatbot Tutorial - CodeProject

41messageshavebeenpostedforthisarticleVisit http://www.codeproject.com/Articles/36106/ChatbotTutorialtopostandviewcommentsonthis article,orclickheretogetaprintviewwithmessages.


Permalink|Advertise|Privacy|Mobile Web01|2.7.131230.1|LastUpdated29May2013 ArticleCopyright2009byGonzalesCenelia EverythingelseCopyrightCodeProject,19992013 TermsofUse

www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print

15/15

Das könnte Ihnen auch gefallen