Beruflich Dokumente
Kultur Dokumente
ArticlesGeneralProgrammingAlgorithms&RecipesGeneral
ChatbotTutorial
ByGonzalesCenelia,29May2013
4.67(44votes)
Overview
AstepbystepguidetoimplementyourownArtificialIntelligencechatbot.
Tableofcontents
1. IntroductionChatbotdescription(firstexample) 2. Introducingkeywordsandstimulusresponse 3. Preprocessingtheuser'sinputandrepetitioncontrol 4. Amoreflexiblewayformatchingtheinputs 5. Usingclassesforabetterimplementation 6. Controllingrepetitionmadebytheuser 7. Using"states"torepresentdifferentevents 8. Keywordboundariesconcept 9. UsingSignonmessages 10. "KeywordRanking"concept 11. Keywordequivalenceconcept 12. Transpositionandtemplateresponse 13. Keywordlocationconcept 14. Handlingcontext 15. UsingTextToSpeech 16. Usingaflatfiletostorethedatabase 17. Abetterrepetitionhandlingalgorithm 18. Updatingthedatabasewithnewkeywords 19. SavingtheconversationLogs 20. Learningcapability
Introduction
BasicallyachatterbotisacomputerprogramthatwhenyouprovideitwithsomeinputsinNatural Language(English,French...)respondswithsomethingmeaningfulinthatsamelanguage.Which meansthatthestrengthofachatterbotcouldbedirectlymeasuredbythequalityoftheoutputselected bytheBotinresponsetotheuser.Bythepreviousdescription,wecoulddeducethataverybasic chatterbotcanbewritteninafewlinesofcodeinagivenspecificprogramminglanguage.Letsmake
www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print
1/15
12/31/13
Asyoucansee,itdoesn'ttakealotofcodetowriteaverybasicprogramthatcaninteractwithauser butitwouldprobablybeverydifficulttowriteaprogramthatwouldreallybecapableoftrulyinterpreting whattheuserisactuallysayingandafterthatwouldalsogenerateanappropriateresponsetoit.These havebeenalongtermgoalsincethebeginningandevenbeforetheveryfirstcomputerswerecreated. In1951,theBritishmathematicianAlanTuringhascameupwiththequestionCanmachinesthinkand hehasalsoproposeatestwhichisnowknownastheTuringTest.Inthistest,acomputerprogram andalsoarealpersonissettospeaktoathirdperson(thejudge)andhehastodecidewhichofthem istherealperson.Nowadays,thereisacompetitionthatwasnamedtheLoebnerPrizeandinthis competitionbotsthathassuccessfullyfoolmostofthejudgeforatlist5minuteswouldwinaprizeof 100.000$.Sofarnocomputerprogramwasabletopassthistestsuccessfully.Oneofthemajor reasonsforthisisthatcomputerprogramswrittentocomputeinsuchcontesthavenaturallythe tendencyofcommittingalotoftypo(theyareoftenoutofthecontextoftheconversation).Which meansthatgenerally,itisn'tthatdifficultforajudgetodecidewhetherheisspeakingtoa"computer program"orarealperson.Also,thedirectancestorofallthoseprogramthattriestomimica conversationbetweenrealhumanbeingsisEliza,thefirstversionofthisprogramwaswrittenin1966 byJosephWeizenbaumaprofessorofMIT. ChatbotsingeneralareconsideredtobelongtotheweakAIfield(weakartificialintelligence)as opposedtostronga.iwho'sgoalistocreateprogramsthatareasintelligentashumansormore intelligent.Butitdoesn'tmeanthatchatbotsdonothaveanytruepotential.Beingabletocreatea programthatcouldcommunicatethesamewayhumansdowouldbeagreatadvancefortheAIfield. Chatbotisthispartofartificialintelligencewhichismoreaccessibletohobbyist(itonlytakesome
www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print
2/15
12/31/13
averageprogrammingskilltobeachatbotprogrammer).So,programmersouttherewhowantedto createtrueAIorsomekindofartificialintelligence,writingintelligentchatbotsisagreatplacetostart!
Now,let'sgetbacktoourpreviousprogram, whataretheproblemswithit?
Well,thereisalotofthem.Firstofall,wecanclearlyseethattheprogramisn'treallytryingto understandwhattheuserissayingbutinsteadheisjustselectingarandomresponsefromhis databaseeachtimetheusertypesomesentenceonthekeyboard.Andalso,wecouldnoticethatthe programrepeathimselfveryoften.Oneofthereasonforthisisbecauseofthesizeofthedatabase whichisverysmall(5sentences).Thesecondthingthatwouldexplaintherepetitionsisthatwehaven't implementedanymechanismthatwouldcontrolthisunwantedbehavior.
www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print
3/15
12/31/13
{ " M Y N A M E I S C H A T T E R B O T 2 . " , " Y O U C A N C A L L M E C H A T T E R B O T 2 . " , " W H Y D O Y O U W A N T T O K N O W M Y N A M E ? " } } , { " H I " , { " H I T H E R E ! " , " H O W A R E Y O U ? " , " H I ! " } } , { " H O W A R E Y O U " , { " I ' M D O I N G F I N E ! " , " I ' M D O I N G W E L L A N D Y O U ? " , " W H Y D O Y O U W A N T T O K N O W H O W A M I D O I N G ? " } } , { " W H O A R E Y O U " , { " I ' M A N A . I P R O G R A M . " , " I T H I N K T H A T Y O U K N O W W H O I ' M . " , " W H Y A R E Y O U A S K I N G ? " } } , { " A R E Y O U I N T E L L I G E N T " , { " Y E S , O F C O R S E . " , " W H A T D O Y O U T H I N K ? " , " A C T U A L Y , I ' M V E R Y I N T E L L I G E N T ! " } } , { " A R E Y O U R E A L " , { " D O E S T H A T Q U E S T I O N R E A L L Y M A T E R S T O Y O U ? " , " W H A T D O Y O U M E A N B Y T H A T ? " , " I ' M A S R E A L A S I C A N B E . " } } } s i z e _ t n K n o w l e d g e B a s e S i z e = s i z e o f ( K n o w l e d g e B a s e ) / s i z e o f ( K n o w l e d g e B a s e [ 0 ] ) i n t m a i n ( ) { s r a n d ( ( u n s i g n e d ) t i m e ( N U L L ) ) s t d : : s t r i n g s I n p u t = " " s t d : : s t r i n g s R e s p o n s e = " " w h i l e ( 1 ) { s t d : : c o u t < < " > " s t d : : g e t l i n e ( s t d : : c i n , s I n p u t ) v s t r i n g r e s p o n s e s = f i n d _ m a t c h ( s I n p u t ) i f ( s I n p u t = = " B Y E " ) { s t d : : c o u t < < " I T W A S N I C E T A L K I N G T O Y O U U S E R , S E E Y O U N E X T T I M E ! " < < s t d : : e n d l b r e a k } e l s e i f ( r e s p o n s e s . s i z e ( ) = = 0 ) { s t d : : c o u t < < " I ' M N O T S U R E I F I U N D E R S T A N D W H A T Y O U A R E T A L K I N G A B O U T . " < < s t d : : e n d l } e l s e { i n t n S e l e c t i o n = r a n d ( ) % M A X _ R E S P s R e s p o n s e = r e s p o n s e s [ n S e l e c t i o n ] s t d : : c o u t < < s R e s p o n s e < < s t d : : e n d l } } r e t u r n 0 } / / m a k e a s e a r c h f o r t h e u s e r ' s i n p u t
www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print
4/15
12/31/13
WhatcanweimproveonthesepreviousChatbot tomakeitevenbetter?
Therearequietafewthingsthatwecanimprove,thefirstoneisthatsincethechatterbottendstobe veryrepetitive,wemightcreateamechanismtocontroltheserepetitions.Wecouldsimplystorethe previousresponseofthatChatbotwithinastrings P r e v R e s p o n s e andmakesomecheckingswhen selectingthenextbotresponsetoseeifit'snotequaltothepreviousresponse.Ifitisthecase,wethen selectanewresponsefromtheavailableresponses. Theotherthingthatwecouldimprovewouldbethewaythatthechatbothandlestheusersinputs, currentlyifyouenteraninputthatisinlowercasetheChatbotwouldnotunderstandanythingaboutit eveniftherewouldbeamatchinsidethebot'sdatabaseforthatinput.Alsoiftheinputcontainsextra spacesorpunctuationcharacters(!,.)thisalsowouldpreventtheChatbotfromunderstandingthe input.That'sthereasonwhywewilltrytointroducesomenewmechanismtopreprocesstheusers inputsbeforeitcanbesearchintotheChatbotdatabase.Wecouldhaveafunctiontoputtheusers inputsinuppercasesincethekeywordsinsidethedatabaseareinuppercaseandanotherprocedure tojustremoveallofthepunctuationsandextraspacesthatcouldbefoundwithinusersinput.That said,wenowhaveenoughmaterialtowriteournextchatterbot:"Chattebot3".Viewthecodefor Chatterbot3
Whataretheweaknesseswiththecurrent versionoftheprogram?
Clearlytherearestillmanylimitationswiththisversionoftheprogram.Themostobviousonewouldbe thattheprogramuse"exactsentencematching"tofindaresponsetotheuser'sinput.Thismeansthat ifyouwouldgoandaskhim"whatisyournameagain",theprogramwillsimplynotunderstandwhat
www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print
5/15
12/31/13
Howdoweovercomethisproblem?
Thereareatlisttwowaystosolvethisproblem,themostobviousoneistouseaslightlymoreflexible wayformatchingkeywordsinthedatabaseagainsttheuser'sinput.Allwehavetodotomakethis possibleistosimplyaloudkeywordstobefoundwithintheinputssothatwewillnolongerhavethe previouslimitation. Theotherpossibilityismuchmorecomplex,ituse'stheconceptofFuzzyStringSearch .Toapplythis method,itcouldbeusefulatfirsttobreaktheinputsandthecurrentkeywordinseparatewords,after thatwecouldcreatetwodifferentvectors,thefirstonecouldbeusetostorethewordsfortheinputand theotheronewouldstorethewordsforthecurrentkeyword.Oncewehavedonethiswecouldusethe Levenshteindistanceformeasuringthedistancebetweenthetwowordvectors.(Noticethatinorder forthismethodtobeeffectivewewouldalsoneedanextrakeywordthatwouldrepresentthesubjectof thecurrentkeyword). So,thereyouhaveit,twodifferentmethodsforimprovingthechatterbot.Actuallywecouldcombine bothmethodsandjustselectingwhichonetouseoneachsituation. Finally,therearestillanotherproblemthatyoumayhavenoticedwiththepreviouschatterbot,you couldrepeatthesamesentenceoverandoverandtheprogramwouldn'thaveanyreactiontothis.We needalsotocorrectthisproblem. So,wearenowreadytowriteourfourthchatterbot,wewillsimplycallitchatterbot4.Viewthecodefor Chatterbot4 Asyouprobablymayhaveseen,thecodefor"chatterbot4"isverysimilartotheonefor"chatterbot3" butalsotherewassomekeychangesinit.Inparticular,thefunctionforsearchingforkeywordsinside thedatabaseisnowalittlebitmoreflexible.So,whatnext? Dontworrytherearestillalotofthingsto becovered.
Whatcanweimproveinchatterbot4tomakeit better?
Herearesomeideas
sincethecodeforthechatterbotshavestartedtogrow,itwouldbeagoodthingtoencapsulate theimplementationofthenextchatterbotbyusingaclass. alsothedatabaseisstillmuchtoosmalltobecapableofhandlingarealconversationwithusers, sowewillneedtoaddsomemoreentriesinit. itmayhappensometimesthattheuserwillpresstheenterkeywithoutenteringanythingonthe keyboard,weneedtohandlethissituationaswell. theusermightalsotrytotrickthechatterbotbyrepeatinghisprevioussentencewithsomeslight modification,weneedtocountthisasarepetitionfromtheuser. andfinally,prettysoonyouwillalsonoticethatwemightneedawayforrankingkeywordswhen wehavemultiplechoicesofkeywordsforagiveninput,weneedawayforchoosingthebestone amongthem. Thatsaid,wewillnowstarttowritetheimplementationforchatterbot5.DownloadChatterbot5 Beforeproceedingtothenextpartofthistutorial,youareencouragedtotrycompilingandrunningthe
www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print
6/15
12/31/13
Wewillnowtrytodiscusstheimplementationof "chatterbot5"
s e l e c t _ r e s p o n s e ( ) : t h i s f u n c t i o n s e l e c t s a r e s p o n s e f r o m a l i s t o f r e s p o n s e s , t h e r e i s a n e w h e l p e r f u n c t i o n t h a t w a s a d d e d t o t h e p r o g r a m s h u f f l e , t h i s n e w f u n c t i o n s h u f f l e s a l i s t o f s t r i n g s r a n d o m l y a f t e r s e e d _ r a n d o m _ g e n e r a t o r ( ) w a s c a l l e d . s a v e _ p r e v _ i n p u t ( ) : t h i s f u n c t i o n s i m p l y s a v e s t h e c u r r e n t u s e r i n p u t i n t o a v a r i a b l e ( m _ s P r e v I n p u t ) b e f o r e g e t t i n g s o m e n e w i n p u t s f r o m t h e u s e r . v o i d s a v e _ p r e v _ r e s p o n s e ( ) : t h e f u n c t i o n s a v e _ p r e v _ r e s p o n s e ( ) s a v e s t h e c u r r e n t r e s p o n s e o f t h e c h a t t e r b o t b e f o r e t h e b o t h a v e s t a r t e d t o s e a r c h r e s p o n s e s f o r t h e c u r r e n t i n p u t , t h e c u r r e n t r e s p o n s e s i s s a v e i n t h e v a r a i b l e ( m _ s P r e v R e s p o n s e ) . v o i d s a v e _ p r e v _ e v e n t ( ) : t h i s f u n c t i o n s i m p l y s a v e s t h e c u r r e n t e v e n t ( m _ s E v e n t ) i n t o t h e v a r i a b l e ( m _ s P r e v E v e n t ) . A n e v e n t c a n b e w h e n t h e p r o g r a m h a s d e t e c t e d a n u l l i n p u t f r o m t h e u s e r a l s o , w h e n t h e u s e r r e p e a t s h i m s e l f o r e v e n w h e n t h e c h a t t e r b o t m a k e s r e p e t i t i o n s h a s w e l l e t c . v o i d s e t _ e v e n t ( s t d : : s t r i n g s t r ) : s e t s t h e c u r r e n t e v e n t ( m _ s E v e n t ) v o i d s a v e _ i n p u t ( ) : m a k e s a b a c k u p o f t h e c u r r e n t i n p u t ( m _ s I n t p u t ) i n t o t h e v a r i a b l e m _ s I n p u t B a c k u p . v o i d s e t _ i n p u t ( s t d : : s t r i n g s t r ) : s e t s t h e c u r r e n t i n p u t ( m _ s I n p u t ) v o i d r e s t o r e _ i n p u t ( ) : r e s t o r e s t h e v a l u e o f t h e c u r r e n t i n p u t ( m _ s I n p u t ) t h a t h a s b e e n s a v e d p r e v i o u s l y i n t o t h e v a r i a b l e m _ s I n p u t B a c k u p . v o i d p r i n t _ r e s p o n s e ( ) : p r i n t s t h e r e s p o n s e t h a t h a s b e e n s e l e c t e d b y t h e c h a t r o b o t o n t h e s c r e e n . v o i d p r e p r o c e s s _ i n p u t ( ) : t h i s f u n c t i o n d o e s s o m e p r e p r o c e s s i n g o n t h e i n p u t l i k e r e m o v i n g p u n c t u a t i o n s , r e d u n d a n t s p a c e s c h a r a c t e s a n d a l s o i t c o n v e r t s t h e i n p u t t o u p p e r c a s e . b o o l b o t _ r e p e a t ( ) : v e r i f i e s i f t h e c h a t t e r b o t h a s s t a r t e d t o r e p e a t h i m s e l f . b o o l u s e r _ r e p e a t ( ) : V e r i f i e s i f t h e u s e r h a s r e p e a t e d h i s s e l f . b o o l b o t _ u n d e r s t a n d ( ) : V e r i f i e s t h a t t h e b o t u n d e r s t a n d t h e c u r r e n t u s e r i n p u t ( m _ s I n p u t ) . b o o l n u l l _ i n p u t ( ) : V e r i f i e s i f t h e c u r r e n t u s e r i n p u t ( m _ s I n p u t ) i s n u l l . b o o l n u l l _ i n p u t _ r e p e t i t i o n ( ) : V e r i f i e s i f t h e u s e r h a s r e p e a t e d s o m e n u l l i n p u t s . b o o l u s e r _ w a n t _ t o _ q u i t ( ) : C h e c k t o s e e i f t h e u s e r w a n t s t o q u i t t h e c u r r e n t s e s s i o n w i t h t h e c h a t t e r b o t .
www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print
7/15
12/31/13
b o o l s a m e _ e v e n t ( ) : V e r i f i e s i f t h e c u r r e n t e v e n t ( m _ s E v e n t ) i s t h e s a m e a s t h e p r e v i o u s o n e ( m _ s P r e v E v e n t ) . b o o l n o _ r e s p o n s e ( ) : C h e c k s t o s e e i f t h e p r o g r a m h a s n o r e s p o n s e f o r t h e c u r r e n t i n p u t . b o o l s a m e _ i n p u t ( ) : V e r i f i e s i f t h e c u r r e n t i n p u t ( m _ s I n p u t ) i s t h e s a m e a s t h e p r e v i o u s o n e ( m _ s P r e v I n p u t ) . b o o l s i m i l a r _ i n p u t ( ) : C h e c k s t o s e e i f t h e c u r r e n t a n d p r e v i o u s i n p u t a r e s i m i l a r , t w o i n p u t s a r e c o n s i d e r e d s i m i l a r i f o n e o f t h e m i s t h e s u b s t r i n g o f t h e o t h e r o n e ( e . g . : h o w a r e y o u a n d h o w a r e y o u d o i n g w o u l d b e c o n s i d e r e d s i m i l a r b e c a u s e h o w a r e y o u i s a s u b s t r i n g o f h o w a r e y o u d o i n g . v o i d g e t _ i n p u t ( ) : G e t s i n p u t s f r o m t h e u s e r . v o i d r e s p o n d ( ) : h a n d l e s a l l r e s p o n s e s o f t h e c h a t r o b o t w h e t h e r i t i s f o r e v e n t s o r s i m p l y t h e c u r r e n t u s e r i n p u t . S o , b a s i c a l l y , t h e s e f u n c t i o n c o n t r o l s t h e b e h a v i o u r o f t h e p r o g r a m . f i n d _ m a t c h ( ) : F i n d s r e s p o n s e s f o r t h e c u r r e n t i n p u t . v o i d h a n d l e _ r e p e t i t i o n ( ) : H a n d l e s r e p e t i t i o n s m a d e b y t h e p r o g r a m . h a n d l e _ u s e r _ r e p e t i t i o n ( ) : H a n d l e s r e p e t i t i o n s m a d e b y t h e u s e r . v o i d h a n d l e _ e v e n t ( s t d : : s t r i n g s t r ) : T h i s f u n c t i o n h a n d l e s e v e n t s i n g e n e r a l .
Youcanclearlyseethat"chatterbot5"havemuchmorefunctionalitiesthan"chatterbot4"andalsoeach functionalitiesisencapsulatedintomethods(functions)oftheclassC B o t butstilltherearealotmore improvementstobemadeonittoo. Chattebot5introducetheconceptof"state",inthesenewversionoftheChatterbot,weassociatea different"state"tosomeoftheeventsthatcanoccurduringaconversation.Ex:whentheuserentersa nullinput,thechatterbotwouldsetitselfintothe"N U L L I N P U T * * "state,whentheuserrepeatthe samesentence,itwouldgointothe"REPETITIONT1**"state,etc. Alsothesenewchatterbotusesabiggerdatabasethanthepreviouschatbotthatwehaveseensofar: chatterbot1,chatterbot2,chatterbot3...Butstill,thisisquietinsignificantduetothefactthatmost chatterbotsinusetoday(theverypopularones)haveadatabaseofatleast10000linesormore.So, thiswoulddefinitelybeoneofthemajorgoalthatwemighttrytoachieveintothenextversionsofthe chatterbot. Buthoweverfornow,wewillconcentratealittleproblemconcerningthecurrentchatterbot.
Whatexactlywouldbethisproblem?
Well,it'sallaboutkeywordboundaries,supposethatuserentersthesentence:"Ithinknot"duringa conversationwiththechatbot,naturallytheprogramwouldlookintohisdatabaseforakeywordthat wouldmatchthesentence,anditmightfoundthekeyword:"Hi",whichisalsoasubstringoftheword "think",clearlythisisanunwantedbehaviour.
Howdoweavoidit?
Simplybyputtingaspacecharacterbeforeandafterthekeywordsthatcanbefoundinsidethe databaseorwecansimplyapplythechangesduringthematchingprocessinsidethe"f i n d _ m a t c h ( ) function".
www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print
8/15
12/31/13
Arethereotherthingsthatwecanimprovein "Chatterbot5"?
Certainlythereis.SofartheChatbotstarta"chattingsession"withtheuserswithoutsayinganythingat thebeginningoftheconversations.Itwouldbegoodifthechatterbotcouldsayanythingatalltostartup theconversations.Thiscaneasilybeachievedbyintroducing"signonmessages"intotheprogram.We cansimplydothisbycreatinganewstateinsidetheChatbot"knowledgebase"andbyaddingsome appropriatemessagethatlinkstoit.Thatnewstatecouldbecall"SIGNON**". DownloadChatterbot6
Introducingtheconceptof"KeywordRanking"
Asyoucansee,oneachnewversionofthechatterbot,weareprogressivelyaddingnewfeaturesin ordertomaketheChabotmorerealistic.Now,inthesesection,wearegoingtointroducetheconcept of'keywordranking'intotheChatterbot.Keywordrankingisawayfortheprogramtoselectthebest keywordsinhisdatabasewhentherearemorethanonekeywordthatmatchtheusersinputs.Ex:ifwe havethecurrentuserinput:Whatisyournameagain,bylookingintohisdatabase,theChatbotwould havealistoftwokeywordsthatmatchthisinput:'WHAT'and'WHATISYOURNAME'.Whichoneisthe best?Well,theanswerisquietsimple,itisobviously:'Whatisyourname'simplybecauseitisthe longestkeyword.Thesenewfeaturehasbeenimplementedinthenewversionoftheprogram: Chatterbot7. DownloadChatterbot7
Equivalentkeywords
WithinallthepreviousChatterbotstherecordforthedatabasealoudustouseonlyonekeywordfor eachsetofresponsesbutsometimesitcouldbeUsefultohavemorethanonekeywordassociatedto eachsetofresponses.Speciallywhenthesekeywordshavethesamemeaning.E.g.:Whatisyour nameandCanyoupleasetellmeyournamehavebothhadthesamemeaning?Sotherewouldbeno needtousedifferentrecordsforthesekeywordsinsteadwecanjustmodifytherecordstructuresothat italoudustohavemorethanonekeywordperrecords.DownloadChatterbot8
Keywordtranspositionandtemplateresponse
Oneofthewellknownmechanismsofchatterbotsisthecapacitytoreformulatetheuser'sinputby doingsomebasicverbconjugation.Example,iftheuserenters:YOUAREAMACHINE,thechatterbot mightrespond:So,youthinkthatI'mamachine. Howdidwearriveatthistransformation?Wemayhavedoneitbyusingtwosteps: Wemakesurethatthechatterbothavealistofresponsetemplatesthatislinkedtothe correspondingkeywords.Responsestemplatesareasortofskeletontobuildnewresponsesfor thechatterbot.usuallyweusedwildcardsintheresponsestoindicatethatitisatemplate.Onthe previousexample,wehaveusedthetemplate:(so,youthinkthat*)toconstructourresponse. Duringthereassemblyprocess,wesimplyreplacethewildcardbysomepartoftheoriginalinput. Inthatsameexample,wehaveused:Youareamachine,whichisactuallythecompleteoriginal inputfromtheuser.Afterreplacingthewildcardbytheuser'sinput,wehavethefollowing sentence:So,youthinkthatyouareamachinebutwecannotusethesesentenceasitis,
www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print
9/15
12/31/13
Keywordlocationconcept
Somekeywordscanbelocatedanywhereinagiveninput,someotherscanonlybefoundinonlysome specificplacesintheuser'sinputotherwiseitwouldn'tmakeanysense.Akeywordlike:"Whoareyou" canbefoundanywhereontheuser'sinputwithoutcreatinganyproblemswiththemeaningofit. Someexamplesofsentencesusing"WHOAREYOU"wouldbe: 1. Whoareyou? 2. Bytheway,whoareyou? 3. Sotellme,whoareyouexactly? Butakeywordsuchas"whois"canonlybefoundatthebeginningorinthemiddleofagivensentence butitcannotbefoundatendofthesentenceoralone. Examplesofsentencesusingthekeyword:"whois": 1. Whoisyourfavoritesinger? 2. Doyouknowwhoisthegreatestmathematicianofalltime? 3. Tellme,doyouknowwhois?(thisclearlydoesn'tmakeanysense) Howdowemakesurethatthechatterbotwillbeabletodistinguishsuchkeywordsandthespecific placesweretheyarealoudtobefoundonasentence?Wewillsimplyintroducesomenewnotations forkeywords: 1. Keywordsthatcanonlybefoundatthebeginningorinthemiddleofasentencewillbe representedby:_ K E Y W O R D (Ex:_WHOIS) 2. Keywordsthatcanonlybefoundatendorinthemiddleofasentencewillbedenotedby: K E Y W O R D _ (WHATAREYOU_) 3. Keywordsthatshouldonlybefoundaloneinasentencewillberepresentedby:_ K E Y W O R D _ (Ex: _WHAT) 4. Andfinally,keywordsthatcanbefoundanywhereinasentenceorevenalonewouldbesimply representedby:K E Y W O R D (Ex:IUNDERSTAND) Akeywordcanhavedifferentmeaningsdependingonit'spositioninagivensentence. DownloadChatterbot10
HandlingContext
ContextaiswayfortheChatterbottokeepintractwhatithassaidpreviouslyandbeingabletotake
www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print
10/15
12/31/13
thisintoaccountwhenselectinghisnextresponse.Sofar,everyresponseselectedbytheChatbot duringaconversationischosenonlybasedonthecurrentuser'sinput.Butsometimes,wemightneed moredatainordertobeabletorespondproperlytoagiveninput,that'swhenweneedtousecontext. Toillustratetheseconcept,wearegoingtolookatthefollowingconversationlog: USER:Whatisyourfavoritemovie? CHATTERBOT:ITISTERMINATORII. USER:Whydoyoulikethismovie?(Nowhowarewesupposedtoanswerthatquestionifweknew nothingaboutthepreviousresponseoftheChatbot?) Soclearly,someinputsrequirestheusageof"context"inordertobeabletoformulateacorrect answer.Inthepreviousexample,itwouldsimplybe:ITISTERMINATORII.NowtheBotknowswhatit wastalkingaboutpreviously,itcanmoreeasilyformulateagoodanswertotheuser'sinput. Wecannowcontinuethepreviousconversationlog: (Context:ITISTERMINATORII) CHATTERBOT:BECAUSEITISASCIENCEFICTIONMOVIEANDILOVESCIENCEFICTION. ContextalsoaloudustocontrolimproperreactionfromtheChatbot.Example,iftheuserentersthe sentence:"Whydoyoulikethesemovie?"duringaconversationwithouttheChatterboteventalking aboutthesesubject.Itcouldsimplyrespondbysaying:WHATAREYOUTALKINGABOUT? ThecontextfeaturehasbeenimplementedinChatterbot11. DownloadChatterbot11 AnothergreatfeaturethatwouldbeveryinterestingtoimplementintoaChatterbotisthecapacityto anticipatethenextresponseoftheuser,thesewouldmaketheChatbotlooksevenmoresmarter duringaconversation.
UsingTextToSpeech
Wouldn'titbegreatifyourcomputercouldspeakbacktoyouwheneveryouorderittodosomething, we'veaccomplishjustthatin"Chatterbot12"thelatestversionoftheprogram.Nowtheprogramcan speakouteveryanswerthatishasselectedafterexaminingtheuser'sinput.TheSAPIlibraryfrom Microsoftwasusedinordertoaddthe"TextToSpeech"featurewithintheprogram.Forthe implementationpart,threenewfunctionswereaddedtotheprogramtoimplementthe"TextTo Speech"functionality:I n i t i a l i z e _ T T S _ E n g i n e ( ) ,s p e a k ( c o n s t s t d : : s t r i n g t e x t ) , R e l e a s e _ T T S _ E n g i n e ( ) .
I n i t i a l i z e _ T T S _ E n g i n e ( ) :Thesefunctionasthenamesuggestinitializedthe"TextTo
SpeechEngine"thatis,wefirststartbyinitializingthe"COMobjects"sinceSAPIisbuildontopof theATLlibrary.Iftheinitializationwassuccessful,wethencreateaninstanceoftheI S p V o i c e objectthatcontrolledthe"TextToSpeech"mechanismwithintheSAPIlibrarybyusingthe C o C r e a t e I n s t a n c e function.Ifthatalsowassuccessful,itmeansthatour"TextToSpeech Engine"wasinitializedproperlyandwearenowreadyforthenextstage:speakoutthe "responsestring" s p e a k ( c o n s t s t d : : s t r i n g t e x t ) :So,thisisthemainfunctionthatisusedfor implementing"TextToSpeech"withintheprogram,itbasicallytakesthe"responsestring" convertedtowidecharacters(W C H A R )andthenpassittothe"Speakmethod"ofthe "I S p V o i c e "objectwhichthenspeakoutthe"bot'sresponse". R e l e a s e _ T T S _ E n g i n e ( ) :Oncewearedoneusingthe"SAPITextToSpeechEngine",we justreleasealltheresourcesthathasbeenallocatedduringtheprocedure.
www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print
11/15
12/31/13
DownloadChatterbot12
Usingaflatfiletostorethedatabase
Sofarthe,databasewasalwaysbuiltintotheprogramwhichmeanswheneveryoumodifiedthe database,youwouldalsohavetorecompiletheprogram.Thisisnotreallyconvenientbecauseitmight happensometimesthatweonlywanttoeditthedatabaseandkeeptherestoftheprogramasitis.For thesereasonandmanyothers,itcouldbeagoodthingtohaveaseparatefiletostorethedatabase whichthengivesusthecapabilityofjusteditingthedatabasewithouthavingtorecompileallthefilesin theprogram.Tostorethedatabasewecouldbasicallyuseasimpletextfilewithsomespecific notationstodistinguishthedifferentelementsofthedatabase(keywords,response,transpositions, context...).Inthecurrentprogram,wewillusethefollowingnotationsthathasbeenusedbeforesome implementationoftheElizachatbotinPascal. 1. Linesthatstartsby"K"inthedatabasewillrepresentkeywords. 2. Linesthatstartsby"R"willrepresentresponses 3. Linesthatstartsby"S"willrepresentsignonmessages 4. Linesthatstartsby"T"willrepresenttranspositions 5. Linesthatstartsby"E"willrepresentpossiblecorrectionscanbemadeaftertransposingthe user'sinput 6. Linesthatstartsby"N"willrepresentresponsesforemptyinputfromtheuser 7. Linesthatstartsby"X"willrepresentresponsesforwhenthatchatbotdidnotfindanymatching keywordthatmatchthecurrentuserinput. 8. Linesthatstartsby"W"willrepresentresponsesforwhentheuserrepeatitself. 9. Linesthatstartsby"C"willrepresentthecontextofthechatbot'scurrentresponse. 10. Linesthatstartsby"#"willrepresentcomments Wenowhaveacompletearchitectureforthedatabase,wejustneedtoimplementthesesfeaturesinto thenextversionofthechatbot(Chatterbot13). DownloadChatterbot13
Abetterrepetitionhandlingalgorithm
Inanefforttopreventthechatbotfromrepeatingitselftoomuch,previouslywehaveuseaverybasic andsimplealgorithmthatconsistofcomparingthecurrentchatbot'sresponsetothepreviousone.If thecurrentresponseselectionisequaltothepreviousone,wesimplydiscardthatresponseandlook overforthenextresponsecandidateonthelistofavailableresponses.Thisalgorithmisveryefficient whenitcomestocontrolimmediaterepetitionsfromthechatbot.However,it'snotthatgoodtoavoid morelongtermrepetition.Duringachattingsession,thesameresponsecanoccursmanytimes.With thenewalgorithm,wecontrolhowlongittakesforthechatbottoreselectthesameresponse.Actually wemakesurethatithasuseallavailableresponseforthecorrespondingkeywordbeforeitcanrepeat thesameresponse.Thisisinturncanimprovethequalityoftheconversationexchanges.Hereisa decryptiononhowthealgorithmworks:Duringtheconversationbetweenthechatbotandtheuser,we makealistofalltheresponsespreviouslyselectedbythechatrobot.Whenselectinganewresponse, wemakeasearchofthencurrentselectedresponseinsidetheliststartingfromtheend.Ifthecurrent responsecandidatewasfoundduringthatsearchwithinthelist,wethenmakeacomparisonofthat positionthetotalnumberofavailableresponses.ifthepositionplusoneisinferiortothetotalof availableresponses,weconsiderthatitisarepetition,sowehavetodiscardthecurrentresponseand selectanotherone. DownloadChatterbot14
www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print
12/15
12/31/13
Updatingthedatabasewithnewkeywords
Sometimes,whenitcomestoaddnewkeywordstothedatabase,itcouldbedifficulttochoosethose thatarereallyrelevant.However,thereisaverysimplesolutiontothatproblem.Whenchatingwiththe chatrobot,wejustmakesurethatwestoretheuser'sinputinafile(ex:unknown.txt)eachtimethe chatbotwasnotabletofindanymatchingkeywordforthecurrentinput.Lateron,whenweneedto makesomekeywordsupdatesinthedatabase,wejusthavetotakealookatthefilethatwe'veuseto savetheunkownsentencesfoundearlierduringthepreviousconversations.Bycontinuouslyadding newkeywordsusingtheseprocedure,wecouldcreateaverygooddatabase. DownloadChatterbot15
SavingtheConversationLogs
Whysavingtheconversationsbetweentheusersandthechatbot?Becauseitcouldhelpusfindthe weaknessofthechatbotduringagivenconversation.Wemightthendecideonwhichmodificationsto maketothedatabaseinordertomakethefutureconversationsexchangesmorenatural.Wecould basicallysavethetimeandalsothedatetohelpusdeterminetheprogressofthechatbotafternew updateswereappliedtoit.Savingthelogshelpsusdeterminehowhumanlikeistheconversationskill ofthechatbot. DownloadChatterbot16
LearningCapability
Sofar,thechatbotwasnotabletolearnnewdatafromtheuserswhilechatting,itwouldbeveryuseful tohavethisfeaturewithinthechatbot.Itbasicallymeansthatwheneverthechatbotencountersan inputthathasnocorrespondingkeyword,itwouldprompttheuseraboutit.Andinreturntheuser wouldbeabletoaddanewkeywordandthecorrespondingresponsetoitinthedatabaseofthechat robot,doingsocanimprovethedatabaseofthechabotverysignificantly.Hereishowthealgorithm shouldgo: 1. NOKEYWORDWASFOUNDFORTHISINPUT,PLEASEENTERAKEYWORD 2. SOTHEKEYWORDIS:(key) 3. (ifresponseisno)PLEASEREENTERTHEKEYWORD(gobacktostep#2) 4. NORESPONSEWASFOUNDFORTHISKEYWORD:(key),PLEASEENTERARESPONSE 5. SO,THERESPONSEIS:(resp) 6. (ifresponseisno)PLEASEREENTERTHERESPONSE(gobacktostep#4) 7. KEYWORDANDRESPONSELEARNEDSUCCESSFULLY 8. ISTHEREANYOTHERKEYWORDTHATISHOULDLEARN 9. (ifresponseisyes,otherwisecontinuechating ):PLEASEENTERTHEKEYWORD(goback tostep#2) Returntobeginningofthedocument Checktheaiprogramming.blogspot.comwebpageforthelatestupdates
License
Thisarticle,alongwithanyassociatedsourcecodeandfiles,islicensedunderTheCodeProjectOpen License(CPOL)
www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print
13/15
12/31/13
AbouttheAuthor
GonzalesCenelia
Helpdesk/SupportGexelTelecom Canada
IhavebeenprogramminginCandC++formorethanfouryears,thefirsttimethatihadlearn programmingwasin1999incollege.Howeveritwasonlybytheyear2000whenihavebuy myfirstcomputerthatihadtrulystartedtodosomemoreinterestingthingsinprogramming. Asaprogrammer,mymaininterestisA.Iprogramming.Soi'mreallycaptivatedbyallthatis relatedtoN.L.U(NaturalLanguageUnderstanding),N.L.P(NaturalLanguageProcessing), ArtificialNeuralNetworksetc.Currentlyi'mlearningtoprograminPrologandLisp.Also,i'm reallyfascinatedwiththeoriginalchatterbotprogramnamed:Eliza,thatprogramwaswroteby JosephWeizenbaum.Everytimeirunthisprogram,itmakesmereallythinkthatA.Icouldbe solveoneday.AlotofinterestingstuffhasbeenaccomplishinthedomainofArtificial Intelligenceinthepastyears.Averygoodexampleofthoseaccomplishmentsis:Logic Programming,whichmakesitpossibletomanipulatelogicstatementsandalsotomakesome inferencesaboutthosestatements.Aclassicalexamplewouldbe:giventhefactthat"Every manismortal"andthatSocratesisaman,thanlogicallywecandeducethatSocratesis mortal.SuchsimplelogicalstatementscanbewroteinPrologbyusingjustafewlinesofcode: prologcodesample: mortal(X):man(X).%rule man(socrates).%declaringafact theprecedingprologrulecanberead:foreveryvariableX,ifXisamanthanXismortal.these lastPrologcodesamplecanbeeasilyextentedbyaddingmorefactsorrules,example: mortal(X):man(X).%rule mortal(X):woman(X).%rule man(socrates).%fact1 man(adam).%fact2 woman(eve).%fact3 formore,check:aiprogramming.blogspot.com
CommentsandDiscussions
www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print 14/15
12/31/13
www.codeproject.com/Articles/36106/Chatbot-Tutorial?display=Print
15/15