Chalenging Domain in Reinforcement Learning For Machine Learning Research

International Journal of Advance Foundation and Research in Computer (IJAFRC)
Volume 1, Issue 7, July !1"# I$$% &"' ( "')&

&' * + !1", IJAFRC All Ri,hts Reserved ---#i.afrc#or,
Chalen,in, /omain In Reinforcement 0earnin, For 1achine
0earnin, Research
Anil kumar Yadav, A. K sachan
PHD Scholar, Department of CSE, I!" #niversit$

%#P& , India
'

Director, (I!S, )hopal %".P& India
*

ak$'*+,-mail.com
'
, sachanak.'*,$ahoo.com
*

A 2 $ 3 R A C 3
1achine learnin, is not .ust a data4ase pro4lem5 it is also a part of artificial intelli,ence# 3o 4e
intelli,ent, a system that is in a chan,in, environment should have the a4ility to learn#1achine
learnin, has its o-n importance in the area of 2ayesian decision theory, reinforcement learnin,,
supervised learnin,, decision tree, clusterin,, hidden 1ar6ova process, com4inin, multiple
learners# 1achine learnin, is pro,rammin, computers to optimi7e performance usin, e8ample
4ased trained data or past e8perience# 1achine learnin, also helps us find solution to many
pro4lems in vision, speech recondition, and ro4otics# In this paper, -e e8amine challen,in, issue
in the reinforcement learnin,5 -e cannot ma6e a ,enerali7ed environment for self(learnin,
system# And other 6no-n issues -e cannot remove uncertainty permanently in artificial
intelli,ence# 3heir o4.ectives for findin,, possi4le solution -ithin comple8 pro4lem or achievin,
solution near to ,oal# In this study -e are representin, ma.or issues in R0 and other learnin,
domain alon, -ith their comple8ity and their advanta,es and disadvanta,es# 9e are also
descri4in, the future research and application of machine learnin, system#

Inde8 3erms: 1achine learnin,, ;nvironment, reinforcement learnin,, artificial intelli,ence

I# I%3R</=C3I<%

"achine learnin- is defined as machines have the a/ilit$ to store process lar-e amount of data, as 0ell as
to access it from actuall$ separatel$ locations over computer net0ork. "ost data ac1uisition devices are
di-ital no0 and record relia/le data. or e2ample of 0all mart chain that has hundreds of stores all over
the countr$ and sellin- thousands of -oods to millions of customers. "anuall$ it is much difficult to
mana-e, 0hile s$stems maintain records the details of each transaction3 date, customer identification
code, -oods /ou-ht and their amount, total mone$ spent. or this re1uired -i-a/$te of data per da$. !his
stored data /ecomes useful onl$ 0hen it is anal$4ed turned in to information that make use of, for
e2ample, to make prediction. 5e have not an$ idea a/out 0hich person /u$ a particular product. 5e
ma$ not /e a/le to indemnif$ the process completel$, /ut 0e /elieve 0e can construct a -ood and useful
appro2imation. !hat appro2imation ma$ not e2plain ever$thin-, /ut a/le to account for some part of the
data. 5e /elieve that throu-h identif$in- the complete process ma$ not /e possi/le, 0e can still detect
certain pattern or re-ularities. !his is the importance of machine learnin-.6'7 !he main pro/lem here is
to select the most plausi/le s$ntactic anal$sis -iven the often thousands used to order the anal$ses
accordin- to their pro/a/ilit$ or to -enerate the most pro/a/le parse%s& onl$.6+,87 Ho0ever, not all
natural lan-ua-e processin- %9:P& applications re1uire a complete s$ntactic anal$sis. A full parse often
provides more information than needed and sometimes less. E.-., in Information (etrieval, it ma$ /e
enou-h to find simple 9Ps %9oun Phrases& and ;Ps %;er/ Phrases&. In Information E2traction, Summar$
<eneration, and =uestion Ans0erin-, 0e are interested especiall$ in information a/out species s$ntactic>
semantic relations such as a-ent, o/?ect, location, time, etc %/asicall$, 0ho did 0hat to 0hom, 0hen,
0here and 0h$&, rather than ela/orate con?u-ational s$ntacticanal.

&> * + !1", IJAFRC All Ri,hts Reserved ---#i.afrc#or,
II# R;0A3;/ 9<R?:

Here 0e /riefl$ summaries the papers in this issue.

A# 0ar,e 0oo6(=p(3a4le

In this paper, 6',*7 0e found that, ma?or issues in machine learnin- techni1ue especiall$ in =uer$ /ase
self learnin- the learner %A-ent& re1uired a lot of trainin- input of e2ecution c$cle. It means it re1uired
lar-e look>up>ta/le. 9o0, 0e emphasi4e the another important pro/lem associated 0ith (: is that
a-ent 0ho travel in virtual 0orld %called -rid 0orld&, ho0 to ac1uire kno0led-e efficientl$ /$ learnin-
e2perience throu-h trial>and>error interaction 0ith its environment @ /ehave intelli-entl$ and also aim
is to reach the -oal /$ movin- shortest decision path randoml$. 5hile at each time %step& a-ents select
randoml$ one of four actions3 move #p, move Do0n, move :eft, and move (i-ht to perform, 0ithout
communicatin- 0ith each other.

2# 0earnin, A,ent Frame-or6

In this paper, 6*7 0eakness of learnin- a-ent is that, it cannot /e classif$ data sets /efore learnin-. 5hile
Classification enhanced learnin- rate of a-ent and -ives fast re0ard to the a-ent. So that /efore a-ent
trainin- it re1uired as a 99 classifier to classif$ data to train learnin- a-ent in fi- *.*.

Fi,ure 1# 0earnin, A,ent 1odel

C# @enerali7ation <f ;nvironment

In this paper, 6A, B7 0e e2amine that, challen-in- domain is ho0 to -enerali4ed environment, /ecause
learner is environment dependent. Each learner %a-ent& should learn from environment. !hat
environment takes in the form of -rid 0orld, ma4e pro/lems etc. If a-ent trained on 'C2'C environments,
it cannot -enerali4e for other environment as sho0n in fi-.*.A and fi-. *.A.'. It is facts in human case also
ever$/od$ should learn from environment.


"! * + !1", IJAFRC All Ri,hts Reserved ---#i.afrc#or,

Fi,ure # 1!A1! @rid 9orld Bro4lem 9ith A,ent 3rainin, 1odel

Fi,ure &# $ho- as 1!81! ma7e pro4lem, ho- a,ent travel shortest route durin, )! trials over )!! episodes#
A,ent startin, moves from (1, 1) and Reached at the ,oal sate at (C, C), -ithout a(priori information#

/# 0earnin, Rules And 3heir ;8pectation

In this paper 6D, E7 presented a top>do0n rule induction s$stem for learnin- lin-uistic structures. !he
initial s$stem is enhanced 0ith additional mechanisms to deal 0ith 9ois$ data. Here -iven t0o t$pes of
difficulties si-nificant noise in the data and the presence of lin-uisticall$ motivated e2ceptions. Since
lin-uisticall$ motivated e2ceptions occur, the$ cannot /e treated as noise. !o address these pro/lems is
introduced to learn e2ceptions for each rule that is learned. !he second improvement introduces
lin-uisticall$ motivated prior kno0led-e to improve the efficienc$ and accurac$ of the s$stem. !he
refinement mechanism is /ased on the assumption that there is some re-ularit$ to the errors in the data
and thus, /$ s$stematicall$ searchin- for e2ceptions, the rule induction s$stem is improved6'C7. 5ith the
use of prior kno0led-e, the conte2t of onl$ one element need /e taken into account and the search space
is reduced resultin- in a si-nificant reduction in learnin- time.

III# ABB0ICA3I<% <F 1ACDI%; 0;AR%I%@

"1 * + !1", IJAFRC All Ri,hts Reserved ---#i.afrc#or,

Application of machine learnin- method to lar-e data/ase is called data minin-. In data minin-, lar-e
volume of data is processed to construct a simple model 0ith valua/le use, for e2ample, havin- hi-h
predictive accurac$ 6',F7. In addition to retail, in finance /ank anal$4e their past data to /uilt models to
use in credit applications, fraud detection and stock market.

In manufacturin- learnin- model are used for optimi4ation, control and trou/leshootin-.In medicine,
learnin- pro-rams are anal$4ed for net0ork optimi4ation and ma2imi4in- the 1ualit$ of service. In
science, lar-e amounts of data in ph$sics, astronom$ and /iolo-$ can onl$ anal$4ed fast enou-h /$
computers.

IV# C<%C0=$I<%

Gur research paper focuses on, fe0 points as Data memori4ation in machine learnin-, is an important
consideration for a-ent trainin- especiall$ in reinforcement learnin-. In all learnin- approaches a-ent
ac1uired lar-e look ta/le for trainin- phase. Due to this, it contain hu-e amount of data/ase for learnin-
case. 5e studied various reinforcement>learnin- pro/lem /$ different researchers and concluded their
related 0ork summer$ in sec.*.urther research challen-es and constraints of desi-nin- decision
classifier are ela/orated. inall$, in this paper 0e discussed some future research direction 0ith
conclusion of this 0ork to intend that it help to man$ researchers 0ho are 0orkin- for improvement in
look up ta/le, environment -enerali4ation to machine intelli-ent s$stem.

V# R;F;R;%C;

6'7 Ethan Alpa$adin, HIntroduction to machine learnin-I, "I! press Cam/rid-e, *CC+

6*7 Anil kumar $adav and Shailendra kumar shrivastav,I Evaluation of (einforcement :earnin-
!echni1uesI, AC", vol. 'A*, *C'C, pp. EEJB*, %IS)93 BDE>'>F+CA>CFCE>+&.

6A7 Hitoshi Ima and Yaouk Karo, HS0arm (einforcement :earnin- Al-orithms )ased on Sara "ethodI,
IEEE, *CCE, pp.*CF+>*CFB

6F7 (. Caruana and D. reita-. <reed$ attri/ute selection. In Proceeedin-s of the Eleventh
International Conference on "achine :earnin-, pa-es *EKA8, 9e0 )runs0ick, 9L, #SA, 'BBF.
"or-an Kaufman

6+7 H. De?ean. :earnin- rules and their e2ceptions. Lournal of "achine :earnin- (esearch,*CC*.

687 E. )rill. Some advances in rule>/ased part of speech ta--in-. In Proceedin-s of the '*
th
9ational
Conference on Arti.cial Intelli-ence %AAAI>BF&, Seattle, 5ashin-ton, 'BBF.

6D7 H. De?ean. :earnin- rules and their e2ceptions. Lournal of "achine :earnin- (esearch,*CC*.

6E7 E.. !?on- Kim San-. "emor$>/ased shallo0 parsin-. Lournal of "achine :earnin- (esearch,
*CC*.

6B7 !. Mhan-, . Damereau, and D. Lohnson. !e2t chunkin- /ase on a -enerali4ation of 0inno0. Lournal
of "achine :earnin- (esearch, *CC*.

" * + !1", IJAFRC All Ri,hts Reserved ---#i.afrc#or,

6'C7 Keita Halmahera, !adahiro, HEffective inte-ration of imitation learnin- and reinforcement
learnin- /$ -eneratin- internal re0ardI, Ei-hth International Conference on Intelli-ent S$stems
Desi-n and Applications, IEEE *CCEpp.'*'>'*8.

Chalenging Domain in Reinforcement Learning For Machine Learning Research

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Chalenging Domain in Reinforcement Learning For Machine Learning Research

Hochgeladen von

Copyright:

Verfügbare Formate

International Journal of Advance Foundation and Research in Computer (IJAFRC)

Volume 1, Issue 7, July !1"# I$$% &"' ( "')&

Das könnte Ihnen auch gefallen