Sie sind auf Seite 1von 10

COL733CLOUDCOMPUTING

TECHNO.FUNDA.

GuidedbyProf.SCGupta

Assignment4

MapReduce

Group11

AbhijitKawale(2012CS50273)
PreritPatidar(2012CS50294)
ArvindBhuria(2012CS50280)
SaurabhNakra(2012CS50298)
BhaveshChauhan(2012CS10221)

InstallationSteps:
MapReducewasinstalledwithHadoopinthepreviousassignment.
Filesrequiredtochangewere:mapredsite.xmlandyarnsite.xml
mapredsite.xmlcontainsthehostandportforthemapreducejobtrackerand
yarnsite.xmlcontainsthepropertiesforthenodetoworkasyarnnode.

PartALargetextfilewordcount

RunningMapreduce:
ToruntheWordCountprogramcodewaswritteninWordCount.javaanduploadedto
themastervm.

Step1
Directoryunitswascreatedinthehomefoldertostoreallthe.classfiles
Commandmkdirunits

Step2
hadoopcore.jarwasneededforthewordcountprogramtocompileandexecute.So,
hadoopcore1.2.1.jarwasdownloadedtothemastervm.
Commandwgethttp://mvnrepository.com/artifact/org.apache.hadoop/hadoopcore/1.2.1

Step3
ThefollowingcommandsareusedforcompilingtheWordCount.javaprogramand
creatingajarfortheprogram.
javacclasspathhadoopcore1.2.1.jardunitsWordCount.java
jarcvfunits.jarCunits/.

Step4
ThefollowingcommandisusedtocreateaninputdirectoryinHDFS.
$HADOOP_HOME/bin/hadoopfsmkdir/user/hadoop/input_dir

Step5
The following command is used to copy the input file named sample.txt in the input
directoryofHDFS.
$HADOOP_HOME/bin/hadoopfsput/home/hduser/sample.txt
/user/hadoop/input_dir

Step6
Afterthisapplicationwasrunusing:
$HADOOP_HOME/bin/hadoopjarunits.jarWordCount
/user/hadoop/input_dir/user/hadoop/output_dir

Step7
ThefollowingcommandisusedtoseetheoutputinPart00000file.Thisfileis
generatedbyHDFS.
$HADOOP_HOME/bin/hadoopfscat/user/hadoop/output_dir/part00000

ProblemsEncountered:
Whenrunningtheapplicationwegotthefollowingerror:

Gotexception:java.net.ConnectException:CallFrom
baadaldesktopvm/127.0.1.1to
baadalservervm.cse.iitd.ernet.in:54310failedonconnection
exception:java.net.ConnectException:Connectionrefused

Wefiguredoutthatproblemwasin/etc/hostnamefilewhichcontainedtheVM
hostname.Itwasbaadaldesktopvmbydefaultbutitshouldhavebeenthesameas
masterorslavenameaddedto/etc/hosts.So,wechanged/etc/hostnameineachofthe
VMsandsetthehostnametomasterformastervmandslave1forslave1,slave2for
slave2andslave3forslave3.Finallly,theproblemgofixedandweranthewordcount
applicationsuccessfully.

Herearethescreenshotsafterrunning:

ResultsforWORDCOUNT:
Inputfileusedwassample.txtpresentinthesubmissionfolder.Itwasa2.4MBsizefile
whichcontainstheline"samplefileherearetherandomwords"62856times.
Resultsobtainedafterrunningthewordcountprogramwereasfollows:

Asitcanbeseenthatoutoftheprogramwasaccurate.

(b)AftershuttingdownoneVM,resultsdidnotchange,Herearethe
screenshots.

ComputingAveragegradeofthecoursesusing
MapReduce

InputfilewasgeneratedusingjavacodeAverage.javapresentinthesubmissionfolder.
Itcontains10,000rowsand1250studentsaredistributedamong8courses.
Average.javawascreatedtocalculatetheaveragegradeofeachofthe8courses.
Fortheinputfilegrades_our.txtpresentinthesubmissionfolderoutputwas:

ApproachUsed:
Mapperreturnskeyvaluepair.Keywascourseid,andvaluewascorrespondinggrade
ofastudentinthatcourse
CombinerandReducertakeinputkeyasText(courseid)andinputvaluesasiteratorof
<FloatWritable>andoutputskeyasText(courseid)andvalueasFloatWritable(average
grade).

Herearethescreenshotsafterrunning:

Das könnte Ihnen auch gefallen