Sie sind auf Seite 1von 25

ACADGILD

BIG DATA &


HADOOP
ADMINISTRATION
LEARN. DO. EARN

ABOUT ACADGILD
ACADGILD is a technology education startup, which
aims to create an ecosystem for skill development,
where people can learn from mentors and from each
other. We believe that software development
requires highly specialized skills, best learned with
guidance from experienced practitioners. Online
videos or classroom formats are poor substitutes for
building real projects with the help of a dedicated
mentor.Our mission is to teach hands-on, job ready
software programming skills, globally, in small
batches of 8 to 10 students using Industry experts.

ACADGILD OFFERS COURSES IN:

ANDROID
DEVELOPMENT

JAVA FOR
FRESHER

DIGITAL
MARKETING

BIG DATA & HADOOP


ADMINISTRATION

MACHINE LEARNING
WITH R

FULL STACK WEB


DEVELOPMENT

Become a Big Data & Hadoop Developer

BIG DATA
ANALYSIS

NODE JS

CLOUD
COMPUTING

FRONT END
DEVELOPMENT
(WITH ANGULARJS)

2016 ACADGILD.
ALL RIGHTS RESERVED.
No part of this book may be reproduced, distributed, or transmitted in any form or by any
means, electronic or mechanical methods, including photocopying, recording, or by any
information storage retrieval system, without permission in writing from ACADGILD.
DISCLAIMER: This material is intended only for the learners and is not intended for any
commercial purpose. If you are not the intended recipient, then you should not distribute or
copy this material. Please notify the sender immediately or click here to contact us..
Published by
ACADGILD,
support@acadgild.com

Become a Big Data & Hadoop Developer

TABLE OF CONTENTS
1

Installing Oracle Virtual Box

Downloading CentOS

Steps to Install CentOS

Adding More Users in the CentOS

Starting NameNode, DataNode, ResourceManager,Nodenager &


Jobhistoryserver

5.1

Starting NameNode

5.2

Starting DataNode

5.3

Starting ResourceManager

5.4

Starting NodeManager

5.5

Starting Job historyserver

Checking the Health of Daemons

Become a Big Data & Hadoop Developer

Single Node Hadoop 2.X Cluster Setup


On CentOS
1. Installing Oracle Virtual Box
Download and install the Oracle Virtual Box from the below link..
https://drive.google.com/le/d/0Bxr27gVaXO5sRXdxQVpEUmhCZ3c/view?usp=sharing

2. Downloading CentOS
Download and install CentOS from the below link:
https://drive.google.com/le/d/0Bxr27gVaXO5sRU0yVFVQM0FvLU00Bxr27gVaXO5sRU0yVFVQM0FvLU0/
view?usp=sharing

Note: CentOS is downloaded as a compressed le. You need to unzip it using any unzipping
software by right clicking on the le and selecting the option Extract here.

3. Steps to Install CentOS


1. Click on New Option and then enter the details below as shown in the screenshot:
Name: Type in any name, to name your VM.
Type: Select the option Linux from the drop down list.
Version: Select Other Linux (64 bit) from the drop down list.

Become a Big Data & Hadoop Developer

2. Click Next.
On clicking Next, a prompt appears to set up RAM size for the VM.
3. Increase the RAM up to 2048 MB if the system has 8 GB RAM and increase up to 1 GB
if the system has 4 GB RAM.Before powering on the VM, click on the setting option
and then increase the RAM size as shown below:

4. Click on Next to get the option of selecting the Hard Disk Option; choose the third option
i.e. using the existing Virtual hard drive le.

5. Click on the folder icon to browse the location where the unzipped le of CentOS is
present.

Become a Big Data & Hadoop Developer

6. Select the imported VM and click on the Start button to start the VM.

On starting the VM, a prompt appears to put the credentials.


7. Type username and password as follows:
User name: tom
Password: tomtom

8. Open the terminal and login to the root user to have administrator permissions.
Type the password as follows:
Password: tomtom

Become a Big Data & Hadoop Developer

4. Adding More Users in the CentOS


1. Add more users in the CentOS by using the command adduser acadgild.
2. Set the password of the added user by using the command passwd acadgild.
refer the below screenshot.

Note: Enter any password for the created user.

3. Disable the rewall in the CentOS using the below command:


service iptables stop

4. Add the user acadgild into sudoers le to give the administrator rights to the
created user.Type the command visudo as shown in the screeenshot below to add the
acadgild into sudoers le.

Become a Big Data & Hadoop Developer

5. Add the user acadgild by scrolling the cursor down as shown


in the screenshot shown below:

Note: To type any command in the above le, enter insert mode by pressing I in the
keyboard and then add the users in the sudoers le and then press Esc button to come
out of insert mode and then type:wq to save and exit.
Reboot the machine and then login to the created user.

6. Use the below link to download jdk in the VM using the browser present in the centos.

http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

On clicking the above link, a screen prompts for selecting the required version.

7. Select the option shown with the red colored arrow symbol.

Become a Big Data & Hadoop Developer

On clicking the above option, download will start and get saved in Downloads folder.

8. Move the above le into /home/acadgild directory using the mv command and
then switch the directory to /home/acadgild by typing the command cd

9. Untar the jdk and extract the java le by using the command as shown in t
he screenshot.

10. Enter the command ls to see the extracted jdk in the same folder /home/acadgild.

Become a Big Data & Hadoop Developer

11. Download the hadoop le using the below link and then copy the le from
Downloads folder to /home/acadgild directory.

http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html

On clicking the above link, the below screen will prompt you to select a le.

12. Select the le with tar.gz extension.

13. Copy the tar le of hadoop from Downloads directory to /home/acadgild directory
using mv command as shown in the previous steps and then untar the downloaded
hadoop le using the below command, refer the below screenshot.

Become a Big Data & Hadoop Developer

10

14. Update the .bashrc le with required environment variables includi


hadoop path.Type the command sudo gedit .bashrc from home
directory /home/acadgild

Note: Update the path present in your system.

15. Type the command source .bashrc to make the environmental variables work.

Note: The java path set in .bashrc will vary for every system, you must give the path of
Java where it is has been downloded and extracted i.e /path-to-extracted-java folder.
Example: /home/acadgild/jdk1.8.0_60

11

Become a Big Data & Hadoop Developer

16. Create two directories to store NameNode metadata and DataNode blocks
as shown below:
mkdir -p $HOME/hadoop/namenode
mkdir -p $HOME/hadoop/datanode
Note: Change the permissions of the directory to 755.
chmod 755 $HOME/hadoop/namenode
chmod 755 $HOME/hadoop/datanode
17. Change the directory to the location where hadoop is installed

Note: Update the path present in your system.

18. Open hadoop-env.sh and add the java home(path) and hadoop home(path) in it.

Become a Big Data & Hadoop Developer

12

Note: Update the JAVA VERSION and path of the JAVA present in your system, in our case the
vesrion is 1.8 and location is /usr/lib/jvm/jdk1.8.0_60.

19. Open Core-site.xml using the below command from the path shown in the screenshot.

Note: Update the path present in your system.

20. Add the below properties in between conguration tag of core-site.xml


<conguration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</conguration>

13

Become a Big Data & Hadoop Developer

21. Open the hdfs-site.xml and add the following lines in between conguration tags.

<conguration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
<property>
<name>dfs.namenode.name.dir</name>
<value>/home/acadgild/hadoop/namenode</value>
</property>
<property>
<name>dfs.datanode.data.dir</name>
<value>/home/acadgild/hadoop/datanode</value>
</property>
</conguration>

Become a Big Data & Hadoop Developer

14

22. Open the Yarn-site.xml and add the following lines in between conguration tags

<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shue</value>
</property>
<property>
<name>yarn.nodemanager.aux-services.mapreduce.shue.class</name>
<value>org.apache.hadoop.mapred.ShueHandler</value>
</property>

23. Copy the mapred-site.xml template into mapred-site.xml and then


add the following properties as shown in the screenshot.
cp mapred-site.xml.template mapred-site.xml

<conguration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</conguration>

15

Become a Big Data & Hadoop Developer

24. Login to the root user and then install openssh server in the CentOS.
Refer the below screenshot to install the openssh server.

Note: Come out of the root directory by typing the command exit from the
command line for the next step.

25. Generate ssh key for hadoop user using command:


ssh-keygen -t rsa
Refer the below screenshot.
Note: Ensure to hit enter key after typing the command ssh-keygen -t rsa and hit enter once
again when it asks for le in which to save the key and for passphrase.

Become a Big Data & Hadoop Developer

16

26. Copy the public key from .ssh directory to authorized_keys folder.
Change the directory to .ssh and then type the below command to copy the les
into the authorized _keys folder.

27. Type the command ls to check whether authorized_keys folder has been created or not

28. To ensure whether the keys have been copied, type the command:
cat authorized_keys

17

Become a Big Data & Hadoop Developer

29. Change the permission of the .ssh directory.


chmod 600 .ssh/authorized_keys

30. Restart the ssh service by typing the below command.


Command: sudo service sshd start

31. To start all the daemons follow the below steps:


a.Format the NameNode:
Command: hadoop namenode -format

Become a Big Data & Hadoop Developer

18

5. Starting NameNode, DataNode,


ResourceManager, NodeManager
and Jobhistoryserver
Note: Change the directory to sbin of hadoop before starting the daemon.
5.1 Starting NameNode
Change the directory to the location of hadoop:
Command: cd hadoop-2.6.0/sbin
And now type the below command to start Namenode:
Command: ./hadoop-daemon.sh start namenode

5.2 Starting DataNode


Command: ./hadoop-daemon.sh start datanode

5.3 Starting ResourceManager


Command: ./yarn-daemon.sh start resourcemanager

19

Become a Big Data & Hadoop Developer

5.4 Starting NodeManager


Command: ./yarn-daemon.sh start nodemanager

5.5 Starting Job historyserver


Command: ./mr-jobhistory-daemon.sh start historyserver

6. Checking the Health of Daemons


Check if all the daemons have started by typing the command jps

Become a Big Data & Hadoop Developer

20

ACADGILD
Check out these resources to enrich your skills in Big Data and Hadoop.
We hope this eBook has been helpful in understanding the vital steps necessary to make your
career in Big Data domain.
For a better understanding and in-depth learning of Big Data and Hadoop technology,
enroll for our Big Data and Hadoop Development course.
Keep visiting our websites www.acadgild.com for more posts on Big Data and other technologies.

21

Become a Big Data & Hadoop Developer

ACADGILD
Check out these resources to enrich your skills in Big Data and Hadoop.
We hope this eBook has been helpful in understanding the vital steps necessary to make your
career in Big Data domain.
For a better understanding and in-depth learning of Big Data and Hadoop technology,
enroll for our Big Data and Hadoop Development course.
Keep visiting our websites www.acadgild.com for more posts on Big Data and other technologies.

21

Become a Big Data & Hadoop Developer

About the Author


Satyam Kumar is a Big Data Professional, working in AcadGild with 3+ years of experience
and having expertise in Big Data technologies like Hadoop, Spark, NoSQL and other related
technologies.
He strives to code in Programming languages like Java and Python and have been responsible
for development of various projects and blogs related to Hadoop ecosystem and Spark.
Feel free to contact him at satyam@acadgild.com in case you have any query.

LEARN. DO. EARN


22

Become a Big Data & Hadoop Developer

ACADGILD
We hope this Ebook has helped you understand some of the terminology associated with
Big Data. If you have any questions, feel free to contact us at support@acadgild.com.

Das könnte Ihnen auch gefallen