Sie sind auf Seite 1von 10

Single Node Hadoop Cluster

Installation Guide
Please follow the steps listed below to setup Hadoop in Pseudo distributed mode.

1. Download vMWare Player


https://my.vmware.com/web/vmware/free#desktop_end_user_computing/vmware_player/5_0
https://www.vmware.com/tryvmware/?p=player

2. Download Ubuntu VM Image


http://www.momotrade.com/tool/vm/ubuntu1604t.html
4. Install SSH and setup passwordless SSH connection
sudo apt-get install openssh-server
ssh-keygen -t rsa -P ""

cat $HOME/.ssh/id_rsa.pub >> $HOME/.ssh/authorized_keys

5. Installing latest java on Ubuntu

sudo apt-get install open-jdk-8-jdk


6. Download Hadoop
Download latest version of Hadoop software
http://apache.claz.org/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz

wget http://apache.claz.org/hadoop/common/hadoop-2.7.3/hadoop-2.7.3.tar.gz

7. Extract files from hadoop tarball and Update .bashrc


tar xzf hadoop-2.7.3.tar.gz

--setup JAVA_HOME
open .bashrc file

export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-i386
export HADOOP_HOME=/home/user/hadoop-2.7.3
--Add Hadoop env vars to .bashrc

export HADOOP_HOME=/home/user/hadoop-2.7.3
export HADOOP_MAPRED_HOME=$HADOOP_HOME
export HADOOP_COMMON_HOME=$HADOOP_HOME
export HADOOP_HDFS_HOME=$HADOOP_HOME
export YARN_HOME=$HADOOP_HOME
export HADOOP_COMMON_LIB_NATIVE_DIR=$HADOOP_HOME/lib/native
export PATH=$PATH:$HADOOP_HOME/sbin:$HADOOP_HOME/bin
export HADOOP_INSTALL=$HADOOP_HOME

8. Update Hadoop Configuration files viz hadoop-env.sh, core-


site.xml, hdfs-site.xml, mapred-site.xml and yarn-site.xml
cd $HADOOP_HOME/etc/hadoop

vi /home/user/hadoop-2.7.3/etc/hadoop/hadoop-env.sh
Add the property below to hadoop-env.sh
export JAVA_HOME=/usr/lib/jvm/java-8-openjdk-i386
--1. Update core-site.xml

vi /home/user/hadoop-2.7.3/etc/hadoop/core-site.xml

Add the property below to core-site.xml

<configuration>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:9000</value>
</property>
</configuration>
--2. hdfs-site.xml
vi /home/user/hadoop-2.7.3/etc/hadoop/hdfs-site.xml

Add the property below to hdfs-site.xml

<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>

--3. yarn-site.xml
vi /home/user/hadoop-2.7.3/etc/hadoop/yarn-site.xml

<configuration>
<property>
<name>yarn.nodemanager.aux-services</name>
<value>mapreduce_shuffle</value>
</property>
</configuration>
--4. mapred-site.xml
vi /home/user/hadoop-2.7.3/etc/hadoop/mapred-site.xml

<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>

--9. Format Namenode


hdfs namenode -format
--10 Start Daemons
Go to /home/user/hadoop-2.7.3/sbin folder

10.1 Start HDFS daemons

start-dfs.sh

--Use jps command to check the list of daemons running on linux box
Jps
ps -eaf | grep 'java'

10.2 Start YARN daemons

start-yarn.sh

10.3 Access Hadoop URLs


HDFS : http://localhost:50070/
RM UI: http://localhost:8088/

Das könnte Ihnen auch gefallen