Beruflich Dokumente
Kultur Dokumente
Fuentes de consulta
http://www.osboxes.org/debian/
http://howto.gandasoftwarefactory.com/desarrollo-software/2014/como-instalar-
apache-hadoop-ubuntu-linux-20141209/
https://www.tutorialspoint.com/es/hadoop/hadoop_hdfs_operations.htm
https://hadoop.apache.org/docs/r2.8.0/hadoop-mapreduce-client/hadoop-mapreduce-
client-core/MapReduceTutorial.html
*/
#*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*#
# SE RECOMIENDA LEER ESTE DOCUMENTO CON NOTEPAD++ Y CONFIGURAR COMO LEGUAJE C++
PARA UNA MEJOR VISUALIZACION #
# https://notepad-plus-
plus.org/download/
#*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-
*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*-*#
passwd osboxes
passwd root
# verificar direccionamiento IP
# instalar SSH server
su
ifconfig
apt-get update
apt-get install openssh-server
nano /etc/bash.bashrc
# Para sistema de 32 bits
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-i386/
# Para sistema de 64 bits
export JAVA_HOME=/usr/lib/jvm/java-7-openjdk-amd64/
export HADOOP_HOME=/home/john/hadoop-2.6.5
source /etc/bash.bashrc
ls $JAVA_HOME
//Como respuesta debe ver los directorios
// bin docs jre man
cd /home/osboxes
wget http://www.eu.apache.org/dist/hadoop/common/hadoop-2.6.5/hadoop-2.6.5.tar.gz
tar xzf hadoop-2.6.5.tar.gz -C .
#Cambiar permisos
nano /etc/hosts
# en el archivo agregue
/*
192.168.X.X nodo1
192.168.X.Y nodo2
192.168.X.Z nodo3
*/
nano /etc/hostname
// root@osboxes:/home/osboxes# exit
exit
su
// osboxes@osboxes:~$ su
// Password:
// root@nodo1:/home/osboxes#
nano $HADOOP_HOME/etc/hadoop/core-site.xml
<configuration>
<property>
<name>fs.defaultFS</name>
<value>hdfs://nodo1:9000</value>
</property>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/john/hadoop-2.6.5/tmp</value>
</property>
</configuration>
nano $HADOOP_HOME/etc/hadoop/hdfs-site.xml
<configuration>
<property>
<name>dfs.replication</name>
<value>2</value>
</property>
</configuration>
nano $HADOOP_HOME/etc/hadoop/mapred-site.xml
<configuration>
<property>
<name>mapreduce.framework.name</name>
<value>yarn</value>
</property>
</configuration>
# Configure YARN
nano $HADOOP_HOME/etc/hadoop/yarn-site.xml
<configuration>
<property>
<name>yarn.nodemanager.aux-services</name> // en el video está linea
tiene un error: sernano ces</name>
<value>mapreduce_shuffle</value>
</property>
<property>
<name>yarn.resourcemanager.hostname</name>
<value>nodo1</value>
</property>
</configuration>
nano $HADOOP_HOME/etc/hadoop/hadoop-env.sh
export HADOOP_HEAPSIZE=256
nano $HADOOP_HOME/etc/hadoop/slaves
#agregue el contendio
/*
nodo1
nodo2
nodo3
*/
# apague la maquina y cree las dos copias,
# cambie el nombre y verifque las direcciones IP
############################################
## EN CADA MAQUINA
############################################
#Genere las llaves ssh que permitirán la autenticación entre los nodos:
#Adicione las llaves públicas de todos los nodos (nodo1, nodo2 y nodo3) dentro del
archivo authorized_keys para permitir autenticación ssh sin solicitud de
credenciales.
#desde nodo1:
scp /home/osboxes/.ssh/id_rsa.pub osboxes@nodo2:/home/osboxes/.ssh/id_rsa_nodo1.pub
scp /home/osboxes/.ssh/id_rsa.pub osboxes@nodo3:/home/osboxes/.ssh/id_rsa_nodo1.pub
#desde nodo2:
#desde nodo3:
lzohead () {
hadoop fs -cat $1 | lzop -dc | head -1000 | less
}
export PATH=$PATH:$HADOOP_HOME/bin
#iniciar hadoop
$HADOOP_HOME/sbin/start-all.sh
#Verificar servicios
// fin lab 1
#descargar archivo
wget http://textfiles.com/etext/REFERENCE/pge0112.txt
//Fin lab 2
/*
Valid program names are:
aggregatewordcount: An Aggregate based map/reduce program that counts the words
in the input files.
aggregatewordhist: An Aggregate based map/reduce program that computes the
histogram of the words in the input files.
bbp: A map/reduce program that uses Bailey-Borwein-Plouffe to compute exact
digits of Pi.
dbcount: An example job that count the pageview counts from a database.
distbbp: A map/reduce program that uses a BBP-type formula to compute exact bits
of Pi.
grep: A map/reduce program that counts the matches of a regex in the input.
join: A job that effects a join over sorted, equally partitioned datasets
multifilewc: A job that counts words from several files.
pentomino: A map/reduce tile laying program to find solutions to pentomino
problems.
pi: A map/reduce program that estimates Pi using a quasi-Monte Carlo method.
randomtextwriter: A map/reduce program that writes 10GB of random textual data
per node.
randomwriter: A map/reduce program that writes 10GB of random data per node.
secondarysort: An example defining a secondary sort to the reduce.
sort: A map/reduce program that sorts the data written by the random writer.
sudoku: A sudoku solver.
teragen: Generate data for the terasort
terasort: Run the terasort
teravalidate: Checking results of terasort
wordcount: A map/reduce program that counts the words in the input files.
wordmean: A map/reduce program that counts the average length of the words in the
input files.
wordmedian: A map/reduce program that counts the median length of the words in
the input files.
wordstandarddeviation: A map/reduce program that counts the standard deviation of
the length of the words in the input files.
*/