Ncyclo Hadoop Documentation

Hochgeladen von

mnagasamy3608

0% fanden dieses Dokument nützlich (0 Abstimmungen)

21 Ansichten3 Seiten

hadoc

Copyright

Verfügbare Formate

DOC, PDF, TXT oder online auf Scribd lesen

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Dieses Dokument melden

hadoc

Copyright:

Verfügbare Formate

Als DOC, PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

0% fanden dieses Dokument nützlich (0 Abstimmungen)

21 Ansichten3 Seiten

Ncyclo Hadoop Documentation

Hochgeladen von

mnagasamy3608

hadoc

Copyright:

Verfügbare Formate

Als DOC, PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

Zu Seite

Sie sind auf Seite 1von 3

Im Dokument suchen

Configuration

1. Cloudera:
- Basically for testing purpose.
- VMware Player: Free download from
http://www.vmware.com/products/player/
- Cloudera QuickStart VM: Free download from
https://www.cloudera.com/content/support/en/downloads.html (My
cloudera version is 4.2.0, with pig version 0.10.0)
- Run cloudera VM in VMware Player
- Need to install Cloudera Development Kit to enable full screen and sharing
folders between host machine and VM. Also from
https://www.cloudera.com/content/support/en/downloads.html
2. AWS account:
Ask from Ripul to get account credentials (access key, secret key),
keypair.pem file
3. SSH tools:
- PuTTy : configure with aws credential and pem file
- Cygwin terminal: ssh from the folder containing the pem file
- Problem: connection might keep dropping. May solve by using other tools.
4. S3 brower:
Download from http://s3browser.com/, put in AWS credentials to use.
Development
1. Cloudera environment
- Compile java code in folder called udf containing NcycloLoader.java
need to specify classpath with following jar files
>>javac -cp
/mnt/hgfs/project/lib/pig.jar:/home/cloudera/eclipse/plugins/org.apache.co
mmons.logging_1.0.4.v201101211617.jar:/mnt/hgfs/project/lib/commonsio-2.4.jar:/mnt/hgfs/project/lib/hadoop-core-0.20.2.jar NcycloLoader.java
Then Pack udf folder to jar:
>>cd ..
>>jar cf udf.jar udf
-

There are 2 modes in pig, hadoop and local. Hadoop mode will run code
from HDFS, local mode run code from local folders, which is better for
testing purpose.
(1)Run Pig in interactive mode
>>pig x local
(2)Run pig with Pig file
>>pig x local file.pig
>>quit
2. Code

Java udf code:

Use package udf to indicate folder
getNext() read from input file line by line, do format transformation, return
a tuple

Pig code:
Register udf.jar
Use proper schemas
Make sure output folder path do not exist
Useful references:
http://chimera.labs.oreilly.com/books/1234000001811/index.html
http://pig.apache.org/docs/r0.10.0/basic.html

3. AWS environment
- Create hadoop cluster
In EMR console
Create new job flow:
Choose name & type - select mode/upload script select number of
nodes&type/spot price keypair/log/if or not keep alive finish
-

Ssh commands
(If using Cygwin, to folder containing pem file:
>>cd /cygdrive/c/Users/qli/Documents
Ssh to micro jumpbox:
>>ssh -i keypair.pem ec2-user@10.1.62.181)
Ssh from jumpbox to hadoop cluster:
>>ssh -i keypair.pem hadoop@ec2-54-224-126-81.compute1.amazonaws.com

Install s3 in jumpbox or other instances(to access from command line)

Download s3:
>>wget http://sourceforge.net/projects/s3tools/files/s3cmd/1.5.0alpha1/s3cmd-1.5.0-alpha1.tar.gz
untar it:
>>tar -xvf s3cmd-1.5.0-alpha1.tar.gz
cd in it and install:
>> sudo python setup.py install
Then configure it with your key:
>>s3cmd configure
Command to upload and download from s3:
>>s3cmd put file s3://yourbucket/folder
>>s3cmd get s3://yourbucket/folder/file
Reference: http://s3tools.org/s3cmd

For large data set transmission:

Hadoop distcp s3://yourbucket/folder
Reference: http://hadoop.apache.org/docs/stable/distcp.html
If uploading pem file to s3, may need to change the authority of pem file:
>>chmod +600 keypair.pem
-

CloudWatch(to monitor the cluster)

Search all matrixes by job flow id
Useful matrix includes: HDFS read/write, running map/reduce jobs, S3
read/write

S3 console : another way to access S3

Workflow
[Beforehand] Upload udf.jar, upload serverlist.csv, fieldlist.csv to s3 bucket
1. S3 browser: Upload by drag file/folder
2. In AWS EMR console: create cluster
In CloudWatch: monitor cloud performance
3.Run script
4.Terminate cluster
5.Download output from s3
Next
1.
2.
3.

Steps
Automation through command line
Update fieldlist, serverlist in run time
Optimize code

Das könnte Ihnen auch gefallen

Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Von Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Bewertung: 4 von 5 Sternen
4/5 (895)
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Von Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Bewertung: 4 von 5 Sternen
4/5 (5794)
Shoe Dog: A Memoir by the Creator of Nike
Von Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Bewertung: 4.5 von 5 Sternen
4.5/5 (537)
Grit: The Power of Passion and Perseverance
Von Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Bewertung: 4 von 5 Sternen
4/5 (588)
The Yellow House: A Memoir (2019 National Book Award Winner)
Von Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Bewertung: 4 von 5 Sternen
4/5 (98)
Principles: Life and Work
Von Everand
Principles: Life and Work
Ray Dalio
Bewertung: 4 von 5 Sternen
4/5 (599)
Yes Please
Von Everand
Yes Please
Amy Poehler
Bewertung: 4 von 5 Sternen
4/5 (1891)
The Little Book of Hygge: Danish Secrets to Happy Living
Von Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Bewertung: 3.5 von 5 Sternen
3.5/5 (400)
Never Split the Difference: Negotiating As If Your Life Depended On It
Von Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Bewertung: 4.5 von 5 Sternen
4.5/5 (838)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Von Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Bewertung: 4.5 von 5 Sternen
4.5/5 (474)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Von Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Bewertung: 3.5 von 5 Sternen
3.5/5 (231)
Rise of ISIS: A Threat We Can't Ignore
Von Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Bewertung: 3.5 von 5 Sternen
3.5/5 (137)
The Emperor of All Maladies: A Biography of Cancer
Von Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Bewertung: 4.5 von 5 Sternen
4.5/5 (271)
Fear: Trump in the White House
Von Everand
Fear: Trump in the White House
Bob Woodward
Bewertung: 3.5 von 5 Sternen
3.5/5 (738)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Von Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Bewertung: 4.5 von 5 Sternen
4.5/5 (266)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Von Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Bewertung: 4.5 von 5 Sternen
4.5/5 (345)
On Fire: The (Burning) Case for a Green New Deal
Von Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Bewertung: 4 von 5 Sternen
4/5 (74)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Von Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Bewertung: 3.5 von 5 Sternen
3.5/5 (2259)
Team of Rivals: The Political Genius of Abraham Lincoln
Von Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Bewertung: 4.5 von 5 Sternen
4.5/5 (234)
The Unwinding: An Inner History of the New America
Von Everand
The Unwinding: An Inner History of the New America
George Packer
Bewertung: 4 von 5 Sternen
4/5 (45)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Von Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Bewertung: 4 von 5 Sternen
4/5 (1090)
Angela's Ashes: A Memoir
Von Everand
Angela's Ashes: A Memoir
Frank McCourt
Bewertung: 4.5 von 5 Sternen
4.5/5 (440)
Steve Jobs
Von Everand
Steve Jobs
Walter Isaacson
Bewertung: 4.5 von 5 Sternen
4.5/5 (806)
Bad Feminist: Essays
Von Everand
Bad Feminist: Essays
Roxane Gay
Bewertung: 4 von 5 Sternen
4/5 (1016)
The Glass Castle: A Memoir
Von Everand
The Glass Castle: A Memoir
Jeannette Walls
Bewertung: 4.5 von 5 Sternen
4.5/5 (1713)
John Adams
Von Everand
John Adams
David McCullough
Bewertung: 4.5 von 5 Sternen
4.5/5 (2409)
The Outsider: A Novel
Von Everand
The Outsider: A Novel
Stephen King
Bewertung: 4 von 5 Sternen
4/5 (1839)
The Light Between Oceans: A Novel
Von Everand
The Light Between Oceans: A Novel
M.L. Stedman
Bewertung: 4.5 von 5 Sternen
4.5/5 (789)
Brooklyn: A Novel
Von Everand
Brooklyn: A Novel
Colm Tóibín
Bewertung: 3.5 von 5 Sternen
3.5/5 (1937)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Von Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Bewertung: 4.5 von 5 Sternen
4.5/5 (121)
The Woman in Cabin 10
Von Everand
The Woman in Cabin 10
Ruth Ware
Bewertung: 3.5 von 5 Sternen
3.5/5 (2322)
Little Women
Von Everand
Little Women
Louisa May Alcott
Bewertung: 4 von 5 Sternen
4/5 (104)
A Man Called Ove: A Novel
Von Everand
A Man Called Ove: A Novel
Fredrik Backman
Bewertung: 4.5 von 5 Sternen
4.5/5 (4610)
Wolf Hall: A Novel
Von Everand
Wolf Hall: A Novel
Hilary Mantel
Bewertung: 4 von 5 Sternen
4/5 (3811)
Manhattan Beach: A Novel
Von Everand
Manhattan Beach: A Novel
Jennifer Egan
Bewertung: 3.5 von 5 Sternen
3.5/5 (792)
The Perks of Being a Wallflower
Von Everand
The Perks of Being a Wallflower
Stephen Chbosky
Bewertung: 4.5 von 5 Sternen
4.5/5 (2104)
The Art of Racing in the Rain: A Novel
Von Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Bewertung: 4 von 5 Sternen
4/5 (4200)
The Constant Gardener: A Novel
Von Everand
The Constant Gardener: A Novel
John le Carré
Bewertung: 3.5 von 5 Sternen
3.5/5 (104)
A Tree Grows in Brooklyn
Von Everand
A Tree Grows in Brooklyn
Betty Smith
Bewertung: 4.5 von 5 Sternen
4.5/5 (1929)
Her Body and Other Parties: Stories
Von Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Bewertung: 4 von 5 Sternen
4/5 (821)
Sing, Unburied, Sing: A Novel
Von Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Bewertung: 4 von 5 Sternen
4/5 (1103)
561 Extracting SD Data Into Sap BW
Dokument25 Seiten
561 Extracting SD Data Into Sap BW
Naresh Kudumu
Noch keine Bewertungen
Spotlight Functionality Details
Dokument7 Seiten
Spotlight Functionality Details
utinf07
Noch keine Bewertungen
CRUD NETBEANS With XAMPP
Dokument11 Seiten
CRUD NETBEANS With XAMPP
Lloydy Lloydy
Noch keine Bewertungen
Load Data in SAP Using LSMW & File Created by BODS
Dokument31 Seiten
Load Data in SAP Using LSMW & File Created by BODS
Deepti Gupta
Noch keine Bewertungen
SRE Context
Dokument30 Seiten
SRE Context
Monib Ahmad
Noch keine Bewertungen
SAP Exchange Infrastructure 3.0 - Best Practices For Naming Conventions PDF
Dokument20 Seiten
SAP Exchange Infrastructure 3.0 - Best Practices For Naming Conventions PDF
subrai
Noch keine Bewertungen
TIF SIF 401 Week 01
Dokument66 Seiten
TIF SIF 401 Week 01
Damar Alam Reja
Noch keine Bewertungen
Planview Enterprise Agile
Dokument2 Seiten
Planview Enterprise Agile
Larryv2u
Noch keine Bewertungen
Object Oriented Programming With C++: Press
Dokument41 Seiten
Object Oriented Programming With C++: Press
Anil Kumar
Noch keine Bewertungen
DFS PDF
Dokument10 Seiten
DFS PDF
priyesh_6
Noch keine Bewertungen
Kenya Export Directory Final
Dokument40 Seiten
Kenya Export Directory Final
Data
100% (1)
Thesis Number 1!
Dokument4 Seiten
Thesis Number 1!
JERA MAE BAGUIOS
Noch keine Bewertungen
NetBackup102 SecEncryp Guide
Dokument602 Seiten
NetBackup102 SecEncryp Guide
dixade1732
Noch keine Bewertungen
Database
Dokument4 Seiten
Database
abhinash biswal
Noch keine Bewertungen
Active Directory in Server 2003 Installment
Dokument5 Seiten
Active Directory in Server 2003 Installment
arnold_najera
Noch keine Bewertungen
Aspect-Oriented Software Development: ©ian Sommerville 2006 Slide 1
Dokument22 Seiten
Aspect-Oriented Software Development: ©ian Sommerville 2006 Slide 1
Ali Butt
Noch keine Bewertungen
Date of Exam:25/09/2020: "T3 Examination, Sep 2020."
Dokument6 Seiten
Date of Exam:25/09/2020: "T3 Examination, Sep 2020."
Amrit Singh
Noch keine Bewertungen
DASA DevOps Fundamentals Mock Exam 1
Dokument22 Seiten
DASA DevOps Fundamentals Mock Exam 1
Sudha Appani
100% (1)
Android Malware Detection Using Machine Learning
Dokument18 Seiten
Android Malware Detection Using Machine Learning
NALLURI PREETHI
Noch keine Bewertungen
LAB211 Assignment: Title Background Program Specifications
Dokument3 Seiten
LAB211 Assignment: Title Background Program Specifications
Duong Quang Long QP3390
Noch keine Bewertungen
15 C# Questions - For, While Loops and If Else Statements
Dokument11 Seiten
15 C# Questions - For, While Loops and If Else Statements
abhishekdubey2011
100% (1)
Debsankar Jana
Dokument3 Seiten
Debsankar Jana
Row
Noch keine Bewertungen
Declaration of Cyber Advisory
Dokument5 Seiten
Declaration of Cyber Advisory
deeptanwar1997
Noch keine Bewertungen
Automated Weighing Software
Dokument18 Seiten
Automated Weighing Software
utalent
Noch keine Bewertungen
MartinezRimarachin JoseLuis
Dokument9 Seiten
MartinezRimarachin JoseLuis
Hector Tomas Reyes Cumpa
Noch keine Bewertungen
Precisely What Is Javascriptvaujt PDF
Dokument3 Seiten
Precisely What Is Javascriptvaujt PDF
powersaw07
Noch keine Bewertungen
Automatix - Art of RPA (In Robotics and Automation)
Dokument34 Seiten
Automatix - Art of RPA (In Robotics and Automation)
ikhwancules46
Noch keine Bewertungen
Research Paper1
Dokument8 Seiten
Research Paper1
Abdulkadir Zaharaddeen
Noch keine Bewertungen
Chapter 09 HP RUM 9 Engine Console Tools
Dokument20 Seiten
Chapter 09 HP RUM 9 Engine Console Tools
Masood Khan
Noch keine Bewertungen
Microsoft Network NAP
Dokument113 Seiten
Microsoft Network NAP
marcoschimenti
Noch keine Bewertungen