Willkommen bei Scribd!

Karussell überspringen

QuickIntroductiontoApacheSpark - Data Phanatik

Hochgeladen von

FemiAnthony

0% fanden dieses Dokument nützlich (0 Abstimmungen)

14 Ansichten3 Seiten

Apache Spark intro tutorial

Originaltitel

QuickIntroductiontoApacheSpark _ Data Phanatik

Copyright

Verfügbare Formate

PDF, TXT oder online auf Scribd lesen

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Dieses Dokument melden

Apache Spark intro tutorial

Copyright:

Verfügbare Formate

Als PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

0% fanden dieses Dokument nützlich (0 Abstimmungen)

14 Ansichten3 Seiten

QuickIntroductiontoApacheSpark - Data Phanatik

Hochgeladen von

FemiAnthony

Apache Spark intro tutorial

Copyright:

Verfügbare Formate

Als PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

Zu Seite

Sie sind auf Seite 1von 3

Im Dokument suchen

3/30/2016

Quick Introduction to Apache Spark | Data Phanatik

DataPhanatik
FortheloveofCode,DataandAnalytics

QuickIntroductiontoApacheSpark
PostedonMarch20,2016
WhatisSpark
Sparkisafastandgeneralpurposeframeworkforclustercomputing.
ItiswritteninScalabutisavailableinScala,JavaandPython.
ThemaindataabstractioninSparkistheRDD,orResilientDistributedDataset.
Withthisabstraction,Sparkenablesdatatobedistributedandprocessedamong
themanynodesofacomputingcluster.
Itprovidesbothinteractiveandprogrammaticinterfaces.Itconsistsofthefollowing
components:
SparkCorethefoundationalclassesonwhichSparkisbuilt.ItprovidestheRDD.
SparkStreamingaprotocolforprocessingstreamingdata
SparkSQLanAPIforhandlingstructureddata
MLLibaMachineLearninglibrary
GraphXaGraphlibrary
ClusterManager
Inordertooperate,inproductionmodeSparkneedsaclustermanagerthatmanagesdata
distribution,taskschedulingandfaulttoleranceamongthevariousnodes.
Thereare3typessupportedApacheMesos,HadoopYARNandSparkstandalone.
SparkFundamentals
Spark,ashasbeenpreviouslydefinedisaframeworkforfastandgeneralpurposecluster
computing.ItsfundamentalabstractionistheRDDtheresilientdistributeddatasetwhich
meansthatitisinherentlyparallelizableamongthenodesandcoresofacomputingcluster.
TheRDDisheldentirelyinRAM.
http://searchdatascience.com/brief-introduction-to-apache-spark/

1/3

3/30/2016

Quick Introduction to Apache Spark | Data Phanatik

WhendataisreadintoSpark,itisreadintoanRDD.Onceitisreadintoan
RDDitcanbeoperatedon.Thereare2distincttypesofoperationsonRDDs:
1.Transformations
TransformationsareusedtoconvertdataintheRDDtoanotherform.Theresultofa
transformationisanotherRDD.Examplesoftransformationsare:
map()takesafunctionasargumentandappliesthefunctiontoeachitem/elementof
theRDD
flatMap()takesafunctionasargumentandappliesthefunctiontoeachelement
whileflatteningtheresultsintoasinglelevelcollection.
filter()takesabooleanexpressionandreturnsanRDDwithrowsforwhichthe
booleanexpressionistrue.e.g.linesofafilewhichcontainthestringObama
countByKey()givenaPair/mapRDDi.e.withKeyvaluepairs,returnanotherPair
RDDwithcountsbykey.
2.Actions
ActionsareoperationsonanRDDwhichresultinsomesortofoutputthatisnotanRDD
e.g.alist,DataFrame,oroutputtothescreen.Examplesofactionoperationsare:
collect()AppliesthevarioustransformationstoanRDDthenreturnstheresultasa
collection.
countreturnsacountofthenumberofelementsinanRDD
reduce()takesafunctionandrepeatedlyappliesittotheelementsofthe
RDDtoproduceasingleoutputvalue
RDDsandLazyEvaluation
AfundamentalideainSparksimplementationistheapplicationoflazyevaluationandthis
isimplementedforallSparktransformationoperations.
ThusanRDDisfundamentallyadataabstractionso,whenwecallsay:

s c a l a > v a l r d d = s c . p a r a l l e l i z e ( S e q ( 1 , 3 , 4 , 5 , 9 ) )
r d d : o r g . a p a c h e . s p a r k . r d d . R D D [ I n t ] = P a r a l l e l C o l l e c t i o n R D D [ 0 ]
s c a l a > v a l
m a p p e d R D D :

m a p p e d R D D = r d d . m a p ( x = > x * x )
o r g . a p a c h e . s p a r k . r d d . R D D [ I n t ]
=

M a p P a r t i t i o n s R D D [ 1 ]

a t

p a r a l l e l i z e

a t

m a p

a t

: 2 3

whatweregettingasmappedRDDisjustananexpressionthathasntbeenevaluated.This
expressionisessentiallyarecordofasequenceoperationsthatneedtobeevaluatedi.e.
http://searchdatascience.com/brief-introduction-to-apache-spark/

2/3

a t

3/30/2016

Quick Introduction to Apache Spark | Data Phanatik

parallelize>map
Thisexpressionamountstowhatisknownasalineagegraphandcanbeseenasfollows:

s c a l a >
r e s 4 : S
( 4 ) M a p
|
P a r

m a p
t r i
P a r
a l l

p e d
n g
t i t
e l C

R D D . t o D e b u g S t r i n g
=
i o n s R D D [ 1 ] a t m a p a t : 2 3 [ ]
o l l e c t i o n R D D [ 0 ] a t p a r a l l e l i z e

a t

: 2 1

[ ]

ThisentrywaspostedinBigDataandDistributedSystemsandtaggedapachesparkby
femibyte.Bookmarkthepermalink[http://searchdatascience.com/briefintroduction
toapachespark/].

http://searchdatascience.com/brief-introduction-to-apache-spark/

3/3

Das könnte Ihnen auch gefallen

The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Von Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Bewertung: 4 von 5 Sternen
4/5 (5794)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Von Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Bewertung: 4 von 5 Sternen
4/5 (1090)
Never Split the Difference: Negotiating As If Your Life Depended On It
Von Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Bewertung: 4.5 von 5 Sternen
4.5/5 (838)
Principles: Life and Work
Von Everand
Principles: Life and Work
Ray Dalio
Bewertung: 4 von 5 Sternen
4/5 (599)
The Glass Castle: A Memoir
Von Everand
The Glass Castle: A Memoir
Jeannette Walls
Bewertung: 4.5 von 5 Sternen
4.5/5 (1712)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Von Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Bewertung: 4 von 5 Sternen
4/5 (895)
Sing, Unburied, Sing: A Novel
Von Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Bewertung: 4 von 5 Sternen
4/5 (1103)
Grit: The Power of Passion and Perseverance
Von Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Bewertung: 4 von 5 Sternen
4/5 (588)
Shoe Dog: A Memoir by the Creator of Nike
Von Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Bewertung: 4.5 von 5 Sternen
4.5/5 (537)
The Perks of Being a Wallflower
Von Everand
The Perks of Being a Wallflower
Stephen Chbosky
Bewertung: 4.5 von 5 Sternen
4.5/5 (2102)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Von Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Bewertung: 4.5 von 5 Sternen
4.5/5 (344)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Von Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Bewertung: 4.5 von 5 Sternen
4.5/5 (474)
Bad Feminist: Essays
Von Everand
Bad Feminist: Essays
Roxane Gay
Bewertung: 4 von 5 Sternen
4/5 (1015)
The Outsider: A Novel
Von Everand
The Outsider: A Novel
Stephen King
Bewertung: 4 von 5 Sternen
4/5 (1839)
Her Body and Other Parties: Stories
Von Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Bewertung: 4 von 5 Sternen
4/5 (821)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Von Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Bewertung: 4.5 von 5 Sternen
4.5/5 (121)
The Emperor of All Maladies: A Biography of Cancer
Von Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Bewertung: 4.5 von 5 Sternen
4.5/5 (271)
Angela's Ashes: A Memoir
Von Everand
Angela's Ashes: A Memoir
Frank McCourt
Bewertung: 4.5 von 5 Sternen
4.5/5 (440)
The Little Book of Hygge: Danish Secrets to Happy Living
Von Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Bewertung: 3.5 von 5 Sternen
3.5/5 (399)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Von Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Bewertung: 3.5 von 5 Sternen
3.5/5 (2259)
Brooklyn: A Novel
Von Everand
Brooklyn: A Novel
Colm Tóibín
Bewertung: 3.5 von 5 Sternen
3.5/5 (1937)
A Man Called Ove: A Novel
Von Everand
A Man Called Ove: A Novel
Fredrik Backman
Bewertung: 4.5 von 5 Sternen
4.5/5 (4609)
A Tree Grows in Brooklyn
Von Everand
A Tree Grows in Brooklyn
Betty Smith
Bewertung: 4.5 von 5 Sternen
4.5/5 (1929)
The Art of Racing in the Rain: A Novel
Von Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Bewertung: 4 von 5 Sternen
4/5 (4200)
The Yellow House: A Memoir (2019 National Book Award Winner)
Von Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Bewertung: 4 von 5 Sternen
4/5 (98)
Steve Jobs
Von Everand
Steve Jobs
Walter Isaacson
Bewertung: 4.5 von 5 Sternen
4.5/5 (806)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Von Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Bewertung: 4.5 von 5 Sternen
4.5/5 (266)
The Woman in Cabin 10
Von Everand
The Woman in Cabin 10
Ruth Ware
Bewertung: 3.5 von 5 Sternen
3.5/5 (2322)
Yes Please
Von Everand
Yes Please
Amy Poehler
Bewertung: 4 von 5 Sternen
4/5 (1891)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Von Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Bewertung: 3.5 von 5 Sternen
3.5/5 (231)
Team of Rivals: The Political Genius of Abraham Lincoln
Von Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Bewertung: 4.5 von 5 Sternen
4.5/5 (234)
Fear: Trump in the White House
Von Everand
Fear: Trump in the White House
Bob Woodward
Bewertung: 3.5 von 5 Sternen
3.5/5 (738)
John Adams
Von Everand
John Adams
David McCullough
Bewertung: 4.5 von 5 Sternen
4.5/5 (2409)
Wolf Hall: A Novel
Von Everand
Wolf Hall: A Novel
Hilary Mantel
Bewertung: 4 von 5 Sternen
4/5 (3811)
On Fire: The (Burning) Case for a Green New Deal
Von Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Bewertung: 4 von 5 Sternen
4/5 (73)
The Light Between Oceans: A Novel
Von Everand
The Light Between Oceans: A Novel
M.L. Stedman
Bewertung: 4.5 von 5 Sternen
4.5/5 (789)
The Unwinding: An Inner History of the New America
Von Everand
The Unwinding: An Inner History of the New America
George Packer
Bewertung: 4 von 5 Sternen
4/5 (45)
Manhattan Beach: A Novel
Von Everand
Manhattan Beach: A Novel
Jennifer Egan
Bewertung: 3.5 von 5 Sternen
3.5/5 (792)
The Constant Gardener: A Novel
Von Everand
The Constant Gardener: A Novel
John le Carré
Bewertung: 3.5 von 5 Sternen
3.5/5 (104)
Finches Statistics Student-1
Dokument7 Seiten
Finches Statistics Student-1
api-319172404
Noch keine Bewertungen
Rise of ISIS: A Threat We Can't Ignore
Von Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Bewertung: 3.5 von 5 Sternen
3.5/5 (137)
Little Women
Von Everand
Little Women
Louisa May Alcott
Bewertung: 4 von 5 Sternen
4/5 (104)
Rotary Valve Functions Booklet
Dokument17 Seiten
Rotary Valve Functions Booklet
amahmoud3
Noch keine Bewertungen
Dielectric Properties of Solids
Dokument15 Seiten
Dielectric Properties of Solids
Mahesh Lohith K.S
100% (11)
Nsdi Spark
Dokument14 Seiten
Nsdi Spark
Guilherme Gomes
Noch keine Bewertungen
Analytical Functions in Oracle 8I: by Ed Kosciuszko Sequel Consulting (973) 226-7835
Dokument34 Seiten
Analytical Functions in Oracle 8I: by Ed Kosciuszko Sequel Consulting (973) 226-7835
FemiAnthony
Noch keine Bewertungen
Investors Guide CMOs
Dokument17 Seiten
Investors Guide CMOs
FemiAnthony
Noch keine Bewertungen
Data Science Distributed Intelligence
Dokument10 Seiten
Data Science Distributed Intelligence
FemiAnthony
Noch keine Bewertungen
Tib Ems Integration Guide
Dokument129 Seiten
Tib Ems Integration Guide
FemiAnthony
Noch keine Bewertungen
11.1 Asymmetric Encryption Schemes
Dokument26 Seiten
11.1 Asymmetric Encryption Schemes
Rajnish Bedi
Noch keine Bewertungen
JBoss Web Server User's Guide
Dokument33 Seiten
JBoss Web Server User's Guide
FemiAnthony
Noch keine Bewertungen
Che 410 ................... Transition Metal Chemistry
Dokument13 Seiten
Che 410 ................... Transition Metal Chemistry
Elizabeth Anyango
Noch keine Bewertungen
Big Data Analytics & Technologies: Hbase
Dokument30 Seiten
Big Data Analytics & Technologies: Hbase
Wong pi wen
Noch keine Bewertungen
Activating The Standard BI Report
Dokument17 Seiten
Activating The Standard BI Report
learnsapbw
Noch keine Bewertungen
Optimization of Decarbonization On Steel Industry
Dokument28 Seiten
Optimization of Decarbonization On Steel Industry
msantosu000
Noch keine Bewertungen
College of Technology & Engineering: Practical Training at Hindustan Zinc Limited Zinc Smelter, Debari Udaipur
Dokument24 Seiten
College of Technology & Engineering: Practical Training at Hindustan Zinc Limited Zinc Smelter, Debari Udaipur
Pooja Sahu
Noch keine Bewertungen
Pneumatic Conveying of Bulk Solids PDF
Dokument231 Seiten
Pneumatic Conveying of Bulk Solids PDF
CarloLopez
100% (2)
Product Specifications: Handheld Termination Aid
Dokument1 Seite
Product Specifications: Handheld Termination Aid
norm
Noch keine Bewertungen
Types of Solids 1
Dokument16 Seiten
Types of Solids 1
Fern Baldonaza
Noch keine Bewertungen
Data and Specifications: HMR Regulated Motors
Dokument21 Seiten
Data and Specifications: HMR Regulated Motors
Beniamin Kowoll
Noch keine Bewertungen
Database Programming With SQL 12-3: DEFAULT Values, MERGE, and Multi-Table Inserts Practice Activities
Dokument2 Seiten
Database Programming With SQL 12-3: DEFAULT Values, MERGE, and Multi-Table Inserts Practice Activities
Florin Catalin
Noch keine Bewertungen
Spark: Owner's Manual
Dokument5 Seiten
Spark: Owner's Manual
jorge medina
Noch keine Bewertungen
Science B
Dokument2 Seiten
Science B
Iyer Junior
Noch keine Bewertungen
Chapter 3
Dokument23 Seiten
Chapter 3
pganoel
Noch keine Bewertungen
Factors That Affect College Students' Attitudes Toward Mathematics
Dokument17 Seiten
Factors That Affect College Students' Attitudes Toward Mathematics
Anthony Bernardino
Noch keine Bewertungen
Test Electrolysis
Dokument3 Seiten
Test Electrolysis
Natalia Whyte
Noch keine Bewertungen
The Roles of Computer
Dokument1 Seite
The Roles of Computer
Dika Triandimas
0% (1)
A Tutorial On Spectral Sound Processing Using Max/MSP and Jitter
Dokument16 Seiten
A Tutorial On Spectral Sound Processing Using Max/MSP and Jitter
tramazio
0% (1)
Fenomenos Superficie
Dokument2 Seiten
Fenomenos Superficie
Simón Caldera
Noch keine Bewertungen
NATCO Presentation - Desalters PDF
Dokument12 Seiten
NATCO Presentation - Desalters PDF
shahmkamal
Noch keine Bewertungen
Redox Titration
Dokument5 Seiten
Redox Titration
christina
Noch keine Bewertungen
7PA30121AA000 Datasheet en
Dokument2 Seiten
7PA30121AA000 Datasheet en
Mirko Djukanovic
Noch keine Bewertungen
Dover Artificial Lift - Hydraulic Lift Jet Pump Brochure
Dokument8 Seiten
Dover Artificial Lift - Hydraulic Lift Jet Pump Brochure
Pedro Antonio Mejia Suarez
100% (1)
Ticket Eater - User Manual 2006
Dokument24 Seiten
Ticket Eater - User Manual 2006
tokio2424
Noch keine Bewertungen
Unit-3 BP
Dokument48 Seiten
Unit-3 BP
Shreyas Shreyu
Noch keine Bewertungen
Famous Mathematician
Dokument116 Seiten
Famous Mathematician
Angelyn Montibola
Noch keine Bewertungen
T 096
Dokument3 Seiten
T 096
abel
Noch keine Bewertungen
T0000598REFTRGiDX 33RevG01052017 PDF
Dokument286 Seiten
T0000598REFTRGiDX 33RevG01052017 PDF
Thanh
Noch keine Bewertungen