Willkommen bei Scribd!

Karussell überspringen

22april2018 Bdhs

Hochgeladen von

santosh kumar

0% fanden dieses Dokument nützlich (0 Abstimmungen)

13 Ansichten3 Seiten

Originaltitel

22april2018-bdhs

Copyright

Verfügbare Formate

TXT, PDF, TXT oder online auf Scribd lesen

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Dieses Dokument melden

Copyright:

Verfügbare Formate

Als TXT, PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

0% fanden dieses Dokument nützlich (0 Abstimmungen)

13 Ansichten3 Seiten

22april2018 Bdhs

Hochgeladen von

santosh kumar

Copyright:

Verfügbare Formate

Als TXT, PDF, TXT herunterladen oder online auf Scribd lesen

Markieren Sie unangemessene Inhalte

Zu Seite

Sie sind auf Seite 1von 3

Im Dokument suchen

These are the notes from Big Data with Spark class

TODO:
Before we start, was the table created for the Spark dataframe vs python vs SQL ???

https://cloudxlab.com/assessment/slide/18/writing-spark-applications?course_id=1
https://cloudxlab.com/assessment/slide/58/spark-project-log-parsing?course_id=1
https://cloudxlab.com/assessment/slide/29/apache-spark-key-value-rdd/1244/project-
handling-binary-files?course_id=1

var txtRDD = sc.textFile("/data/spark/temps.csv")

def cleanRecord(line:String) = {
var arr = line.split(",");
(arr(1).trim, arr(0).toInt)
}
var recordsRDD = txtRDD.map(cleanRecord)
def max(a:Int, b:Int) = if (b > a) b else a
var res = recordsRDD.reduceByKey(max)
res.collect()

CHICAGO,21), (BLR,23), (SEATLE,25), (NYC,24)

----

20, NYC, 2014-01-01

20, NYC, 2015-01-01

var txtRDD = sc.textFile("/data/spark/temps.csv")

def cleanRecord(line:String): (String, (Int, String)) = {

var arr = line.split(",");
return (arr(1).trim, (arr(0).toInt, arr(2).trim))
}

var recordsRDD = txtRDD.map(cleanRecord)

//
[
(NYC, (20, "1-1-2018")),
("NYC", (21, "2-1-2018")),
("SEATLE", (20, "1-1-2017")) ]
]

==> Grouping
"NYC", [(20, "1-1-2018"), (21, "2-1-2018")]
"SEATLE", [(20, "1-1-2017")]

=> max
"NYC", (21, "2-1-2018")
"SEATLE", (20, "1-1-2017"

def max(t:(Int, String), t1:(Int, String)):(Int, String) = {

if (t._1 == t1._1){
if(t._2 > t1._2) t else t1
} else if (t._1 > t1._1){
return t
} else return t1
}
var res = recordsRDD.reduceByKey(max)
res.collect()
----
def min() = {
....
}

var res1 = recordsRDD.reduceByKey(min)

var output = res.union(res1)

def concatenate(min, max) = {
...
}
ouput.reduceByKey(concat_max_min)
---

reduceByCity() -> nyc, tuple

sortBykey -> data will be order

----

Discuss

----
class Customer {
var Name:String
var Address: String
}

=======
var nums = sc.parallelize((1 to 100000), 50)

def mysum(itr:Iterator[Int]):Iterator[Int] = {
return Array(itr.sum).toIterator
}
var partitions = nums.mapPartitions(mysum)

def incrByOne(x:Int ):Int = {

return x+1;
}
var partitions1 = partitions.map(incrByOne)
partitions1.collect()

partitions1.persist()
partitions1.collect()
partitions1.unpersist()
var mysl = StorageLevel(true, true, false, true, 1)
import org.apache.spark.storage.StorageLevel
var mysl = StorageLevel(true, true, false, true, 1)
partitions1.persist(mysl)
partitions1.collect()
:history

Connection to the Database

class X implements Serializable

{
}
----

val userData = sc.sequenceFile[ UserID, UserInfo](" hdfs://...")

//.partitionBy( new HashPartitioner( 100))
.persist()

// Function called periodically to process a logfile of events in the past 5

minutes;
// we assume that this is a SequenceFile containing (UserID, LinkInfo) pairs.
def processNewLogs( logFileName: String) {
val events = sc.sequenceFile[ UserID, LinkInfo]( logFileName)
val joined = userData.join( events) // RDD of (UserID, (UserInfo, LinkInfo))
pairs
val offTopicVisits = joined.filter({
case (userId, (userInfo, linkInfo)) = > !
userInfo.topics.contains( linkInfo.topic)
}).count()
println(" Number of visits to non-subscribed topics: " + offTopicVisits)
}

def f(t:(userId, (userInfo, linkInfo))) = {

return !userInfo.topics.contains( linkInfo.topic)
}
val offTopicVisits = joined.filter(f)
.count()

userData: [(UserID1, UserInfo1), (UserID2, UserInfo2)]

events: [(UserID1, LinkInfo1), (UserID2, LinkInfo2) ]

res = userData.join(events)
res = [(UserID1, (UserInfo1, LinkInfo1)), (UserID2, (UserInfo2, LinkInfo2))]

Das könnte Ihnen auch gefallen

Electric Versus Conventional Vehicles For Logistics: A Total Cost of Ownership
Dokument10 Seiten
Electric Versus Conventional Vehicles For Logistics: A Total Cost of Ownership
santosh kumar
Noch keine Bewertungen
Rapport079 PDF
Dokument80 Seiten
Rapport079 PDF
santosh kumar
Noch keine Bewertungen
Electric Versus Conventional Vehicles For Logistics: A Total Cost of Ownership
Dokument10 Seiten
Electric Versus Conventional Vehicles For Logistics: A Total Cost of Ownership
santosh kumar
Noch keine Bewertungen
Electric Versus Conventional Vehicles For Logistics: A Total Cost of Ownership
Dokument10 Seiten
Electric Versus Conventional Vehicles For Logistics: A Total Cost of Ownership
santosh kumar
Noch keine Bewertungen
New Pond-Tilapia Fish Culture
Dokument21 Seiten
New Pond-Tilapia Fish Culture
santosh kumar
Noch keine Bewertungen
SparkInternals All
Dokument90 Seiten
SparkInternals All
Christopher Milne
Noch keine Bewertungen
Economically Feasible Fish Feed For GIFT Tilapia (Oreochromis Lanka
Dokument31 Seiten
Economically Feasible Fish Feed For GIFT Tilapia (Oreochromis Lanka
santosh kumar
Noch keine Bewertungen
54 e 5 FFD 80 CF 277664 FF 1 D 3 DF
Dokument21 Seiten
54 e 5 FFD 80 CF 277664 FF 1 D 3 DF
nadaosman
Noch keine Bewertungen
Day3 JimZhang
Dokument24 Seiten
Day3 JimZhang
santosh kumar
Noch keine Bewertungen
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Von Everand
The Subtle Art of Not Giving a F*ck: A Counterintuitive Approach to Living a Good Life
Mark Manson
Bewertung: 4 von 5 Sternen
4/5 (5784)
The Yellow House: A Memoir (2019 National Book Award Winner)
Von Everand
The Yellow House: A Memoir (2019 National Book Award Winner)
Sarah M. Broom
Bewertung: 4 von 5 Sternen
4/5 (98)
Never Split the Difference: Negotiating As If Your Life Depended On It
Von Everand
Never Split the Difference: Negotiating As If Your Life Depended On It
Chris Voss
Bewertung: 4.5 von 5 Sternen
4.5/5 (838)
Shoe Dog: A Memoir by the Creator of Nike
Von Everand
Shoe Dog: A Memoir by the Creator of Nike
Phil Knight
Bewertung: 4.5 von 5 Sternen
4.5/5 (537)
The Emperor of All Maladies: A Biography of Cancer
Von Everand
The Emperor of All Maladies: A Biography of Cancer
Siddhartha Mukherjee
Bewertung: 4.5 von 5 Sternen
4.5/5 (271)
Fear: Trump in the White House
Von Everand
Fear: Trump in the White House
Bob Woodward
Bewertung: 3.5 von 5 Sternen
3.5/5 (738)
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Von Everand
Hidden Figures: The American Dream and the Untold Story of the Black Women Mathematicians Who Helped Win the Space Race
Margot Lee Shetterly
Bewertung: 4 von 5 Sternen
4/5 (890)
The Little Book of Hygge: Danish Secrets to Happy Living
Von Everand
The Little Book of Hygge: Danish Secrets to Happy Living
Meik Wiking
Bewertung: 3.5 von 5 Sternen
3.5/5 (399)
Team of Rivals: The Political Genius of Abraham Lincoln
Von Everand
Team of Rivals: The Political Genius of Abraham Lincoln
Doris Kearns Goodwin
Bewertung: 4.5 von 5 Sternen
4.5/5 (234)
Yes Please
Von Everand
Yes Please
Amy Poehler
Bewertung: 4 von 5 Sternen
4/5 (1888)
Grit: The Power of Passion and Perseverance
Von Everand
Grit: The Power of Passion and Perseverance
Angela Duckworth
Bewertung: 4 von 5 Sternen
4/5 (587)
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Von Everand
Devil in the Grove: Thurgood Marshall, the Groveland Boys, and the Dawn of a New America
Gilbert King
Bewertung: 4.5 von 5 Sternen
4.5/5 (265)
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Von Everand
A Heartbreaking Work Of Staggering Genius: A Memoir Based on a True Story
Dave Eggers
Bewertung: 3.5 von 5 Sternen
3.5/5 (231)
On Fire: The (Burning) Case for a Green New Deal
Von Everand
On Fire: The (Burning) Case for a Green New Deal
Naomi Klein
Bewertung: 4 von 5 Sternen
4/5 (72)
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Von Everand
Elon Musk: Tesla, SpaceX, and the Quest for a Fantastic Future
Ashlee Vance
Bewertung: 4.5 von 5 Sternen
4.5/5 (474)
Principles: Life and Work
Von Everand
Principles: Life and Work
Ray Dalio
Bewertung: 4 von 5 Sternen
4/5 (599)
Rise of ISIS: A Threat We Can't Ignore
Von Everand
Rise of ISIS: A Threat We Can't Ignore
Jay Sekulow
Bewertung: 3.5 von 5 Sternen
3.5/5 (137)
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Von Everand
The Hard Thing About Hard Things: Building a Business When There Are No Easy Answers
Ben Horowitz
Bewertung: 4.5 von 5 Sternen
4.5/5 (344)
The Unwinding: An Inner History of the New America
Von Everand
The Unwinding: An Inner History of the New America
George Packer
Bewertung: 4 von 5 Sternen
4/5 (45)
Steve Jobs
Von Everand
Steve Jobs
Walter Isaacson
Bewertung: 4.5 von 5 Sternen
4.5/5 (806)
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Von Everand
The World Is Flat 3.0: A Brief History of the Twenty-first Century
Thomas L. Friedman
Bewertung: 3.5 von 5 Sternen
3.5/5 (2219)
Angela's Ashes: A Memoir
Von Everand
Angela's Ashes: A Memoir
Frank McCourt
Bewertung: 4.5 von 5 Sternen
4.5/5 (440)
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Von Everand
The Gifts of Imperfection: Let Go of Who You Think You're Supposed to Be and Embrace Who You Are
Brené Brown
Bewertung: 4 von 5 Sternen
4/5 (1090)
John Adams
Von Everand
John Adams
David McCullough
Bewertung: 4.5 von 5 Sternen
4.5/5 (2409)
Bad Feminist: Essays
Von Everand
Bad Feminist: Essays
Roxane Gay
Bewertung: 4 von 5 Sternen
4/5 (1015)
The Glass Castle: A Memoir
Von Everand
The Glass Castle: A Memoir
Jeannette Walls
Bewertung: 4.5 von 5 Sternen
4.5/5 (1711)
The Outsider: A Novel
Von Everand
The Outsider: A Novel
Stephen King
Bewertung: 4 von 5 Sternen
4/5 (1800)
The Woman in Cabin 10
Von Everand
The Woman in Cabin 10
Ruth Ware
Bewertung: 3.5 von 5 Sternen
3.5/5 (2322)
A Man Called Ove: A Novel
Von Everand
A Man Called Ove: A Novel
Fredrik Backman
Bewertung: 4.5 von 5 Sternen
4.5/5 (4609)
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Von Everand
The Sympathizer: A Novel (Pulitzer Prize for Fiction)
Viet Thanh Nguyen
Bewertung: 4.5 von 5 Sternen
4.5/5 (119)
The Light Between Oceans: A Novel
Von Everand
The Light Between Oceans: A Novel
M.L. Stedman
Bewertung: 4.5 von 5 Sternen
4.5/5 (789)
Brooklyn: A Novel
Von Everand
Brooklyn: A Novel
Colm Tóibín
Bewertung: 3.5 von 5 Sternen
3.5/5 (1937)
Wolf Hall: A Novel
Von Everand
Wolf Hall: A Novel
Hilary Mantel
Bewertung: 4 von 5 Sternen
4/5 (3811)
Manhattan Beach: A Novel
Von Everand
Manhattan Beach: A Novel
Jennifer Egan
Bewertung: 3.5 von 5 Sternen
3.5/5 (791)
Little Women
Von Everand
Little Women
Louisa May Alcott
Bewertung: 4 von 5 Sternen
4/5 (104)
The Perks of Being a Wallflower
Von Everand
The Perks of Being a Wallflower
Stephen Chbosky
Bewertung: 4.5 von 5 Sternen
4.5/5 (2099)
The Art of Racing in the Rain: A Novel
Von Everand
The Art of Racing in the Rain: A Novel
Garth Stein
Bewertung: 4 von 5 Sternen
4/5 (4193)
A Tree Grows in Brooklyn
Von Everand
A Tree Grows in Brooklyn
Betty Smith
Bewertung: 4.5 von 5 Sternen
4.5/5 (1929)
Her Body and Other Parties: Stories
Von Everand
Her Body and Other Parties: Stories
Carmen Maria Machado
Bewertung: 4 von 5 Sternen
4/5 (821)
Sing, Unburied, Sing: A Novel
Von Everand
Sing, Unburied, Sing: A Novel
Jesmyn Ward
Bewertung: 4 von 5 Sternen
4/5 (1103)
The Constant Gardener: A Novel
Von Everand
The Constant Gardener: A Novel
John le Carré
Bewertung: 3.5 von 5 Sternen
3.5/5 (104)
BRKSPG-2904-2904 - Cisco Live Session - v2-CL PDF
Dokument182 Seiten
BRKSPG-2904-2904 - Cisco Live Session - v2-CL PDF
Mohamed Samir
Noch keine Bewertungen
Joy Difuntorum-Ramirez CV
Dokument2 Seiten
Joy Difuntorum-Ramirez CV
Jojoi N Jecah
Noch keine Bewertungen
Seismic Design Guide (2010)
Dokument102 Seiten
Seismic Design Guide (2010)
ingcarlosgonzalez
Noch keine Bewertungen
Principles of CHN New Updated
Dokument4 Seiten
Principles of CHN New Updated
iheart music
Noch keine Bewertungen
Benedict - Ethnic Stereotypes and Colonized Peoples at World's Fairs - Fair Representations
Dokument16 Seiten
Benedict - Ethnic Stereotypes and Colonized Peoples at World's Fairs - Fair Representations
Veronica Uribe
Noch keine Bewertungen
Bibliography Presocratics
Dokument10 Seiten
Bibliography Presocratics
alraun66
Noch keine Bewertungen
Operation Manual 11-3000psi Shear Ram Bop
Dokument30 Seiten
Operation Manual 11-3000psi Shear Ram Bop
Boedi Syafiq
Noch keine Bewertungen
Why Leaders Should Look in the “Mirror
Dokument4 Seiten
Why Leaders Should Look in the “Mirror
Caryl Baylon Estrera
Noch keine Bewertungen
Optitex Com Products 2d and 3d Cad Software
Dokument12 Seiten
Optitex Com Products 2d and 3d Cad Software
Faathir Reza Avicena
Noch keine Bewertungen
42U System Cabinet Guide
Dokument68 Seiten
42U System Cabinet Guide
German Anders
Noch keine Bewertungen
EJC H2 Math P1 With Solution PDF
Dokument23 Seiten
EJC H2 Math P1 With Solution PDF
Kipp Soh
Noch keine Bewertungen
Laboratory Manual: Semester: - Viii
Dokument15 Seiten
Laboratory Manual: Semester: - Viii
rsingh1987
Noch keine Bewertungen
Solwezi General Mental Health Team
Dokument35 Seiten
Solwezi General Mental Health Team
Humphrey
Noch keine Bewertungen
TiONA 592 PDS - EN
Dokument1 Seite
TiONA 592 PDS - EN
Quang VA
Noch keine Bewertungen
All Types of Switch Commands
Dokument11 Seiten
All Types of Switch Commands
Kunal Sahoo
Noch keine Bewertungen
Inclusive E-Service or Risk of Digital Divide The Case of National ICT Policy 2018 of Bangladesh
Dokument11 Seiten
Inclusive E-Service or Risk of Digital Divide The Case of National ICT Policy 2018 of Bangladesh
International Journal of Innovative Science and Research Technology
100% (1)
Graphs & Charts Summaries
Dokument20 Seiten
Graphs & Charts Summaries
Maj Ma Salvador-Bandiola
100% (1)
Solution Manual For Contemporary Project Management 4th Edition
Dokument15 Seiten
Solution Manual For Contemporary Project Management 4th Edition
DanaAllendzcfa
100% (77)
A Laboratory Experiment in Crystals and Crystal Model Building Objectives
Dokument7 Seiten
A Laboratory Experiment in Crystals and Crystal Model Building Objectives
rajaa
Noch keine Bewertungen
Rha GGBS 27 4
Dokument12 Seiten
Rha GGBS 27 4
KhaDeja Mawra
Noch keine Bewertungen
Pub - Perspectives On Global Cultures Issues in Cultural PDF
Dokument190 Seiten
Pub - Perspectives On Global Cultures Issues in Cultural PDF
Cherlyn Jane Ventura Tuliao
Noch keine Bewertungen
Antenatal Assessment
Dokument9 Seiten
Antenatal Assessment
jyoti singh
Noch keine Bewertungen
Internal auditing multiple choice questions
Dokument4 Seiten
Internal auditing multiple choice questions
Santos Gigantoca Jr.
Noch keine Bewertungen
2009 GCSE PE Specifications
Dokument225 Seiten
2009 GCSE PE Specifications
Adstastic
Noch keine Bewertungen
Soil Testing Lab Results Summary
Dokument2 Seiten
Soil Testing Lab Results Summary
Md Sohag
Noch keine Bewertungen
Ubc 2015 May Sharpe Jillian
Dokument65 Seiten
Ubc 2015 May Sharpe Jillian
herzog
Noch keine Bewertungen
MATH Concepts PDF
Dokument2 Seiten
MATH Concepts PDF
s b
Noch keine Bewertungen
Philippine Politics and Constitution Syllabus
Dokument7 Seiten
Philippine Politics and Constitution Syllabus
Ivy Karen C. Prado
100% (1)
1HMEE5013 Exam Q JAN2017 S14
Dokument5 Seiten
1HMEE5013 Exam Q JAN2017 S14
kumar6125
100% (1)
Kashmira Karim Charaniya's Resume
Dokument3 Seiten
Kashmira Karim Charaniya's Resume
Megha Jain
Noch keine Bewertungen