Beruflich Dokumente
Kultur Dokumente
Yahoo!
Milind Bhandarkar
(milindb@yahoo-inc.com)
About Me
38,000+ Servers
170+ PB of Storage
1000+ Users
!"
$"
%"
&"
'"
("
)"
*"
+"
"
*'"
*""
+'"
+""
'"
"
*""& *""% *""$ *""! *"+"
,-./0
)$1 2345346
+%" 78 29-4/:3
+;< ;-=9>?0 @-A6
!
"
#
$
%
&
'
(
%
#
*
+
,
-
.
,
-
%
/
,
0
&
1
2
0
,
%
3,%,&-4"
+45,'4,
678&40
9&5:2
/-#($4;#'
B/.--C 2345346
B/.--C 29-4/:3 D78E
Hadoop Growth
Hadoop Clusters
Production (30%)
Sample Applications
User Modeling
Computational Advertising
Content
(Web Pages, Blogs,
News Articles, Media)
Search Queries
Advertisements
(Display, Search)
User
Major Data Sources
Web Graph Analysis
1+ PB Content
2 Trillion links
Challenge: Scale
Sessionization
Model Training
User-Actions of Interest
Conversion Events
Incrementally updated
Joining Targets &
Features
Regressions
Naive Bayes
Pleasantly parallel
Evaluate metrics
Batch Scoring
Vijay K Narayanan
Vishwanath Ramarao
Nitin Motgi