April 16, 2013 Yahoo!s core business Make the worlds daily habits inspiring and entertaining Put brands in the center of peoples daily habits Yahoo! Confidential & Proprietary. 2 4/18/2013 Yahoo! Users Adv Publ What problems do we solve? Matching content to user Personalized Responsive Matching ads to users Maximize yield Maximize return on investment Maintain positive user experience Yahoo! Confidential & Proprietary. 3 4/18/2013 What is involved? Data, lots of it! Metrics Instrumentation Attribution Learning Econometrics Pricing assets Valuing outcomes Mechanism Design Allocation of impressions to advertisers Behavioral Sciences Understanding users, what inspires and entertains them Yahoo! Confidential & Proprietary. 5 4/18/2013 What do we need? Science Data Platforms that analyze data at scale and close the loop On grid solution Horizontally scalable and fault tolerant Interactive Easy to describe sophisticated data mining tasks Quick to prototype, and easy to productionize Few knobs to turn Yahoo! Confidential & Proprietary. 6 4/18/2013 How do we do this today? Yahoo! Confidential & Proprietary. 7 4/18/2013 How the AMPLab, Yahoo! Relationship started How to cut down ETL, and query on grid directly? Inspired by Dremel/ enhance with in memory techniques Mateis talk on Shark @ Hadoop Summit 2012 Shark Server Further small enhancements and bug fixes Meet with Ion and Mike at AMPLab Yahoo! Confidential & Proprietary. 8 4/18/2013 Why this partnership works Mutual goal alignment AMPLabs focus on Algorithms, Machines and People World leading researchers Close collaboration with outstanding PhD students. Yahoo believes in Open Source, and we contribute back Real world applications at truly massive scale. Many avenues for partnership shared data sets internships collaboration on specific projects of core significance to Yahoo! Yahoo! Confidential & Proprietary. 9 4/18/2013 Where are we headed? End of Q2, Shark will be available on a 50 node cluster (100GB RAM) for advertising analytics. One customer facing analytics optimization feature planned on top of Shark Shark/ Spark packaged and available to autodeploy on any cluster within Yahoo! Mid Q2 start work on 4000 node cluster productionize YARN patch Bug fixes, memory leak fixes and features like Column Pruning, Map Join etc will be checked back into Shark/Spark main branch. Upcoming work includes further join optimization, query optimizations specific to analytic workloads., Compression etc. Longer term roadmap to enhance on disk performance as well Yahoo! Confidential & Proprietary. 10 4/18/2013 Future Architecture Yahoo! Confidential & Proprietary. 11 4/18/2013 We need your help getting there Positions available for internship, full-time & post-docs A very active Women In Tech employee resource group Labs.yahoo.com for more information about types of Research positions Careers.yahoo.com to apply Feel free to shoot an email to us Ram Sriharsha, harshars@yahoo-inc.com (for Systems, Audience and Advertising related inquiries) Tarun Bhatia , tarunb@yahoo-inc.com (for Yahoo Labs/ Research related inquiries) Yahoo! Confidential & Proprietary. 12 4/18/2013 SCIENCE-DRIVEN INNOVATION Yahoo! Labs is HIRING! We have openings for Research Scientists, Applied Scientists and Research Engineers at our locations in the US, Spain, Israel, India and China. Come join us in solving real world problems at scale, diving into oceans of data, creating new products and experiences, and collaborating in ground-breaking research. We are looking for scientists in many disciplines of Computer Science such as Machine Learning, Natural Language Processing, Statistical Data Analysis, Systems, Computational Advertising, Optimization, Media, Human-Computer Interaction and Mobile Experiences. For more information and to apply please visit http://careers.yahoo.com and search for scientist positions. Explore the opportunities with Yahoo! Labs Yahoo! Labs is looking for INTERNS! Yahoo! Labs is looking for exceptional PhD students to work with us in our intern program for the summer of 2013. We seek world-class graduate students in pursuit of a PhD in Computer Science, Mathematics, Statistics, or a related area. We are particularly interested in students working on Machine Learning, algorithms, Natural Language Processing, Knowledge Representation, HCI, Multimedia, Mobile Innovations, search (systems or algorithms), collaborative filtering, auctions, mechanism design, linear algebra, Systems or analysis of large data. Ideal candidates will have finished at least 2 years of graduate work. Interns are expected to work with our scientists to perform original research, apply scientific thinking and techniques to improve the performance and effectiveness of our products, and solve problems for our users and advertisers by analyzing mountains of data. They will have the opportunity to publish their work and expand the horizons of web science. Candidates will need to submit a CV plus a letter of recommendation from their graduate advisor. For more information and to apply, please visit http://y.ahoo.it/uhMpv