You are on page 1of 2

Storm Applied

About the BOOK

Storm Applied is an example-driven guide
to processing and analyzing real-time data
streams. This immediately useful book
starts by teaching you how to design Storm
solutions the right way. Then, it quickly
dives into real-world case studies that show
you how to scale a high-throughput stream
processor, ensure smooth operation within a
production cluster, and more. Along the way,
youll learn to use Trident for stateful stream
processing, along with other tools from the
Storm ecosystem.

` 799/Authors: Allen, Jankowski,

ISBN: 9789351197980 Pages: 278 Pathirana, and Montalenti

Mapping real problems to Storm


Performance tuning and scaling

Practical troubleshooting and


Exactly-once processing with


Storm Applied is a practical guide to using Apache Storm for the real-world tasks associated with processing
and analyzing real-time data streams. This immediately useful book starts by building a solid foundation of
Storm essentials so that you learn how to think about designing Storm solutions the right way from day one.
But it quickly dives into real-world case studies that will bring the novice up to speed with productionizing

About the Authors

Sean Allen, Matthew Jankowski, and Peter Pathirana lead the development
team for a high-volume, search-intensive commercial web application at



Table of Contents
Chapter 1 Introducing Storm

yy Tuning: I wanna go fast

yy What is big data?

yy Latency: when external systems take their time

yy How Storm fits into the big data picture

yy Storms metrics-collecting API

yy Why youd want to use Storm

Chapter 7 Resource contention

Chapter 2 Core Storm concepts

yy Changing the number of worker processes running on a

yy Problem definition: GitHub commit count dashboard

worker node

yy Changing the amount of memory allocated to worker

yy Basic Storm concepts

yy Implementing a GitHub commit count dashboard in Storm

Chapter 3 Topology design

processes (JVMs)

yy Figuring out which worker nodes/processes a topology is

executing on

yy Approaching topology design

yy Contention for worker processes in a Storm cluster

yy Problem definition: a social heat map

yy Precepts for mapping the solution to Storm
yy Initial implementation of the design

yy Memory contention within a worker process (JVM)

yy Memory contention on a worker node
yy Worker node CPU contention

yy Scaling the topology

yy Worker node I/O contention

yy Topology design paradigms

Chapter 8 Storm internals

Chapter 4 Creating robust topologies

yy The commit count topology revisited

yy Requirements for reliability

yy Problem definition: a credit card authorization system
yy Basic implementation of the bolts

yy Diving into the details of an executor

yy Routing and tasks
yy Knowing when Storms internal queues overflow

yy Guaranteed message processing

yy Addressing internal Storm buffers overflowing

yy Replay semantics

Chapter 5 Moving from local to remote topologies

yy Tweaking buffer sizes for performance gain

Chapter 9 Trident

yy The Storm cluster

yy Fail-fast philosophy for fault tolerance within a Storm


yy What is Trident?
yy Kafka and its role with Trident

yy Installing a Storm cluster

yy Problem definition: Internet radio

yy Getting your topology to run on a Storm cluster

yy Implementing the internet radio design as a Trident

yy The Storm UI and its role in the Storm cluster


yy Accessing the persisted counts through DRPC

Chapter 6 Tuning in Storm

yy Problem definition: Daily Deals! reborn

yy Mapping Trident operations to Storm primitives

yy Initial implementation

yy Scaling a Trident topology

Published by:



4435-36/7, Ansari Road, Daryaganj
19-A, Ansari Road, Daryaganj
New Delhi-110 002, INDIA
New Delhi-110 002, INDIA
Tel: +91-11-4363 0000, Fax: +91-11-2327 5895
Tel: +91-11-2324 3463-73, Fax: +91-11-2324 3078
Regional Offices: Bangalore: Tel: +91-80-2313 2383, Fax: +91-80-2312 4319, Email:
Mumbai: Tel: +91-22-2788 9263, 2788 9272, Telefax: +91-22-2788 9263, Email:


Distributed by: