Sie sind auf Seite 1von 6

1/26/2014

Streams Lab Introduction | streamsdev

IBM developerWorks. Sign in

Lab Introduction | streamsdev IBM developerWorks. Sign in STR EAMSDEV   Docs Downloads Get help

STR EAMSDEV

 
Docs Downloads Get help Events Blog

Docs

Docs Downloads Get help Events Blog

Downloads

Docs Downloads Get help Events Blog

Get help

Docs Downloads Get help Events Blog

Events

Docs Downloads Get help Events Blog

Blog

Search

 
Search  

Labs / Streams Intro Lab

Streams Lab Introduction

  Labs / Streams Intro Lab Streams Lab Introduction KevinJHom / October 26, 2013 / 0

KevinJHom / October 26, 2013 / 0 comments

Lab Introduction KevinJHom / October 26, 2013 / 0 comments InfoSphere Streams Introductory Hands-On Lab, Lab

InfoSphere Streams Introductory Hands-On Lab, Lab Version 1.0 For Use with Streams QuickStart Version 3.2.0.0

Robert Uleman

Introduction – Getting started with InfoSphere Streams

First, if you haven’t done so already, you can download the VM here and the Readme here.

And this is a video walking you through the setup of the QuickStart VM.

a video walking you through the setup of the QuickStart VM.

1/26/2014

Streams Lab Introduction | streamsdev

IBM InfoSphere Streams (“Streams”) enables continuous and fast analysis of massive volumes of moving data to help improve the speed of business insight and decision-making. InfoSphere Streams provides an execution platform and services for user-developed applications that ingest, filter, analyze, and correlate the information in data streams.

InfoSphere Streams includes the following major components:

Streams Runtime Engine – A collection of distributed processes that work together to facilitate the running of stream processing applications on a given set of host computers in a cluster. A single instantiation of these is referred to as a Streams Instance.

Streams Processing Language (SPL) Declarative language and framework for writing stream-processing applications. (This lab does not cover direct SPL programming.)

Streams Studio (“Studio”) – Eclipse-based Development Environment for writing, compiling, running, visualizing, and debugging Streams applications. Development tooling supports graphical (“drag & drop”) design and editing, as well as quick data visualization in a running application. The Instance Graph feature provides different views into the applications running within a Streams Instance, including the display of live data flow metrics and numerous color highlighting schemes for quick understanding and diagnostics of data flows.

InfoSphere Streams Console – The Streams Console is a web-based graphical user interface that is provided by the Streams Web Service (SWS). You can use the Streams Console to monitor and manage your Streams instances and applications from any computer that has HTTPS connectivity to the server that is running SWS. The Streams Console also supports data visualization in charts and tables.

Streamtool – Command-line interface to the Streams Runtime Engine. (This lab does not use the command-line interface.)

1/26/2014

Streams Lab Introduction | streamsdev

Requirements for Virtual Machine Host

64-bit host capable of supporting a 64-bit-guest| streamsdev Requirements for Virtual Machine Host Note In all cases you must enable the BIOS

Note In all cases you must enable the BIOS settings for 64-bit guest support. There will In all cases you must enable the BIOS settings for 64-bit guest support. There will be one or two settings; names vary, but look for “Virtualization Technology” or “VT-x”.

VMware workstation 8.x or newer OR current version of VMware Player or VMware Fusionbut look for “Virtualization Technology” or “VT-x”. Note VMware Player is a free download, so there

Note VMware Player is a free download, so there is no reason not to get the VMware Player is a free download, so there is no reason not to get the latest version. If you are running VMware Workstation 7 or older, you may be able to run this VM by editing the image’s .vmx file

20 GB of free disk (image takes about 10 GB)be able to run this VM by editing the image’s .vmx file 4 GB of RAM

4 GB of RAM (more is better; image set to 2 GB).vmx file 20 GB of free disk (image takes about 10 GB) Internet access (only required

Internet access (only required for the last part of Lab 4) Set VMware network configuration to NAT (VMnet8)about 10 GB) 4 GB of RAM (more is better; image set to 2 GB) Verify

of Lab 4) Set VMware network configuration to NAT (VMnet8) Verify that no firewall gets in

Verify that no firewall gets in the wayof Lab 4) Set VMware network configuration to NAT (VMnet8) 4-core CPU (image set to take

4-core CPU (image set to take 2 CPUs but may be reduced to one if necessary; more is good but not required)to NAT (VMnet8) Verify that no firewall gets in the way Goal and objectives The goal

Goal and objectives

The goal of this lab is to:

Provide an opportunity to interact with many of the components and features of Streams Studio while developing a small sample application. While the underlying SPL code is always accessible, this lab requires no direct programming and will not dwell on syntax and other features of SPL.

The objectives of this lab (along with the accompanying presentation) are to:

Provide a basic understanding of stream computingthis lab (along with the accompanying presentation) are to: Provide a brief overview of the fundamental

Provide a brief overview of the fundamental concepts of InfoSphere Streamsare to: Provide a basic understanding of stream computing Introduce the InfoSphere Streams runtime environment

Introduce the InfoSphere Streams runtime environmentoverview of the fundamental concepts of InfoSphere Streams Introduce InfoSphere Streams Studio A brief introduction to

Introduce InfoSphere Streams Studio A brief introduction to navigating the IDE Create and open projects Submit and cancel jobs View jobs, health, metrics, and data in the Instance GraphStreams Introduce the InfoSphere Streams runtime environment Introduce the Streams Studio graphical editing environment

Introduce the Streams Studio graphical editing environment Apply several built-in operators Design your first SPL application and enhance itView jobs, health, metrics, and data in the Instance Graph Introduce the data visualization capabilities in

Introduce the data visualization capabilities in the Streams Consoleoperators Design your first SPL application and enhance it

the data visualization capabilities in the Streams Console
the data visualization capabilities in the Streams Console
the data visualization capabilities in the Streams Console
the data visualization capabilities in the Streams Console
the data visualization capabilities in the Streams Console
the data visualization capabilities in the Streams Console

1/26/2014

Streams Lab Introduction | streamsdev

Overview

This hands-on lab provides a broad introduction to the components and tools of InfoSphere Streams.

to the components and tools of InfoSphere Streams. Figure 1. InfoSphere Streams lab overview The lab

Figure 1. InfoSphere Streams lab overview

The lab is based on a trivial example from a Smarter Cities automotive scenario: handling vehicle locations and speeds (and other variables). This is used to illustrate a number of techniques and principles, including:

Graphical design of an application graph and using a Properties view to configure program detailsillustrate a number of techniques and principles, including: Runtime visualization of Streams applications and data flow

Runtime visualization of Streams applications and data flow metricsand using a Properties view to configure program details Application integration using importing and exporting of

Application integration using importing and exporting of data streams based on stream properties.visualization of Streams applications and data flow metrics Figure 1 provides a graphical overview of the

Figure 1 provides a graphical overview of the completed lab. The lab environment includes five Eclipse “workspaces” (directories) numbered 1 through 5. Labs build on one another, but each workspace already contains a project with what would be the results of the previous lab. (Workspace 1 is empty; workspace 5 has the final result but there are no instructions for modifying it.) This way, attendees can experiment and get themselves in trouble any way they like in any lab and still go on to the next lab with everything in place, simply by switching workspaces.

The lab is broken into four parts:

Lab 1 A simple Streams app. Start a Streams Instance. Open Streams Studio; explore its views. Create an SPL application project; create an application graph with three operators; configure the operators. Run the application and verify the results.

Lab 2 Enhance the app: add the ability to read multiple files from a given directory and slow down the flow so you can watch things happen. Learn about jobs and PEs. Use the Instance Graph to monitor the stream flows and show data.

Lab 3 Enhance the app: add an operator to compute the average speed every five observations, separately for two cars. Use the Streams Console to visualize results.

Lab 4 Enhance the app: add an operator to check the vehicle ID format and separate records with an unexpected ID structure onto an error stream. Use exported application streams to create a modular application. If possible (internet access required), bring in live data.

If you want some more information about the labs, the following is a really nice Video

1/26/2014

Streams Lab Introduction | streamsdev

1/26/2014 Streams Lab Introduction | streamsdev Start Lab 1. Need Help? Ask the forum! Like 0

Start Lab 1.

Need Help? Ask the forum!

Like 0 Tweet 1 0
Like
0
Tweet
1
0

Leave a comment

1/26/2014

Streams Lab Introduction | streamsdev

You must be logged in to post a comment.

IBM