Sie sind auf Seite 1von 55

Chapter 4:

Data Description
and Visualization
Contents

1 Introduction

2 Description and visualization of business processes

3 Description and visualization of data in customer perspective

4 Basic visualization techniques

5 Reporting

6 Summary & outlook

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 2


1 Introduction

- Each BI projects starts with understanding of the data based on


• Description of data
• Visualization of data
- Visualization also plays an important role in displaying
• analysis models and
• analysis results
• à Visual analytics
- Finally reporting summarizes and conveys the analysis results to
stakeholders, e.g., the management

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 3


1 Introduction

Information needs
- Data description and visualization for business processes:
• Structure and usage
• Production and organization perspective
• Event view
• Static and dynamic
- Data description and visualization for collections of business process
instances:
• Observation over certain period of time
• Customer perspective
• Cross-sectional and state view
- Data description and visualization for reporting:
• High-level reports on all BI activities
• Put into context of business goals

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 4


Contents

1 Introduction

2 Description and visualization of business processes

3 Description and visualization of data in customer perspective

4 Basic visualization techniques

5 Reporting

6 Summary & outlook

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 5


2 Description and visualization of business processes

Different visualizations depending on process life cycle phase


- Design time: process models; often represented as graphs
- Examples:
• BPMN1 (standard)
• Event-driven process chains2
• YAWL3
• WSM Nets4
• CPEE Nets5
1http://www.bpmn.org

2https://www.ariscommunity.com/event-driven-process-chain

3http://www.yawlfoundation.org/yawlbook/table-of-contents.html

4Stefanie Rinderle, Manfred Reichert, Peter Dadam: Correctness criteria for dynamic changes in workflow systems - a
survey. Data Knowl. Eng. 50(1): 9-34 (2004)
5cpee.org

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 6


2 Description and visualization of business processes

- Different visualizations depending on process life cycle phase


- Runtime: visualization of process execution data for different
instances
- Process execution logs
- Markings reflecting the instance and activity states
- Or both: logs enable to restore the instance and activity states at
any time
- Next slide shows BPMN and Petri Net process models with
markings and logs
- In addition: declarative process models, i.e., describe the process
logic based on rules

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 7


2 Description and visualization of business processes

© 2015 Springer-Verlag Berlin Heidelberg

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 8


2 Description and visualization of business processes

Signavio® ARIS Express YAWL4Study IBM Business Modeler


Advanced Version 7.0
Vertical and horizontal Element alignment Horizontal alignment
alignment
Alignment of
satellite objects
Background color
Background
picture
Element color
Space between elements
Import and
assignment of
user-defined icons

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 9


2 Description and visualization of business processes

Further aspects:
- Visualization of process perspectives:
• Control flow: as discussed before (à production)

• Data flow: data elements and data connectors (à


customer)
• Organizational information (à organization)

• Within process models:


• Organizational elements
• Swimlanes
• Outside process model
• Plus organizational model (organigram)

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 10


2 Description and visualization of business processes

Ø BPMN and EPC use data


elements and connectors
within the models

Ø BPMN uses swimlanes


for organizational
information
Ø EPC has organizational
elements

© 2015 Springer-Verlag Berlin Heidelberg

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 11


2 Description and visualization of business processes

Organizational models
- Typical elements
• Roles
• Organizational units
• Actors
- Typical relations
• Actor has role
• Role 1 is specialized with respect to role 2
• Role belogs to organizational unit
• Organizational unit 1 is subordinated to organizational unit 2
- Visualizations as graph, table, list
- Using existing approaches the graph can get quite big

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 12


2 Description and visualization of business processes

Ø Excerpt of organigram of midsized faculty, modeled using ARIS Express

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 13


2 Description and visualization of business processes

Ø Find information more quickly


Simone Kriglstein, Juergen Mangler, Stefanie Rinderle-Ma: Who is
© 2015 Springer-Verlag Berlin Heidelberg who: On visualizing organizational models in Collaborative
Systems. CollaborateCom 2012: 279-288
ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 14
2 Description and visualization of business processes

Challenges:
- Wallpaper processes on limited screen size
• Views6 and abstraction7
- Large number of process instances:
• Selection or multimodal approaches, e.g., sonification8
- Visualizing change information:
• Change tracking9 or change trees10
6Ralph Bobrik, Manfred Reichert, Thomas Bauer: View-Based Process Visualization. BPM 2007: 88-95
7Sergey Smirnov, Hajo A. Reijers, Mathias Weske, Thijs Nugteren: Business process model abstraction: a definition, catalog,
and survey. Distributed and Parallel Databases 30(1): 63-99 (2012)
8Tobias Hildebrandt, Thomas Hermann, Stefanie Rinderle-Ma: Continuous sonification enhances adequacy of interactions in

peripheral process monitoring. Int. J. Hum.-Comput. Stud. 95: 54-65(2016)


9Sonja Kabicher, Simone Kriglstein, Stefanie Rinderle-Ma: Visual Change Tracking for Business Process

Models. ER 2011: 504-513


10Georg Kaes, Stefanie Rinderle-Ma: Mining and Querying Process Change Information Based on Change

Trees. ICSOC 2015: 269-284

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 15


Contents

1 Introduction

2 Description and visualization of business processes

3 Description and visualization of data in customer perspective

4 Basic visualization techniques

5 Reporting

6 Summary & outlook

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 16


3 Description and visualization of data in customer perspective

Visualization of a collection of process instances


- Definition of data: defined by a number of variables; one of the following
structures
• Multidimensional tables
• Simple data structures
• Complex data structures
- Mapping of data: aesthetic attributes used for display
- Definition of layers based on
• Statistical information to be displayed
• Geometric object used for displaying statistics
• Aesthetic mapping and position for geometric object
• Coordinate system
- Facet specification: defines small multiples for displaying subsets of the
entire data set

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 17


3 Description and visualization of data in customer perspective

Definition of data
- Which process instances? Depending on instance attributes, e.g.,
• time interval of interest
• customer
- Cross-sectional and state view
- Event view à Chapter 7
- Which attributes?
• Depends on analysis question, e.g., cargo temperature in logistic
process
• Also: data transformations for existing attributes, for example, in the
state view summary characteristics of times series from each instance,
first-order differences

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 18


3 Description and visualization of data in customer perspective

Data structures:
- Multidimensional (pivot) tables, defined by:
• Values of qualitative variables (dimensions)
• Summary attribute for the cells (see also multidimensional data
structures for data warehouses in Chapter 3)
- For process instances:
• Simple matrix with rows representing process instances and columns
representing variable values; possibly nested
• Complex structures for cross-sectional and state view; here the
attributes refer to a sequence of values, together with the temporal
information

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 19


3 Description and visualization of data in customer perspective

Mapping:
- Defines for each variable how it is represented in the graphics
- Basic aesthetic attributes:
• Axis
• Color
• Size
• Shape
- Quantitative variable à axis
- Qualitative variable à shape
- Scale: mapping to aesthetic attributes

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 20


3 Description and visualization of data in customer perspective

Definition of layers
- Specification of statistical transformations, e.g.,
• Identity transformation: display variable values
• Summary transformation: calculate univariate characteristics, e.g.,
mean, median
• Transformation for histogram: define bins and count observations
• Calculation of regression line
- Transformations are represented using geometric objects, e.g.,
points, lines, intervals, polygons
- Geometric objects are mapped ot aesthetic variables
- Avoid overplotting by position specification or jittering

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 21


3 Description and visualization of data in customer perspective

Coordinate system:
- Defines location of points in space
- Examples: Cartesian or polar coordinate systems
Facets:
- Bind together different graphical displays
- Display aspects under different conditions
- Alternative to putting everything in one graphics using different
aesthetic attributes
- See also conditioning plots or trellis plots

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 22


3 Description and visualization of data in customer perspective

Interactive and dynamic visualizations:


- Explorative analysis: play around with different graphics
parameters such as aesthetic and geometry
- Goal: getter better / different insight into data
- Interactive and dynamic visualization can help through
view specification and manipulation as well as the
process and provenance of the conducted analysis
Ø View specification: interaction between user and
visualization in definition phase (see next slide for
GGobi)

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 23


3 Description and visualization of data in customer perspective

Interface for view specification in GGobi:


http://www.ggobi.org

© 2015 Springer-Verlag Berlin Heidelberg

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 24


3 Description and visualization of data in customer perspective

Interactive and dynamic visualizations:


- View manipulation: selecting objects with specific features by
• Queries: exact information about data points, e.g., the coordinates
• Standard method: mouseover
• Selection: specify subsets of interest in interactive way
• Filtering the data
• Selecting and reordering variables
• Individual selection of objects
• Selection of ranges for continuous variables
• Methods: highlighting, brushing, painting
• Brushing: dynamic selection
• Painting: permanent selection
• Linking: propagate selections from one facet to the other facets
• Zooming: changes size, but also level of details
ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 25
3 Description and visualization of data in customer perspective

Selection techniques in HighCharts on HEP data


www.highcharts.com

© 2015 Springer-Verlag Berlin Heidelberg

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 26


3 Description and visualization of data in customer perspective

Dynamic graphics
- Additional elements:
• Rotating axis

• Dynamic transformation of the axes

• Motion charts: presentation of time series, allow


for dynamic presentation of temporal behavior
• Examples can be found at
http://www.gapminder.org/

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 27


Contents

1 Introduction

2 Description and visualization of business processes

3 Description and visualization of data in customer perspective

4 Basic visualization techniques

5 Reporting

6 Summary & outlook

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 28


4 Basic visualization techniques

Qualitative Information
- Data structure: pivot table providing frequencies of value combinations
for different attributes (absolute, percentage)
- Bar charts and pie charts
• One variable, absolute: bar chart
• One variable, relative: bar chart, pie chart
• Multiple variables: stacked or clustered bar chart

R package ggplot2

ÓW.Grossmann,
© 2015 Springer-Verlag
S. Rinderle-Ma,Berlin Heidelberg
University of Vienna – Chapter 4: Data Description and Visualization 29
4 Basic visualization techniques

Qualitative Information
- Mosaic plots
• Two or more variables
• All data represented as square
• Horizontal edge is split according to the proportions of the first variable;
resulting retangles correspond to relative frequencies
• Then each rectangle is divided along conditional probability of the
second variable given the value of the first one
• For further variables alternating split of the rectangles along horizontal
and vertical axes based on condional probabilities
• Result: each rectangle represents to the frequency of occurrence of
that particular combination of variables

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 30


4 Basic visualization techniques
R graphics

© 2015 Springer-Verlag Berlin Heidelberg

Interpretation of left chart: female students score more often


„good“ or „poor“ grades when compared to male students

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 31


4 Basic visualization techniques

R package treemap
Qualitative Information
- Tree maps
• Values represent nested hierarchy of groups by nested rectangles
• Additional attributes represented by colors

Interpretation:
Ø 21 outlets in 5 regions
Ø Sales in regions and outlets
represented by size of
rectangles
Ø Example: “dominant“
Outlet1_4 in Region1
© 2015 Springer-Verlag Berlin Heidelberg
ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 32
4 Basic visualization techniques

Quantitative Information
- Histogram
• Value range of variable is divided into non-overlapping classes, so
called bins
• Number of observations per bin is counted and displayed by heights of
bars per bin
• Absolute
• Relative
• Density: area of the bars corresponds to relative frequency of the bin
• Density estimates: possibly transformation, e.g., logarithmic

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 33


4 Basic visualization techniques
R package ggplot

© 2015 Springer-Verlag Berlin Heidelberg

Interpretation: assumption of normal distribution of sales is not


justified, mainly due to number of low sales

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 34


4 Basic visualization techniques

Quantitative Information
- Boxplots
• Represent the distribution of a quantitative variable
• Often used for displaying value distributions of different groups, e.g.,
age groups
• 25% and 75% quantiles define the box of the 50% most frequent
observations
• Whiskers define the mark the are where all the values should lie when
following a normal distribution
• Values outside the whiskers are considered outliers, deserve special
attention

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 35


4 Basic visualization techniques
R graphics

© 2015 Springer-Verlag Berlin Heidelberg

Interpretation: female and male customers consume Service 3 with


almost the same distribution where there are more outliers in the female
group consuming Service3 exceptionally often

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 36


4 Basic visualization techniques
Interpretation: the deviation from the normal
distribution due to the low sales is confirmed by
the QQplots
Quantitative Information
- QQ Plots
• Goal: compare distribution of the variable with normal distribution,
based on the quantiles

R graphics

© 2015 Springer-Verlag Berlin Heidelberg


ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 37
4 Basic visualization techniques
Interpretation: peak of the density is around the
values Sales = 5 and a
Average Sales = 0.5; some kind of stability of
Quantitative Information customer behavior over time.
- Contour Plots
• Display the joint distribution of two variables based on two-dimensional
densities

R package
ggplot

© 2015 Springer-Verlag Berlin Heidelberg


ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 38
4 Basic visualization techniques
Interpretation: strong correlation (dark blue)
between, e.g., helpfulness and friendliness
Relationships
- Correlation and Heat Maps
• Correlation coefficient between values of two variables (<0.2 no,
between 0.2 and 0.5 weak, between 0.5 and 0.8 medium, >0.8 strong)
• Displayed by color-coding in heat map

R package corrplot

© 2015 Springer-Verlag Berlin Heidelberg


ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 39
4 Basic visualization techniques

Relationships
- Scatter plots
• Represent the relationships between k variables based on
𝑘 ∗ (𝑘 − 1)/2 plots in a scatter plot matrix
• Additional layers:
• Smoothing curves showing the relationship between the variables
(à Chapter 5)
• If qualitative variable is used for grouping, colors can represent the
different groups
• Frequency distributions in the diagonal of the matrix

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 40


4 Basic visualization techniques
R package car

Interpretation:
Ø all frequency distributions are
skewed to the right
Ø for all plots: linear trend line and
a smoothed trend line
Ø positive relationship between
average sales and actual sales
Ø relationship is rather scattered
for larger sales
Ø almost no correlation between
the duration of the customer
relationship and sales
Ø for larger average sales there
seems to be almost no
relationship
© 2015 Springer-Verlag Berlin Heidelberg

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 41


4 Basic visualization techniques

Relationships
- Projections and Principal Components:
• Representation of multivariate data in two or three dimensions
• Given variables X1, ..., Xk, for each Xi a principle component PCi is defined as
follows:
• 𝑃𝐶𝑖: = ∑;<=,..,> 𝛼𝑖𝑗𝑋𝑗, i.e., PCi is a linear combination of X1, ..., Xk
• For the first variable, the coefficients are determined in such a way that PC1 explains
as much as possible from the overall variance of the observations.
• Given PC1, ..., PCi, PCi+1 is defined orthogonal to PC1, ..., PCi and explains as much as
possible from the overall variance of the observations
• Typically, PC1, PC2 represent 80% of the variability in the data
• Scatter plot of PC1, PC2
• Biplot: displays the observation points as well as the variables in the coordinate
system defined by the first two principal components.

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 42


4 Basic visualization techniques
R statistics

Interpretation:
Ø The first principal component
accounts for 74% of the
variability, the second one for
12%
Ø Helpfulness and Competence
are evaluated similarly
Ø Eco-friendliness is evaluated
differently

© 2015 Springer-Verlag Berlin Heidelberg

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 43


4 Basic visualization techniques

Temporal data
- Use time-independent summaries such as mean and display them using
visualization techniques as discussed before, e.g., boxplots
- Or visualize the state variable for each process instance as a function of
time
Interpretation:
• One curve per woman
• For women going to hospital
Proteinurea has steeply
increased between days 48 and
100
• Different from women not going
to hospital

© 2015 Springer-Verlag Berlin Heidelberg


R package lattice
ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 44
4 Basic visualization techniques

Interactive and dynamic visualization


- Parallel coordinates
• Displaying highly dimensional data

• Variables are placed on x-axis in an equally distributed way

• Values of variables form ordinates and are connected by


polygons

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 45


4 Basic visualization techniques

© 2015 Springer-Verlag Berlin Heidelberg


GGobi
ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 46
Contents

1 Introduction

2 Description and visualization of business processes

3 Description and visualization of data in customer perspective

4 Basic visualization techniques

5 Reporting

6 Summary & outlook

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 47


0 represents „no missing value“; 1739 cases have no missing
5 Reporting values; in the matrix red cells represent missing values
Interpretation: Perf shows highest number of missing values over
all combinations, but the second highest one alone

Metadata and Data Quality


- Important: visualization of missing values R package vim

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 48


© 2015 Springer-Verlag Berlin Heidelberg
5 Reporting

Metadata and Data Quality


- Radar plot: Quality criteria such as accuracy, completeness,
timeliness with value scale, e.g., from 0 to 10 arranged as spider
web
Excel radar plot

Interpretation:
• Dataset1 seems to be
reliable, relevant, and
consistent, but less
complete, and even less
accurate, and timely.
• Dataset2 shows the opposite
picture: it is seemingly
complete, accurate, and
timely, but lacks relevance
and consistency
© 2015 Springer-Verlag Berlin Heidelberg

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 49


5 Reporting

Interpretation: Interactive
High-level reporting dashboard on student
performance
- Dashboards and Business Cockpits
- (Graphical) summaries for non-experts HighCharts

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 50


© 2015 Springer-Verlag Berlin Heidelberg
5 Reporting

High-level reporting
- Balanced scorecard
• Components:
• Destination statement describes the organization at present and at
a defined point in the future (mid-term planning) in the four
perspectives: financial and stakeholder expectations, customer and
external relationships, processes activities, and organization and
culture.
• Strategic linkage model contains strategic objectives with respect
to outcome and activities, together with hypothesized causal
relationships between these strategic objectives.
• Definitions of strategic objectives
• For each strategic objective measures are defined, together with
their targets.
ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 51
5 Reporting

Infographics
- Designed to convey possibly complex information
- Goals:
• Appeal: An infographic should engage the intended audience.
• Comprehension: The viewer of an infographic should understand the
information easily.
• Retention: The information provided by an infographic should be
remembered by the viewer.
- Example: maps for public transport, Pinterest
- Tools: Piktochart (open source), ManyEyes, Tableau Public,
Gapminder.

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 52


Contents

1 Introduction

2 Description and visualization of business processes

3 Description and visualization of data in customer perspective

4 Basic visualization techniques

5 Reporting

6 Summary & outlook

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 53


6 Summary & outlook

- Representation and visualization play an important


role in all phases of the BI project
- Data understanding is supported
- Analysis results can be conveyed
- Support of different stakeholders
- Interactive visualizations and representations
increase understanding

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 54


6 Summary & outlook

ÓW.Grossmann, S. Rinderle-Ma, University of Vienna – Chapter 4: Data Description and Visualization 55

Das könnte Ihnen auch gefallen