Online Analytical Processing (OLAP)

Online Analytical Processing (OLAP)
Suppose there is a company which has four different products Nuts,

Bolts, Washers, Screws in the East, West, Central Regions
If it is needed to find out how many washers were sold in each of the sales
regions and compare it with the projected sales OLAP will be needed
OLAP supports multidimensional data analysis, enabling users to view

the same data in different ways using multiple dimensions.
Each aspect of information products, pricing, cost, region- represents a

different dimensions.
OLAP enables users to obtain online answers to ad hoc questions in a

fairly rapid amount of time.
Overview of OLAP systems
At the core of any OLAP system is an OLAP cube (also called a

'multidimensional cube' or a hypercube).
It consists of numeric facts called measures which are categorized

by dimensions. The measures are placed at the intersections of the
hypercube, which is spanned by the dimensions as a vector space.
The usual interface to manipulate an OLAP cube is a matrix interface

like Pivot tables in a spreadsheet program, which performs projection
operations along the dimensions, such as aggregation or averaging.
OLAP Cube Definition
An OLAP Cube is a data structure that allows fast analysis of data

according to the multiple Dimensions that define a business problem.
A multidimensional cube for reporting sales might be, for example,

composed of 7 Dimensions: Salesperson, Sales Amount, Region, Product,
Region, Month, Year.
OLAP Cube Advantages
The arrangement of data into Cubes overcomes a limitation of relational

databases, which are not well suited for near instantaneous analysis and
display of large amounts of data.
Instead, they are better suited for creating records from a series of
transactions known as OLTP or On-Line Transaction Processing.
Although many report-writing tools exist for relational databases, these

are slow when the whole database must be summarized, and present
great difficulties when users wish to re-orient reports or analyses

according to different, multidimensional perspectives, aka, Slices.
The use of Cubes facilitate this kind of fast end-user interaction with data
OLAP Cube can be thought of as an extension of the modeling structure

provided by a spreadsheet, which accommodates data in rows and
columnsi.e., a two-dimensional array of data.
A Cube can accommodate any number of arrays, or Dimensions, though

designers of OLAP Cubes will try to build models that balance user needs
and logical model limitations
Operations
Conceiving data as a cube with hierarchical dimensions leads to

conceptually straightforward operations to facilitate analysis. Aligning the
data content with a familiar visualization enhances analyst learning and
productivity.
The user-initiated process of navigating by calling for page displays

interactively, through the specification of slices via rotations and drill
down/up is sometimes called "slice and dice". Common operations include
slice and dice, drill down, roll up, and pivot.
The cube metadata is typically created from a star schema or snowflake

schema or fact constellation of tables in a relational database.
Measures are derived from the records in the fact table and dimensions
are derived from the dimension tables.
Each measure can be thought of as having a set of labels, or meta-data

associated with it. A dimension is what describes these labels; it provides
information about the measure.
A simple example would be a cube that contains a store's sales as

a measure, and Date/Time as a dimension. Each Sale has a
Date/Time label that describes more about that sale.
Multidimensional databases
Multidimensional structure is defined as "a variation of the relational

model that uses multidimensional structures to organize data and express
the relationships between data".
The structure is broken into cubes and the cubes are able to store and
access data within the confines of each cube. "Each cell within a
multidimensional structure contains aggregated data related to elements
along each of its dimensions".
Even when data is manipulated it remains easy to access and continues

to constitute a compact database format. The data still remains
interrelated. Multidimensional structure is quite popular for analytical
databases that use online analytical processing (OLAP) applications.
Analytical databases use these databases because of their ability to

deliver answers to complex business queries swiftly. Data can be viewed
from different angles, which gives a broader perspective of a problem
unlike other models.
Aggregations
It has been claimed that for complex queries OLAP cubes can produce an
answer in around 0.1% of the time required for the same query
on OLTP relational data.
The most important mechanism in OLAP which allows it to achieve such

performance is the use of Aggregations.
Aggregations are built from the fact table by changing the granularity on
specific dimensions and aggregating up data along these dimensions.
The number of possible aggregations is determined by every possible

combination of dimension granularities.
The combination of all possible aggregations and the base data contains
the answers to every query which can be answered from the data.
At the simplest form an Aggregate is a simple summary table that can be

derived by performing a Group by SQL query. A more common use of
aggregates is to take a dimension and change the granularity of this
dimension.
When changing the granularity of the dimension the fact table has to be
partially summarized to fit the new grain of the new dimension, thus
creating new dimensional and fact tables, fitting this new level of grain.
Because usually there are many aggregations that can be calculated,

often only a predetermined number are fully calculated; the remainder are
solved on demand.
The problem of deciding which aggregations (views) to calculate

is known as the view selection problem. View selection can be
constrained by the total size of the selected set of aggregations, the time
to update them from changes in the base data, or both.
The objective of view selection is typically to minimize the

average time to answer OLAP queries, although some studies also
minimize the update time. View selection is NP-Complete. Many
approaches to the problem have been explored, including greedy
algorithms, randomized search, genetic algorithms and A* search
algorithm.
Datawarehouse
Data Warehouse is a relational database that is designed for

query and analysis rather than for transaction processing.
It usually contains historical data derived from transaction data, but it can
include data from other sources. It separates analysis workload from
transaction workload and enables an organization to consolidate data from
several sources.
In addition to a Relational Database, a Data warehouse environment

includes an -:
ETL
Extraction,
Transportation,
Transformation, and
Loading solution,
Online Analytical Processing (OLAP) Engine, Client Analysis Tools,

and other applications that manage the process of gathering data and
delivering it to business users.
A common way of introducing data warehousing is to refer to the

characteristics of a Data Warehouse as -:
Subject Oriented
Integrated
Nonvolatile
Time Variant
Subject Oriented
Data warehouses are designed to help you analyze data. For example, to
learn more about your company's sales data, you can build a warehouse
that concentrates on sales.
Using this warehouse, you can answer questions like "Who was our best
customer for this item last year?"
This ability to define a data warehouse by subject matter, sales in this

case, makes the data war
ehouse subject oriented.
Integrated
Integration is closely related to subject orientation.
Data warehouses must put data from disparate sources into a consistent
format.
They must resolve such problems as naming conflicts and inconsistencies

among units of measure.
When they achieve this, they are said to be integrated.
Nonvolatile
Nonvolatile means that, once entered into the warehouse, data should not
change.
This is logical because the purpose of a warehouse is to enable you to

analyze what has occurred.
Time Variant
In order to discover trends in business, analysts need large amounts of

data.
This is very much in contrast to Online Transaction Processing (OLTP)

systems, where performance requirements demand that historical data be
moved to an archive.
A data warehouse's focus on change over time is what is meant by the

term time variant.
Differences between typical Data Warehouses and OLTP systems
Data warehouses and OLTP systems have very different requirements.

Here are some examples of differences between typical data warehouses
and OLTP systems:
Workload
Data warehouses are designed to accommodate ad hoc queries. You might

not know the workload of your data warehouse in advance, so a data
warehouse should be optimized to perform well for a wide variety of
possible query operations.
OLTP systems support only predefined operations. Your applications might

be specifically tuned or designed to support only these operations.
Data modifications
A data warehouse is updated on a regular basis by the ETL process (run

nightly or weekly) using bulk data modification techniques. The end users
of a data warehouse do not directly update the data warehouse.
In OLTP systems, end users routinely issue individual data modification

statements to the database. The OLTP database is always up to date, and
reflects the current state of each business transaction.
transaction.
Schema design
Data warehouses often use denormalized or partially denormalized

schemas (such as a star schema) to optimize query performance.
OLTP systems often use fully normalized schemas to optimize

update/insert/delete performance, and to guarantee data consistency.
Typical operations
A typical data warehouse query scans thousands or millions of rows. For

example, "Find the total sales for all customers last month."
A typical OLTP operation accesses only a handful of records. For example,

"Retrieve the current order for this customer."
Historical data
Data warehouses usually store many months or years of data. This is to

support historical analysis.
OLTP systems usually store data from only a few weeks or months. The
OLTP system stores only historical data as needed to successfully meet the
requirements of the current
Normalization
Independent entities and relationships in the source data should not be

grouped together in the same relation in the database schema.
In particular, source specific schema elements should not be grouped with

overlapping schema elements, if the grouping co-locates independent
entities or relationships.
Example of two Schema Integrations
Suppose we want a mediated (database) schema to integrate two travel

databases, Go-travel and Ok-travel.
Go-travel has two relations:
Go-flight(f-num, time, meal(yes/no))
Go-price(f-num, date, price)
(f-num being the flight number)
Ok-travel has just one relation:
Ok-flight(f-num, date, time, price, nonstop(yes/no))
The overlapping information in Ok-travels and Go-travels schemas could

be represented in a mediated schema:
Flight(f-num, date, time, price)
OLTP (On-line Transaction Processing) is characterized by a large

number of short on-line transactions (INSERT, UPDATE, DELETE). The main
emphasis for OLTP systems is put on very fast query processing,
maintaining data integrity in multi-access environments and an
effectiveness measured by number of transactions per second. In OLTP
database there is detailed and current data, and schema used to store
transactional databases is the entity model (usually 3NF).
- OLAP (On-line Analytical Processing) is characterized by relatively
low volume of transactions. Queries are often very complex and involve
aggregations. For OLAP systems a response time is an effectiveness
measure. OLAP applications are widely used by Data Mining techniques. In
OLAP database there is aggregated, historical data, stored in
multidimensional schemas (usually star schema).
Data Warehouse Architecture
Different data warehousing systems have different structures.

Some may have an ODS (operational data store),
while some may have multiple data marts.
There are different layers of a data warehouse architecture.
In general, all data warehouse systems have the following layers:
Data Source Layer
Data Extraction Layer
Staging Area
ETL Layer
Data Storage Layer
Data Logic Layer
Data Presentation Layer
Metadata Layer
System Operations Layer
Data Source Layer
This represents the different data sources that feed data into the data
warehouse. The data source can be of any format -- plain text file,
relational database, other types of database, Excel file, etc., can all act as
a data source.
Many different types of data can be a data source:
Operations -- such as sales data, HR data, product data, inventory data,

marketing data, systems data.
Web server logs with user browsing data.
Internal market research data.
Third-party data, such as census data, demographics data, or survey data.
All these data sources together form the Data Source Layer.
Data Extraction Layer
Data gets pulled from the data source into the data warehouse system.
There is likely some minimal data cleansing, but there is unlikely any
major data transformation.
Staging Area
This is where data sits prior to being scrubbed and transformed into a data
warehouse / data mart. Having one common area makes it easier for
subsequent data processing / integration.
Data staging The data stored to sources should be extracted, cleansed to

remove inconsistencies and fill gaps, and integrated to merge
heterogeneous sources into one common schema.
The so-called Extraction, Transformation, and Loading tools (ETL) can

merge heterogeneous schemata, extract, transform, cleanse, validate,
filter, and load source data into a data warehouse .
Technologically speaking, this stage deals with problems that are typical
for distributed information systems, such as inconsistent data
management and incompatible data structures .
ETL Layer
This is where data gains its "intelligence", as logic is applied to transform

the data from a transactional nature to an analytical nature. This layer is
also where data cleansing happens. The ETL design phase is often the
most time-consuming phase in a data warehousing project, and an ETL
tool is often used in this layer.
Data Storage Layer
This is where the transformed and cleansed data sit. Based on scope and
functionality, 3 types of entities can be found here:
data warehouse,
data mart, and
operational data store (ODS).
In any given system, you may have just one of the three, two of the three,
or all three types.
Data Logic Layer
This is where business rules are stored. Business rules stored here do not
affect the underlying data transformation rules, but do affect what the
report looks like.
Data Presentation Layer
This refers to the information that reaches the users.
This can be in a form of a tabular / graphical report in a browser, an

emailed report that gets automatically generated and sent everyday, or an
alert that warns users of exceptions, among others.
Usually an OLAP tool and/or a reporting tool is used in this layer.
Metadata Layer
This is where information about the data stored in the data warehouse
system is stored.
A logical data model would be an example of something that's in the

metadata layer.
A metadata tool is often used to manage metadata.
System Operations Layer
This layer includes information on how the data warehouse system

operates, such as ETL job status, system performance, and user access
history.
A Data Mart is the access layer of the Data warehouse environment that
is used to get data out to the users.
The data mart is a subset of the data warehouse that is usually oriented to
a specific business line or team.
Data marts are small slices of the data warehouse. Whereas data
warehouses have an enterprise-wide depth, the information in data marts
pertains to a single department.
In some deployments, each department or business unit is considered

the owner of its data mart including all the hardware, software and data.
This enables each department to use, manipulate and develop their data
any way they see fit;
Without altering information inside other data marts or the data

warehouse.
In other deployments where conformed dimensions are used, this

business unit ownership will not hold true for shared dimensions like
customer, product, etc.
A data mart is basically a condensed and more focused version of a data

warehouse that reflects the regulations and process specifications of each
business unit within an organization.
Each data mart is dedicated to a specific business function or region. This

subset of data may span across many or all of an enterprises functional
subject areas.
It is common for multiple data marts to be used in order to serve the

needs of each individual business unit (different data marts can be used to
obtain specific information for various enterprise departments, such as
accounting, marketing, sales, etc.).
Data mart vs Data warehouse:
Holds multiple subject areas
Holds very detailed information
Works to integrate all data sources
Does not necessarily use a dimensional model
Data mart:
Often holds only one subject area- for example, Finance, or Sales
May hold more summarized data (although many hold full detail)
Concentrates on integrating information from a given subject area or set

of source systems
Is built focused on a dimensional model using a star schema.

data warehouse
Reasons for creating a Data Mart
Easy access to frequently needed data
Creates collective view by a group of users
Improves end-user response time
Ease of creation
Lower cost than implementing a full data warehouse
Potential users are more clearly defined than in a full data warehouse
Contains only business essential data and is less cluttered.

Design schemas
Star schema - fairly popular design choice; enables a relational

database to emulate the analytical functionality of amultidimensional
database
Snowflake schema
Decision Support System
DSS are a natural progression from information reporting systems and

transaction processing systems.
DSS are interactive, computer-based information systems that use

decision models and specialized databases to assist the Decision-Making
processes of Managerial End users.
They provide Managerial End users with information in an interactive

session on ad hoc basis.
DSS provides managers with analytical modeling, simulation, data

retrieval and information presentation capabilities.
Managers generate the information they need for more unstructured types
decisions in an interactive , simulation-based process.
For Eg electronic spreadsheets allow a Manager to pose a series of what-if

questions and receive interactive responses to such ad hoc requests for
information.
Decision Support System
A Decision Support System (DSS) is a computer-based Information

System that supports business or organizational decisionmaking activities.
DSSs serve the management, operations, and planning levels of an

organization (usually mid and higher management)
It helps to make decisions, which may be rapidly changing and not easily
specified in advance (Unstructured and Semi-Structured decision
problems).
Decision support systems can be either fully computerized, human or a

combination of both.
DSS by its characteristics
DSS tends to be aimed at the less well structured,

underspecified problem that upper level managers typically face;
DSS attempts to combine the use of models or analytic techniques with

traditional data access and retrieval functions;
DSS specifically focuses on features which make them easy to use by noncomputer people in an interactive mode; and
DSS emphasizes flexibility and adaptability to accommodate changes in

the environment and the decision-making approach of the user.
DSSs include knowledge-based systems. A properly designed DSS is an

interactive software-based system intended to help decision makers
compile useful information from a combination of raw data, documents,
and personal knowledge, or business models to identify and solve
problems and make decisions.
Typical information that a decision support application might gather and

present includes:
Inventories of information assets (including legacy and relational data

sources, cubes, data warehouses, and data marts),
Comparative sales figures between one period and the next,
Projected revenue figures based on product sales assumptions.
Components
Design of Decision Support System
Three fundamental components of a DSS architecture are:
1. The database (or knowledge base),

2. The model (i.e., the decision context and user criteria), and
3. The user interface.
4. The users themselves

Development frameworks
DSS systems are not entirely different from other systems and require a
structured approach. Such a framework includes people, technology, and
the development approach.
The Early Framework of Decision Support System consists of four phases:
1.
Intelligence Searching for conditions that call for decision.
2. Design Developing and analyzing possible alternative actions of solution.

3. Choice Selecting a course of action among those.
4. Implementation Adopting the selected course of action in decision
situation.
DSS Technology Levels (of hardware and software) may include:
This is the part of the application that allows the decision maker to make
decisions in a particular problem area. The user can act upon that
particular problem.
Generator contains Hardware/software environment that allows people to

easily develop specific DSS applications. This level makes use of case tools
or systems such as Crystal, Analytica and iThink.
Tools include lower level hardware/software. DSS generators including

special languages, function libraries and linking modules
An iterative developmental approach allows for the DSS to be changed

and redesigned at various intervals. Once the system is designed, it will
need to be tested and revised where necessary for the desired outcome.
Classification
There are several ways to classify DSS applications. Not every DSS fits
neatly into one of the categories, but may be a mix of two or more
architectures.
DSS is classified into the following six frameworks:
1. text-oriented DSS,
2. database-oriented DSS,
3. spreadsheet-oriented DSS,
4. solver-oriented DSS,
5. rule-oriented DSS,
6. compound DSS.
A compound DSS is the most popular classification for a DSS. It is a hybrid

system that includes two or more of the five basic structures described
The support given by DSS can be separated into three distinct, interrelated
categories:
Personal Support,
Group Support, and
Organizational Support.
DSS components may be classified as:
Inputs: Factors, numbers, and characteristics to analyze
User Knowledge and Expertise: Inputs requiring manual analysis by

the user
Outputs: Transformed data from which DSS "decisions" are generated
Decisions: Results generated by the DSS based on user criteria
DSSs which perform selected cognitive decision-making functions and are

based on artificial intelligence or intelligent agents technologies are
called Intelligent Decision Support Systems (IDSS)
The nascent field of Decision engineering treats the decision itself as an

engineered object, and applies engineering principles such
as Design and Quality assurance to an explicit representation of the
elements that make up a decision.
Group Decision Support System
Time/Place Framework
Same Time/Same Place
Same Time/Different Place
telephone conferencing, video conferencing
Different Time/Same Place
decision room
project/team rooms, shared offices
Different Time/Different Place
email, workflow management systems
Group Decision Support Systems (GDSS)
Group Support Systems (GSS)
Electronic Meeting Systems
Collaborative Computing
Evolved as information technology researchers recognized that technology

could be developed for supporting meeting activities
Idea generation
Consensus building
Anonymous ranking
Voting, etc.
Important Characteristics
of a GDSS
Specially Designed Information System
Goal of Supporting Groups of Decision Makers
Easy to Learn and Use
May be designed for one type of problem or for many organizational

decisions
Designed to encourage group activities
Attempts to minimize process losses

Three Levels of GDSS Support
Based on DeSanctis and Gallupe
Level 1: Process Support
Level 2: Decision-making Support
Level 3: Rules of order
Level 1: Process Support
Supports the basic communication process between participants
electronic messaging
network linking the PCs
public screen
anonymous input of votes and ideas
solicitation of ideas or votes
summary and display of ideas and opinions
format for an agenda
Level 2: Decision-Making Support
Decision Modeling and Group Decision Techniques aimed at reducing

Uncertainty and that occur in the group decision process
adds capabilities for modeling and decision analysis
planning and financial models
decision trees
probability assessment models
resource allocation models
Level 3: Rules of Order
Rule of order ensures that the group involved in the group meeting can
conduct its business in a way that is both fair and effective.
Characterized by machine-induced group communication patterns
Control the pattern, timing, or content of information exchange
Special software containing rules of order is added
rules determining the sequence of speaking, the appropriate

response, or voting rules
Groupware Technologies
Groupware is defined as any software that enables group collaboration

over a network.
These technologies have the potential to increase collaboration at a

distance while reducing the cost of travel and the time knowledge workers
waste in transit.
Groupware provides
flexible communication structures (connecting people in new ways),
increased communication speed,
increased work performance and productivity,

organizational memory capability, etc.
Examples of Groupware Technologies include:
Shared authoring tools such as MS Office applications (Word, Excel,

etc.) which include common word processing programs, graphics programs
and sound-editing facilities. Many stand-alone applications can be
considered as groupware if they can access and modify a document on the
web or a common server
E-mail systems such as MS Outlook Express, support multiple textbased communications and is the most often used groupware Online
forums are real-time, text-based systems that allow group posting and
response to text messages. They are self-archiving, in that the sequence
of text-based conversations involving dozens or even hundreds of
contributors is maintained for review by others
Instant messaging such as AOL messenger, is a growing form of

groupware that allows knowledge workers working away from their desks
to exchange short items of information
Screen sharing allows a user with the appropriate access privileges to

connect to and take control of a remote PC. It is popular in training and
troubleshooting situations where a support person can show the trainee at
a remote site how to perform an operation and then watch as the trainee
attempts to do the operation
Electronic whiteboard provides a virtual whiteboard drawing space that

enables multiple collaborators to take turns at authoring and modifying
hand-drawn graphics or simply by posting a slide for a presentation. They
are used in conjunction with other products, such as videoconferencing
which is the real-time, multi-way broadcasting of video and audio
Videoconferencing such as Skype conferences, allow real-time, multiway broadcasting of video and audio, using telephone lines for audio and
the Internet or other networks for the video channels
Multimodal conferencing supports real-time group sharing of an

electronic whiteboard, a text forum, audio, and multiple-channel video and
audio.
What is Groupware?
Tools (hardware, software, processes) that support person-to-person

collaboration
This can include e-mail, bulletin boards, conferencing systems, decision

support systems, video and workflow systems, etc
Some common groupware acronyms:
Group Support Systems (GSS)
Group Decision Support Systems (GDSS)
Electronic Meeting Systems (EMS)
Bulletin Board Systems (BBS)
Group Collaboration Systems (GCS)
Computer-Supported Cooperative Work (CSCW) systems
Groupware and Levels of Collaboration
Groupware can be divided into three categories depending on the level

of collaboration:
1. Communication can be thought of as unstructured interchange of

information. A phone call or an IM Chat discussion are examples of this.
2. Conferencing (or collaboration level, as it is called in the academic
papers that discuss these levels) refers to interactive work toward a
shared goal. Brainstorming or Voting are examples of this.
3. Co-ordination refers to complex interdependent work toward a shared
goal. A good metaphor for understanding this is to think about a sports
team; everyone has to contribute the right play at the right time as well as
adjust their play to the unfolding situation - but everyone is doing
something different - in order for the team to win. That is complex

interdependent work toward a shared goal: collaborative management.
Electronic Communication Tools
Electronic communication tools send messages, files, data, or documents

between people and hence facilitate the sharing of information. Examples
include:
Synchronous conferencing
Asynchronous conferencing
E-mail
Faxing
Voice mail
Wikis
Web publishing
Revision control
Electronic Conferencing is Tools
Electronic conferencing tools facilitate the sharing of information, but in a

more interactive way. Examples include:
Internet forums (also known as message boards or discussion boards) a

virtual discussion platform to facilitate and manage online text messages
Online chat a virtual discussion platform to facilitate and manage realtime text messages
Instant Messaging
Telephony telephones allow users to interact
Videoconferencing networked PCs share video and audio signals
Data conferencing networked PCs share a common whiteboard that

each user can modify
Application sharing users can access a shared document or application

from their respective computers simultaneously in real time
Electronic meeting systems (EMS) originally these were described as

"electronic meeting systems," and they were built into meeting rooms.
These special purpose rooms usually contained video projectors
interlinked with numerous PCs; however, electronic meeting systems have
evolved into web-based, any time, any place systems that will
accommodate "distributed" meeting participants who may be dispersed in

several locations.
Collaborative Management (coordination) Tools
Collaborative management tools facilitate and manage group activities.

Examples include:
Electronic calendars (also called time management software)

schedule events and automatically notify and remind group members
Project management systems schedule, track, and chart the steps in

a project as it is being completed
Online proofing share, review, approve, and reject web proofs,

artwork, photos, or videos between designers, customers, and clients
Workflow systems collaborative management of tasks and documents

within a knowledge-based business process
Knowledge Management Systems collect, organize, manage, and

share various forms of information
Enterprise Bookmarking collaborative bookmarking engine to tag,

organize, share, and search enterprise data
Prediction Markets let a group of people predict together the

outcome of future events
Extranet Systems (sometimes also known as 'project extranets')

collect, organize, manage and share information associated with the
delivery of a project (e.g.: the construction of a building)
Social Software Systems organize social relations of groups
Online Spreadsheets collaborate and share structured data and

information
Client Portals interact and share information with your clients in a

private online environment
Benefits of GDSS
supports parallel generation of ideas
supports larger groups
rapid and easy access to external information
parallel computer discussion
anonymous input
automatic documentation of the group meetings

Groupware
(Collaborative software)
Collaboration, with respect to information technology, seems to have

several definitions. Some are defensible but others are so broad they lose
any meaningful application.
Understanding the differences in human interactions is necessary to

ensure the appropriate technologies are employed to meet interaction
needs.
Collaborative Software
Collaborative software helps facilitate the action-oriented team

working together over geographic distances by providing tools that
help communication, collaboration and the process of problem solving by
providing the team with a common means for communicating ideas and
brainstorming.
Additionally, collaborative software may support project

management functions, such as task assignments, timemanagement with deadlines and shared calendars.
The artifacts, the tangible evidence of the problem solving process,

including the final outcome of the collaborative effort, typically require
documentation and archiving of the process itself, and may
involve archiving project plans, deadlines and deliverables.
The primary ways in which humans interact in an organization
Conversational interaction is an exchange of information between two

or more participants where the primary purpose of the interaction is
discovery or relationship building. There is no central entity around which
the interaction revolves but is a free exchange of information with no
defined constraints generally focused on personal
experiences. Communication technology such as telephones, instant
messaging, and e-mail are generally sufficient for conversational
interactions.
Transactional interaction involves the exchange of transaction entities

where a major function of the transaction entity is to alter the relationship
between participants. The transaction entity is in a relatively stable form
and constrains or defines the new relationship. One participant exchanges
money for goods and becomes a customer. Transactional interactions are
most effectively handled by transactional systems that manage state and
commit records for persistent storage.
In Collaborative Interactions the main function of the participants'

relationship is to alter a collaboration entity (i.e., the converse of
transactional). The collaboration entity is in a relatively unstable form.
Examples include the development of an idea, the creation of a design,

the achievement of a shared goal. Therefore, real collaboration
technologies deliver the functionality for many participants to augment a
common deliverable.
Record or document management, threaded discussions, audit history,

and other mechanisms designed to capture the efforts of many into a
managed content environment are typical of collaboration technologies.
By method used we can divide Collaborative Software into
Web-based collaborative tools
Software collaborative tools

By area served we can divide collaborative software into:
Knowledge management tools
Knowledge creation tools
Information sharing tools
Collaborative project management tools

Collaborative Project Management Tools
Collaborative project management tools (CPMT) are very similar to

collaborative management tools (CMT) except that CMT may only facilitate
and manage a certain group activities for a part of a bigger project or task,
while CPMT covers all detailed aspects of collaboration activities and
management of the overall project and its related knowledge areas.
Another major difference is that CMT may include social software,

Document Management System (DMS) and Unified Communication (UC)
while CPMT mostly considers business or corporate related goals with
some kind of social boundaries most commonly used for project
management.
CPMT facilitate and manage social or group project based activities.
Examples include:
Electronic calendars
Project management systems
Resource Management
Workflow systems
Knowledge management
Prediction markets
Extranet systems
Social software
Online spreadsheets
Online artwork proofing, feedback, review and approval tool
In addition to most CPMT examples, CMT also includes:
HR and equipment management
Time and cost management
Online chat
Instant messaging
Telephony
Videoconferencing
Web conferencing
Data conferencing
Application sharing
Electronic Meeting Systems (EMS)
Synchronous conferencing
E-mail
Faxing
voice mail
Wikis
Web publishing
Revision control
Charting
Document-centric collaboration
Document retention
Document sharing
Document repository
Evaluation and survey
Group Decision Making
Many of the decisions in today's workplace are made by groups of

individuals
Groups bring many advantages to the choice process:
Multiple source of knowledge and experience
A wider variety of prospectives
Potential synergy associated with collaborative activity
Some times too many decision makers result in either a bad decision or no
decision at all.
Group in term of decision making can be defined as : a collective

entity that is independent of the properties of its members.
Multiparticipant decision maker (MDM): An activity conducted by a

collective entity composed of two or more individuals and characterised in
terms of both the properties of the collective entity and of its individual
members
Classification of Multi-participant
Decision -Making structures
Decision structure, two types:
Collaborative
Group decision structure: Formal participants and multiple

decision maker
Negotiation decisions
Majority decisions
Noncollaborative
Team decision structure: Formal participants and single

decision maker
Negotiation decisions
Majority decisions
Individual decision structure
Communication Networks
The structure of an MDM is primarily based on the interaction and flow of

communication among the various members.
Communication can be thought as any means by which information is

transmitted to one or more members of the MDM.
Basic Types of Networks Structures

1. Wheel Network
2. Chain Network
3. Circle Network
4. Completely Connected Network
Classification of networks according to centrality
Highly Centralised
They are efficient to routine and recurring decisions.
They tend to strengthen the leadership position of the central

members.
They tend to result in a stable set of interactions among the

participants.
They tend to produce lower average levels of satisfaction among

the participants.
Highly Decentralised
They tend to produce higher average levels of satisfaction among

participants.
They facilitate nonroutine or nonrecurring decisions.
They promote innovation and creative solutions.
Factors used in determining Decision Structure

1. The importance of the quality of the decision.
2. The extent to which the decision maker possess the knowledge and
expertise to make the decision.
3. The extent to which potential participants have the necessary information.
4. The degree of structuredness of the problem context.
5. The degree to which the acceptance or commitment is critical to
successful implementation.
6. The probability of acceptance of an autocratic decision.
7. The degree of motivation among the participants to achieve the
organisational goals.
8. The degree of potential conflicts among the participants over a preferred
solution.
Problems with Group Decisions
1. Size
The most widely studied and consequential component of group

decision making.
Studies show that as the size of a group increases, individual

satisfaction tends to decrease.
As the size increases, the less active members tend to
become noticeably less productive.
Logic suggests that the management of an MDM requiring

consensus or majority is easier when the size is small.
Problems with Group Decisions: Size..
Member cohesiveness decreases as MDM size increases. When

membership is high, subgroups and internal coalitions tend to form
that serve redirect the focus of the participant away from the
common goal.
The increased likelihood for certain members of large MDMs to feel

threatened reluctant to participate because the size magnifies the
impersonal nature of the problem context.
Despite the disadvantages when the size of the MDM increases, in

certain situations such as quantitative judgment in statistics, the
larger the membership of the MDM, the more likley it is that the
results of the judgment must be made.
Effects related to MDM (Management DM) size
Participant interaction tends to decrease as size increase.
Affective or emotional relationships tend to decrease as size

increases.
Central, dominant leadership tend to increase as size increases.
Conflicts is resolved with political rather than analytical solutions as

size increases.
Despite the disadvantages when the size of the MDM increases, in

certain situations such as quantitative judgment in statistics, the
larger the membership of the MDM, the more likely it is that the
results of the judgment must be made.
Problems with Group Decisions

2. Groupthink: a mode of thinking that people engage in when they are
deeply involved in a cohesive in-group.
The more friendly and cooperative the members of a group, the

greater the likelihood that independent critical thinking will be
suspended in deference to group norms.
Unfavourable outcomes associated with Groupthink
Tends to prevent a complete open-mind analysis of opportunities in

the development of objectives.
Holds back a meaningful search for information and tends to bias

any searches toward a self fulfilling selectivity.
Limits the participants ability to appraise possibilities associated

with the cost of failure.
Tends to eliminate the formation of incident of fallback position.
3. Other Social issues
Conflict
The desire to be viewed as a good member and to be

accepted by the other participants often leads to conflict
avoidance.
Natural group dynamics such as struggle of power can result

in some form of conflict.
Anonymity
One common method used to control sources of potential

conflict and to support other MDM processes is participant
anonymity, i.e. vote.
In many cases anonymity results in the generation of more

and better information.
MDM Support Technologies
Tools used in MDM environment to support the processes and

activities related to the decision making process.
Usual group meeting description .. (Gray 1981).
New technologies and telecommunications
MDM support technologies can be classified based on decision

maker styles .
The four basic levels of MDM technology:

1. Organisational Decision Support System (ODSS): A complex system
of computer based technologies- including those that facilitate
communication- that provides support for decision makers.
2. Group Support Systems (GSS): A collective of computer based
technologies used to aid MDM in identifying and addressing
problems, opportunities, and issues.
3. Group Decision support System (GDSS):A collective of computer
based technologies designed to support the activities and processes
related to MDM.
4. Decision Support System (DSS): a computer program under the
control of one or more persons that provides staff within
organisations with support tools capable of enhancing the results of
the decision making process.
Gains and Losses Associated with MDM Activities
Some of the Gain
1. Collective has greater knowledge than a single participant.
2. Allows for synergistic results.

3. Interaction stimulates the generation of knowledge.
4. Participants can improve individual performance through learning
from others.
Some of the Losses
1. Can block the production of ideas.
2. Can produce information overload much faster.
3. Relative collection of speaking time is reduced with MDM size
4. Increase opportunities of socialising over goal focus.
Types by features offered in support of the multi-participant
decision-making activities:
1. Reduce communication barriers.
2. Reduce uncertainty and noise.
3. Organize decision process.
Types by technology used:
1. Electronic boardroom.
2. Teleconference room.
3. Group network.
4. Information centre .
5. Collaboration laboratory.
6. Decision room.
Collaborative Support Technologies
Groupware: A particular type of MDM support technology specifically
focused on issues related to collaborative processes among people. You
can think of it as a tool that, when deployed and used appropriately,
positively affects that way people communicate with each other, resulting
in an improvement in the way people work.
Current market leaders of Groupware:
Lotus Notes
Microsoft Exchange
Oracle Office
GroupWise
Team Office
Groupware refers to programs that help people work together collectively

while located remotely from each other. Programs that enable real
time collaboration are called synchronous groupware.
Groupware services can include the sharing of calendars, collective

writing, e-mail handling, shared database access, electronic
meetings with each person able to see and display information to
others, and other activities.
Sometimes called collaborative software, groupware is an integral

component of a field of study known as Computer-Supported Cooperative
Work or CSCW.
Groupware is often broken down into categories describing whether or not

work group members collaborate in real time (synchronous groupware and
asynchronous groupware).
Some product examples of groupware include Lotus Notes and Microsoft

Exchange, both of which facilitate calendar sharing, e-mail handling, and
the replication of files across a distributed system so that all users can
view the same information.
Electronic "face-to-face" meetings are facilitated by CU-See Me and

Microsoft NetMeeting.
Five Basic group processes
Dynamic Group Interaction model
Basic Principles
The effectiveness of a group can be expressed in terms of three types of

outcomes, i.e. (quality and quantity of the )products, individual
rewards and vitality of the social relations.
Effectiveness depends on the quality of the individual preformance and six

group processes, which have to match
The quality of the group processes depends on the support of six

conditions, and on the interaction with the environment.
The six aspects of the context-of-use have to fit to each other.
Groups develop and tools become adopted and adapted to, through
interaction processes and feedback.
SUPPORT MATCH ADAPTATION
Lessons learned (1)
1.
Groupware is part of a social system. Design not for a tool as such
but for a new socio-technical setting.
2. Design for several levels of interaction, i.e. for user friendly human
computer interaction, adequate interpersonal communication, group cooperation and organisational functioning.
3. Design in a participative way, i.e. users and possibly other
stakeholders should be part of the design process from the beginning.
4. Analyse carefully the situation of the users. Success of collaboration
technology depends on the use and the users, not on the technology.
Introduction should match their skills and abilities, and also their attitudes,
otherwise resistance is inevitable.
5. Analyse carefully the context, since success of collaboration

technology depends on the fit to that context. The more a new setting
deviates from the existing one the more time, energy and other resources
should be mobilised to make it a success.
Lessons learned (2)
6. Introduce the new system carefully. Apply proper project management,
find a champion, try a pilot, inform people intensively
7. Train and support end-users extensively
8. Measure success conditions and success criteria before, during and
after the development process. Only in this way you can learn for future
developments.
9. Plan for a long process of introduction, incorporation, evaluation and
adaptation. Groupware is not a quick fix.
10. Despite careful preparations groupware is appropriated and adapted in
unforeseen ways. Keep options open for new ways of working with the
groupware, because this may result in creative and innovative processes.
6. Introduce the new system carefully. Apply proper project management,
find a champion, try a pilot, inform people intensively
7. Train and support end-users extensively
8. Measure success conditions and success criteria before, during and
after the development process. Only in this way you can learn for future
developments.
9. Plan for a long process of introduction, incorporation, evaluation and
adaptation. Groupware is not a quick fix.
10. Despite careful preparations groupware is appropriated and adapted in
unforeseen ways. Keep options open for new ways of working with the
groupware, because this may result in creative and innovative processes.
Expert Systems
Expert Systems are computer programs that are derived from a branch
of computer science research called Artificial Intelligence (AI).
AI's scientific goal is to understand intelligence by building computer

programs that exhibit intelligent behavior.
It is concerned with the concepts and methods of symbolic inference, or

reasoning, by a computer, and how the knowledge used to make those
inferences will be represented inside the machine.
AI programs that achieve expert-level competence in solving problems in

task areas by bringing to bear a body of knowledge about specific tasks
are called knowledge-based or expert systems.
Often, the term expert systems is reserved for programs whose knowledge
base contains the knowledge used by human experts, in contrast to
knowledge gathered from textbooks or non-experts.
More often than not, the two terms, expert systems (ES) and knowledgebased systems (KBS), are used synonymously. Taken together, they
represent the most widespread type of AI application. The area of human
intellectual endeavor to be captured in an expert system is called the task
domain.
Task refers to some goal-oriented, problem-solving activity. Domain refers

to the area within which the task is being performed. Typical tasks are
diagnosis, planning, scheduling, configuration and design. An example of a
task domain is aircraft crew scheduling,
The Building Blocks of Expert Systems
Every expert system consists of two principal parts: the knowledge base;
and the reasoning, or inference, engine.
The knowledge base of expert systems contains both factual and

heuristic knowledge. Factual knowledge is that knowledge of the task
domain that is widely shared, typically found in textbooks or journals, and
commonly agreed upon by those knowledgeable in the particular field.
Heuristic knowledge is the less rigorous, more experiential, more

judgmental knowledge of performance. In contrast to factual knowledge,
heuristic knowledge is rarely discussed, and is largely individualistic. It is
the knowledge of good practice, good judgment, and plausible reasoning
in the field. It is the knowledge that underlies the "art of good guessing."
Knowledge representation formalizes and organizes the knowledge.

One widely used representation is the production rule, or simply rule.
A rule consists of an IF part and a THEN part (also called a condition and
an action).
The IF part lists a set of conditions in some logical combination.
The piece of knowledge represented by the production rule is relevant to

the line of reasoning being developed if the IF part of the rule is satisfied;
consequently, the THEN part can be concluded, or its problem-solving
action taken.
Expert systems whose knowledge is represented in rule form are

called rule-based systems.
Another widely used representation, called the unit (also known

as frame, schema, or list structure) is based upon a more passive view of
knowledge.
The unit is an assemblage of associated symbolic knowledge about an

entity to be represented. Typically, a unit consists of a list of properties of
the entity and associated values for those properties.
Since every task domain consists of many entities that stand in various
relations, the properties can also be used to specify relations, and the
values of these properties are the names of other units that are linked
according to the relations.
One unit can also represent knowledge that is a "special case" of another
unit, or some units can be "parts of" another unit.
The problem-solving model, or paradigm, organizes and controls the

steps taken to solve the problem.
One common but powerful paradigm involves chaining of IF-THEN rules to

form a line of reasoning.
If the chaining starts from a set of conditions and moves toward some
conclusion, the method is called forward chaining.
If the conclusion is known (for example, a goal to be achieved) but the

path to that conclusion is not known, then reasoning backwards is called
for, and the method is backward chaining.
These problem-solving methods are built into program modules

called inference engines or inference procedures that manipulate and
use knowledge in the knowledge base to form a line of reasoning.
In artificial intelligence, an Expert system is a computer system that

emulates the decision-making ability of a human expert.
Expert systems are designed to solve complex problems

by reasoning about knowledge, represented primarily as ifthen
rules rather than through conventional procedural code.
The first expert systems were created in the 1970s and then proliferated
in the 1980s.
Expert systems were among the first truly successful forms of AI software.
An Expert System is divided into two sub-systems:
The Inference Engine applies the rules to the known facts to deduce
new facts. Inference engines can also include explanation and debugging
capabilities.
The Knowledge Base which represents facts and rules.
Components of an Expert System
Expert System
As Expert Systems evolved, many new techniques were incorporated into

various types of Inference Engines. Some of the most important of these
were:
1. Truth Maintenance. Truth maintenance systems record the

dependencies in a knowledge-base so that when facts are altered
dependent knowledge can be altered accordingly. For example, if the
system learns that Socrates is no longer known to be living, it will revoke
the assertion that Socrates is mortal.
2. Hypothetical Reasoning. In hypothetical reasoning, the Knowledge Base
can be divided up into many possible views, aka worlds. This allows the
Inference Engine to explore multiple possibilities in parallel. In this simple
example, the system may want to explore the consequences of both
assertions, what will be true if Socrates is living and what will be true if he
is not?
3. Fuzzy Logic. One of the first extensions of simply using rules to represent
knowledge was also to associate a probability with each rule. So, not to
assert that Socrates is mortal but to assert Socrates may be mortal with
some probability value. Simple probabilities were extended in some
systems with sophisticated mechanisms for uncertain reasoning and
combination of probabilities.
4. Ontology Classification. With the addition of Object classes to the
Knowledge Base a new type of reasoning was possible. Rather than reason
simply about the values of the Objects, the system could also reason
about the structure of the objects as well. In this simple example Man can
represent an Object Class and R1 can be redefined as a rule that defines
the class of all men.
5. These types of special purpose Inference Engines are known as Classifiers.

Although they were not highly used in Expert systems, Classifiers are very
powerful for unstructured volatile domains and are a key technology for
the Internet and the emerging Semantic Web.

Online Analytical Processing (OLAP)

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Online Analytical Processing (OLAP)

Hochgeladen von

Copyright:

Verfügbare Formate

Online Analytical Processing (OLAP)

Suppose there is a company which has four different products Nuts,

OLAP supports multidimensional data analysis, enabling users to view

Each aspect of information products, pricing, cost, region- represents a

OLAP enables users to obtain online answers to ad hoc questions in a

Overview of OLAP systems

At the core of any OLAP system is an OLAP cube (also called a

It consists of numeric facts called measures which are categorized

The usual interface to manipulate an OLAP cube is a matrix interface

OLAP Cube Definition

An OLAP Cube is a data structure that allows fast analysis of data

A multidimensional cube for reporting sales might be, for example,

OLAP Cube Advantages

The arrangement of data into Cubes overcomes a limitation of relational

Although many report-writing tools exist for relational databases, these

great difficulties when users wish to re-orient reports or analyses

OLAP Cube can be thought of as an extension of the modeling structure

A Cube can accommodate any number of arrays, or Dimensions, though

Conceiving data as a cube with hierarchical dimensions leads to

The user-initiated process of navigating by calling for page displays

The cube metadata is typically created from a star schema or snowflake

Each measure can be thought of as having a set of labels, or meta-data

A simple example would be a cube that contains a store's sales as

Multidimensional structure is defined as "a variation of the relational

Even when data is manipulated it remains easy to access and continues

Analytical databases use these databases because of their ability to

The most important mechanism in OLAP which allows it to achieve such

The number of possible aggregations is determined by every possible

At the simplest form an Aggregate is a simple summary table that can be

Because usually there are many aggregations that can be calculated,

The problem of deciding which aggregations (views) to calculate

The objective of view selection is typically to minimize the

Data Warehouse is a relational database that is designed for

In addition to a Relational Database, a Data warehouse environment

Online Analytical Processing (OLAP) Engine, Client Analysis Tools,

A common way of introducing data warehousing is to refer to the

This ability to define a data warehouse by subject matter, sales in this

ehouse subject oriented.

Integration is closely related to subject orientation.

They must resolve such problems as naming conflicts and inconsistencies

When they achieve this, they are said to be integrated.

This is logical because the purpose of a warehouse is to enable you to

In order to discover trends in business, analysts need large amounts of

This is very much in contrast to Online Transaction Processing (OLTP)

A data warehouse's focus on change over time is what is meant by the

Data warehouses and OLTP systems have very different requirements.

Data warehouses are designed to accommodate ad hoc queries. You might

OLTP systems support only predefined operations. Your applications might

A data warehouse is updated on a regular basis by the ETL process (run

In OLTP systems, end users routinely issue individual data modification

Data warehouses often use denormalized or partially denormalized

OLTP systems often use fully normalized schemas to optimize

A typical data warehouse query scans thousands or millions of rows. For

A typical OLTP operation accesses only a handful of records. For example,

Data warehouses usually store many months or years of data. This is to

Independent entities and relationships in the source data should not be

In particular, source specific schema elements should not be grouped with

Suppose we want a mediated (database) schema to integrate two travel

Go-travel has two relations:

Go-flight(f-num, time, meal(yes/no))

Go-price(f-num, date, price)

(f-num being the flight number)