Sie sind auf Seite 1von 43

DECISION SUPPORT SYSTEM.

DEFINITION.

Decision support systems constitute a class of computer-based information systems including knowledge-based systems that support decision-making activities.

OVERVIEW.
A Decision Support System (DSS) is a class of information systems (including but not limited to computerized systems) that support business and organizational decision-making activities. A properly designed DSS is an interactive software-based system intended to help decision makers compile useful information from a combination of raw data, documents, personal knowledge, or business models to identify and solve problems and make decisions.

Typical information that a decision support application might gather and present are: an inventory of all of your current information assets (including legacy and relational data sources, cubes, data warehouses, and data marts), comparative sales figures between one week and the next, projected revenue figures based on new product sales assumptions.

HISTORY.

According to Keen (1978), the concept of decision support has evolved from two main areas of research: The theoretical studies of organizational decision making done at the Carnegie Institute of Technology during the late 1950s and early 1960s, and the technical work on interactive computer systems, mainly carried out at the Massachusetts Institute of Technology in the 1960s.

In 1987 Texas Instruments completed development of the Gate Assignment Display System (GADS) for United Airlines. Beginning in about 1990, data warehousing and on-line analytical processing (OLAP) began broadening the realm of DSS. The advent of better and better reporting technologies has seen DSS start to emerge as a critical component of management design.

In the 1970s DSS was described as "a computer based system to aid decision making". Late 1970s the DSS movement started focusing on "interactive computer-based systems which help decision-makers utilize data bases and models to solve ill-structured problems". In the 1980s DSS should provide systems "using suitable and available technology to improve effectiveness of managerial and professional activities", and end 1980s DSS faced a new challenge towards the design of intelligent workstations.

TAXONOMIES
Taxonomy is the practice and science of classification. Haettenschwiler differentiates passive, active, and cooperative DSS. Another taxonomy for DSS has been created by Daniel Power. Power differentiates communication-driven DSS, data-driven DSS, document-driven DSS, knowledge-driven DSS, and model-driven DSS.

A communication-driven DSS supports more than one person working on a shared task; examples include integrated tools like Microsoft's NetMeeting or Groove . A data-driven DSS or data-oriented DSS emphasizes access to and manipulation of a time series of internal company data and, sometimes, external data. A document-driven DSS manages, retrieves, and manipulates unstructured information in a variety of electronic

A knowledge-driven DSS provides specialized problem-solving expertise stored as facts, rules, procedures, or in similar structures. A model-driven DSS emphasizes access to and manipulation of a statistical, financial, optimization, or simulation model. Modeldriven DSS use data and parameters provided by users to assist decision makers in analyzing a situation; they are not necessarily data-intensive. Dicodess is an example of an open source model-driven DSS generator. Using scope as the criterion, Power differentiates enterprise-wide DSS and desktop DSS.

ARCHITECTURE
Three

i.
ii. iii.

fundamental components of a DSS architecture are: The database (or knowledge base). The model (the decision context and user criteria). The user interface.

DSS

technology levels (of hardware and software) may include: i. The actual application. ii. DSS Generator. iii. Tools. An iterative developmental approach allows for the DSS to be changed and redesigned at various intervals. Once the system is designed, it will need to be tested and revised for the desired outcome.

CLASSIFICATION
i. ii. iii. iv. v. vi. i. ii. iii. iv.

There are several ways to classify DSS applications. Not every DSS fits neatly into one category, but a mix of two or more architecture in one. Holsapple and Whinston classify DSS into the following six frameworks: Text-oriented DSS. Database-oriented DSS. Spreadsheet-oriented DSS. Solver-oriented DSS. Rule-oriented DSS. Compound DSS. DSS components may be classified as: Inputs: Factors, numbers, and characteristics to analyze. User Knowledge and Expertise: Inputs requiring manual analysis by the user. Outputs: Transformed data from which DSS "decisions" are generated. Decisions: Results generated by the DSS based on user criteria.

APPLICATIONS
Clinical

decision support system for medical diagnosis. Business and Management. Agricultural production. Forest management. National Railway.

BENEFITS

Improves personal efficiency Expedites problem solving (speed up the progress of problems solving in an organization) Facilitates interpersonal communication Promotes learning or training Increases organizational control Generates new evidence in support of a decision Creates a competitive advantage over competition Encourages exploration and discovery on the part of the decision maker Reveals new approaches to thinking about the problem space Helps automate the managerial processes.

DATA MINING
Data mining is the process of extracting patterns from data. Data mining is becoming an increasingly important tool to transform these data into information. It is commonly used in a wide range of profiling practices, such as marketing, surveillance, fraud detection and scientific discovery. The term data mining has also been used to describe data dredging and data snooping. Dredging and snooping can be used as exploratory tools when developing and clarifying hypotheses.

BACKGROUND

i. ii. iii. iv.

v.

Early methods of identifying patterns in data include Bayes' theorem (1700s) and Regression analysis (1800s). As data sets have grown in size and complexity, direct hands-on data analysis has increasingly been augmented with indirect, automatic data processing. This has been aided by other discoveries in computer science, such as: Neural networks. Clustering . Genetic algorithms (1950s). Decision trees(1960s). Support vector machines (1980s). Data mining is the process of applying these methods to data with the intention of uncovering hidden patterns. A primary reason for using data mining is to assist in the analysis of collections of observations of behavior. There have been some efforts to define standards for data mining, for example the 1999 European Cross Industry Standard Process for Data Mining (CRISP-DM 1.0) and the 2004 Java Data Mining standard (JDM 1.0). Independent of these standardization efforts, freely available open-source software systems like the R Project, Weka, KNIME, RapidMiner and others have become an informal standard for defining data-mining processes.

PROCESS
Knowledge Discovery in Databases (KDD) is the name coined by Gregory Piatetsky-Shapiro in 1989 to describe the process of finding interesting, interpreted, useful and novel data. There are many nuances to this process, but roughly the steps are: i. Pre-processing. ii. Data mining. Classification. Clustering. Regression. Association rule learning. iii. Results validation.

APPLICATIONS

Games Business

Science

and engineering Spatial Data mining Surveillance

PRIVACY CONCERNS AND ETHICS.


Data mining requires data preparation which can uncover information or patterns which may compromise confidentiality and privacy obligations. A common way for this to occur is through data aggregation. The threat to an individual's privacy comes into play when the data, once compiled, cause the data miner, or anyone who has access to the newly-compiled data set, to be able to identify specific individuals, especially when originally the data were anonymous.

It is recommended that an individual is made aware of the following before data are collected: the purpose of the data collection and any data mining projects, how the data will be used, who will be able to mine the data and use them, the security surrounding access to the data, and in addition, how collected data can be updated.

DATA WAREHOUSE.
A data warehouse is a repository of an organization's electronically stored data. Data warehouses are designed to facilitate reporting and analysis. An expanded definition for data warehousing includes business intelligence tools , tools to extract, transform, and load data into the repository, and tools to manage and retrieve metadata.

HISTORY.
The concept of data warehousing dates back to the late 1980s when IBM researchers Barry Devlin and Paul Murphy developed the "business data warehouse". In essence, the data warehousing concept was intended to provide an architectural model for the flow of data from operational systems to decision support environments.

In larger corporations it was typical for multiple decision support environments to operate independently. Each environment served different users but often required much of the same stored data. Moreover, the operational systems were frequently reexamined as new decision support requirements emerged. Often new requirements necessitated gathering, cleaning and integrating new data from "data marts" that were tailored for ready access by users.

1960s General Mills and Dartmouth College, in a joint research project, develop the terms dimensions and facts. 1970s ACNielsen and IRI provide dimensional data marts for retail sales. 1983 Teradata introduces a database management system specifically designed for decision support. 1988 Barry Devlin and Paul Murphy publish the article An architecture for a business and information systems in IBM Systems Journal where they introduce the term "business data warehouse". 1990 Red Brick Systems introduces Red Brick Warehouse, a database management system specifically for data warehousing.

1991 Prism Solutions introduces Prism Warehouse Manager, software for developing a data warehouse. 1991 Bill Inmon publishes the book Building the Data Warehouse. 1995 The Data Warehousing Institute, a forprofit organization that promotes data warehousing, is founded. 1996 Ralph Kimball publishes the book The Data Warehouse Toolkit. 1997 Oracle 8, with support for star queries, is released. 1998 Microsoft releases Microsoft Analysis Services (then OLAP Services) heavily utilising data warehousing schemas.

ARCHITECTURE.
Architecture, in the context of an organization's data warehousing efforts, is a conceptualization of how the data warehouse is built. The worthiness of the architecture can be judged from how the conceptualization aids in the building, maintenance, and usage of the data warehouse.

Operational database layer The source data for the data warehouse An organization's Enterprise Resource Planning systems fall into this layer. Data access layer The interface between the operational and informational access layer Tools to extract, transform, load data into the warehouse fall into this layer.

Metadata layer The data directory - This is usually more detailed than an operational system data directory. Informational access layer The data accessed for reporting and analyzing and the tools for reporting and analyzing data Business intelligence tools fall into this layer.

NORMALIZED VERSUS DIMENSIONAL APPROACH FOR STORAGE OF DATA.


In a dimensional approach, transaction data are partitioned into either "facts", which are generally numeric transaction data, or "dimensions", which are the reference information that gives context to the facts. In the normalized approach, the data in the data warehouse are stored following, to a degree, database normalization rules.

The main disadvantages of the dimensional approach are: In order to maintain the integrity of facts and dimensions, loading the data warehouse with data from different operational systems is complicated, and It is difficult to modify the data warehouse structure if the organization adopting the dimensional approach changes the way in which it does business.

A disadvantage of this approach is that, because of the number of tables involved, it can be difficult for users both to: join data from different sources into meaningful information and then access the information without a precise understanding of the sources of data and of the data structure of the data warehouse.

CONFORMING INFORMATION.
Another important fact in designing a data warehouse is which data to conform and how to conform the data. Typically, extract, transform, load tools are used in this work. Master Data Management has the aim of conforming data that could be considered "dimensions".

Bottom- up design
In the so-called bottom-up approach data marts are first created to provide reporting and analytical capabilities for specific business processes . The combination of data marts is managed through the implementation of what Kimball calls "a data warehouse bus architecture". The most important management task is making sure dimensions among data marts are consistent. In Kimball's words, this means that the dimensions "conform".

Top-down design
Bill Inmon, one of the first authors on the subject of data warehousing, has defined a data warehouse as a centralized repository for the entire enterprise. In the Inmon vision the data warehouse is at the center of the "Corporate Information Factory" (CIF), which provides a logical framework for delivering business intelligence (BI) and business management capabilities.

Inmon states that the data warehouse is:

Subject-oriented The data in the data warehouse is organized so that all the data elements relating to the same real-world event or object are linked together. Non-volatile Data in the data warehouse is never over-written or deleted - once committed, the data is static, read-only, and retained for future reporting. Integrated The data warehouse contains data from most or all of an organization's operational systems and this data is made consistent.

Hybrid design
Over time it has become apparent to proponents of bottom-up and top-down data warehouse design that both methodologies have benefits and risks. Hybrid methodologies have evolved to take advantage of the fast turn-around time of bottom-up design and the enterprise-wide data consistency of topdown design.

DATA WAREHOUSES VERSUS OPERATIONAL SYSTEMS

Operational systems are optimized for preservation of data integrity and speed of recording of business transactions through use of database normalization and an entity-relationship model. The databases have very fast insert/update performance because only a small amount of data in those tables is affected each time a transaction is processed. Finally, in order to improve performance, older data are usually periodically purged from operational systems. Data warehouses are optimized for speed of data analysis. Frequently data in data warehouses are denormalised via a dimension-based model. To speed data retrieval, data warehouse data are often stored multiple times. Data warehouse data are gathered from the operational systems and held in the data warehouse even after the data has been purged from the operational systems.

EVOLUTION IN ORGANIZATION USE

i.

ii.

iii.

iv.

Organizations generally start off with relatively simple use of data warehousing. Over time, more sophisticated use of data warehousing evolves. The following general stages of use of the data warehouse can be distinguished: Off line Operational Database: Data warehouses in this initial stage are developed by simply copying the data off an operational system to another server where the processing load of reporting against the copied data does not impact the operational system's performance. Off line Data Warehouse: Data warehouses at this stage are updated from data in the operational systems on a regular basis and the data warehouse data is stored in a data structure designed to facilitate reporting. Real Time Data Warehouse: Data warehouses at this stage are updated every time an operational system performs a transaction (e.g. an order or a delivery or a booking.) Integrated Data Warehouse: Data warehouses at this stage are updated every time an operational system performs a transaction. The data warehouses then generate transactions that are passed back into the operational systems.

BENEFITS

A data warehouse provides a common data model for all data of interest regardless of the data's source. This makes it easier to report and analyze information than it would be if multiple data models were used to retrieve information such as sales invoices, order receipts, general ledger charges, etc. Prior to loading data into the data warehouse, inconsistencies are identified and resolved. This greatly simplifies reporting and analysis. Information in the data warehouse is under the control of data warehouse users so that, even if the source system data is purged over time, the information in the warehouse can be stored safely for extended periods of time. Because they are separate from operational systems, data warehouses provide retrieval of data without slowing down operational systems. Data warehouses can work in conjunction with and, hence, enhance the value of operational business applications, notably customer relationship management (CRM) systems. Data warehouses facilitate decision support system applications such as trend reports (e.g., the items with the most sales in a particular area within the last two years), exception reports, and reports that show actual performance versus goals.

DISADVANTAGES

Data warehouses are not the optimal environment for unstructured data. Because data must be extracted, transformed and loaded into the warehouse, there is an element of latency in data warehouse data. Over their life, data warehouses can have high costs. Data warehouses can get outdated relatively quickly. There is a cost of delivering suboptimal information to the organization. There is often a fine line between data warehouses and operational systems. Duplicate, expensive functionality may be developed. Or, functionality may be developed in the data warehouse that, in retrospect, should have been developed in the operational systems and vice versa.

APPLICATIONS

Credit

card churn analysis Insurance fraud analysis Call record analysis Logistics management.

Thank You

Das könnte Ihnen auch gefallen