Sie sind auf Seite 1von 45

CHAPTER 4

Data Warehousing, Access, Analysis, Mining, and Visualization

Data Warehousing, Access, Analysis, Mining, and Visualization


        

MSS foundation Many new concepts Object-oriented databases Intelligent databases Data warehouse Data mining Online analytical processing Multidimensionality Internet / Intranet / Web
2

Data Warehousing, Access, Analysis, and Visualization


What to do with all the data that organizations collect, store, and use? (Information overload!)

Solution
     

Data warehousing Data access Data mining Online analytical processing (OLAP) Data visualization Data sources
3

The Nature and Sources of Data




Data: Raw Information: Data organized to convey meaning Knowledge: Data items organized and processed to convey understanding, experience, accumulated learning, and expertise

DSS Data Items


     

Documents Pictures Maps Sound Animation Video

Can be hard or soft

Data Sources
  

Internal External Personal

Data Collection, Problems, and Quality




Problems (Table 4.1) Quality: determines usefulness of data


  

Intrinsic data quality Accessibility data quality Representation data quality

Data Quality Issues in Data Warehousing


    

Uniformity Version Completeness check Conformity check Genealogy check (drill down)

10

The Internet and Commercial Database Services


For external data  The Internet: major supplier of external data


Commercial Data Banks: sell access to specialized databases

Can add external data to the MSS in a timely manner and at a reasonable cost

Decision Support Systems and Intelligent Systems, Efraim Turban and Jay E. Aronson, 6th edition Copyright 2001, Prentice Hall, Upper Saddle River, NJ

11

The Internet and Commercial Databases Servers


Use Web Browsers to


 

Access vital information by employees and customers Implement executive information systems Implement group support systems (GSS) Database management systems provide data in HTML, on Web servers directly

12

Database Management Systems in DSS




DBMS: Software program for entering (or adding) information into a database; updating, deleting, manipulating, storing, and retrieving information A DBMS + modeling language to develop DSS DBMS to handle LARGE amounts of information

13

Database Organization and Structure


      

Relational databases Hierarchical databases Network databases Object-oriented databases Multimedia-based databases Document-based databases Intelligent databases

14

15

Data Warehousing


   

Physical separation of operational and decision support environments Purpose: to establish a data repository making operational data accessible Transforms operational data to relational form Only data needed for decision support come from the TPS Data are transformed and integrated into a consistent structure Data warehousing (information warehousing): solves the data access problem End users perform ad hoc query, reporting analysis and visualization

16

Data Warehousing Benefits


Increase in knowledge worker productivity Supports all decision makers data requirements Provide ready access to critical data Insulates operation databases from ad hoc processing Provides high-level summary information Provides drill down capabilities Yields
    

     

Improved business knowledge Competitive advantage Enhances customer service and satisfaction Facilitates decision making Help streamline business processes
17

Data Warehouse Architecture and Process


 

Two-tier architecture Three-tier architecture

18

19

20

Data Warehouse Components


   

Large physical database Logical data warehouse Data mart Decision support systems (DSS) and executive information system (EIS) Can feed OLAP

21

Data Marts

22

DW Suitability
For organizations where
   

Data are in different systems Information-based approach to management in use Large, diverse customer base Same data have different representations in different systems Highly technical, messy data formats

23

Characteristics of Data Warehousing


1. Data organized by detailed subject with information relevant for decision support 2. Integrated data 3. Time-variant data 4. Non-volatile data

24

OLAP: Data Access and Mining, Querying, and Analysis


Online analytical processing (OLAP)


DSS and EIS computing done by end-users in online systems Versus online transaction processing (OLTP)

25

OLAP Activities


Generating queries Requesting ad hoc reports Conducting statistical and other analyses Developing multimedia applications

26

OLAP uses the data warehouse and a set of tools, usually with multidimensional capabilities


Query tools Spreadsheets Data mining tools Data visualization tools

27

28

Using SQL for Querying




SQL (Structured Query Language) Data language English-like, nonprocedural, very user friendly language Free format Example: SELECT FROM WHERE

Name, Salary Employees Salary >2000

29

Data Mining for


      

Knowledge discovery in databases Knowledge extraction Data archeology Data exploration Data pattern processing Data dredging Information harvesting

30

Major Data Mining Characteristics and Objectives


  

  

Data are often buried deep Client/server architecture Sophisticated new tools--including advanced visualization tools--help to remove the information ore End-user miner empowered by data drills and other power query tools with little or no programming skills Often involves finding unexpected results Tools are easily combined with spreadsheets, etc. Parallel processing for data mining

31

Data Mining Application Areas


           

Marketing Banking Retailing and sales Manufacturing and production Brokerage and securities trading Insurance Computer hardware and software Government and defense Airlines Health care Broadcasting Law enforcement

32

Intelligent Data Mining




Use intelligent search to discover information within data warehouses that queries and reports cannot effectively reveal Find patterns in the data and infer rules from them Use patterns and rules to guide decision making and forecasting Five common types of information that can be yielded by data mining: 1) association, 2) sequences, 3) classifications, 4) clusters, and 5) forecasting
33

34

Main Tools Used in Intelligent Data Mining




Case-based Reasoning Neural Computing Intelligent Agents Other Tools


  

Decision trees Rule induction Data visualization

35

Data Visualization and Multidimensionality


Data Visualization Technologies
       

Digital images Geographic information systems Graphical user interfaces Multidimensions Tables and graphs Virtual reality Presentations Animation

36

Multidimensionality
  

3-D + Spreadsheets (OLAP has this) Data can be organized the way managers like to see them, rather than the way that the system analysts do Different presentations of the same data can be arranged easily and quickly Dimensions: products, salespeople, market segments, business units, geographical locations, distribution channels, country, or industry Measures: money, sales volume, head count, inventory profit, actual versus forecast Time: daily, weekly, monthly, quarterly, or yearly
37

 

Multidimensionality Limitations
   

Extra storage requirements Higher cost Extra system resource and time consumption More complex interfaces and maintenance Multidimensionality is especially popular in executive information and support systems

38

Geographic Information Systems (GIS)




 

 

A computer-based system for capturing, storing, checking, integrating, manipulating, and displaying data using digitized maps Spatially-oriented databases Useful in marketing, sales, voting estimation, planned product distribution Available via the Web Can use with GPS

39

40

Virtual Reality


   

An environment and/or technology that provides artificially generated sensory cues sufficient to engender in the user some willing suspension of disbelief Can share data and interact Can analyze data by creating a landscape Useful in marketing, prototyping aircraft designs VR over the Internet through VRML

41

42

Business Intelligence on the Web


 

Can capture and analyze data from Web Tools deployed on Web

43

Summary
 

    

Data for decision making come from internal and external sources The database management system is one of the major components of most management support systems Familiarity with the latest developments is critical Data contain a gold mine of information if they can dig it out Organizations are warehousing and mining data Multidimensional analysis tools and new enterprisewide system architectures are useful OLAP tools are also useful
44

Summary (cont d.)


 

New data formats for multimedia DBMS Internet and intranets via Web browser interfaces for DBMS access Built-in artificial intelligence methods in DBMS

45

Das könnte Ihnen auch gefallen