Sie sind auf Seite 1von 18

SOURAV DUTTA CSE 3rd yr. Roll No.

17

Data Warehouse is a physical repository where corporate data are specially organized to provide enterprise-wide collection of data in support of managements decision-making process.
A data warehouse (DW) is a huge database where corporate data is stored securely for archival, analysis and reporting Used for data management,data analysis and reporting. Its specific purpose is to support business decisions, not business operations.

Subject-oriented It is organized around the major subjects of the enterprise so it need to store decisionsupport data rather than application-oriented data. Integrated Because the data comes from different sources and enterprise-wide applications, it must be consistent to maintain the integrety of the system. Time-variant the source data in the WH is only accurate and valid at some point in time or over some time interval.So data warehouse is time-variant. Non-volatile Data is refreshed on a regular basis. New data is always added as a supplement to DB, rather than replacement. The DB continually absorbs this new data, incrementally integrating it with previous data.

1.Data Selection. 2.Data Pre-processing: Checked for integrity, Remove inconsistency. 3.Data Transformation to specific formats. 4.Loading Data to the warehouse.

Extraction Transformation Operator 1 Loading

Database 1

Operator 2

Database 2

Data Warehouse

Operator 3

Database 3

Operator 4

Database 4

Report
sales

Expansion

Data Warehouse

Analysis & Reporting tool


time

Manager

Improvement

A data warehouse tool requires to support these tasks:


1. Monitoring data from multiple sources. 2. Data quality and integrity checks. 3. Good performance to ensure efficient query response times and resource utilization. 4. Maintaining effient data storage management. 5. Good Backup & Recovery facility. 7. Security management . 8. Compatibility with other platforms. 9. Good user-interface at least at Data-entry & Reporting portion.

Software from Hardware OEM from

Microsoft -- SQL Server 2008 R2 Teradata-Data warehouse Appliance 2650 Oracle Warehouse Builder SAS Institute -- SAS Sybase -- SQL Server, IQ, MPP Hewlett-Packard -- Allbase/SQL

Microsoft Plato Informatica xData-server Prism Warehouse Manager Oracle Express Server

VB Applications Oracle BIEE Oracle -- Discoverer2000 Greenplum GPDB MS Excel Pivot Chart

CPU:
With advanced technology, multicore chips, extensive use of pipelining in data-warehousing. Multicore and pipelined CPU architectures enable much greater parallelization. Oracle Exadata V2 uses a pair of the latest Intel Xeon quad-core CPUs in every server, in turn, can handle massive data say, a 1-terabyte per query.

Memory :
Requires fast & high-capacity memory to operate on massive data.Ranges upto several gigabytes. The Oracle Exadata Database Server Grid offers 72 gigabytes of memory in each database server.

I/O:
CPU speed and faster memory can do nothing if faster I/O devices and interfaces is not provided. So, I/O system must be fast for massive data delivery and loading.

Storage:
High storage density and transfer speed required to boost data warehousing performance. High speed(7000-12000 RPM) SATA/SAS disks with several Terabyte capacity are used to achieve storage capacity upto several petabytes. Use of SSD(Solid-State-Disk) is recent trend.

http://en.wikipedia.org http://www.wisegeek.com http://www.tech-faq.com http://www.dwforum.net http://www.academictutorials.com http://www.datawarehousesolution.net http://www.1keydata.com http://www.exforsys.com

Sourav Dutta
CSE 3rd yr St. Thomas College of Engineering & Tecnology

Contact : me.i.sourav@gmail.com

Das könnte Ihnen auch gefallen