Beruflich Dokumente
Kultur Dokumente
Authors:
HARISH GUPTA.GOPU,
03761A1216, Information Technology,
LAKIREDDY BALIREDDY COLLEGE OF ENGINEERING,
MYLAVARAM.
Email: harishgupt@gmail.com
Contact: 9346963676.
&
K.H.S.CHAITANYA,
03761A1215, Information Technology,
LAKIREDDY BALIREDDY COLLEGE OF ENGINEERING,
MYLAVARAM.
Email: khs_chaithanya@yahoo.com
Contact: 0866-2415584
managers.
• Pre-Data Warehouse:
The pre-Data Warehouse zone provides the data for data warehousing. Data
Warehouse designers determine which data contains business value for insertion.
OLTP databases are where operational data are stored. OLTP databases can reside
in transactional software applications such as Enterprise Resource Management
(ERP), Supply Chain, Point of Sale, and Customer Serving Software. OLTP are
design for transaction speed and accuracy.
Metadata ensures the sanctity and accuracy of data entering into the data lifecycle
process. Meta-data ensures that data has the right format and relevancy.
Organizations can take preventive action in reducing cost for the ETL stage by
having a sound Metadata policy. The commonly used terminology to describe
metadata is "data about data".
• Data Cleansing:
Before data enters the data warehouse, the extraction, transformation and
cleaning (ETL) process ensures that the data passes the data quality threshold.
ETLS are also responsible for running scheduled tasks that extract data from
OLTPS.
• Data Repositories:
The Data Warehouse repository is the database that stores active data of
business value for an organization. The Data Warehouse modeling design is
optimized for data analysis. Data Warehouses collects data and is the repository
for historical data. Hence it is not always efficient for providing up-to-date
analysis. This is where ODS, Operational Data Stores, come in. ODS are used to
hold recent data before migration to the Data Warehouse.
• Front-End Analysis:
The last and most critical portion of the Data-Warehouse overview is the
front-end applications that business users will use to interact with data stored in
the repositories. Data Mining is the discovery of useful patterns in data. Data
mining are used for prediction analysis and classification - e.g. what is the
likelihood that a customer will migrate to a competitor. OLAP, Online Analytical
Processing, is used to analyze historical data and slice the business information
required. Reporting tools are used to provide reports on the data. Data are
displayed to show relevancy to the business and keep track of key performance
indicators (KPI). Data Visualization tools are used to display data from the data
repository. Often data visualization is combined with Data Mining and OLAP
Tools. Data visualization can allow the user to manipulate data to show relevancy
and patterns.
Data Warehouse Options:
There are perhaps as many ways to develop data warehouses as there
are organizations. Moreover, there are a number of different dimensions that
need to be considered:
• Scope of the data warehouse
• Data redundancy
• Type of end-user
Figure 2 shows a two-dimensional grid for analyzing the basic options, with the
horizontal dimension indicating the scope of the warehouse, and the vertical
dimension showing the amount of redundant data that must be stored and
maintained.
• Data Redundancy:
There are essentially three levels of data redundancy that enterprises should think
about when considering their data warehouse options:
• "Virtual" or "Point-to-Point" Data Warehouses
• Central Data Warehouses
• Distributed Data Warehouses
There is no one best approach. Each option fits a specific set of requirements, and
a data warehousing strategy may ultimately include all three options.
• Type of End-user:
In the same sense that there are lots of different ways to organize a
data warehouse, it is important to note that there is an increasingly wide range of
end-users as well. In general we tend to think in terms of three broad categories of
end-users:
• Executives and managers
• "Power" users (business and financial analysts, engineers, etc.)
• Support users (clerical, administrative, etc.)
Each of these different categories of user has its own set of requirements for data,
access, flexibility and ease of use.
Future Enhancements:
Data Warehousing is such a new field that it is difficult to estimate what
new developments is likely to most affect it. Clearly, the development of parallel
DB servers with improved query engines is likely to be one of the most important.
Parallel servers will make it possible to access huge databases in much less time.
Another new technology is data warehouses that allow for the mixing of
traditional numbers, text and multi-media. The availability of improved tools for
data visualization (business intelligence) will allow users to see things that could
never be seen before. The motivation behind our idea of extreme Data
Warehousing (X-DW) is to consider where the industry trends will take us a few
generations ahead of where we are now. What will happen when organizations
begin to deploy data warehouses with scalability and service-level requirements
10 times beyond any system currently in existence?
A future involving X-DW raises many interesting technical and social issues. We
believe that technical concerns related to X-DW deployment are well within the
bounds of our future engineering capabilities. The social implications, on the
other hand, will be more difficult to address. Business intelligence plays a critical
role in the real-time enterprise to ensure that decisions are not just fast, but
"intelligent" as well. Gartner predicts that any participant in a competitive
marketplace that has not embraced real-time enterprise capability by 2006 will
have difficulty sustaining a leadership position. Organizations such as Dell,
Vodafone, 3M, Wal-Mart, DHL and others are pursuing worldwide data
acquisition and delivery from their data warehouses. How will the uses of
information change with the evolution of X-DW? Take a look at some case
studies that share what the future might hold.
Conclusion:
Data Warehousing is not a new phenomenon. All large organizations
already have data warehouses, but they are just not managing them. Over the next
few years, the growth of data warehousing is going to be enormous with new
products and technologies coming out frequently. In order to get the most out of
this period, it is going to be important that data warehouse planners and
developers have a clear idea of what they are looking for and then choose
strategies and methods that will provide them with performance today and
flexibility for tomorrow.
Reference:
The Federated Data Warehouse by colin white published
in DM review
March.