Sie sind auf Seite 1von 8

Q1. Define the term business intelligence tools?

Briefly explain how the data from one end gets transformed into information at the other end?
The various tools of this suite are:

Data Integration Tools: These tools extract, transform and load the data from the source databases to the target database. There are two categories; Data Integrator and Rapid Marts. Data Integrator is an ETL tool with a GUI. Rapid Marts is a packaged ETL with pre-built data models for reporting and query analysis that makes initial prototype development easy and fast for ERP applications. BI Platform: This platform provides a set of common services to deploy, use and manage the tools and applications. These services include providing the security, broadcasting, collaboration, metadata and developer services. Reporting Tools and Query & Analysis Tools: These tools provide the facility for standard reports generation, ad hoc queries and data analysis. Performance Management Tools: These tools help in managing the performance of a business by analyzing and tracking key metrics and goals.

Business intelligence tools are a type of application software designed to help in making better business decisions. These tools aid in the analysis and presentation of data in a more meaningful way and so play a key role in the strategic planning process of an organization. They illustrate business intelligence in the areas of market research and segmentation, customer profiling, customer support, profitability, and inventory and distribution analysis to name a few. Various types of BI systems viz. Decision Support Systems, Executive Information Systems (EIS), Multidimensional Analysis software or OLAP (On-Line Analytical Processing) tools, data mining tools are discussed further. Whatever is the type, the Business Intelligence capabilities of the system is to let its users slice and dice the information from their organizations numerous databases without having to wait for their IT departments to develop complex queries and elicit answers. Although it is possible to build BI systems without the benefit of a data warehouse, most of the systems are an integral part of the user-facing end of the data warehouse in practice. In fact, we can never think of building a data warehouse without BI Systems. That is the reason; sometimes, the words data warehousing and business intelligence are being used interchangeably.

Q2. What do you mean by data ware house? What are the major concepts and terminology used in the study of data warehouse?
A data warehouse is a central repository of data which is created by integrating data from one or more disparate sources and are used for creating trending reports for management reporting and decision making. A Data warehouse is a part of the data warehousing system. It provides consolidated, accessible and flexible collection of data for end user analysis and reporting. This may include sales figures, market performance, accounts payables, and leave details of employees. Major concepts of data warehousing Subject-Oriented: Data Warehouse is subject-oriented as the data gives information about a particular subject instead of about a companys ongoing operation. Integrated: Data Warehouse is integrated as the data is gathered from a variety of sources into the data warehouse and merged into a coherent whole. Time Variant: Data warehouse is time-variant as all the data in it is identified with a particular time period. Non-Volatile: Data is stable in a data warehouse. More data is added but data is never removed. Thus, the management can gain a constant picture of the business. Hence the data warehouse is non-volatile (long term storage). Terminologies used in the study of Data Warehousing are Data Warehouse: A data structure that is optimized for distribution. It collects and stores integrated sets of historical data from multiple operational systems and feeds them to one or more data marts. It may also provide end-user access to support enterprise views of data.

Data Mart: A data structure that is optimized for access. It is designed to facilitate end-user analysis of data. It typically supports a single, analytic application used by a distinct set of workers. Staging Area: Any data store that is designed primarily to receive data into a warehousing environment. Operational Data Store: A collection of data that addresses operational needs of various operational units. It is not a component of a data warehousing architecture, but a solution to operational needs. Multidimensional Analysis: The ability to manipulate information by a variety of relevant categories or dimensions to facilitate analysis and understanding of the underlying data. It is also sometimes referred to as drilling-down, drilling-across and slicing and dicing Hypercube: A means of visually representing multidimensional data. Star Schema: A means of aggregating data based on a set of known dimensions. It stores data multidimensional in a two dimensional Relational Database Management System (RDBMS), such as Oracle. Snowflake Schema: An extension of the star schema by means of applying additional dimensions to the dimensions of a star schema in a relational environment. Multidimensional Database: Also known as MDDB or MDDBS. A class of proprietary, nonrelational database management tools that store and manage data in a multidimensional manner, as opposed to the two dimensions associated with traditional relational database management systems. OLAP Tools: A set of software products that attempt to facilitate multidimensional analysis. It can incorporate data acquisition, data access, data manipulation, or any combination thereof.

Q3. What are the data modeling techniques used in data warehousing environment?
Two data modeling techniques that are relevant in a data warehousing: A) ER Modeling:

An ER model is represented by an ER diagram, which uses three basic graphic symbols to conceptualize the data: entity, relationship, and attribute. Entity An entity is defined to be a person, place, thing, or event of interest to the business or the organization. An entity represents a class of objects, which are things in the real world that can be observed and classified by their properties and characteristics. In the detailed ER model, defining a unique identifier of an entity is the most critical task. These unique identifiers are called candidate keys. From them we can select the key that is most commonly used to identify the entity. It is called the primary key. Relationship: A relationship is represented with lines drawn between entities. It depicts the structural interaction and association among the entities in a model. A relationship is designated grammatically by a verb, such as owns, belongs, and has.The relationship between two entities can be defined in terms of the cardinality. This is the maximum number of instances of one entity that are related to a single instance in another table and vice versa. The possible cardinalities are: one-to-one (1:1), one-to-many (1:M), and many-to-many (M:M). Attributes: Attributes describe the characteristics of properties of the entities. Eg: Product ID, Description, and Picture are attributes of the PRODUCT entity. An attribute name should be unique in an entity and should be self-explanatory. B) Dimensional Modeling Dimensional modeling is a technique for conceptualizing and visualizing data models as a set of measures that are described by common aspects of the business. It is especially useful for summarizing and rearranging the data and presenting views of the data to support data analysis. Fact: A fact is a collection of related data items, consisting of measures and context data. Each fact typically represents a business item, a business transaction, or an event that can be used in analyzing the business or business processes. Dimension: A dimension is a collection of members or units of the same type of views. In a diagram, a dimension is usually represented by an axis. Dimensions are the parameters over which we want to perform Online Analytical Processing (OLAP). For example, in a database for analyzing all sales of products, common dimensions could be Time, Location/region, Customers and Salespersons Measure: A measure is a numeric attribute of a fact, representing the performance or behavior of the business relative to the dimensions. The actual numbers are called as variables. For example, measures are the sales in money, the sales volume, the quantity supplied, the supply cost, the transaction amount, and so forth.

Q4. Discuss the categories in which data is divided before structuring it into data ware house?
Populating is the process of getting the source data from operational and external systems into the data warehouse and data marts. The data is captured from the operational and external systems, transformed into a usable format for the data warehouse, and finally loaded into the data warehouse or the data mart.

Capture - Capture is the process of collecting the source data from the operational systems and other external sources. - Source data extraction provides a static snapshot of source data as of a specific point in time. It is sufficient to support a temporal data model that does not have a requirement for a continuous history. Source data extraction can produce extract files, tables, or image copies. Transform - The transform process converts the captured source data into a format and structure suitable for loading into the data warehouse. The mapping characteristics used to transform the source data are captured and stored as metadata. - Transformation of data can occur at the record level or at the attribute level. The basic techniques include structural transformation, content transformation, and functional transformation. Apply - The apply process uses the files or tables created in the transform process and applies them to the relevant data warehouse or data mart. - There are four basic techniques for applying data: load, append, constructive merge, and destructive merge.

Q5. Discuss the purpose of executive information system in an organization?


The primary purpose of an Executive Information System is to support managerial learning about an organization, its work processes, and its interaction with the external environment. Informed managers can ask better questions and make better decisions. A secondary purpose for an EIS is to allow timely access to information. All of the information contained in an EIS can typically be obtained by a manager through traditional methods. However, the resources and time required to manually compile information in a wide variety of formats, and in response to ever changing and ever more specific questions usually inhibit managers from obtaining this information. An EIS has a powerful ability to direct management attention to specific areas of the organization or specific business problems. Some managers see this as an opportunity to discipline subordinates. Some subordinates fear the directive nature of the system and spend a great deal of time trying to outwit or discredit it. Neither of these behaviors is appropriate or productive. Rather, managers and subordinates can work together to determine the root causes of issues highlighted by the EIS. In a nutshell the advantages EIS offers are: Easy for upper-level executives to use, extensive computer experience is not required in operations Provides timely delivery of company summary information EIS provides timely delivery of information. Management can make decisions made promptly. Improves tracking information. Offers efficiency to decision makers

Q6. Discuss the challenges involved in data integration and coordination process?
In general, most of the data that the warehouse gets is the data extracted from a combination of legacy mainframe systems, old minicomputer applications, and some client/server systems. But these source systems do not conform to the same set of business rules. Thus they may

often follow different naming conventions and varied standards for data representation. Thus the process of data integration and consolidation plays a vital role. Here, the data integration includes combining of all relevant operational data into coherent data structures so as to make them ready for loading into data warehouse. Some of the challenges involved in the data integration and consolidation process are as follows. Identification of an Entity: Suppose there are three legacy applications that are in use in your organization; one is the order entry system, second is customer service support system, and the third is the marketing system. - Each of these systems might have their own customer file to support the system. As you need to keep a single record for each customer in a data warehouse, you need to get the transactions of each customer from various source systems and then match them up to load into the data warehouse. This is an entity identification problem in which you do not know which of the customer records relate to the same customer. - This problem is prevalent where multiple sources exist for the same entities and the other entities that are prone to this type of problem include vendors, suppliers, employees, and various products manufactured by a company. - In case of three customer files, you have to design complex algorithms to match records from all the three files and groups of matching records. But this is a difficult exercise. If the matching criterion is too tight, then some records might escape the groups. Similarly, a particular group may include records of more than one customer if the matching criterion designed is too loose. Existence of Multiple Sources - Another major challenge in the area of data integration and consolidation results from a single data element having more than one source. For instance, cost values are calculated and updated at specific intervals in the standard costing application. Similarly, your order processing application also carries the unit costs for all products. - Thus there are two sources available to obtain the unit cost of a product and so there could be a slight variation in their values. Which of these systems needs to be considered to store the unit cost in the data warehouse becomes an important question. One easy way of handling this situation is to prioritize the two sources, or you may select the source on the basis of the last update date. Implementation of Transformation - The implementation of data transformation is a complex exercise. You may have to go beyond the manual methods, usual methods of writing conversion programs while deploying

the operational systems. Transformation for Dimension Attributes - Now we consider the updating of the dimension tables. The dimension tables are more stable in nature and so they are less volatile compared to the fact tables. The fact tables change through an increase in the number of rows, but the dimension tables change through the changes to the attributes.

Das könnte Ihnen auch gefallen