Sie sind auf Seite 1von 10

MSBI (Microsoft Business Intelligence) Data: Information in raw or unorganized form (such as alphabets, numbers, or symbols) that refer

to, or represent, conditions, ideas, or objects Ex: 100 Information: Data that has been verified to be accurate and timely or Meaningful Data Ex: Product cost is Rs 500 Database: A database is a collection of information that is organized so that it can easily be accessed, managed, and updated. In one view, databases can be classified according to types of content: bibliographic, full-text, numeric, and images. DBMS: A collection of programs that enables you to store, modify, and extract information from a database. Ex: MySQL, PostgreSQL, Microsoft Access RDBMS: A DBMS in which data is stored in tables and the relationships among the data are also stored in tables. Or A Relational database management system (RDBMS) is a database management system (DBMS) that is based on the relational model as introduced by E. F. Codd. Or A type of database management system (DBMS) that stores data in the form of related tables Table: The data in RDBMS is stored in database objects called tables. The table is a collection of related data entries and it consists of columns and rows.

BI Materials

Page 1

Field: Every table is broken up into smaller entities called fields. The fields in the CUSTOMERS table consist of ID, NAME, AGE, ADDRESS and SALARY. A field is a column in a table that is designed to maintain specific information about every record in the table. Null Value: A NULL value in a table is a value in a field that appears to be blank which means A field with a NULL value is a field with no value. It is very important to understand that a NULL value is different than a zero value or a field that contains spaces. A field with a NULL value is one that has been left blank during record creation. SQL Constraints: Constraints are the rules enforced on data columns on table. These are used to limit the type of data that can go into a table. This ensures the accuracy and reliability of the data in the database. Constraints could be column level or table level. Column level constraints are applied only to one column whereas table level constraints are applied to the whole table. Following are commonly used constraints available in SQL: NOT NULL Constraint: By default, a column can hold NULL values. If you do not want a column to have a NULL value then you need to define such constraint on this column specifying that NULL is now not allowed for that column. A NULL is not the same as no data, rather, it represents unknown data. Ex: CREATE TABLE CUSTOMERS( ID INT NOT NULL, NAME VARCHAR (20) NOT NULL, AGE INT NOT NULL, ADDRESS CHAR (25) , SALARY DECIMAL (18, 2), PRIMARY KEY (ID) ); DEFAULT Constraint:

BI Materials

Page 2

The DEFAULT constraint provides a default value to a column when the INSERT INTO statement does not provide a specific value. Ex: CREATE TABLE CUSTOMERS( ID INT NOT NULL, NAME VARCHAR (20) NOT NULL, AGE INT NOT NULL, ADDRESS CHAR (25) , SALARY DECIMAL (18, 2) DEFAULT 5000.00, PRIMARY KEY (ID) ); UNIQUE Constraint The UNIQUE Constraint prevents two records from having identical values in a particular column. In the CUSTOMERS table, for example, you might want to prevent two or more people from having identical age. Ex: CREATE TABLE CUSTOMERS( ID INT NOT NULL, NAME VARCHAR (20) NOT NULL, AGE INT NOT NULL UNIQUE, ADDRESS CHAR (25) , SALARY DECIMAL (18, 2), PRIMARY KEY (ID) ); Primary Key: A primary key is a field in a table which uniquely identifies the each rows/records in a database table. Primary keys must contain unique values. A primary key column cannot have NULL values. A table can have only one primary key which may consist of single or multiple fields. When multiple fields are used as a primary key, they are called a composite key. If a table has a primary key defined on any field(s) then you can not have two records having the same value of that field(s). Note: You would use these concepts while creating database tables. CREATE TABLE CUSTOMERS( ID INT NOT NULL, NAME VARCHAR (20) NOT NULL, AGE INT NOT NULL, ADDRESS CHAR (25) , SALARY DECIMAL (18, 2), PRIMARY KEY (ID) BI Materials Page 3

); To create a PRIMARY KEY constraint on the "ID" column when CUSTOMERS table already exists, use the following SQL syntax: ALTER TABLE CUSTOMER ADD PRIMARY KEY (ID); Delete Primary Key: You can clear the primary key constraints from the table, Use Syntax: ALTER TABLE CUSTOMERS DROP PRIMARY KEY ; Foreign Key: A foreign key is a key used to link two tables together. This is sometimes called a referencing key. Primary key field from one table and insert it into the other table where it becomes a foreign key ie. Foreign Key is a column or a combination of columns whose values match a Primary Key in a different table. The relationship between 2 tables matches the Primary Key in one of the tables with a Foreign Key in the second table. If a table has a primary key defined on any field(s) then you can not have two records having the same value of that field(s). Ex: CUSTOMERS table: CREATE TABLE CUSTOMERS( ID INT NOT NULL, NAME VARCHAR (20) NOT NULL, AGE INT NOT NULL, ADDRESS CHAR (25) , SALARY DECIMAL (18, 2), PRIMARY KEY (ID) ); ORDERS table: CREATE TABLE ORDERS ( ID INT NOT NULL, DATE DATETIME, CUSTOMER_ID INT references CUSTOMERS(ID), AMOUNT double, BI Materials Page 4

PRIMARY KEY (ID)); If ORDERS table has already been created, and the foreign key has not yet been, use the syntax for specifying a foreign key by altering a table. ALTER TABLE ORDERS ADD FOREIGN KEY (Customer_ID) REFERENCES CUSTOMERS (ID); DROP a FOREIGN KEY Constraint: To drop a FOREIGN KEY constraint, use the following SQL: ALTER TABLE ORDERS DROP FOREIGN KEY; CHECK Constraint : The CHECK Constraint enables a condition to check the value being entered into a record. If the condition evaluates to false, the record violates the constraint and isn.t entered into the table.

Ex: CREATE TABLE CUSTOMERS( ID INT NOT NULL, NAME VARCHAR (20) NOT NULL, AGE INT NOT NULL CHECK (AGE >= 18), ADDRESS CHAR (25) , SALARY DECIMAL (18, 2), PRIMARY KEY (ID)); INDEX Constraint: The INDEX is used to create and retrieve data from the database very quickly. Index can be created by using single or group of columns in a table. When index is created it is assigned a ROWID for each rows before it sort out the data. Proper indexes are good for performance in large databases but you need to be careful while creating index. Selection of fields depends on what you are using in your SQL queries. Ex: CREATE TABLE CUSTOMERS( ID INT NOT NULL, NAME VARCHAR (20) NOT NULL, AGE INT NOT NULL, ADDRESS CHAR (25) , SALARY DECIMAL (18, 2), PRIMARY KEY (ID));

BI Materials

Page 5

Datawarehouse: A data warehouse is a central repository for all or significant parts of the data that an enterprise's various business systems collect. Or A data warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process. Subject-Oriented: A data warehouse can be used to analyze a particular subject area. For example, "sales" can be a particular subject. Integrated: A data warehouse integrates data from multiple data sources. For example, source A and source B may have different ways of identifying a product, but in a data warehouse, there will be only a single way of identifying a product. Time-Variant: Historical data is kept in a data warehouse. For example, one can retrieve data from 3 months, 6 months, 12 months, or even older data from a data warehouse. This contrasts with a transactions system, where often only the most recent data is kept. For example, a transaction system may hold the most recent address of a customer, where a data warehouse can hold all addresses associated with a customer. Non-volatile: Once data is in the data warehouse, it will not change. So, historical data in a data warehouse should never be altered. Datamart: A database, or collection of databases, designed to help managers make strategic decisions about their business. Whereas a data warehousecombines databases across an entire enterprise, data marts are usually smaller and focus on a particular subject or department. Some data marts, called dependent data marts, are subsets of larger data warehouses. Or DataMart is the subset of datawarehouse. you can also consider datamart holds the data of one subject area. For an example, you consider an organization that has HR, Finance, Communications and CorporateService divisions. For each division you can create a datamart. the historical data will be stored into datamarts first and then exported to datawarehouse finally. Or Data Mart is a subset of the data warehouse typically serving a functional area such as marketing or finance, or particular location of the business (for instance mid-Western division).

BI Materials

Page 6

Business intelligence: Business intelligence usually refers to the information that is available for the enterprise to make decisions on. A data warehousing (or data mart) system is the backend, or the infrastructural, component for achieving business intellignce. Business intelligence also includes the insight gained from doing data mining analysis, as well as unstrctured data (thus the need fo content management systems). For our purposes here, we will discuss business intelligence in the context of using a data warehouse infrastructure. ETL: ETL is an abbreviation of the three words Extract, Transform and Load. It is an ETL process to extract data, mostly from different types of systems, transform it into a structure thats more appropriate for reporting and analysis and finally load it into the database and or cube(s). Extract from source: In this step we extract data from different internal and external sources, structured and/or unstructured. Plain queries are sent to the source systems, using native connections, message queuing, ODBC or OLE-DB middleware. The data will be put in a so-called Staging Area (SA), usually with the same structure as the source. In some cases we want only the data that is new or has been changed, the queries will only return the changes. Some ETL tools can do this automatically, providing a changed data capture (CDC) mechanism. Transform the Data: Once the data is available in the Staging Area, it is all on one platform and one database. So we can easily join and union tables, filter and sort the data using specific attributes, pivot to another structure and make business calculations. In this step of the ETL process, we can check on data quality and cleans the data if necessary. After having all the data prepared, we can choose to implement slowly changing dimensions. In that case we want to keep track in our analysis and reports when attributes changes over time, for example a customer moves from one region to another. Load into the Data Warehouse: Finally, data is loaded into a data warehouse, usually into fact and dimension tables. From there the data can be combined, aggregated and loaded into datamarts or cubes as is deemed necessary.

BI Materials

Page 7

Or ETL is short for extract, transform, load, three database functions that are combined into one tool to pull data out of one database and place it into another database. Extract is the process of reading data from a database. Transform is the process of converting the extracted data from its previous form into the form it needs to be in so that it can be placed into another database. Transformation occurs by using rules or lookup tables or by combining the data with other data. Load is the process of writing the data into the target database. ETL is used to migrate data from one database to another, to form data martsand data warehouses and also to convert databases from one format or type to another. Dimensions: (Textual Data) Dimensions allow data analysis from various perspectives. For example, time dimension could show you the breakdown of sales by year, quarter, month, day and hour. Product dimension could help you see which products bring in the most revenue. Supplier dimension could help you choose those business partners who always deliver their goods on time. Customer dimension could help you pick the strategic set of consumers to whom you'd like to extend your very special offers.

BI Materials

Page 8

Or A data warehouse organizes descriptive attributes as columns in dimension tables. For example, a customer dimensions attributes could include first and last name, birth date, gender, etc., or a website dimension would include site name and URL attributes. A dimension table has a primary key column that uniquely identifies each dimension record (row). The dimension table is associated with a fact table using this key. Data in the fact table can be filtered and grouped (sliced and diced) by various combinations of attributes. For example, a Login fact with Customer, Website, and Date dimensions can be queried for number of males age 19-25 who logged in to funsportsite.com more than once during the last week of September 2010, grouped by day. Facts: (Numeric Data) Measures are numeric representations of a set of facts that have occurred. Examples of measures include dollars of sales, number of credit hours, store profit percentage, dollars of operating expenses, number of past-due accounts and so forth. Dimensional Modeling: Dimensional modeling is the design concept used by many data warehouse designers to build their data warehouse. Dimensional model is the underlying data model used by many of the commercial OLAP products available today in the market. In this model, all data is contained in two types of tables called Fact Table and Dimension Table. Dimensional Modeling-Fact table In a Dimensional Model, Fact table contains the measurements or metrics or facts of business processes. If your business process is Sales, then a measurement of this business process such as "monthly sales number" is captured in the fact table. In addition to the measurements, the only other things a fact table contains are foreign keys for the dimension tables Dimensional Modeling-Dimension table In a Dimensional Model, context of the measurements are represented in dimension tables. You can also think of the context of a measurement as the characteristics such as who, what, where, when, how of a measurement (subject ). In your business process Sales, the characteristics of the 'monthly sales number' measurement can be a Location (Where), Time (When), Product Sold (What). The Dimension Attributes are the various columns in a dimension table. In the Location dimension, the attributes can be Location Code, State, Country, Zip code. Generally the Dimension Attributes are used in report labels, and query constraints such as where Country='USA'. The dimension attributes also contain one or more hierarchical relationships. Before designing your data warehouse, you need to decide what this data warehouse contains.

BI Materials

Page 9

Say if you want to build a data warehouse containing monthly sales numbers across multiple store locations, across time and across products then your dimensions are: 1. Location 2. Time 3. Product Each dimension table contains data for one dimension. In the above example you get all your store location information and put that into one single table calledLocation. Your store location data may be spanned across multiple tables in your OLTP system (unlike OLAP), but you need to de-normalize all that data into one single table. Additive measures are measures that can be added across all dimensions. For example dollars of sales can be added across all dimensions within a retail store warehouse. Semi-additive measures are measures that can be added across some, but not all dimensions. For example the bank account balance is simply a snapshot in time and cannot be summed over time. However you could add multiple accounts of the same customer to get the total balance for that customer. Non-additive measures are measures that cannot be added across any dimensions. For example the inventory is simply a snapshot in time and cannot be summed over time. Nor can you combine inventory for various products. Hierarchy defines parent-child relationships among various levels within a single dimension. For instance in a time dimension, year level is parent of four quarters, each of which is a parent of three months, which are parents of 28 to 31 days, which are parents of 24 hours. Similarly in a geography dimension a continent is a parent of countries, country could be a parent of states, and state could be a parent of cities. Level is a column within a dimension table that could be used for aggregating data. For example, product dimension could have levels of product type (beverage), product category (alcoholic beverage), product class (beer), product name (miller lite, budlite, corona, etc). Member is a value within a dimension level that can be used for aggregating and reporting data. For example each product category such as beverage, non-consumable, food, clothing, etc is a member. Each product class such as beer, wine, coke, bottled water would represent a member.

BI Materials

Page 10

Das könnte Ihnen auch gefallen