Beruflich Dokumente
Kultur Dokumente
Course 10056
Table of Contents
A data warehouse is a relational database that contains tables and relationships to provide an integrated view of the data within a business. Compared to transactional systems, however, the tables are more de-normalized within a data warehouse as to optimize it for querying purposes. As the data warehouse holds a separate copy of the data found on transactional systems, reporting and data analysis activities can then take place without impinging on the transactional systems performance. A data warehouse can consist of one or more data marts. A data mart is a set of tables that are interrelated and contain its own fact table and a number of dimension tables. OLAP OLAP refers to the multidimensional analysis of data. SSAS consumes information from a data warehouse to store data within an OLAP database that is stored in cubes. Cubes can store detailed data, however, its power is to create preaggregated data that is persistent within the cube. You can arrange the aggregated data so that it can be intersected by dimensions that provide contextual information for the aggregated data. Dimension can include contextual information about customers, employees or orders. This allows analysis to be performed far more efficiently than could be performed against the same data in a transactional system. OLAP databases can store one or more cubes within it. Data mining Data mining provides an exciting aspect to SSAS in that it can use mathematical algorithms to analyze the data in either a cube or a relational table. This analysis can involve trend analysis, data classification or clustering and sequence analysis. Data mining allows you to explore your data and find out patterns that may have not been immediately evident. SSAS provides the ability to create data mining structures that allow you to pass through the source data through data mining algorithms known as data mining models. It also provides validation tools that you can use to validate the accuracy of the results that are returned by the data mining structures. Microsoft Excel 2007 also has a data mining add-in that enables you to expose data mining structures through a familiar client tool. Dashboards and scorecards Key performance indicators (KPIs) are a feature of SSAS and are stored in an OLAP database but can be exposed through client tools such as SSRS, Microsoft Excel and Microsoft Office SharePoint Services. The ability to provide key business metrics through these visual indicators is a very powerful feature that enables you to build scorecards and digital dashboards through the client tools. Storing the KPIs on SSAS helps you to manage the KPIs centrally.
Reporting Reports are the objects of the SQL Server BI stack that the users have most interaction with. Using SSRS, you can provide flexible standard reports that can contain a high degree of interactive objects such as parameters and drill through capabilities that allow the user to be involved with the report. You can also automate the delivery of reports through subscription mechanisms such as e-mail. Furthermore, SSRS provides the capability for power users to create their own reports for ad hoc requirements.
you can use Business Intelligence Development Studio to create reports. Report Designer can be made available to the end user to create their own reports. Report Manager is a Web-based front-end that enables users to view reports and allow report administrators to manage reports. You can also automate the delivery of reports. Microsoft Office SharePoint Although Microsoft Office SharePoint Services (MOSS) is not a part of the SQL Server BI stack, it would be remiss not to acknowledge the growing importance that MOSS has in delivering BI to the end user. MOSS can act as a central repository of business information within an organization. This can include the storage of documents and images. You can also set up calendars and newsgroups that an organization can use to exchange information. You can now integrate SSRS with MOSS during the installation of Reporting Services. This will enable you to host and store reports within MOSS. Furthermore, it was announced by Microsoft in January 2009, that Microsoft Office Performance Point Server would be integrated into SharePoint Services. This would provide greater capabilities for providing digital dashboards and scorecards within MOSS. Microsoft Excel Microsoft Excel is a popular client tool for BI solutions. You can export SSRS reports to Excel. However, its power is evident in the way that it integrates with SSAS. You can connect to SSAS cubes by using Excel and create pivot table reports. Furthermore, you can also download a number of Excel add-ins such as the Analysis Services Add-in for Excel and the Data Mining Add-in for Excel. These add-ins provide more sophisticated integration with Analysis Services than the pivot table alone.
10
11
12
Dimension Tables
Dimension tables contain columns that hold information that represents a business entity. Examples of dimension tables can include time, region and customers. The columns will contain information that is specific to the entity. In the case of the customers dimension table, columns can include Firstname and Lastname. You can define as many columns as you want to precisely define the customer entity. However, placing unnecessary columns increases the volume and time it would take to populate the dimension table with data by using SSIS. Dimension tables should consist of two columns that are important in loading the data warehouse. The first column is typically a dimension key known as a surrogate key. This is usually a primary key column that uses an identity constraint to automatically populate the column value. It is also used to provide a relationship to the fact table. The second column is referred to as an application key column. This column is used to hold the value of the record in the original transactional system. You may also wish to add columns to the dimension table that can be used to track changes to the records within the dimension table. As information within the table changes, you can add a ModifiedDate column that can be used to track the date when changes occurred to a record. To keep historical records of changed data, you could include StartDate and EndDate columns that help you to determine the duration that a record was in a particular state.
13
Fact Tables
The fact table is at the heart of the data mart and typically consists of three groups of columns. The first is a primary key column to maintain the integrity of the fact table itself. The second group of columns are known as foreign keys. Each of these columns relate to a dimension table primary key to provide a relationship between the fact table and the contextual information provided by the dimension table. Imagine that the image represents a Sales data mart, and that the three smaller tables represent dimension tables called CustomerDim, TimeDim and RegionDim. The fact table, which would be the big table in the image, would contain three foreign key columns that would relate directly to each dimension table. The fact table also contains a third group of columns known as measures. This group of columns hold business metrics such as sales units, sales amounts and cost amounts. As a result, the fact table ideally consists of business information represented by integer values. On occasions, a fourth group of columns may be seen within the fact table. This may be one or more columns that are referred to as degenerative dimensions. A degenerative dimension is information about a business entity that is stored within the fact table itself rather than held within a separate dimension table. Using the example of the CustomerDim, TimeDim and RegionDim, these three dimension tables provide context information about customers, time and regions. A central fact table holds foreign key references to the dimension tables and includes measures such as SalesUnits and OrderQuantity. We can include a SalesOrderNumber for each record of a sale. However, rather than creating a separate dimension table that would require a join to the dimension table just to return one piece of contextual information, it is more efficient to store this piece of information within the fact table itself. Fact tables are created like any table that is created in SQL Server. You also define the primary and foreign key relationships within the database engine. Therefore, you are limited to the 8-KB limit on the amount of information that is stored in a row within a table.
14
15
Dimension tables may also be shared across different data marts to provide data consistency.
16
17
18
Pre-installation checks
Security Considerations
Follow security best practise. Best practises for security considerations for Microsoft SQL Server 2008 should be adhered to in the following areas: Physical security o Ensure that the SQL Server and backup devices are physically secure. Firewall configuration o Ensure that only the relevant ports are open and inside a Windows domain, configure interior firewalls to permit Windows Authentication. Service isolation o Avoid installing SQL Server on a domain controller and use different service accounts for each SQL Server components. Disable NetBIOS and Server Message Block
19
o Disable NetBIOS and Server Message to mitigate security threats within environments that use just Domain Name System (DNS).
20
21
1. 2.
In Business Intelligence Development Studio, create or open an Integration Services project. In Solution Explorer, right-click the SSIS Packages node, and then click Upgrade All Packages to upgrade all the packages under this node.
To run the wizard from SQL Server Management Studio, perform the following step:
In SQL Server Management Studio, connect to Integration Services, expand the Stored Packages node, right-click the File System or MSDB node, and then click Upgrade Packages.
To run the wizard at the command prompt, perform the following step:
At the command prompt, run the SSISUpgrade.exe file from the C:\Program Files\Microsoft SQL Server\100\DTS\Binn folder.
22
When you run the SSIS Package Upgrade Wizard from Business Intelligence Development Studio, the wizard automatically stores both the original packages and upgraded packages in the same folder in the file system. You can configure the wizard to back up the original packages. When you run the SSIS Package Upgrade Wizard from SQL Server Management Studio or at the command prompt, you can specify different storage locations for the original and upgraded packages. To back up the original packages, make sure to specify that both the original and upgraded packages are stored in the same folder in the file system. If you specify any other storage options, the wizard will not be able to back up the original packages.
When the wizard backs up the original packages, the wizard stores a copy of the original packages in an SSISBackupFolder folder. The wizard creates this SSISBackupFolder folder as a subfolder to the folder that contains the original packages and the upgraded packages. To back up the original packages in SQL Server Management Studio or at the command prompt, perform the following steps:
1. 2. 3. 4. 5.
Save the original packages to a location on the file system. In SQL Server Management Studio or at the command prompt, run the SSIS Package Upgrade Wizard. On the Select Source Location page of the wizard, set the Package source property to File System. On the Select Destination Location page of the wizard, select Save to source location to save the upgraded packages to the same location as the original packages. On the Select Package Management Options page of the wizard, select the Backup original packages option.
To back up the original packages in Business Intelligence Development Studio, perform the following steps: 1. 2. 3. Save the original packages to a location on the file system. In Business Intelligence Development Studio, run the SSIS Package Upgrade Wizard. On the Select Package Management Options page of the wizard, select the Backup original packages option.
SSAS SQL Server setup can be used to upgrade an instance of SQL Server 2000 and SQL Server 2005 Analysis Services to SQL Server 2008. However, consideration must be given to the fact that some of the functionality between the different versions of Analysis Services may be lost on completing the upgrade. You can run the SQL Server 2008 upgrade advisor that will provide a summary of the functionality that 23
may be lost. Features such as linked cubes that were available in Analysis Services 2000 are no longer available and are now replaced by linked measures group. The Analysis Services Migration Wizard can be used to upgrade SQL Server 2000 Analysis Services databases to SQL Server 2008 Analysis Services. The wizard copies SQL Server 2000 Analysis Services database objects and then recreates them on an instance of SQL Server 2008 Analysis Services. The source databases remain intact and you can manually delete the old databases. The migration wizard can be found in the Analysis Services server node in the Object Browser in SQL Server Management Studio or by running the program MigrationWizard.exe from the command prompt.
Upgrading Business Intelligence Components
Migrate existing Analysis Services databases by using the Migration Wizard To migrate the existing Analysis Services databases by using the Migration Wizard, perform the following steps: 1.
In SQL Server Management Studio, in the Object Browser, start the Migration Wizard from an Analysis Services server node. You can also start the wizard at the command prompt, by running the program MigrationWizard.exe. The wizard starts and displays the Welcome to the Analysis Services Migration Wizard page. Read the introductory message and then click Next to continue. On the Specify Source and Destination page, identify the source SQL Server 2000 Analysis Services server and instance name, and then identify the destination SQL Server 2008 Analysis Services server and instance name. Instead of designating a destination server, you can also decide to save the database schema to a script file and complete the migration later. You can do this by using an Analysis Services Execute DDL task in a SQL Server 2008 Integration Services (SSIS) package. On the Select Databases to Migrate page, select the check boxes next to the databases that you want to migrate. You can specify different names for the destination databases if you want. On the Validating Databases page, the wizard analyzes the databases to be migrated and reports any issues that it discovers. On the Migrating Databases page, the wizard reports its progress as it performs the database migration. On the Completing the Wizard page, the wizard reports the results of migration. Click Finish to complete the wizard. After you migrate a database, you must process the database from the original data source before you can query the database. After migration, you might want to review Migration Considerations (Analysis Services) to understand the differences between Analysis Services database versions.
2. 3.
4.
5. 6. 7. 8. 9.
24
SSRS The most notable aspect of migration to SQL Server Reporting Services 2008 from SQL Server Reporting Services 2000 or 2005 is that there is no longer a dependency on Internet Information Services (IIS). Interoperability issues between IIS 6.0/7.0 and Reporting Services occur when IIS Web sites have virtual directory names that are identical to those used by Reporting Services. This problem can be prevalent when running side-by-side deployments of different versions of Reporting Services as both versions create the same Reports and ReportServer URL reservation. Should this occur, it is recommended that with SQL Server 2008 Reporting Services, the Files-only mode installation option should be used, so that configurations of the Web site names can occur within the Report Server Configuration tool to define different Web site names and avoid the possibility of encountering the interoperability issue outlined. SQL Server 2008 Reporting Services is now managed as a single service that defines the security context under which Report Manager, report server Web service, and a background processing application run under. The report server service account can be defined during the installation and can also be changed using Reporting Services Configuration tool. The upgrade can be performed through the SQL Server setup and the upgrade advisor can be used to determine if there are issue that may prevent a successful upgrade. You can also install a new instance of SQL Server 2008 Reporting Services and then migrate the report server databases from the old instance to the new instance through the following steps: Back up the database, the application and the configuration files. Back up the encryption key. Install a new instance of SQL Server 2008. If you are using the same hardware, you can install SQL Server 2008 side-by-side with your existing SQL Server 2000 or 2005 installation. However, if you do this, you might need to install SQL Server 2008 as a named instance. Move the report server database and other application files from your SQL Server 2000 or 2005 installation to your new SQL Server 2008 installation. Move any custom application files to the new installation. Configure the report server. Edit RSReportServer.config to include any custom settings from your previous installation. Optionally, configure custom Access Control Lists (ACLs) for the new Reporting Services Windows service group. Test your installation. Remove unused applications and tools after you have confirmed that the new instance is fully operational.
25
26
27
28
29
30
Data Formatting transformations to change the format of the data. Column transformations to copy, export or import columns. Multiple Data Flow transformations to merge, split or join data together. Custom transformations using .NET and ActiveX to perform custom transformations. Slowly Changing Dimension to manage dimension tables in a data warehouse. Data Analysis transformations such as data mining and Pivot for data analysis. Data Sampling transformations to perform row counts and sampling. Audit transformations to audit data changes. Fuzzy transformations to apply fuzzy logic to data within a transform to standardize data. Term transformations to extract specific data from text. Additionally, Integration Services provides paths that connect the output of one component to the input of another component. Paths define the sequence of components and let you add annotations to the data flow or view the source of the column.
Event handlers Event handlers are a component that execute tasks based on an event that occurs at the package, container or task level at run time. The tasks that can be performed within an event handler are the same tasks that are available within the control flow. The events that can be defined include: OnError. This event is raised by an executable when an error occurs. OnExecStatusChanged. This event is raised by an executable when its execution status changes. OnInformation. This event is raised during the validation and execution of an executable to report information. This event conveys information only, no errors or warnings. OnPostExecute. This event is raised by an executable immediately after it has finished running. OnPostValidate. This event is raised by an executable when its validation is finished. OnPreExecute. This event is raised by an executable immediately before it runs. OnPreValidate. This event is raised by an executable when its validation starts. OnProgress. This event is raised by an executable when measurable progress is made by the executable. OnQueryCancel. This event is raised by an executable to determine whether it should stop running. OnTaskFailed. This event is raised by a task when it fails. OnVariableValueChanged. This event is raised by an executable when the value of a variable changes. The event is raised by the executable on which the variable is defined. OnWarning. This event is raised by an executable when a warning occurs. 32
Event handlers can perform the following tasks: Clean up temporary data storage when a package or task finishes running. Retrieve system information to assess resource availability before a package runs. Refresh data in a table when a lookup in a reference table fails. Send an e-mail message when an error or a warning occurs or when a task fails. For example, an OnError event is raised when an error occurs. You can create custom event handlers for these events to extend package functionality and make packages easier to manage at run time. Variables Variables can be used to provide additional functionality to SSIS packages. SSIS provides a host of system variables. These system variables can be used for customization of a package. An example could include using the system variables to populate package information to a table using custom logging. SSIS also allows you to create user-defined variables. The names of user-defined and system variables are case-sensitive. These variables can be used to hold a value that may be used elsewhere in a package. For example you may use a GETNOW() function to populate a variable. This variable may then be used in the WHERE clause of an Execute SQL Task to return results back where the ModifiedDate column of a table is equal to the value in the variable. In this respect, variables can prove valuable in providing package logic. Package configurations Package configurations are a useful tool to use to set package properties as the package executes. In scenarios where it is difficult to anticipate package properties before it is deployed, package configurations can help. For example, you may have a data source destination within a Data Flow task, but you do not know the name of the server. You can use a package configuration to populate the data source destination's ServerName property with the environmental variable, ComputerName. When the package executes, it will use the environmental variable ComputerName to populate the ServerName property at run time. Package configurations can be provided from one of five sources: Environmental variables Registry SQL Server XML configuration file Parent package variable
33
34
35
36
37
Data source views Data source views are objects based on a data source that provide an abstraction of a subset of tables, columns and relationships from the data source that you require as part of your Analysis Services solution. This abstraction layer holds metadata about the objects and allows you to create an Analysis Services solution without a permanent connection to the data source. To run the cube wizard or the data mining wizard, a data source view must be created. The type of information that is defined within a data source view includes: A data source view name. A definition of any subset of the schema retrieved from one or more data sources including: o Tables o Columns o Relationships Annotations can be added to provide friendly names for the tables and columns to improve the readability for the user. Cubes An OLAP database can hold one or more cubes. A cube is a fundamental unit of multidimensional analysis. A cube consists of dimensions and measures. The dimensions form the axis of the cube and provide contextual information for the data that resides in the cube. This information is known as measures. Dimensions Dimensions provide the contextual information for the data that resides in a cube. The dimensions in a cube typically map to a dimension table within the data warehouse. As dimensions are used to describe a business entity, dimensions consist of attributes that provide specific information about the business entity. These attributes typically map to a specific column within the dimension table in the data warehouse. To improve readability, hierarchies can be introduced within the dimension. For example, within a time dimension, a hierarchy named CalendarYear can be created containing three levelsYear, Quarter and Month. These levels allow the user to easily drill down into specific time periods. Measures Measures provide the numeric information that resides inside the cube. They typically map to the measures columns in the fact table of a data warehouse. As well as including the measures information from the fact table, SSAS also creates pre-aggregated versions of the measures. For example, a fact table may consist of a measure named OrderQuantity. As well as including the detail information that is provided by the fact table, Analysis Service creates pre-aggregated OrderQuantity totals for yearly, quarterly or monthly totals. By creating this pre-aggregated data, queries for this information is retrieved more efficiently. 38
Slice and dice Slicing and dicing is the technique that can be used to interrogate the data from within a cube. This technique is implemented in products such as Excel 2007. This allows you to use the pivot table feature to slice and dice the data to retrieve specific data. Business Intelligence Development Studio provides a browser window that allows you to slice and dice the data to test the results that is returned before the cube is deployed into a production environment. Furthermore, you can also employ this technique within Reporting Services as the basis for providing data for the reports that are deployed through SSRS. Data mining Data mining is a feature in SSAS that enables you to extrapolate trends and patterns in the data. You can create data mining structures to explore the data against algorithms that are provided. There is also a Data Mining Add-in for Excel 2007 that allows you to explore the results that are returned by a data mining structure. You can also use the add-in to create your own data structures.
39
40
41
42
Gauge A new type of data region that has been introduced is the Tablix data region. Tablix combines the benefits of both a table and matrix data region, which gives greater flexibility in the design of the report. The Tablix data region is not explicitly found in any of the report authoring tools. This functionality has been added to the table and matrix data region properties in Report Designer. You also have the ability to define multiple data regions within a single report.
44
Toolbox. Provides the ability to add data regions and other report items to the designer canvass. Report canvass designer. Located in the center of the Report Designer application, it allows you to develop the report in its entirety. Grouping pane. A grouping pane is located at the bottom of the Report Designer. This provides a quick way of managing groups of data in the canvas designer, without the need for entering into dialog boxes. Properties pane. A context sensitive pane that changes as items on the designer canvass are selected, providing rich formatting and report body options. The Properties pane provides a vast array of options than can vastly improve the visualization of the reports.
Report Builder Report Builder is an application that empowers power users to create their own adhoc reports. Report Builder is available in two versions: Report Builder version 1 is a ClickOnce .NET application that allows the creation of unplanned reports. For power users to use the Report Builder version 1, a report model must be created that provides an abstraction layer to the underlying data in a data source. It is also a prerequisite to be able to use the Report Builder version 1. As a result, the report model has a separate project template in Business Intelligence Development Studio. Report Builder version 2 has now been introduced and is not constrained to the requirements of a report model. Report Builder version 2 has the same layout as the report designer that is available in Business Intelligence Development Studio. The application is encased within an Office 2007 application with ribbons to provide shortcut to common tasks. With Report Builder version 2, you can make use of data sources directly, providing a more flexible tool than its previous version. Report Manager Report Manager is the Web site that is used to host the reports that are deployed to the report server. End users will browse to a predefined URL that is the home page of Reporting Services. From here, users can browse and export reports. If they have the permission, they may also be able to subscribe to reports so that they can be delivered via e-mail. Report Manager also acts as an administrative Web site for the reports. Report Server administrators have the same capabilities as the end users, however they can manage report execution, report history and security as well. SQL Server Management Studio SQL Server Management Studio provides administrative capabilities for the report server. Within the Reporting Services server type, a report server administrator can create shared schedules, create system and item-level security roles and view the jobs that are running on the report server. You can also manage the report server databases within the database engine, including backing up the database.
46
Command prompt utilities There are command prompt utilities that can be used to manage Reporting Services: RSConfig.exe. can be used to perform post installation tasks on the report server. The options available within RSConfig.exe are available within the Reporting Services Configuration Manager. RSKeymgmt.exe. allows you to back up and restore the report server encryption keys. This feature is also available within the Reporting Services Configuration Manager. Encryption keys are used to protect the sensitive data that is stored in the report server, such as data source credentials and passwords. This uses a symmetric key to encrypt and decrypt the sensitive data. The report server service is used to create and unlock the key. If you change the identity of the Report Server service or if you migrate the report server to a new computer, the private key of the Report Server service will no longer be able to unlock the symmetric key. To restore access to the symmetric key, the symmetric key must be re-encrypted by using the private key of the new Report Server service identity. Restoring the symmetric key is the process by which the re-encryption occurs. Note: A word from the classroom: a common mistake that is made by organizations is that they do not back up the report server encryption keys. This has caused inconvenience for some of my customers. RS Utility. The RS Utility is a script host that you can use to perform scripted operations. Use this tool to run Microsoft Visual Basic scripts that copy data between report server databases, publish reports, create items in a report server database and more.
47
48
Business Practices
Set a clear vision for the purpose and role of BI for your organization. Understand the reporting requirements of the users within the business. Document the data sources that will be used for the BI solution. Consider creating a data warehouse to centrally store all the data sources. Consider installing BI components on separate servers to reduce the loads on a single server. Make use of Business Intelligence Development Studio solutions to organise the work that will create the BI solution. Use SSIS for the controlled movement of source data into the data warehouse. Use SSAS to centrally store reporting data for improved reporting data access. Use SSRS to the centrally manage and distribute reports for the organization. Consider managing the creation of the BI solution as a business project.
49
50
Scenario
You are a database administrator for Adventure Works, a manufacturing company that sells bicycle and bicycle components through the Internet and a reseller distribution network. The senior management has identified the need to implement BI within the company. It has been decided that SQL Server 2008 will be used to facilitate this requirement. The first step is to install the BI components. A SQL Server named MIAMI already exists on the network that has the database engine installed. The requirement is to add Analysis Services, Integration Services and Reporting Services to this existing instance. On completion of the installation, a script has been provided that will create the tables required to build a development data warehouse named AdventureworksDWDev. You will then create a BI solution named AdventureWorksBI which will contain three projectsan Analysis Services project named AW_SSAS, an Analysis Services project named AW_SSIS and a Reporting Services project named AW_SSRS. To verify that the BI components are operational, you will use the Import and Export wizard to export the Firstname, Lastname and EmailAddress of records in the Person.Contact table to a text file named Employees.txt stored in the D:\Labfiles\Starter folder. You should use the Import and Export wizard to create an SSIS package that can be imported into the AW_SSIS project. You will create a simple report by using the Report Wizard in the AW_SSRS project.
51
Exercise Information
Exercise 1: Installing Business Intelligence Components in SQL Server 2008 In this exercise, you will install Analysis Services, Integration Services and Reporting Services. You will verify that these components of the SQL Server installation has completed successfully. You will then review and run a Transact-SQL script file in SQL Server Management Studio to create a database named AdventureWorksDWDev that will create a simple data warehouse for testing purposes. Exercise 2: Creating a Business Intelligence Solution in SQL Server 2008 In this exercise, you will use Business Intelligence Development Studio to create a BI solution named AdventureWorksBI. The solution will comprise of three projectsan Analysis Services project named AW_SSAS, an Integration Services project named AW_SSIS and a Reporting Services project named AW_SSRS. Exercise 3: Use the Import/Export Wizard to Create an SSIS Package In this exercise, you will use the Import and Export wizard to create a simple data transfer from tables in SQL Server including the Person.Contact to create a text file in the file system named Employees.txt located in D:\Labfiles\Starter. You will use the Import and Export wizard to save the work as an SSIS package named Customers. On completion of the wizard, you will verify that the text file has been created. You will also import the packaged saved in the Import and Export wizard into the AW_SSIS project. Exercise 4: Using the Report Wizard in SQL Server Reporting Services 2008 In this exercise, you will use the Report Wizard to create a simple report. While using the wizard, you will be exposed to key components of a report including data sources, data sets and data regions. On completion of the Report Wizard, you will then preview the report in Business Intelligence Development Studio.
52
a. c.
Log on to the MIAMI server. To log on to the MIAMI server, press CTRL+ALT+DELETE.
a.
b. Run setup.exe.
Task 3: Perform an installation of Business Intelligence components to an existing instance of SQL Server
Task 4: Review and install the setup support files, and add new features to an existing instance of SQL Server 2008
Review the setup support rule and install the support files.
Task 5: Select Analysis Services, Reporting Services and Integration Services features on the Feature Selection page to install Business Intelligence components in SQL Server 2008
1. 2.
Task 6: Review the disk space requirements Review the disk space requirements.
Task 7: Specify the Server configuration Configure the Service Account for the SQL Server Analysis Services, SQL Server Reporting Services and SQL Server Integration Services to all use the MI4SQLS3rv1ce account. Confirm the collation settings for the SQL Server Analysis Services.
Task 8: Specify which user has administrative permissions for Analysis Services Configure Analysis Services to include MIAMI\Student as an administrator.
53
Task 9: Configure Reporting Services to Install, but do not configure the report server
Configure Reporting Services to Install, but do not configure the report server.
Task 10: Review the installation rules and install SQL Server Review the installation rules and then install SQL Server.
Task 11: Connect to the SQL Server Integration Services, SQL Server Analysis Services and SQL Server Reporting Services in SQL Server Management Studio
1. 2. 3.
Open the Microsoft SQL Server Management Studio console and connect to SQL Server Integration Services. Connect to SQL Server Analysis Services in Microsoft SQL Server Management Connect to SQL Server Reporting Services in Microsoft SQL Server Management Studio. Studio. Task 12: Configure SQL Server Reporting Services to create the Report Server databases on the MIAMI server and create the Report Manager Web site on the MIAMI server
1. 2. 3. 4. 1. 2.
Open the Reporting Services Configuration Manager and connect to the MSSQLSERVER instance on the MIAMI server. Set the Web Service URL to ReportServer. Set the Database to ReportServer. Ensure the Report Manager URL virtual directory is set to Reports.
Task 13: Verify that Reporting Services has been installed Log on to the MIAMI server.
Open the Microsoft SQL Server Management Studio console and connect to the Database Engine.
3.
In the Microsoft SQL Server Management Studio console connect to SQL Server Reporting Services. Task 14: You have completed all tasks in this exercise
A successful completion of this exercise results in the following outcomes: You have installed SQL Server Integration Services, SQL Server Analysis Services and SQL Server Reporting Services. You have configured SQL Server Reporting Services using Reporting Services Configuration Manager. You can connect successfully to SQL Server Integration Services, SQL Server Analysis Services and SQL Server Reporting Services.
Exercise 2: Creating a Business Intelligence Solution in Microsoft SQL Server 2008 Exercise Overview
In this exercise, you will review and run Transact-SQL script files in SQL Server Management Studio to create a database named AdventureWorksDWDev that will create a simple data warehouse for testing purposes. You will then use Business Intelligence Development Studio to create a Business Intelligence solution named AdventureWorksBI. The solution will then comprise of three projects
54
an Analysis Services project named AW_SSAS, an Integration Services project named AW_SSIS and a Reporting Services project named AW_SSRS. Task 1: Open SQL Server Management Studio and open the SQL Server Solution file AW2008DWDev solution located in the D:\Labfiles\Starter\AW2008DWDev folder
1. 2.
Open the Microsoft SQL Server Management Studio console and connect to the SQL Server Database Engine. Open the AW2008DWDev solution file in the D:\Labfiles\Starter\AW2008DWDev folder. Task 2: Create a database named AdventureWorksDWDev using the CreateAWDWDevDB.sql in the AW2008DWDev solution
1. 2. 1. 2. 3. 4.
In the Microsoft SQL Server Management Studio, open the CreateAWDWDevDB.sql script file. Execute the CreateAWDWDevDB.sql script to create the AdventureWorksDWDev.
Task 3: Create the staging tables and log tables in the AdventureWorksDWDev database In the Microsoft SQL Server Management Studio, open the CreateProductStage.sql script file. Execute the CreateProductStage.sql script to create the StageProduct table in the AdventureWorksDWDev. Execute the CreateResellerStage.sql script to create the StageReseller table in the AdventureWorksDWDev database. Execute the ExtractLog.sql script to create the ExtractLog table in the AdventureWorksDWDev database. Task 4: Create the dimension tables in the AdventureWorksDWDev database
1. 2. 3.
Execute the CreateResellerDim.sql script to create the DimReseller dimension table in the AdventureWorksDWDev database. Execute the CreateProductDim.sql script to create the DimProduct dimension table in the AdventureWorksDWDev database. Execute the CreateTimeDim.sql script to create the DimTime dimension table in the AdventureWorksDWDev database. Task 5: Create the FactSales fact table in the AdventureWorksDWDev database
Execute the CreateSalesFact.sql script to create the FactSales fact table in the AdventureWorksDWDev database. Task 6: Create two database diagrams; one diagram to contain the staging tables and ExtractLog table named AW2008DWDevStage. The other that consists of the dimension tables and SalesFact table named AW2008DWDev
1. 2. 3. 4.
Create a database diagram named AW2008DWDevStage that contains the StageProduct, StageReseller and Extract log table. Save the database diagram and name the database diagram AW2008DWDevStage . Close the database diagram. Create a database diagram named AW2008DWDevStage that contains the FactSales, DimProduct, DimReseller and DimTime table. Save the database diagram and name the database diagram AW2008DWDev . Close the database diagram and then close down SQL Server Management Studio.
55
Task 7: Create a Business Intelligence Development Studio Solution named AW_BI with three projects named AW_SSIS, AW_SSAS and AW_SSRS
1. 2. 3. 4.
Open the Business Intelligence Development Studio console and connect to SQL Server Integration Services. Create a Business Intelligence Development Studio solution named AW_BI that contains a SQL Server Integration Services project named AW_SSIS in the D:\Labfiles\Starter folder. Create a Business Intelligence Development Studio project named AW_SSAS that is added to the AW_BI solution in the D:\Labfiles\Starter folder. Create a Business Intelligence Development Studio project named AW_SSRS that is added to the AW_BI solution in the D:\Labfiles\Starter folder. Close down Business Intelligence Development Studio. Task 8: You have completed all tasks in this exercise
You have created a database that hosts a development data warehouse named AdventureWorksDWDev database. You have created the tables that will be used in the AdventureWorksDWDev database. You have created a Business Intelligence Development Studio solotion named AW_BI that holds a SQL Server Integrstion Services project named AW_SSIS, a SQL Server Analysis Services project named AW_SSAS and a SQL Server Reporting Services project named AW_SSRS.
Exercise 3: Using the Import/Export Wizard to Create a SSIS Package Exercise Overview
In this exercise, you will use the Import/Export wizard to create a simple data transfer from Sales.vStoreWithAddresses view in the AdventureWorks2008 database to create a text file in the file system named Resellers.txt located in D:\Labfiles\Starter. You will use the Import/Export wizard to save the work as an SSIS package named ResellerText. On completion of the wizard, you will verify that the text file has been created. You will also import the packaged saved in the Import/Export wizard into the AW_SSIS project. Task 1: Run the Import/Export wizard
Start the Import and Export wizard selecting the source data from the AdventureWorks2008 database and the destination as D:\Labfiles\Starter\Resellers.txt. Task 2: Write a query in the Import and Export wizard that returns Reseller Customers from all columns in the Sales.vStoreWithAddresses view and configure the flat file destination
1. 2.
In the Import and Export wizard, write a Transact-SQL statement that will return Resellers to the Import/Export wizard. Configure the flat file destination with the default setting.
Task 3: Run the Import and Export wizard to save the SSIS package to the D:\Labfiles\Starter folder
56
1. 2.
Configure the Import and Export wizard to run and save the SSIS package.
Name the SSIS package ResellerText and save the package to the D:\Labfiles\Starter folder as ResellerText.dtsx. Task 4: Verify that the Reseller.txt text file and the ResellerText.dtsx package has been created
1. 2.
Verify that the Reseller.txt text files has been created in the D:\Labfiles\Starter
folder. Verify the ResellerText.dtsx SSIS package has been created in the D:\Labfiles\Starter folder. Task 5: You have completed all tasks in this exercise
You have used the Import and Export wizard. The Import and Export wizard has created a text file named Resellers.txt.
The Import and Export wizard has created a SSIS package named ResellerText. You have verified the existence of the Resellers.txt file and the ResellerText.dtsx SSIS package in the D:\Labfiles\Starter folder.
Exercise 4: Using the Report Wizard in SQL Server Reporting Services 2008 Exercise Overview
In this exercise, you will use Report Wizard to create a simple report. While using the wizard you will be exposed to key components of a report including data sources, data sets and data regions. On completion of the Report Wizard, you will then preview the report in Business Intelligence Development Studio. Task 1: Open up the AW_BI Business Intelligence solution in Business Intelligence Development Studio
Open the Business Intelligence Development Studio and open the AW_BI solution located in the D:\Labfiles\Starter\AW_BI folder. Task 2: Run the Report Wizard in the AW_SSRS project in Business Intelligence Development Studio that creates a report returning the FirstName and LastName columns from the Person.Person table in the AdventureWorks2008 database
1. 2. 3. 4.
In the Report Wizard, define a data source named AW2008DS that points to the Adventureworks2008 database on the MIAMI server. In the Report Wizard, design a query that returns FirstName and LastName from the Person.Person table in the AdventureWorks2008 database. In the Report Wizard, set the data region as a table and the formatting.
57
Studio.
You have used the Report Wizard. You have previewed the report in Business Intelligence Development
58
Lab Review
In this lab, you installed BI components in SQL Server 2008 for a company named Adventure Works. On completion of the installation, you set up a BI solution that included Integration Services, Analysis Services and Reporting Services projects. You used the Import and Export wizard to transfer data and create a package that was then imported into the Integration Services project. You then used the Report Wizard to create a simple report. What type of database is a SSAS cube stored in? Cubes are stored within an OLAP database. This database stores data in a multidimensional format that is optimized for querying data. What is the difference between a dimension table and a dimension? A dimension table is a table that exists within a data warehouse that is stored in the SQL Server database engine. A dimension is an object within a cube that forms the access of the cube and holds contextual information that users use to slice and dice the data within a cube. Typically, the data that is stored in a dimension table acts as the basis for the data that is stored in the dimension of a cube. What are the two core components used in an SSIS package? The two core components that are used in an SSIS package is the control flow and the data flow components. The control flow components are used to create the overall logic of an SSIS package. It contains tasks that modify data and tasks that interact with other technologies such as FTP sites, e-mail systems and .NET components. The data flow components provide a focus for the ETL of data within the Data Flow task. It allows you to specifically focus on the movement of data. What is the best tool to perform a simple transfer of data between two SQL Server tables? The best tool to use to perform a simple transfer of data between two SQL Server tables is the Import and Export wizard. What is the difference between a data set and a data region in SSRS? Within SSRS, a data set is the query that is used to return specific data from the data source. The data region is the area of the report that holds the results returned by the data set. A data region can be a table, matrix, or a combination of both these data regions known as a Tablix. There is also a list data region, chart and gauge data region. What is the purpose of a Business Intelligence Development Studio solutions? The purpose of the Business Intelligence Development Studio solution is to provide a tool that enables you to organize your Business Intelligence work. You have the ability to define many projects within a single solution. A project consists of a discrete are of Business Intelligence, including Integration Services, Analysis 59
Services and Reporting Services projects. Holding these projects in a single solution enables you to centralize the storage and management of your business intelligence files.
60
Module Summary
Overview of Business Intelligence
In this lesson, you have learned the following key points: The role that BI can play within an organization and how SQL Server 2008 can be used to facilitate BI. The purpose of ETL operations in BI solutions and how it is used to cleanse and standardize the data through SSIS. The purpose of a data warehouse and that it is implemented within the database engine in SQL Server 2008. The role of OLAP systems and data mining in a BI solution and that it is implemented with SSAS. The importance of reporting to a business and how SSRS facilitates reporting. SQL Server technologies work together in a BI solution. Additional Microsoft applications that can be used within BI include SharePoint Server 2007 and Excel 2007.
61
62
63
Glossary
.NET Framework An integral Windows component that supports building, deploying and running the next generation of applications and Web services. It provides a highly productive, standards-based, multilanguage environment for integrating existing investments with next generation applications and services, as well as the agility to solve the challenges of deployment and operation of Internet-scale applications. The .NET Framework consists of three main parts: the common language runtime, a hierarchical set of unified class libraries, and a componentized version of ASP called ASP.NET. ad hoc report An .rdl report created with report builder that accesses report models. aggregation A table or structure that contains precalculated data for a cube. aggregation design In Analysis Services, the process of defining how an aggregation is created. aggregation prefix A string that is combined with a system-defined ID to create a unique name for a partition's aggregation table. ancestor A member in a superior level in a dimension hierarchy that is related through lineage to the current member within the dimension hierarchy. attribute The building block of dimensions and their hierarchies that corresponds to a single column in a dimension table. attribute relationship The hierarchy associated with an attribute containing a single level based on the corresponding column in a dimension table. axis 64
A set of tuples. Each tuple is a vector of members. A set of axes defines the coordinates of a multidimensional data set. ActiveX Data Objects Component Object Model objects that provide access to data sources. This API provides a layer between OLE DB and programming languages such as Visual Basic, Visual Basic for Applications, Active Server Pages and Microsoft Internet Explorer Visual Basic Scripting. ActiveX Data Objects (Multidimensional) A high-level, language-independent set of object-based data access interfaces optimized for multidimensional data applications. ActiveX Data Objects MultiDimensional.NET A managed data provider used to communicate with multidimensional data sources. ADO MD See Other Term: ActiveX Data Objects (Multidimensional) ADOMD.NET See Other Term: ActiveX Data Objects MultiDimensional.NET AMO See Other Term: Analysis Management Objects Analysis Management Objects The complete library of programmatically accessed objects that let and application manage a running instance of Analysis Services. balanced hierarchy A dimension hierarchy in which all leaf nodes are the same distance from the root node. calculated column A column in a table that displays the result of an expression instead of stored data. calculated field A field, defined in a query, that displays the result of an expression instead of stored data.
65
calculated member A member of a dimension whose value is calculated at run time by using an expression.
66
calculation condition A MDX logical expression that is used to determine whether a calculation formula will be applied against a cell in a calculation subcube. calculation formula A MDX expression used to supply a value for cells in a calculation subcube, subject to the application of a calculation condition. calculation pass A stage of calculation in a multidimensional cube in which applicable calculations are evaluated. calculation subcube The set of multidimensional cube cells that is used to create a calculated cells definition. The set of cells is defined by a combination of MDX set expressions. case In data mining, a case is an abstract view of data characterized by attributes and relations to other cases. case key In data mining, the element of a case by which the case is referenced within a case set. case set In data mining, a set of cases. cell In a cube, the set of properties, including a value, specified by the intersection when one member is selected from each dimension. cellset In ADO MD, an object that contains a collection of cells selected from cubes or other cellsets by a multidimensional query. changing dimension A dimension that has a flexible member structure, and is designed to support frequent changes to structure and data.
67
chart data region A report item on a report layout that displays data in a graphical format. child A member in the next lower level in a hierarchy that is directly related to the current member. clickthrough report A report that displays related report model data when you click data within a rendered report builder report. clustering A data mining technique that analyses data to group records together according to their location within the multidimensional attribute space. collation A set of rules that determines how data is compared, ordered and presented. column-level collation Supporting multiple collations in a single instance. composite key A key composed of two or more columns. concatenation The combining of two or more character strings or expressions into a single character string or expression, or to combine two or more binary strings or expressions into a single binary string or expression. concurrency A process that allows multiple users to access and change shared data at the same time. SQL Server uses locking to allow multiple users to access and change shared data at the same time without conflicting with each other. conditional split A restore of a full database backup, the most recent differential database backup (if any), and the log backups (if any) taken since the full database backup.
68
config file See Other Term: configuration file configuration In reference to a single microcomputer, the sum of a system's internal and external components, including memory, disk drives, keyboard, video and generally less critical add-on hardware, such as a mouse, modem or printer. configuration file A file that contains machine-readable operating specifications for a piece of hardware or software, or that contains information about another file or about a specific user. configurations In Integration Services, a name or value pair that updates the value of package objects when the package is loaded. connection An interprocess communication (IPC) linkage established between a SQL Server application and an instance of SQL Server. connection manager In Integration Services, a logical representation of a run-time connection to a data source. constant A group of symbols that represent a specific data value. container A control flow element that provides package structure. control flow The ordered workflow in an Integration Services package that performs tasks. control-break report A report that summarizes data in user-defined groups or breaks. A new group is triggered when different data is encountered.
69
cube A set of data that is organized and summarized into a multidimensional structure defined by a set of dimensions and measures. cube role A collection of users and groups with the same access to a cube. custom rollup An aggregation calculation that is customized for a dimension level or member, and that overrides the aggregate functions of a cube's measures. custom rule In a role, a specification that limits the dimension members or cube cells that users in the role are permitted to access. custom variable An aggregation calculation that is customized for a dimension level or member and overrides the aggregate functions of a cube's measures. data dictionary A set of system tables, stored in a catalog, that includes definitions of database structures and related information, such as permissions. data explosion The exponential growth in size of a multidimensional structure, such as a cube, due to the storage of aggregated data. data flow The ordered workflow in an Integration Services package that extracts, transforms and loads data. data flow engine An engine that executes the data flow in a package. data flow task Encapsulates the data flow engine that moves data between sources and destinations, providing the facility to transform, clean and modify data as it is moved.
70
data integrity A state in which all the data values stored in the database are correct. data manipulation language The subset of SQL statements that is used to retrieve and manipulate data. data mart A subset of the contents of a data warehouse. data member A child member associated with a parent member in a parent-child hierarchy. data mining The process of analyzing data to identify patterns or relationships. data processing extension A component in Reporting Services that is used to retrieve report data from an external data source. data region A report item that displays repeated rows of data from an underlying dataset in a table, matrix, list or chart. data scrubbing Part of the process of building a data warehouse out of data coming from multiple (OLTP) systems. data source In ADO and OLE DB, the location of a source of data exposed by an OLE DB provider. The source of data for an object such as a cube or dimension. It is also the specification of the information necessary to access source data. It sometimes refers to object of ClassType clsDataSource. In Reporting Services, a specified data source type, connection string and credentials, which can be saved separately to a report server and shared among report projects or embedded in a .rdl file. data source name The name assigned to an ODBC data source. 71
72
data source view A named selection of database objects that defines the schema referenced by OLAP and data mining objects in an Analysis Services databases. data warehouse A database specifically structured for query and analysis. database role A collection of users and groups with the same access to an Analysis Services database. data-driven subscription A subscription in Reporting Services that uses a query to retrieve subscription data from an external data source at run time. datareader A stream of data that is returned by an ADO.NET query. dataset In OLE DB for OLAP, the set of multidimensional data that is the result of running a MDX SELECT statement. In Reporting Services, a named specification that includes a data source definition, a query definition and options. decision support Systems designed to support the complex analytic analysis required to discover business trends. decision tree A treelike model of data produced by certain data mining methods. default member The dimension member used in a query when no member is specified for the dimension. delimited identifier An object in a database that requires the use of special characters (delimiters) because the object name does not comply with the formatting rules of regular identifiers. 73
74
delivery channel type The protocol for a delivery channel, such as Simple Mail Transfer Protocol (SMTP) or File. delivery extension A component in Reporting Services that is used to distribute a report to specific devices or target locations. density In an index, the frequency of duplicate values. In a data file, a percentage that indicates how full a data page is. In Analysis Services, the percentage of cells that contain data in a multidimensional structure. dependencies Objects that depend on other objects in the same database. derived column A transformation that creates new column values by applying expressions to transformation input columns. descendant A member in a dimension hierarchy that is related to a member of a higher level within the same dimension. destination An Integration Services data flow component that writes the data from the data flow into a data source or creates an in-memory dataset. destination adapter A data flow component that loads data into a data store. dimension A structural attribute of a cube, which is an organized hierarchy of categories (levels) that describe data in the fact table.
75
dimension granularity The lowest level available to a particular dimension in relation to a particular measure group. dimension table A table in a data warehouse whose entries describe data in a fact table. Dimension tables contain the data from which dimensions are created. discretized column A column that represents finite, counted data. document map A navigation pane in a report arranged in a hierarchy of links to report sections and groups. drill down/drill up To navigate through levels of data ranging from the most summarized (up) to the most detailed (down). drill through In Analysis Services, to retrieve the detailed data from which the data in a cube cell was summarized. In Reporting Services, to open related reports by clicking hyperlinks in the main drillthrough report. drilldown/drillup A technique for navigating through levels of data ranging from the most summarized (up) to the most detailed (down). drillthrough In Analysis Services, a technique to retrieve the detailed data from which the data in a cube cell was summarized. In Reporting Services, a way to open related reports by clicking hyperlinks in the main drillthrough report. drillthrough report A report with the 'enable drilldown' option selected. Drillthrough reports contain hyperlinks to related reports.
76
77
dynamic connection string In Reporting Services, an expression that you build into the report, allowing the user to select which data source to use at run time. You must build the expression and data source selection list into the report when you create it. Data Mining Model Training The process a data mining model uses to estimate model parameters by evaluating a set of known and predictable data. entity In Reporting Services, an entity is a logical collection of model items, including source fields, roles, folders and expressions, presented in familiar business terms. executable In Integration Services, a package, Foreach Loop, For Loop, Sequence or task. execution tree The path of data in the data flow of a SQL Server 2008 Integration Services package from sources through transformations to destinations. expression In SQL, a combination of symbols and operators that evaluate to a single data value. In Integration Services, a combination of literals, constants, functions and operators that evaluate to a single data value. ETL Extraction, transformation and loading. The complex process of copying and cleaning data from heterogeneous sources. fact A row in a fact table in a data warehouse. A fact contains values that define a data event such as a sales transaction. fact dimension A relationship between a dimension and a measure group in which the dimension main table is the same as the measure group table.
78
fact table A central table in a data warehouse schema that contains numerical measures and keys relating facts to dimension tables. field length In bulk copy, the maximum number of characters needed to represent a data item in a bulk copy character format data file. field terminator In bulk copy, one or more characters marking the end of a field or row, separating one field or row in the data file from the next. filter expression An expression used for filtering data in the Filter operator. flat file A file consisting of records of a single record type, in which there is no embedded structure information governing relationships between records. flattened rowset A multidimensional data set presented as a two-dimensional rowset in which unique combinations of elements of multiple dimensions are combined on an axis. folder hierarchy A bounded namespace that uniquely identifies all reports, folders, shared data source items and resources that are stored in and managed by a report server. format file A file containing meta information (such as data type and column size) that is used to interpret data when being read from or written to a data file. File connection manager In Integration Services, a logical representation of a connection that enables a package to reference an existing file or folder or to create a file or folder at run time. For Loop container In Integration Services, a container that runs a control flow repeatedly by testing a condition.
79
80
Foreach Loop container In Integration Services, a container that runs a control flow repeatedly by using an enumerator. Fuzzy Grouping In Integration Services, a data cleaning methodology that examines values in a dataset and identifies groups of related data rows and the one data row that is the canonical representation of the group. global assembly cache A machine-wide code cache that stores assemblies specifically installed to be shared by many applications on the computer. grant To apply permissions to a user account, which allows the account to perform an activity or work with data. granularity The degree of specificity of information that is contained in a data element. granularity attribute The single attribute is used to specify the level of granularity for a given dimension in relation to a given measure group. grid A view type that displays data in a table. grouping A set of data that is grouped together in a report. hierarchy A logical tree structure that organizes the members of a dimension such that each member has one parent member and zero or more child members. hybrid OLAP A storage mode that uses a combination of multidimensional data structures and relational database tables to store multidimensional data.
81
HTML Viewer A UI component consisting of a report toolbar and other navigation elements used to work with a report. input member A member whose value is loaded directly from the data source instead of being calculated from other data. input set The set of data provided to a MDX value expression upon which the expression operates. isolation level The property of a transaction that controls the degree to which data is isolated for use by one process, and is guarded against interference from other processes. Setting the isolation level defines the default locking behavior for all SELECT statements in your SQL Server session. item-level role assignment A security policy that applies to an item in the report server folder namespace. item-level role definition A security template that defines a role used to control access to or interaction with an item in the report server folder namespace. key A column or group of columns that uniquely identifies a row (primary key), defines the relationship between two tables (foreign key) or is used to build an index. key attribute The attribute of a dimension that links the non-key attributes in the dimension to related measures. key column In an Analysis Services dimension, an attribute property that uniquely identifies the attribute members. In an Analysis Services mining model, a data mining column that uniquely identifies each case in a case table. key performance indicator 82
A quantifiable, standardized metric that reflects a critical business variable (for instance, market share), measured over time. KPI See Other Term: key performance indicator latency The amount of time that elapses when a data change is completed at one server and when that change appears at another server. leaf In a tree structure, an element that has no subordinate elements. leaf level The bottom level of a clustered or nonclustered index. leaf member A dimension member without descendants. level The name of a set of members in a dimension hierarchy such that all members of the set are at the same distance from the root of the hierarchy. lift chart In Analysis Services, a chart that compares the accuracy of the predictions of each data mining model in the comparison set. linked dimension In Analysis Services, a reference in a cube to a dimension in a different cube. linked measure group In Analysis Services, a reference in a cube to a measure group in a different cube. linked report A report that references an existing report definition by using a different set of parameter values or properties. list data region A report item on a report layout that displays data in a list format.
83
84
local cube A cube created and stored with the extension .cub on a local computer using PivotTable Service. lookup table In Integration Services, a reference table for comparing, matching or extracting data. many-to-many dimension A relationship between a dimension and a measure group in which a single fact may be associated with many dimension members and a single dimension member may be associated with a many facts. matrix data region A report item on a report layout that displays data in a variable columnar format. measure In a cube, a set of values that are usually numeric and are based on a column in the fact table of the cube. Measures are the central values that are aggregated and analyzed. measure group All the measures in a cube that derive from a single fact table in a data source view. member An item in a dimension representing one or more occurrences of data. member property Information about an attribute member, for example, the gender of a customer member or the color of a product member. mining structure A data mining object that defines the data domain from which the mining models are built. multidimensional OLAP A storage mode that uses a proprietary multidimensional structure to store a partition's facts and aggregations or a dimension. multidimensional structure 85
A database paradigm that treats data as cubes that contain dimensions and measures in cells.
86
MDX A syntax used for defining multidimensional objects and querying and manipulating multidimensional data. Mining Model An object that contains the definition of a data mining process and the results of the training activity. Multidimensional Expression A syntax used for defining multidimensional objects and querying and manipulating multidimensional data. named set A set of dimension members or a set expression that is created for reuse, for example, in MDX queries. natural hierarchy A hierarchy in which at every level there is a one-to-many relationship between members in that level and members in the next lower level. nested table A data mining model configuration in which a column of a table contains a table. nonleaf In a tree structure, an element that has one or more subordinate elements. In Analysis Services, a dimension member that has one or more descendants. In SQL Server indexes, an intermediate index node that points to other intermediate nodes or leaf nodes. nonleaf member A member with one or more descendants. normalization rules A set of database design rules that minimizes data redundancy and results in a database in which the Database Engine and application software can easily enforce integrity. Non-scalable EM A Microsoft Clustering algorithm method that uses a probabilistic method to determine the probability that a data point exists in a cluster. 87
88
Non-scalable K-means A Microsoft Clustering algorithm method that uses a distance measure to assign a data point to its closest cluster. object identifier A unique name given to an object. In Metadata Services, a unique identifier constructed from a globally unique identifier (GUID) and an internal identifier. online analytical processing A technology that uses multidimensional structures to provide rapid access to data for analysis. online transaction processing A data processing system designed to record all of the business transactions of an organization as they occur. An OLTP system is characterized by many concurrent users actively adding and modifying data. overfitting The characteristic of some data mining algorithms that assigns importance to random variations in data by viewing them as important patterns. ODBC data source The location of a set of data that can be accessed using an ODBC driver. A stored definition that contains all of the connection information an ODBC application requires to connect to the data source. ODBC driver A dynamic-link library (DLL) that an ODBC-enabled application, such as Excel, can use to access an ODBC data source. OLAP See Other Term: online analytical processing OLE DB A COM-based API for accessing data. OLE DB supports accessing data stored in any format for which an OLE DB provider is available.
89
OLE DB for OLAP Formerly, the separate specification that addressed OLAP extensions to OLE DB. Beginning with OLE DB 2.0, OLAP extensions are incorporated into the OLE DB specification. package A collection of control flow and data flow elements that runs as a unit. padding A string, typically added when the last plaintext block is short. The space allotted in a cell to create or maintain a specific size. parameterized report A published report that accepts input values through parameters. parent A member in the next higher level in a hierarchy that is directly related to the current member. partition In replication, a subset of rows from a published table, created with a static row filter or a parameterized row filter. In Analysis Services, one of the storage containers for data and aggregations of a cube. Every cube contains one or more partitions. For a cube with multiple partitions, each partition can be stored separately in a different physical location. Each partition can be based on a different data source. Partitions are not visible to users; the cube appears to be a single object. In the Database Engine, a unit of a partitioned table or index. partition function A function that defines how the rows of a partitioned table or index are spread across a set of partitions based on the values of certain columns, called partitioning columns. partition scheme A database object that maps the partitions of a partition function to a set of filegroups. partitioned index 90
An index built on a partition scheme, and whose data is horizontally divided into units which may be spread across more than one filegroup in a database. partitioned snapshot In merge replication, a snapshot that includes only the data from a single partition. partitioned table A table built on a partition scheme, and whose data is horizontally divided into units which may be spread across more than one filegroup in a database. partitioning The process of replacing a table with multiple smaller tables. partitioning column The column of a table or index that a partition function uses to partition a table or index. perspective A user-defined subset of a cube. pivot To rotate rows to columns, and columns to rows, in a crosstabular data browser. To choose dimensions from the set of available dimensions in a multidimensional data structure for display in the rows and columns of a crosstabular structure. polling query A polling query is typically a singleton query that returns a value Analysis Services can use to determine if changes have been made to a table or other relational object. precedence constraint A control flow element that connects tasks and containers into a sequenced workflow. predictable column A data mining column that the algorithm will build a model around based on values of the input columns. prediction
91
A data mining technique that analyses existing data and uses the results to predict values of attributes for new records or missing attributes in existing records.
92
proactive caching A system that manages data obsolescence in a cube by which objects in MOLAP storage are automatically updated and processed in cache while queries are redirected to ROLAP storage. process In a cube, to populate a cube with data and aggregations. In a data mining model, to populate a data mining model with data mining content. profit chart In Analysis Services, a chart that displays the theoretical increase in profit that is associated with using each model. properties page A dialog box that displays information about an object in the interface. property A named attribute of a control, field or database object that you set to define one of the object's characteristics, such as size, color or screen location; or an aspect of its behavior, such as whether it is hidden. property mapping A mapping between a variable and a property of a package element. property page A tabbed dialog box where you can identify the characteristics of tables, relationships, indexes, constraints and keys. protection In Integration Services, determines the protection method, the password or user key and the scope of package protection. ragged hierarchy See Other Term: unbalanced hierarchy raw file In Integration Services, a native format for fast reading and writing of data to files.
93
recursive hierarchy A hierarchy of data in which all parent-child relationships are represented in the data. reference dimension A relationship between a dimension and a measure group in which the dimension is coupled to the measure group through another dimension. This behaves like a snowflake dimension, except that attributes are not shared between the two dimensions. reference table The source table to use in fuzzy lookups. refresh data The series of operations that clears data from a cube, loads the cube with new data from the data warehouse and calculates aggregations. relational database A database or database management system that stores information in tables as rows and columns of data, and conducts searches by using the data in specified columns of one table to find additional data in another table. relational database management system A system that organises data into related rows and columns. relational OLAP A storage mode that uses tables in a relational database to store multidimensional structures. rendered report A fully processed report that contains both data and layout information, in a format suitable for viewing. rendering A component in Reporting Services that is used to process the output format of a report. rendering extension(s) A plug-in that renders reports to a specific format.
94
rendering object model Report object model used by rendering extensions. replay In SQL Server Profiler, the ability to open a saved trace and play it again. report definition The blueprint for a report before the report is processed or rendered. A report definition contains information about the query and layout for the report. report execution snapshot A report snapshot that is cached. report history A collection of report snapshots that are created and saved over time. report history snapshot A report snapshot that appears in report history. report intermediate format A static report history that contains data captured at a specific point in time. report item Any object, such as a text box, graphical element or data region, that exists on a report layout. report layout In report designer, the placement of fields, text and graphics within a report. In report builder, the placement of fields and entities within a report, plus applied formatting styles. report layout template A predesigned table, matrix or chart report template in report builder. report link A URL to a hyperlinked report. report model
95
A metadata description of business data used for creating ad hoc reports in report builder.
96
report processing extension A component in Reporting Services that is used to extend the report processing logic. report rendering The action of combining the report layout with the data from the data source for the purpose of viewing the report. report server database A database that provides internal storage for a report server. report server execution account The account under which the Report Server Web service and Report Server Windows service run. report server folder namespace A hierarchy that contains predefined and user-defined folders. The namespace uniquely identifies reports and other items that are stored in a report server. It provides an addressing scheme for specifying reports in a URL. report snapshot A static report that contains data captured at a specific point in time. report-specific schedule Schedule defined inline with a report. resource Any item in a report server database that is not a report, folder or shared data source item. role A SQL Server security account that is a collection of other security accounts that can be treated as a single unit when managing permissions. A role can contain SQL Server logins, other roles, and Windows logins or groups. In Analysis Services, a role uses Windows security accounts to limit scope of access and permissions when users access databases, cubes, dimensions and data mining models.
97
In a database mirroring session, the principal server and mirror server perform complementary principal and mirror roles. Optionally, the role of witness is performed by a third server instance.
98
role assignment Definition of user access rights to an item. In Reporting Services, a security policy that determines whether a user or group can access a specific item and perform an operation. role definition A collection of tasks performed by a user (i.e. browser, administrator). In Reporting Services, a named collection of tasks that defines the operations a user can perform on a report server. role-playing dimension A single database dimension joined to the fact table on a different foreign keys to produce multiple cube dimensions. RDBMS See Other Term: relational database management system RDL See Other Term: Report Definition Language Report Definition Language A set of instructions that describe layout and query information for a report. Report Server service A Windows service that contains all the processing and management capabilities of a report server. Report Server Web service A Web service that hosts, processes and delivers reports. ReportViewer controls A Web server control and Windows Form control that provides embedded report processing in ASP.NET and Windows Forms applications. scalar A single-value field, as opposed to an aggregate.
99
scalar aggregate An aggregate function, such as MIN(), MAX() or AVG(), that is specified in a SELECT statement column list that contains only aggregate functions. scale bar The line on a linear gauge on which tick marks are drawn analogous to an axis on a chart. scope An extent to which a variable can be referenced in a DTS package. script A collection of Transact-SQL statements used to perform an operation. security extension A component in Reporting Services that authenticates a user or group to a report server. semiadditive A measure that can be summed along one or more, but not all, dimensions in a cube. serializable The highest transaction isolation level. Serializable transactions lock all rows they read or modify to ensure the transaction is completely isolated from other tasks. server A location on the network where report builder is launched from and a report is saved, managed and published. server admin A user with elevated privileges who can access all settings and content of a report server. server aggregate An aggregate value that is calculated on the data source server and included in a result set by the data provider. shared data source item
100
101
shared dimension A dimension created within a database that can be used by any cube in the database. shared schedule Schedule information that can be referenced by multiple items sibling A member in a dimension hierarchy that is a child of the same parent as a specified member. slice A subset of the data in a cube, specified by limiting one or more dimensions by members of the dimension. smart tag A smart tag exposes key configurations directly on the design surface to enhance overall design-time productivity in Visual Studio 2005. snowflake schema An extension of a star schema such that one or more dimensions are defined by multiple tables. source An Integration Services data flow component that extracts data from a data store, such as files and databases. source control A way of storing and managing different versions of source code files and other files used in software development projects. Also known as configuration management and revision control. source cube The cube on which a linked cube is based. source database In data warehousing, the database from which data is extracted for use in the data warehouse.
102
A database on the Publisher from which data and database objects are marked for replication as part of a publication that is propagated to Subscribers.
103
source object The single object to which all objects in a particular collection are connected by way of relationships that are all of the same relationship type. source partition An Analysis Services partition that is merged into another and is deleted automatically at the end of the merger process. sparsity The relative percentage of a multidimensional structure's cells that do not contain data. star join A join between a fact table (typically a large fact table) and at least two dimension tables. star query A star query joins a fact table and a number of dimension tables. star schema A relational database structure in which data is maintained in a single fact table at the center of the schema with additional dimension data stored in dimension tables. subreport A report contained within another report. subscribing server A server running an instance of Analysis Services that stores a linked cube. subscription A request for a copy of a publication to be delivered to a Subscriber. subscription database A database at the Subscriber that receives data and database objects published by a Publisher. subscription event rule A rule that processes information for event-driven subscriptions. subscription scheduled rule 104
One or more Transact-SQL statements that process information for scheduled subscriptions. Secure Sockets Layer (SSL) A proposed open standard for establishing a secure communications channel to prevent the interception of critical information, such as credit card numbers. Primarily, it enables secure electronic financial transactions on the World Wide Web, although it is designed to work on other Internet services as well. Semantic Model Definition Language A set of instructions that describe layout and query information for reports created in report builder. Sequence container Defines a control flow that is a subset of the package control flow. table data region A report item on a report layout that displays data in a columnar format. tablix A Reporting Services RDL data region that contains rows and columns resembling a table or matrix, possibly sharing characteristics of both. target partition An Analysis Services partition into which another is merged, and which contains the data of both partitions after the merger. temporary stored procedure A procedure placed in the temporary database, tempdb and erased at the end of the session. time dimension A dimension that breaks time down into levels such as Year, Quarter, Month and Day. In Analysis Services, a special type of dimension created from a date/time column. transformation In data warehousing, the process of changing data extracted from source data systems into arrangements and formats consistent with the schema of the data warehouse. 105
In Integration Services, a data flow component that aggregates, merges, distributes and modifies column data and rowsets.
106
transformation error output Information about a transformation error. transformation input Data that is contained in a column, which is used during a join or lookup process, to modify or aggregate data in the table to which it is joined. transformation output Data that is returned as a result of a transformation procedure. tuple Uniquely identifies a cell, based on a combination of attribute members from every attribute hierarchy in the cube. two A process that ensures transactions that apply to more than one server are completed on all servers or on none. unbalanced hierarchy A hierarchy in which one or more levels do not contain members in one or more branches of the hierarchy. unknown member A member of a dimension for which no key is found during processing of a cube that contains the dimension. unpivot In Integration Services, the process of creating a more normalized dataset by expanding data columns in a single record into multiple records. value An expression in MDX that returns a value. Value expressions can operate on sets, tuples, members, levels, numbers or strings. variable interval An option on a Reporting Services chart that can be specified to automatically calculate the optimal number of labels that can be placed on an axis, based on the chart width or height.
107
vertical partitioning To segment a single table into multiple tables based on selected columns. very large database A database that has become large enough to be a management challenge, requiring extra attention to people, processes and processes. visual A displayed, aggregated cell value for a dimension member that is consistent with the displayed cell values for its displayed children. VLDB very large database. write back To update a cube cell value, member or member property value. write enable To change a cube or dimension so that users in cube roles with read/write access to the cube or dimension can change its data. writeback In SQL Server, the update of a cube cell value, member or member property value. Web service In Reporting Services, a service that uses Simple Object Access Protocol (SOAP) over HTTP and acts as a communications interface between client programs and the report server. XML for Analysis A specification that describes an open standard that supports data access to data sources that reside on the World Wide Web. XMLA See Other Term: XML for Analysis
108