Beruflich Dokumente
Kultur Dokumente
Chapter 4: Functional and Nonfunctional Requirements Its important to define functional and nonfunctional requirements when building a data warehouse system to make sure that the system we build help users achieve the intended business objectives. Functional Requirements define what the system does. They contain the features that the DW should have. Nonfunctional requirements guide and constrain the DW architecture. We will be using the Amadeus Entertainment Case Study to show how to gather functional and nonfunctional requirements. 1. First we will have to find about Amadeus Entertainment business issues and challenges. 2. Then we will have to find out what business areas will benefit from a data warehouse system. 3. We will dive into each of these areas and learn about business operations within that area. 4. By this we mean the purpose, the processes, the roles, the exceptions, and the issues. After we understand the business operations, we will define the functional requirements, that is, the things that define: 5. What the data warehouse system does: such as the questions or issues that a data warehouse will be able to answer 6. The data that will be stored in the DW 7. The type of analysis that a user will be performing We will define the nonfunctional requirements, such as security, availability, and performance. An example of a nonfunctional requirement (in this case, availability) is that the data warehouse is expected to be up and running with downtime of less than one hour a month. We will then investigate the operational systems (Jupiter, WebTower9, and Jade) and 9.conduct a feasibility study to find out whether the data that we need is available in the source systems and whether it is accessible. 10. We will research the amount of data, 11. The condition of the data, and 12. The quality of the data and try to identify any potential risks in the project that could prevent us from delivering a data warehouse system that satisfies those functional and nonfunctional requirements.
2
b. The DWA = David c. The PM = Natalie Grace, David and Natalie have met with various managers and directors from various parts of the business and talked about business operations and business issues that the managers are facing. a. They then visited several physical stores, b. Talked to the store managers and employees c. Studied the details of how the business works in the stores d. They also visited the online stores and looked at the product categories and the sales order process. e. They also created a few orders in WebTower9 (the online store system) and followed up orders into the back end of the system . Lets say that from their survey they found the following business challenges and issues: 1. Because Amadeus Entertainment spans several countries and operates on different systems, it has difficulty aggregating worldwide sales (including returns) at any time. The company also needs to analyze sales by product and geographic area. 2. The company needs to evaluate supplier performance based on delivery lead time and promptness, order volume and quantity, trading and payment history, and store stock fluctuation records, regardless of brand, country, and transaction system. Historically this activity has been laborious. 3. The company wants to be able to select certain customers based on demographic attributes, order history and communication permissions and send them newsletters via e-mails containing promotional offers, regardless of which systems they are on, and record the customer responses, such as opening e-mails and visiting the company web sites. They then grouped these issues and challenges into three business areas: sales, purchasing, and CRM.
3
A level is a quantitative measurement of an object at a particular point in time, such as account balance, inventory level, and number of customers. These quantitative measures change from time to time. Roles are the who, whom and what is involved in an event. For example, the roles in the purchase order are product, account manager and supplier. The account manager raises a purchase order for a supplier for a particular product. The role in the subscriber event is customer, package and store. In other words, a customer subscribes to a package in a store.
1. After talking to business users about these three areas, Grace and David found out that the
events, status and levels in Amadeus Entertainment were as follows: a. sales event b. browsing event c. subscribe event d. customer class e. customer band f. purchase order event g. campaign sent event h. campaign opened event i. campaign click-through event j. inventory level k. package revenues l. package costs They then worked with the business people to get all the roles and attributes associated with each event and process. In this activity, they tried to understand the business terminology and the business principles in the corporation. Grace and David then sat down with the DBA for the following applications: SQL Server, DB/2, and Informix for a few hours and conducted a brief walk-through of the database of each system.
4
1. Understanding the data means finding out where the data is located for each functional requirement 2. Understanding the meaning 3. Understanding the data quality rules The risks are indentified by finding out whether there are gaps between the requirements and the data, that is, whether for each requirement the data is available and accessible. The purpose of doing a data feasibility study is to get an idea about whether there are any data risks that could fail the project. Data risks are project risks associated with data availability and data accessibility. Data availability risks are the risks of not being able to satisfy a requirement because the data is not available. Data accessibility risks are project risks of not being able to satisfy a requirement because the ETL system cannot extract the data and bring it into the warehouse. For example, we may not be able to identify updates in a table because there is no last updated timestamp, because we are not allowed to install a trigger in the source table, and because the table is too big to extract the whole table every time. For risk 1, is it possible to extract data from Jupiter within one-hour limitation? We cannot answer that question until we know what data we need from Jupiter. And for that we need to model the data first. But we know its going to be inventory data that we need from Jupiter and (sales data from WebTower9 and Jade). We can build a simple SQL Server Integration Services (SSIS) package to extract one-week data from Jupiter inventory tables. For risk 2, is there any table that is not available or difficult to get? Speak to the business project manager or somebody who knows the front end of the source systems very well. Ask, them whether, to their knowledge, there is data required by the business requirements that is unavailable in Jupiter, Jade or WebTower9. Data such as the company fiscal calendar, performance targets, product costs, holiday data, customer classification, and overhead costs may not be available in the source system. Discuss with the business project manager whether someone can arrange these data items to be supplied to you in text files or spreadsheets. For risk 3, we need to find out whether we can get to the data. We need to understand what RDBMS the source systems are running on. Most companies use popular RDBMS such as Oracle, Informix, DB/2, Ingress, Progress, mysql, Sybase, or SQL Server and these RDBMS supports ADO.NET, ODBC, JDBC, or OLEDB, so we can get to the data using SSIS, assuming we have cleared the security permission. For risk 4, some data in the source system may require cleansing. We can find this out by querying the source system. Some of the data quality issues are incomplete data, incorrect/inaccurate data, and duplicate records (because no hard database constraints are applied, such as primary keys or unique constraints), data is in free text format, a column is available but most are null, orphan records, and so on. For risks 5, unfortunately, without designing DDS and NDS, we would not be able to understand the size of the data stores.