Sie sind auf Seite 1von 5

NAME: IDNO:

BIRLA INSTITUTE OF TECHNOLOGY & SCIENCE, PILANI


II SEMESTER 2008-2009
SS G515 DATA WAREHOUSING
Comprehensive Examination - PART A (CLOSED BOOK)
05th May 2009 Weightage: 20% Time: 3 hours
Multiple-Choice Questions (40*0.5=20)
Points to note:
 Answer multiple choice questions in the Question paper itself
 Some questions may have more than one correct option. You will get credit only if you mark all the
correct options
 There is NO NEGATIVE marking
 ENCIRCLE the correct option(s) using ink
1. List the DW architectural component(s) which are both a source of data to the DW and also draw data from
the DW
(a) Data marts
(b) Operational data store
(c) Supermarts
(d) All of the above
2. Operational systems work at what granularity level:
(a) Coarsest granularity
(b) Finest granularity
(c) Can work at any granularity level
(d) Concept of granularity does not apply to operational systems
3. A DW is:
(a) A synchronous replication of data
(b) An asynchronous replication of data
(c) Not a replication of data
(d) Hot backup of operational systems
4. Relationship between a dimension table and a fact table could be:
(a) 1:N
(b) M:N
(c) N:1
(d) 1:1
5. Two dimension tables are connected through:
(a) Bridge table
(b) Helper table
(c) Fact table
(d) Aggregated fact table
6. Most useful dimension tables are:
(a) Deep
(b) Wide
(c) Deep & wide
(d) Deep & narrow
7. Dimensional modeling is more restrictive than ER modeling because:
(a) Data is classified as either fact or dimension
(b) Dimension tables must have single field PK
(c) No direct relationship between two dimension tables
(d) Composite FKs are not allowed
8. In dimensional modeling, a bridge table can exist between:
(a) Two dimension tables
(b) Two fact tables
(c) Dimension and fact table
(d) Any two tables
9. Composite keys can be there in:
(a) Fact tables
(b) Dimension tables
(c) Bridge tables
(d) Snowflaked dimension tables
10. ETL tool sits between:
(a) Source systems & DW
(b) Source systems and data mart
(c) DW and data marts
(d) DW & user access tools
11. Which of the following can cause a distortion of the classical star schema:
(a) Aggregates
(b) Multi-valued dimensions
(c) Partitions
(d) All of the above
12. Which of the following can cause a distortion of the classical star schema:
(a) Bridge table
(b) Mini-dimensions
(c) Outriggers
(d) Snowflaked dimensions
13. Any distortion of the classical star schema can be hidden under:
(a) Materialized view
(b) View
(c) Snapshot
(d) Synonyms
14. From the ETL point of view, it is simplest to handle:
(a) Highly aggregated data
(b) Lightly aggregated data
(c) Finest granularity data
(d) Medium granularity data
15. Multidimensional databases are most suitable for:
(a) Finest granularity data
(b) Aggregated data
(c) Multidimensional data
(d) Data in dimension tables
16. MDDBs are still not used extensively in data warehouse architectures because:
(a) Their scalability is not established yet
(b) Their support for aggregates is not as strong as RDBMS
(c) They make the architecture more complex
(d) Their handling of dimension tables is weak
17. MDDBs store dimension table data in:
(a) Multidimensional arrays
(b) One dimensional arrays
(c) Structures
(d) Program variables
18. Proper subset(s) of the data warehouse:
(a) Data mart
(b) Supermart
(c) ODS
(d) OLTP system
19. Granularity of a data warehouse could be:
(a) Coarser than that of the operational system
(b) Finer than that of the operational system
(c) Coarser than that of the operational data store
(d) Finer than that of the operational data store
20. The concept of granularity applies to:
(a) Fact tables
(b) Dimension tables
(c) Both fact & dimension tables
(d) Neither fact or dimension table
21. Granularity of the operational systems is definitely captured in:
(a) Data warehouse
(b) Super mart
(c) Operational data store
(d) Data mart
22. Online aggregation:
(a) Improves query performance
(b) Provides early trends
(c) Uses non-blocking algorithms for evaluating relational operators
(d) Uses blocking algorithms for evaluating relational operators
23. Query performance enhancing technique(s):
(a) Aggregation
(b) Data cleaning
(c) Data de-duplication
(d) Indexes
24. Most inexpensive operation:
(a) Natural Join
(b) Left outer join
(c) Right outer join
(d) Full outer join
25. In the grocery store data mart, the sales fact table contains:
(a) Products on promotion that were sold
(b) Products on promotion that were not sold
(c) Products that were sold
(d) Products that were not sold
26. Most common type of queries in a DW environment:
(a) Inside-out queries
(b) Outside-in queries
(c) Dimension focused queries
(d) Fact focused queries
27. The number of cuboids in the lattice of a cube that has 7 dimensions and three levels of hierarchy along
each dimension:
(a) 37
(b) 27
(c) 47
(d) None of the above
28. Events that did not happen can be recorded in:
(a) Fact table
(b) Factless fact table
(c) Coverage tables
(d) All of the above
29. Most visible component of a DW system:
(a) Data Warehouse
(b) ODS
(c) OLAP tool
(d) ETL tool
30. Building a data mart for a business process/department that is very critical for your organization is a
______________ project:
(a) High risk high reward
(b) High risk low reward
(c) Low risk low reward
(d) Low risk high reward
31. Examples of multi-valued dimensions
(a) Patient having many diagnosis
(b) Customer buying many products
(c) Different sales persons on a given day in a daily grain sales FT
(d) Customers having multiple accounts with a bank
32. Space savings is achieved by using surrogate keys in:
(a) Dimension tables
(b) Shrunken dimension tables
(c) Fact tables
(d) Mini-dimension tables
33. Example(s) of semi-additive facts
(a) Sales units
(b) Customer count
(c) Account balance
(d) Ratios
34. What kind of constraints can be put on data in a DW:
(a) Primary key
(b) Foreign key
(c) Check constraints
(d) Functional dependencies
35. Which partitioning is suitable for partitioning wrt to location dimension:
(a) Range
(b) List
(c) Hash
(d) Composite
36. Query performance can be enhanced using:
(a) Horizontal partitioning
(b) Vertical partitioning
(c) Data partitioning
(d) Hardware partitioning
37. Advantages of views include:
(a) Logical data independence
(b) Security
(c) Macros
(d) Hiding distortions to the classical star schema
38. Highest normalized structures in a DW could be:
(a) Dimension tables
(b) Mini-dimensions
(c) Fact tables
(d) Outriggers
39. Bridge/helper tables are used to handle:
(a) Multi-valued dimensions
(b) Outriggers
(c) Aggregated dimension tables
(d) Variable depth hierarchy
40. Generation of surrogate keys
(a) Can be done in parallel for different dimensions
(b) Cannot be done parallel for different dimensions
(c) Can be done only serially, scheduling important dimensions first
(d) Can be done only serially, scheduling important mini-dimensions first

End of Part A
BIRLA INSTITUTE OF TECHNOLOGY & SCIENCE, PILANI
II SEMESTER 2008-2009
SS G515 DATA WAREHOUSING
Comprehensive Examination - PART B (OPEN BOOK)
05th May 2009 Weightage: 20% Time: 3 hours

1. Explain in detail, the kind of things RDBMS vendors are doing to make their product
more suitable for data warehousing & OLAP.
[3]
2. Give a fact table that contains only integer values. It must have a fully-additive and a
semi-additive fact.
[4]
3. There is a chain of 500 grocery stores in India. Each store has 60000 SKUs. On an
average 10% of SKUs are sold from each store each day. If we want to store 5 years’
sales data in our data warehouse. Estimate the size of the fact table in Giga bytes given
that we have 4 facts (each of 4 byte length). Assume that there are four dimensions,
namely, Product, Time, Location, and Promotion and surrogate keys are used in all
dimension tables. It is also given that for each SKU we store only one record for a day
from each store. What could be the size of the 4-way aggregate?
[4]
4. In a university data warehouse, how would you model the grades of students, as a fact
or as a dimension? Give a detailed justification in support of your answer.
(Assume BITS education model)
[4]
5. Suppose a data warehouse consists of three dimensions time, doctor, and patient , and
the two measures count and charge, where charge is the fee that a doctor charges a
patient for a visit. Store the above information in a MDDB in two different ways. Show
the cubes and the corresponding array declarations (in C). Name the data structures
that you would need to implement a MDDB.
[5]