Beruflich Dokumente
Kultur Dokumente
Slice: A slice is a subset of a multi-dimensional array corresponding to a single value for one or
more members of the dimensions not in the subset.[6] The picture shows a slicing operation: The
sales figures of all sales regions and all product categories of the company in the year 2004 are
"sliced" out the data cube.
Dice: The dice operation is a slice on more than two dimensions of a data cube (or more than two
consecutive slices).[7] The picture shows a dicing operation: The new cube shows the sales
figures of a limited number of product categories, the time and region dimensions cover the same
range as before.
Drill Down/Up: Drilling down or up is a specific analytical technique whereby the user navigates
among levels of data ranging from the most summarized (up) to the most detailed (down).[6] The
picture shows a drilling operation: Theres a better understanding of the sales figures of the
product category "Outdoor-Schutzausrstung" since you now see the sales figures for the single
products of this category.
Roll-up: A roll-up involves computing all of the data relationships for one or more dimensions.
To do this, a computational relationship or formula might be defined.[6]
Pivot: This operation is also called rotate operation. It rotates the data in order to provide an
alternative presentation of data - the report or page display takes a different dimensional
orientation.[6] The picture shows a pivoting operation: The whole cube is rotated, giving another
perspective on the data.
Ans.- Users of decision support systems often see data in the form of data cubes. The cube is
used to represent data along some measure of interest. Although called a "cube", it can be 2-
dimensional, 3-dimensional, or higher-dimensional. Each dimension represents some attribute in
the database and the cells in the data cube represent the measure of interest. For example, they
could contain a count for the number of times that attribute combination occurs in the database,
or the minimum, maximum, sum or average value of some attribute. Queries are performed on
the cube to retrieve decision support information.Example: We have a database that contains
transaction information relating company sales of a part to a customer at a store location. The
data cube formed from this database is a 3-dimensional representation, with each cell (p,c,s) of
the cube representing a combination of values from part, customer and store-location. A sample
data cube for this combination is shown in Figure 1. The contents of each cell is the count of the
number of times that specific combination of values occurs together in the database. Cells that
appear blank in fact have a value of zero. The cube can then be used to retrieve information
within the database about, for example, which store should be given a certain part to sell in order
to make the greatest sales.
Rollup or summarization of the data cube can be done by traversing upwards through a concept
hierarchy. A concept hierarchy maps a set of low level concepts to higher level, more general
concepts. It can be used to summarize information in the data cube. As the values are combined,
cardinalities shrink and the cube gets smaller. Generalizing can be thought of as computing some
of the summary total cells that contain ANYs, and storing those in favour of the original cells.To
reduce the size of the data cube, we can summarize the data by computing the cube at a higher
level in the concept hierarchy. A non-summarized cube would be computed at the lowest level,
for example, the province level in Figure 2(a). If we compute the cube at the second level, there
are only six categories, B.C., Prairies, Ont., Que., Maritimes and Nfld., and the data cube will be
much smaller. Figure 3 shows a sample generalization of the Province attribute for those
provinces that can be grouped under the concept Prairies and those that can be grouped under the
concept Maritimes. For example, for Sask., the province, or location name, changes to Prairies,
but the other attribute values remain unchanged because they are not summarized at this point.
Ans.- Data cleansing, data cleaning, or data scrubbing is the process of detecting and
correcting (or removing) corrupt or inaccurate records from a record set, table, or database. Used
mainly in databases, the term refers to identifying incomplete, incorrect, inaccurate, irrelevant,
etc. parts of the data and then replacing, modifying, or deleting this dirty data.After cleansing,
a data set will be consistent with other similar data sets in the system. The inconsistencies
detected or removed may have been originally caused by user entry errors, by corruption in
transmission or storage, or by different data ictionary definitions of similar entities in different
stores.Data cleansing differs from data validation in that validation almost invariably means data
is rejected from the system at entry and is performed at entry time, rather than on batches of data.