Beruflich Dokumente
Kultur Dokumente
Behavior
inShare
Four possible combinations are noted in the MicroStrategy Document Creation Guide
product manual.
Scenario
Defining characteristics
Join behavior
Datasets are at the same level
Case 1:Same(dimensionality), and the unique set
attributes,
of attribute elements for each
same result dataset is the same (for instance, if
elements
the same filter is used in each).
Datasets are at the same level
The join in these cases behaves
(dimensionality), but some attribute exactly like a database full outer
Case 2:Sameelements in one or more datasets I join. All attribute elements will b
attributes,
cannot be found in other datasets. preserved, with null values for
different
The filters may be different, or the metrics where attribute element
result
reports may be using different fact in one dataset do not match
elements
tables with sparse data.
another.
Case
Datasets have different
This case also behaves like a
3:Dataset
dimensionalities, and one
database full outer join. Multiple
with a
dimensionality completely overlaps rows of the lower-level
superset of the other: e.g., Category and
dimensionality match a single ro
attributes in Region in one dataset versus Region in the higher; so one metric valu
another
in the other. {Category, Region} is from the single row will be
dataset
a superset of {Region}. Useful for replicated across the multiple row
Case
4:Different
attributes
percent-to-total calculations.
Considering two datasets, one
dataset has attributes present in
the other and vice versa; e.g.,
Category and Region in one dataset
versus Quarter and Region in the
other. Neither datasets
dimensionality is a superset of the
others.
Cases 1-3 are straightforward and should be familiar to users of relational databases.
Case 4 can return results that are not expected.
In a cross join, the relationship among the attribute elements can be considered
arbitrary: when every element of one attribute is matched to every element of the
other, no particular meaning can be inferred from the fact that element 1 of attribute A
appears alongside element 2 of attribute B.
A cross join will produce a final result with many more rows than either source table has
individually. This is an undesirable outcome for Report Services documents, because
repeated elements for a grouping attribute would result in entire sections of the
document being repeated. For this reason, an algorithm was chosen that would
eliminate redundancies in the joined attribute element set.
A Case 4 compound join between two datasets (A and B) takes place according to the
following general methodology:
1.
If the two datasets have any common attributes, common elements will be
matched.
2.
Once the rows are paired up, they are no longer considered for future matches.
3.
If one dataset has more rows than the other, the remaining rows will be added to
the compound join result with null values for metrics and/or attributes coming from
the smaller datasets. This holds true for:
Region
Northeast
Northeast
Northeast
Northeast
Mid-Atlantic
Mid-Atlantic
Mid-Atlantic
Mid-Atlantic
Category
Books
Electronics
Movies
Music
Books
Electronics
Movies
Music
Revenue
9093
1550784
387667
387320
13578
2281847
557250
560665
Profit
2412
423535
95900
42732
3630
623124
137923
61692
Region
Northeast
Northeast
Northeast
Northeast
Northeast
Northeast
Northeast
Northeast
Quarter
2003 Q1
2003 Q2
2003 Q3
2003 Q4
2004 Q1
2004 Q2
2004 Q3
2004 Q4
Units Sold
6444
8590
5083
9251
4221
8070
5147
9942
When these datasets are used in a Report Services document, the results are as follows:
For the common attribute, Region, the Northeast and Mid-Atlantic elements are
not mixed in the result.
There are more Categories than Quarters; thus some of the Quarters do not have
corresponding Category values.
The mid-Atlantic Region does not have any corresponding values in the {Region,
Quarter} dataset; thus it has no Quarter or Units Sold values.
This result has the smallest number of rows to capture the data from the two
datasets. By contrast, if a cross join were used to combine the unrelated attributes
Category and Quarter, the Northeast region would have 32 rows (48); each Revenue
value would be repeated eight times and each Units Sold value would be repeated four
times.
If a cross join were used, the user could infer nothing from the fact that Books
sits alongside 2003 Q1. The same is true of the compound join.
Note:
In Case 4, there may be relationships in the schema between attributes that are not in
common between the two datasets, but those attribute relationships are not considered
when resolving the compound join. The data relationships must be present in the
datasets as given to the Report Services document. For example, in the case of datasets
at the level of {Region, Category} and {Region, Subcategory}, where Category is a
parent of Subcategory, the only way to preserve the category-subcategory data
relationship is to run an additional query against the warehouse. By design, this is not
part of the Report Services execution flow. (If the datasets come from different data
sources, there is no relationship table to poll.)
Best Practices
The Report Services compound join functions best when the following conditions are
met:
The primary dataset is at the lowest level dimensionality (that is, the finest data
granularity).
The primary datasets dimensionality is the same as, or is a superset of, every
other datasets dimensionality.
The other datasets in the document do not introduce attributes that are not
present in the primary dataset.
Under these conditions, every dataset join will fall into Cases 1-3.