Cubes Python Online Analytical Processing Framework

Cubes Documentation
Release 0.7.0
Stefan Urbanek
October 04, 2011
CONTENTS
1 2
Introduction Installation 2.1 From sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . model Logical Model Logical Model description 4.1 Load a model . . . . 4.2 Model components . 4.3 Dimensions . . . . . 4.4 Attributes . . . . . .
3 5 5 7 9 9 9 12 14 17 17 17 19 19 20 21 25 25 25 26 27 28 29 29 31 31 34 36 41 41
3 4
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
Physical Mapping 5.1 Attribute Mappings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5.2 Joins . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Model validation 6.1 Errors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6.2 Warnings . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Aggregations and Aggregation Browsing Creating Cubes 8.1 Relational Database (SQL) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.2 Mongo Backend . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8.3 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . Localization 9.1 Metadata Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.2 Data Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9.3 Localized Reporting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
7 8
10 OLAP Web Service 10.1 API . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.2 Reports . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10.3 Running and Deployment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11 slicer - Command Line Tool 11.1 serve . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
11.2 11.3 11.4 11.5
model validate . . . model json . . . . . model extract_locale model translate . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
. . . .
42 42 42 42 43 43 48 51 53 53 55 57 59 61
12 Cubes API 12.1 OLAP Cubes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.2 Aggregation Browsers . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12.3 Utility functions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 Development Notes 13.1 Fact Table . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14 Contact and Getting Help 15 Indices and tables Python Module Index Index
ii
Cubes Documentation, Release 0.7.0
Cubes is a framework for Online Analytical Processing (OLAP), multidimensional analysis and aggregated cube computation. It is part of Data Brewery. Contents:
CONTENTS
CONTENTS
CHAPTER
ONE
INTRODUCTION
Focus on data analysis, not on physical data structure Cubes is a framework for: Online Analytical Processing - OLAP, mostly relational DB based - ROLAP multidimensional analysis star and snowake schema denormalisation cube computation (see Creating Cubes) Features: model Logical Model - description of how data are being analysed and reported, independent of physical data implementation hierarchical dimensions (attributes that have hierarchical dependencies, such as category-subcategory or country-region) localizable metadata and data Localization Framework has modular nature and supports multiple database backends, different ways of cube computation and ways of browsing aggregated data. relational databases with SQL through SQL alchemy document based database in MongoDB
Chapter 1. Introduction
CHAPTER
TWO
INSTALLATION
Optional requirements: SQLAlchemy for SQL backend Werkzeug for Slicer server To install cubes, you can use easy_install (from setuptools):
easy_install cubes
or pip:
pip install cubes
Main project repository at Github: https://github.com/Stiivi/cubes Bitbucket copy for mercurial users: https://bitbucket.org/Stiivi/cubes (might be lagging a little bit behind github).
2.1 From sources

Download from Github:
git clone git://github.com/Stiivi/cubes.git
Install:
cd cubes python setup.py install
Chapter 2. Installation
CHAPTER
THREE
MODEL LOGICAL MODEL

Logical model describes the data from users or analysts perspective: data how they are being measured, aggregated and reported. Model is independent of physical implementation of data. This physical independence makes it easier to focus on data instead on ways of how to get the data in understandable form. In short, logical model enables users to: refer to dimension attributes by name regardless of storage (which table) specify hierarchical dependencies of attributes, such as: product category > product > subcategory > product country > region > county > town. specify attribute labels to be displayed in end-user application for all localizations use the same attribute name, therefore write only one query for all report translations Analysts or report writers do not have to know where name of an organisation or category is stored, nor he does not have to care whether customer data is stored in single table or spread across multiple tables (customer, customer types, ...). They just ask for customer.name or category.code. In addition to abstraction over physical model, localization abstraction is included. When working in multi-lingual environment, only one version of report/query has to be written, locales can be switched as desired. If requesting contract type name, analyst just writes constract_type.name and Cubes framework takes care about appropriate localisation of the value. Example: Analysts wants to report contract amounts by geography which has two levels: country level and region level. In original physical database, the geography information is normalised and stored in two separate tables, one for countries and another for regions. Analyst does not have to know where the data are stored, he just queries for geography.country and/or geography.region and will get the proper data. How it is done is depicted on the following image: The logical model describes dimensions geography in which default hierarchy has two levels: contry and region. Each level can have more attributes, such as code, name, population... In our example report we are interested only in geographical names, that is: country.name and region.name. Cubes framework has to know where those logical (reported) attributes are physically stored. It is done in two ways: default mapping and explicit mapping. Default mapping is be discussed in other section, however in short: in most cases for structures normalized by dimension, the attributes are looked in tables with same name as dimension and column with same name as attribute. The other way how attributes are mapped to physical implementation is by explicitly mentioning the physical table and column name (in relational database). With logical model, the Cubes framework knows where to nd the data, therefore analysts can focus on reporting and keep their way of looking on data.
Figure 3.1: Mapping from logical model to physical data.
Chapter 3. model Logical Model
CHAPTER
FOUR
LOGICAL MODEL DESCRIPTION

The logical model can be either constructed programmatically or provided as JSON. The model entities and their structure is depicted on the following gure:
4.1 Load a model 4.2 Model components

class cubes.model.Model(name=None, label=None, description=None, sions=None, locale=None, **kwargs) Logical Model represents analysts point of view on data. The model dictionary contains main model description. The structure is:
{ "name": "public_procurements", "label": "Public Procurements of Slovakia", "description": "Contracts of public procurement winners in Slovakia" "cubes": {...} "dimensions": {...} }
cubes=None,
dimen-
Attributes: name - model name label - human readable name - can be used in an application description - longer human-readable description of the model cubes - dictionary of cube descriptions (see below) dimensions - dictionary of dimension descriptions (see below) locale - locale code of the model When initializing the Model object, cubes and dimensions might be dictionaries with descriptions. See Cube and Dimension for more information. add_cube(cube) Adds cube to the model and also assigns the model to the cube. If cube has a model assigned and it is not this model, then error is raised.
Figure 4.1: The logical model entities and relationships.
10
Chapter 4. Logical Model description
Cubes dimensions are collected to the model. If cube has a dimension with same name as one of existing models dimensions, but has different structure, an exception is raised. Dimensions in cube should be the same as in model. add_dimension(dimension) Add dimension to model. Replace dimension with same name cube(cube) Get a cube with name name or coalesce object to a cube. dimension(obj) Get dimension by name or by object is_valid(strict=False) Check whether model is valid. Model is considered valid if there are no validation errors. If you want to be sure that there are no warnings as well, set strict to True. Args: strict: If False only errors are considered fatal, if True also warnings will make model invalid. Returns: boolean ag whether model is valid or not. localizable_dictionary() Get model locale dictionary - localizable parts of the model localize(translation) Return localized version of model remove_cube(cube) Removes cube from the model remove_dimension(dimension) Remove a dimension from receiver to_dict(**options) Return dictionary representation of the model. All object references within the dictionary are name based Options: expand_dimensions - if set to True then fully expand dimension information in cubes full_attribute_names - if set to True dimension_name.attribute_name then attribute names will be written as
validate() Validate the model, check for model consistency. Validation result is array of tuples in form: (validation_result, message) where validation_result can be warning or error. Returs: array of tuples class cubes.model.Cube(name=None, model=None, label=None, measures=None, details=None, dimensions=None, mappings=None, joins=None, fact=None, key=None, description=None, **kwargs) Create a new cube Args: name (str): dimension name desc (dict): dict object containing keys label, description, dimensions, ... add_dimension(dimension) Add dimension to cube. Replace dimension with same name
4.2. Model components
11
dimension(obj) Get dimension object. If obj is a string, then dimension with given name is returned, otherwise dimension object is returned if it belongs to the cube. remove_dimension(dimension) Remove a dimension from receiver. dimension can be either dimension name or dimension object. to_dict(expand_dimensions=False, with_mappings=True, **options) Convert to dictionary Options: expand_dimensions - if set to True then fully expand dimension information validate() Validate cube. See Model.validate() for more information.
4.3 Dimensions
Dimension descriptions are stored in model dictionary under the key dimensions. The dimension description contains keys: Key name label levels attributes hierarchies hierarchy Example:
{ "name": "date", "label": "Dtum", "levels": { ... } "attributes": [ ... ] "hierarchies": { ... } }
Description dimension name human readable name - can be used in an application dictionary of hierarchy levels dictionary of dimension attributes dictionary of dimension hierarchies if dimension has only one hierarchy, you can specify it hiere.
Use either hierarchies or hierarchy, using both results in an error. Hierarchy levels are described: Key label key Description human readable name - can be used in an application key eld of the level (customer number for customer level, region code for region level, year-month for month level). key will be used as a grouping eld for aggregations. Key should be unique within level. laname of attribute containing label to be displayed (customer name for customer level, region name for bel_attribute region level, month name for month level) atlist of other additional attributes that are related to the level. The attributes are not being used for tributes aggregations, they provide additional useful information. Example of month level of date dimension:
"month": { "label": "Mesiac", "key": "month",
12
Figure 4.2: Dimension description - attributes.
4.3. Dimensions
13
"label_attribute": "month_name", "attributes": ["month", "month_name", "month_sname"] },
Example of supplier level of supplier dimension:

"supplier": { "label": "Dodvatel", "key": "ico", "label_attribute": "name", "attributes": ["ico", "name", "address", "date_start", "date_end", "legal_form", "ownership"] }
Hierarchies are described: Key label levels Description human readable name - can be used in an application ordered list of level names from top to bottom - from least detailed to most detailed (for example: from year to day, from country to city)
Example:
"hierarchies": { "default": { "levels": ["year", "month"] }, "ymd": { "levels": ["year", "month", "day"] }, "yqmd": { "levels": ["year", "quarter", "month", "day"] } }
4.4 Attributes
Measures and dimension level attributes can be specied either as rich metadata or just simply as strings. If only string is specied, then all attribute metadata will have default values, label will be equal to the attribute name. Key name label order locales Description attribute name, used in reports human readable name - can be used in an application, localizable natural order of the attribute (optional), can be asc or desc list of locales in which the attribute values are available in (optional)
The optional order is used in aggregation browsing and reporting. If specied, then all queries will have results sorted by this eld in specied direction. Level hierarchy is used to order ordered attributes. Only one ordered attribute should be specied per dimension level, otherwise the behaviour is unpredictable. This natural (or default) order can be later overriden in reports by explicitly specied another ordering direction or attribute. Explicit order takes precedence before natural order. For example, you might want to specify that all dates should be ordered by default:
"attributes" = [ {"name" = "year", "order": "asc"} ]
14
Locales is a list of locale names. Say we have a CPV dimension (common procurement vocabulary - EU procurement subject hierarchy) and we are reporting in Slovak, English and Hungarian. The attributes will be therefore specied as:
"attributes" = [ {"name" = "group_code"}, {"name" = "group_name", "order": "asc", "locales" = ["sk", "en", "hu"]} ]
group name is localized, but group code is not. Also you can see that the result will always be sorted by group name alphabeticall in ascending order. See Attribute Mappings for more information about how logical attributes are mapped to the physical sources. In reports you do not specify locale for each locaized attribute, you specify locale for whole report or browsing session. Report queries remain the same for all languages.
4.4. Attributes
15
16
CHAPTER
FIVE
PHYSICAL MAPPING
In addition to logical model denition, the model description might contain physical mapping. The mapping is optional and can be used when backend defaults is not sufcient. Serves mostly for better logical to physical mapping customisation. Key fact mappings joins Description name of a fact table (or collection or dataset, depending on backend) dictionary of mapping of logical attribute to physical attribute list of join specications
5.1 Attribute Mappings

Mappings is a dictionary of logical attributes as keys and physical attributes (columns, elds) as values. The logical attributes are referenced as dimensions_name.attribute_name, for example: geography.country_name or category.code. The physical attributes are backend-specic, for example in relational database (SQL) it can be table_name.column_name. Default mapping is identity mapping - physical attribute is the same as logical attribute. For example, if you have dimension category and have attribute code then Cubes looks in table named category and column code. Localizable attributes are those attributes that have locales specied in their denition. To map logical attributes which are localizable, use locale sufx for each locale. For example attribute name in dimension category has two locales: Slovak (sk) and English (en), the mapping for such attribute will look like:
... "category.name.sk" = "dm_categories.name_sk", "category.name.en" = "dm_categories.name_en", ...
Note: Current implementation of Cubes framework requires a star or snowake schema that can be joined into fully denormalized normalized form. Therefore all localized attributes have to be stored in their own columns. You have to denormalize the data before using them in Cubes.
5.2 Joins
If you are using star or snowake schema in relational database, Cubes requires information on how to join the tables into the star/snowake. Tables are joined by matching single-column keys. Say we have a fact table named fact_contracts and dimension table with categories named dm_categories. To join them we dene following join specication: 17
"joins" = [ { "master": "fact_contracts.category_id", "detail": "dm_categories.id" } ]
There might be situiations when you would need to join one detail table more than once. Example of such situation is a dimension with list of organisations and in fact table you have two organisational references, such as receiver and donor. In this case you specify alias for detail table:
"joins" = [ { "master": "fact_contracts.receiver_id", "detail": "dm_organisation.id", "alias": "dm_receiver" } { "master": "fact_contracts.donor_id", "detail": "dm_organisation.id", "alias": "dm_donor" } ]
Note that order of joins matters, if you have snowake and would like to join deeper detail, then you have to have all required tables joined (and properely aliased, if necessary) already. In mappings you refer to table aliases, if you joined with an alias.
18
Chapter 5. Physical Mapping
CHAPTER
SIX
MODEL VALIDATION
To validate a model do:
results = model.validate()
This will return a list of tuples (result, message) where result might be warning or error. If validation contains errors, the model can not be used without resulting in failure. If there are warnings, some functionalities might or might not fail or might not work as expected. You can validate model from command line:
slicer model validate /path/to/model
6.1 Errors
Error No mappings for cube a cube No mapping for measure a measure in cube a cube No levels in dimension a dimension No hierarchies in dimension a dimension No defaut hierarchy specied, there is more than one hierarchy in dimension a dimension Level a level in dimension a dimension has no attributes Key a key in level a level in dimension a dimension is not in attribute list Dimension a dimension is not a subclass of Dimension class Resolution Provide mappings dictionary for cube Add mapping for a measure into mappings dictionary Dene at least one dimension level. Dene at least one hierarchy. Specify a default hierarchy name or name one hierarchy as default Provide level attributes. At least one - the level key. Add key attribute into attribute list or check the key name. This might happen when model was constructed programatically. Check your model construction code.
19
6.2 Warnings
Warning No fact specied for cube a cube (factless cubes are not yet supported, using fact as default dataset/table name No mapping for dimension a dimension attribute an attribute in cube a cube (using default mapping) No default hierarchy name specied in dimension a dimension, using some autodetect default name Default hierarchy a hierarchy does not exist in dimension a dimension Level a level in dimension a dimension has no key attribute specied, rst attribute will be used: rst attribute name No cubes dened Resolution Specify a fact table/dataset, otherwise table with name fact will be used. View builder will fail if such table does not exist. Provide mapping for dimension, otherwise identity mapping will be used (dimension.attribute) Provide default_hierarchy_name. If there is only one hierarchy for dimension, the only one will be used. If there are more hierarchies, the one with name default will be used. Check that default_hierarchy refers to existing hierarchy within that dimension. Specify key attribute in the dimension level.
Dene at least one cube.
20
Chapter 6. Model validation
CHAPTER
SEVEN
AGGREGATIONS AND AGGREGATION BROWSING

Warning: This information is obsolete. Cell is no longer used for browsing, you use only Browser and pass a cell as rst argument to get aggregated or other results. Mongo backend is no logner maintained, only cubes.backends.sql.SQLBrowser is available This is MongoDB example, other systems coming soon. First you have to prepare logical model and cube. In relational database:
import cubes # connection is SQLAlchemy database connection # Create aggregation browser browser = cubes.backends.SQLBrowser(cube, connection, "mft_contracts")
To browse localized data, just pass locale to the browser and all results will contain localized values for localizable attributes:
browser = cubes.backends.SQLBrowser(cube, connection, "mft_contracts", locale = "sk")
To browse pre-aggregated mongo data:

import cubes import pymongo # Create MongoDB database connection connection = pymongo.Connection() database = connection["wdmmg_dev"] # Load model and get cube model_path = "wdmmg_model.json" model = cubes.model_from_path(model_path) cube = model.cubes["wdmmg"]
Prepare aggregation browser:

browser = cubes.browse.MongoSimpleCubeBrowser(cube = cube, collection = "cube", database = database)
21
# Get the whole cube full_cube = browser.full_cube()
Following aggregation code is backend-independent. Aggregate all data for year 2009:
cuboid = full_cube.slice("date", [2009]) results = cuboid.aggregate()
Results will contain one aggregated record. Drill down through a dimension:
results_cofog = cuboid.aggregate(drill_down = "cofog") results_date = cuboid.aggregate(drill_down = "date")
results_cofog will contain all aggregations for cofog dimension at level 1 within year 2009. results_date will contain all aggregations for month within year 2009. Drilling-down and aggregating through single dimension. Following function will print aggregations at each level of given dimension.
def expand_drill_down(dimension_name, path = []): dimension = cube.dimension(dimension_name) hierarchy = dimension.default_hierarchy # We are at last level, nothing to drill-down if hierarchy.path_is_base(path): return # Construct cuboid of our interest full_cube = browser.full_cube() cuboid = full_cube.slice("date", [2009]) cuboid = cuboid.slice(dimension_name, path) # Perform aggregation cells = cuboid.aggregate(drill_down = dimension_name) # Print results prefix = " " * len(path) for cell in cells: path = cell["_cell"][dimension_name] current = path[-1] print "%s%s: %.1f %d" % (prefix, current, cell["amount_sum"], cell["record_count"]) expand_drill_down(dimension_name, path)
The internal key _cell contains a dictionary with aggregated cell reference in form: {dimension: "date" = [2010, 1] }
path}, like {
Note: The output record from aggregations will change into an object instead of a dictionary, in the future. The equivalent to the _cell key will be provided as an object attribute. Assume we have two levels of date hierarhy: year, month. To get all time-based drill down:
expand_drill_down("date")
Possible output would be:
22
Chapter 7. Aggregations and Aggregation Browsing
2008: 1200.0 60 1: 100.0 10 2: 200.0 5 3: 50.0 1 ... 2009: 2000.0 10 1: 20.0 10 ...
23
24
Chapter 7. Aggregations and Aggregation Browsing
CHAPTER
EIGHT
CREATING CUBES
The Cubes framework provides funcitonality for denormalisation and for cube pre-computation. Currently SQL backend supports denormalisation only and mongo backend supports cube precomputation.
8.1 Relational Database (SQL)

Following code will create a denormalized view (implemented as table) from a model and star/snowake relational schema:
import sqlalchemy import cubes model = cubes.model_from_path("/path/to/model") engine = sqlalchemy.create_engine(common.staging_dburl) connection = engine.connect() cube = model.cube("contracts") builder = cubes.backends.SQLDenormalizer(cube, connection) builder.create_materialized_view("mft_contracts") connection.close()
8.2 Mongo Backend

Warning: Mongo Backed is not up-to-date with current model implementation. It might, but does not have to work correctly. Example of cube precomputation for Where Does My Money Go in a MongoDB database. Source is single database collection containing facts with multiple dimensions and single measure amount. There is one dimension that is required for all aggregations: date (not listed, as it is required by default). See this simplied wdmmg:download:logical model example <wdmmg_model.json> for cube metadata (dimensions, levels, hierarchies, ...).
import cubes import pymongo # Create MongoDB database connection
25
connection = pymongo.Connection() database = connection["wdmmg_dev"] # Load model and get cube model_path = "wdmmg_model.json" model = cubes.model_from_path(model_path) cube = model.cubes["wdmmg"] # Create cube builder: facts are read from collection named "entry", aggregations # are inserted into collection named "cube" builder = cubes.builders.MongoSimpleCubeBuilder(cube, database, fact_collection = "entry", cube_collection = "cube") # Compute the cube! builder.compute()
8.3 API
See Also: Module cubes.backends. More information about cube builders in different database environments. Module cubes. Logical model description - required for preaggregated cube computation.
26
Chapter 8. Creating Cubes
CHAPTER
NINE
LOCALIZATION
Having origin in multi-lingual Europe one of the main features of the Cubes framework is ability to provide localizable results. There are three levels of localization in each analytical application: 1. Application level - such as buttons or menus 2. Metadata level - such as table header labels 3. Data level - table contents, such as names of categories or procurement types
Figure 9.1: Localization levels. The application level is out of scope of this framework and is covered in internationalization (i18n) libraries, such as gettext. What is covered in Cubes is metadata and data level. Localization in cubes is very simple: 1. Create master model denition and specify locale the model is in 2. Specify attributes that are localized (see Attribute Mappings) 3. Create model translations for each required language 4. Make cubes function or a tool create translated versions the master model To create localized report, just specify locale to the browser and create reports as if the model was not localized. See Localized Reporting.
27
9.1 Metadata Localization

The metadata are used to display report labels or provide attribute descriptions. Localizable metadata are mostly label and description metadata attributes, such as dimension label or attribute description. Say we have three locales: Slovak, English, Hungarian with Slovak being the main language. The master model is described using Slovak language and we have to provide two model translation specications: one for English and another for Hungarian. The model translation le has the same structure as model denition le, but everything except localizable metadata attributes is ignored. That is, only label and description keys are considered in most cases. You can not change structure of mode in translation le. If structure does not match you will get warning or error, depending on structure change severity. There is one major difference between master model le and model translations: all attribute lists, such as cube measures, cube details or dimension level attributes are dictionaries, not arrays. Keys are attribute names, values are metadata translations. Therefore in master model le you will have:
attributes = [ { "name": "name", "label": "Name" }, { "name": "cat", "label": "Category" } ]
in translation le you will have:

attributes = { "name": {"label": "Meno"}, "cat": {"label": "Kategoria"} }
If a translation of a metadata attribute is missing, then the one in master model description is used. In our case we have following les:
procurements.json procurements_en.json procurements_hu.json
Figure 9.2: Localization master model and translation les. To load a model:
28
Chapter 9. Localization
import cubes model_sk = cubes.load_model("procurements.json", translations = { "en": "procurements_en.json", "hu": "procurements_hu.json", })
To get translated version of a model:

model_en = model.translate("en") model_hu = model.translate("hu")
Or you can get translated version of the model by directly passing translation dictionary:
handle = open("procurements_en.json") trans = json.load(handle) handle.close() model_en = model.translate("en", trans)
9.2 Data Localization

If you have attributes that needs to be localized, specify the locales (languages) in the attribute denition in Attribute Mappings. Note: Data localization is implemented only for Relational/SQL backend.
9.3 Localized Reporting

Main point of localized reporting is: Create query once, reuse for any language. Provide translated model and desired locale to the aggregation browser and you are set. The browser takes care of appropriate value selection. Aggregating, drilling, getting list of facts - all methods return localized data based on locale provided to the browser. If you want to get multiple languages at the same time, you have to create one browser for each language you are reporting.
9.2. Data Localization
29
30
Chapter 9. Localization
CHAPTER
TEN
OLAP WEB SERVICE

Cubes framework provides easy to install web service WSGI server with API that covers most of the Cubes logical model metadata and aggregation browsing functionality. Server requires the werkzeug framework.
10.1 API
10.1.1 Model
GET /model Get model metadata as JSON GET /model/dimension/<name> Get dimension metadata as JSON GET /model/dimension/<name>/levels Get list level metadata from default hierarchy of requested dimension.
10.1.2 Cube
Cube API calls have format: /cube/<cube_name>/<browser_action> where the browser action might be aggregate, facts, fact, dimension and report. GET /cube/<cube>/aggregate Return aggregation result as JSON. The result will contain keys: summary and drilldown. The summary contains one row and represents aggregation of whole cuboid specied in the cut. The drilldown contains rows for each value of drilled-down dimension. If no arguments are given, then whole cube is aggregated. Paramteres cut - specication of cuboid, for example: cut=date:2004,1|category=2|entity=12345 drilldown - dimension to be drilled down. For example drilldown=date will give rows for each value of next level of dimension date. You can explicitly specify level to drill down in form: dimension:level, such as: drilldown=date:month page - page number for paginated results pagesize - size of a page for paginated results order - list of attributes to be ordered by limit - limit number of results in form limit=5:received_amount_sum:asc limit[,measure[,order_direction]]:
31
Reply: summary - dictionary of elds/values for summary aggregation drilldown - list of drilled-down cells remainder - summary of remaining cells (not in drilldown), if limit is specied. Not implemented yet total_cell_count - number of total cells in drilldown (after limir, before pagination) If pagination is used, then drilldown will not contain more than pagesize cells. Note that not all backengs might implement total_cell_count or providing this information can be congurable therefore might be disabled (for example for performance reasons). GET /cube/<cube>/facts Return all facts (details) within cuboid. Parameters cut - see /aggregate page, pagesize - paginate results order - order results format - result format: json (default; see note below), csv elds - comma separated list of fact elds, by default all elds are returned Note: Number of facts in JSON is limited to conguration value of json_record_limit, which is 1000 by default. To get more records, either use pages with size less than record limit or use alternate result format, such as csv. GET /cube/<cube>/fact/<id> Get single fact with specied id. For example: /fact/1024 GET /cube/<cube>/dimension/<dimension> Get values for attributes of a dimension. Parameters depth - specify depth (number of levels) to retrieve. If not specied, then all levels are returned cut - see /aggregate page, pagesize - paginate results order - order results POST /cube/<cube>/report Process multiple request within one API call. The POST data should be a JSON containig report specication where keys are names of queries and values are dictionaries describing the queries. report expects Content-type header to be set to application/json. See Reports for more information. GET /cube/<cube>/search/dimension/<dimension>/<query> Search values of dimensions for query. If dimension is _all then all dimensions are searched. Returns search results as list of dictionaries with attributes: Search result dimension - dimension name level - level name depth - level depth
32
Chapter 10. OLAP Web Service
level_key - value of key attribute for level attribute - dimension attribute name where searched value was found value - value of dimension attribute that matches search query path - dimension hierarchy path to the found value level_label - label for dimension level (value of label_attribute for level) Warning: Not yet fully implemented, just proposal. GET /cube/<cube>/drilldown/<dimension>/<path> Aggregate next level of dimension. This is similar to /aggregate with drilldown=<dimension> parameter. Does not result in error when path has largest possible length, returns empty results instead and result count 0. If <path> is specied, it replaces any path specied in cut= parameter for given dimension. If <path> is not specied, it is taken from cut, where it should be represented as a point (not range nor set). In addition to /aggregate result, folloing is returned: is_leaf - Flag determining whether path refers to leaf or not. For example, this ag can be used to determine whether create links (is not last) or not (is last) dimension - name of drilled dimension path - path passed to drilldown In addition to this, each returned cell contains additional attributes: * _path - path to the cell - can be used for constructing further browsable links Note: Not yet implemented Parameters that can be used in any request: prettyprint - if set to true formatting spaces are added to json output
10.1.3 Cuts in URLs

The cuboid - part of the cube we are aggregating or we are interested in - is specied by cuts. The cut in URL are given as single parameter cut which has following format: Examples:
date:2004 date:2004,1 date:2004,1|class=5 date:2004,1,1|category:5,10,12|class:5
Dimension name is followed by colon :, each dimension cut is separated by |, and path for dimension levels is separated by a comma ,. Or in more formal way, here is the BNF for the cut:
<list> <cut> <dimension> <path> ::= ::= ::= ::= <cut> | <cut> | <list> <dimension> : <path> <identifier> <value> | <value> , <path>
Why dimension names are not URL parameters? This prevents conict from other possible frequent URL parameters that might modify page content/API result, such as type, form, source.
10.1. API
33
Following image contains examples of cuts in URLs and how they change by browsing cube aggregates:
Figure 10.1: Example of how cuts in URL work and how they should be used in application view templates.
10.2 Reports
Report queries are done either by specifying a report name in the request URL or using HTTP POST request where posted data are JSON with report specication. If report name is specied in GET request instead, then server should have a repository of named report specications. Keys: 34 Chapter 10. OLAP Web Service
queries - dictionary of named queries Query specication: query - query type: aggregate, details (list of facts), values for dimension values, facts or fact for multiple or single fact respectively Note that you have to set content type to application/json. Result is a dictionary where keys are the query names specied in report specication and values are result values from each query call. Example: report.json:
{ "summary": { "query": "aggregate" }, "by_year": { "query": "aggregate", "drilldown": ["date"], "rollup": "date" } }
Request:
curl -H "Content-Type: application/json" --data-binary "@report.json" \ "http://localhost:5000/cube/contracts/report?prettyprint=true&cut=date:2004"
Reply:
{ "by_year": { "total_cell_count": 6, "drilldown": [ { "record_count": 4390, "requested_amount_sum": 2394804837.56, "received_amount_sum": 399136450.0, "date.year": "2004" }, ... { "record_count": 265, "requested_amount_sum": 17963333.75, "received_amount_sum": 6901530.0, "date.year": "2010" } ], "remainder": {}, "summary": { "record_count": 33038, "requested_amount_sum": 2412768171.31, "received_amount_sum": 2166280591.0 } }, "summary": { "total_cell_count": null, "drilldown": {}, "remainder": {},
10.2. Reports
35
"summary": { "date.year": "2004", "requested_amount_sum": 2394804837.56, "received_amount_sum": 399136450.0, "record_count": 4390 } } }
10.2.1 Roll-up
Report queries might contain rollup specication which will result in rolling-up one or more dimensions to desired level. This functionality is provided for cases when you would like to report at higher level of aggregation than the cell you provided is in. It works in similar way as drill down in serveraggregate but in the opposite direction (it is like cd .. in a UNIX shell). Example: You are reporting for year 2010, but you want to have a bar chart with all years. You specify rollup:
... "rollup": "date", ...
Roll-up can be: a string - single dimension to be rolled up one level an array - list of dimension names to be rolled-up one level a dictionary where keys are dimension names and values are levels to be rolled up-to
10.3 Running and Deployment

10.3.1 Local Server
To run your local server, prepare server conguration grants_config.json:
{ "model": "grants_model.json", "cube": "grants", "view": "mft_grants", "connection": "postgres://localhost/mydata" }
Run the server using the Slicer tool (see slicer - Command Line Tool):
slicer serve grants_config.json
10.3.2 Apache mod_wsgi deployment

Deploying Cubes OLAP Web service server (for analytical API) can be done in four very simple steps: 1. Create server conguration json le 2. Create WSGI script 3. Prepare apache site conguration
36
4. Reload apache conguration Create server conguration le procurements.ini:

[model] path: /path/to/model.json [db] view_prefix: mft_ schema: datamarts connection: postgres://localhost/transparency [translations] en: /path/to/model-en.json hu: /path/to/model-hu.json
Place the le in the same directory as the following WSGI script (for convenience). Create a WSGI script /var/www/wsgi/olap/procurements.wsgi:
import sys import os.path import ConfigParser CURRENT_DIR = os.path.dirname(os.path.abspath(__file__)) CONFIG_PATH = os.path.join(CURRENT_DIR, "procurements.ini") try: config = ConfigParser.SafeConfigParser() config.read(CONFIG_PATH) except Exception as e: raise Exception("Unable to load configuration: %s" % e) import cubes.server application = cubes.server.Slicer(config)
Apache site conguration (for example in /etc/apache2/sites-enabled/):

<VirtualHost *:80> ServerName olap.democracyfarm.org WSGIScriptAlias /vvo /var/www/wsgi/olap/procurements.wsgi <Directory /var/www/wsgi/olap> WSGIProcessGroup olap WSGIApplicationGroup %{GLOBAL} Order deny,allow Allow from all </Directory> ErrorLog /var/log/apache2/olap.democracyfarm.org.error.log CustomLog /var/log/apache2/olap.democracyfarm.org.log combined </VirtualHost>
Reload apache conguration:

sudo /etc/init.d/apache2 reload
And you are done.
10.3. Running and Deployment
37
10.3.3 Server requests

Example server request to get aggregate for whole cube:
$ curl http://localhost:5000/cube/procurements/aggregate?cut=date:2004
Reply:
{ "drilldown": {}, "remainder": {}, "summary": { "date.year": "2004", "received_amount_sum": 399136450.0, "requested_amount_sum": 2394804837.56, "record_count": 4390 } }
10.3.4 Conguration
Server conguration is stored in .ini les with sections: [server] - server related conguration, such as host, port host - host where the server runs, defaults to localhost port - port on which the server listens, defaults to 5000 log - path to a log le log_level - level of log details, from least to most: error, warn, info, debug json_record_limit - number of rows to limit when generating JSON output with iterable objects, such as facts. Default is 1000. It is recommended to use alternate response format, such as CSV, to get more records. [model] - model and cube conguration path - path to model .json le locales - comma separated list of locales the model is provided in. Currently this variable is optional and it is used only by experimental sphinx search backend. [db] - relational database conguration url - database URL in form: adapter://user:password@host:port/database schema - schema containing denormalized views for relational DB cubes view_prefix, view_suffix - prex and sufx for view or table containing cube facts, name is constructed by concatenating prex + cube name + sufx [translations] - model translation les, option keys in this section are locale names and values are paths to model translation les. See Localization for more information. Example conguration le:
[server] host: localhost port: 5001 reload: yes
38
log: /var/log/cubes.log log_level: info [db] url: postgresql://localhost/data view: contracts schema: cubes [model] path: ~/models/contracts_model.json cube: contracts locales: en,sk [translations] sk: ~/models/contracts_model-sk.json
10.3. Running and Deployment
39
40
CHAPTER
ELEVEN
SLICER - COMMAND LINE TOOL

Cubes comes with a command line tool that can: run OLAP server build and compute cubes validate and translate models Usage:
slicer command [command_options]
or:
slicer command sub_command [sub_command_options]
Commands are: Command serve model validate model json build Description Start OLAP server Validates logical model for OLAP cubes Create JSON representation of a model (can be used) when model is a directory. Build OLAP cube from source data using model
11.1 serve
Run Cubes OLAP HTTP server. Example server conguration le config.json:
{ "port": 5000, "model": "contracts.json", "cube": "contracts", "view": "ft_contracts", "connection": "postgres://localhost/contracts" }
Note: Currently the connection can be only a SQL database connection. Access to other existing backends from this tool will be added in the future. To run local server:
41
slicer serve config.json
For more information about OLAP HTTP server see OLAP Web Service
11.2 model validate

Usage:
slicer model validate /path/to/model/directory slicer model validate model.json slicer model validate http://somesite.com/model.json
For more information see Model Validation in cubes Example output:

loading model wdmmg_model.json ------------------------cubes: 1 wdmmg dimensions: 5 date pog region cofog from ------------------------found 3 issues validation results: warning: No hierarchies in dimension date, flat level year will be used warning: Level year in dimension date has no key attribute specified warning: Level from in dimension from has no key attribute specified 0 errors, 3 warnings
11.3 model json

For any given input model produce reusable JSON model.
11.4 model extract_locale

Extract localizable parts of the model. Use this before you start translating the model to get translation template.
11.5 model translate

Translate model using translation le.
42
Chapter 11. slicer - Command Line Tool
CHAPTER
TWELVE
CUBES API
Contents:
12.1 OLAP Cubes

12.1.1 Logical Model
cubes.load_model(resource, translations=None) Load logical model from object reference. resource can be an URL, local le path or le-like object. The path might be: JSON le with a dictionary describing model URL with a JSON dictionary class cubes.Model(name=None, label=None, description=None, cubes=None, dimensions=None, locale=None, **kwargs) Logical Model represents analysts point of view on data. The model dictionary contains main model description. The structure is:
{ "name": "public_procurements", "label": "Public Procurements of Slovakia", "description": "Contracts of public procurement winners in Slovakia" "cubes": {...} "dimensions": {...} }
Attributes: name - model name label - human readable name - can be used in an application description - longer human-readable description of the model cubes - dictionary of cube descriptions (see below) dimensions - dictionary of dimension descriptions (see below) locale - locale code of the model When initializing the Model object, cubes and dimensions might be dictionaries with descriptions. See Cube and Dimension for more information.
43
add_cube(cube) Adds cube to the model and also assigns the model to the cube. If cube has a model assigned and it is not this model, then error is raised. Cubes dimensions are collected to the model. If cube has a dimension with same name as one of existing models dimensions, but has different structure, an exception is raised. Dimensions in cube should be the same as in model. add_dimension(dimension) Add dimension to model. Replace dimension with same name cube(cube) Get a cube with name name or coalesce object to a cube. dimension(obj) Get dimension by name or by object is_valid(strict=False) Check whether model is valid. Model is considered valid if there are no validation errors. If you want to be sure that there are no warnings as well, set strict to True. Args: strict: If False only errors are considered fatal, if True also warnings will make model invalid. Returns: boolean ag whether model is valid or not. localizable_dictionary() Get model locale dictionary - localizable parts of the model localize(translation) Return localized version of model remove_cube(cube) Removes cube from the model remove_dimension(dimension) Remove a dimension from receiver to_dict(**options) Return dictionary representation of the model. All object references within the dictionary are name based Options: expand_dimensions - if set to True then fully expand dimension information in cubes full_attribute_names - if set to True dimension_name.attribute_name then attribute names will be written as
validate() Validate the model, check for model consistency. Validation result is array of tuples in form: (validation_result, message) where validation_result can be warning or error. Returs: array of tuples class cubes.Dimension(name=None, label=None, levels=None, attributes=None, hierarchy=None, description=None, **desc) Create a new dimension default_hierarchy Get default hierarchy specied by default_hierarchy_name, if the variable is not set then get a hierarchy with name default flat_hierarchy(level) Return the only one hierarchy for the only one level 44 Chapter 12. Cubes API
has_details Returns True when each level has only one attribute, usually key. is_flat Return true if dimension has only one level level(obj) Get level by name. levels Get list of all dimension levels. Order is undened. to_dict(**options) Return dictionary representation of the dimension validate() Validate dimension. See Model.validate() for more information. class cubes.Hierarchy(name=None, levels=None, label=None, dimension=None) Dimension hierarchy Attributes: name: hierarchy name label: human readable name levels: ordered list of levels from dimension levels_for_path(path, drilldown=False) Returns levels for given path. If path is longer than hierarchy levels, exception is raised next_level(level) Returns next level in hierarchy after level. If level is last level, returns None path_is_base(path) Returns True if path is base path for the hierarchy. Base path is a path where there are no more levels to be added - no drill down possible. previous_level(level) Returns previous level in hierarchy after level. If level is rst level, returns Nonte rollup(path, level=None) Rolls-up the path to the level. If level is None then path is rolled-up only one level. If level is deeper than last level of path the exception is raised. If level is the same as path level, nothing happens. to_dict(**options) Convert to dictionary class cubes.Level(name=None, key=None, attributes=None, bel_attribute=None, dimension=None) Hierarchy level Attributes: name: level name label: human readable label key: key eld of the level (customer number for customer level, region code for region level, year-month for month level). key will be used as a grouping eld for aggregations. Key should be unique within level. label_attribute: name of attribute containing label to be displayed (customer_name for customer level, region_name for region level, month_name for month level) null_value=None, label=None, la-
12.1. OLAP Cubes
45
attributes: list of other additional attributes that are related to the level. The attributes are not being used for ag they provide additional useful information to_dict(full_attribute_names=False, **options) Convert to dictionary class cubes.Attribute(name, label=None, locales=None, order=None, description=None, **kwargs) Create an attribute. Attributes name - attribute name, used as identier label - attribute label displayed to a user locales = list of locales that the attribute is localized to order - default order of this attribute. If not specied, then order is unexpected. Possible values are: asc/ascending or desc/descending. It is recommended and safe to use Attribute.ASC and Attribute.DESC full_name(dimension, locale=None) Return full name of an attribute as if it was part of dimension. Append locale if it is one of of attributes locales, otherwise raise an error. If no locale is specied and attribute is localized, then rst locale from list of locales is used. cubes.attribute_list(attributes) Create a list of attributes from a list of strings or dictionaries.
12.1.2 Aggregate browsing

class cubes.Cell(cube=None, cuts=[]) Part of a cube determined by slicing dimensions. Immutable object. cut_for_dimension(dimension) Return rst found cut for given dimension multi_slice(cuts) Create another cell by slicing through multiple slices. cuts can be list or a dictionry. If it is a list, it should be a list of two item tuples where rst item is a dimension, second item is a dimension cut path. If cuts is a dictionary, then keys are dimensions, values are cut paths. See Cell.slice() for more information about slicing. rollup(rollup) Rolls-up cell - goes one or more levels up through dimension hierarchy. It works in similar way as drill down in AggregationBrowser.aggregate() but in the opposite direction (it is like cd .. in a UNIX shell). Roll-up can be: a string - single dimension to be rolled up one level an array - list of dimension names to be rolled-up one level a dictionary where keys are dimension names and values are levels to be rolled up-to Note: Only default hierarchy is currently supported.
46
Chapter 12. Cubes API
slice(dimension, path) Create another cell by slicing receiving cell through dimension at path. Receiving object is not modied. If cut with dimension exists it is replaced with new one. If path is empty list or is none, then cut for given dimension is removed. Example:
full_cube = Cell(cube) contracts_2010 = full_cube.slice("date", [2010])
Returns: new derived cell object. class cubes.PointCut(dimension, path) Object describing way of slicing a cube (cell) through point in a dimension class cubes.AggregationBrowser(cube) Class for browsing data cube aggregations Attributes cube - cube for browsing aggregate(cell, measures=None, drilldown=None, **options) Return aggregate of a cell. Subclasses of aggregation browser should implement this method. Attributes drilldown - dimensions and levels through which to drill-down, default None measures - list of measures to be aggregated. By default all measures are aggregated. Drill down can be specied in two ways: as a list of dimensions or as a dictionary. If it is specied as list of dimensions, then cell is going to be drilled down on the next level of specied dimension. Say you have a cell for year 2010 and you want to drill down by months, then you specify drilldown = ["date"]. If drilldown is a dictionary, then key is dimension or dimension name and value is last level to be drilleddown by. If the cell is at year level and drill down is: { "date": "day" } then both month and day levels are added. If there are no more levels to be drilled down, an exception is raised. Say your model has three levels of the date dimension: year, month, day and you try to drill down by date then ValueError will be raised. Retruns a :class:AggregationResult object. dimension_object(dimension) Helper function to return proper dimension object as a subclass of Dimension. Warning: Depreciated. Use cubes.Cube.dimension() Arguments dimension - a dimension object or a string, if it is a string, then dimension object is retrieved from cube fact(key) Returns a single fact from cube specied by fact key key facts(cell, **options) Return an iterable object with of all facts within cell report(cell, report) Creates multiple outputs specied in the report.
12.1. OLAP Cubes
47
report is a dictionary with multiple aggregation browser queries. Keys are custom names of queries which requestor can later use to retrieve respective query result. Values are dictionaries specifying single query arguments. Each query should contain at least one required value query which contains name of the query function: aggregate, facts, fact or values. Rest of values are function specic, please refer to the respective function documentation for more information. Result is a dictionary where keys wil lbe the query names specied in report specication and values will be result values from each query call. This method provides convenient way to perform multiple common queries at once, for example you might want to have always on a page: total transaction count, total transaction amount, drill-down by year and drill-down by transaction type. Roll-up Report queries might contain rollup specication which will result in rolling-up one or more dimensions to desired level. This functionality is provided for cases when you would like to report at higher level of aggregation than the cell you provided is in. It works in similar way as drill down in AggregationBrowser.aggregate() but in the opposite direction (it is like cd .. in a UNIX shell). Example: You are reporting for year 2010, but you want to have a bar chart with all years. You specify rollup:
... "rollup": "date", ...
Roll-up can be: a string - single dimension to be rolled up one level an array - list of dimension names to be rolled-up one level a dictionary where keys are dimension names and values are levels to be rolled up-to Future In the future there might be optimisations added to this method, therefore it will become faster than subsequent separate requests. Also when used with Slicer OLAP service server number of HTTP call overhead is reduced. values(cell, dimension, depth=None, paths=None, **options) Return values for dimension with level depth depth. If depth is None, all levels are returned. Note: Currently only default hierarchy is used.
12.2 Aggregation Browsers

Classes and methods for browsing aggregated data. class cubes.backends.SQLDenormalizer(cube, connection=None, sion_table_prex=None) Creates a simple SQL view builder. Parameters cube - cube from logical model schema=None, dimen-
48
connection - database connection, default None if you want only to create SELECT statement dimension_table_prex - default prex for dimension tables - used if there is no mapping for dimension attribute. Say you have dimension supplier and eld name and dimension table prex dm_ then default physical mapping for that eld would be: dm_supplier.name create_view(view_name, schema=None, index=False, materialize=True) Creates a view. Arguments view_name - name of a view or a table to be created schema - target database schema index - create indexes on level key columns if True. default False materialize - create materialized view (currently as table) if True (default) denormalized_view() Returns SQLAlchemy expression representing select from denormalized view. split_field(eld) Split eld into table and eld name: before rst . is table name, everything else is eld name. If there is no ., then table name is None. table(table_name) Get a table with name table_name. If table was not yet collected (while collecting joins) then raise an exception. If alias is specied, then table will be registered as known under that alias. class cubes.backends.SQLBrowser(cube, connection=None, view_name=None, view=None, locale=None) Create a browser. Attributes cube - cube object to be browsed connection - sqlalchemy database connection object view_name - name of denormalized view (might be VIEW or TABLE) view - SLQ alchemy view/table object locale - locale to be used for localized attributes To initialize SQL browser you should provide either a connection, view_name and optionally shcema or view. aggregate(cell, measures=None, drilldown=None, order=None, **options) See cubes.browsers.cell.aggregate(). fact(key) Fetch single row based on fact key facts(cell, order=None, **options) Retruns iterable objects with facts values(cell, dimension, depth=None, order=None, **options) Get values for dimension at given path within cell class cubes.backends.MongoSimpleCubeBrowser(cube, collection, database=None, gate_ag_eld=_is_aggregate) Create a browser. Attributes aggreschema=None,
12.2. Aggregation Browsers
49
cube - cube object to be browsed collection - MongoDB collection object or name of a collection database - MongoDB database. Has to be specied if collection is a name aggregate_ag_eld - eld to identify aggregated records. _is_aggregate collcetion is generated by cubes.build.MongoSimpleCubeBuilder aggregate(cell, measures=None, drill_down=None) See cubes.browsers.cell.aggregate(). selector_object(cell, drill_dimension=None) Return a dictionary object for nding specied cell. If drill_dimension is set, then selector for all descendants of cell through drill dimension is returned. class cubes.backends.MongoSimpleCubeBuilder(cube, database, fact_collection, cube_collection=None, measures=None, aggregate_ag_eld=_is_aggregate, required_dimensions=[date]) Creates simple cube builder in mongo. See MongoSimpleCubeBuilder.compute() for more information about computation algorithm Attributes cube - description of a cube from logical model fact_collection - either name or mongo collection containing facts (should correspond) to cube denition cube_collection - name or mongo collection where computed cell aggregates will be stored. By default it is the same collection as fact collection. Make sure to properely set aggregate_ag_eld. measures - list of attributes that are going to be aggregated. By default it is [amount] aggregate_ag_eld - name of eld (key) that distincts fact elds from aggregated records. Should be used when fact collection and cube collection is the same. By default it is _is_aggregate. required_dimensions - dimensions that are required for all cells. By default: [date] compute() Compute a multidimensional cube. Computed aggregations for cells can be stored either in separate collection or in the same source - fact collection. Attribute aggregate_ag_eld is used to distinct between facts and aggregated cells. Algorithm: 1.Compute all dimension combinations (for all levels if there are any hierarchies). Each combination is called selector and is represented by a list of tuples: (dimension, levels). For more information see: cubes.util.compute_dimension_cell_selectors(). 2.Compute aggregations for each point within dimension selector. Use MongoDB group function (alternative to map-reduce). 3.Each record for aggregated cell is stored in target collection (see above). This is naive non-optimized method of cube computation: no aggregations are reused for computation. compute_cell(selector) Compute aggregation for cell specied by selector. cell is computed using MongoDB aggregate function. Computed records are inserted into cube_collection and they contain: 50 Chapter 12. Cubes API By default it is
key elds used for grouping aggregated measures sufxed with _sum, for example: amount_sum record count in record_count cell selector as _selector (congurable) with dimension names as keys and current dimension levels as values, for example: {date: [year, month] } cell reference as _cell (congurable) with dimension names as keys and level keys forming dimension paths as values, for example: {date: [2010, 10] } Arguments selector is a list of tuples: (dimension, level_names) Note: Only sum aggregation is being computed. Other aggregations might be implemented in the future, such as average, min, max, rank, ... class cubes.backends.SlicerBrowser(url, cube) Create a browser. Attributes cube - name of a cube url - base url of Cubes Slicer OLAP server aggregate(cell, measures=None, drilldown=None) See cubes.browsers.Cell.aggregate(). fact(key) Fetch single row based on fact key
12.3 Utility functions

Utility functions for computing combinations of dimensions and hierarchy levels class cubes.util.IgnoringDictionary Simple dictionary extension that will ignore any keys of which values are empty (None/False) setnoempty(key, value) Set value in a dictionary if value is not null cubes.util.all_cuboids(dimensions, required=[]) Create cuboids for all possible combinations of dimensions for each levels in hierarchical order. Returns list of dimension selectors. Each dimension selector is a list of tuples where rst element is a dimension and second element is list of levels. Order of selectors and also dimensions within selector is undened. Example 1: If there are no hierarchies (dimensions are at), then this method returns all combinations of all dimensions. If there are dimensions A, B, C with single level a, b, c, respectivelly, the output will be: Output:
12.3. Utility functions
51
(A, (B, (C, (A, (A, (B, (A,
(a)) (b)) (c)) (a)), (a)), (b)), (a)),
(B, (C, (C, (B,
(b)) (c)) (c)) (b)), (C, (c))
Example 2: Take dimensions from example 1 and add requirement for dimension A (might be date usually). then the youtput will contain dimension A in each returned tuple. Tuples without dimension A will be ommited. Output:
(A, (A, (A, (A, (a)) (a)), (B, (b)) (a)), (C, (c)) (a)), (B, (b)), (C, (c))
Example 3: If there are multiple hierarchies, then all levels are combined. Say we have D with d1, d2, B with b1, b2, and C with c. D (as date) is required: Output:
(D, (D, (D, (D, (D, (D, (D, (D, (D, (D, (d1)) (d1, d2)) (d1)), (d1, d2)), (d1)), (d1, d2)), (d1)), (d1, d2)), (d1)), (d1, d2)),
(B, (B, (B, (B, (B, (B, (B, (B,
(b1)) (b1)) (b1, b2)) (b1, b2)) (b1)), (b1)), (b1, b2)), (b1, b2)),
(C, (C, (C, (C,
(c)) (c)) (c)) (c))
cubes.util.combine_node_levels(nodes) Get all possible combinations between each level from each node. It is a cartesian product of rst node levels and all combinations of the rest of the levels cubes.util.combine_nodes(all_nodes, required_nodes=[]) Create all combinations of nodes, if required_nodes are specied, make them present in each combination. cubes.util.expand_dictionary(record, separator=.) Return expanded dictionary: treat keys are paths separated by separator, create sub-dictionaries as necessary cubes.util.get_localizable_attributes(obj) Returns a dictionary with localizable attributes of obj. cubes.util.localize_attributes(attribs, translations) Localize list of attributes. translations should be a dictionary with keys as attribute names, values are dictionaries with localizable attribute metadata, such as label or description. cubes.util.localize_common(obj, trans) Localize common attributes: label and description cubes.util.node_level_points(node) Get all level points within given node. Node is described as tuple: (object, levels) where levels is a list or a tuple
52
CHAPTER
THIRTEEN
DEVELOPMENT NOTES
This chapter contains notes related to Cubes development, such as: unresolved design decisions suggestions proposals for changes explaination for certain design decisions Ive included this document as part of documentation to get more feedback or to help understanding why certain things are done in certain way at the time being.
13.1 Fact Table

Currently all models are required to specify fact table. This can be somehow discovered from model and model mapping. Or from database schema itself.
53
54
Chapter 13. Development Notes
CHAPTER
FOURTEEN
CONTACT AND GETTING HELP

If you have questions, problems or suggestions, you can send a message to Google group or write to me (Stefan Urbanek - author). Report bugs in github issues tracking
55
56
Chapter 14. Contact and Getting Help
CHAPTER
FIFTEEN
INDICES AND TABLES

genindex modindex search
57
58
Chapter 15. Indices and tables
PYTHON MODULE INDEX
c
cubes, 43 cubes.backends, 48 cubes.util, 51
m
model, 7
59
60
Python Module Index
INDEX
A
add_cube() (cubes.Model method), 43 add_cube() (cubes.model.Model method), 9 add_dimension() (cubes.Model method), 44 add_dimension() (cubes.model.Cube method), 11 add_dimension() (cubes.model.Model method), 11 aggregate() (cubes.AggregationBrowser method), 47 aggregate() (cubes.backends.MongoSimpleCubeBrowser method), 50 aggregate() (cubes.backends.SlicerBrowser method), 51 aggregate() (cubes.backends.SQLBrowser method), 49 AggregationBrowser (class in cubes), 47 all_cuboids() (in module cubes.util), 51 Attribute (class in cubes), 46 attribute_list() (in module cubes), 46
dimension() (cubes.model.Model method), 11 dimension_object() (cubes.AggregationBrowser method), 47
E
expand_dictionary() (in module cubes.util), 52
F
fact() (cubes.AggregationBrowser method), 47 fact() (cubes.backends.SlicerBrowser method), 51 fact() (cubes.backends.SQLBrowser method), 49 facts() (cubes.AggregationBrowser method), 47 facts() (cubes.backends.SQLBrowser method), 49 at_hierarchy() (cubes.Dimension method), 44 full_name() (cubes.Attribute method), 46
Cell (class in cubes), 46 get_localizable_attributes() (in module cubes.util), 52 combine_node_levels() (in module cubes.util), 52 H combine_nodes() (in module cubes.util), 52 compute() (cubes.backends.MongoSimpleCubeBuilder has_details (cubes.Dimension attribute), 45 method), 50 Hierarchy (class in cubes), 45 compute_cell() (cubes.backends.MongoSimpleCubeBuilder method), 50 I create_view() (cubes.backends.SQLDenormalizer IgnoringDictionary (class in cubes.util), 51 method), 49 is_at (cubes.Dimension attribute), 45 Cube (class in cubes.model), 11 is_valid() (cubes.Model method), 44 cube() (cubes.Model method), 44 is_valid() (cubes.model.Model method), 11 cube() (cubes.model.Model method), 11 cubes (module), 43 L cubes.backends (module), 48 Level (class in cubes), 45 cubes.util (module), 51 level() (cubes.Dimension method), 45 cut_for_dimension() (cubes.Cell method), 46 levels (cubes.Dimension attribute), 45 levels_for_path() (cubes.Hierarchy method), 45 D load_model() (in module cubes), 43 default_hierarchy (cubes.Dimension attribute), 44 localizable_dictionary() (cubes.Model method), 44 denormalized_view() (cubes.backends.SQLDenormalizer localizable_dictionary() (cubes.model.Model method), 11 method), 49 localize() (cubes.Model method), 44 Dimension (class in cubes), 44 localize() (cubes.model.Model method), 11 dimension() (cubes.Model method), 44 localize_attributes() (in module cubes.util), 52 dimension() (cubes.model.Cube method), 11 localize_common() (in module cubes.util), 52 61
M
Model (class in cubes), 43 Model (class in cubes.model), 9 model (module), 7 MongoSimpleCubeBrowser (class in cubes.backends), 49 MongoSimpleCubeBuilder (class in cubes.backends), 50 multi_slice() (cubes.Cell method), 46
N
next_level() (cubes.Hierarchy method), 45 node_level_points() (in module cubes.util), 52
P
path_is_base() (cubes.Hierarchy method), 45 PointCut (class in cubes), 47 previous_level() (cubes.Hierarchy method), 45
R
remove_cube() (cubes.Model method), 44 remove_cube() (cubes.model.Model method), 11 remove_dimension() (cubes.Model method), 44 remove_dimension() (cubes.model.Cube method), 12 remove_dimension() (cubes.model.Model method), 11 report() (cubes.AggregationBrowser method), 47 rollup() (cubes.Cell method), 46 rollup() (cubes.Hierarchy method), 45
S
selector_object() (cubes.backends.MongoSimpleCubeBrowser method), 50 setnoempty() (cubes.util.IgnoringDictionary method), 51 slice() (cubes.Cell method), 46 SlicerBrowser (class in cubes.backends), 51 split_eld() (cubes.backends.SQLDenormalizer method), 49 SQLBrowser (class in cubes.backends), 49 SQLDenormalizer (class in cubes.backends), 48
T
table() (cubes.backends.SQLDenormalizer method), 49 to_dict() (cubes.Dimension method), 45 to_dict() (cubes.Hierarchy method), 45 to_dict() (cubes.Level method), 46 to_dict() (cubes.Model method), 44 to_dict() (cubes.model.Cube method), 12 to_dict() (cubes.model.Model method), 11
V
validate() (cubes.Dimension method), 45 validate() (cubes.Model method), 44 validate() (cubes.model.Cube method), 12 validate() (cubes.model.Model method), 11 values() (cubes.AggregationBrowser method), 48 values() (cubes.backends.SQLBrowser method), 49 62 Index

Cubes Python Online Analytical Processing Framework

Hochgeladen von

Dokumentinformationen

Originalbeschreibung:

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

Cubes Python Online Analytical Processing Framework

Hochgeladen von

Copyright:

Verfügbare Formate

Cubes Documentation

October 04, 2011

11.2 11.3 11.4 11.5

model validate . . . model json . . . . . model extract_locale model translate . . .

Cubes Documentation, Release 0.7.0

Cubes Documentation, Release 0.7.0

Cubes Documentation, Release 0.7.0

2.1 From sources

Cubes Documentation, Release 0.7.0

MODEL LOGICAL MODEL

Cubes Documentation, Release 0.7.0

Figure 3.1: Mapping from logical model to physical data.

Chapter 3. model Logical Model

LOGICAL MODEL DESCRIPTION

4.1 Load a model 4.2 Model components

Cubes Documentation, Release 0.7.0

Figure 4.1: The logical model entities and relationships.

Chapter 4. Logical Model description

Cubes Documentation, Release 0.7.0

4.2. Model components

Cubes Documentation, Release 0.7.0

Chapter 4. Logical Model description

Cubes Documentation, Release 0.7.0

Figure 4.2: Dimension description - attributes.

Cubes Documentation, Release 0.7.0

"label_attribute": "month_name", "attributes": ["month", "month_name", "month_sname"] },

Example of supplier level of supplier dimension:

Chapter 4. Logical Model description

Cubes Documentation, Release 0.7.0

Cubes Documentation, Release 0.7.0

Chapter 4. Logical Model description

5.1 Attribute Mappings

Cubes Documentation, Release 0.7.0

"joins" = [ { "master": "fact_contracts.category_id", "detail": "dm_categories.id" } ]

Chapter 5. Physical Mapping

Cubes Documentation, Release 0.7.0

Dene at least one cube.

Chapter 6. Model validation

AGGREGATIONS AND AGGREGATION BROWSING

To browse pre-aggregated mongo data:

Prepare aggregation browser:

Cubes Documentation, Release 0.7.0

# Get the whole cube full_cube = browser.full_cube()

Possible output would be:

Chapter 7. Aggregations and Aggregation Browsing

Cubes Documentation, Release 0.7.0

Cubes Documentation, Release 0.7.0

Chapter 7. Aggregations and Aggregation Browsing

8.1 Relational Database (SQL)

8.2 Mongo Backend

Cubes Documentation, Release 0.7.0

Chapter 8. Creating Cubes

Cubes Documentation, Release 0.7.0

9.1 Metadata Localization

in translation le you will have:

Cubes Documentation, Release 0.7.0

import cubes model_sk = cubes.load_model("procurements.json", translations = { "en": "procurements_en.json", "hu": "procurements_hu.json", })

To get translated version of a model:

9.2 Data Localization

9.3 Localized Reporting

9.2. Data Localization

Cubes Documentation, Release 0.7.0