Beruflich Dokumente
Kultur Dokumente
The definitions will be oriented according to the three dimensions of semiotics (the theory of
signs), i.e., syntax, semantics, and pragmatics.
―There is, in general, no known way to distinguish knowledge from information or data on a
purely representational basis.
The relation among signs, i.e., the syntax of ‗sentences‘does not relate signs to anything
in the real world and thus, can be called one-dimensional. In our eyes, signs only viewed
under this dimension equal data.
The relation between signs and their meaning, i.e., their semantics adds a second
dimension to signs. It is only through this relation between sign and meaning, that data
becomes information.
The third dimension is added by relating signs and their ‗users‘. If this third dimension
including users, who pursue goals which require performing actions, is introduced, we
call the patterns, or signs knowledge This relationship, which is called pragmatics,
defines an important characteristic of knowledge, i.e., only knowledge enables actions
and decisions, performed by actors
Fundamental of Database
A database is a collection of tables, with related data. Table 6.1 Provides Example of
Employee table with 2 columns and 3 rows
A table is a matrix with data. A table in a database looks like a simple spreadsheet. The
example above has 3 rows and 2 columns in employee table. One column (data element)
contains data of one and the same kind, for example the column Emp_Number. A row (=
tuple, entry) is a group of related data, for example the data of one employee detail.
Normalization in a relational database is the process of removing redundancy in data by
separating the data into multiple tables. The process of removing redundancy in data by
separating the data into multiple tables
Armed with the correct information, you can create an effective and efficient database from a
logical data model. The first step in transforming a logical data model into a physical model is to
perform a simple translation from logical terms to physical objects. Of course, this simple
transformation will not result in a complete and correct physical database design – it is simply
the first step. The transformation consists of the following:
To support the mapping of attributes to table columns you will need to map each logical domain
of the attribute to a physical data type and perhaps additional constraints. In a physical database,
each column must be assigned a data type. Certain data types require a maximum length to be
specified. For example a character data type could be specified as VARCHAR(10), indicating
that up to 10 characters can be stored for the column. You may need to apply a length to other
data types as well, such as graphic, floating point, and decimal (which require a length and scale)
types
In making selection for the database/hardware platform, there are several items that need to be
carefully considered:
Scalability: How can the system grow as your data storage needs grow? Which RDBMS and
hardware platform can handle large sets of data most efficiently? To get an idea of this, one
needs to determine the approximate amount of data that is to be kept in the data warehouse
system once it‘s mature, and base any testing numbers from there.
Parallel Processing Support: The days of multi-million dollar supercomputers with one single
CPU are gone, and nowadays the most powerful computers all use multiple CPUs, where each
processor can perform a part of the task, all at the same time. Massively parallel computers in
1993,so that it would be the best way for any large computations to be done within 5 years.
OLAP is implemented in a multi-user client/server mode and offers consistently rapid response
to queries, regardless of database size and complexity. OLAP helps the user synthesize enterprise
OLAP Server
A relational database is designed for a specific purpose. Because the purpose of a data warehouse
differs from that of an On Line Transaction Processing (OLTP), the design characteristics of a
relational database that supports a data warehouse differ from the design characteristics of an
OLTP database. An OLTP system is designed to handle a large volume of small-predefined
transactions.
A data warehouse supports an OLTP system by providing a place for the OLTP database to
offload data as it accumulates, and by providing services that would complicate and degrade
OLTP operations if they were performed in the OLTP database. Without a data warehouse to
hold historical information, data is archived to static media such as magnetic tape, or allowed to
accumulate in the OLTP database. If data is simply archived for preservation, it is not available
or organized for use by analysts and decision makers. If data is allowed to accumulate in the
OLTP so it can be used for analysis, the OLTP database continues to grow in size and requires
more indexes to service analytical and report queries. These queries access and process large
portions of the continually growing historical data and add a substantial load to the database. The
large indexes needed to support these queries also tax the OLTP transactions with additional
index maintenance. These queries can also be complicated to develop due to the typically
complex OLTP database schema. A data warehouse offloads the historical data from the OLTP,
allowing the OLTP to operate at peak transaction efficiency. High volume analytical and
reporting queries are handled by the data warehouse and do not load the OLTP, which does not
need additional indexes for their support. As data is moved to the data warehouse, it is also
reorganized and consolidated so that analytical queries are simpler and more efficient.
OLAP is not designed to store large volumes of text or binary data, nor is it designed to support
high volume update transactions. The inherent stability and consistency of historical data in a
data warehouse enables OLAP to provide its remarkable performance in rapidly summarizing
information for analytical queries.
In SQL server 2000, Analysis Services provides tools for developing OLAP applications and a
server specifically designed to service OLAP queries. Other tolls are informatica, business
objects and data stage.
Data mining is a technology that applies sophisticated and complex algorithms to analyze data
and expose interesting information for analysis by decision makers. Whereas OLAP organizes
data in a model suited for exploration by analysts, data mining performs analysis on data and
provides the results to decision makers. Thus, OLAP supports model-driven analysis and data
mining supports data-driven analysis.
Data mining has traditionally operated only on raw data in the data warehouse database or, more
commonly, text files of data extracted from the data warehouse database. In SQL server 2000,
analysis services provides data mining technology that can analyze data in OLAP cubes, as well
as data in the relational data warehouse database. In addition, data mining results can be