Sie sind auf Seite 1von 12

Section A Question 1:

ERD based question. You will have to draw one with entities, relationships and attributes and keys.

Section A Question 2:
Basic SQL. Does not require functions, triggers

Section A Question 3:
TSQL (3 sub questions) Recognise whether they are functions, triggers, and stored procedures
Write an execution statement
Read two objects and tell them what tasks they are doing and point out errors

Section B Questions 4, 5, 6:
Three BI questions, based on the Snow Abode case study All BI solutions are very important.

Section C Questions 7, 8, 9:

Information Modelling
Ignoring time variant data (current data with historic data)
-most data changes with time
-updating data does not have to mean over writing
-more accurate information is through time: stamping and adding extra rows with every change over
time.
Fan trap problem
-set hierarchies for the data
Redundant relationships
-useless relationship between data should be discussed
Use of surrogate keys
-when the natural keys are not suitable as a primary key: too long with multiple data types and may
change over time
Use of enterprise keys
-primary keys that are unique in the whole database and it can be used in more than one relation
corresponds with the concept of the key in object-oriented systems

Denormalization

Benefits can one improve performance by reducing the number of table lookups (reduce
number of necessary join queries)

Costs (data duplication) wasted storage space, data integrity/consistency threats

Common denormalization opportunities one to one relationship, many to many relationship


with non-key attributes (associative entity), reference data (1: N relationship where 1-side has
data not used in any other relationship)

Efficiency: records used together are grouped

Inconsistent access speed: slow retrievals across

Relational model
The data stored in a database are worthless unless one can interact with it to get useful information.
In relational algebra (theoretical way), queries are specified using a collection of operators.
Such queries are said to be conducted in an operational manner. This is procedural.
In relational calculus, a query describes the desired answer, without specifying how the
answer is to be computed. Used very much in QBE (query by example). This non-procedural
style of querying is said to be declarative

SQL Basic concepts


SQL can be used for five main purposes: retrieve, define, manipulate, and control data & transactions.
A statement is sent to RDBMS server and the results are appeared as a tuple.
NULL values (Unknown)
Tuples in SQL relations can have NULL as a value for one or more components.
Missing value: we know it has data but we dont know what it is
Inapplicable: the value does not meet the condition.
When we want to initialize a variable at the beginning of the program (SET @variable=null)
GroupBy functions perform calculations such as sum, avg, count, max, min and so on. After this
clause HAVING function could be used as WHERE (it cant be with aggregate functions)
View

Simplifying complex joins


Reformatting retrieved data for efficiency of the query
Filtering unwanted data
To include calculated fields
When we want to update a view, drop view and then create a new view

ERD generalisation concepts


All the subtypes inherit the common attributes from the supertype also subtype will have its own
attributes

Generalization: the process of defining a more general entity type from a set of more
specialized entity types. Aggregate entities to a superclass entity type by identifying their
common characteristics.

Specialization: the process of defining one or more subtypes of the supertype and forming
supertype/subtype relationships. Identifying subclasses, and their distinguishing
characteristics.

EER (extended entity relation) diagrams are difficult to read when there are too many entities and
relationship so GROUP entities and relationships into entity clusters: set of one or more entity types
and associated relationships grouped into a single abstract entity type.

Programming with T-SQL


Variable

It is a named object that stores values within a program

It does not have persistency outside a program (unlike values stored in tables and views in the
database) by default. These variables are known as local variables (use @)

Persistency of a value stored in a variable can be extended to other available programs within
an application by declaring it a global variable (use @@)

Once it is declared with data type, it cannot be undeclared within the scope of the program.

It has no values when first declared so use SET for single variable and SELECT for multiple.

GetDate(): returns the current date


DatePart(): returns part of the date (the day, the day of the week and the month)
DatePart(dw, GetDate()): returns the current day of the week (1 for Sunday and 2 for Monday)

BEGIN & END define a block of code. Use this in IF and WHILE (looping) statements. SQL server
does not recognize ELSE as part of the IF statement so only executes the next line of code and if a
condition is not true do not activate the code.

Stored procedures
Stored procedures are batched collections of SQL statements saved for future use. They are statements
that are bunched together and run as one unit, sequentially in order as written within the procedure

Allows data integrity same reports will always give the same results if conditions are kept
static

Preventing errors and maintaining consistency, security and improve performance and
reducing workload

Independence of business logic and data

It contains simplicity, operational efficiency, performance and security.

Cursors
A cursor points to a record one at a time allowing operations to be performed on the tuple that it is
pointing to. It is a database query stored in SQL server. Not a SELECT statement but the relational set
resulting from that statement. We can think it as a pointer.
Cursors are created using the DECLARE statement used to declare a variable. DECLARE names the
cursor and takes a SELECT statement that defines the cursor. To remove an unused cursor use the
DEALLOCATE statement.
Once the cursor is opened, the data can be accessed by stepping through the relation created by using
FETCH statement. It is managed by FETCH statement accessing the next row and the next with each
FETCH.

FETCH options

Fetch to prior: to retrieve the previous row of a relation

Fetch first: to retrieve the first row of a relation

Fetch last: to retrieve the last row of a relation

Fetch absolute: to retrieve a specific row of a relation starting from the top

Fetch relative: to retrieve a specific row of a relation starting from the current row

DECLARE @orderNo INT;


DECLARE @orderTotal MONEY;
DECLARE @grandTotal MONEY;
SET @grandTotal=0;
DECLARE CustOrders_Cursor CURSOR
FOR
SELECT CustOrdNo FROM CustOrder
ORDER BY CustOrdNo;
OPEN CustOrders_Cursor;
FETCH NEXT from CustOrders_Cursor into @orderNo;
WHILE @@FETCH_STATUS=0
BEGIN
EXECUTE Order_TOTAL @orderNo, 1, @orderTotal OUTPUT
SET @grandTotal = @grandTotal + @orderTotal;
FETCH NEXT from CustOrders_Cursor into @orderNo;

END
CLOSE CustOrders_Cursor;
DEALLOCATE CustOrders_Cursor;
SELECT @grandTotal as GrandTotal;

Triggers
Triggers are used to create an audit trail of what happens in a database and also ensure data integrity
and consistency.
1. Specify unique name for the trigger
2. Specify the table on which the trigger will act on
3. Specify which database event the trigger should respond to: insert, update or delete
Example: AFTER INSERT
4. If you may want the trigger to run after more than one such event, then specify
Example: AFTER INSERT, UPDATE
5. To delete a trigger use: DROP trigger <trigger name>;
6. To update a trigger use: ALTER trigger <trigger name>; or DROP and CREATE a trigger
Identity
It is a function in SQL server that can be used when creating a table similar to creating a table similar
to creating a sequence in Oracle. Only one identity column is used for one table, generally for the PK.
*CREATE Table Test1 (Test1_ID INT NOT NULL IDENTITY (1: Start with, 1: Increment by))*

Functions
It is a programmable object used to do one or more calculations and return a value to a calling
application or integrate values into a result set. It can manipulate data and return a value, but they
cannot modify any data within the repositories.
These functions are global functions available to all developers of SQL server applications. Even it is
not portable, better to use it than writing new more portable functions for better performance within
SQL server.

User-defined functions (can be created only to manipulate data)

Declare to define data variables and cursors local to the function

Set to assign values to scalar and table local variables

Cursor operations including FETCH statements that assign values to local variables using the
INTO clause are allowed

Select statements containing select lists with expressions that assign values to variables that
are local to the function

Update, Insert and Delete statements modifying table variables that are local to the function

Execute statements calling an extended store procedure

User-defined function

Stored procedure

Must return a value- a single result set

Can return a values or even multiple result sets

Returns table variables

Cannot return a table variable although it can


create a table

Directly use in SELECT, ORDER BY, WHERE


and FROM clauses

Cannot use SELECT

Cannot change server environment variables

Can change server environment variables

Always stop execution when error occurs

If use proper error handling code, consistency

Data Warehouse
It is an integrated, subject-oriented, time-variant and non-volatile database that provides support for
management decision making in an organization. It needs to be integrated, company-wide view of
high quality information from disparate databases and separation of operational and informational
systems and data for improved performance.
Operational or Transaction Processing system
Substitutes computer-based processing for manual procedures
Deals with well-structured routine processes
Data warehouse: motivation
Massive amounts of data from business transactions
Improvements in IT
Intense competition for customers attention
Operational data

DS data

Timespan

represent current transactions

tend to cover long time frame

Granularity

represent specific transactions that occur


at a given time

presented at different levels of


aggregation

Dimensionality

focuses on representing atomic


transactions

can be analyzed from multiple


dimensions

When we create a data warehouse, there are external and internal data which is operational data , so
we extract filter, transform, integrate, classify, aggregate and summarise and then send those to data
warehouse to make it integrated, subject-oriented, time-variant and non-volatile.

Star schema
Facts, dimensions, attributes, attribute hierarchies

Data mining

steps in data reconciliation


capture - scrub - transform - load&index

data warehouse pitfalls


getting reconciled metadata for the enterprise data model
getting clean data values
communication problems
lack of technical expertise
poor planning

Database architecture

Database engine
is the core service for storing, processing, and securing data
provides controlled access and rapid transaction processing to meet requirements, also rich
support for sustaining high availability

is used to create relational databases for OLTP or OLAP data

includes creating tables and other database objects for viewing, managing and securing data

SQL server management studio can be used to manage the database objects

SQL server profiler can be used for capturing server events

Analysis services multidimensional data

It provides fast, intuitive, top-down analysis of large quantities of data built on this unified data
model, which can be delivered to users in multiple languages and currencies.
Also works with data warehouse, data marts, production databases and operational data stores,
supporting analysis of both historical and real time data

OLAP data can be disaggregated and aggregated along a dimension according to their natural
hierarchy
Issues are query performance & reliability, integration & flexibility, capacity & scalability,
exponential database growth, total cost of ownership and rapid technological changes.

Intelligence density
BI is what is achieved for the decision makers in a business through techniques and tool such as MSS,
data warehousing, OLAP, and data mining, either individually or in combination of two or more

Das könnte Ihnen auch gefallen