330 Note

Section A Question 1:
ERD based question. You will have to draw one with entities, relationships and attributes and keys.
Basic SQL. Does not require functions, triggers
TSQL (3 sub questions) Recognise whether they are functions, triggers, and stored procedures
Write an execution statement
Read two objects and tell them what tasks they are doing and point out errors
Section B Questions 4, 5, 6:
Three BI questions, based on the Snow Abode case study All BI solutions are very important.
Section C Questions 7, 8, 9:
Information Modelling
Ignoring time variant data (current data with historic data)
-most data changes with time
-updating data does not have to mean over writing
-more accurate information is through time: stamping and adding extra rows with every change over
time.
Fan trap problem
-set hierarchies for the data
Redundant relationships
-useless relationship between data should be discussed
Use of surrogate keys
-when the natural keys are not suitable as a primary key: too long with multiple data types and may
change over time
Use of enterprise keys
-primary keys that are unique in the whole database and it can be used in more than one relation
corresponds with the concept of the key in object-oriented systems
Denormalization
Benefits can one improve performance by reducing the number of table lookups (reduce
number of necessary join queries)
Costs (data duplication) wasted storage space, data integrity/consistency threats
Common denormalization opportunities one to one relationship, many to many relationship

with non-key attributes (associative entity), reference data (1: N relationship where 1-side has
data not used in any other relationship)
Efficiency: records used together are grouped
Inconsistent access speed: slow retrievals across
Relational model
The data stored in a database are worthless unless one can interact with it to get useful information.
In relational algebra (theoretical way), queries are specified using a collection of operators.
Such queries are said to be conducted in an operational manner. This is procedural.
In relational calculus, a query describes the desired answer, without specifying how the
answer is to be computed. Used very much in QBE (query by example). This non-procedural
style of querying is said to be declarative
SQL Basic concepts

SQL can be used for five main purposes: retrieve, define, manipulate, and control data & transactions.
A statement is sent to RDBMS server and the results are appeared as a tuple.
NULL values (Unknown)
Tuples in SQL relations can have NULL as a value for one or more components.
Missing value: we know it has data but we dont know what it is
Inapplicable: the value does not meet the condition.
When we want to initialize a variable at the beginning of the program (SET @variable=null)
GroupBy functions perform calculations such as sum, avg, count, max, min and so on. After this
clause HAVING function could be used as WHERE (it cant be with aggregate functions)
View
Simplifying complex joins

Reformatting retrieved data for efficiency of the query
Filtering unwanted data
To include calculated fields
When we want to update a view, drop view and then create a new view
ERD generalisation concepts

All the subtypes inherit the common attributes from the supertype also subtype will have its own
attributes
Generalization: the process of defining a more general entity type from a set of more
specialized entity types. Aggregate entities to a superclass entity type by identifying their
common characteristics.
Specialization: the process of defining one or more subtypes of the supertype and forming
supertype/subtype relationships. Identifying subclasses, and their distinguishing
characteristics.
EER (extended entity relation) diagrams are difficult to read when there are too many entities and
relationship so GROUP entities and relationships into entity clusters: set of one or more entity types
and associated relationships grouped into a single abstract entity type.
Programming with T-SQL

Variable
It is a named object that stores values within a program
It does not have persistency outside a program (unlike values stored in tables and views in the
database) by default. These variables are known as local variables (use @)
Persistency of a value stored in a variable can be extended to other available programs within
an application by declaring it a global variable (use @@)
Once it is declared with data type, it cannot be undeclared within the scope of the program.
It has no values when first declared so use SET for single variable and SELECT for multiple.
GetDate(): returns the current date

DatePart(): returns part of the date (the day, the day of the week and the month)
DatePart(dw, GetDate()): returns the current day of the week (1 for Sunday and 2 for Monday)
BEGIN & END define a block of code. Use this in IF and WHILE (looping) statements. SQL server
does not recognize ELSE as part of the IF statement so only executes the next line of code and if a
condition is not true do not activate the code.
Stored procedures
Stored procedures are batched collections of SQL statements saved for future use. They are statements
that are bunched together and run as one unit, sequentially in order as written within the procedure
Allows data integrity same reports will always give the same results if conditions are kept
static
Preventing errors and maintaining consistency, security and improve performance and
reducing workload
Independence of business logic and data
It contains simplicity, operational efficiency, performance and security.
Cursors
A cursor points to a record one at a time allowing operations to be performed on the tuple that it is
pointing to. It is a database query stored in SQL server. Not a SELECT statement but the relational set
resulting from that statement. We can think it as a pointer.
Cursors are created using the DECLARE statement used to declare a variable. DECLARE names the
cursor and takes a SELECT statement that defines the cursor. To remove an unused cursor use the
DEALLOCATE statement.
Once the cursor is opened, the data can be accessed by stepping through the relation created by using
FETCH statement. It is managed by FETCH statement accessing the next row and the next with each
FETCH.
FETCH options
Fetch to prior: to retrieve the previous row of a relation
Fetch first: to retrieve the first row of a relation
Fetch last: to retrieve the last row of a relation
Fetch absolute: to retrieve a specific row of a relation starting from the top
Fetch relative: to retrieve a specific row of a relation starting from the current row
DECLARE @orderNo INT;

DECLARE @orderTotal MONEY;
DECLARE @grandTotal MONEY;
SET @grandTotal=0;
DECLARE CustOrders_Cursor CURSOR
FOR
SELECT CustOrdNo FROM CustOrder
ORDER BY CustOrdNo;
OPEN CustOrders_Cursor;
FETCH NEXT from CustOrders_Cursor into @orderNo;
WHILE @@FETCH_STATUS=0
BEGIN
EXECUTE Order_TOTAL @orderNo, 1, @orderTotal OUTPUT
SET @grandTotal = @grandTotal + @orderTotal;
FETCH NEXT from CustOrders_Cursor into @orderNo;
END
CLOSE CustOrders_Cursor;
DEALLOCATE CustOrders_Cursor;
SELECT @grandTotal as GrandTotal;
Triggers
Triggers are used to create an audit trail of what happens in a database and also ensure data integrity
and consistency.
1. Specify unique name for the trigger
2. Specify the table on which the trigger will act on
3. Specify which database event the trigger should respond to: insert, update or delete
Example: AFTER INSERT
4. If you may want the trigger to run after more than one such event, then specify
Example: AFTER INSERT, UPDATE
5. To delete a trigger use: DROP trigger <trigger name>;
6. To update a trigger use: ALTER trigger <trigger name>; or DROP and CREATE a trigger
Identity
It is a function in SQL server that can be used when creating a table similar to creating a table similar
to creating a sequence in Oracle. Only one identity column is used for one table, generally for the PK.
*CREATE Table Test1 (Test1_ID INT NOT NULL IDENTITY (1: Start with, 1: Increment by))*
Functions
It is a programmable object used to do one or more calculations and return a value to a calling
application or integrate values into a result set. It can manipulate data and return a value, but they
cannot modify any data within the repositories.
These functions are global functions available to all developers of SQL server applications. Even it is
not portable, better to use it than writing new more portable functions for better performance within
SQL server.
User-defined functions (can be created only to manipulate data)
Declare to define data variables and cursors local to the function
Set to assign values to scalar and table local variables
Cursor operations including FETCH statements that assign values to local variables using the
INTO clause are allowed
Select statements containing select lists with expressions that assign values to variables that
are local to the function
Update, Insert and Delete statements modifying table variables that are local to the function
Execute statements calling an extended store procedure
User-defined function
Stored procedure
Must return a value- a single result set
Can return a values or even multiple result sets
Returns table variables
Cannot return a table variable although it can

create a table
Directly use in SELECT, ORDER BY, WHERE

and FROM clauses
Cannot use SELECT
Cannot change server environment variables
Can change server environment variables
Always stop execution when error occurs
If use proper error handling code, consistency
Data Warehouse
It is an integrated, subject-oriented, time-variant and non-volatile database that provides support for
management decision making in an organization. It needs to be integrated, company-wide view of
high quality information from disparate databases and separation of operational and informational
systems and data for improved performance.
Operational or Transaction Processing system
Substitutes computer-based processing for manual procedures
Deals with well-structured routine processes
Data warehouse: motivation
Massive amounts of data from business transactions
Improvements in IT
Intense competition for customers attention
Operational data
DS data
Timespan
represent current transactions
tend to cover long time frame
Granularity
represent specific transactions that occur

at a given time
presented at different levels of

aggregation
Dimensionality
focuses on representing atomic

transactions
can be analyzed from multiple

dimensions
When we create a data warehouse, there are external and internal data which is operational data , so
we extract filter, transform, integrate, classify, aggregate and summarise and then send those to data
warehouse to make it integrated, subject-oriented, time-variant and non-volatile.
Star schema
Facts, dimensions, attributes, attribute hierarchies
Data mining
steps in data reconciliation

capture - scrub - transform - load&index
data warehouse pitfalls

getting reconciled metadata for the enterprise data model
getting clean data values
communication problems
lack of technical expertise
poor planning
Database architecture
Database engine
is the core service for storing, processing, and securing data
provides controlled access and rapid transaction processing to meet requirements, also rich
support for sustaining high availability
is used to create relational databases for OLTP or OLAP data
includes creating tables and other database objects for viewing, managing and securing data
SQL server management studio can be used to manage the database objects
SQL server profiler can be used for capturing server events
Analysis services multidimensional data
It provides fast, intuitive, top-down analysis of large quantities of data built on this unified data
model, which can be delivered to users in multiple languages and currencies.
Also works with data warehouse, data marts, production databases and operational data stores,
supporting analysis of both historical and real time data
OLAP data can be disaggregated and aggregated along a dimension according to their natural
hierarchy
Issues are query performance & reliability, integration & flexibility, capacity & scalability,
exponential database growth, total cost of ownership and rapid technological changes.
Intelligence density
BI is what is achieved for the decision makers in a business through techniques and tool such as MSS,
data warehousing, OLAP, and data mining, either individually or in combination of two or more

330 Note

Hochgeladen von

Dokumentinformationen

Originaltitel

Copyright

Verfügbare Formate

Dieses Dokument teilen

Dokument teilen oder einbetten

Freigabeoptionen

Stufen Sie dieses Dokument als nützlich ein?

Sind diese Inhalte unangemessen?

Copyright:

Verfügbare Formate

330 Note

Hochgeladen von

Copyright:

Verfügbare Formate

Section A Question 1:

Costs (data duplication) wasted storage space, data integrity/consistency threats

Common denormalization opportunities one to one relationship, many to many relationship

Efficiency: records used together are grouped

Inconsistent access speed: slow retrievals across

SQL Basic concepts

Simplifying complex joins

ERD generalisation concepts

Programming with T-SQL

It is a named object that stores values within a program

GetDate(): returns the current date

Independence of business logic and data

It contains simplicity, operational efficiency, performance and security.

Fetch to prior: to retrieve the previous row of a relation

Fetch first: to retrieve the first row of a relation

Fetch last: to retrieve the last row of a relation

DECLARE @orderNo INT;

User-defined functions (can be created only to manipulate data)

Declare to define data variables and cursors local to the function

Set to assign values to scalar and table local variables

Execute statements calling an extended store procedure

Must return a value- a single result set

Can return a values or even multiple result sets

Returns table variables

Cannot return a table variable although it can

Directly use in SELECT, ORDER BY, WHERE

Cannot use SELECT

Cannot change server environment variables

Can change server environment variables

Always stop execution when error occurs

If use proper error handling code, consistency

represent current transactions

tend to cover long time frame

represent specific transactions that occur

presented at different levels of

focuses on representing atomic

can be analyzed from multiple

steps in data reconciliation

data warehouse pitfalls

is used to create relational databases for OLTP or OLAP data

SQL server profiler can be used for capturing server events

Analysis services multidimensional data

Das könnte Ihnen auch gefallen