Beruflich Dokumente
Kultur Dokumente
Sandeep Kar
Definition
Database design is the process of producing a detailed data model of a database. Data modeling in software engineering is the process of creating a data model by applying formal data model descriptions using data modeling techniques. Data modeling is a technique for defining business requirements for a database.
Model
Conceptual
Logical
Physical
Modeling techniques
Conceptual Data Model: A conceptual data model identifies the highest-level relationships between the different entities. Logical Data Modeling uses Entity-Relationship Modeling technique (E/R Modeling technique) Physical Data Modeling uses Relational Modeling technique Logical Data Independence : The ability to change the logical (conceptual) schema without changing the External schema is called logical data independence. E.g. Views can shield users from changes in the structure of the real tables .
Data Modeler
DBA
Data Architect
Business Analyst
---
Data Requirements Use Cases LDM Logical and Physical Data Models
Data Modeler
DBA
---
---
Data Architect
Performs the detailed physical design Creates the DDL and/or required DML Implements the model into the database
Reviews the Data Models and assesses the completeness for the physical implementation
Transaction TRANSACTION IDENTIFIER: Number CUSTOMER NUMBER: Number (FK) TRANSACTION AMOUNT: Number TRANSACTION DATE: Datetime
10
11
12
13
14
15
Relationships
Identifying Relationship Non-Identifying Relationship Weak Relationship Binary relationships can be One-to-one(1:1) One-to-many(1:*) Many-to-many(*:*) Each of these is modelled differently Understanding 1:* type is particularly important Many real world relationships are of type 1:*
16
Identifying Relationships
An identifying relationship means that the child table cannot be uniquely identified without the parent. For example, you have this situation in the intersection table used to resolve a many-to-many relationship where the intersecting table's Primary Key is a composite of the left and right (parents) table's Primary Keys. Example... Account (AccountID, AccountNum, AccountTypeID) PersonAccount (AccountID, PersonID, Balance) Person(PersonID, Name) The Account to PersonAccount relationship and the Person to PersonAccount relationship are identifying because the child row (PersonAccount) cannot exist without having been defined in the parent (Account or Person). In other words: there is no personaccount when there is no Person or when there is no Account.
17
Non-Identifying Relationships
A non-identifying relationship is one where the child can be identified independently of the parent ( Account - AccountType) Example... Account( AccountID, AccountNum, AccountTypeID ) AccountType( AccountTypeID, Code, Name, Description ) The relationship between Account and AccountType is non-identifying because each AccountType can be identified without having to exist in the parent table.
18
Our task: how 1:* relationship between two entities at ER Model level is represented in a relational model
We assume that both the participating entities are modelled as tables ( as explained earlier) Do we make any changes to these tables to reflect the relationship between them?
Yes, we use a foreign key to mark the relationship
19
Create a few rows of data in the tables participating in the 1:* relationship and check if the foreign key is acting as a link for information from the child table to the information from the parent table
Example data is always useful in designing foreign keys
20
Example
Staffthe 1:* relationship Oversees between Staff and PropertyForRent PropertyForRent Consider 0..1 0..*
Oversees
In this case,
Staff is the Parent entity
Because it is on the one side of the relationship
21
22
Here, the *:* relationship between Client and PropertyForRent is directly represented as a new table viewing Primary key for the new entity includes the two foreign keys from the two participating entities
23
2010 Wipro Ltd - Confidential
In this case of 1:1, max (cardinality) constraints which are 1:1 do not help to identify the parent table Therefore we use min (participation) constraints to identify the parent table For example, we choose the entity with min value zero as the parent entity, if the other participating entity has min value of one
24
Complex Relationships
Complex relationships too can be simplified into simpler 1:1 or 1:* relationships first and then modelled at the logical level Alternatively, a new table can be created to represent a complex relationship. Foreign keys are posted in the new table from all the participating entities
25
Complexity
Physical Data Model
Model repositories:
Model Mart (ERWin/AllFusion) Power Designer Model repository
27
28
29
30
Our Table
user
name nickname phone1 phone2 phone3 cell pager address city province postal_code country email1 email2 web_url company department picture notes email_format
name
Mike Hillyer
phone1
403-555-1717
phone2
403-555-1919
email1
mike@hoppen.com
email2
mhillyer@mysite.com
Tom Jensen
403-555-1919
403-555-1313
tom@openwin.org
tom@supersite.org
Ray Smith
403-555-1919
403-555-1111
ray@cpma.com
31
32
One Solution
user
email
mike@hoppen.com mhillyer@mysite.com tom@openwin.org tom@supersite.org ray@cpma.com
33
Multiple rows per user Emails are associated with only one other phone Hard to Search
2010 Wipro Ltd - Confidential
Satisfying 1NF
user PK user_id first_name last_name nickname address city province postal_code country web_url company department picture notes
34
Forming Relationships
Three Forms
One to (zero or) One One to (zero or) Many Many to Many
One to One
Same Table?
One to Many
Place PK of the One in the Many
Many to Many
Create a joining table
35
Joining Tables
user PK user_id first_name last_name nickname address city province postal_code country web_url picture notes email_format
36
department
Documentation Finance Documentation
user PK user_id first_name last_name nickname address city province postal_code country web_url picture notes email_format
37
38
Satisfying 2NF
user user PK user_id PK user_id first_name last_name first_name nickname last_name address nickname city address province city postal_code province country postal_code web_url country picture web_url notes picture email_format notes
39
40
Satisfying 3NF
user_phone user PK user_id first_name last_name nickname address city province postal_code country web_url picture notes PK,FK1 PK,FK2 user_id phone_id extension PK
41
Finding Balance
user PK user_id first_name last_name nickname unit street_number street_name street_type quadrant web_url picture notes postal_code
FK2
FK1
FK1
42
Goal of de-normalization
Only one valid reason exists for de-normalizing a relational design - to enhance performance. De-normalization is the process of putting one fact in numerous places. This speeds data retrieval at the expense of data modification. Normalize first, then de-normalize Use only when you cannot optimize Try temp tables, UNIONs, VIEWs, Always monitor and periodically re-evaluate all de-normalized applications. I/O saved , CPU saved , complexity of update programming , cost of returning to a normalized design
43
Types of De-normalization
44
Concepts
Surrogate Key: A unique {primary key} generated by the {RDBMS} that is not derived from any data in the database and whose only significance is to act as the primary key. A surrogate key is frequently a sequential number (e.g. a {Sybase} "{identity column}) Candidate Key: is a combination of attributes that can be uniquely used to identify a database record without any extraneous data. Each table may have one or more candidate keys. One of these candidate keys is selected as the table primary key. Vertical slicing: A Projection is a vertical slicing of the table. You simply indicate which columns you want. Horizontal slicing: A Selection is a horizontal slicing of the table. The selection defines which records you returned out of all possible records in the table. The projection is defined in the WHERE clause of the SQL statement.
45
CASE Tool
Compare
Generate DDL
DB Meta-model
Target DB
46
47
Plan
Pick the topic for analysis Create the Conceptual Data Model Create LDM Validate the LDM (Formal validation, normalization etc.) Perform the transition from LDM to PDM using ERWin Create PDM Perform the Physical design (indexes, table spaces, buffer pools, partitioning etc.) Validate the PDM (make sure that de-normalization did not change the data integirty, validate reference implementation through RI) Perform the model walkthrough (attack the PDM to see if we have any holes) Implement OLTP model (enjoy the work of DBA) Play with Data Model/Database to see what we can/cant do (experience the SQL against our masterpiece) Define the requirements for Decision support system (understand what Data Analytics means and why we have to create a separate data model for this) Create DSS model (Star, Snow flake, Constellation, Data Mart, Data Warehouse)
48
Questions?
Thank You
Sandeep Kar