Sie sind auf Seite 1von 50

Database Design

Sandeep Kar

2010 Wipro Ltd - Confidential

Topics We are going to discuss


Logical Database Design Physical Vs Logical Database Modeling Normalization Techniques De-normalization Techniques Data Modeling Tool - ERWIN

2010 Wipro Ltd - Confidential

Definition
Database design is the process of producing a detailed data model of a database. Data modeling in software engineering is the process of creating a data model by applying formal data model descriptions using data modeling techniques. Data modeling is a technique for defining business requirements for a database.

2010 Wipro Ltd - Confidential

Types of Data Models

Model

Conceptual

Logical

Physical

2010 Wipro Ltd - Confidential

Modeling techniques
Conceptual Data Model: A conceptual data model identifies the highest-level relationships between the different entities. Logical Data Modeling uses Entity-Relationship Modeling technique (E/R Modeling technique) Physical Data Modeling uses Relational Modeling technique Logical Data Independence : The ability to change the logical (conceptual) schema without changing the External schema is called logical data independence. E.g. Views can shield users from changes in the structure of the real tables .

2010 Wipro Ltd - Confidential

Roles and responsibilities


Communicate via Business Analyst
---

Data Modeler

DBA

Data Architect

Business Analyst

Produces Data Requirements Use Cases ---

---

Data Requirements Use Cases LDM Logical and Physical Data Models

Data Modeler

Logical Data Model

Physical Data Model

DBA

---

Physical Data Model

---

Physical Data Model

Data Architect

Data Requirements Use Cases LDM

Logical and Physical Data Models

Performs the detailed physical design Creates the DDL and/or required DML Implements the model into the database

Reviews the Data Models and assesses the completeness for the physical implementation

2010 Wipro Ltd - Confidential

Conceptual Data Model:


Includes the important entities and the relationships among them. No attribute is specified. No primary key is specified. A conceptual entity-relationship model shows how the business world sees information. It suppresses non-critical details in order to emphasize business rules and user objects. It typically includes only significant entities which have business meaning, along with their relationships.

2010 Wipro Ltd - Confidential

Logical database design


Derive a logical model from the information represented in the ER model (conceptual model) Validate the logical model to check if it fulfils clients data and transaction requirements We focus on one type of logical model which is relational model Logical model = relational model Relational model means Collection of connected tables Mapping from conceptual model to logical model mainly involves Designing tables with primary keys And linking tables with foreign keys Logical modelling is independent of database type
8
2010 Wipro Ltd - Confidential

Logical Data Model


Customer CUSTOMER NUMBER: Number CUSTOMER FIRST NAME: String CUSTOMER LAST NAME: String CUSTOMER BIRTH DATE: String Creates / Is created by a

Transaction TRANSACTION IDENTIFIER: Number CUSTOMER NUMBER: Number (FK) TRANSACTION AMOUNT: Number TRANSACTION DATE: Datetime

2010 Wipro Ltd - Confidential

How to read the cardinalities

Must have one and only one

May have one/Must not exceed one/Must have zero or one

May have zero, one or more

Must have at least one and possibly more

10

2010 Wipro Ltd - Confidential

Logical Data Model


Entity is an LDM construct that has uniform structure storing data instances in pre-defined format Attribute is part of an entity and stores atomic data of uniform structure (e.g. number, string etc.) Relationship is a modeling object that defines the dependency between two entities

11

2010 Wipro Ltd - Confidential

Entities & their Attributes


Individual tables are derived from strong entities (entities with a clear Primary key) Fields in the tables are derived from attributes associated with entities
Define the data types of the fields

Define the primary key of the table


Unique and Not Null

Foreign keys are decided later while modelling the relationships


Not all tables (relations) have foreign keys However, relation model is incomplete without deciding foreign keys

12

2010 Wipro Ltd - Confidential

Physical Data Model


Table is a physical implementation of entity (s). It has uniform structure and comprises of records Column is a physical implementation of an attribute, stores a single value (although not necessarily) and has uniform validation rule (s) Reference is a physical implementation of a relationship that defines basic integrity rules in the model. It can be implemented using the database, trigger or application (e.g. Stored Procedure of external application code)

13

2010 Wipro Ltd - Confidential

Physical Database Modeling


Constraints are also defined, including primary keys, foreign keys, other unique keys, and check constraints. Views can be created from database tables to summarize data or to simply provide the user with another perspective of certain data. When physical modeling occurs, objects are being defined at the schema level. A schema is a group of related objects in a database. A database design effort is normally associated with one schema. Other objects such as indexes and snapshots can also be defined during physical modeling. Physical modeling is when all the pieces come together to complete the process of defining a database for a business.

14

2010 Wipro Ltd - Confidential

Logical Vs Physical Database Modeling


Physical modeling involves the actual design of a database according to the requirements that were established during logical modeling. Logical modeling mainly involves gathering the requirements of the business, with the latter part of logical modeling directed toward the goals and requirements of the database. Physical modeling deals with the conversion of the logical, or business model, into a relational database model. During physical modeling, objects such as tables and columns are created based on entities and attributes that were defined during logical modeling.

15

2010 Wipro Ltd - Confidential

Relationships

Identifying Relationship Non-Identifying Relationship Weak Relationship Binary relationships can be One-to-one(1:1) One-to-many(1:*) Many-to-many(*:*) Each of these is modelled differently Understanding 1:* type is particularly important Many real world relationships are of type 1:*

16

2010 Wipro Ltd - Confidential

Identifying Relationships
An identifying relationship means that the child table cannot be uniquely identified without the parent. For example, you have this situation in the intersection table used to resolve a many-to-many relationship where the intersecting table's Primary Key is a composite of the left and right (parents) table's Primary Keys. Example... Account (AccountID, AccountNum, AccountTypeID) PersonAccount (AccountID, PersonID, Balance) Person(PersonID, Name) The Account to PersonAccount relationship and the Person to PersonAccount relationship are identifying because the child row (PersonAccount) cannot exist without having been defined in the parent (Account or Person). In other words: there is no personaccount when there is no Person or when there is no Account.

17

2010 Wipro Ltd - Confidential

Non-Identifying Relationships
A non-identifying relationship is one where the child can be identified independently of the parent ( Account - AccountType) Example... Account( AccountID, AccountNum, AccountTypeID ) AccountType( AccountTypeID, Code, Name, Description ) The relationship between Account and AccountType is non-identifying because each AccountType can be identified without having to exist in the parent table.

18

2010 Wipro Ltd - Confidential

One-to-many (1:*) relationships


These are the most common type of relationships Also known as parent:child relationship
One parent can have many children The entity on the One side of the relationship is known as the Parent entity The entity on the many side is known as the Child entity

Our task: how 1:* relationship between two entities at ER Model level is represented in a relational model
We assume that both the participating entities are modelled as tables ( as explained earlier) Do we make any changes to these tables to reflect the relationship between them?
Yes, we use a foreign key to mark the relationship

We make foreign key decision while modelling 1:* relationship

19

2010 Wipro Ltd - Confidential

Foreign Key Design


In a 1:* relationship
Foreign key is designed as a column in the child table (table one the * side) Foreign key references the parent table (table on the 1 side)

In other words, when you post a foreign key to a table it means


This table is the child table and For every row in the parent table, this table may have more than one (many) corresponding rows

Create a few rows of data in the tables participating in the 1:* relationship and check if the foreign key is acting as a link for information from the child table to the information from the parent table
Example data is always useful in designing foreign keys

20

2010 Wipro Ltd - Confidential

Example

Staffthe 1:* relationship Oversees between Staff and PropertyForRent PropertyForRent Consider 0..1 0..*

Oversees

In this case,
Staff is the Parent entity
Because it is on the one side of the relationship

PropertyForRent is the child entity


Because it is one the many side of the relationship

When we model this relationship at the relational level


We assume that Staff and PropertyForRent are modelled as tables as discussed earlier We post a copy of the PrimaryKey, StaffNo from the Parent entity, Staff as a foreign key in the child entity,PropertyForRent

Our final tables are


Staff(StaffNo, lName, fName, Position) Primary key StaffNo PropertyForRent(PropertyNo, Street, Town, StaffNo) Primary key ProperrtyNo Foreign key StaffNo references Staff(StaffNo)

21

2010 Wipro Ltd - Confidential

Many-to-many (*:*) Relationships


There are two methods to tackle *:* relationships Method: Create a new table to represent the relationship
We assume that the two entities participating in the relationship are already modelled as tables as explained earlier The third table is created to represent the relationship

This methods result in similar solutions


Three tables, where one of the tables (relationship table) links both the entity tables through foreign keys

22

2010 Wipro Ltd - Confidential

Example: Method for modelling *:* relationship

Here, the *:* relationship between Client and PropertyForRent is directly represented as a new table viewing Primary key for the new entity includes the two foreign keys from the two participating entities
23
2010 Wipro Ltd - Confidential

One-to-one (1:1) relationships


Generally, in relationship modelling we always identify the parent table
Then post a copy of its primary key as the foreign key in the child table

In this case of 1:1, max (cardinality) constraints which are 1:1 do not help to identify the parent table Therefore we use min (participation) constraints to identify the parent table For example, we choose the entity with min value zero as the parent entity, if the other participating entity has min value of one

24

2010 Wipro Ltd - Confidential

Complex Relationships
Complex relationships too can be simplified into simpler 1:1 or 1:* relationships first and then modelled at the logical level Alternatively, a new table can be created to represent a complex relationship. Foreign keys are posted in the new table from all the participating entities

25

2010 Wipro Ltd - Confidential

Complexity
Physical Data Model

Conceptual Data Model

Logical Data Model

Physical Data Model

Dimensional Data Model


26

2010 Wipro Ltd - Confidential

Classification of Data Modeling tools


Enterprise Data Modeling tools:
Rational Rose (Object Oriented and Relational modeling)

Data Modeling tools:


AllFusion (ERWin)-This tool is a Market leader (according to GG and Foresters) Power Designer-This tool is a major contender Embarkaderos E/R Studio

Model repositories:
Model Mart (ERWin/AllFusion) Power Designer Model repository

27

2010 Wipro Ltd - Confidential

What Is Database Normalization?


Normalization is the process of efficiently organizing data in a database. There are two goals of the normalization process:
eliminating redundant data ensuring data dependencies (only storing related data in a table)

28

2010 Wipro Ltd - Confidential

What are the Benefits of Database Normalization?


Decreased storage requirements by effectively choosing the data type Faster search performance!
Smaller file for table scans. More directed searching.

Improved data integrity!

29

2010 Wipro Ltd - Confidential

What are the Normal Forms?


First Normal Form (1NF) Second Normal Form (2NF) Third Normal Form (3NF) Boyce-Codd Normal Form (BCNF) Fourth Normal Form (4NF) Fifth Normal Form (5NF)

30

2010 Wipro Ltd - Confidential

Our Table

user

name nickname phone1 phone2 phone3 cell pager address city province postal_code country email1 email2 web_url company department picture notes email_format

name
Mike Hillyer

phone1
403-555-1717

phone2
403-555-1919

email1
mike@hoppen.com

email2
mhillyer@mysite.com

Tom Jensen

403-555-1919

403-555-1313

tom@openwin.org

tom@supersite.org

Ray Smith

403-555-1919

403-555-1111

ray@cpma.com

31

2010 Wipro Ltd - Confidential

First Normal Form


Eliminate duplicative columns from the same table. Create separate tables for each group of related data and identify each row with a unique column or set of columns (the primary key). Benefits
Easier to query/sort the data More scalable Each row can be identified for updating

32

2010 Wipro Ltd - Confidential

One Solution

user

first_name last_name phone


first_name last_name nickname phone cell pager address city province postal_code country web_url department picture notes

email
mike@hoppen.com mhillyer@mysite.com tom@openwin.org tom@supersite.org ray@cpma.com

Mike Mike Tom Tom Ray Ray

Hillyer Hillyer Jensen Jensen Smith Smith

403-555-1717 403-555-1919 403-555-1919 403-555-1313 403-555-1919 403-555-1111

33

Multiple rows per user Emails are associated with only one other phone Hard to Search
2010 Wipro Ltd - Confidential

Satisfying 1NF

user PK user_id first_name last_name nickname address city province postal_code country web_url company department picture notes

phone email PK PK email_id address country_code number extension phone_id

34

2010 Wipro Ltd - Confidential

Forming Relationships
Three Forms
One to (zero or) One One to (zero or) Many Many to Many

One to One
Same Table?

One to Many
Place PK of the One in the Many

Many to Many
Create a joining table

35

2010 Wipro Ltd - Confidential

Joining Tables

user PK user_id first_name last_name nickname address city province postal_code country web_url picture notes email_format

user_phone PK,FK1 PK phone_id user_id type PK

phone phone_id country_code number extension

email PK FK1 address user_id

36

2010 Wipro Ltd - Confidential

Our User Table

first_name last_name company


Mike Tom Ray Hillyer Jensen Smith MySQL CPNS CPNS

department
Documentation Finance Documentation

user PK user_id first_name last_name nickname address city province postal_code country web_url picture notes email_format

user_phone PK,FK1 PK phone_id user_id type PK

phone phone_id country_code number extension

email PK FK1 address user_id

37

2010 Wipro Ltd - Confidential

Second Normal Form


Meet all the requirements of the first normal form. Remove subsets of data that apply to multiple rows of a table and place them in separate tables. Create relationships between these new tables and their predecessors through the use of foreign keys. Benefits Increased storage efficiency Less data repetition

38

2010 Wipro Ltd - Confidential

Satisfying 2NF

email email PK address PK address type FK1 user_id FK1 user_id

user user PK user_id PK user_id first_name last_name first_name nickname last_name address nickname city address province city postal_code province country postal_code web_url country picture web_url notes picture email_format notes

user_phone PK,FK1 PK,FK2 user_id phone_id PK

phone phone_id country_code number extension type

user_company PK,FK1 PK,FK2 user_id company_id department PK

company company_id name

39

2010 Wipro Ltd - Confidential

Third Normal Form


Table must be in Second Normal Form If your table is 2NF, there is a good chance it is 3NF Remove columns that are not dependent upon the primary key. All attributes that are not part of the key must not depend on any other non-key attributes. Benefits No extraneous data

40

2010 Wipro Ltd - Confidential

Satisfying 3NF

user_phone user PK user_id first_name last_name nickname address city province postal_code country web_url picture notes PK,FK1 PK,FK2 user_id phone_id extension PK

phone phone_id country_code number type

email PK FK1 address user_id format

user_company PK,FK1 PK,FK2 user_id company_id department company PK company_id name

41

2010 Wipro Ltd - Confidential

Finding Balance

user PK user_id first_name last_name nickname unit street_number street_name street_type quadrant web_url picture notes postal_code

user_phone PK,FK1 PK,FK2 user_id phone_id extension PK FK1

phone phone_id type_id area_code NXX NCX country_id PK

type type_id type PK

country country_id Name phone_code

email PK FK1 address user_id format

FK2

user_department PK,FK1 PK,FK2 user_id department_id PK

department department_id name company_id

FK1

FK1

postal_code PK FK1 postal_code city_id FK1 PK

city city_id name province_id FK1 PK

province province_id Name Abbreviation country_id PK

company company_id name

42

2010 Wipro Ltd - Confidential

Goal of de-normalization
Only one valid reason exists for de-normalizing a relational design - to enhance performance. De-normalization is the process of putting one fact in numerous places. This speeds data retrieval at the expense of data modification. Normalize first, then de-normalize Use only when you cannot optimize Try temp tables, UNIONs, VIEWs, Always monitor and periodically re-evaluate all de-normalized applications. I/O saved , CPU saved , complexity of update programming , cost of returning to a normalized design

43

2010 Wipro Ltd - Confidential

Types of De-normalization

44

2010 Wipro Ltd - Confidential

Concepts
Surrogate Key: A unique {primary key} generated by the {RDBMS} that is not derived from any data in the database and whose only significance is to act as the primary key. A surrogate key is frequently a sequential number (e.g. a {Sybase} "{identity column}) Candidate Key: is a combination of attributes that can be uniquely used to identify a database record without any extraneous data. Each table may have one or more candidate keys. One of these candidate keys is selected as the table primary key. Vertical slicing: A Projection is a vertical slicing of the table. You simply indicate which columns you want. Horizontal slicing: A Selection is a horizontal slicing of the table. The selection defines which records you returned out of all possible records in the table. The projection is defined in the WHERE clause of the SQL statement.

45

2010 Wipro Ltd - Confidential

Using the Data Modeling tools


Model Meta-model

CASE Tool

Compare

Generate DDL

DB Meta-model

CASE tool repository

Target DB

46

2010 Wipro Ltd - Confidential

Data Modeling Tool - ERWIN


Purpose of Model Mart in Erwin is to allow centralized sharing of models automate programming tasks and forward and reverse engineer databases include data types, macros and model storage that a data modeler must address. We can generate the Erwin met model to create a data dictionary that stores information about the data structures used in Erwin models Attribute Name as it appears in the logical model. Data type lists the default data type assigned to the table column. The actual data type used is dependent on which target server you use to hold the Erwin dictionary. Description indicates the type of information stored by the attribute. Valid Values lists the valid values, the reference table that contains the valid value list, or validation expression for the attribute. Where no validation rule has been defined, the data must conform to the attributes data type. FK indicates if the attribute is a foreign key (FK) attribute. This column is blank for non-foreign key attributes.
2010 Wipro Ltd - Confidential

47

Plan
Pick the topic for analysis Create the Conceptual Data Model Create LDM Validate the LDM (Formal validation, normalization etc.) Perform the transition from LDM to PDM using ERWin Create PDM Perform the Physical design (indexes, table spaces, buffer pools, partitioning etc.) Validate the PDM (make sure that de-normalization did not change the data integirty, validate reference implementation through RI) Perform the model walkthrough (attack the PDM to see if we have any holes) Implement OLTP model (enjoy the work of DBA) Play with Data Model/Database to see what we can/cant do (experience the SQL against our masterpiece) Define the requirements for Decision support system (understand what Data Analytics means and why we have to create a separate data model for this) Create DSS model (Star, Snow flake, Constellation, Data Mart, Data Warehouse)

48

2010 Wipro Ltd - Confidential

Questions?

Thank You
Sandeep Kar

Module Leader sandeep.kar@wipro.com

2010 Wipro Ltd - Confidential

Das könnte Ihnen auch gefallen