Sie sind auf Seite 1von 39

INFO445 Advanced Database Design, Management, and Maintenance

Lecture 1: Course Overview, Review of DB Systems & Design


Professor Melody Y. Ivory-Ndiaye

Intros

Course Objectives
Design and implement a relational database application Explore the social implications of integrating two databases and to create a policy document Use a range of technologies and tools relevant to database applications, OLAP, and data mining, including MySQL, PHP, and WEKA Orally present projects Build a community of scholarship for each other
3

Course Structure
Lectures
Cover concepts briefly Practice activities Project discussions

Project
Hands-on experience
7 deliverables

Oral presentation

Lab Assignments (4)


Build skills Design and development First two assignments are to prepare you for the project

Exams (2)
Demonstrate learning

Lectures
Schedule (Tues/Thurs 1:30-3:20pm)
Slides online beforehand; students to print

Activities to reinforce concepts


Distributed in class; online afterwards

Project discussions on Thursdays

Lectures - Topics
Review of DB systems and design Advanced DB systems (hierarchical, objectoriented, spatial, data warehousing, ) Advanced RDBMS (architecture, storage engine, query engine, transaction processing, ) Data warehousing & OLAP Data mining Social implications (security, privacy, aggregation,)
6

Lab Assignments
All four assignments based on the same dataset
Data about colleges

Assignments
Database Design Database Implementation and Application (MySQL, PHP) Online analytical processing (OLAP) Data mining (WEKA)

Students responsible for hardware/software setup to complete assignments


Nathan can provide a lot of help on this setup and UW resources

Projects
Integrated web-based database application
Student-defined Database application requirements
Will explore possibilities today and Thursday Modest size and complexity Use of transactions and security mechanisms Store private information about people

Database integration Other requirements


Oral presentation

Social and technical impacts of merging two databases

Students responsible for hardware/software setup to complete deliverables


Nathan can provide a lot of help on this setup and UW resources

Projects
7 Deliverables
Project Proposal Discovery and Technology Selection Database Design Database Design Optimization Database Implementation Database Application Database Integration Policy Document

Other Course Info


Office Hours:
Tues 11am-12pm, 330C MGH TA: Nathan SaintClair, TBD

Course Web Site


http://www.ischool.washington.edu/myivory/teach/i nfo445-s03/index.html Lecture slides and activities, assignment submission, communication,

INFO340 Course Web Site


http://www.ischool.washington.edu/myivory/teach/i nfo340-w03/schedule.html
10

Other Course Info


Textbook
Required (from INFO340):
Ramakrishnan, R. and Gehrke, J. 2003. Database Management Systems. Third Edition. New York: McGraw-Hill. Extremely difficult to read

May consider:
Connolly, T. M. and Begg, C. E. 2001. Database Systems: A Practical Approach to Design, Implementation, and Management (3rd Edition) New York: Addison-Wesley Publishing. Available online (e.g., Amazon)

11

Success Tips
In general
Attend lectures Complete required readings beforehand Do class activities Take advantage of office hours

Projects
Start early on deliverables Set up hardware/software early Prepare for project discussions Meet regularly and distribute workload (if working collaboratively)
Students encouraged to do so Choose team members with complementary skills

Lab Assignments
Start assignments early Set up hardware/software early

12

In Summary
This course is not a repeat of INFO340!!! Students are expected to know a lot already
Most things will not be reviewed (at least not in detail)

Students are expected to take initiative more so than in INFO340


Spend time researching unfamiliar topics, prepping for exams, etc.

Students are expected to be comfortable with struggling


Fast pace (multiple assignments due simultaneously) Difficult assignments with little hand-holding Difficult book to read And so on

Be sure that you are committed to working hard in this course


If not,

13

Admin
Upcoming assignment(s)
This Saturday by noon
Project Deliverable #1: Project Proposal One-page summary Brief discussion of potential project topics today Longer discussion of topics and potential collaborators on Thursday Feedback this weekend Project Deliverable #2: Discovery and Technology Selection Intended users, use scenarios, workload description Hardware/software, DBMS comparison, web-based implementation comparison Lab Assignment #1 Good DB design (ER diagram) for collecting and using data on colleges

Next Saturday by noon

14

Admin
No class next week (at conference)
No office hour, but accessible via email More time to work on lab assignment and deliverable #2

15

Review of DB Systems
What is a DB system? Why would you create a DB system? When would you create a DB system? How do you create a DB system? Where do you create a DB system?

16

DB Systems
An information system that enables users to stores data and query this data to fulfill information-seeking needs
Data retrieval (age, income, ) Queries over structured records and objects stored in a database Returns all objects that satisfy a specific query

17

Databases
Database: (Large) integrated collection of data
Models real-world organization and use Helps people to keep track of things
e.g., orders, students, phone calls, purchases

We will examine relational databases, which consist of:


Entities (e.g., students, courses) Relationships between entities (e.g., Madonna is taking INFO340) Expressed as tables Use SQL queries
sid 53666 53688 53650 name login Jones jones@cs Smith smith@eecs Smith smith@math age 18 18 19 gpa 3.4 3.2 3.8
18

Enables multi-user and remote access


concurrency control, transactions, clientserver computing

DB Applications
User operations DB connection (ODBC, JDBC, etc.)

SQL statements

SQL queries

19

DB Design Process
Conceptual Design DBMS Selection

Discovery

Logical Design

Implementation Physical Design

Schema Refinement

20

Discovery
How will data be stored and used
What is the purpose of the database? What are the detailed requirements for the system? What types of users and applications will access this database? What are the plans for the future of this database?

How do you answer these questions?

21

Conceptual Design
How to represent data such that it can be mapped into a logical design
What is the data model for the database?
Social security number Persons name Parking lot space Date of hire ID of the departmen t that they work for 51 60 60 Name of the departmen t that they work for Pharmacy Hardware Hardware Budget for the department that they work for $100K $75K $75K

123-22-3666 231-31-5368 131-23-3650

Attishoo Mary Sam

12 1 15

1/1/91 3/3/93 3/1/92

22

Conceptual Design
Entity-relationship (ER) data model: describe data for an application in terms of objects (entities) and their relationships

What are the entities and relationships in the application? What information about these entities and relationships should we store in the database? What are the integrity constraints or business rules that hold?

since name ssn dname

lot
Employees Works_In

did

budget Department
23

Adapted from slide by Raghu Ramakrishnan

DBMS Selection
How do you choose a DBMS for your DB application? Benchmarking used to compare objectively different DBMS
e.g., Results available at http://www.tpc.org/ Online transaction processing: TPC-C
Models a warehouse with customers, products, orders, etc.

Queries to support decisions: TPC-H

Caveat: vendors optimize systems for standard benchmarks


Might have to create your own benchmark
24

Logical Design
Translating ER model (diagram) to relational model
Entity sets -> tables
Creating keys, domains, etc.

Relationship sets -> tables


Specifying constraints (key and participation)

Similar conversions for inheritance, aggregation, etc.

Some ER diagramming tools can generate code to create tables

25

Relation Schema (SQL)


Structured query language (SQL) used for creating, manipulating, and querying relational dbs
e.g., Create Table statement used to create a new table

CREATE TABLE Students (sid: CHAR(20), Students(sid: string, name: name: CHAR(20), string, login: string, age: login: CHAR(10), integer, gpa: real) age: INTEGER, gpa: REAL)
Adapted from slide by Raghu Ramakrishnan

26

Schema Refinement
Schema refinement: the modification of a schema to improve its design
Tables have simple meaning Database has less duplication of information Database has fewer null values Database has good performance

Normalization is the main approach for schema refinement


Need to understand functional dependency and keys first

27

Physical Design
Next level of DB design optimization Goal: support typical db use patterns (workloads) efficiently
Guided by the nature of data and its intended use
Specified during all the preceding db design steps

Some tools to identify workloads or optimize after the DB is implemented and is in use
e.g., SQL Servers Index Tuning Wizard and Profiler

However, it is best to tune the DB before its implementation and use


28

Tuning DB Performance
Goals
Minimize query response time (latency) Minimize space utilization (efficiency) Maximize the number of transactions processed (throughput)

Entails making good decisions about


Index selection and type Decompositions Query/view construction

29

Implementation
Creating and populating the DB within the DBMS
Ideally, using SQL statements that can be rerun

Creating interface to DB
Need to support user tasks
Desktop applications Web-based applications

30

DB Design Process
Conceptual Design DBMS Selection

Discovery

Logical Design

Implementation Physical Design

Schema Refinement

31

In Summary
DB systems allow you to store and retrieve data to satisfy specific information needs
Need to follow an extensive process to build them correctly

32

33

Activity II

34

Project Discussion
Potential DB applications?

35

36

Activity III

37

Review
Course Overview
Lots of work!!

To Do

Work on finalizing projects and teams (if collaborating)

DB Systems
Manage storage and access to data

DBMS
Make it easy to create and maintain DBs

38

39

Das könnte Ihnen auch gefallen