Beruflich Dokumente
Kultur Dokumente
Intros
Course Objectives
Design and implement a relational database application Explore the social implications of integrating two databases and to create a policy document Use a range of technologies and tools relevant to database applications, OLAP, and data mining, including MySQL, PHP, and WEKA Orally present projects Build a community of scholarship for each other
3
Course Structure
Lectures
Cover concepts briefly Practice activities Project discussions
Project
Hands-on experience
7 deliverables
Oral presentation
Exams (2)
Demonstrate learning
Lectures
Schedule (Tues/Thurs 1:30-3:20pm)
Slides online beforehand; students to print
Lectures - Topics
Review of DB systems and design Advanced DB systems (hierarchical, objectoriented, spatial, data warehousing, ) Advanced RDBMS (architecture, storage engine, query engine, transaction processing, ) Data warehousing & OLAP Data mining Social implications (security, privacy, aggregation,)
6
Lab Assignments
All four assignments based on the same dataset
Data about colleges
Assignments
Database Design Database Implementation and Application (MySQL, PHP) Online analytical processing (OLAP) Data mining (WEKA)
Projects
Integrated web-based database application
Student-defined Database application requirements
Will explore possibilities today and Thursday Modest size and complexity Use of transactions and security mechanisms Store private information about people
Projects
7 Deliverables
Project Proposal Discovery and Technology Selection Database Design Database Design Optimization Database Implementation Database Application Database Integration Policy Document
May consider:
Connolly, T. M. and Begg, C. E. 2001. Database Systems: A Practical Approach to Design, Implementation, and Management (3rd Edition) New York: Addison-Wesley Publishing. Available online (e.g., Amazon)
11
Success Tips
In general
Attend lectures Complete required readings beforehand Do class activities Take advantage of office hours
Projects
Start early on deliverables Set up hardware/software early Prepare for project discussions Meet regularly and distribute workload (if working collaboratively)
Students encouraged to do so Choose team members with complementary skills
Lab Assignments
Start assignments early Set up hardware/software early
12
In Summary
This course is not a repeat of INFO340!!! Students are expected to know a lot already
Most things will not be reviewed (at least not in detail)
13
Admin
Upcoming assignment(s)
This Saturday by noon
Project Deliverable #1: Project Proposal One-page summary Brief discussion of potential project topics today Longer discussion of topics and potential collaborators on Thursday Feedback this weekend Project Deliverable #2: Discovery and Technology Selection Intended users, use scenarios, workload description Hardware/software, DBMS comparison, web-based implementation comparison Lab Assignment #1 Good DB design (ER diagram) for collecting and using data on colleges
14
Admin
No class next week (at conference)
No office hour, but accessible via email More time to work on lab assignment and deliverable #2
15
Review of DB Systems
What is a DB system? Why would you create a DB system? When would you create a DB system? How do you create a DB system? Where do you create a DB system?
16
DB Systems
An information system that enables users to stores data and query this data to fulfill information-seeking needs
Data retrieval (age, income, ) Queries over structured records and objects stored in a database Returns all objects that satisfy a specific query
17
Databases
Database: (Large) integrated collection of data
Models real-world organization and use Helps people to keep track of things
e.g., orders, students, phone calls, purchases
DB Applications
User operations DB connection (ODBC, JDBC, etc.)
SQL statements
SQL queries
19
DB Design Process
Conceptual Design DBMS Selection
Discovery
Logical Design
Schema Refinement
20
Discovery
How will data be stored and used
What is the purpose of the database? What are the detailed requirements for the system? What types of users and applications will access this database? What are the plans for the future of this database?
21
Conceptual Design
How to represent data such that it can be mapped into a logical design
What is the data model for the database?
Social security number Persons name Parking lot space Date of hire ID of the departmen t that they work for 51 60 60 Name of the departmen t that they work for Pharmacy Hardware Hardware Budget for the department that they work for $100K $75K $75K
12 1 15
22
Conceptual Design
Entity-relationship (ER) data model: describe data for an application in terms of objects (entities) and their relationships
What are the entities and relationships in the application? What information about these entities and relationships should we store in the database? What are the integrity constraints or business rules that hold?
lot
Employees Works_In
did
budget Department
23
DBMS Selection
How do you choose a DBMS for your DB application? Benchmarking used to compare objectively different DBMS
e.g., Results available at http://www.tpc.org/ Online transaction processing: TPC-C
Models a warehouse with customers, products, orders, etc.
Logical Design
Translating ER model (diagram) to relational model
Entity sets -> tables
Creating keys, domains, etc.
25
CREATE TABLE Students (sid: CHAR(20), Students(sid: string, name: name: CHAR(20), string, login: string, age: login: CHAR(10), integer, gpa: real) age: INTEGER, gpa: REAL)
Adapted from slide by Raghu Ramakrishnan
26
Schema Refinement
Schema refinement: the modification of a schema to improve its design
Tables have simple meaning Database has less duplication of information Database has fewer null values Database has good performance
27
Physical Design
Next level of DB design optimization Goal: support typical db use patterns (workloads) efficiently
Guided by the nature of data and its intended use
Specified during all the preceding db design steps
Some tools to identify workloads or optimize after the DB is implemented and is in use
e.g., SQL Servers Index Tuning Wizard and Profiler
Tuning DB Performance
Goals
Minimize query response time (latency) Minimize space utilization (efficiency) Maximize the number of transactions processed (throughput)
29
Implementation
Creating and populating the DB within the DBMS
Ideally, using SQL statements that can be rerun
Creating interface to DB
Need to support user tasks
Desktop applications Web-based applications
30
DB Design Process
Conceptual Design DBMS Selection
Discovery
Logical Design
Schema Refinement
31
In Summary
DB systems allow you to store and retrieve data to satisfy specific information needs
Need to follow an extensive process to build them correctly
32
33
Activity II
34
Project Discussion
Potential DB applications?
35
36
Activity III
37
Review
Course Overview
Lots of work!!
To Do
DB Systems
Manage storage and access to data
DBMS
Make it easy to create and maintain DBs
38
39