Information Systems
SSC, Semester 6
Lecture 1
1
Staff
Lecturer:
Heinz Stockinger, Heinz.Stockinger@isb-sib.ch
Office hours: by appointment
TAs:
Gleb Skobeltsyn,
Adriana Budura,
Ali Salehi,
Renault John,
Wojciech Galuba
Students:
Lorenzo Keller,
Jose A. Camacho
Communications
Web page: lsirww.epfl.ch
Lectures will be available here
Homeworks and solutions will be posted here
The project description and resources will be here
Newsgroup:
epfl.ic.cours.IIS
2
Textbook
Main textbook:
Other Texts
Many classic textbooks (each of them will do it)
Database Systems: The Complete Book, Hector Garcia-
Molina,
Jeffrey Ullman, Jennifer Widom
Database Management Systems, Ramakrishnan
Fundamentals of Database Systems, Elmasri, Navathe
Database Systems, Date (7th edition)
Modern Database Management, Hoffer, (4th edition)
Database Systems Concepts, Silverschatz, (4th edition)
3
Material on the Web
SQL Intro
SQL for Web Nerds, by Philip Greenspun,
http://philip.greenspun.com/sql/
Java Technology
java.sun.com
WWW Technology
www.w3c.org
7
The Course
Goal: Teaching RDBMS (standard) with a
strong emphasis on the Web
Fortunately others already did it
Alon Halevy, Dan Suciu, Univ. of Washington
http://www.cs.washington.edu/education/cours
es/cse444/
http://www.acm.org/sigmod/record/issues/0309/
4.AlonLevy.pdf
8
4
Acknowledgement
Build on UoW course
many slides
many exercise
ideas for the project
Main difference
less theory
will use real Web data in the project
Prof. Aberer previously taught this course
9
10
5
11
12
6
Where are RDBMS used ?
Backend for traditional database
applications
EPFL administration
Backend for large Websites
Immosearch
Backend for Web services
Amazon
13
Example of a Traditional
Database Application
Suppose we are building a system
to store the information about:
students
courses
professors
who takes what, who teaches what
14
7
Can we do it without a DBMS?
Sure we can! Start by storing the data in files:
8
Problems without an DBMS...
System crashes: Read students.txt
Read courses.txt
CRASH !
Find&update the record M ary Johnson
Find&update the record CSE444
Write students.txt
Write courses.txt
Enters a DBMS
Two tier system or client-server
connection
(ODBC, JDBC)
Database server
(someone elses
Data files C program) Applications
18
9
Functionality of a DBMS
The programmer sees SQL, which has two
components:
Data Definition Language - DDL
Data Manipulation Language - DML
query language
20
10
How the Programmer Sees the
DBMS - 2
Tables:
Students: Takes:
SSN Name Category SSN CID
123-45-6789 Charles undergrad 123-45-6789 CSE444
234-56-7890 Dan grad 123-45-6789 CSE444
234-56-7890 CSE142
Courses:
CID Name Quarter
CSE444 Databases fall
CSE541 Operating systems winter
Still implemented as files, but behind the scenes
can be quite complex
data independence = separate logical view
from physical implementation 21
Transactions - 1
Enroll Mary Johnson in CSE444:
BEGIN TRANSACTION;
IF everything-went-OK
THEN COMMIT;
ELSE ROLLBACK
22
If system crashes, the transaction is still either committed or aborted
11
Transactions - 2
A transaction = sequence of statements that
either all succeed, or all fail
Transactions have the ACID properties:
A = atomicity (a transaction should be done or undone completely )
C = consistency (a transaction should transform a system from one
consistent state to another consistent state)
I = isolation (each transaction should happen independently of other
transactions )
D = durability (completed transactions should remain permanent)
23
Queries
Find all courses that Mary takes
SELECT C.name
FROM Students S, Takes T, Courses C
WHERE S.name=Mary and
S.ssn = T.ssn and T.cid = C.cid
What happens behind the scene ?
Query processor figures out how to answer the
query efficiently.
24
12
Queries, behind the scene
Declarative SQL query Imperative query execution plan:
sname
SELECT C.name
FROM Students S, Takes T, Courses C
WHERE S.name=Mary and cid=cid
name=Mary
Database Systems
The big commercial database vendors:
Oracle
IBM (with DB2)
Microsoft (SQL Server)
Sybase
Some free database systems (Unix) :
Postgres
MySQL
Predator
26
13
Databases and the Web
Accessing databases through web interfaces
Java programming interface (JDBC)
Embedding into HTML pages (JSP)
Access through HTTP protocol (Web Services)
Using Web document formats for data definition
and manipulation
XML, Xquery, Xpath
XML databases and messaging systems
27
Database Integration
Combining data from different databases
collection of data (wrapping)
combination of data and generation of new views on
the data (mediation)
Problem: heterogeneity
access, representation, content
Example revisited
http://immo.search.ch/
http://www.swissimmo.ch
28
14
Other Trends in Databases
Industrial
Object-relational databases
Main memory database systems
Data warehousing and mining
Research
Peer-to-peer data management
Stream data management
Mobile data management
29
Course Outline
Part I
SQL (Chapter 6)
The relational data model (Chapter 3)
Database design (Chapters 2, 3, 7)
XML, XPath, XQuery
Part II
Indexes (Chapter 13)
Transactions and Recovery (Chapter 17 - 18)
Exam
30
15
Structure
Prerequisites:
Programming courses
Data structures
Work & Grading:
Homeworks (4): 0%
Exam (like homeworks): 40%
Project: 60% (see next) each phase graded
separately - includes discussion
31
The Project
Models the real data management needs of a Web
company
Phase 1: Modelling and Data Acquisition
Phase 2: Data integration and Applications
Phase 3: Services
"One can only start to appreciate database systems
by actually trying to use one" (Halevy)
Any SW/IT company will love you for these skills
32
16
The Project Side Effects
Trains your soft skills
team work
deal with bugs, poor documentation,
produce with limited time resources
project management and reporting
Results useful for you personally
Demo
Project should be fun
33
Practical Concerns
34
17
Schedule
Date Lecture Exercise/Project Deadlines
13 03 2006 Introduction & Basic SQL Project Introduction
20 03 2006 Advanced SQL E1: SQL
27 03 2006 Conceptual Modelling and Schema Design PP1
03 04 2006 Database Programming, JDBC, Reg. Expr. PP1
10 04 2006 Functional Dependencies E2: Func. Dep. & Rel Alg. PP1 due
17 04 2006 Vacation (Easter)
24 04 2006 Relational Algebra PP1 discussion; PP2
01 05 2006 Introduction to XML PP2
08 05 2006 XML and XQuery E3: XML - XQuery
15 05 2006 Web Services PP3 PP2 due
22 05 2006 Transactions PP2 discussion; PP3
29 05 2006 Recovery E4: Transactions, Recovery
05 06 2006 Vacation (Pentecte)
12 06 2006 Database Heterogeneity PP3 PP3 due
19 06 2006 Test exam PP3 discussion
July 2006 Final Exam
http://lsirwww.epfl.ch/courses/iis/2006ss/
35
36
18
Summary
We use a (Relational) Database
Management System:
Mainly as the backend
To store different kinds of data
To allow for concurrent access of many users
To ensure that data is not corrupted
37
19