Sie sind auf Seite 1von 3

CSCI 585 - Database Systems (Summer 2014)

Homework Assignment 1
Prof. Shawn Shamsian
Due: 11:59PM June 8, 2014
The goal of this assignment is to: (1) design a conceptual schema using the ER/EER Data
model; (2) incorporate this schema into a relational database; (3) run SQL queries on this
database; (4) provide solutions to real applications.
1 ER/EER data model (15 pts)
Design a schema that incorporates the specication described below as eciently as possible.
You should submit a the ER/EER diagram of your schema design using the notation given in
the class. In this diagram, indicate all the classes, subclasses, relationships (weak & strong),
relationship cardinalities and degrees, total participation, attributes, and primary keys. In
addition, specify whether each attribute is single-valued or multi-valued, stored or derived, and
atomic or composite. In your design, you cam make and state reasonable assumptions if they
are not specied in the specication.
Design a (simplied) database system for Yelp (http://www.yelp.com). It should store and
manage the following information that may not be exactly the same as the real Yelp website.
The database system consists of the following entities:
User
A Yelp user is a person who has a unique Yelp ID. He or she has an e-mail address, a nickname,
a real name (which contains a rst name and a last name), gender, date of birth, age, current
location (which consists of current latitude and longitude), a prole picture, a list of friend
IDs.
Restaurant
A Restaurant is in of one of the following 6 categories: American, Italian, Chinese, Japanese,
Indian and Korean. It has a unique ID, a name, a location (represented by its latitude and
longitude), open hours (which you may assume to be the same everyday), a city, a zip code,
a price level (taking value from 1-4) and one wall where users can post reviews.
Wall
A wall has a list of reviews, which is associated to exactly one shop.
Review
A review has a unique ID, which is associated to exactly one wall. A review has an author,
rating (1-5), content (which can be text or a photo), posted date, number of likes and a list
of users who like the review. Note that if a restaurant does not has any review, its rating is
0.
1
Photo
A photo has a unique ID. It may belong to exactly one review, or just a prole picture of a
user or a restaurant. It has an uploader and a description.
2 Converting the ER/EER diagram into a relational database
schema (25 pts)
In this part of the project consists of the following tasks:
Installing Oracle on your machine (0 pts)
You do not need to submit anything for this credit, but it is required to nish the rest of this
project.
Designing tables (5 pts)
Convert your ER/EER conceptual schema into tables. For each table, specify the table name,
its elds and their data types, the primary key and foreign keys (if there are any). For each
foreign key (if there is any), specify which table it refers to. You are free to choose proper data
types (if not specied in this project specication) and null-ability assumptions according to
your own understanding of the real Yelp website. You may want to optimize your table design
in this step using 2NF or 3NF described on the textbook since your ER/EER model may not
be optimal, but you can get full credit for this part as long as your tables work properly for
queries in Part 3.
Creating tables in Oracle (20 pts)
Populate your database with the given test data le (HW1 data.xls). Note that there are
several tabs in the Excel le, each one of which may be used for one or more tables in your
database. If there is any attributes which are unavailable in the Excel le, you can make
reasonable assumptions and ll them out yourself.
3 Querying the database (30 pts/6 pts each)
Write SQL sentences for the following queries and run them on the database you have created
in Part 2. You may refer to the appendix for the formula for calculating the distance between
two points on the earth.
Find the names and locations of all restaurants which are open at 9:37PM whose average
rating are no lower than all Korean restaurants.
Find the name and location of the prole picture of the user who posted the most liked review
for Chinese restaurants Los Angeles.
For each category, nd the names of the 2 nearest restaurants to John Green that have average
rating no less than 3.
Find the name of the restaurant which is added to bookmark by the largest number of friends
of John Green.
Find the distance (in miles) from James Green to the nearest American restaurant which is
open at 10:30PM.
2
4 Data mining and analysis (30 pts/10 pts each)
For each one of the following problems, please model it and describe it using words. Then write
and run one or more SQL query/queries to solve the problem. Please indicate your assumptions
explicitly. Note that there are no exact solutions to these problems. You may get full or partial
credits depending on the quality of your modeling and solution to the problem.
Find out the top 3 potential customers for Panda Express, and send them ads via e-mail.
Find out the 3 biggest competitors for restaurant Panda Express.
Find out the top 3 potential friends for user John Green.
5 Submission Guidelines
Please submit a compressed folder via DEN named your name hw1.zip that contains the
following les:
report.pdf
In the report, please state all assumptions that you make and write up solutions to Parts 1-4.
For Part 1, you may either scan your diagrams, or create it with a diagramming software.
For Part 2, you only need to write up your table design and do not include the SQL query
for creating and dropping the database. For Part 3 and Part 4, although you need to hand
in .sql les as well, you still need to paste your solutions in your report.
createdb.sql
This SQL le is used to create the tables and insert the test data, as stated in Part 2.
dropdb.sql
This SQL le is used to drop all tables that are created by createdb.sql.
q1.sql q5.sql
These SQL les are solutions to Questions 1-5 in Part 3, respectively.
m1.sql m3.sql
These SQL les are the SQL queries for the questions in Part 4.
readme.txt
This text le contains your name, USC ID, blackboard user name and your email address.
*Please do not submit homework via email. Late submissions will NOT be
accepted.*
6 Appendix
Here is the formula for calculating geographical distances. Let
s
,
s
;
f
,
f
be the latitude
and longitude (in radian) of two points s and f and = |
s

f
|, = |
s

f
|. The
distance between s and f is given by
d(s, f) = R arccos(sin
s
sin
f
+ cos
s
cos
f
cos )
where R = 3959 miles is the radius of the Earth.
3