Sie sind auf Seite 1von 46

SQL Lecture I:

simple queries, aggregate functions, inner joins

Examples will use the NASA Database


projectName

Note: The entities (i.e. boxes) in This ERD do NOT have either NASA_ or NASA2_ pre-pended to the name, but the actual tables in PostgreSQL do have NASA_ or NASA2_ pre-pended to the table names.

Nasa vs Nasa2
We are using the NASA database in this lecture, not the NASA2 database.
These two databases have the same ERD, but different data. The NASA database does NOT include data from the shuttle program, whereas NASA2 does.

Advice from Uncle Raymond


The SQL to CREATE the nasa database is at:
http://www-staff.it.uts.edu.au/~raymond/db/nasa.txt

Load it into postgreSQL using the instructions in How to start PostgreSQL in the building 10 labs .

To learn the SQL in this presentation, Raymond recommends that you type in and run all the queries given in this presentation.
Just staring at the powerpoint slides isnt effective study but dont just type them in think about them!!! Maybe even play with simple variations. (HD students should!)
4

The simplest query: List all the data in a table.


raymond=> select * from nasa_projects;
project ----------------Mercury Gemini Apollo Skylab Apollo-Soyuz (5 rows)

Note again: While the entities (i.e. boxes) in the ERD do NOT have either NASA_ or NASA2_ pre-pended to the name, the actual tables in PostgreSQL do have NASA_ or NASA2_ pre-pended to the table names.

The simplest query: List all the data in a table (2)


raymond=> select * from nasa_spacecraft;
projectname | missionno | crafttype | craftname ------------+-----------+-----------+-----------Mercury | 1 | capsule | Freedom 7 Mercury | 2 | capsule | Liberty Bell 7 Mercury | 3 | capsule | Friendship 7 Mercury | 4 | capsule | Aurora 7 Mercury | 5 | capsule | Sigma 7 Mercury | 6 | capsule | Faith 7 SQL keywords Gemini | 3 | capsule | Molly Brown (e.g. select, Apollo | 9 | CSM | Gum Drop from) are NOT etc | etc | etc | etc case sensitive Apollo | 17 | CSM | America (other things can Apollo | 17 | LM | Challenger be case sensitive). (25 rows)
6

All SQL commands end with a semi-colon ;

Choosing a subset of columns:


just name the columns you want
raymond=> select projectname,missionno,crafttype raymond-> from nasa_spacecraft;
projectname | missionno | crafttype ------------+-----------+---------Mercury | 1 | capsule Mercury | 2 | capsule Mercury | 3 | capsule Mercury | 4 | capsule Mercury | 5 | capsule Mercury | 6 | capsule Gemini | 3 | capsule Apollo | 9 | CSM etc | etc | etc Apollo | 17 | CSM Apollo | 17 | LM (25 rows)

Column names are comma separated. We can specify any ordering of columns we want.

Notice the -> prompt when I did not complete the command on one line.

Choosing a subset of rows:


using where
raymond=> select projectname,missionno,crafttype raymond-> from nasa_spacecraft raymond-> where projectname = Apollo;
projectname | missionno | crafttype ------------+-----------+---------SQL uses single quotes Apollo | 9 | CSM for strings, not double Apollo | 9 | LM quotes Apollo | 10 | CSM Apollo | 10 | LM Specifying what data to look etc | etc | etc for in the table is case Apollo | 17 | CSM Apollo | 17 | LM sensitive (e.g. try apollo). (18 rows)
Any where (i.e. choosing rows) can be matched with any select (i.e. choosing columns).

A trap for rookie players:


not completing a string
raymond=> select projectname,missionno,crafttype raymond-> from nasa_spacecraft raymond-> where projectname = Apollo raymond'> Missing
Notice the -> prompt when I did not complete the string on one line.
It is rare to want to extend a string across a line; when you mistakenly dont complete a string, type a control c and it will terminate the command. control c terminates any command. It can also terminate the output of lots and lots of unwanted rows.
9

Choosing a subset of rows:


using where and a comparison
raymond=> select * raymond-> from nasa_astronaut raymond-> where birth < 1930;
astrono | astroname | birth | death ---------+---------------------------+-------+------5 | Borman, Frank | 1928 | 7 | Carpenter, Scott | 1925 | 13 | Cooper, Gordo | 1927 | 2004 20 | Glenn, John | 1921 | etc | etc | etc | etc 37 | Shepard, Alan | 1923 | 1998 38 | Slayton, Deke | 1924 | 1993 (11 rows)

10

Comparison Operators
Operator = < <= > >= <> Equal to Less than Less than or equal to Greater than Greater than or equal to Not equal to Note this one! Meaning

Also, WHERE a BETWEEN x AND y is equivalent to WHERE a >= x AND a <= y


11

More about WHERE


The WHERE clause can contain several conditions linked by AND or OR. In a WHERE containing one or more ANDs all specified conditions must be true
Very important for doing natural joins as cross products (see later slides)

In a WHERE containing one or more ORs, at least one of the conditions must be true If you mix ANDs and ORs, the ANDs have precedence
Just as, in arithmetic, multiplication has precedence over addition, and like arithmetic, you can use brackets ( ) to control precedence
12

IS NOT NULL
raymond=> select * from nasa_astronaut raymond-> where death is not null raymond-> order by death;
astrono | astroname | birth | death ---------+---------------------------+-------+------10 | Chaffee, Roger | 1935 | 1967 22 | Grissom, Gus | 1926 | 1967 42 | White, Edward | 1930 | 1967 40 | Swigert, John | 1931 | 1982 16 | Eisele, Donn | 1930 | 1987 17 | Evans, Ron | 1933 | 1990 24 | Irwin, James | 1930 | 1991 38 | Slayton, Deke | 1924 | 1993 32 | Roosa, Stu | 1933 | 1994 37 | Shepard, Alan | 1923 | 1998 12 | Conrad, Charles | 1930 | 1999 13 | Cooper, Gordo | 1927 | 2004 (12 rows)

We can also use is null

Trap for rookie players: you CANNOT write either = NULL or <> NULL
This note is NOT on the handout
13

Calculating - also as
raymond=> raymond-> raymond-> raymond-> raymond-> select astroname, death, death-birth as as age death - birth age from nasa_astronaut where death is not null order by age, death;

astroname | death | age --------------------+-------+----Chaffee, Roger | 1967 | 32 White, Edward | 1967 | 37 Grissom, Gus | 1967 | 41 Swigert, John | 1982 | 51 Eisele, Donn | 1987 | 57 Evans, Ron | 1990 | 57 Irwin, James | 1991 | 61 Roosa, Stu | 1994 | 61 Slayton, Deke | 1993 | 69 Conrad, Charles | 1999 | 69 Shepard, Alan | 1998 | 75 Cooper, Gordo | 2004 | 77 (12 rows)

as age

Note also age, death


14

IN
raymond=> select * raymond-> from nasa_astronaut raymond-> where birth in (1928, 1932);
astrono | astroname | birth | death ---------+---------------------------+-------+------4 | Bean, Alan | 1932 | 5 | Borman, Frank | 1928 | 8 | Carr, Gerald | 1932 | 14 | Cunningham, Walter | 1932 | 25 | Kerwin, Joseph | 1932 | 27 | Lovell, James | 1928 | 36 | Scott, David | 1932 | 41 | Weitz, Paul | 1932 | 43 | Worden, Alfred | 1932 | (9 rows)

15

IN
raymond=> raymond-> raymond-> raymond-> select * from nasa_astronaut where birth in (1928, 1932) order by birth;

astrono | astroname | birth | death ---------+---------------------------+-------+------5 | Borman, Frank | 1928 | 27 | Lovell, James | 1928 | 4 | Bean, Alan | 1932 | 8 | Carr, Gerald | 1932 | 14 | Cunningham, Walter | 1932 | 25 | Kerwin, Joseph | 1932 | 36 | Scott, David | 1932 | 41 | Weitz, Paul | 1932 | 43 | Worden, Alfred | 1932 | (9 rows)

16

NOT IN
(also descending)
raymond=> raymond-> raymond-> raymond-> raymond-> select * Lunar Landing from nasa_mission where projectname = Apollo and missionType not not in in (LO, (LO, LL) LL) order by launchYear;
Lunar Orbit

missionno|missiontype|launchyear| description ---------+-----------+----------+--------------------------1 | XX | 1967 | Crew killed in launch test 7 | EO | 1968 | First successful Apollo mission 9 | EO | 1969 | First test of Lunar Module 13 | LF | 1970 | Houston, we have a problem (3 rows)

Lunar Flyby (aborted landing)

Earth Orbit

Unclassified / disaster
17

NOT IN
(also DESC, short for descending)
raymond=> raymond-> raymond-> raymond-> raymond-> select * Lunar Landing from nasa_mission where projectname = Apollo and missionType not not in in (LO, (LO, LL) LL) order by launchYear DESC ; DESC;
Lunar Orbit

missionno|missiontype|launchyear| description ---------+-----------+----------+--------------------------13 | LF | 1970 | Houston, we have a problem 9 | EO | 1969 | First test of Lunar Module 7 | EO | 1968 | First successful Apollo mission 1 | XX | 1967 | Crew killed in launch test
(3 rows) Lunar Flyby

(aborted landing)

Earth Orbit

Unclassified / disaster
18

Built-in aggregate functions


e.g. AVG, SUM, MIN, MAX, and COUNT

To collect or gather into a mass or whole

19

Built-in aggregate functions


e.g. AVG, short for average
raymond=> select AVG(death-birth) AVG(death - birth) raymond-> from nasa_astronaut raymond-> where death is not null; avg --------------------57.2500000000000000 (1 row)

All the death-age values calculated from every row were collected or gathered into a whole; i.e. their average
20

Built-in aggregate functions


e.g. AVG, short for average
raymond=> select AVG(death-birth) AVG(death - birth) raymond-> from nasa_astronaut raymond-> where death is not null; Using round avg select ROUND(AVG(death-birth)) --------------------will produce 57.2500000000000000 (1 row) round ------57 (1 row)
21

Built-in aggregate functions


e.g. AVG, short for average
raymond=> select AVG(death-birth) AVG(death - birth) raymond-> from nasa_astronaut raymond-> where death is not null; Using round with a parameter avg select ROUND(AVG(death-birth), 2) --------------------will produce 57.2500000000000000 round (1 row) ------57.25 (1 row)
22

Built-in aggregate functions


count
COUNT(name of a column) non-null values SELECT count(death) FROM NASA_astronaut; count ------12 (1 row) COUNT(*) the number of rows SELECT count(*) FROM NASA_astronaut; count ------44 (1 row)

23

web links on aggregate functions


This slide is on the single-page handout

Also, but these can quickly get technical ...

http://www.postgresql.org/docs/8.4/static/functions-aggregate.html http://www.postgresql.org/docs/8.4/static/tutorial-agg.html
24

Pattern matching - LIKE and %


List all astronauts whose names start with A:

SELECT astroname FROM NASA_astronaut WHERE astroname LIKE LIKE A%'; 'A%';
astroname --------------Aldrin, Buzz Anders, William Armstrong, Neil (3 rows)

% matches any character, one or more times.

List all astronauts whose names contain an A:

SELECT astroname FROM NASA_astronaut WHERE astroname LIKE LIKE %A%'; %A%'; Also matches
Bean, Alan Shepard, Alan Worden, Alfred
25

Note: Use = for an exact match, and % for a pattern match.

Eliminating duplicate rows


DISTINCT
SELECT DISTINCT launchyear FROM NASA_Mission WHERE projectname = Apollo;
launchyear ---------1967 1968 1969 1970 1971 1972 (6 rows)

SELECT count(*) FROM NASA_Mission WHERE projectname = 'Apollo'; count ------ NASA launched 12 Apollo missions 12 in 6 years, so some missions must (1 row) have flown in the same year. Just how many NASA flew in each year we can see with GROUP BY
26

GROUP BY
SELECT launchYear launchyear, count(*) FROM NASA_Mission WHERE projectname = Apollo GROUP BY launchyear launchYear ORDER BY launchyear;
launchyear | count -----------+------1967 | 1 1968 | 2 1969 | 4 1970 | 1 1971 | 2 1972 | 2 (6 rows)

The one or more columns nominated in the select must also be nominated in the GROUP BY.

27

GROUP BY
SELECT launchYear launchyear, count(*) FROM NASA_Mission WHERE projectname = Apollo GROUP BY launchyear launchYear ORDER BY launchyear;
launchyear | count -----------+------1967 | 1 1968 | 2 1969 | 4 1970 | 1 1971 | 2 1972 | 2 (6 rows)

The column names nominated in the select are followed by zero or more aggregate functions (often count).

28

Revision on missionTypes
SELECT launchYear launchyear, count(*) FROM NASA_Mission WHERE projectname = Apollo GROUP BY launchyear launchYear ORDER BY launchyear;
launchyear | count -----------+------1967 | 1 1968 | 2 1969 | 4 1970 | 1 1971 | 2 1972 | 2 (6 rows)

Revision on missionTypes: Actually XX, Apollo 1 disaster Apollo 7, 8 (first LO, Lunar orbit) Apollo 9, 10, 11 (first LL), 12 Apollo 13 (only LF, Lunar Flyby) Apollo 14, 15 Apollo 16, 17
29

HAVING like WHERE, but after the grouping


select projectname, count(*) from nasa_mission group by projectname; projectname | count This slide is on the -------------+------single-page handout Gemini | 10 Skylab | 3 Apollo | 12 Mercury | 6 Apollo-Soyuz | 1 (1 row) if if we we add add having having count(*) count(*) = =1 1 we we just just get get projectname | count -------------+------Apollo-Soyuz | 1 30 (1 row)

Revision: Foreign keys

31

Revision: Foreign keys


A foreign key is a column (or columns) that is a primary key in another table
project,missionno in spacecraft is a foreign key because project,missionno is the primary key of entity mission

Foreign keys are used to record a one to many (1:m) relationship


Referential integrity: Either a foreign key is null (e.g. Dennis, who has no girlfriend), or a non-null value for a foreign key must have a matching value in the other table. 32

The Natural Join (1) natural join


Displays output from two or more tables created by finding matching row values in columns that have the same name. SELECT astroname, projectname, missionno FROM NASA_astronaut NATURAL JOIN NASA_Assigned WHERE astroname = 'Armstrong, Neil';
astroname | projectname | missionno ----------------+-------------+---------Armstrong, Neil | Gemini | 8 Armstrong, Neil | Apollo | 11 (2 rows)

33

The Natural Join (2) the cross product form


The same thing as using natural join on the previous slide, but using the alternate (cross product) notation

SELECT astroname, projectname, missionno FROM NASA_astronaut, , NASA_Assigned WHERE astroname = 'Armstrong, Neil
AND NASA_astronaut.astrono = NASA_Assigned.astrono;

Notice that the words natural join have been replaced with a dot. Notice other places where a dot appears ... a table name, followed by . followed by the name of a column in that table. This disambiguates which column we mean. 34

A trap for rookie players:


not using a . in the select
NOTE: If on the previous slide I replace SELECT astroname, etc

with
SELECT astrono, etc Then which astrono do I mean, the column in the Astronaut table, or the column in the Assigned table??? You might reasonably argue that it doesnt matter, as the NASA_astronaut.astrono = NASA_Assigned.astrono specifies that the two rows have the same value. Computers arent that smart; you must use a . when you use an ambiguous column name in the select. 35

Another trap for rookie players:


loads and loads and loads of output from a wrong cross product
Loads and loads and loads of output from a cross product probably means that you havent specified all the right columns to do the natural join. When that happens, and it will, type a control c and it will terminate the command. Then figure out what you did wrong. control c terminates any command. It can also terminate the output of lots and lots of unwanted rows.

37

Self-Join
Join a table to itself Usually involve a self-referencing relationship Useful to find relationships among rows of the same table
emp

supervises

38

Joining a table with itself


emp empno 1 2 3 4 5 6 7 8 9 empfname Alice Ned Andrew Clare Todd Nancy Brier Sarah Sophie empsalary 75000 45000 25000 22000 38000 22000 43000 56000 35000 deptname Management Marketing Marketing Marketing Accounting Accounting Purchasing Purchasing Personnel & PR bossno 1 2 2 1 5 1 7 1

supervises

emp emp empno 1 2 3 4 5 6 7 8 9 empfname Alice Ned Andrew Clare Todd Nancy Brier Sarah Sophie empsalary 75000 45000 25000 22000 38000 22000 43000 56000 35000 deptname Management Marketing Marketing Marketing Accounting Accounting Purchasing Purchasing Personnel & PR bossno 1 2 2 1 5 1 7 1

39

This slide is on the single-page handout

Querying a recursive relationship


supervises

Find the names of employees who earn more than their boss. SELECT wrk.empfname FROM emp wrk, emp boss emp WHERE wrk.bossno = boss.empno AND wrk.empsalary > boss.empsalary;
wrk empno empfname 2 Ned empsalary deptname 45,000 Marketing boss bossno empno empfname 1 1 Alice empsalary deptname 75,000 Management bossno

3
4 5 6 7 8 9

Andrew
Clare Todd Nancy Brier Sarah Sophie

25,000 Marketing
22,000 Marketing 38,000 Accounting 22,000 Accounting 43,000 Purchasing 56,000 Purchasing 35,000 Personnel & PR

2
2 1 5 1 7 1

2 Ned
2 Ned 1 Alice 5 Todd 1 Alice 7 Brier 1 Alice

45,000 Marketing
45,000 Marketing 75,000 Management 38,000 Accounting 75,000 Management 43,000 Purchasing 75,000 Management

1
1 1 1

empfname Sarah 40

web links on joins


This slide is on the single-page handout

Also, but these can quickly get technical ...

http://www.postgresql.org/docs/8.4/static/tutorial-join.html http://www.postgresql.org/docs/8.4/static/queries-table-expressions.html#QUERIES-JOIN

41

This slide is on the single-page handout

Left Outer join

A natural join plus those rows from table1 not included in the natural join
SELECT * FROM table1 LEFT JOIN table2 USING (id);

t1 id col1

t2 id col2

t1.id
1 2 3

col1
a b c

t2.id
1 null 3

col2
x null y
42

1 2
3

a b
c

1 3
5

x y
z

This slide is on the single-page handout

Right Outer join


SELECT * FROM t1 RIGHT JOIN t2 USING (id);

A natural join plus those rows from table2 not included in the natural join

t1
id 1 2 3 col1 a b c

t2
id 1 3 5 col2 x y z t1.id 1 3 null col1 a c null t2.id 1 3 5 col2 x y z
43

Theta join
A join that may include comparison operators other than just =, such as <> > >= < <= A theta join is the most general form of join. Examples of using things other than = follow
44

This slide is on the single-page handout

Right Outer join


SELECT * FROM t1 RIGHT JOIN t2 USING (id);

A natural join plus those rows from table2 not included in the natural join

t1
id 1 2 3 col1 a b c

t2
id 1 3 5 col2 x y z t1.id 1 3 null col1 a c null t2.id 1 3 5 col2 x y z
45

Das könnte Ihnen auch gefallen