SQL Query Language SQL versus RA Basic SQL query

Query Language
• SQL is a query language
• Used to examine data in the database
SQL
• SQL queries do not change the contents
of the database (no side-effects!)
Structured Query Language
• The result of an SQL query is printed to the
screen, not stored in the database
1
2
SQL versus RA
Basic SQL query structure
• Relation algebra (RA) is the theoretical basis of SQL.
SELECT Attributes
FROM relations
WHERE condition
• However, there are many differences:
– RA is over a set data model, SQL over a multiset data
model
– RA uses 2-valued logic for conditions and SQL uses 3-
For example:
valued logic
SELECT sid,sname
FROM students
WHERE sid=1122
– SQL has many additional features, which makes it Turingcomplete
• Understanding RA is an important basis for
understanding and optimizing SQL queries!
3
Very Basic SQL Query
Query Components
SELECT [Distinct] Attributes
FROM relation
• A query can contain the following clauses
• Attributes: The attributes or values which
will appear in the query result (For example:
id, name).
– select
– from
– where
– having
• DISTINCT: Optional keyword to delete
duplicates
– order by
• Relation: Relation to perform the query on.
– group by
• Only select and from are obligatory
• Order of clauses is always as above
5
1
4
Example:
Select studentID, studentName
From students
6
StudentID
StudentDept. StudentName
1123
Math
2245
55611
StudentAge
Moshe
25
Computers
Mickey
26
Math
Menahem
29
Basic SQL Query
SELECT [Distinct] Attributes
FROM relation
WHERE condition
Select studentID, studentName
• condition: A Boolean condition (For example:
Eid>21, or Ename=‘Yuval’ ). Only tuples which
return ‘true’ for this condition will appear in
the result
From students
Result:
StudentID
StudentName
1123
Moshe
2245
Mickey
55611
Menahem
7
StudentID
StudentDept. StudentName
1123
Math
2245
55611
8
StudentAge
Moshe
25
Computers
Mickey
26
Math
Menahem
29
SQL and relational algebra
SELECT Distinct A1,…,An
FROM R1,…,Rm
WHERE C;
Select studentID, studentName
From students
Where StudentDept=‘Math’
Result:
StudentID
StudentName
1123
Moshe
55611
Menahem
A1,…,An (C(R1 x…x Rm))
9
10
Basic SQL Query
Example Tables Used
SELECT [Distinct] attributes
FROM relations
WHERE condition;
Important! The evaluation order,
conceptually, is:
1. Compute the cross product of the tables in
relations.
2. Delete all rows that do not satisfy condition.
3. Delete all columns that do not appear in
attributes.
4. If Distinct is specified eliminate duplicate rows. 11
2
Boats
Sailors
sid
sname
rating
age
22
Dustin
7
45.0
31
Lubber
8
55.5
58
Rusty
10
35.0
bid
bname
color
101
Nancy
red
103
Gloria
green
sid
bid
day
22
101
10/10/96
58
103
11/12/96
Reserves
12
What does this compute?
Stage 1: Sailors x Reserves
Select sname
from sailors, reserves
Sailors
Where sailors.sid=reserves.sid
All sailors who have reserved a boat
Sailors
Reserves
sid
sname
rating
age
sid
bid
day
22
Dustin
7
45.0
22
101
10/10/96
31
Lubber
8
55.5
58
103
11/12/96
58
Rusty
10
35.0
Reserves
sid
sname
rating age
sid
bid
day
22
Dustin
7
45.0
22
101
10/10/96
22
Dustin
7
45.0
58
103
11/12/96
31
Lubber
8
55.5
22
101
10/10/96
31
Lubber
8
55.5
58
103
11/12/96
58
Rusty
10
35.0
22
101
10/10/96
58
Rusty
10
35.0
58
103
11/12/96
13
Stage 2: “where sailors.sid=reserves.sid”
Sailors
14
Stage 2: “where sailors.sid=reserves.sid”
Reserves
Sailors
Reserves
sid
sname
rating age
sid
bid
day
sid
sname
rating age
sid
bid
day
22
Dustin
7
45.0
22
101
10/10/96
22
Dustin
7
45.0
22
101
10/10/96
22
Dustin
7
45.0
58
103
11/12/96
58
Rusty
10
35.0
58
103
11/12/96
31
Lubber
8
55.5
22
101
10/10/96
31
Lubber
8
55.5
58
103
11/12/96
58
Rusty
10
35.0
22
101
10/10/96
58
Rusty
10
35.0
58
103
11/12/96
15
Stage 3: “select sname”
Stage 3: “select sname”
Sailors
Reserves
sid
sname
rating age
sid
bid
day
22
Dustin
7
45.0
22
101
10/10/96
58
Rusty
10
35.0
58
103
11/12/96
sname
Dustin
Final answer
Rusty
17
3
16
18
Example Query
Example Query
SELECT DISTINCT sname
FROM Sailors, Reserves
WHERE Sailors.sid = Reserves.sid and
bid = 103;
SELECT sname, age
FROM Sailors
WHERE rating>7;
Q: What does this compute?
Q: What does this compute?
19
20
WHERE Sailors.sid = Reserves.sid and
bid = 103;
Sailors
Select sname
Reserves
Sailors
Reserves
sid
sname
rating age
sid
bid
day
sid
sname
rating age
sid
bid
day
22
Dustin
7
45.0
22
101
10/10/96
22
Dustin
7
45.0
22
101
10/10/96
22
Dustin
7
45.0
58
103
11/12/96
22
Dustin
7
45.0
58
103
11/12/96
31
Lubber
8
55.5
22
101
10/10/96
31
Lubber
8
55.5
22
101
10/10/96
31
Lubber
8
55.5
58
103
11/12/96
31
Lubber
8
55.5
58
103
11/12/96
58
Rusty
10
35.0
22
101
10/10/96
58
Rusty
10
35.0
22
101
10/10/96
58
Rusty
10
35.0
58
103
11/12/96
58
Rusty
10
35.0
58
103
11/12/96
21
A Few SELECT Options
The WHERE Clause
• Numerical and string comparison:
• Select all columns:
!=,<>,=, <, >, >=, <=, between(val1 AND val2)
SELECT *
FROM Sailors;
• Rename selected columns:
• Logical components: AND, OR
SELECT sname AS Sailors_Name
FROM Sailors;
• Applying functions (e.g., Mathematical
manipulations)
• Null verification: IS NULL, IS NOT NULL
SELECT (age-5)*2
FROM Sailors;
• Checking against a list with IN, NOT IN.
23
4
22
24
Examples
The LIKE Operator
SELECT sname
• A pattern matching operator (regular expression)
FROM Sailors
• Basic format: colname LIKE pattern
– Example:
WHERE age>=40 AND rating IS NOT NULL ;
SELECT sid
FROM Sailors
WHERE sname LIKE ‘R_%y’;
SELECT sid, sname
FROM sailors
_ is a single character
WHERE sid IN (1223, 2334, 3344) or
% is 0 or more characters
sname between(‘George’ and ‘Paul’);
25
26
Relation naming
Example Query
SELECT S.sname
FROM Sailors S, Reserves R
WHERE S.sid = R.sid and
R.bid = 103;
SELECT S.sname
FROM Sailors S, Reserves R
WHERE S.sid = R.sid and
R.bid != 103;
Q: Does this return the names of sailors who did
not reserve boat 103?
• Naming relations is good style
• It is necessary if the same relation appears
twice in the FROM clause
A: No! it returns the names of sailors who
reserved a boat other than boat 103
Explanation in the next slides
27
SELECT S.sname
FROM Sailors S, Reserves R
WHERE S.sid = R.sid and
R.bid != 103;
Sailors
Sailors
Reserves
sid
sname
rating
age
22
Dustin
7
45.0
31
Lubber
8
55.5
sid
bid
day
22
101
10/10/07
22
103
11/12/07
31
104
12/2/07
29
5
28
Reserves
sid
sname
rating age
sid
bid
day
22
Dustin
7
45.0
22
101
10/10/07
22
Dustin
7
45.0
22
103
11/12/07
22
Dustin
7
45.0
31
104
12/2/07
31
Lubber
8
55.5
22
101
10/10/07
31
Lubber
8
55.5
22
103
11/12/07
31
Lubber
8
55.5
31
104
12/2/07
30
Sailors
Sailors
Reserves
Reserves
sid
sname
rating age
sid
bid
day
22
Dustin
7
45.0
22
101
10/10/07
31
Lubber
8
55.5
31
104
12/2/07
sid
sname
rating age
sid
bid
day
22
Dustin
7
45.0
22
101
10/10/07
22
Dustin
7
45.0
22
103
11/12/07
22
Dustin
7
45.0
31
104
12/2/07
31
Lubber
8
55.5
22
101
10/10/07
sname
31
Lubber
8
55.5
22
103
11/12/07
Dustin
31
Lubber
8
55.5
31
104
12/2/07
Lubber
But Dustin did order
boat 103!
31
32
Are any of these the same?
SELECT S.sid
FROM Sailors S, Reserves R
WHERE S.sid = R.sid;
SQL query
SELECT S.sid
FROM Sailors S, Reserves R
WHERE S.sid = R.sid;
SELECT DISTINCT R.sid
FROM Sailors S, Reserves R
WHERE S.sid = R.sid;
SELECT R.sid
FROM Reserves R
When would adding DISTINCT give a
different result?
Reserves
Sailors
sid
When there is a sailor who reserved more than a
single boat
sname
rating
age
sid
bid
33
Example Query
34
SQL query
SELECT S.sname
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid and
R.bid = B.bid and
B.color = 'red'
How would you find sailors who have
reserved more than one boat?
SELECT S.sname
FROM Sailors S, Reserves R1,
Reserves R2
WHERE S.sid = R1.sid and
R1.sid=R2.sid and R1.bid!=R2.bid;
Q: What does this return?
35
6
day
36
Order Of the Result
SQL query
• The ORDER BY clause can be used to sort
Q: How would you find the colors of boats
reserved by Bob?
results by one or more columns
• The default sorting, when ORDER BY is
A:
used, is in ascending order
SELECT distinct B.color
FROM Sailors S, Reserves R, Boats B
WHERE S.sname = ‘Bob’ and
S.sid = R.sid and R.bid = B.bid
• Can specify ASC or DESC
SELECT
FROM
WHERE
ORDER BY
sname, rating, age
Sailors S
age > 50
rating ASC, age DESC
37
What does this return?
38
Sailors who’ve reserved red and
green boats
SELECT DISTINCT S.sname
FROM Sailors S, Reserves R, Boats B
WHERE S.sid = R.sid and
R.bid = B.bid and
(B.color = 'red' or
B.color='green')
SELECT S.sname
FROM Sailors S, Reserves R1, Reserves R2
Boats B1, Boats B2
WHERE S.sid = R1.sid and R1.bid = B1.bid
and B1.color = ‘red’ and
S.sid = R2.sid and R2.bid = B2.bid
and B2.color = ‘green’;
What would happen if we replaced or
by and ?
We would get no results!
Then how can we find sailors who have reserved
both a green and a red boat?
39
40
Other Relational Algebra
Operators
Three SET Operators
• So far, we have seen selection, projection
• [Query] UNION [Query]
and Cartesian product
• [Query] EXCEPT [Query]
• How do we do operators UNION and
• [Query] INTERSECT [QUERY]
MINUS?
• Note: The operators remove duplicates by
default!
41
7
42
Multiset (Bag) Operators
Sailors who’ve reserved red or green
boat
• Union without removing duplicates:
UNION ALL
SELECT S.sname
FROM Sailors S, Boats B, Reserves R
WHERE S.sid = R.sid and R.bid = B.bid
and B.color = ‘red’
Would INTERSECT give us sailors who
UNION
reserved both red and green boats?
SELECT S.sname
FROM Sailors S, Boats B, Reserves R
WHERE S.sid = R.sid and R.bid = B.bid
and B.color = ‘green’;
SELECT
FROM
UNION ALL
SELECT
FROM
DISTINCT sname
Sailors S
DISTINCT sname
Sailors S
Almost, but not quite because sname is not unique…
43
44
Nested Queries
• A query is nested if one of its clauses
contains a query
Nested Queries
• Queries can be nested in the following
clauses:
– Select
– From
– Where
– Having
Nested Queries (cont)
46
Remember!
• A sub-query of a nested query is correlated
• The WHERE clause is evaluated for each
if it refers to relations appearing in the outer
tuple in the Cartesian Product formed by the
portion of the query
FROM clause, and a Boolean answer is
returned
• We start by discussing subqueries in the
– The subquery is used to define the Boolean
WHERE clause
answer!
– Common operators used to correlate are: IN,
ANY/ALL, EXISTS
47
8
48
Nested queries in WHERE
In/Not In Format
Subqueries with multiple results:
• The format of these subqueries is always:
SELECT S.sname
FROM Sailors S
WHERE S.sid IN (SELECT R.sid
FROM Reserves R
WHERE R.bid = 103);
– Attribute or value
– In / Not in
– Subquery that returns a single column
• Returns true if the attribute value is in / not in
the result of the subquery
What would happen if we wrote NOT IN?
49
50
Any/All Format
What does this produce?
SELECT S.sname
FROM Sailors S
WHERE S.sid NOT IN
(SELECT R.sid
FROM Reserves R
WHERE R.bid IN
(SELECT B.bid
FROM Boats B
WHERE B.color='red'))
• The format of these subqueries is always:
– Attribute or value
– Arithmetic comparison operator
– Any / All
– Subquery that returns a single column
• Returns true if the attribute value satisfies
the arithmetic operator with respect to any/all
Names of sailors who did not reserve a red boat
of the query results
51
52
Exists/Not Exists Format
Set-Comparison Queries
• The format of these subqueries is always:
SELECT *
FROM Sailors S1
WHERE S1.age > ANY (SELECT S2.age
FROM Sailors S2);
– Exists / Not Exists
– Subquery that returns any number of columns
• Returns true if the subquery returns a nonempty (resp. empty) result
We can also use op ALL (op is >, <, =, >=, <=, or <>).
53
9
54
Correlated Nested Queries
Exists and Not Exists
SELECT S.sid
FROM Sailors S
WHERE EXISTS (SELECT *
FROM Reserves R
WHERE R.bid = 103 and
S.sid = R.sid);
• Differs from In and Not In
• Exists:
For every tuple in the outer loop, the inner
loop is tested. If the inner loop produces a
S not in subquery, refers
to outer loop
result, the outer tuple is added to the result.
Sid of sailors who reserved boat 103
Q: What if we wrote NOT EXISTS?
A: We would get sid of sailors who did not reserve boat 103
55
How would you find the names of sailors who
have reserved a red boat but not a green
boat?
SELECT SS.sname from sailors SS where SS.sid in
( SELECT R1.sid
FROM Reserves R1, Boats B1
WHERE R1.bid=B1.bid and B1.color=‘red’
EXCEPT
SELECT R2.sid
FROM Reserves R2, Boats B2
WHERE R2.bid=B2.bid and B2.color=‘green’ );
56
Rewrite using not in
SELECT SS.sname from sailors SS where SS.sid in
( SELECT R1.sid
FROM Reserves R1, Boats B1
WHERE R1.bid=B1.bid and B1.color=‘red’ and
R1.sid not in (
SELECT R2.sid
FROM Reserves R2, Boats B2
WHERE R2.bid=B2.bid and B2.color=‘green’
));
57
58
Remember: Algebraic Operator
of Division
• Consider: A(X,Y) and B(Y).
How would you find the sailors who
have reserved all boats?
Then AB =
{ x | ( y  B)
x, y  A}
• In general, we require that the set of
fields in B be contained in those of A.
59
10
60
Reminder: Suppliers from A who
supply All Parts from B (1)
Reminder: Suppliers from A who
supply All Parts from B (2)
sno
pno
sno
pno
S1
P1
S1
P1
S1
P2
S1
P2
S1
P3
S1
P4
S2
P1
S2
P2
S3
sno
S1
P3
S1
P4
S2
P1
S2
P2
P2
S3
P2
S4
P2
S4
P2
S4
P4
S4

pno
P2
=
B1
A
61
Reminder: Suppliers from A who
supply All Parts from B (3)
sno
pno
S1
P1
S1
P2
S1
P3
S1
P4
S2
P1
S2
P2
S3
P2
S4
P2
S4
sno
pno

P2
P4
=
B2
P4
A
62
So what is the result of this?
Reserves(sid,bid) Boats(bid)
sno
pno

P1
P2
P4
=
Sailors who Reserved all Boats
B3
Sailor S whose "set of boats reserved"
contains the "set of all boats"
P4
A
63
64
Sailors who reserved all boats
(Division 1)
Sailors for which there does not exist a
boat that they did not reserve
What is the strategy for finding
sailors who have reserved all
boats?
SELECT sid
FROM Sailors S
WHERE NOT EXISTS
(SELECT B.bid
The sailors for which there does
not exist a boat which they have
not reserved
FROM Boats B
WHERE B.bid NOT IN
(SELECT R.bid
FROM Reserves R
65
11
WHERE R.sid = S.sid));
66
Sailors who reserved all boats
(Division 2)
Sailors for which there does not exist a
boat that they did not reserve
Sailors who reserved all boats
(Division 3)
Sailors for which there does not exist a
SELECT S.sid
FROM Sailors S
WHERE NOT EXISTS(
SELECT B.bid
FROM Boats B
WHERE NOT EXISTS(
SELECT R.bid
FROM Reserves R
WHERE R.bid=B.bid and
R.sid=S.sid))
Reserves
boat for which there is no reservation in
SELECT S.sid
FROM Sailors S
WHERE NOT EXISTS((SELECT
FROM
EXCEPT
(SELECT
FROM
WHERE
B.bid
Boats B)
R.bid
Reserves R
R.sid = S.sid));
67
68
Aggregate Operators
• The aggregate operators available in SQL are:
– COUNT(*)
Aggregation
– COUNT([DISTINCT] A)
– SUM([DISTINCT] A)
– AVG([DISTINCT] A)
– MAX(A)
– MIN(A)
69
Find Average Age for each
Rating
Some Examples
SELECT COUNT(*)
FROM Sailors S

SELECT COUNT(sid)
FROM Sailors S
• So far, aggregation has been
applied to all tuples that passed the
SELECT AVG(S.age)
FROM Sailors S
WHERE S.rating=10
WHERE clause test.
• How can we apply aggregation to
groups of tuples?
SELECT COUNT(distinct color)
FROM Boats
71
12
70
72
Find Average Age for each
Rating
Basic SQL Query
SELECT
FROM
WHERE
GROUP BY
HAVING
SELECT AVG(age)
FROM Sailors
GROUP BY rating;
[Distinct] attributes
relation-list
condition
grouping-attributes
group-condition;
• attributes: must appear in grouping-attributes or
aggregation operators
• group-condition: Constrains groups. Can only constrain
attributes appearing in grouping-attributes or in
aggregation operators
73
74
Evaluation- important!
SELECT
FROM
WHERE
GROUP BY
HAVING
SELECT AVG(age)
FROM Sailors
GROUP BY rating;
[Distinct] attributes
relation-list
condition
grouping-attributes
group-condition;
Sailors
Sailors
sid
Sname
rating age
sid
sname
rating age
22
Dustin
7
45.0
22
Dustin
7
45.0
31
Lubber 8
55.5
63
Fluffy
7
44.0
58
Rusty
10
35.0
78
Morley
7
31.0
63
Fluffy
7
44.0
31
Lubber 8
55.5
4. The group-condition is applied to eliminate groups
78
Morley
7
31.0
58
Rusty
10
35.0
5. One answer in generated for each group!
84
Popeye
10
33.0
84
Popeye
10
33.0
1. Compute cross product of relation-list
2. Tuples failing condition are thrown away
3. Tuples are partitioned into groups by values of groupingattributes
40
75
SELECT AVG(age)
FROM Sailors
Where age<50
GROUP BY rating
Having count(*)>2;
Sid
Sname
rating age
22
Dustin
7
45.0
31
Lubber 8
55.5
58
Rusty
10
35.0
63
Fluffy
7
44.0
78
Morley
7
31.0
84
Popeye
10
33.0
SELECT AVG(age)
FROM Sailors
Where age<50
GROUP BY rating
Having count(*)>2;
Step 1
34
76
Step 2
Sid
Sname
rating age
Sid
sname
rating age
22
Dustin
7
45.0
22
Dustin
7
45.0
58
Rusty
10
35.0
63
Fluffy
7
44.0
Morley 7
31.0
63
Fluffy
7
44.0
78
78
Morley 7
31.0
58
Rusty
10
35.0
84
Popeye 10
33.0
84
Popeye 10
33.0
77
13
55.5
78
SELECT AVG(age)
FROM Sailors
Where age<50
GROUP BY rating
Having count(*)>2;
Sid
sname
rating age
22
Dustin
7
45.0
63
Fluffy
7
78
SELECT AVG(age)
FROM Sailors
Where age<50
GROUP BY rating
Having count(*)>2;
Step 3
Sid
sname
rating age
Sid
sname
rating age
44.0
22
Dustin
7
45.0
22
Dustin
7
45.0
Morley 7
31.0
63
Fluffy
7
44.0
63
Fluffy
7
44.0
58
Rusty
10
35.0
78
Morley 7
31.0
78
Morley 7
31.0
84
Popeye 10
33.0
Step 4
Final Answer:
40
79
Find name and age of oldest
Sailor..?
SELECT S.sname, MAX(S.age)
FROM Sailors S
SELECT S.sname, MAX(S.age)
FROM Sailors S
GROUP BY S.sname
80
Find name and age of oldest
Sailor
SELECT S.sname, S.age
FROM Sailors S
WHERE S.age =
(SELECT MAX(S2.age)
FROM Sailors S2)
Wrong! Why?
Right!!
How else can
this be done?
HINT: Use ALL
Wrong! Why?
81
What would happen if we put the
condition about the color in the
HAVING clause?
What does this return?
SELECT
FROM
WHERE
GROUP BY
82
B.bid, COUNT(*)
Boats B, Reserves R
R.bid=B.bid and B.color=‘red’
B.bid
SELECT B.bid, COUNT(*)
FROM Boats B, Reserves R
WHERE R.bid=B.bid
GROUP BY B.bid, B.color
HAVING B.color=‘red’
What would happen if we put the condition
about the color in the HAVING clause?
83
14
84
The Color for which there are
the most boats..?
What does this return?
SELECT bname
FROM Boats B, Reserves R
WHERE R.bid=B.bid
GROUP BY bid, bname
HAVING count(DISTINCT day) <= 5
SELECT
FROM
GROUP BY
HAVING
Names of Boats that were not Reserved on
more than 5 days
color
Boats B
color
max(count(bid))
What is wrong with this?
How would you fix it?
Can we move the condition in the
HAVING to the WHERE?
No! Aggregate functions are not allowed in WHERE
85
Aggregation Instead of Exists
The Color for which there are
the most boats
SELECT
FROM
GROUP BY
HAVING
86
• Aggregation can take the place of exists.
color
Boats B
color
count(bid) >= ALL
(SELECT count(bid)
FROM Boats
GROUP BY Color)
• What does this return?
SELECT color
FROM Boats B1
WHERE NOT EXISTS(
SELECT *
FROM Boats B2
WHERE B1.bid <> B2.bid
AND B1.color=B2.color)
The color of boats which have a unique color (no other
boats with the same color)
87
88
Aggregation Instead of Exists
SELECT
FROM
GROUP BY
HAVING
color
Boats B1
color
count(bid) = 1
Subqueries in the FROM
and in the SELECT clauses
Somewhat simpler…
89
15
A Complex Query
What We Want:
Sailors
• We would like to create a table containing 3
columns:
– Sailor id
– Sailor age
sid
sname
rating
age
22
Dustin
7
45.0
31
Lubber
8
55.5
58
Rusty
10
35.0
– Age of the oldest Sailor (same value in all rows)
How can this be done?
Result
Max-
sid
age
22
45.0
55.5
31
55.5
55.5
58
35.0
55.5
age
91
Attempt 1
92
Attempt 2
SELECT S.sid, S.age, MAX(S.age)
FROM Sailors S;
SELECT S.sid, S.age, MAX(S.age)
FROM Sailors S
GROUP BY S.id, S.age;
Why is this wrong?
Why is this wrong?
93
Solution 1:
Subquery in FROM
Solution 2:
Subquery in SELECT
SELECT S.sid, S.age, M.mx
FROM Sailors S,(SELECT MAX(S2.age) as mx
FROM Sailors S2) M;
SELECT S.sid, S.age, (SELECT MAX(S2.age)
FROM Sailors S2)
FROM Sailors S;
• We can put a query in the FROM clause
instead of a table
• The query in the FROM clause must be
renamed with a range variable (M in this
case).
• A query in the SELECT clause must return at
most one value for each row returned by the
outer query.
95
16
94
96
Another Example of a
Sub-query in SELECT
Result:
Sailors
SELECT S.sid, S.age, (SELECT MAX(S2.age)
FROM Sailors S2
WHERE S2.age<S.age)
FROM Sailors S;
sid
sname
rating
age
22
Dustin
7
45.0
31
Lubber
8
55.5
58
Rusty
10
35.0
• What does this query return?
• Note the use of S (defined in the outer
query) within the inner query.
Result
(Select
sid
age
22
45.0
35.0
31
55.5
45.0
58
35.0
null
…)
97
98
Another Example of a
Sub-query in FROM??
Null Values
SELECT S.sid, S.age, M.mx
FROM Sailors S, (SELECT MAX(S2.age) as mx
FROM Sailors S2
WHERE S2.age<S.age);
Why is this wrong?
100
99
What does NULL Mean?
What does NULL Mean? (cont)
• There are different interpretations to a value of
3. Value Withheld: We are not entitled to
NULL
know this value
1. Value Unknown: I know that there is a value that
– Example: phone number attribute
belongs here, but I don’t know what it is.
–
Example: Birthday attribute
2. Value Inapplicable: There is no value that makes
sense here
–
Example: Spouse attribute for unmarried person
101
17
102
Null Values in Expressions
3 Valued Logic: True, False, Unknown
• Two important rules:
– When we operate on NULL and any other value,
(including another NULL), using an arithmetic operator
like * or +, the result is always NULL
– When we compare NULL and any other value (including
another NULL) , using a comparison operator like = or >,
the result is UNKNOWN.
A
B
A and B
A or B
True
True
True
True
True
False
False
True
True
Unknown Unknown
True
False
True
False
True
False
False
False
False
False
Unknown False
Not A
True
False
Unknown True
Unknown
True
False
True
Unknown False
False
Unknown
Unknown
Unknown
Unknown Unknown Unknown
• The correct way to determine if an attribute x has
value NULL is using x IS NULL or x IS NOT NULL,
which will return true or false
A
Unknown
Unknown
Only tuples for which the WHERE clause has value
TRUE are used to create tuples in the result
103
What does this return?
104
What does this return?
SELECT *
SELECT *
FROM Sailors
FROM Sailors
WHERE sname = sname
WHERE rating > 5 or rating <= 5
105
What do these return?
106
Nulls in Aggregation Functions
• count(*): counts all rows (even rows that are
SELECT sname, rating * 0
all null values)
FROM Sailors
• count(A): counts non-null A-s. returns 0 if all
As are null
SELECT sname, rating - rating
• sum(A), avg(A), min(A), max(A)
FROM Sailors
– ignore null values of A
– if A only contains null value, the result is null
107
18
108
Distinct and Group By
Example
R
• Rows are considered identical, for group by
B
C
and distinct, if they have all the same non-
1
null
null values and both have null values in the
2
null
same columns
3
4
3
5
• Distinct removes duplicates of such rows
SELECT count(*), count(c), min(c), sum(c)
FROM (SELECT c
FROM R
WHERE c IS NULL or c <> NULL
GROUP BY c)
• Such rows form a single group when using
GROUP BY
109
110
Shorthand for Conditional Join
SELECT S1.sname, S2.sname
FROM Sailors S1, Sailors S2
Join Operators in the
FROM Clause
WHERE S1.sid != S2.sid and
S1.sname = ‘Rusty’
SELECT S1.sname, S2.sname
“Syntactic Sugar”
FROM Sailors S1 JOIN Sailors S2 on
S1.sid != S2.sid
WHERE S1.sname = ‘Rusty’
111
Shorthand for Equi-Join
112
Shorthand for Natural Join
SELECT S.sname,
SELECT S.sname,
FROM Sailors S, Reserves R
FROM Sailors S, Reserves R
WHERE S.sid = R.sid and
WHERE S.sid = R.sid and
S.age > 20
S.age > 20
SELECT S.sname,
SELECT S.sname,
FROM Sailors S JOIN Reserves R USING (sid)
FROM Sailors S NATURAL JOIN Reserves R
WHERE S.age > 20
WHERE S.age > 20
Requires equality on all common fields
113
19
114
Left Outer Join
• The left outer join of R and S contains:
– all the tuples in the join of R and S
Outer Join
– all the tuples in R that did not join with tuples
from S, padded with null values
SELECT Sailors.sid, Reserves.bid
FROM Sailors NATURAL LEFT OUTER JOIN
Reserves
115
116
Right Outer Join, Full Outer
Join
Result
Reserves
Sailors
• The right outer join of R and S contains:
sid
sname
rating
age
sid
bid
day
– all the tuples in the join of R and S
22
Dustin
7
45.0
22
101
10/10/96
31
Lubber
8
55.5
58
103
11/12/96
– all the tuples in S that did not join with tuples
from R, padded with null values
58
Rusty
10
35.0
• The full outer join of R and S contains:
Result
sid
bid
– all the tuples in the left outer join of R and S
31
null
– all the tuples in the right outer join of R and S
22
101
58
103
117
118
Express the Left Outer Join in
SQL, without the Left Outer
Join Operator
• Suppose we have R(A,B) and S(B,C).
• Can you write a query that returns the left
ALL and ANY: Special Cases
outer join of R and S, that does not use the
left outer join operator?
119
20
120
Query 1: What does this return?
Query 2: What does this return?
SELECT *
SELECT *
FROM Sailors
FROM Sailors
WHERE age > ANY (SELECT age
WHERE age > ALL (SELECT age
FROM Sailors
FROM Sailors
WHERE sname=‘Joe’)
WHERE sname=‘Joe’)
121
122
Query Q3: What does this
return?
Query Containment
• We say that a query Q1 contains query Q2,
SELECT *
if for all databases D, the result of applying
FROM Sailors
Q1 to D contains the result of applying Q2 to
WHERE age = ANY (SELECT age
D.
FROM Sailors
• Does Query 2 contain Query 1?
WHERE sname=‘Joe’)
• Does Query 1 contain Query 2?
• (the answer to both questions is no – but
why?)
123
Query Q4: What does this
return?
124
Query Q5: What does this
return?
SELECT *
SELECT *
FROM Sailors
FROM Sailors
WHERE age IN (SELECT age
WHERE age <> ANY (SELECT age
FROM Sailors
FROM Sailors
WHERE sname=‘Joe’)
WHERE sname=‘Joe’)
Equivalent to Q3
125
21
126
Query Q6: What does this
return?
SELECT *
FROM Sailors
Views
WHERE age NOT IN (SELECT age
FROM Sailors
WHERE sname=‘Joe’)
Not equivalent to Q5 – why?
128
127
What is a View?
Defining a View
• A view is a virtual table
• CREATE VIEW <view-name> AS <view-def>;
• A view is defined by a query
– Where view-def is an SQL query
• The result of the query is the contents of the virtual
table
• Example:
– always update with respect to the database
– does not exist, is computed every time referenced
CREATE VIEW GreatSailors AS
• Changing a table (insert/update/delete)
automatically changes the view
SELECT sid, sname
FROM Sailors
WHERE rating>=9
129
130
Defining a View
Querying a View
• Another example:
• Once you have defined a view, you can use
it in a query (in the same way that you use a
CREATE VIEW SailorsDates AS
relation)
SELECT sid, date
FROM Sailors S, Reservations R
SELECT sid
WHERE S.sid = R.sid
FROM GreatSailors
WHERE sname = ‘Joe’
131
22
132
Understanding Queries using
Views
Querying a View
• You can use a view and a regular relation
• When writing a query with a view it is as if
together in a query
the expression defining the view is a subquery is the FROM clause
SELECT bname
SELECT bname
SELECT bname
FROM GreatSailors G, Reservations R, Boats B
FROM GreatSailors G,
FROM (SELECT sid
FROM Sailors
WHERE rating >=9) G,
Reservations R, Boats B
Reservations R, Boats B
WHERE G.sid = R.sid and R.bid = B.bid
WHERE G.sid = R.sid and
R.bid = B.bid
WHERE G.sid = R.sid and R.bid
= B.bid
133
What are views good for? (1)
134
What are views good for? (1)
• Simplifying complex queries: Here is an
• Now: Find snames of Sailors who reserved
example allows the user to "pretend" that there is a
red boats on 1/11/09 using SRB
single table in the database
– CREATE VIEW SRB as
SELECT sname
SELECT S.sid, sname, rating, age, R.bid, day,
FROM SRB
bname, color
WHERE color = ‘red’ and day = ‘1/11/09’
FROM Sailors S, Boats B, Reserves R
WHERE S.sid = R.sid and R.bid = B.bid
135
What are views good for? (2)
136
Modifying Views
• Security issues – preventing unauthorized
• Sometimes it is possible to insert into, delete
access. Example: hiding the rating value
from, or update, a view !!!
• Actually, the user request is translated into a
CREATE VIEW SailorInfo
modification of the base tables (the tables
SELECT sname, sid, age
used in the view definition)
FROM Sailors
• Modifications are possible only when the
grant SELECT on SailorInfo to shimon;
view is updatable
137
23
138
Updatable Views
Inserting Example
• There are complex rules determining when a view
CREATE VIEW GreatSailors AS
is updatable
SELECT sid, sname
• Basically, updates are possible when the view is
defined by selecting (SELECT, not SELECT
FROM Sailors
DISTINCT) from a single relation R such that:
WHERE rating>=9
1. The WHERE clause does not involve R in a subquery
INSERT INTO GreatSailors
2. The FROM clause contains only the one relation R
VALUES(113, ‘Sam’)
3. The list in the SELECT clause must include enough
attributes such that for every tuple inserted through the
INSERT INTO Sailors(sid, sname)
view, we can fill the other attributes with NULL or a
default value
Interestingly, we
won’t see this tuple
when we query
GreatSailors
VALUES(113, ‘Sam’)
139
IMPORTANT NOTE
140
Deleting Example
• There is no relation GreatSailors
CREATE VIEW GreatSailors AS
• Insertion actually affects the table over which
SELECT sid, sname
FROM Sailors
GreatSailors is defined, i.e., Sailors
WHERE rating>=9
DELETE FROM GreatSailors
WHERE sname = ‘John’
• Similarly, deletion and updates will affect the
underlying tables…
We add the where
condition from the
view definition to
make sure that only
tuples appearing in
the view are deleted
DELETE FROM Sailors
WHERE sname = ‘John’ and rating>=9
141
Updating Example
142
Postgres Support
CREATE VIEW GreatSailors AS
SELECT sid, sname
• Postgres does not support updatable views
FROM Sailors
WHERE rating>=9
Update GreatSailors
SET sname = ‘Abraham’
• Can achieve the same effect using
WHERE sname = ‘John’
triggers…
Update Sailors
SET sname = ‘Abraham’
WHERE sname = ‘John’and rating>=9
24
143
144
Flights
• Flight(airline, from, to)
Recursion
Airline
El Al
Continental
Air Canada
Air Canada
Frm
Tel Aviv
New York
Los Angeles
Los Angeles
To
New York
Los Angeles
Toronto
Montreal
145
146
Can you find?
Can you find?
• How can you find all places that you can get
• How can you find all places that you can get
to by a direct flight from Tel Aviv?
to by a flight with one stop-over from Tel
Aviv?
SELECT to
SELECT F2.to
FROM Flights
FROM Flights F1, Flights F2
WHERE frm = ‘Tel Aviv’
WHERE F1.frm = ‘Tel Aviv’ and F1.to = F2.frm
147
148
Can you find?
Why can’t you find?
• How can you find all places that you can get
• How can you find all places that you can get
to by a flight with zero or one stop-over from
to by a flight any number of stop-overs from
Tel Aviv?
Tel Aviv?
SELECT to
FROM Flights
• Problem: How many times should Flights
WHERE frm = ‘Tel Aviv’
UNION
appear in the FROM clause?
SELECT F2.to
FROM Flights F1, Flights F2
WHERE F1.frm = ‘Tel Aviv’ and F1.to = F2.frm
149
25
150
Recursion in SQL
Example
• The SQL-99 standard allows us to define
WITH RECURSIVE Reaches(frm,to) AS
temporary relations which can be recursive
(SELECT frm, to FROM Flights)
WITH R AS <definition of R> <query involving R>
UNION
Or more generally:
(SELECT R1.frm, R2.to
WITH
FROM Reaches R1, Reaches R2
[RECURSIVE] R1 AS <definition of R1> , …,
WHERE R1.to = R2.frm)
SELECT to FROM Reaches WHERE frm=‘Tel Aviv’
[RECURSIVE] Rn AS <definition of Rn>
<query involving R1, .., Rn>
151
152
WITH RECURSIVE Reaches(frm,to) AS
Fix-Point Semantics
(SELECT frm, to FROM Flights)
UNION
• The value of Reaches is derived by
(SELECT R1.frm, R2.to
repeatedly evaluating its definition until no
FROM Reaches R1, Reaches R2
changes are made
WHERE R1.to = R2.frm)
SELECT to FROM Reaches WHERE frm=‘Tel Aviv’
– Before starting evaluation, Reaches is empty
– Then, its definition is repeatedly evaluated, and
Flights
the result defines Reaches
– This continues until no more changes appear
Reaches: Step 1
Airline
Frm
To
Frm
To
El Al
Tel Aviv
New York
Tel Aviv
New York
Los Angeles
New York
Los Angeles
Continental New York
Air Canada
Los Angeles Toronto
Los Angeles Toronto
Air Canada
Los Angeles Montreal
Los Angeles Montreal
153
Flights
Reaches: Step 1
Flights
Reaches: Step 1
Airline
Frm
To
Frm
To
Airline
Frm
To
Frm
To
El Al
Tel Aviv
New York
Tel Aviv
New York
El Al
Tel Aviv
New York
Tel Aviv
New York
Los Angeles
New York
Los Angeles
Continental New York
Los Angeles
New York
Los Angeles
Continental New York
Air Canada
Los Angeles Toronto
Los Angeles Toronto
Air Canada
Los Angeles Toronto
Air Canada
Los Angeles Montreal
Los Angeles Montreal
Air Canada
Los Angeles Montreal
Reaches: Step 2
Frm
To
Tel Aviv
New York
New York
Los Angeles
Reaches: Step 3
Los Angeles
New York
Toronto
New York
Montreal
Los Angeles Montreal
Frm
To
Tel Aviv
New York
New York
Los Angeles
Los Angeles Montreal
Los Angeles Montreal
Tel Aviv
Los Angeles Toronto
Los Angeles Toronto
Los Angeles Toronto
155
26
154
Tel Aviv
Los Angeles
New York
Toronto
New York
Montreal
Tel Aviv
Toronto
Tel Aviv
Montreal
156
Flights
Reaches: Step 1
Airline
Frm
To
Frm
To
El Al
Tel Aviv
New York
Tel Aviv
New York
Los Angeles
New York
Los Angeles
Continental New York
Air Canada
Los Angeles Toronto
Los Angeles Toronto
Air Canada
Los Angeles Montreal
Los Angeles Montreal
Reaches:
Final Value
Mutually Recursive Relations
• We can define several recursive queries, which
can use one another in their definitions.
• A dependency graph has a node for each relation
Frm
To
Tel Aviv
New York
New York
Los Angeles
defined, and an edge from one node to another if
Los Angeles Toronto
Tel Aviv
Los Angeles
New York
Toronto
New York
Montreal
Tel Aviv
Toronto
Tel Aviv
Montreal
the first uses the second in its definition
Query will
return
values in
Red
Los Angeles Montreal
– In particular, in the previous example, there would be an
edge from Reaches to itself
• R and S are mutually recursive, if there is a cycle
in the graph involving nodes R and S
157
158
Example
Problematic Recursion
WITH
RECURSIVE P(x) AS
• Complicated recursions are allowed
R
P
(SELECT * FROM R)
• However, sometimes the result may not be
EXCEPT
well defined
(SELECT * FROM Q)
Q
– These cases are not allowed by SQL
RECURSIVE Q(x) AS
(SELECT * FROM R)
EXCEPT
• Before defining exactly what is not allowed,
P and Q are
mutually
recursive
we consider an example
(SELECT * FROM P)
SELECT * FROM P
159
160
Is there a Fix Point?
Example
WITH
• Recall that the result of a defined relation is
RECURSIVE P(x) AS
derived by simply evaluating it again and
(SELECT * FROM R)
again until it no longer changes.
EXCEPT
• However, what happens if this process never
What is in P and Q
if R has the single
tuple (0)?
(SELECT * FROM Q)
RECURSIVE Q(x) AS
terminates?
(SELECT * FROM R)
EXCEPT
(SELECT * FROM P)
SELECT * FROM P
161
27
162
Monotonicity Requirement
Monotonicity Requirement
• R can be defined using a mutually recursive
• In the previous example, R uses the
relation S only if R is monotone in S, i.e.,
mutually recursive relation Q, but R is not
– Adding an arbitrary tuple to S might add tuples
monotonic in Q (adding tuples to Q can
to R, or might leave R unchanged, but can never
cause a tuple to be deleted from R
cause tuples to be removed from R)
• Therefore, this type of recursion is not
allowed in SQL
163
164
RA is strictly less expressive
than SQL
• Every query in relational algebra can be
equivalently written in SQL
Translating RA to SQL
• There are SQL queries that cannot be
expressed using relational algebra
– Examples?
• We now present a procedure for translating
RA to SQL
165
– Note: This is not the most efficient translation
166
Translation By Induction on
Structure of E
Assumptions
• To make the presentation simpler, assume
• Induction on the number of relational algebra
we are translating a relational algebra
operators appearing in E.
expression E into SQL where:
• Base case: 0 operators.
– E does not use the same relation twice
– E is simply a relation R
– No two relations have any attributes with the
same names
SELECT DISTINCT *
FROM R
• Easy to overcome these assumptions using
RA renaming and SQL aliasing.
167
28
168
Translation By Induction on
Structure of E
Last Operator is 
• E = C(E1), where C is a Boolean condition
• Induction Step: Assume that for all E with
less than k operators, there is an SQL
• Let S be an SQL expression equivalent to E1
expression equivalent to E. We show for k
(there is one by the induction hypothesis)
• Must consider several cases, depending on
• E is equivalent to
the “last operator” using in E:
SELECT DISTINCT *
FROM S
Sub-query in the
WHERE C
FROM Clause!
, , , , -
169
170
Last Operator is 
Last Operator is 
• E = A1,..,Ak(E1), where A1,…,Ak are attributes
• E = E1  E2
• Let S be an SQL expression equivalent to E1 (there
• Let S1,S2 be SQL expressions equivalent to E1 and
is one by the induction hypothesis)
E2
• E is equivalent to
• E is equivalent to
SELECT DISTINCT A1,…,Ak
FROM S
S1
UNION
S2
Sub-query in the
FROM Clause!
171
172
Last Operator is -
Last Operator is 
• E = E1  E2
• E = E1 - E2
• Let S1,S2 be SQL expressions equivalent to E1 and
• Let S1,S2 be SQL expressions equivalent to E1 and
E2
E2
• E is equivalent to
• E is equivalent to
SELECT *
FROM S1, S2
S1
EXCEPT
S2
Sub-queries in the
FROM Clause!
173
29
174
Example
• Translate
– sname,color(rating<10(Sailors  Boats))
175
30