Week 02 Notes 02 – more SQL

Here is a script to initialize data in the Student, Course, Schedule database from the last notes.

Aggregate functions

Can Select particular fields from particular records, then perform functions to aggregate these results.

Aggregate functions operate on a group of data and produce a single result, e.g., Min, max, sum, count.

SELECT Sum(credits) AS TotalCredits
FROM Course
WHERE name like '%CS%';

Can Perform aggregates on groups of records

Use Group by to group records containing the same value for a field, then perform an aggregate function on each of the groups. 

Notice:

·       the query produces only one record per group when you have an aggregate function in your SELECT clause,

·       without a GROUP BY clause, there is only one group

SELECT crn, Count(crn) AS numberEnrolled
FROM Schedule
WHERE crn<2
GROUP BY crn;

Duplicates in query results

Query results may produce a table containing duplicate rows.

SELECT crn
FROM Schedule
WHERE crn = 1;

 

Removing dups

Keyword Distinct or DistinctRow may follow Select to remove duplicates from the resultant table.

SELECT DISTINCT crn
FROM Schedule
WHERE crn = 1;

Enumerations in where clause

List several values a field might match.

NOTICE: an enumeration has two parts: the <fieldName> whose value we check against the list of values and the parenthesized list of acceptable values.

 

SELECT <fields>

FROM <tables>

WHERE <fieldName> IN (<value1>,<value2>,…)

SELECT *
FROM Schedule
WHERE id IN (1, 2);

Subqueries

2 selects in one statement.  Select data where the enumeration values for an ‘in’ clause are provided by doing another select.

These queries make use of Foreign Keys.  The inner query finds FK values in one table.  The outer query selects fields from records where these values are Primary Key values.

 

SELECT <fields>

FROM <tables>

WHERE

<fieldName1> IN

(SELECT <fieldName2>

From TABLE

WHERE <conditions causing the inner subquery to select the right records>

 

Example: Find all schedule records where the crn is for a 200 level course:

 

SELECT *

FROM Schedule

WHERE crn in

    (Select crn

    from Course

    Where name like '%2_ _');

 

 

Another example using a logical operator and a subquery:

Find all courses that are not being taken by any students:

 

-- Find all course data for courses not being taken

Select *

from Course

where not crn in

          -- find crn for all courses being taken

          (select distinct crn

          from Schedule

          )

 

?How about find all students not taking a course?

Having clause

Having is a keyword that is used after group by to indicate which groups to qualify to be part of the result.  A Having clause is like a where clause for group by.

 

Syntax:

Select …

From …

Where …

Group by <fieldname>

Having <booleanCondition>

(the booleanCondition may contain aggregate functions, and the aggregate functions apply to each group)

 

Example:

Find states that have exactly one supplier.

 

See the ERD for lab 2.

State is in the city table, so I must work with a set of cities.

Supplier holds a foreign key to city, and my ‘base data’ is a list of suppliers.

 

Starting from the inside out:

Start with the list of suppliers, use this list to get a list of cities,

select cityID from Supplier  -- there are 15 suppliers:

7926

7935

7949

7955

7972

7979

9198

9208

10740

10830

10862

10893

10896

11425

11694

 

use this list to get a list of states.

select state from City where CityID in

(select cityID from Supplier)

KY

KY

KY

KY

KY

KY

NE

NE

SC

TN

TN

TN

TN

WA

WY

 

 

Notice the result of this query has multiple rows showing the same state

Make a group for each state.

select state from City where CityID in

(select cityID from Supplier)

Group by state

KY

NE

SC

TN

WA

WY

 

Note: each state shown above represents a set of rows from the previous query.

Count how many records are in each group:

 

select count(state) from City where CityID in

(select cityID from Supplier)

Group by state

6

2

1

4

1

1

 

Limit the states to those that have only one supplier – use having clause.

select state, count(state) from City where CityID in

(select cityID from Supplier)

Group by state

having count(state) = 1

SC, 1

WA, 1

WY, 1

 

These are the states we want, but we don’t need the ‘1’ along with each row.

select state from City where CityID in

(select cityID from Supplier)

Group by state

having count(state) = 1

SC

WA

WY

 

Putting it all together

Develop subqueries ‘inside out’, i.e., do the simplest query first, put this query in parentheses, then build outward using the results of this simple query using FK values to transition from table to table.

 

Example:

Find students taking only 1 course where that course is a CS course.

 

Steps: 

Get student ids for all students who are taking some course

Select id

From Schedule

 

Use this list to get student ids for students taking only one course:

                    (Select id

                    From Schedule

                    Group by id

                    Having Count(*) = 1

                    )

 

Get crns for all CS courses:

                              (Select crn

                              From Course

                              Where name like '%CS%'

                              )

 

Get ids for students taking one course where that course is a CS course

          (Select id

          From Schedule

          Where id in

                    -- get students taking only one course

                    (Select id

                    From Schedule

                    Group by id

                    Having Count(*) = 1

                    )

          and

                    -- course is a CS course

                    (crn in

                              (Select crn

                              From Course

                              Where name like '%CS%'

                              )

                    )

          )

 

Get all data for these student ids

Select *

From Student

Where id in

          -- get students whose course is a CS course

          (Select id

          From Schedule

          Where id in

                    -- get students taking only one course

                    (Select id

                    From Schedule

                    Group by id

                    Having Count(*) = 1

                    )

          and

                    -- course is a CS course

                    (crn in

                              (Select crn

                              From Course

                              Where name like 'CS%'

                              )

                    )

          )

Another approach:

-- Find students taking only 1 course where that course is a CS course.

select id, crn

from Schedule

where id in (

-- find all students taking 1 CS course

select id

from Schedule

where crn in (

          -- find crns of CS courses

select crn from Course

          where name like 'cs%'

)

group by id

having count(*)=1)

group by id

having count(*)=1

 

A Harder Example:

See http://cs.sou.edu/~nordquip/cs360/demos/maxEnrollmentQueriesDemo.sql