- Utilize different ways to filter data in the WHERE clause
- Understand the uses of advanced queries like subqueries and joins
- Be able to normalize a database structure
Not equal - <>
- LIKE - SELECT * FROM students WHERE name LIKE '%';
- DISTINCT - SELECT DISTINCT name FROM students;
- ORDER BY - SELECT * FROM students ORDER BY name DESC;
- COUNT - SELECT count(*) FROM students;
- MAX - SELECT max(age) FROM students;
- MIN - SELECT min(age) FROM students;
- AND - SELECT * from students WHERE name = 'Elie' AND age = 26;
- OR - SELECT * from students WHERE name = 'Elie' OR name ='Mary';
- IN - SELECT * FROM students WHERE name IN ('Bob', 'Tom');
- NOT IN - SELECT * FROM students WHERE name NOT IN ('Bob', 'Tom');
- LIMIT - SELECT * FROM students LIMIT 2;
- OFFSET - SELECT * FROM students OFFSET 1;
- LIMIT + OFFSET - SELECT * FROM students LIMIT 2 OFFSET 1;
- % - SELECT * FROM students WHERE name LIKE '%b';
We use an aggregate function to get the total count of movies in a table.
SELECT COUNT(*) FROM movies;
What about getting the count of something more specific in movies, such as the count of each rating?
SELECT COUNT(rating) FROM movies;
We get the same result. GROUP BY allows you to 'group' the table by a specific attribute, which is then provided to the aggregate function.
SELECT rating, COUNT(rating) FROM movies
GROUP BY rating;
- Produces only the results from both tables that match the join condition.
- Produces all the results from both tables regardless of whether or not there is any row in either table that matches the join condition.
- Produces all results from the left table regardless of whether or not there is a matching row in the right table and only results from the right table that have a matching row from the left table based on the join condition.
- Opposite of a Left Join: Produces all results from the right table regardless of whether or not there is a matching row in the left table and only results from the left table that have a matching row in the right table based on the join condition.
- Produces a cartesian product of both joined tables (all rows in the left table match all rows in the right table, giving NxM results). There is NO join condition for a cross join.
CREATE TABLE authors (
author_id SERIAL PRIMARY KEY,
name VARCHAR(100)
);
CREATE TABLE books (
book_id SERIAL PRIMARY KEY,
name VARCHAR(100),
author_id INTEGER
);
INSERT INTO authors (name) VALUES ('Elie'), ('Bob'), ('Mary');
INSERT INTO books (name, author_id) VALUES ('Book 1', 1), ('Book 2', 2);
INSERT INTO books (name) VALUES ('Book 3'), ('Book 4');
SELECT * FROM authors
JOIN books
ON authors.author_id = books.author_id
ORDER BY authors.author_id ASC;
- Replace the 'JOIN' above with a Left/Right/Full/Inner/Cross Join in the query above and see what the results are.
ALTER TABLE books ADD CONSTRAINT author_id
FOREIGN KEY (author_id) REFERENCES authors (author_id)
ON DELETE NO ACTION;
ALTER TABLE books ADD COLUMN year_released INTEGER;
ALTER TABLE books ALTER COLUMN name SET NOT NULL;
DROP TABLE books;
CREATE TABLE books (
book_id SERIAL PRIMARY KEY,
name VARCHAR(100) NOT NULL,
author_id INTEGER REFERENCES authors
);
What if I want to do something very specific, but I need to get groups of results? For example:
- Get titles of movies with the highest ratings.
- Get titles of movies with the lowest ratings.
Let's try it using WHERE
SELECT title FROM movies
WHERE rating = MAX(rating);
That will give us an error, because MAX is an aggregate function and can't be used in WHERE.
Solution: Nested queries.
SELECT MAX(rating) FROM movies;
This will return the maximum rating, which we need to feed into WHERE.
SELECT title FROM movies
WHERE rating = (
SELECT MAX(rating) FROM movies
);
The idea behind normalization is that the data should not be repeated. The rules of normalization are called "Normal Forms". There's technically 6 forms, but the first three are the most important.
- Each column name must be unique.
- Each column value must be a single value.
- Each row must be unique.
- There are no repeating groups.
Additionally:
- Choose a primary key
Reminder:
- A primary key is unique, not null, and unchangeable. A primary key can either be a single column or combination of columns.
Student | Age | Subject |
---|---|---|
Adam | 15 | Biology, Maths |
Alex | 14 | Maths |
Stuart | 17 | Maths |
vs
Student | Age | Subject |
---|---|---|
Adam | 15 | Biology |
Adam | 15 | Maths |
Alex | 14 | Maths |
Stuart | 17 | Maths |
- Table is in 1NF.
- All non-primary-key columns are fully dependent on the primary key.
With our 1NF table from above, if Student is our primary key, Subject does not depend on the Student for its existence. Biology does not require Adam for its existence. In this case, Subject should be moved to a new table.
Student | Age |
---|---|
Adam | 15 |
Alex | 14 |
Stuart | 17 |
And...
Student | Subject |
---|---|
Adam | Biology |
Adam | Maths |
Alex | Maths |
Stuart | Maths |
- Table is in 1NF and 2NF.
- Non-primary-key columns do not depend on other non-primary-key columns.
The number of enrolled students in a course depend on the Subject, not the student.
Student | Subject | Enrolled |
---|---|---|
Adam | Biology | 3 |
Adam | Maths | 5 |
Alex | Maths | 5 |
Stuart | Maths | 5 |
vs
Student | Subject |
---|---|
Adam | Biology |
Adam | Maths |
Alex | Maths |
Stuart | Maths |
And
Subject | Enrolled |
---|---|
Biology | 3 |
Maths | 5 |
Not as important and are more difficult to explain, but basically come as a consequence of thinking logically about your database design. Basically don't repeat data (foreign keys to other tables don't count as that is not the data itself). Think about the relationships between your pieces of data and set up your 1:M/M:1 and M:M relationships with appropriate columns or join tables.
See also: http://www.slideshare.net/kosalgeek/database-normalization-1nf-2nf-3nf-bcnf-4nf-5nf
Aliases are a piece of a SQL query that allows you to temporarily rename a table or column for the current query. This is useful for creating shorthand names for tables when using table prefixes, renaming columns, or differentiating tables when you join the same table more than once in a query (eliminating ambiguity).
SELECT
users.userID AS 'id',
users.username AS 'name'
FROM users;
--
SELECT * FROM authors a
INNER JOIN books b
ON a.author_id = b.author_id
ORDER BY a.author_id ASC;
--
SELECT * FROM crew
LEFT JOIN users photographer
ON crew.fk_photographer = photographer.userID
LEFT JOIN users director
ON crew.fk_director = director.userID
LEFT JOIN users model
ON crew.fk_model = model.userID
ORDER BY crew.crewID ASC;
The CASE statement is used when you want to display different things depending on the data that you've queried from the database. There's two different ways to structure a CASE statement shown below. Note that in the first example you can only compare against single values while in the second example you can use actual expressions for evaluation. Also note that CASE statements require an ELSE statement.
SELECT
CASE users.age
WHEN 0 THEN 'baby'
WHEN 15 THEN 'teen'
ELSE 'adult'
END AS 'age'
FROM users;
--
SELECT
CASE
WHEN users.age < 13 THEN 'preteen'
WHEN users.age < 20 THEN 'teen'
ELSE 'adult'
END AS 'UserAge'
FROM users;
Unions are the compilation of one or more disparate SQL queries that have the same columns. These are helpful when doing data aggregation that requires multiple SQL statements with different sets of joins and where clauses, but that return the same type of data.
Note: Unioned SQL statements MUST have the exact same columns (matching names) in the exact same order.
In the example below, I want all users and the number of photographs taken that they have "been a part of". In my first query I'm selecting photographers. In my second I'm selecting directors. In my third I'm selecting editors. But I want everything to display as just individual users with a 'role' column that I've manually set.
SELECT
users.id,
users.name,
'Photographer' AS 'role',
COUNT(photos.id) AS 'photoCount'
FROM photoShoots
INNER JOIN users
ON photoShoots.fk_photographerUserId = users.id
INNER JOIN photos
ON photoShoots.id = photos.fk_photoShootsId
GROUP BY users.id, users.name
UNION
SELECT
users.id,
users.name,
'Director' AS 'role',
COUNT(photos.id) AS 'photoCount'
FROM photoShoots
INNER JOIN users
ON photoShoots.fk_directorUserId = users.id
INNER JOIN photos
ON photoShoots.id = photos.fk_photoShootsId
GROUP BY users.id, users.name
UNION
SELECT
users.id,
users.name,
'Editor' AS 'role',
COUNT(photos.id) AS 'photoCount'
FROM photoShoots
INNER JOIN users
ON photoShoots.fk_editorUserId = users.id
INNER JOIN photos
ON photoShoots.id = photos.fk_photoShootsId
GROUP BY users.id, users.name;