Using Subqueries in PostgreSQL
An Overview on Subqueries and How to Use Them Effectively
A Subquery is a powerful tool that can be helpful when creating complex queries in PostgreSQL. What are subqueries? And how can we use them efficiently? This blog post will attempt to answer these questions.
I will be using the Microsoft Northwind database for all the examples.
What is a Subquery?
A subquery is a query contained within another SQL query. The subquery or “inner query” is used to return values that will be used in the main query as a condition. A subquery is always enclosed within parentheses. These subqueries can reside in the SELECT, FROM, and WHERE clause.
Why Should You Use a Subquery?
- Subqueries can divide a complex query into isolated logical parts
- It is easier to understand and code maintenance is also easier.
- Subqueries allow you to use the results of another query in the outer query.
- In some cases, subqueries can replace complex joins and unions.
When Should You Use a Subquery?
Ah, the golden question. Unfortunately, the answer is not as definitive as I would like it to be. It all depends on the situation. There will be situations where a subquery would be best practice and other times where a JOIN would be better. It’s like that old saying, “ Just because you can, doesn’t mean you should”. The goal is to break down a difficult query into something more manageable, not to increase its complexity.
Different Types of Subqueries
The inner part of the nested query can return a single value, column, or a derived table. Also, the inner can be independent of its outer query, or they can be correlated.
This post won’t dive any deeper into the topic of independent and correlated subqueries. I will be making another post that will cover that topic in the future. All subqueries used in this post are independent subqueries.
What Do Subqueries Return?
The inner query returns one value i.e. 1 row and 1 column.
Here is the full query where we want to find the orderid, productid, and unitprice where the unitprice is equal to the highest-priced item.
Let’s just run the inner query by itself and see what we get.
The inner query returns one column and multiple rows (>1).
If your subquery returns more than one row, you will not be able to use it in a comparison condition. However, there are four additional operators that you can use to build conditions with these types of subqueries. The operators are IN, ANY, ALL, and EXISTS. I will be using an IN operator for the example below.
In this query, we want to find the orderid, customerid of our customers that were serviced by our employees in the city of Seattle and Tacoma.
The inner query returns multiple columns and multiple rows (>1).
Here we have another query where we want to find the productid, productname, supplierid, and unitsinstock of products that have a discount rate higher than 15 % and come from Tokyo Traders company.
Subqueries can be nested in the SELECT, FROM, and WHERE clauses of a query. It can also be nested in the HAVING, INSERT, UPDATE, DO and SET clauses but these won’t be covered in this blog post.
Subquery in the SELECT clause
Subquery in the SELECT clause can only return a scalar value. The subquery should also have an alias.
Subquery in the FROM clause
Subquery in the FROM clause can return multiple rows and columns. The results returned by such a subquery is referred to as a derived table. A derived table is useful when you want to work with a subset of data from one or more tables without needing to create a view or temporary table. The subquery must have an alias.
Subquery in the WHERE clause
Subquery in the WHERE clause can return a single value or multiple values. Used to compare values or a list of values.
- You must enclose a subquery in parenthesis.
- A subquery must include a SELECT clause and a FROM clause.
- A subquery can include optional WHERE, GROUP BY, and HAVING clauses.
- A subquery can be used in almost any SQL clause.
- If a JOIN seems like a better choice, go for it. JOIN holds certain advantages over subqueries. For example, JOIN executes faster than subqueries and the retrieval time of the query using joins almost always will be faster than that of a subquery.