Syntax Cache
BlogMethodFeaturesHow It WorksBuild a Game
  1. Home
  2. SQL
  3. SQL Subqueries Practice
SQL15 exercises

SQL Subqueries Practice

Master SQL subqueries: IN vs EXISTS performance, NOT IN + NULL pitfall, correlated scalar rewrites, LATERAL for "top-1 per row", and ANY/ALL operators.

Common ErrorsQuick ReferencePractice
Warm-up1 / 2

Can you write this from memory?

Select all products where the price is greater than the average price of all products.

On this page
  1. 1The NOT IN + NULL trap
  2. 2IN vs EXISTS: when does it matter?
  3. The short answer
  4. The detailed answer
  5. The equivalences
  6. 3Correlated subqueries: the performance trap
  7. Rewrite as JOIN + GROUP BY
  8. When correlated subqueries are fine
  9. 4LATERAL: correlated subqueries in FROM
  10. The problem LATERAL solves
  11. The LATERAL solution
  12. LATERAL use cases
  13. 5Derived tables: subqueries in FROM
  14. When to use derived tables
  15. Derived tables vs CTEs
  16. 6ANY and ALL: the operators behind IN
  17. ALL for "greater than every value"
  18. ANY for "matches at least one"
  19. 7Debugging checklist: "Why is my subquery wrong?"
  20. 8References
The NOT IN + NULL trapIN vs EXISTS: when does it matter?Correlated subqueries: the performance trapLATERAL: correlated subqueries in FROMDerived tables: subqueries in FROMANY and ALL: the operators behind INDebugging checklist: "Why is my subquery wrong?"References

A common SQL subquery bug: you write NOT IN (subquery) to find "customers who never ordered," but the subquery returns a NULL—so your query returns zero rows. No error, no warning, just silently wrong results.

SQL uses three-valued logic. NOT IN compares with <> ALL, and x <> NULL is UNKNOWN, making the entire expression UNKNOWN. Use NOT EXISTS for anti-joins unless you can guarantee no NULLs.

These exercises drill subquery patterns until you know when IN beats EXISTS, when to rewrite a correlated subquery as a JOIN, and when LATERAL is the right tool.

Related SQL Topics
SQL Filtering: WHERE, LIKE, IN, BETWEEN, NULLSQL JoinsSQL CTEsSQL Window FunctionsSQL Aggregation: GROUP BY, HAVING, COUNT, SUM, AVG

If the subquery returns any NULL, NOT IN returns zero rows—silently. Use NOT EXISTS for anti-joins unless you can guarantee no NULLs.

This is the most common subquery bug in production:

-- Suppose orders.customer_id contains NULL values
SELECT * FROM customers
WHERE id NOT IN (SELECT customer_id FROM orders);
-- Returns 0 rows! Even for customers who never ordered.

Why? NOT IN (1, 2, NULL) evaluates as id <> 1 AND id <> 2 AND id <> NULL. The id <> NULL yields UNKNOWN, making the entire expression UNKNOWN. WHERE only includes rows where the condition is TRUE.

Solution: Use NOT EXISTS

SELECT * FROM customers c
WHERE NOT EXISTS (
  SELECT 1 FROM orders o WHERE o.customer_id = c.id
);

NOT EXISTS returns TRUE when the subquery finds no rows—NULL values in the subquery don't affect the result. You can also express this as a LEFT JOIN anti-pattern.

See our filtering guide for a comprehensive explanation of three-valued logic and NULL handling. For a quick-reference card of subquery syntax, see the SQL subqueries cheat sheet.


Ready to practice?

Start practicing SQL Subqueries with spaced repetition

Modern optimizers often generate the same plan for both. Use IN for simple set membership, EXISTS for correlated existence checks, and always NOT EXISTS over NOT IN.

The short answer

Modern query optimizers often transform IN and EXISTS into the same execution plan (semijoin). Choose based on clarity:

Use CaseRecommendation
Filter by a list of valuesIN
Check if related rows existEXISTS
Anti-join (no match)NOT EXISTS (NULL-safe)
Large subquery resultEXISTS (can short-circuit)

The detailed answer

-- These are often optimized identically:

-- IN version
SELECT * FROM customers
WHERE id IN (SELECT customer_id FROM orders WHERE total > 1000);

-- EXISTS version
SELECT * FROM customers c
WHERE EXISTS (
  SELECT 1 FROM orders o
  WHERE o.customer_id = c.id AND o.total > 1000
);

When EXISTS is better:

  • Anti-joins: NOT EXISTS is NULL-safe, NOT IN is not
  • Early termination: EXISTS can stop at first match
  • Correlated conditions: EXISTS naturally expresses "for this row, check if..."

When IN is better:

  • Simple set membership: WHERE status IN ('active', 'pending')
  • Readability for non-correlated checks

The equivalences

PostgreSQL documents these equivalences:

  • x IN (subquery) = x = ANY (subquery)
  • x NOT IN (subquery) = x <> ALL (subquery)

Understanding this helps explain the NULL trap: <> ALL means "not equal to every value," and comparing to NULL yields UNKNOWN.


Correlated subqueries can run once per row (nested-loop style). Rewrite as JOIN + GROUP BY when possible—it's usually faster and clearer.

A correlated subquery references the outer query, potentially executing once per row:

-- This can be slow: subquery may run for each customer
SELECT c.id, c.name,
  (SELECT COUNT(*) FROM orders o WHERE o.customer_id = c.id) AS order_count
FROM customers c;

Rewrite as JOIN + GROUP BY

-- Usually faster: single pass through orders
SELECT c.id, c.name, COUNT(o.id) AS order_count
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
GROUP BY c.id, c.name;

Why LEFT JOIN? To keep customers with zero orders (they'll show NULL for o.id).

Why COUNT(o.id) not COUNT(*)? COUNT(*) counts NULL rows (returns 1 for customers with no orders), COUNT(o.id) correctly returns 0.

When correlated subqueries are fine

  • Small outer result set
  • Subquery is indexed and fast
  • You need a scalar value that's awkward to express as a join
  • The optimizer rewrites it anyway (check EXPLAIN)

LATERAL is a powerful PostgreSQL feature (also in MySQL 8.0.14+, SQL Server as CROSS/OUTER APPLY).

The problem LATERAL solves

"For each customer, get their most recent order."

Without LATERAL, you need window functions:

SELECT c.id, c.name, o.id AS order_id, o.created_at
FROM customers c
LEFT JOIN (
  SELECT *,
    ROW_NUMBER() OVER (PARTITION BY customer_id ORDER BY created_at DESC) AS rn
  FROM orders
) o ON o.customer_id = c.id AND o.rn = 1;

The LATERAL solution

SELECT c.id, c.name, last_order.id AS order_id, last_order.created_at
FROM customers c
LEFT JOIN LATERAL (
  SELECT o.*
  FROM orders o
  WHERE o.customer_id = c.id  -- References c from outer query!
  ORDER BY o.created_at DESC
  LIMIT 1
) last_order ON true;

How it works: The LATERAL keyword allows the subquery to reference columns from preceding FROM items. The subquery runs for each row of customers.

LEFT JOIN LATERAL ... ON true: The ON true is required syntax but always matches—the filtering happens inside the LATERAL subquery.

LATERAL use cases

  • Top-N per group (most recent order per customer)
  • Expand arrays/JSON into rows
  • Complex per-row calculations with multiple columns
  • Calling set-returning functions with per-row arguments

A subquery in FROM creates a derived table (also called an inline view):

SELECT country, avg_spend
FROM (
  SELECT country, AVG(total_spend) AS avg_spend
  FROM customers
  GROUP BY country
) t
WHERE avg_spend > 1000;

When to use derived tables

  • Filter aggregated results (alternative to HAVING)
  • Pre-aggregate before joining (prevent row explosion)
  • Compute intermediate results for further processing

Derived tables vs CTEs

-- Derived table
SELECT * FROM (SELECT ... ) t WHERE ...;

-- CTE (equivalent)
WITH t AS (SELECT ...)
SELECT * FROM t WHERE ...;

CTEs are better when:

  • You need to reference the result multiple times
  • The query is complex and benefits from named steps
  • You need recursion (WITH RECURSIVE)

Derived tables are fine when:

  • Single use, simple transformation
  • You want to keep related logic together

Most developers never use ANY/ALL directly, but understanding them clarifies IN/NOT IN behavior:

-- These are equivalent:
WHERE id IN (1, 2, 3)
WHERE id = ANY (ARRAY[1, 2, 3])

-- These are equivalent:
WHERE id NOT IN (1, 2, 3)
WHERE id <> ALL (ARRAY[1, 2, 3])

ALL for "greater than every value"

-- Products more expensive than ALL clearance items
SELECT * FROM products
WHERE price > ALL (
  SELECT price FROM products WHERE category = 'clearance'
);

-- Equivalent to:
WHERE price > (SELECT MAX(price) FROM products WHERE category = 'clearance')

ANY for "matches at least one"

-- Products cheaper than ANY premium item
SELECT * FROM products
WHERE price < ANY (
  SELECT price FROM products WHERE category = 'premium'
);

-- Equivalent to:
WHERE price < (SELECT MAX(price) FROM products WHERE category = 'premium')

  1. Zero rows from NOT IN? Check if the subquery returns any NULL values. Use NOT EXISTS instead.

  2. Error: "subquery returns more than one row"? You used a scalar subquery where multiple rows match. Add LIMIT 1 or use IN/EXISTS instead.

  3. Slow query? Check if a correlated subquery is running per-row. Consider rewriting as JOIN + GROUP BY.

  4. Wrong counts? Scalar subquery counting may differ from JOIN + COUNT due to NULL handling. Use COUNT(column) not COUNT(*) with LEFT JOIN.

  5. LATERAL not working? Check database support: PostgreSQL has LATERAL, MySQL 8.0.14+ has LATERAL, SQL Server uses CROSS APPLY / OUTER APPLY.


  • PostgreSQL: Subquery Expressions
  • PostgreSQL: LATERAL Subqueries
  • PostgreSQL: Row and Array Comparisons (ANY/ALL)
  • MySQL: Optimizing Subqueries with Semijoin
  • MySQL: LATERAL Derived Tables
  • SQL Server: APPLY (CROSS APPLY / OUTER APPLY)

When to Use SQL Subqueries

  • Filter by a derived set of values (IN subquery).
  • Check existence without returning data (EXISTS).
  • Compute a single value per row (scalar subquery).
  • Create a derived table in FROM for intermediate calculations.
  • Access correlated data with LATERAL (top-N per group, row-level lookups).
  • Compare against all/any values in a set (ANY/ALL operators).

Check Your Understanding: SQL Subqueries

Prompt

List customers who have never placed an order.

What a strong answer looks like

Use NOT EXISTS with a correlated subquery. NOT EXISTS is NULL-safe and clearly expresses "there is no matching order." ```sql -- Recommended: NOT EXISTS (NULL-safe, clear intent) SELECT c.* FROM customers c WHERE NOT EXISTS ( SELECT 1 FROM orders o WHERE o.customer_id = c.id ); ``` Alternative: LEFT JOIN anti-pattern (same result, sometimes preferred for readability): ```sql SELECT c.* FROM customers c LEFT JOIN orders o ON o.customer_id = c.id WHERE o.customer_id IS NULL; ``` **Avoid NOT IN** unless you can guarantee the subquery never returns NULL—otherwise the query may return zero rows silently.

What You'll Practice: SQL Subqueries

Scalar subqueries (single value)IN subqueries (set membership)EXISTS and NOT EXISTS (existence checks)Correlated subqueries (per-row evaluation)Derived tables (subqueries in FROM)LATERAL subqueries (FROM with correlation)ANY/ALL operators (set comparisons)Subqueries vs JOINs (when to rewrite)Subqueries vs CTEs (readability vs performance)

Common SQL Subqueries Pitfalls

  • NOT IN + NULL returns no rows—use NOT EXISTS instead
  • Correlated subqueries may run per row (check execution plans)
  • Scalar subquery returning multiple rows causes an error
  • IN/NOT IN can return NULL (unknown), not just true/false
  • Multiple scalar subqueries on same table—rewrite as single JOIN
  • Assuming subquery evaluation order (side effects are unpredictable)

SQL Subqueries FAQ

Why does NOT IN (subquery) return no rows?

If the subquery returns any NULL values, NOT IN returns UNKNOWN for every row (due to SQL's three-valued logic), so nothing matches. Use NOT EXISTS instead—it's NULL-safe. See our [filtering guide](/sql/filtering) for a detailed explanation of the NOT IN + NULL trap.

Is IN faster than EXISTS?

Often equivalent. Modern optimizers transform both into semijoin/antijoin plans. Performance depends on indexes and cardinality, not the keyword. Use EXISTS when you only care about existence (it can stop at first match) and IN when checking set membership. For NOT IN vs NOT EXISTS, prefer NOT EXISTS to avoid NULL issues.

What does SELECT 1 mean in EXISTS?

EXISTS only checks if any row exists—it ignores what you select. SELECT 1, SELECT *, and SELECT column all behave identically inside EXISTS. SELECT 1 is conventional because it signals 'I only care about existence, not the data.'

Will the database evaluate the whole subquery?

Not necessarily. EXISTS can stop after finding the first matching row. IN with a subquery may be executed once and cached, or transformed into a semijoin. PostgreSQL docs warn against relying on full evaluation order—side effects in subqueries are unpredictable.

When should I use a subquery in FROM vs a CTE?

CTEs (WITH clauses) are more readable for complex queries and can be referenced multiple times. Subqueries in FROM are fine for simple derived tables. Some databases materialize CTEs differently—PostgreSQL 12+ inlines most CTEs like subqueries unless you use MATERIALIZED. See our [CTE guide](/sql/ctes) for details.

What's LATERAL and when do I need it?

LATERAL lets a subquery in FROM reference columns from preceding tables—like a correlated subquery but in the FROM clause. Use it for 'top-1 per row' queries: for each customer, get their most recent order. Without LATERAL, you'd need window functions or correlated scalar subqueries.

What are ANY and ALL in SQL?

ANY/ALL compare a value against all values from a subquery. `= ANY(subquery)` is equivalent to IN. `<> ALL(subquery)` is equivalent to NOT IN (with the same NULL pitfall). `> ALL(subquery)` means greater than every value. These are rarely used directly but help understand how IN/NOT IN work internally.

Why is my correlated subquery slow?

Correlated subqueries can execute once per row (nested-loop style). Rewrite as a JOIN with aggregation when possible: instead of SELECT (SELECT COUNT(*) FROM orders WHERE customer_id = c.id), use LEFT JOIN orders GROUP BY customer. The optimizer may rewrite it anyway, but explicit joins are often clearer and faster.

Scalar subquery vs JOIN—which should I use?

Scalar subqueries return one value per row. For simple lookups, they're readable. For aggregations across many rows, JOIN + GROUP BY is usually faster and clearer. If you're doing multiple scalar subqueries on the same table, definitely rewrite as a single JOIN.

SQL Subqueries Syntax Quick Reference

IN subquery (filter by derived set)
SELECT * FROM orders
WHERE customer_id IN (
  SELECT id FROM customers WHERE country = 'USA'
);
EXISTS (check for matching rows)
-- Customers who have placed at least one order
SELECT * FROM customers c
WHERE EXISTS (
  SELECT 1 FROM orders WHERE customer_id = c.id
);
NOT EXISTS (anti-join, NULL-safe)
-- Customers who have never ordered
SELECT c.*
FROM customers c
WHERE NOT EXISTS (
  SELECT 1 FROM orders o
  WHERE o.customer_id = c.id
);
Scalar subquery (single value per row)
-- Order count per customer (can be slow for large tables)
SELECT c.id, c.name,
  (SELECT COUNT(*) FROM orders o
   WHERE o.customer_id = c.id) AS order_count
FROM customers c;
Rewrite scalar as JOIN + GROUP BY
-- Same result, usually faster for large tables
SELECT c.id, c.name, COUNT(o.id) AS order_count
FROM customers c
LEFT JOIN orders o ON o.customer_id = c.id
GROUP BY c.id, c.name;
Derived table (subquery in FROM)
-- Filter aggregated results without HAVING
SELECT country, avg_spend
FROM (
  SELECT country, AVG(total_spend) AS avg_spend
  FROM customers
  GROUP BY country
) t
WHERE avg_spend > 1000;
ANY (equivalent to IN)
-- = ANY is equivalent to IN
SELECT * FROM products
WHERE category = ANY (
  SELECT category FROM featured_categories
);
ALL (compare to every value)
-- Price greater than ALL clearance items
SELECT * FROM products p
WHERE p.price > ALL (
  SELECT price FROM products
  WHERE category = 'clearance'
);
LATERAL (correlated FROM subquery)
-- Most recent order per customer (PostgreSQL)
SELECT c.id, c.name, last_order.id AS order_id, last_order.created_at
FROM customers c
LEFT JOIN LATERAL (
  SELECT o.*
  FROM orders o
  WHERE o.customer_id = c.id
  ORDER BY o.created_at DESC
  LIMIT 1
) last_order ON true;
LATERAL vs window function
-- Alternative with window function (no LATERAL needed)
SELECT c.id, c.name, o.id AS order_id, o.created_at
FROM customers c
LEFT JOIN (
  SELECT *, ROW_NUMBER() OVER (
    PARTITION BY customer_id ORDER BY created_at DESC
  ) AS rn
  FROM orders
) o ON o.customer_id = c.id AND o.rn = 1;

SQL Subqueries Sample Exercises

Example 1Difficulty: 2/5

Write a query that selects all columns from the products table, plus a column showing the average price of all products (name it avg_price).

SELECT *, (SELECT AVG(price) FROM products) AS avg_price FROM products
Example 2Difficulty: 2/5

Fill in the blank to find products priced above average

AVG(price)
Example 3Difficulty: 2/5

If the orders table has orders totaling 100, 200, and 300, what does this subquery return?

300

+ 12 more exercises

Quick Reference
SQL Subqueries Cheat Sheet →

Copy-ready syntax examples for quick lookup

Start practicing SQL Subqueries

Free daily exercises with spaced repetition. No credit card required.

← Back to SQL Syntax Practice
Syntax Cache

Build syntax muscle memory with spaced repetition.

Product

  • Pricing
  • Our Method
  • Daily Practice
  • Design Patterns
  • Interview Prep

Resources

  • Blog
  • Compare
  • Cheat Sheets
  • Vibe Coding
  • Muscle Memory

Languages

  • Python
  • JavaScript
  • TypeScript
  • Rust
  • SQL
  • GDScript

Legal

  • Terms
  • Privacy
  • Contact

© 2026 Syntax Cache

Cancel anytime in 2 clicks. Keep access until the end of your billing period.

No refunds for partial billing periods.