Can you write this from memory?
Select all orders from the orders table where the status equals 'shipped'.
In SQL, = NULL doesn't work. AND evaluates before OR. LIKE is case-sensitive in some databases but not others. BETWEEN is inclusive on both ends—which causes off-by-one bugs with timestamps.
Get filtering wrong and your JOINs, aggregations, and results are all garbage. Worse, incorrect filters often return plausible-looking results that silently drop or include wrong rows.
These exercises drill the WHERE clause patterns until the correct syntax is automatic.
Before running any filtered query in production, verify:
- Start small — Add
LIMIT 20to see sample results before running on full table - Add predicates one at a time — Watch row counts change to catch unexpected filters
- Watch for NULL — Remember that any comparison with NULL yields UNKNOWN
- Parenthesize mixed AND/OR — Never rely on implicit precedence
- Use half-open ranges for dates — Prefer
>= start AND < endover BETWEEN for timestamps
NULL comparisons are one of the most common SQL mistakes—frequently cited as a top source of incorrect query results by C.J. Date and the SQL standard committee. The three-valued logic behavior catches virtually every developer at least once.
-- WRONG: This never matches NULL values
SELECT * FROM users WHERE status = NULL; -- Always 0 rows
SELECT * FROM users WHERE status <> 'active'; -- Excludes NULLs!
-- RIGHT: Use IS NULL / IS NOT NULL
SELECT * FROM users WHERE status IS NULL;
SELECT * FROM users WHERE status IS NOT NULL;
-- Include NULLs in "not active" results
SELECT * FROM users
WHERE status <> 'active' OR status IS NULL;
Why? In SQL's three-valued logic, NULL = NULL returns UNKNOWN (not TRUE). And WHERE only includes rows where the condition is TRUE. This three-valued logic also causes the infamous NOT IN + NULL trap in subqueries.
IS DISTINCT FROM: null-safe comparison
Standard SQL provides null-safe comparison operators:
-- Normal equality fails with NULLs
SELECT * FROM audit_log WHERE old_value = new_value; -- NULLs excluded
-- NULL-safe: treats NULL as a comparable value
SELECT * FROM audit_log
WHERE old_value IS NOT DISTINCT FROM new_value; -- NULLs match NULLs
-- MySQL equivalent
SELECT * FROM audit_log WHERE old_value <=> new_value;
| Expression | Result |
|---|---|
NULL = NULL | UNKNOWN |
NULL IS NOT DISTINCT FROM NULL | TRUE |
1 IS DISTINCT FROM NULL | TRUE |
1 IS DISTINCT FROM 1 | FALSE |
AND binds tighter than OR. This means:
-- What you wrote:
WHERE role = 'admin' OR role = 'staff' AND status = 'active'
-- What SQL evaluates:
WHERE role = 'admin' OR (role = 'staff' AND status = 'active')
-- This includes ALL admins (even inactive ones) plus active staff
Always use explicit parentheses:
-- Clear intent: only active admins or staff
WHERE (role = 'admin' OR role = 'staff')
AND status = 'active';
-- Or use IN for cleaner syntax
WHERE role IN ('admin', 'staff')
AND status = 'active';
This is a classic gotcha that causes silent data loss:
-- Suppose excluded_ids contains: (1, 2, NULL)
SELECT * FROM users
WHERE id NOT IN (SELECT id FROM excluded_ids);
-- Returns 0 rows!
Why? NOT IN (1, 2, NULL) expands to id <> 1 AND id <> 2 AND id <> NULL. The id <> NULL is always UNKNOWN, making the whole expression UNKNOWN.
Solutions:
-- Option 1: Use NOT EXISTS (recommended)
SELECT * FROM users u
WHERE NOT EXISTS (
SELECT 1 FROM excluded_ids e WHERE e.id = u.id
);
-- Option 2: Filter NULLs from the subquery
SELECT * FROM users
WHERE id NOT IN (
SELECT id FROM excluded_ids WHERE id IS NOT NULL
);
-- Option 3: Use LEFT JOIN anti-pattern
SELECT u.*
FROM users u
LEFT JOIN excluded_ids e ON u.id = e.id
WHERE e.id IS NULL;
Rule of thumb: Prefer NOT EXISTS for anti-joins. It's NULL-safe and often performs better.
BETWEEN includes both endpoints:
-- This includes both Jan 1 AND Dec 31
WHERE date BETWEEN '2024-01-01' AND '2024-12-31'
-- Equivalent to:
WHERE date >= '2024-01-01' AND date <= '2024-12-31'
The timestamp trap
For datetime/timestamp columns, BETWEEN causes off-by-one bugs:
-- WRONG: Misses orders on Dec 31 after midnight
-- '2024-12-31' is implicitly '2024-12-31 00:00:00'
WHERE created_at BETWEEN '2024-01-01' AND '2024-12-31';
-- Orders at '2024-12-31 15:30:00' are excluded!
Use half-open ranges instead:
-- RIGHT: All of 2024 (any time on any day)
WHERE created_at >= '2024-01-01'
AND created_at < '2025-01-01';
-- All of January 2024
WHERE created_at >= '2024-01-01'
AND created_at < '2024-02-01';
-- Last 30 days (timestamp-safe)
WHERE created_at >= NOW() - INTERVAL '30 days'
AND created_at < NOW();
Rule: For timestamps, prefer >= start AND < exclusive_end over BETWEEN.
Basic wildcards
| Wildcard | Meaning |
|---|---|
% | Zero or more characters |
_ | Exactly one character |
-- Ends with @gmail.com
WHERE email LIKE '%@gmail.com';
-- Starts with 'test_' followed by exactly 3 characters
WHERE code LIKE 'test____'; -- Four underscores = 4 chars
-- Contains 'error' anywhere
WHERE message LIKE '%error%';
Case sensitivity varies by database
| Database | LIKE behavior | Case-insensitive option |
|---|---|---|
| PostgreSQL | Case-sensitive | Use ILIKE |
| MySQL | Depends on collation | Use LOWER(col) LIKE LOWER(pattern) or set collation |
| SQLite | Case-insensitive for ASCII | PRAGMA case_sensitive_like = ON for sensitive |
| SQL Server | Depends on collation | Use COLLATE clause |
-- PostgreSQL case-insensitive search
SELECT * FROM products WHERE name ILIKE '%phone%';
-- MySQL (works regardless of collation)
SELECT * FROM products WHERE LOWER(name) LIKE '%phone%';
Escaping wildcards
To match literal % or _ characters:
-- Match files containing literal underscore
SELECT * FROM files
WHERE name LIKE '%\_%' ESCAPE '\';
-- Match values containing literal percent
SELECT * FROM discounts
WHERE code LIKE '%\%%' ESCAPE '\';
Performance note: leading wildcards
-- SLOW: Leading wildcard prevents index use
WHERE name LIKE '%phone%';
-- FAST: Prefix match can use index
WHERE name LIKE 'phone%';
Performance note: functions on columns (SARGability)
Wrapping a column in a function disables index usage:
-- SLOW: Evaluates YEAR() for every row, can't use index
WHERE YEAR(created_at) = 2024;
-- SLOW: Same problem with DATE(), LOWER(), etc.
WHERE DATE(timestamp_col) = '2024-01-15';
WHERE LOWER(email) = 'alice@example.com';
-- FAST: Range comparison uses index
WHERE created_at >= '2024-01-01' AND created_at < '2025-01-01';
-- FAST: For case-insensitive, create a functional index or use ILIKE (Postgres)
WHERE email ILIKE 'alice@example.com';
The term "SARGable" (Search ARGument able) describes predicates that can use indexes. A non-SARGable predicate forces a sequential scan—reading every row in the table instead of using an index. On a table with millions of rows, this can mean the difference between milliseconds and seconds. Keep the column "naked"—no functions wrapping it.
For fast substring search, PostgreSQL supports trigram indexes:
-- Enable pg_trgm extension
CREATE EXTENSION IF NOT EXISTS pg_trgm;
-- Create trigram index
CREATE INDEX idx_products_name_trgm ON products
USING gin (name gin_trgm_ops);
-- Now this is fast (uses index)
SELECT * FROM products WHERE name ILIKE '%phone%';
Basic IN syntax
-- Cleaner than multiple ORs
WHERE status IN ('pending', 'processing', 'shipped');
-- Equivalent to:
WHERE status = 'pending'
OR status = 'processing'
OR status = 'shipped';
IN with subqueries
-- Orders from US customers
SELECT * FROM orders
WHERE customer_id IN (
SELECT id FROM customers WHERE country = 'US'
);
IN vs EXISTS
| Use Case | Recommendation |
|---|---|
| Small, known value list | IN |
| Large subquery result | EXISTS (can short-circuit) |
| Anti-join (NOT IN) | NOT EXISTS (NULL-safe) |
| Correlated condition | EXISTS |
-- EXISTS: stops at first match, often faster for large tables
SELECT * FROM customers c
WHERE EXISTS (
SELECT 1 FROM orders
WHERE customer_id = c.id
AND total > 1000
);
Readable multi-condition filters
SELECT *
FROM products
WHERE category = 'electronics'
AND price BETWEEN 100 AND 500
AND (
brand = 'Apple'
OR brand = 'Samsung'
OR brand = 'Sony'
)
AND stock > 0
AND discontinued_at IS NULL;
Using CASE for complex filters
-- Filter with business rules
SELECT *
FROM orders
WHERE CASE
WHEN user_type = 'premium' THEN total > 0
WHEN user_type = 'trial' THEN total > 50 AND created_at > NOW() - INTERVAL '7 days'
ELSE total > 100
END;
Filtering with multiple NULLable columns
-- Find rows where either field changed (handling NULLs)
SELECT *
FROM audit_log
WHERE old_email IS DISTINCT FROM new_email
OR old_phone IS DISTINCT FROM new_phone;
| Feature | PostgreSQL | MySQL | SQLite |
|---|---|---|---|
| Case-insensitive LIKE | ILIKE | Collation-dependent | Default for ASCII |
| NULL-safe equals | IS NOT DISTINCT FROM | <=> | IS (full support) |
| Regex matching | ~, ~* (case-insensitive) | REGEXP | ❌ (needs extension) |
| Trigram search | pg_trgm extension | Full-text search | ❌ |
| ESCAPE in LIKE | ✅ | ✅ | ✅ |
Filtering is the first step in the SQL pipeline — predicates run before GROUP BY and aggregation, and long before ORDER BY and pagination.
-
Check for NULL comparisons — Are you using
= NULLinstead ofIS NULL? -
Check NOT IN subquery for NULLs — Does the subquery return any NULL values?
-
Check AND/OR precedence — Are mixed conditions parenthesized correctly?
-
Check BETWEEN with timestamps — Are you missing time components at boundaries?
-
Check LIKE case sensitivity — Does your database/collation match expectations?
-
Check for implicit type conversion — Is
'123'being compared to integer123? -
Check empty string vs NULL — They're different:
WHERE col = ''vsWHERE col IS NULL -
Check for functions on columns — Is
WHERE YEAR(date) = 2024killing index usage?
When to Use SQL Filtering: WHERE, LIKE, IN, BETWEEN, NULL
- Limit results to relevant rows before joining or aggregating.
- Match patterns with LIKE, ILIKE, or regex functions.
- Filter by ranges with BETWEEN or comparison operators.
- Check for missing data with IS NULL / IS NOT NULL.
- Combine conditions with AND, OR, and NOT (with explicit parentheses).
Check Your Understanding: SQL Filtering: WHERE, LIKE, IN, BETWEEN, NULL
Find active users who logged in within the last 30 days.
Use WHERE status = 'active' AND last_login >= NOW() - INTERVAL '30 days' (PostgreSQL). Use single quotes for strings. Note: date arithmetic syntax varies—MySQL uses INTERVAL 30 DAY, SQLite uses datetime('now', '-30 days').
What You'll Practice: SQL Filtering: WHERE, LIKE, IN, BETWEEN, NULL
Common SQL Filtering: WHERE, LIKE, IN, BETWEEN, NULL Pitfalls
- = NULL does not work—use IS NULL
- AND binds before OR—always parenthesize mixed conditions
- NOT IN + NULL can return zero rows—use NOT EXISTS instead
- BETWEEN is inclusive—use half-open ranges for timestamp precision
- LIKE case sensitivity varies by database/collation
- Unescaped wildcards (% _) match when you meant literal characters
- Empty string is not NULL—they need different checks
- Functions on columns disable indexes—WHERE YEAR(date) = 2024 is slow
SQL Filtering: WHERE, LIKE, IN, BETWEEN, NULL FAQ
How do I compare to NULL?
Use IS NULL or IS NOT NULL. The expression x = NULL always yields unknown (not true or false), so it never matches. This is SQL's three-valued logic: TRUE, FALSE, UNKNOWN.
What is IS DISTINCT FROM?
It's a null-safe equality check. x IS DISTINCT FROM y returns true if x and y differ, treating NULL as a normal value. x IS NOT DISTINCT FROM y is the null-safe equals. PostgreSQL supports this directly; MySQL uses NULL-safe operator <=>.
How does AND/OR precedence work?
AND binds tighter than OR. So 'a OR b AND c' evaluates as 'a OR (b AND c)', not '(a OR b) AND c'. Always use explicit parentheses to make your intent clear.
Why does NOT IN sometimes return no rows?
If the subquery/list contains NULL, NOT IN can return no rows at all. This is because 'x NOT IN (1, NULL)' becomes 'x != 1 AND x != NULL', and the NULL comparison yields UNKNOWN, making the whole expression UNKNOWN. Use NOT EXISTS or add AND x IS NOT NULL.
Is LIKE case-sensitive?
It depends on the database and collation. PostgreSQL: LIKE is case-sensitive; use ILIKE for case-insensitive. MySQL: depends on column collation (utf8_general_ci is case-insensitive). SQLite: LIKE is case-insensitive for ASCII by default.
Is BETWEEN inclusive?
Yes, BETWEEN is inclusive on both ends: 'x BETWEEN 1 AND 10' includes both 1 and 10. For timestamps, this can cause off-by-one bugs—use half-open ranges instead.
What are half-open date ranges?
Half-open ranges use >= start AND < end (exclusive end). For 'all of January', use created_at >= '2024-01-01' AND created_at < '2024-02-01'. This avoids inclusive BETWEEN issues with timestamps that have time components.
How do I escape wildcards in LIKE?
Use the ESCAPE clause: WHERE col LIKE '%\_%' ESCAPE '\\' matches literal underscores. The escape character makes the next character literal instead of a wildcard.
When should I use IN vs EXISTS?
Use IN for small, known value lists. Use EXISTS for existence checks with correlated subqueries—it can short-circuit and stop at the first match. For anti-joins (NOT IN/NOT EXISTS), prefer NOT EXISTS to avoid the NULL pitfall.
How do I filter for rows where a column matches multiple patterns?
Chain conditions with OR: WHERE col LIKE '%foo%' OR col LIKE '%bar%'. PostgreSQL also supports regex: WHERE col ~ 'foo|bar'. For large pattern lists, consider a join to a patterns table or full-text search.
Why does WHERE YEAR(date) = 2024 run slowly?
Wrapping a column in a function (YEAR, DATE, LOWER, etc.) prevents the database from using indexes—it must evaluate the function for every row. Use range comparisons instead: WHERE date >= '2024-01-01' AND date < '2025-01-01'.
Why can't I use COUNT(*) or SUM() in WHERE?
WHERE filters rows before grouping/aggregation. To filter by aggregate results, use HAVING (which runs after GROUP BY). See our aggregation guide for details on WHERE vs HAVING.
SQL Filtering: WHERE, LIKE, IN, BETWEEN, NULL Syntax Quick Reference
SELECT * FROM users
WHERE email LIKE '%@gmail.com';SELECT * FROM products
WHERE name ILIKE '%phone%';SELECT * FROM orders
WHERE status IN ('pending', 'processing', 'shipped');SELECT * FROM events
WHERE event_date BETWEEN '2024-01-01' AND '2024-12-31';-- All of January 2024 (timestamp-safe)
SELECT * FROM orders
WHERE created_at >= '2024-01-01'
AND created_at < '2024-02-01';SELECT * FROM users
WHERE deleted_at IS NULL;-- True even when old_status or new_status is NULL
SELECT * FROM audit_log
WHERE old_status IS DISTINCT FROM new_status;-- Parentheses make the logic clear
SELECT * FROM users
WHERE (role = 'admin' OR role = 'moderator')
AND status = 'active';-- Users with no orders (NULL-safe)
SELECT * FROM users u
WHERE NOT EXISTS (
SELECT 1 FROM orders
WHERE customer_id = u.id
);-- Match literal underscore in filename
SELECT * FROM files
WHERE name LIKE '%\_%' ESCAPE '\';-- PostgreSQL regex alternative to OR LIKE
SELECT * FROM products
WHERE name ~ '(phone|tablet|laptop)';SELECT * FROM products
WHERE category = 'electronics'
AND (price < 100 OR on_sale = true)
AND stock > 0;SQL Filtering: WHERE, LIKE, IN, BETWEEN, NULL Sample Exercises
Select all products from the products table where the category is not 'electronics'.
SELECT * FROM products WHERE category <> 'electronics'Fill in the operator to select products priced at 50 or less
<=If orders table has 3 orders with totals 50, 100, and 150, how many rows does this return?
2+ 22 more exercises
Copy-ready syntax examples for quick lookup