Can you write this from memory?
Count the total number of rows in the orders table.
Aggregation is where SQL goes from "select rows" to "answer questions." And it's where queries silently return wrong numbers if you put a filter in the wrong place.
SQL order of operations
FROM/JOIN → WHERE → GROUP BY → aggregates → HAVING → SELECT → ORDER BY → LIMIT
WHERE filters rows before grouping. HAVING filters groups after aggregation. If your numbers are wrong, check which phase your filter is in.
Understand this order and most aggregation bugs become obvious:
1. FROM / JOIN -- assemble the raw rows
2. WHERE -- filter rows (before grouping)
3. GROUP BY -- form groups
4. aggregates -- compute COUNT, SUM, AVG, etc.
5. HAVING -- filter groups (after aggregation)
6. SELECT -- pick columns and expressions
7. ORDER BY -- sort results
8. LIMIT -- cap output
If your numbers are wrong, ask: "Am I filtering in the right phase?"
Understanding this distinction depends on knowing how WHERE clause filtering works — WHERE filters individual rows before the GROUP BY phase.
| Use WHERE when... | Use HAVING when... |
|---|---|
| Filtering individual rows | Filtering groups |
| Condition doesn't involve aggregates | Condition uses SUM, COUNT, AVG, etc. |
| You want to exclude rows before grouping | You want to exclude groups after aggregation |
Example: "Active products with avg price > $100"
SELECT category, AVG(price) AS avg_price
FROM products
WHERE is_active = true -- row filter: only active products
GROUP BY category
HAVING AVG(price) > 100; -- group filter: only high-value categories
Common mistake: Putting the aggregate condition in WHERE.
-- WRONG: This errors (can't use aggregate in WHERE)
SELECT category, SUM(amount)
FROM orders
WHERE SUM(amount) > 1000 -- ERROR!
GROUP BY category;
-- RIGHT: Use HAVING
SELECT category, SUM(amount)
FROM orders
GROUP BY category
HAVING SUM(amount) > 1000;
This trips up everyone at least once:
-- Sample data
| id | email |
|----|-----------------|
| 1 | alice@test.com |
| 2 | NULL |
| 3 | bob@test.com |
SELECT COUNT(*) AS total_rows, -- 3
COUNT(email) AS emails_present, -- 2 (NULL ignored)
COUNT(id) AS ids_present -- 3 (no NULLs in id)
FROM users;
Rule of thumb:
- COUNT(*) = "how many rows?"
- COUNT(column) = "how many non-NULL values in this column?"
Instead of multiple queries, use CASE inside aggregates:
SELECT
COUNT(*) AS total_orders,
SUM(CASE WHEN status = 'completed' THEN 1 ELSE 0 END) AS completed,
SUM(CASE WHEN status = 'cancelled' THEN 1 ELSE 0 END) AS cancelled,
SUM(CASE WHEN status = 'completed' THEN amount ELSE 0 END) AS completed_revenue
FROM orders;
PostgreSQL/modern SQL alternative: FILTER clause
SELECT
COUNT(*) AS total,
COUNT(*) FILTER (WHERE status = 'completed') AS completed,
SUM(amount) FILTER (WHERE status = 'completed') AS completed_revenue
FROM orders;
-- Wrong: counts all order rows
SELECT category, COUNT(*) AS buyer_count
FROM orders
GROUP BY category;
-- Right: counts unique users per category
SELECT category, COUNT(DISTINCT user_id) AS unique_buyers
FROM orders
GROUP BY category;
You can combine DISTINCT with conditions:
-- Unique users who made a purchase over $100
SELECT COUNT(DISTINCT user_id)
FROM orders
WHERE amount > 100;
"Give me all tags for each post as a comma-separated string":
-- PostgreSQL
SELECT post_id, STRING_AGG(tag, ', ' ORDER BY tag) AS tags
FROM post_tags
GROUP BY post_id;
-- MySQL
SELECT post_id, GROUP_CONCAT(tag ORDER BY tag SEPARATOR ', ') AS tags
FROM post_tags
GROUP BY post_id;
-- Oracle
SELECT post_id, LISTAGG(tag, ', ') WITHIN GROUP (ORDER BY tag) AS tags
FROM post_tags
GROUP BY post_id;
Tips:
- Add
DISTINCTto dedupe:STRING_AGG(DISTINCT tag, ', ') - Use
ORDER BYinside the function to control sort order - Watch out for NULL values—they're typically ignored, but check your database
Raw timestamps create one group per second—not useful.
-- PostgreSQL: DATE_TRUNC for clean buckets
SELECT
DATE_TRUNC('month', created_at) AS month,
SUM(amount) AS revenue
FROM orders
GROUP BY DATE_TRUNC('month', created_at)
ORDER BY month;
-- MySQL: DATE_FORMAT or YEAR/MONTH functions
SELECT
DATE_FORMAT(created_at, '%Y-%m') AS month,
SUM(amount) AS revenue
FROM orders
GROUP BY DATE_FORMAT(created_at, '%Y-%m');
Common buckets:
DATE_TRUNC('day', ts)— dailyDATE_TRUNC('week', ts)— weekly (starts Monday in Postgres)DATE_TRUNC('month', ts)— monthlyDATE_TRUNC('quarter', ts)— quarterly
JOINs multiply rows. If each order has 3 items, the order row appears 3 times after the join:
-- WRONG: If each order has multiple items, order_amount is counted multiple times
SELECT
o.category,
SUM(o.order_amount) AS revenue -- INFLATED!
FROM orders o
JOIN order_items i ON i.order_id = o.id
GROUP BY o.category;
Solutions:
- Aggregate before joining:
WITH order_totals AS (
SELECT category, SUM(order_amount) AS revenue
FROM orders
GROUP BY category
)
SELECT ot.category, ot.revenue, COUNT(i.id) AS items
FROM order_totals ot
LEFT JOIN orders o ON o.category = ot.category
LEFT JOIN order_items i ON i.order_id = o.id
GROUP BY ot.category, ot.revenue;
- Use COUNT(DISTINCT ...) for counting:
SELECT
o.category,
COUNT(DISTINCT o.id) AS order_count -- Safe: counts unique orders
FROM orders o
JOIN order_items i ON i.order_id = o.id
GROUP BY o.category;
Warning: Don't use SUM(DISTINCT amount) for revenue—if two orders have the same price ($19.99), one gets silently dropped. For sums, always aggregate before joining.
- Rethink what you're measuring: Maybe you want item-level revenue, not order-level.
"Show all categories, even those with no orders":
SELECT
c.name,
COUNT(o.id) AS order_count, -- 0 for empty categories
COALESCE(SUM(o.amount), 0) AS revenue -- NULL → 0
FROM categories c
LEFT JOIN orders o ON o.category_id = c.id
GROUP BY c.name;
Key points:
- Start FROM the dimension table (categories)
- LEFT JOIN to the fact table (orders)
- Critical: Use
COUNT(o.id)notCOUNT(*). When there are no matches, the LEFT JOIN creates one row with all NULLs.COUNT(*)counts that NULL row (returns 1), butCOUNT(o.id)ignores the NULL (returns 0). - Use COALESCE to convert NULL sums to 0
Note: These examples use window functions (OVER), which calculate aggregates without collapsing rows. See our window functions guide for details.
Revenue by month with running total
SELECT
DATE_TRUNC('month', order_date) AS month,
SUM(amount) AS monthly_revenue,
SUM(SUM(amount)) OVER (ORDER BY DATE_TRUNC('month', order_date)) AS running_total
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY DATE_TRUNC('month', order_date);
Conversion funnel
SELECT
COUNT(DISTINCT user_id) AS visitors,
COUNT(DISTINCT CASE WHEN signed_up THEN user_id END) AS signups,
COUNT(DISTINCT CASE WHEN purchased THEN user_id END) AS buyers
FROM user_events;
Percentage breakdown
SELECT
category,
COUNT(*) AS count,
ROUND(100.0 * COUNT(*) / SUM(COUNT(*)) OVER (), 2) AS percentage
FROM products
GROUP BY category;
-
Check your WHERE timing — Is a filter that should be in HAVING in WHERE, or vice versa?
-
Check for JOIN multiplication — Did a join create duplicate rows? Run
SELECT COUNT(*)before and after the join. -
Check COUNT behavior — Are you using COUNT(column) when you meant COUNT(*)? Check for NULLs.
-
Check DISTINCT — Do you need COUNT(DISTINCT x) instead of COUNT(x)?
-
Check date bucketing — Are you grouping by raw timestamp instead of DATE_TRUNC?
-
Check GROUP BY completeness — Are all non-aggregated columns in GROUP BY?
When to Use SQL Aggregation: GROUP BY, HAVING, COUNT, SUM, AVG
- Summarize rows into totals, averages, or counts.
- Group data by category, user, date bucket, or any dimension.
- Filter aggregated results with HAVING (e.g., "only groups with count > 10").
- Build dashboards, KPI queries, and reports (revenue, retention, conversion).
Check Your Understanding: SQL Aggregation: GROUP BY, HAVING, COUNT, SUM, AVG
Find the top 3 categories by revenue in 2024.
Filter by date in WHERE (before grouping), GROUP BY category, SUM(amount) as revenue, ORDER BY revenue DESC, LIMIT 3. The date filter goes in WHERE because you're filtering rows, not groups.
What You'll Practice: SQL Aggregation: GROUP BY, HAVING, COUNT, SUM, AVG
Common SQL Aggregation: GROUP BY, HAVING, COUNT, SUM, AVG Pitfalls
- Selecting non-aggregated columns without GROUP BY (works in old MySQL, fails elsewhere)
- Filtering aggregates in WHERE instead of HAVING (WHERE SUM(...) errors or filters wrong)
- COUNT(column) unexpectedly ignoring NULLs—use COUNT(*) for row counts
- JOIN row-multiplication inflating SUM/COUNT (aggregate before join or use DISTINCT)
- Grouping by raw timestamps instead of DATE_TRUNC—creates one group per second
- Using column aliases in HAVING (works in MySQL, fails in standard SQL)
SQL Aggregation: GROUP BY, HAVING, COUNT, SUM, AVG FAQ
WHERE vs HAVING—when do I use each?
WHERE filters rows *before* grouping (use for date ranges, status filters, etc.). HAVING filters groups *after* aggregation (use for "only groups with SUM > 1000"). If your condition references an aggregate function, it must go in HAVING.
COUNT(*) vs COUNT(column)—why do my counts differ?
COUNT(*) counts all rows in each group. COUNT(column) ignores NULLs in that column. If a column has NULLs, COUNT(column) will be lower. Use COUNT(*) for "how many rows" and COUNT(column) for "how many non-null values."
Why do I need to GROUP BY every non-aggregated column?
Each output row represents a group. If you SELECT a column without aggregating it, SQL needs to know which value to pick. Standard SQL requires it in GROUP BY. MySQL's ONLY_FULL_GROUP_BY mode (now default) enforces this; older MySQL was more permissive and could return arbitrary values.
Why did my sum/count double or triple?
JOINs can multiply rows. If one order has 3 items and you SUM(order_amount), you count the order 3 times. Fix: aggregate in a subquery/CTE before joining, or use COUNT(DISTINCT order_id) to avoid overcounting.
How do I count only rows that match a condition?
Use a conditional aggregate: SUM(CASE WHEN condition THEN 1 ELSE 0 END) or COUNT(CASE WHEN condition THEN 1 END). Some databases support FILTER: COUNT(*) FILTER (WHERE condition).
How do I count distinct values per group?
Use COUNT(DISTINCT column). For example, COUNT(DISTINCT user_id) gives unique users per group, not total rows.
Can I use an aggregate alias in HAVING?
MySQL and some databases allow it: HAVING total > 100. Standard SQL requires you to repeat the expression: HAVING SUM(amount) > 100. PostgreSQL follows the standard.
How do I include groups with zero rows (e.g., categories with no orders)?
Use a LEFT JOIN from the dimension table (categories) to the fact table (orders), then GROUP BY. Groups with no matches will show NULL aggregates or 0 with COALESCE.
How do I get the row with the MAX value (e.g., highest-spending user)?
This is the "greatest-N-per-group" problem. Use a window function: ROW_NUMBER() OVER (PARTITION BY group_col ORDER BY value DESC), then filter to row_number = 1. Or use a correlated subquery. Don't try to SELECT non-aggregated columns with GROUP BY—you'll get arbitrary values.
How do I combine values into a comma-separated list?
Use STRING_AGG(column, ', ') in PostgreSQL, GROUP_CONCAT(column) in MySQL, or LISTAGG(column, ', ') in Oracle. Add DISTINCT inside to dedupe, and ORDER BY to control the order.
SQL Aggregation: GROUP BY, HAVING, COUNT, SUM, AVG Syntax Quick Reference
SELECT category, COUNT(*) AS total
FROM products
GROUP BY category;-- WHERE filters rows, HAVING filters groups
SELECT category, AVG(price) AS avg_price
FROM products
WHERE is_active = true -- filter rows first
GROUP BY category
HAVING AVG(price) > 100; -- then filter groupsSELECT category,
SUM(CASE WHEN in_stock THEN 1 ELSE 0 END) AS in_stock,
COUNT(*) AS total
FROM products
GROUP BY category;SELECT category,
COUNT(DISTINCT user_id) AS unique_buyers
FROM orders
GROUP BY category;SELECT DATE_TRUNC('month', created_at) AS month,
SUM(amount) AS revenue
FROM orders
GROUP BY DATE_TRUNC('month', created_at)
ORDER BY month;-- Top 3 categories by revenue
SELECT category, SUM(amount) AS revenue
FROM orders
WHERE order_date >= '2024-01-01'
GROUP BY category
ORDER BY revenue DESC
LIMIT 3;-- Show all categories, even those with 0 orders
SELECT c.name, COUNT(o.id) AS order_count
FROM categories c
LEFT JOIN orders o ON o.category_id = c.id
GROUP BY c.name;-- Combine tags into comma-separated list
SELECT post_id,
STRING_AGG(tag, ', ' ORDER BY tag) AS tags
FROM post_tags
GROUP BY post_id;SQL Aggregation: GROUP BY, HAVING, COUNT, SUM, AVG Sample Exercises
Fill in the function to calculate the average price
AVGIf users has 5 rows and 2 have NULL email, what does COUNT(email) return?
3Calculate the total order amount for each user_id in the orders table.
SELECT user_id, SUM(total) FROM orders GROUP BY user_id+ 17 more exercises
Copy-ready syntax examples for quick lookup