A clear explanation of finding customers who purchased every product in the catalog using GROUP BY and HAVING with COUNT DISTINCT.
Problem Restatement
We have two tables:
Customer(customer_id, product_key) — records each purchase.
Product(product_key) — all products in the catalog.
We need to return the customer_id of customers who bought all products in the catalog.
Input and Output
| Table | Schema |
|---|---|
Customer | customer_id INT, product_key INT |
Product | product_key INT |
Return a table with one column: customer_id.
Examples
Customer: Product:
+-------------+----------+ +-------------+
| customer_id | product_key | | product_key |
+-------------+----------+ +-------------+
| 1 | 5 | | 5 |
| 2 | 6 | | 6 |
| 3 | 5 | +-------------+
| 3 | 6 |
| 1 | 6 |
+-------------+----------+Customers who bought both 5 and 6: customers 1 and 3.
Answer:
+-------------+
| customer_id |
+-------------+
| 1 |
| 3 |
+-------------+Key Insight
For a customer to have bought all products, the count of distinct products they purchased must equal the total number of products.
Group by customer_id and filter with HAVING COUNT(DISTINCT product_key) = (SELECT COUNT(*) FROM Product).
Edge Cases
- Match the aggregate to the question: use raw
COUNTorSUMfor rows/amounts, andCOUNT(DISTINCT ...)only for unique entities. - Tie cases should return every qualifying row unless the statement asks for a single row.
- For outer joins, keep filters on the joined table in the
ONclause when unmatched rows must remain visible.
Common Pitfalls
- Do not put aggregate filters in
WHERE; useHAVINGafterGROUP BY. - When the answer needs all tied winners, avoid
LIMIT 1unless the statement explicitly asks for one row.
Implementation
SELECT customer_id
FROM Customer
GROUP BY customer_id
HAVING COUNT(DISTINCT product_key) = (SELECT COUNT(*) FROM Product);Code Explanation
GROUP BY customer_id aggregates each customer’s purchases.
COUNT(DISTINCT product_key) counts unique products each customer bought.
(SELECT COUNT(*) FROM Product) gives the total number of products.
The HAVING clause keeps only customers who bought every product.