A clear explanation of finding actor-director pairs with at least three collaborations using GROUP BY and HAVING.
Problem Restatement
We have a table ActorDirector(actor_id, director_id, timestamp).
We need to return all (actor_id, director_id) pairs where the actor has cooperated with the director at least three times.
Input and Output
| Table | Schema |
|---|---|
ActorDirector | actor_id INT, director_id INT, timestamp INT |
Return a table with actor_id and director_id.
Examples
ActorDirector:
+----------+-------------+----------+
| actor_id | director_id | timestamp |
+----------+-------------+----------+
| 1 | 1 | 0 |
| 1 | 1 | 1 |
| 1 | 1 | 2 |
| 1 | 2 | 3 |
| 1 | 2 | 4 |
| 2 | 1 | 5 |
+----------+-------------+----------+Actor 1 worked with director 1 three times.
Answer:
+----------+-------------+
| actor_id | director_id |
+----------+-------------+
| 1 | 1 |
+----------+-------------+Key Insight
Group by the (actor_id, director_id) pair and count rows. Filter where count ≥ 3.
Edge Cases
- Match the aggregate to the question: use raw
COUNTorSUMfor rows/amounts, andCOUNT(DISTINCT ...)only for unique entities. - Tie cases should return every qualifying row unless the statement asks for a single row.
- For outer joins, keep filters on the joined table in the
ONclause when unmatched rows must remain visible.
Common Pitfalls
- Do not put aggregate filters in
WHERE; useHAVINGafterGROUP BY. - When the answer needs all tied winners, avoid
LIMIT 1unless the statement explicitly asks for one row.
Implementation
SELECT actor_id, director_id
FROM ActorDirector
GROUP BY actor_id, director_id
HAVING COUNT(*) >= 3;Code Explanation
GROUP BY actor_id, director_id groups each unique pair.
COUNT(*) counts the number of collaborations.
HAVING COUNT(*) >= 3 keeps only pairs with at least three collaborations.