MySQL self join is a query that’s written to join a specific table with itself.
The goal of self join is usually to compare hierarchical data within a single table to create a customized result set.
A self join can be performed with or without a JOIN
clause. You can also use INNER JOIN
, LEFT JOIN
, or RIGHT JOIN
to execute a self join.
The most common syntax for self join would look as follows:
SELECT ...
FROM table_name AS table_one,
table_name AS table_two
WHERE ...
You need to create two alias of the same table in the FORM
clause for self join to work.
Let’s see an example of self join in action. Suppose you have a table named scores
with the following data:
+----+-------+-------+
| id | name | score |
+----+-------+-------+
| 1 | Jack | 9 |
| 2 | Susan | 1 |
| 3 | Steve | 3 |
| 4 | Fany | 8 |
| 5 | Dan | 7 |
| 6 | Molly | 5 |
+----+-------+-------+
Now suppose you want to query all rows that have a score
value smaller than "Dan"
.
You may write a query like this:
SELECT * FROM scores
WHERE score < 7;
Because "Dan"
has the score
value of 7
, the above query would produce the correct result set as follows:
+----+-------+-------+
| id | name | score |
+----+-------+-------+
| 2 | Susan | 1 |
| 3 | Steve | 3 |
| 6 | Molly | 5 |
+----+-------+-------+
But what if suddenly the score
value of "Dan"
changed from 7
to 4
? This will make your query invalid and return the wrong result.
To make the WHERE
condition dynamic and adaptable to the changing value, you need to be able to refer to the "Dan"
’s row and look at its score
value during query execution.
Using the self join technique, you can write the query as follows:
SELECT s1.id, s1.name, s1.score
FROM scores AS s1,
scores AS s2
WHERE s1.score <> s2.score
AND s2.name="Dan";
The above query would compare the scores
table with itself. The second table s2
would return only one row which has the name
value of "Dan"
.
Then, the s1
table score
values will be compared to the s2.score
. Only values smaller than s2.score
will be shown in the output table.
Another way to produce the same output would be to use a subquery in the WHERE
clause as follows:
SELECT * FROM scores
WHERE score < (SELECT score FROM scores WHERE name = "Dan");
Both subquery and self join techniques can compare the table with itself. A join operation is faster than a subquery operation, but a subquery syntax is usually easier to read than a join syntax.
And that’s how you perform a self join in MySQL database server 😉