MySQL self join technique explained

MySQL self join is a query that’s written to join a specific table with itself.

The goal of self join is usually to compare hierarchical data within a single table to create a customized result set.

A self join can be performed with or without a JOIN clause. You can also use INNER JOIN, LEFT JOIN, or RIGHT JOIN to execute a self join.

The most common syntax for self join would look as follows:

SELECT ...
FROM table_name AS table_one,
     table_name AS table_two
WHERE ...

You need to create two alias of the same table in the FORM clause for self join to work.

Let’s see an example of self join in action. Suppose you have a table named scores with the following data:

+----+-------+-------+
| id | name  | score |
+----+-------+-------+
|  1 | Jack  |     9 |
|  2 | Susan |     1 |
|  3 | Steve |     3 |
|  4 | Fany  |     8 |
|  5 | Dan   |     7 |
|  6 | Molly |     5 |
+----+-------+-------+

Now suppose you want to query all rows that have a score value smaller than "Dan".

You may write a query like this:

SELECT * FROM scores
  WHERE score < 7;

Because "Dan" has the score value of 7, the above query would produce the correct result set as follows:

+----+-------+-------+
| id | name  | score |
+----+-------+-------+
|  2 | Susan |     1 |
|  3 | Steve |     3 |
|  6 | Molly |     5 |
+----+-------+-------+

But what if suddenly the score value of "Dan" changed from 7 to 4? This will make your query invalid and return the wrong result.

To make the WHERE condition dynamic and adaptable to the changing value, you need to be able to refer to the "Dan"’s row and look at its score value during query execution.

Using the self join technique, you can write the query as follows:

SELECT s1.id, s1.name, s1.score
  FROM scores AS s1, 
       scores AS s2 
  WHERE s1.score <> s2.score 
  AND s2.name="Dan";

The above query would compare the scores table with itself. The second table s2 would return only one row which has the name value of "Dan".

Then, the s1 table score values will be compared to the s2.score. Only values smaller than s2.score will be shown in the output table.

Another way to produce the same output would be to use a subquery in the WHERE clause as follows:

SELECT * FROM scores 
  WHERE score < (SELECT score FROM scores WHERE name = "Dan");

Both subquery and self join techniques can compare the table with itself. A join operation is faster than a subquery operation, but a subquery syntax is usually easier to read than a join syntax.

And that’s how you perform a self join in MySQL database server 😉

Take your skills to the next level ⚡️

I'm sending out an occasional email with the latest tutorials on programming, web development, and statistics. Drop your email in the box below and I'll send new stuff straight into your inbox!

No spam. Unsubscribe anytime.