To find the median value using a MySQL query, you need to write a subquery that returns the column where you want to find the median value with a number index attached to each row.
The complete solution to find a median value would be similar to the following query:
SET @row_index := -1;
SELECT AVG(subq.score) as median_value
FROM (
SELECT @row_index:=@row_index + 1 AS row_index, score
FROM scores
ORDER BY score
) AS subq
WHERE subq.row_index
IN (FLOOR(@row_index / 2) , CEIL(@row_index / 2));
But it’s quite confusing without understanding what’s going on with the query above, so let’s read this tutorial to learn more about finding the median value using MySQL query.
Introduction
In statistics, the median value is the value of the middle item from a dataset.
When you have an odd number of items, then the median value will be the item in the middle.
But when you have an even number of items, then the median value will be the average of the two middle values.
For example, suppose you have an array of 6
numbers as follows:
[5, 12, 23, 41, 52, 77]
The median value of the array above will be (23 + 41) / 2
which equals 32
.
Since the median represents the value that separates the higher half and the lower half of a data sample, you need to sort an array from the smallest to the largest first before finding the median value.
Suppose you have another array that has an even number of items as follows:
[9, 3, 5, 12, 1]
Then you need to sort the array before finding the median value:
[1, 3, 5, 9, 12]
The median value will be the number 5
.
Calculating the median value is a common task in statistics. Unfortunately, MySQL doesn’t offer a median function that can find the median value for you.
To find the median value, you’ll need to create your own query using the available functions.
The easiest MySQL median query
Let’s create an example table for finding the median value. Suppose you have a table named scores
that contain the following data:
+----+-------+-------+
| id | name | score |
+----+-------+-------+
| 1 | Jack | 2 |
| 2 | Susan | 1 |
| 3 | Steve | 3 |
| 4 | Fany | 8 |
| 5 | Dan | 7 |
| 6 | Molly | 5 |
+----+-------+-------+
To find the median value using MySQL query, you need to be able to attach an index number to each row of your table.
You can create a variable in MySQL using the SET
clause as follows:
SET @row_index := -1;
The above statement initializes the @row_index
variable with the value of -1
. This is required because you need to start counting the index from the number 0
as you will see.
Next, let’s assign the @row_index
value to each row and prepare the score
column by sorting it in ascending order:
SET @row_index := -1;
SELECT @row_index:=@row_index + 1 AS row_index, score
FROM scores
ORDER BY score;
The above SQL query will produce the following result set:
+-----------+-------+
| row_index | score |
+-----------+-------+
| 0 | 1 |
| 1 | 2 |
| 2 | 3 |
| 3 | 5 |
| 4 | 7 |
| 5 | 8 |
+-----------+-------+
Since the @row_index
variable is initialized as -1
, the first index starts from 0
for the first row.
Now that the row_index
and the score
columns are set correctly, you can find the median value by putting the query above as a subquery.
The outer query will filter the result set with a WHERE
condition as follows:
SET @row_index := -1;
SELECT *
FROM (
SELECT @row_index:=@row_index + 1 AS row_index, score
FROM scores
ORDER BY score
) AS subq
WHERE subq.row_index
IN (FLOOR(@row_index / 2) , CEIL(@row_index / 2));
The WHERE
clause above will filter the subquery result to return only the middle index number(s).
When you have an odd number of rows, then the middle row will be returned. When you have an even number of rows, then the two middle rows will be returned.
Here’s the output returned from the query above:
+-----------+-------+
| row_index | score |
+-----------+-------+
| 2 | 3 |
| 3 | 5 |
+-----------+-------+
The last thing you need to do is to find the average of the median value(s) using the AVG()
function.
You need to change the outer SELECT
statement from SELECT *
to SELECT AVG(subq.score)
as follows:
SET @row_index := -1;
SELECT AVG(subq.score) as median_value
FROM (
SELECT @row_index:=@row_index + 1 AS row_index, score
FROM scores
ORDER BY score
) AS subq
WHERE subq.row_index
IN (FLOOR(@row_index / 2) , CEIL(@row_index / 2));
Now the output will be as shown below:
+-----------------+
| AVG(subq.score) |
+-----------------+
| 4 |
+-----------------+
And that’s the easiest way you can find the median value of your data using MySQL SELECT
statement.
Feel free to copy and modify the complete query above for your projects 😉