There are many ways to write a query that counts duplicates in your MySQL database table.
Depending on the result set you wish to generate, there are at least two kinds of SQL queries that you can use to find duplicate values in your database table:
- Generate all rows that have a column with a duplicate value
- Find how many rows have a column with a duplicate value
This tutorial will help you learn how to write queries that produce the right result set.
For example, suppose you have a members
table with the following data. Note that the highlighted rows are the only unique rows in the table:
+----+------------+----------------+
| id | first_name | country |
+----+------------+----------------+
| 1 | Jackson | United States |
| 2 | Sarah | Canada |
| 3 | Akihiko | Japan |
| 4 | Matt | United Kingdom |
| 5 | Peter | Canada |
| 6 | Yuko | Japan |
| 7 | Devon | United States |
| 8 | Sasuke | Japan |
| 9 | Nicky | Netherlands |
+----+------------+----------------+
Besides the fourth and the ninth row, all other rows have a duplicate country
value in the table.
Let’s start with generating all rows that have a column with duplicate values
Generate all rows that have a column with a duplicate value
There are two methods you can use to generate all rows that have a column with a duplicate value:
- Using
INNER JOIN
subquery - Using
WHERE ... IN
subquery
Both methods above require you to write a subquery that returns only the country
value that has appeared more than once.
The subquery will be a SELECT
statement with a GROUP BY
and HAVING
clause as follows:
SELECT country
FROM members
GROUP BY country
HAVING COUNT(country) > 1
The result set of the query above will be as shown below:
+---------------+
| country |
+---------------+
| United States |
| Canada |
| Japan |
+---------------+
Using the INNER JOIN
clause, you can write a SELECT
statement that joins with a temporary table created from a query as follows:
SELECT id, first_name, members.country
FROM members
INNER JOIN(
SELECT country
FROM members
GROUP BY country
HAVING COUNT(country)>1
) AS temp_tbl WHERE members.country = temp_tbl.country;
The query above will produce the following result set:
+----+------------+---------------+
| id | first_name | country |
+----+------------+---------------+
| 1 | Jackson | United States |
| 2 | Sarah | Canada |
| 3 | Akihiko | Japan |
| 5 | Peter | Canada |
| 6 | Yuko | Japan |
| 7 | Devon | United States |
| 8 | Sasuke | Japan |
+----+------------+---------------+
As you can see, all duplicate rows are displayed in the result set above.
Besides using INNER JOIN
clause, you can also use the WHERE ... IN
clause to produce the same result as follows:
SELECT id, first_name, country
FROM members
WHERE country IN (
SELECT country
FROM members
GROUP BY country
HAVING COUNT(country) > 1
);
The WHERE ... IN
clause will filter the rows, causing MySQL to show only rows where the country
column value matches a value inside the IN
clause.
Find how many rows have a column with a duplicate value
You can also find out how many rows in your table has a field with duplicate values.
The following query will display the country
column along with how many times the country
column appears in the table:
SELECT country, COUNT( country ) total_members
FROM members
GROUP BY country
HAVING total_members > 1;
The output will be as shown below:
+---------------+---------------+
| country | total_members |
+---------------+---------------+
| United States | 2 |
| Canada | 2 |
| Japan | 3 |
+---------------+---------------+
And that’s how you can find duplicate values in your MySQL table.
You may need to copy and adjust the example queries above to produce the right result from your database server.