Understanding the difference between COUNT() and COUNT(DISTINCT) in SQL is crucial for accurate data analysis.
COUNT() returns the total number of rows that match your query criteria, including duplicates, while COUNT(DISTINCT) returns the number of unique values in a specified column, effectively eliminating duplicates from the count.
For example, if you have a table of customer orders where a single customer can place multiple orders, COUNT(customer_id) would give you the total number of orders, whereas COUNT(DISTINCT customer_id) would tell you how many unique customers have placed orders.
The choice between these functions depends on your specific reporting needs. Use COUNT() when you need the total number of records, such as counting all sales transactions or total number of website visits.
Use COUNT(DISTINCT) when you need to know unique occurrences, like the number of different products sold or unique visitors to your website. It's also worth noting that COUNT(*) counts all rows including NULL values, while COUNT(column_name) excludes NULL values from that specific column, which can lead to different results depending on your data structure.
Example
-- Example table: customer_orders
-- customer_id | order_date | product_id
-- 1 | 2024-01-01 | 100
-- 1 | 2024-01-02 | 101
-- 2 | 2024-01-01 | 100
-- 3 | 2024-01-03 | 102
-- Count all orders
SELECT COUNT(*) as total_orders
FROM customer_orders;
-- Result: 4 (counts all rows)
-- Count unique customers who placed orders
SELECT COUNT(DISTINCT customer_id) as unique_customers
FROM customer_orders;
-- Result: 3 (counts unique customer_ids: 1, 2, 3)
-- Count unique products ordered
SELECT COUNT(DISTINCT product_id) as unique_products
FROM customer_orders;
-- Result: 3 (counts unique product_ids: 100, 101, 102)
-- Compare regular COUNT with COUNT DISTINCT
SELECT
COUNT(customer_id) as total_orders,
COUNT(DISTINCT customer_id) as unique_customers
FROM customer_orders;
-- Result: total_orders = 4, unique_customers = 3
Walt is a computer scientist, software engineer, startup founder and previous mentor for a coding bootcamp. He has been creating software for the past 20 years.
Last updated on: