SQL Group Functions
Introduction
SQL Group Functions (also known as aggregate functions) are powerful tools that operate on sets of rows to calculate and return a single value. Unlike single-row functions that work on individual rows, group functions process multiple rows at once, making them essential for data analysis and reporting.
These functions allow you to answer questions like:
- What is the total sales amount per region?
- What is the average salary by department?
- How many customers are in each country?
Group functions are fundamental to data analysis in SQL and form the backbone of business intelligence reporting.
Understanding Group Functions
Group functions compute a single result from a set of input values. They're typically used with the GROUP BY clause to divide the table into groups of rows, applying the function to each group separately.
Common SQL Group Functions
Here are the most commonly used group functions:
| Function | Description | Example | 
|---|---|---|
| COUNT() | Counts the number of rows or non-NULL values | COUNT(employee_id) | 
| SUM() | Calculates the total of numeric values | SUM(salary) | 
| AVG() | Calculates the average of numeric values | AVG(salary) | 
| MIN() | Finds the minimum value | MIN(hire_date) | 
| MAX() | Finds the maximum value | MAX(price) | 
Basic Syntax
SELECT column(s), GROUP_FUNCTION(column)
FROM table
[WHERE condition]
[GROUP BY column(s)]
[HAVING group_condition]
[ORDER BY column(s)];
Group Functions in Action
Let's explore each function with examples using a fictional employees table:
COUNT() Function
The COUNT() function returns the number of rows in a specified table or the number of non-NULL values in a column.
-- Count all employees
SELECT COUNT(*) AS total_employees
FROM employees;
Output:
total_employees
---------------
         1000
-- Count employees by department
SELECT department_id, COUNT(*) AS employee_count
FROM employees
GROUP BY department_id
ORDER BY employee_count DESC;
Output:
department_id   employee_count
-------------   --------------
           50              45
           80              34
           30              25
           20              20
          ...             ...
SUM() Function
The SUM() function calculates the total of all values in a numeric column.
-- Calculate total salary budget
SELECT SUM(salary) AS total_salary
FROM employees;
Output:
total_salary
------------
    6871850
-- Calculate total salary by department
SELECT department_id, SUM(salary) AS department_salary
FROM employees
GROUP BY department_id
ORDER BY department_salary DESC;
Output:
department_id   department_salary
-------------   -----------------
           80            590550
           50            385750
           30            249500
          ...              ...
AVG() Function
The AVG() function calculates the average value of a numeric column.
-- Calculate average salary
SELECT AVG(salary) AS average_salary
FROM employees;
Output:
average_salary
--------------
        6871.85
-- Calculate average salary by job title
SELECT job_id, ROUND(AVG(salary), 2) AS avg_job_salary
FROM employees
GROUP BY job_id
ORDER BY avg_job_salary DESC;
Output:
job_id          avg_job_salary
--------------  --------------
AD_PRES               24000.00
AD_VP                 17000.00
IT_PROG                5760.00
MK_MAN                13000.00
...                       ...
MIN() and MAX() Functions
The MIN() and MAX() functions return the smallest and largest values in a column, respectively.
-- Find salary range
SELECT 
    MIN(salary) AS lowest_salary,
    MAX(salary) AS highest_salary
FROM employees;
Output:
lowest_salary   highest_salary
-------------   --------------
        2100            24000
-- Find date range of employment by department
SELECT department_id,
    MIN(hire_date) AS earliest_hire,
    MAX(hire_date) AS latest_hire
FROM employees
GROUP BY department_id
ORDER BY department_id;
Output:
department_id   earliest_hire   latest_hire
-------------   -------------   -----------
           10    1987-09-17     1994-01-03
           20    1994-07-01     2005-06-21
           30    1994-12-01     2005-08-15
          ...          ...            ...
Using Group Functions with GROUP BY
The GROUP BY clause organizes rows into groups, and the group function is then applied to each group rather than the entire table.
-- Count employees and average salary by department and job
SELECT 
    department_id,
    job_id,
    COUNT(*) AS employee_count,
    ROUND(AVG(salary), 2) AS avg_salary
FROM employees
GROUP BY department_id, job_id
ORDER BY department_id, job_id;
Output:
department_id   job_id          employee_count   avg_salary
-------------   -------------   --------------   ----------
           10   AD_ASST                     1       4400.00
           20   MK_MAN                      1      13000.00
           20   MK_REP                      1       6000.00
           30   PU_CLERK                    5       2780.00
           30   PU_MAN                      1      11000.00
          ...   ...                       ...          ...
The HAVING Clause
While the WHERE clause filters rows before grouping, the HAVING clause filters groups after they are formed.
-- Find departments with average salary over 8000
SELECT 
    department_id,
    ROUND(AVG(salary), 2) AS avg_salary
FROM employees
GROUP BY department_id
HAVING AVG(salary) > 8000
ORDER BY avg_salary DESC;
Output:
department_id   avg_salary
-------------   ----------
           90      19333.33
           80      10033.33
          100       8600.00
           20       9500.00
          ...          ...
Combining Multiple Group Functions
You can use multiple group functions in a single query.
-- Department salary statistics
SELECT 
    department_id,
    COUNT(*) AS employee_count,
    MIN(salary) AS min_salary,
    ROUND(AVG(salary), 2) AS avg_salary,
    MAX(salary) AS max_salary,
    SUM(salary) AS total_salary
FROM employees
GROUP BY department_id
ORDER BY department_id;
Output:
department_id   employee_count   min_salary   avg_salary   max_salary   total_salary
-------------   --------------   ----------   ----------   ----------   ------------
           10                1        4400       4400.00        4400          4400
           20                2        6000       9500.00       13000         19000
           30                6        2500       4150.00       11000         24900
          ...              ...         ...          ...         ...           ...
NULL Handling in Group Functions
Most group functions ignore NULL values in their calculations, but there are specific behaviors to be aware of:
-- COUNT(*) counts all rows
-- COUNT(column) counts non-NULL values
SELECT 
    COUNT(*) AS total_rows,
    COUNT(commission_pct) AS employees_with_commission
FROM employees;
Output:
total_rows   employees_with_commission
----------   --------------------------
      1000                          35
Real-World Applications
Sales Analysis
-- Monthly sales summary
SELECT 
    TO_CHAR(order_date, 'YYYY-MM') AS month,
    COUNT(*) AS order_count,
    SUM(order_total) AS total_revenue,
    ROUND(AVG(order_total), 2) AS avg_order_value,
    MAX(order_total) AS largest_order
FROM orders
GROUP BY TO_CHAR(order_date, 'YYYY-MM')
ORDER BY month;
Customer Segmentation
-- Customer purchase frequency
SELECT 
    customer_segment,
    COUNT(DISTINCT customer_id) AS customer_count,
    ROUND(AVG(purchase_amount), 2) AS avg_purchase,
    SUM(purchase_amount) AS total_spending
FROM purchases
GROUP BY customer_segment
ORDER BY total_spending DESC;
Inventory Management
-- Inventory levels by category
SELECT 
    category,
    COUNT(*) AS product_count,
    SUM(units_in_stock) AS total_stock,
    MIN(units_in_stock) AS min_stock,
    MAX(units_in_stock) AS max_stock
FROM products
GROUP BY category
HAVING SUM(units_in_stock) < 100
ORDER BY total_stock;
Best Practices
- Always use aliases: Give meaningful names to your calculated columns
- Use ROUND() with AVG(): Average calculations often produce many decimal places
- Filter before grouping: Use WHERE to filter input rows before applying GROUP BY for better performance
- Order results: Include ORDER BY to make results more readable
- Remember NULL behavior: Group functions generally ignore NULL values
- Use HAVING correctly: HAVING filters groups, WHERE filters rows
Advanced Group Function Concepts
DISTINCT with Group Functions
You can use DISTINCT inside group functions to count or calculate based on unique values only:
-- Count distinct job titles in each department
SELECT 
    department_id,
    COUNT(DISTINCT job_id) AS unique_job_count
FROM employees
GROUP BY department_id
ORDER BY unique_job_count DESC;
Group Functions and Joins
Group functions work seamlessly with table joins:
-- Average salary by department name
SELECT 
    d.department_name,
    COUNT(e.employee_id) AS employee_count,
    ROUND(AVG(e.salary), 2) AS avg_salary
FROM employees e
JOIN departments d ON e.department_id = d.department_id
GROUP BY d.department_name
ORDER BY avg_salary DESC;
Common Errors and How to Avoid Them
- 
Mixing group functions with non-grouped columns: -- INCORRECT
 SELECT department_id, employee_name, AVG(salary)
 FROM employees
 GROUP BY department_id;Every column in the SELECT list must either be in the GROUP BY clause or be used with a group function. 
- 
Using WHERE instead of HAVING for group conditions: -- INCORRECT
 SELECT department_id, AVG(salary)
 FROM employees
 WHERE AVG(salary) > 8000
 GROUP BY department_id;
 -- CORRECT
 SELECT department_id, AVG(salary)
 FROM employees
 GROUP BY department_id
 HAVING AVG(salary) > 8000;
Visualizing the Group Function Process
Summary
SQL Group Functions are powerful tools for data analysis, allowing you to:
- Aggregate data across multiple rows
- Perform calculations on groups of records
- Generate summary statistics and reports
- Analyze patterns and trends in your data
By mastering group functions, you can transform raw data into meaningful insights that drive better business decisions.
Practice Exercises
- Write a query to find the number of employees hired each year.
- Calculate the average salary for employees in departments with more than 10 employees.
- Find the department with the highest total salary expense.
- List job titles where the maximum salary is at least twice the minimum salary.
- Calculate the median salary for each department (hint: this requires advanced techniques in most SQL dialects).
Additional Resources
💡 Found a typo or mistake? Click "Edit this page" to suggest a correction. Your feedback is greatly appreciated!