Grouping and aggregating

We already learned about the DISTINCT keyword in a previous lesson. This achieved that there are no duplicates in the result of our query. We can also achieve this behavior with the GROUP BY clause. Data records with the same values in the specified columns are combined.

Query with DISTINCT

SELECT DISTINCT assignee, 
                project
  FROM tasks;

Query with GROUP BY

SELECT assignee, 
       project
  FROM tasks
 GROUP BY assignee,
          project;

The advantage of summarizing with GROUP BY is that calculations can also be performed on the other columns not mentioned in the GROUP BY. For example, we can extend the query above and use the aggregate function SUM( ) to get the sum of the effort per assignee and project:

Example

Further aggregate functions

The following aggregate functions calculate per grouping, that is, per different combination of the columns in the GROUP BY clause:

Aggregate functionDescription
SUM(<column>)The sum of the values in a column
MAX(<column>) The largest value in the column, where the maximum is calculated according to the data type.
MIN(<column>) Analogous to MAX( ) the smallest value
AVG(<column>)The average value, only for numeric columns
COUNT(*)Number of records in the grouping
COUNT(DISTINCT <column>) Number of different values of the column in the grouping
COUNT(<column>)Number of non-NULL values of the column in the grouping
The most important aggregate functions