Welcome to the world of window functions.

A window function is closely related to an aggregate function. However, instead of collapsing all rows into one, we keep them all. A new column is then added with a running total, ranking, or moving average. This is our ‘window frame.’

If you’re interested in using this powerful tool, keep reading for examples and pictures.


Introduction
Syntax
Setting the scene
Create a running total
RANK rows based on a given criteria


Introduction

Window functions come in three main types. These are:

Aggregate window functions
These use aggregate functions like SUM, COUNT, MAX, MIN over a set of rows and return a single result from the query.

Ranking window functions
These assign a ‘rank’ to a set of rows and using RANK, DENSE_RANK, ROW_NUMBER, NTILE

Value window functions
These use LAG, LEAD, FIRST_VALUE, LAST_VALUE to access a previous row without having to do a self join.


Syntax

Let’s start off by exploring aggregate window functions and how they work. Using a few real-life examples, I will simplify the syntax and explain the use cases.

You will need:

  • The function you want to perform: AVG, SUM, COUNT
  • An indication you want to use this function over multiple rows: OVER
  • How you want to group your rows – PARTITION
  • How you want to order your rows – ORDER BY

Setting the Scene

In this case, the sales manager sent us a request. In order to set her team’s targets for next year, she needs historic data. The data is in the sales database, but we need it in a format that is easier to use than the raw table.


sql window functions
The raw sales order table, not of much use to your sales manager

How to create a running total

She would like to be able to see daily sales totals as well as individual order IDs so she can ‘drill down’ as needed.

We could use SUM to total all the rows in the orders table. However, this collapses the order details. A window function will allow the Sales Manager to see each order with a running total.

By using a window function we can see each order for each day and its total, with a running total along the side. Other aggregate functions work the same way, so you can use COUNT, AVG, MIN or MAX, or in combination.

select
  sale_date,
  salesorderid,
  subtotal,
  sum(subtotal) over(partition by sale_date order by salesorderid) as total_sales
from 
  sales.salesorderheader
where 
  orderdate between 
'2018-01-01 00:00:00:000' and '2018-12-31 00:00:00:000'
order by sale_date

sql window functions

How to RANK rows based on a given criteria

The sales manager has returned.

Although she was happy with the table created for setting targets, she needs strategies for increasing sales. She wants to see the sales by customer and dollar value for 2018. Perhaps she could go knock on their door again if she can remember when their big sales were in the past?

Using RANK() in a window function ranks each row, in this case, by subtotal. The sales manager can now decide if she wants to target those customers again in the coming year.

select
  sale_date,
  salesorderid,
  subtotal,
  rank() over(order by subtotal desc) as sales_rank
from 
  sales.salesorderheader
where 
  orderdate between 
'2018-01-01 00:00:00:000' and '2018-12-31 00:00:00:000'
order by sale_date

sql window functions

For rows with identical numbers, as in the example above, the two and three positions are skipped. If you use DENSE_RANK, then you would keep the second and third positions before moving on to the fourth.

Check out part two to learn even more and to see how you can use window functions to your advantage.


Photo by Sean Patrick from Pexels