sql window functions

Why you need SQL Window Functions (part 1)

Welcome to the world of window functions.

Window functions are closely related to aggregate functions. But rather than collapsing all the rows we want into one row, we keep them. Then we add a new column with a running total, rank or moving average. This becomes our ‘window frame.’

It becomes clearer with examples and pictures, so if you’re interested in using this powerful tool, keep reading.


Introduction
Syntax
Setting the Scene
Create a running total
RANK rows based on a given criteria


Introduction

Window functions come in three main types. These are:

Aggregate Window Functions
These use aggregate functions like SUM, COUNT, MAX, MIN over a set of rows and return a single result from the query.

Ranking Window Functions
These assign a ‘rank’ to a set of rows and using RANK, DENSE_RANK, ROW_NUMBER, NTILE

Value Window Functions
These use LAG, LEAD, FIRST_VALUE, LAST_VALUE to access a previous row without having to do a self join.


Syntax

To get started with window functions lets begin with the syntax for an aggregate window function and how each part works.

You will need:

  • The function you want to perform: AVG, SUM, COUNT
  • An indication you want to use this function over multiple rows: OVER
  • How you want to group your rows – PARTITION
  • How you want to order your rows – ORDER BY

To make sense of the syntax and show the use cases I’m going to tackle some real-life problems faced by a Sales Analyst.


Setting the Scene

In this example, the Sales Manager has come to us with a request.

She is setting targets for the team for next year and needs historic data for her decision making. We have data available in the sales database but need it in a format that’s easier to use than the raw table.


The raw Sales Order table, not much use to your Sales Manager

How to Create a Running Total

First, she’d like to see daily sales totals but also wants to be able to see each orderid, so she can ‘drill down’ if need be.

We could tackle this using SUM to total up all the rows on the orders table. But this collapses down the order details. A window function will allow us to see each order with a running total for the Sales Manager.



By using a window function we can see each order for each day and it’s total, with a running total along the side. Other aggregate functions work the same way, so you can use COUNT, AVG, MIN or MAX, or in combination.


How to RANK rows based on a given criteria

The Sales Manager is back.

She was happy with the table created for setting targets but now needs some strategies to increase sales. This time she wants to see sales for 2018 by the customer and dollar value. If she can spot when those big sales were in the past, maybe she could go knock on their door again?

To give her the data she needs we need to use a window function with a RANK.



Using RANK() in a window function ranks each row, in this case, by subtotal. The Sales Manager can now decide if she wants to target those customers again in the coming year.


If there are identical rows, like in the example above, they all receive the same rank and places 2 and 3 are skipped. To get around this we can use DENSE_RANK which would have kept rank 2 and 3 before moving on to fourth place.


There’s much more to learn and useful ways to use window functions, so check out part two in this series.


Photo by Vitor Almeida from Pexels

Sharing is caring!

Helen Anderson

A data analyst, writer, and technical product manager working on new features to help software go faster – currently at Raygun.

Bitnami