Periscope Data
The world’s fastest visualization platform for data analysts.

The High-Performance SQL Blog

Calculating Year-over-Year Metrics

Calculating Year-over-Year Metrics
October 29, 2015

One of the most common questions analysts ask is “Have our metrics improved over time?”.

This is typically answered with a week-over-week, month-over-month, or year-over-year comparison. For this post, we’ll do a year-over-year comparison.

A few standard calculations are required to do this:

  • Grouping your data by year
  • Calculating percent change
  • Plotting years in parallel

Grouping Data By Year

Let’s use Revenue as the example for our year-over-year comparison. To get revenue by year, we group by year and sum the price.

select
  extract(year from created_at) as year
  , sum(price) as revenue
from
  purchases
group by
  1

This will give you a graph that looks like this:

Calculating Percent Change

To enhance the data grouped by years, we can do a quick percent change calculation.

In Redshift and Postgres, we can use window functions to quickly do the calculation. To do this, we lag the revenue column over the previous year using the lag function. Remember to set the order of the rows to ensure we get the correct row:

lag(revenue) over (order by year)

In our query, we subtract this year’s value over last year’s value from one to get the percent change:

select
  year
  , revenue
  , (revenue / nullif(lag(revenue) 
  over(order by year), 0)) - 1 as percent_change
from
  (
    select
      extract(year from created_at) as year
      , sum(price) as revenue
    from
      purchases
    group by
      1
  )
  annual

We can see both the actual values as well as the percent change with the results:

Plotting Years in Parallel

Sometimes we want to compare exact months or even days between years. In this example, we want to be able to compare January from this year to last year. We will use to_char to get the name of the month for readability. We will then have to order by the number of the month to get our data in the right order.

select
  extract(year from created_at) as year
  , to_char(created_at, 'Mon') as month
  , sum(price) as revenue
from
  purchases
group by
  1
  , 2
  , extract(month from created_at)
order by
  extract(month from created_at)

To take it one step further, we can also compare month-over-month growth between the years. This time, we need a partition by in our lag function to make sure that we calculate percent change between months in the same year:

select 
year
  , month
  , revenue
  , (revenue * 1.0 / lag(revenue) 
  over (partition by year order by month_number)) 
    - 1 as percent_change
from (
select
  extract(year from created_at) as year
  , to_char(created_at, 'Mon') as month
  , extract(month from created_at) as month_number
  , sum(price) as revenue
from
  purchases
group by
  1
  , 2
  , 3
)
order by month_number

With these techniques in hand, you can now calculate period-over-period metrics with your data!

Haven't tried Periscope Data yet?
Start a trial and we’ll send you one of our famous coffee mugs.
Read More
Haven’t tried Periscope Data yet?