Periscope Data
The world’s fastest visualization platform for data analysts.

The High-Performance SQL Blog

Use Subqueries and Window Functions to Compute Running Averages

July 15, 2014

Daily Active Users by Week

By now, we’re all familiar with the ubiquitous Daily Active Users metric. It’s a wonderful, concrete way to get a sense of your engagement.

Here’s a simple query that will do the trick:

select date(created_at), count(distinct user_id)
from activity
group by 1

Easy! Put it into your favorite visualization tool, and you get a graph like this:

Daily Active Users

Holy noise in the data, Batman! It certainly seems to be going up, but it’s hard to identify the trends. If only there were a way to smooth it out.

Running Averages

To smooth this out, let’s look at Daily Active Users by Week, which is a weekly average of your DAUs. Note this is not the same as Weekly Active Users: Both are valid, yet measure different things.

To compute DAU by Week, we’ll put our DAU query in a subquery, and wrap it in a query that takes a simple weekly average:

select date_trunc('week', date), avg(count)
from (
  select date(created_at) date, count(distinct user_id) count
  from activity
  group by 1
) t
group by 1

Here’s the revised graph:

Daily Active Users by Week

Much smoother! Now we can see that our DAUs are generally trending upward, but not explosively so, and not without some substantial dips.

Daily Running Averages

Weekly averages are all well and good, but they are very coarse-grained! There should be a way to show smoothed-out daily data.

For that, we’ll use a Trailing 7-Day Average. For each date on the graph, we’ll plot the average DAUs of the 7 days ending on that date.

To do this in SQL, we’ll turn to our favorite trick, the window function:

select 
  date(created_at) d,
  avg(count(distinct user_id)) over (
    order by d
    rows between 7 preceding and current row
  )
from activity
group by 1

Let’s take our window function piece by piece:

  • avg(count(distinct user_id)) is our basic operation. count(distinct user_id) is the DAU, and avg averages the DAUs.
  • order by d tells the window function in what order to look at the data, which is important because:
  • rows between 7 preceding and current row tells the window function which rows to average. In this case we want the 7 rows before the current row.

Put all of that together, and we get this graph:

Daily Active Users, Trailing 7d

Boom! The lines are smoothed out to show macro trends, but not so much that we can’t see daily variations.

Or Click The “Running Average” Button

You guessed it! If you’re a Periscope user, try the “Line With Average” chart type. You can get the noisy daily data and the smooth running average in the same chart:

Daily Active Users With Running Average

Sign up for Periscope to get your own graphs like this!


Want to discuss this article? Join the Periscope Data Community!
Haven't tried Periscope Data yet?
Start a trial and we’ll send you one of our famous coffee mugs.
Read More
Haven’t tried Periscope Data yet?

Subscribe to our Newsletter