Periscope Data
The world’s fastest visualization platform for data analysts.

The High-Performance SQL Blog

Cohorted Engagement: 7-day and 30-day Metrics

Cohorted Engagement: 7-day and 30-day Metrics
July 2, 2016

When tracking user growth, it is helpful to look at multiple time periods. In a previous post, Cohorted Engagement with Comparable Time Windows in SQL, we covered creating charts of the form: “How many times does a user play our game in the first 30 days, and how has this varied for the users that started in January, February, and so on?”

Today we’ll extend our previous technique. Our goal is to create charts like this one:

The blue line is the number of users we’ve added in the first seven days after a new company account is created. For example, each week in January over a thousand new users were added within the first seven days of a company signing up. The orange line is the first 30 days, and it ends early because the full 30 days hasn’t passed yet - each day the blue, orange, and green lines will extend an extra day.

We can spot several trends in this chart. The week of March 14th and May 16th both have large downward cliffs and warrant further investigation. The d7 users trendline has been consistent throughout the year, but there’s a worrying d90 users trend at the beginning: it could be accounts aren’t growing and expanding as much, or new accounts are from smaller companies.

We won’t know how April and May did on d90 for another couple months but we can cautiously hope that it isn’t getting worse: our d30 trendline, on average, isn’t worse than the low point on March 14th.

Let’s walk through creating this chart. First, we’ll start with a table of users and a table of accounts, where each company has many users.

Users table:

Accounts table:

Let’s create a helper CTE, a with clause in postgres, to pull out some intermediate calculations and join the tables.

with new_users as (
  select
    date_trunc('day', accounts.created_at) 
      as account_start_day,
    date_diff('day', accounts.created_at, user.created_at) 
      as account_age_when_invited,
    date_diff('day', account.created_at, now()) 
      as account_age,
    count(distinct users.id) 
      as users
  from account
  left join users on accounts.id = users.account_id
  group by 1,2,3
)
select * from new_users

Running this query builds our new_users table:

As an example of the two helper fields, if a company that signed up 20 days ago had a user sign up after 5 days, account_age_when_invited would be 5, and account_age would be 20. We’d want to count that user in 7-day metrics because they were invited in the first 7 days of the account, and the account age is more than 7.

With that, we can easily make our trend lines:

with new_users as (
...
)
select
  date_trunc('week', account_start_day),
  sum(case when account_age >= 7 and account_age_when_invited < 7
                   then users else null end) as d7_users,
  sum(case when account_age >= 30 and account_age_when_invited < 30
                   then users else null end) as d30_users,
  sum(case when account_age >= 90 and account_age_when_invited < 90
                   then users else null end) as d90_users,
from new_users
group by 1

We get our resulting table

and produce our final chart!

We aggregate the days up to weeks to smooth out the daily variance. Each time window is a separate sum statement. The account_age check is what causes the lines to end earlier for larger time windows, and the account_age_when_invited restricts the user count to just the users that fit in the N-day window.

That’s the general technique - good luck in slicing your data by multiple cohorted time windows!

Haven't tried Periscope Data yet?
Start a trial and we’ll send you one of our famous coffee mugs.
Read More
Haven’t tried Periscope Data yet?