Using Raspberry Pis to Build Beautiful Wall-Mounted Dashboards

Everyone wants a beautiful, wall-mounted dashboard for their primary KPIs—but setting one up can be challenge. Here's an easy and affordable way to display your dashboards without hiding a desktop computer under a desk.

Read More
Building a Data-Driven Culture Starts With the Right Players

Being a data-driven organization is an ongoing process—you need to invest in the right team, infrastructure, software, and processes that can promote lasting change.

Read More
Using Python to Write a Create Table Statement and Load a CSV into Redshift

For wide tables, it can be a pain to write those long create table statements and load the data into Redshift. Here is some simple Python to get the job done using the 2017 Stack Overflow survey.

Read More
Connecting Python and Periscope Data

Sometimes you need to step away from reporting and analytics and do some data science. You might want to build a new lead scoring model or use a random forest to look for patterns in your churn, and languages such as R or Python offer the power and flexibility to get that done. Periscope Data is an amazing tool to have available while you code.

Read More
3 Ways Communication Can Help Analysts Be More Successful

Being a great analyst is more than SQL, Python or reporting—great analytics is a human exercise, a communications skill and way to grow a data-driven culture. Here are 3 simple techniques will help take your dashboards from simple intrigues to impactful business necessities.

Read More
How to Get Started With AWS Spectrum in Minutes

AWS Redshift Spectrum is Amazon’s newest database technology, allowing exabyte scale data in S3 to be accessed through Redshift. This is not simply file access; Spectrum uses Redshift's brain, deploying workers by the thousands to order, join and aggregate your data before sending the minimum amount of data needed back to your Redshift cluster to finish your query. Learn how to get started!

Read More
SQL Text Analysis with Donald Trump’s Tweets

There were many unique aspects of the 2016 U.S. presidential election, with one of the most interesting being Donald Trump’s use of Twitter. As luck would have it, tweets happen to be very easy to extract and put into relational database format.

Read More
Net Promoter Score and Survey Analysis in SQL

Introduced by Fred Reichheld in 2003, Net Promoter Score (NPS) is a simple method for measuring the likelihood your customers will recommend your product or service. It requires that you survey respondents and ask one simple question: “How likely are you to recommend [Company/Product/Service] to a friend or colleague?”

Read More
SQL Symbol Cheat Sheet

When you are new to programming in SQL, you will come across a lot of hard-to-search-for character operators. If you are prone to forgetting ~ is called tilde, are wondering why there are so many %s in your strings, or have a hard time googling what the (+) symbol in where users.group_id(+) = group.id, this guide is for you.

Read More
The Best Redshift Optimization Blog Posts of 2016

At Periscope Data we are huge fans of AWS Redshift because it empowers our users to run queries blazingly fast. Beyond the success of Redshift the product is the ecosystem that has grown around the platform: dozens of companies helping you deploy and optimize your Redshift clusters and a myriad of resources for maximizing your query speed.

Read More
Building a Better Pokédex with SQL

Pokémon Go has taken over the world. In the moments I’m not exploring the city trying to catch ‘em all, I’ve found myself idly wondering about Pokémon Theory.

Read More
How to Calculate Confidence Intervals in SQL

Imagine you have a small online business. This month 200 users signed up on your website, and 10 of them bought your $800 service. Great! You’ve made $8k of income. How much should you expect to make this year?

Read More
Common Salesforce Reports in SQL

At Periscope Data, we depend on our sales team to bring in revenue from the leads our marketing team generates. Like many sales teams, ours uses Salesforce to manage the processes involved in making a sale.

Read More
Calculating Distance between Data Centers on a Globe

Calculating distance between two points on a flat plane is straightforward thanks to the Pythagorean theorem. However, what if your plane isn’t flat? What if you need to find the distance between two points on a sphere, like the earth?

Read More
Understanding Outliers with Skew and Kurtosis in SQL

When you approach a distribution of data for the first time, it’s often helpful to pull out summary statistics to understand the domain of the data.

Read More
Calculating Money Growth using PL/pgSQL

PL/pgSQL for-loops are a tool that we like to use at Periscope Data. To demonstrate their use, we’re going to determine the real value of $1000 in our savings account after 15 years.

Read More
Coalesce Down using Correlated Subqueries

Filling in gaps in your dataset usually takes one of two forms. You can interpolate, where intermediate points are generated based on their neighbors, or you can carry values forward.

Read More
Getting Started with Regular Expressions

Last time we talked about matching strings in SQL, we covered tactics that work well for strings on the simple side. For strings with more complicated patterns, the regular expressions below are a handy tool to have in your belt.

Read More
Analyzing Your Email with SQL

Ever since I first heard the phrase Inbox Zero I have been aggressively pursuing it with careful inbox management and quick email response times. I wanted to see how I was doing, so I downloaded a data dump of my gmail data. ​

Read More
Rolling Classifications by X Previous Aggregations

Rolling averages are useful for comparing a specific point with those around it. However, in some cases we might want to instead classify and group a period’s data based on its contemporary periods.

Read More
Managing a Six-Figure Ad Budget with SQL

At Periscope Data, one of our main marketing channels is advertising. We have to be very careful with advertising, because costs can quickly add up to outstrip the revenue we receive from our ad campaigns.

Read More
Bigram Frequencies in Pure SQL

Basic text analysis on unigram and bigram frequencies can be helpful when digging into datasets of unstructured text.

Read More
String Parsing in SQL

String parsing is a common task for data analysts and data engineers working with raw data. With the growth of unstructured qualitative data, parsing strings efficiently has become increasingly important for fast analysis.

Read More
Outlier Detection in SQL

Inevitably, the unexpected happens. A historically low-traffic channel brings in 10x the normal amount of users. Or your user login rate drops by half.

Read More
Using CTEs and Unions to Compute Running Totals

A common pattern in SQL analysis is to use multiple groupings to show grouped totals over time.

Read More
How to Really Use SQL Views

Views are virtual tables that can be a great way to optimize your database experience.

Read More
Creating a Histogram in Redshift

One of the more popular questions our customers ask is how to make histograms in SQL.

Read More
Adding Summary Statistics to your SQL Table

At Periscope Data we continuously help people build, diagnose, and update their data. When approaching a table of data for the first time, it helps our understanding to see summarizing statistics.

Read More
Concatenating Rows in Redshift, Postgres, & MySQL

Sometimes it’s helpful to look at an aggregated overview of many rows. With numeric columns it’s easy to sum or average many values, but for string columns we need something different.

Read More
Calculating Year-over-Year Metrics

One of the most common questions analysts ask is “Have our metrics improved over time?”.

Read More
Analyzing Your Olark History with SQL

​ A strong Customer Success team should be highly engaged with customers, which is why we use Olark to enable live chat and email support directly within the product.

Read More
Cash Flow Modelling in SQL

People love financial modelling in Excel. It’s easy, fast and flexible, and Excel’s functions allow you to quickly tweak your financial model. However, when your data volume starts to get very large, a spreadsheet can quickly break down.

Read More
Analyzing Your Salesforce Data With SQL

When it comes to customer relationship management software, Salesforce is the undisputed king, so it is no surprise that there are many folks looking to integrate Salesforce data with other sources for a deeper look into performance.

Read More
International Analytics In SQL

As word of Periscope has spread, our international customer base has grown. In order to capture more international customers, we now run marketing campaigns targeting companies outside the U.S.

Read More
Thinking Critically About Mobile Game Retention

Many of our customers produce mobile games and use Periscope to track their growth metrics. Since a mobile game that doesn’t retain its players will never recoup the costs of getting those players, it’s not surprising that retention is one of their top metrics.

Read More
Comparing IP Addresses in SQL

It’s a common practice to store users’ IP addresses when logging actions to an events table. Most warehouses we’ve seen have chosen to store IPs as varchars.

Read More
The Top SQL Analysis Errors We See

‍When writing SQL for data analysis there are two kinds of errors: errors that prevent the query from running, and errors that merrily return the wrong data. The second kind of error can be hard to spot, especially when your results are within range of your expectations.

Read More
Daily, Weekly and Monthly Charts On The Same Graph

Depending on your data, looking at the same metric with daily, weekly or monthly aggregations can yield startlingly different results.

Read More
Helpful Redshift Admin Queries

It is easy to treat Redshift as a black box — queries go in, answers come out. When something goes wrong, though, you’ll want to open the hood and see what Redshift is actually doing.

Read More
Single Event Tables and Common Analysis Queries

Single denormalized events tables are increasingly common, especially for storing logs data from mobile clients. These single-table implementations often use JSON blobs to store properties.

Read More
Splitting Comma-Separated Values In MySQL

Every once in a while, a quick decision is made to store data in a comma-separated fashion, and the SQL analyst is left to pick up the pieces during analysis.

Read More
Redshift User Defined Functions in Python

Today the Redshift team announced support for User Defined Functions! UDFs allow you to create your own SQL functions, backed by the power and flexibility of Python. Now you can add your favorite functions from other databases, or invent new ones, to make your data analysis much easier.

Read More
SQL for Marketers — Your Most Profitable Channels

Knowing how your customer acquisition costs compare to your customer lifetime value is a vital part of marketing. When split across channels, this data allows you to decide whether to double down on a channel or take resources away.

Read More
Exact Row Counts for All Tables in MySQL and Postgres

Keeping track of your row counts can be helpful for budgeting and capacity planning, yet accurate counts are surprisingly hard to get.

Read More
Reuse Calculations in the Same Query with Lateral Joins

Reusing parts of computations has long been a wart in SQL. Let’s say you want to compute confidence intervals for your signup rate. You’ll have to use the standard error calculation twice: once for the upper bound, and once for the lower bound.

Read More
Aggregating into Strings without String_agg or Array_agg in Amazon Redshift

From time to time, any analyst will want to know the “top n instances” of something. For example, as the holidays approach, a toy store may want to know who the top customers of certain products are, so they can prepare special marketing for those customers.

Read More
Rolling Averages in MySQL and SQL Server

Previously we discussed how to write rolling averages in Postgres. By popular demand we’re showing you how to do the same in MySQL and SQL Server.

Read More
Extrapolating to the End of the Month in SQL

Our customers often come to us for help with advanced SQL queries. Recently, two of them asked us how to do an extrapolation in SQL.

Read More
Medians in SQL

Finding the min, max, and avg price of an order in your purchases table is easy. But what about the median price?

Read More
Cross-Database Joins

It began with the best of intentions: You launched your first web app for your customers, backed by a database full of transactional data to analyze

Read More
Using Row Numbering And Full Joins To Transpose Wide Tables

Many a MySQL or Postgres DBA, with responsibility for serving a fast and responsive website, will design a schema for a survey that looks like this:

Read More
Fun with Window Functions

Window functions are a wonderfully useful SQL technique. They make complex aggregations simple to build.

Read More
Estimating Conversions with Poisson Distributions

Following up on last week’s post on normal distributions, this week we’re going to show off another common probability distribution.

Read More
Beyond Random() — Normal Distributions in SQL

If you’ve ever had to generate sample data in your database, odds are you’ve started with random() or its equivalent. It’s fast and it’s reliable. Unfortunately, it’s also evenly distributed.

Read More
Computing Day-Over-Day Changes With Window Functions

In most sophisticated analysis, the rate of change is at least as important as the raw values. This makes life tough for a SQL analyst, where adding daily deltas to your result set can be difficult.

Read More
Date Parts in SQL Server

The bread and butter of SQL analysis is grouping by a period of time and then counting the rows for each period.

Read More
ASCII Art Charts in the Terminal

At Periscope we love charts. Chart on dashboards. Charts on TVs. Charts in email. And when we’re at the the command line, charts in the terminal!

Read More
How To Calculate Cohort Retention in SQL

‍Losing users sucks. Losing customers really sucks. If you’re a startup you know that Retention is King.

Read More
Hashing Tables to Ensure Consistency in Postgres, Redshift and MySQL

One of the main things Periscope does to speed up queries is maintain a cache of customer data. The cache is optimized to serve certain kinds of queries, and benefits f

Read More
HyperLogLog in Pure SQL

Earlier we showed how to make count distinct 50x faster with subqueries. Using probabilistic counting we’ll make count distinct even faster, trading a little accuracy for the increase in speed.

Read More
How to Format Numbers as Currency in Postgres, MySQL and Redshift

In your venerable orders table, you’re almost certainly storing prices as numbers. Perhaps they’re integer, perhaps they’re numeric, perhaps you’re using Postgres and they’re money, or perhaps you rolled the dice on floating-point rounding errors and went with real.

Read More
Generate Series in Redshift and MySQL

‍A lot of the charts and tables made in Periscope are time series, and the queries behind them are often easier when you can join and aggregate against a list of dates.

Read More
Faster Redshift Queries with Materialized Views — Lifetime Daily ARPU

The best way to make your SQL queries run faster is to have them do less work, and a great way to do less work is to query a materialized view that’s already done the heavy lifting.

Read More
How To Sample Rows in SQL 273X Faster

Sampling is an incredibly powerful tool to speed up analyses at scale. While it’s not appropriate for all datasets or all analyses, when it works, it really works.

Read More
Counting Conditionally in SQL

When preparing a report on the number of customers you have, it can be helpful to split out the premium customers — i.e., those who spend more than $100/month — from the rest of the group.

Read More
How to Calculate Confidence Intervals in SQL

Imagine you have a small online business. This month 200 users signed up on your website, and 10 of them bought your $800 service. Great! You’ve made $8k of income. How much should you expect to make this year?

Read More
Use window functions for time-series percentages

When visualizing gameplays by platform over time — a chart every multiplatform game company knows well — it’s typical to start with a simple query:

Read More
4 Ways to Join Only The First Row in SQL

For today’s daily report, we need a list of users and the most recent widget each user has created.

Read More
2X Your Redshift Speed With Sortkeys and Distkeys

Like a lot of folks in the data community, we’ve been impressed with Redshift, Amazon’s new distributed database. Yet at first, we couldn’t figure out why performance was so variable on seemingly-simple queries.

Read More
Use generate_series to get continuous results

When you’re just starting out, chances are you’ll be asking a very simple question: How many widgets did I sell each day?

Read More
How To Speed Up Where Clauses By 3,000X

Front and center on my Periscope dashboard is a question I ask all the time: How much usage has there been today?

Read More
Use Subqueries to Count Distinct 50X Faster

Count distinct is the bane of SQL analysts, so it was an obvious choice for our first blog post.

Read More
Range joins give you accurate histories

One fine morning, your investor Leo asks you: How many charts are there across all of Periscope? Because he’s an investor, he also wants to know: How fast is that number growing?

Read More