How Joins Work

The SQL join operation is one of the most powerful and commonly used SQL operations, but little attention is paid to how the internal SQL engine breaks down the tasks of join operations. This post will explore the common algorithms that databases use to compute them, including nested loop, hash, and merge joins. Our aim is to act as a resource for SQL users interested in exploring optimizations.

Read More

Interactive Analytics: Redshift vs Snowflake vs BigQuery

Every day customers ask us what the best data warehouse technology is for handling the workload of their growing analytics department: they want to run queries at interactive, real-time speeds at a price-point that fits in their budget. We’ve helped our users with everything from tried-and-true MySQL servers, day-old Postgres boxes, and fifteen year old SQL Server deploys. We help customers work with their production databases and modern cloud analytics solutions like AWS Redshift, Snowflake and Google BigQuery.

Our customers want to know where to get the fastest query times, how much data their data warehouse can handle, how much it will cost them. We want to make sure we provide the best recommendations to our users when they ask for our help navigating the data warehouse space, so we put Redshift, Snowflake, and BigQuery to the test.

After investigating Redshift, Snowflake, and BigQuery, we found that Redshift is the best choice for real-time query speeds on our customers’ typical data volumes. In this post, we will walk through our analysis of these three data warehouse solutions and the compelling use cases we found for each of the technologies.

Read More

Data Warehouse Management: Redshift, Bigquery, and Snowflake

In our post comparing Redshift, BigQuery, and Snowflake on query performance and cost for interactive analytics, we looked at the trade-offs across different data warehouses from a performance perspective.

BigQuery, Snowflake and Redshift all have web based consoles where you control your data, clusters, user management, query logging and system analytics. In this post we will visit the management console of each system and the supporting tools for managing your warehouse deployments.

Read More

Tracking KPIs using SQL for Effective Revenue Growth Analysis

Tracking revenue is crucial to understanding the health of your business. Whether your long-term growth is declining, has plateaued, or is heading upwards helps you define your long term strategy and goals. Many sales cycles are biased towards weekends, beginning-of-month, or end-of-quarter deals, and it can be hard to know day-to-day if your sales numbers are trending on target.

At Periscope we track several KPIs as they progress over the course of the month. In this post, we are going to look at monthly recurring revenue (MRR) historically across the year, and segment each month by its progress up to the current date.

Read More

SQL Symbol Cheat Sheet

When you are new to programming in SQL, you will come across a lot of hard-to-search-for character operators. If you are prone to forgetting ~ is called tilde, are wondering why there are so many %s in your strings, or have a hard time googling what the (+) symbol in where users.group_id(+) =, this guide is for you.

Read More

The Best Redshift Optimization Blog Posts of 2016

At Periscope Data we are huge fans of AWS Redshift because it empowers our users to run queries blazingly fast. Beyond the success of Redshift the product is the ecosystem that has grown around the platform: dozens of companies helping you deploy and optimize your Redshift clusters and a myriad of resources for maximizing your query speed.

Here are our six favorite blog posts on optimizing Redshift performance to come out of 2016.

Read More

Calculating Proportional Values in SQL

SQL is great for aggregating: counting, summing, and averaging numbers are straightforward tasks in SQL. When looking at a distribution, it is often useful to look at them in context and compare the relative size of each part of the proverbial pie.

Read More

Valuing User Subscriptions Over Time with SQL

We recently launched the instant-classic underwater warfare game Scope: Battle for the High Seas, and our investors are excited to see how the game’s subscriptions are growing. We are using a payment processor to handle collections of subscriptions and have access to a few tables. While this makes analyzing the data possible, it is initially not in the best format to do so. Let’s investigate how we can normalize our date ranges to compute the growth of subscriptions since our game launched.

Read More

Safety in San Francisco: A Numbers Game

When I first moved to San Francisco, my mother warned me “not to go looking for trouble”, but this can be difficult when you do not know the city well. Luckily, SF OpenData provides a great public safety dataset, and our good friend Redshift has a few helpful tricks to help make sure trouble doesn’t find us.

Read More

Creating Ranked Position Time Series with SQL

The Los Angeles Lakers were the top NBA team in 2008 but the last in 2016. We wanted to compare their downfall to the rise of other teams in the league, so we built a comparison of each team’s rank in the standings over the past ten years.

Let’s explore how to create this chart of rankings of the teams in the western conference!

Read More