Periscope Data
The world’s fastest visualization platform for data analysts.

The High-Performance SQL Blog

Analyzing Your Salesforce Data With SQL

Analyzing Your Salesforce Data With SQL
September 17, 2015

When it comes to customer relationship management software, Salesforce is the undisputed king, so it is no surprise that there are many folks looking to integrate Salesforce data with other sources for a deeper look into performance.

Let’s take a look at the types of queries we can run on Salesforce data and how to integrate Salesforce into your database.

Using SQL To Analyze Salesforce Data

One of the most important set of metrics for a sales organization is the sales pipeline funnel. Knowing the conversion rates at each stage of the sales pipeline helps your forecast revenue numbers and figure out where you need to improve.

We have a field new_value in our _lead_history table that lets us count the leads by stage, and we can use the users team to track across market segments. By dividing by the total leads, we can look at the conversion rates over the last two weeks:

with total as (select count(1) from salesforce._lead_history)
select
  case 
    when new_value ilike '%assigned%' then 'Assigned' 
    when new_value ilike '%attempt%' then 'Contacted' 
    when new_value ilike '%qualified%' then 'Qualified' 
    when new_value ilike '%won%' then 'Won' end
  , _user.team
  , count(1) / total
from
  salesforce._lead_history 
    join salesforce._user on 
    _lead_history.created_by_id = _user.id
where
  field = 'Status'
  and created_date >= 
    date_trunc('day', getdate() - interval '14 day')::date
group by
  1, 2

Running this query gives us back triplets with the stage, team name, and count, like (Contacted, Mid-Market, 0.363). We can turn this into a visualization of our pipeline across market segments:

Integrating Other Data Sources

An advantage to downloading your data is being able to join with data sources outside of Salesforce. Users provide a billing address and are assigned a latitude and longitude, but some users cannot be geolocated.

We can use their zip code in this case to join with our standard zipcode map to include these users in geographical analysis.

select
  latitude, longitude
from
  salesforce._account join zip_codes on 
    left(billing_postal_code, 5) = zip_codes.zip_code
where
  billing_latitude is null
  and billing_postal_code is not null
union select
    billing_latitude, billing_longitude
  from
    salesforce._account
  where
    billing_latitude is null

This gives us the full picture of our U.S. users locations:

Another important growth metric for a company is the ability to grow existing relationships. We can cross-reference our internal usage statistics to find customers ready to move to the next upgrade tier.

select
  opportunities.company as Client
  , date_trunc('week', (created_date)) as Start_Date
  , (client.requests - client.paid_requests) as Overage
  , user.name as Rep
from
  opportunities
  join users on
    opportunities.owner_id = users.id
  join client on 
    client.site = opportunities.company

We can use the result to grow our relationship with existing customers:

Extracting Data: Setting Up A Connected App In Salesforce

To export data from Salesforce we will use a connected app. From the setup menu in Salesforce, we’ll first navigate to Build -> Apps:

And create a new connected app:

We put a name, email address, and check the “Enable OAuth Settings” box. We need API access to the data and a valid callback URL - we won’t actually be calling it, but it should be a properly formatted URL.

This will give us a Consumer Secret and Consumer Key which we’ll use in conjunction with out login information to access the data.

You’ll use the Consumer Secret and Consumer Key with the Rest API to extract the schema, and your Username and Password to access the Bulk API. There are many Salesforce API integrations that will help you pull the data.

At Periscope we use Heroku’s Salesforce REST API ruby gem to use Javier Julio’s Salesforce Bulk API gem to access the Salesforce APIs.

Configuring Salesforce Data For Your Database

Before loading the data into your database, you’ll want to convert the column types to match your database and reformat the schema to mesh well with your table and column names. For postgres and redshift databases, we use this conversion table from Salesforce type to postgres type.

Salesforce     Postgres
id             varchar(255)
reference      varchar(255)
phone          varchar(255)
boolean        boolean
double         double precision
currency       double precision
datetime       timestamp
date           date
int            integer

And defer to varchar for any other types.

The Salesforce schema naming conventions might need some normalization to match your internal conventions.

At a minimum, we recommend replacing double underscores with a single underscore, removing non-alphanumeric characters, and stripping __c from the end of tables, which Salesforce uses to mark custom tables.

Now you can upload the data into your database and start running queries! We hope this post is helpful while integrating Salesforce into your data analytics stack, let us know what steps you took to connect with Salesforce.

Thanks to Megan Lin and Melody Chan for their help brainstorming this blog post.

Haven't tried Periscope Data yet?
Start a trial and we’ll send you one of our famous coffee mugs.
Read More
Haven’t tried Periscope Data yet?