Analyzing Your Salesforce Data With SQL
When it comes to customer relationship management software, Salesforce is the undisputed king, so it is no surprise that there are many folks looking to integrate Salesforce data with other sources for a deeper look into performance.
Let’s take a look at the types of queries we can run on Salesforce data and how to integrate Salesforce into your database.
Using SQL To Analyze Salesforce Data
One of the most important set of metrics for a sales organization is the sales pipeline funnel. Knowing the conversion rates at each stage of the sales pipeline helps your forecast revenue numbers and figure out where you need to improve.
We have a field new_value in our _lead_history table that lets us count the leads by stage, and we can use the users team to track across market segments. By dividing by the total leads, we can look at the conversion rates over the last two weeks:
with total as (select count(1) from salesforce._lead_history)
when new_value ilike '%assigned%' then 'Assigned'
when new_value ilike '%attempt%' then 'Contacted'
when new_value ilike '%qualified%' then 'Qualified'
when new_value ilike '%won%' then 'Won' end
, count(1) / total
join salesforce._user on
_lead_history.created_by_id = _user.id
field = 'Status'
and created_date >=
date_trunc('day', getdate() - interval '14 day')::date
Running this query gives us back triplets with the stage, team name, and count, like (Contacted, Mid-Market, 0.363). We can turn this into a visualization of our pipeline across market segments:
Integrating Other Data Sources
An advantage to downloading your data is being able to join with data sources outside of Salesforce. Users provide a billing address and are assigned a latitude and longitude, but some users cannot be geolocated.
We can use their zip code in this case to join with our standard zipcode map to include these users in geographical analysis.
salesforce._account join zip_codes on
left(billing_postal_code, 5) = zip_codes.zip_code
billing_latitude is null
and billing_postal_code is not null
billing_latitude is null
This gives us the full picture of our U.S. users locations:
Another important growth metric for a company is the ability to grow existing relationships. We can cross-reference our internal usage statistics to find customers ready to move to the next upgrade tier.
opportunities.company as Client
, date_trunc('week', (created_date)) as Start_Date
, (client.requests - client.paid_requests) as Overage
, user.name as Rep
join users on
opportunities.owner_id = users.id
join client on
client.site = opportunities.company
We can use the result to grow our relationship with existing customers:
Extracting Data: Setting Up A Connected App In Salesforce
To export data from Salesforce we will use a connected app. From the setup menu in Salesforce, we’ll first navigate to Build -> Apps:
And create a new connected app:
We put a name, email address, and check the “Enable OAuth Settings” box. We need API access to the data and a valid callback URL - we won’t actually be calling it, but it should be a properly formatted URL.
This will give us a Consumer Secret and Consumer Key which we’ll use in conjunction with out login information to access the data.
You’ll use the Consumer Secret and Consumer Key with the Rest API to extract the schema, and your Username and Password to access the Bulk API. There are many Salesforce API integrations that will help you pull the data.
Configuring Salesforce Data For Your Database
Before loading the data into your database, you’ll want to convert the column types to match your database and reformat the schema to mesh well with your table and column names. For postgres and redshift databases, we use this conversion table from Salesforce type to postgres type.
double double precision
currency double precision
And defer to varchar for any other types.
The Salesforce schema naming conventions might need some normalization to match your internal conventions.
At a minimum, we recommend replacing double underscores with a single underscore, removing non-alphanumeric characters, and stripping __c from the end of tables, which Salesforce uses to mark custom tables.
Now you can upload the data into your database and start running queries! We hope this post is helpful while integrating Salesforce into your data analytics stack, let us know what steps you took to connect with Salesforce.
Thanks to Megan Lin and Melody Chan for their help brainstorming this blog post.