Periscope Data
The world’s fastest visualization platform for data analysts.

The High-Performance SQL Blog

Comparing IP Addresses in SQL

Comparing IP Addresses in SQL
June 24, 2015

It’s a common practice to store users’ IP addresses when logging actions to an events table. Most warehouses we’ve seen have chosen to store IPs as varchars.

While this makes querying with specific IP addresses easy, it can make filtering to ranges of IPs quite difficult depending on your database. Comparing IP addresses as varchars will yield incorrect results:

-- 100 is not less than 2, but is when compared as varchars!
select '100.1.1.1' < '2.1.1.1'
-- => true

Filtering to a range of IP addresses requires being able to compare any two IPs. Once you can compare, it’s trivial to filter records down to those with IPs in the range you care about. Here’s how to do just that in Postgres, MySQL, SQL Server, and Redshift.

Postgres

Postgres makes working with IPs very easy. It supports an inet column type, which provides native support for IPs and IP ranges. If you’re using Postgres and didn’t use the inet column type, don’t worry, you can still cast your varchars to inets to make comparisons easy:

select '192.168.1.1'::inet < '255.1.1.1'::inet

And to check if a single IP is within a range, use the << operator:

select '192.168.1.1'::inet << '192.168.0.0/16'::inet

MySQL 

While MySQL doesn’t have a data type for IP addresses, it’s still easy to compare IPs in MySQL. Using inet_aton you can convert your IP addresses to integers before you compare.

select inet_aton('192.168.1.1')
-- => 3,232,235,777 

IPv4 addresses are just nice ways of writing 32-bit numbers. Each octet of an IPv4 address represents 8 bits. Once the IPs have been converted to integers, they are easy to compare:

select inet_aton('192.168.1.1') < inet_aton('255.1.1.1')

SQL Server

With SQL Server there isn’t a built-in function to convert a varchar IP to an integer, but we can make one! We’ll use the function from Dennis Gobo over at sqlblog.com:

create function dbo.IPAddressToInteger(@ip as varchar(15))
  returns bigint
as
  begin
    return (
      convert(bigint, parsename(@ip, 1)) +
      convert(bigint, parsename(@ip, 2)) * 256 +
      convert(bigint, parsename(@ip, 3)) * 65536 +
      convert(bigint, parsename(@ip, 4)) * 16777216
    )
  end
go

SQL Server doesn’t have split function, but the parsename function works great for our use case. It is normally used to extract parts of object identifiers like dbo.table1.column1. But we can use it to parse IP addresses because they are also dot-delimited:

select dbo.IPAddressToInteger('192.168.1.1')
-- => 3,232,235,777

We’re multiplying the extracted octets by large numbers to form the final 32-bit number representing the IP address. These multiplications and additions are equivalent to bit shifting and binary or’ing the octets together, which we’ll also do in the Redshift section.

Redshift

Like with SQL Server, Redshift requires some gymnastics to convert a varchar IP to an  integer IP. First split the varchar on the dots and convert each octet to an integer. Bitshift each octet with << to it’s position in the 32-bit version of the IP address, and finally use | to binary or the four integers together.

select 
  split_part(user_ip, '.', 1)::integer << 24 |
  split_part(user_ip, '.', 2)::integer << 16 |
  split_part(user_ip, '.', 3)::integer << 8 |
  split_part(user_ip, '.', 4)::integer
from events

Splitting and shifting many varchar IPs can take a while. To speed things up, you can use a CTE or a temporary table to store the results of this conversion before it’s time to compare the IP addresses.

But we can do better - this integer conversion is quite slow. Instead of converting the varchar IP to an integer, we’ll add a bunch of '0' padding to the varchar so that any two IP varchars will be comparable.

-- Octets with under 3 chars get 0-padded
select 
  lpad(split_part(user_ip, '.', 1), 3, '0') ||
  lpad(split_part(user_ip, '.', 2), 3, '0') ||
  lpad(split_part(user_ip, '.', 3), 3, '0') || 
  lpad(split_part(user_ip, '.', 4), 3, '0')
from events
-- '192.168.1.1' => 192168001001

The resulting values aren’t true IP addresses, but they will compare correctly. Be sure to convert the IP range’s bounds to the same format. As a bonus, this varchar version runs 15x faster than converting the IPs to integers.

Fastest Comparisons

You’re best option for performance is to store the IP addresses in a format that can be used natively in comparisons. For Postgres, use the inet type. For everything else, we recommend an integer.

And to simplify querying, store the IP twice - once as a varchar, and once as an integer, and use whichever is most convenient for your query.

Haven't tried Periscope Data yet?
Start a trial and we’ll send you one of our famous coffee mugs.
Read More
Haven’t tried Periscope Data yet?