How Joins Work
The SQL join operation is one of the most powerful and commonly used SQL operations, but little attention is paid to how the internal SQL engine breaks down the tasks of join operations. This post will explore the common algorithms that databases use to compute them, including nested loop, hash, and merge joins. Our aim is to act as a resource for SQL users interested in exploring optimizations.
The examples use Postgres version 126.96.36.199 and the ‘world’ sample database that can be downloaded from pgfoundry.
Nested Loop Join
A join is defined as the cartesian product of two tables followed by a filter on the join predicate. The nested loop algorithm follow directly from the definition. For two tables INNER and OUTER and a join predicate P, nested loop joins concatenate every record in OUTER with every record in INNER and filter each result record on P.
Consider the following query:
JOIN countrylanguage ON
Running ‘explain’ on it returns:
Nested Loop (cost=0.00..2963.53 rows=235176 width=130)
-> Seq Scan on countrylanguage (cost=0.00..15.84 rows=984 width=17)
-> Materialize (cost=0.00..8.59 rows=239 width=113)
-> Seq Scan on country (cost=0.00..7.39 rows=239 width=113)
The query asks for every record in one table combined with every record in another. There’s no opportunity for pruning, so the query planner has no choice but to choose the nested loop join.
For queries with equijoins - or joins that exclusively use the equality operator - nested loops can be slower than necessary. Instead of iterating through each record in an inner table for each record in the outer, hash joins build a temporary hash table to index the values of the field whose equality is being tested. This saves time at the cost of memory. The outer table applies a hash function over the comparison fields and navigates to the matching stored values created in the earlier step, reducing the number of records it must traverse.
To get each city paired with its country, we can perform an equijoin.
JOIN country ON
city.countrycode = country.code
This yields the following query plan:
Hash Join (cost=10.38..139.25 rows=4079 width=20)
Hash Cond: (city.countrycode = country.code)
-> Seq Scan on city (cost=0.00..72.79 rows=4079 width=13)
-> Hash (cost=7.39..7.39 rows=239 width=15)
-> Seq Scan on country (cost=0.00..7.39 rows=239 width=15)
First, the query planner performs a full table scan on the country table to create a hash table for the possible values of country.code. Finally, a full table scan on city uses the hash table to return records whose countrycode equals country.code. Note that the query planner chose to build the hash table from the smaller table. Building it from the larger city table will return the same result but will take slightly longer and, more importantly, use more RAM. Databases optimize for memory accesses. I/O operations are orders of magnitude slower.
As tables get large, nested loop and hash joins can become costly. Large datasets quickly use up RAM and force the query planner to perform many expensive I/O operations to move data in and out of memory. For these cases, query planners will likely use a merge join.
Like hash join, merge join consists of two steps. First, both tables of the join are sorted on the join attribute. This can be done with just two passes through each table via an external merge sort. Finally, the result tuples are generated as the next ordered element is pulled from each table and the join attributes are compared.
To see the merge join in action, we join two large tables by creating a query to count how many pairs of unique cities have the sum of their populations equal to the difference of populations of any two unique cities.
To simplify the size of the SQL, we create two views; one for all the possible sums and one for all the possible differences:
VIEW sum_any2 AS SELECT city1.population + city2.population
city AS city1
, city AS city2
city1.id < city2.id
VIEW diff_any2 AS SELECT city1.population - city2.population
city AS city1
, city AS city2
city1.population > city2.population
These two views generate tables of over 8 million records each. The count query follows:
JOIN diff_any2 ON
sum_any2.pop = diff_any2.pop
Run ‘explain’ to see the query plan. Note, that this may vary depending on your Postgres setup.
Here we see each view is generated with a nested loop join that feed into sort operations that feed into a merge join operation, just as we expected.
Gaining a greater understanding of joins
While the knowledge of how joins occur is not required for most day-to-day work, understanding the process promotes greater appreciation for the algorithms that drive the SQL engine.