HASH JOIN deep-dive

Among the three different types of join operators used by SQL Server, the HASH JOIN does some of the hardest work. It scales pretty well and is very suitable for parallel processing. As such, it can be very powerful in many applications, but hash joins can potentially consume quite a bit of memory, so seeing on in your query plan could be an indicator of a performance tuning issue in your query or data.

Recap: Different join operators

You may recall a previous post on sqlsunday.com that goes through the three different types of joins. The short, executive summary of this article is that there are three types of joins you should know about:

  • The MERGE JOIN, which joins two datasets that are identically ordered and typed by merging them. Extremely efficient, but often hard to accomplish because not all data is sorted the way you’d like it to, and sorting the data first is an expensive operation that more often than not eliminates the performance gains of using MERGE JOIN.
  • Nested loops, LOOP JOIN, used to match rows from a large table to a (typically) small lookup table. This is the only join type that can handle joins that completely lack equijoins (i.e. where a column or expression in the left table is equal to a column or expression in the right table).
  • The HASH JOIN, which does require at least one equijoin, but apart from this, it will handle pretty much most of your queries, particularly the larger ones.

How does a HASH JOIN work?


A hash function is a one-way function that turns a set of data into a hash, like a checksum if you will. This means that there is no way to deterministically reverse a hash function in order to retrieve the original plaintext.

On a side note, a hash function may look like encryption or compression, but it’s not: Encryption and compression are reversible, and the length of an encrypted/compressed text varies with the length of the plaintext. Hash values, however, are generally fixed in length, which implicitly means that multiple plaintexts can generate the same hash values, rendering a hash value non-reversible.

Both encryption, hashing and compression, however, generate values that are very evenly distributed, which is a trait we’re looking for in hash joins.

SQL Server supports a number of different hash functions using the HASHBYTES() function. I’ve provided the following code example to illustrate generally how hashing works, not specifically how SQL Server implements hashing in its joins. The principle, however, is the same. The following example..

SELECT HASHBYTES('SHA1', 'This is a test 1');
SELECT HASHBYTES('SHA1', 'This is a test 2');
SELECT HASHBYTES('SHA1', 'This is a test 3');
SELECT HASHBYTES('SHA1', 'This is a test 4');

… returns the following results:


Notice how, even though the original texts differ by only a single character, the hash values are completely different. Also, the length of the hash is constant (how long it is depends on which algorithm you use). In this example, we’re using SHA1, which yields a 20 byte hash.

Another thing to remember is that hash functions are always deterministic, which means that the same plaintext will yield the same hash value every time.

Hash tables

As a part of the process of JOIN’ing two sets of data, SQL Server calculates the hash values of all the equijoin columns for both sides.

Consider the following example:

FROM tableA AS a
    a.someColumn=b.someColumn AND
    a.important=1 AND
    a.intValue*100=b.intValue AND

An equijoin is a join that use the “=” to join a column or expression on the left side with one on the right. In this example, there are two of those going on; the “someColumn” and “intValue” columns. The other join conditions are residual, which means that they’re performed after the equijoin columns have been successfully joined.

The hash values are stored in two hash tables, known as the probe input and the build input. The build input is normally the smaller data set, and is stored in memory. You could think of them as the build input being the “dimension table”, and the probe table as the “fact table”. The records from the build input are evenly distributed in so-called buckets. This effectively partitions the data into much smaller and more manageable chunks that can be matched together.

Remember, because the hash function, and its subsequent assignment to a bucket are deterministic, matching rows will end up in the same bucket in the build table as in the probe table.


In the last step of the join, the buckets of the build table and the probe table are matched together first. After that, the rows in each corresponding bucket are matched to each other. As you remember, two different records can yield the same hash value and bucket; in the second step, SQL Server actually verifies that the records do match as well.

Join using hashes and bucketsFinally, any residual join conditions are matched. In the example above, this is the “anotherColumn” join, which isn’t an equijoin.

In graphical query plans

Hash matchIn a graphical query plan, the hash join is shown as a “Hash match” icon, joining two sets of data. The build input, i.e. the smaller of the two tables, will be shown on top of these two, and the probe table below.

Different types of hash joins

Hash joins are performed slightly differently depending on the size of the two inputs and the memory allocated to your query. When SQL Server cannot accurately determine the size of the inputs, it’ll start “optimistically” with an in-memory join, and then gradually migrate to a grace join and finally to a recursive join.

In-memory hash join

The build input is hashed and stored in memory in buckets. The probe input is processed row-by-row and matched against the build input. In-memory hash joins are possible when the entire build input fits in the query’s allocated memory.

Grace hash join

A grace hash join happens when the inputs won’t comfortably fit in the allocated space of memory. The input and probe builds are partitioned, using hash values of the hash values, in a process that could be described as partitioning. The two inputs are then matched first by partition, then by bucket, and finally row-for-row.

The “current” partition being used to match the two inputs is in-memory, whereas the remaining partitions are temporarily stored on disk in what’s known as a spill. This can cause a potentially severe performance penalty.

Recursive hash join

Sometimes, even a grace hash join can generate partitions that are too large to work with. This can happen if your inputs are very large, or if the buckets are not evenly populated, resulting in some buckets that are too big to use in-memory. In this scenario, SQL Server can dynamically partition the partitions (recursively).

Query hints

Forcing a hash join

You can force SQL Server to use a hash join, using a join hint:

FROM tableA AS a
    a.someColumn=b.someColumn AND
    a.important=1 AND
    a.intValue*100=b.intValue AND

But as a general rule of thumb, if SQL Server doesn’t suggest a hash join, it probably isn’t the best strategy. I know very few people who’ve ever outsmarted the SQL Server query planner.

Avoiding a hash join

How your inputs tables are indexed, datatyped, as well as how your join is written, is key to which type of join SQL Server selects. You can use a join hint to force a certain type of joins.

Again, in 99% of the cases, a bet against the default query plan is a losing strategy. Instead, think about how you can change or add indexes on the input tables, refresh your statistics, split your queries into parts, perhaps using temp tables, etc.

More reading

There’s a lot more to be read about the inner workings on hash joins. Here are three good articles on the subject: Craig Freedman’s SQL Server blog, Sebastian Meine on hash joins, MSDN: Understanding hash joins

4 thoughts on “HASH JOIN deep-dive

  1. Pingback: Blocking/non-blocking aggregate operators « Sunday morning T-SQL

  2. Pingback: Video: three SQL Server join operators in three minutes | Sunday morning T-SQL

Let me hear your thoughts!

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s