Cross-UPDATEs

A cross-UPDATE is an UPDATE operation that includes one or more other tables, using a JOIN. This is very useful in order to update one table with a value derived from another table.

It’s worth noting that cross UPDATEs, also known as “cross table updates” on some platforms, are not ANSI compliant. Not that anybody cares, and as far as I can tell, quite a few other database platforms have added support for the UPDATE..FROM syntax in recent years. But now you know, just in case you’re an ANSI SQL afficionado.

Example of a cross UPDATE

Here’s an example of two tables with a few rows of data:

CREATE TABLE #a (
    keycol        int NOT NULL,
    val           varchar(10) NULL,
    PRIMARY KEY CLUSTERED (keycol)
);

INSERT INTO #a (keycol)
VALUES (101), (102), (103);

CREATE TABLE #b (
    keycol        int NOT NULL,
    subkeycol     int NOT NULL,
    val           varchar(10) NOT NULL,
    PRIMARY KEY CLUSTERED (keycol, subkeycol)
)

INSERT INTO #b (keycol, subkeycol, val)
VALUES (101, 1, 'A'),
       (102, 1, 'B1'),
       (103, 1, 'C');

And here’s the basic cross UPDATE, where we apply the “val” column of #b to the #a table:

UPDATE a
SET a.val=b.val
FROM #a AS a
INNER JOIN #b AS b ON a.keycol=b.keycol;

Multiple matching rows

Here’s the problem with cross UPDATEs: If multiple rows in the #b table match a row in #a, you won’t deterministically know which value from #b gets applied to #a.

Let’s add some more example data to #b to illustrate this:

INSERT INTO #b (keycol, subkeycol, val)
VALUES (102, 2, 'B2'),
       (102, 3, 'B3');

You can now easily see the problem by joining the two tables in a SELECT statement:

SELECT a.keycol, b.val
FROM #a AS a
INNER JOIN #b AS b ON a.keycol=b.keycol;

The SELECT query will return three records for keycol=102: “B1”, “B2” and “B3”. So, there’s no “correct” way to solve this task. Here’s the UPDATE statement:

UPDATE a
SET a.val=b.val
FROM #a AS a
INNER JOIN #b AS b ON a.keycol=b.keycol;

If you look at the query plan, you’ll actually find a stream aggregate operator. If you examine the query plan XML, the aggregate function is “ANY” – which means just that – the query will match any random value from #b.

The conclusion, in other words, is that it is your responsibility to enforce uniqueness in cross UPDATEs, either by applying correct unique indexes or by building deterministic business logic in your query.

But there’s one more way to accomplish this operation.

Using MERGE to update

The MERGE statement can be used to update records in one table with data from another. The statement is designed to also INSERT or DELETE records, depending on how you write it, but using it just to UDPATE is just as good. The equivalent UPDATE statement using a MERGE will look like this:

MERGE INTO #a AS a
USING #b AS b ON a.keycol=b.keycol

WHEN MATCHED THEN
    UPDATE SET a.val=b.val;

And, as an added benefit, the MERGE till add an “Assert” operator to the query plan – it will actually verify that you are not matching multiple rows. So, with the test data that we have from earlier on, this MERGE operation would return the following error message:

Msg 8672, Level 16, State 1, Line 1
The MERGE statement attempted to UPDATE or DELETE the same row more than once. This happens when a target row matches more than one source row. A MERGE statement cannot UPDATE/DELETE the same row of the target table multiple times. Refine the ON clause to ensure a target row matches at most one source row, or use the GROUP BY clause to group the source rows.

In this specific case, changing the MERGE statement to the following (by adding a condition that makes #b unique) will do the trick:

MERGE INTO #a AS a
USING #b AS b ON a.keycol=b.keycol AND b.subkeycol=1

WHEN MATCHED THEN
    UPDATE SET a.val=b.val;

Oh, and don’t be surprised if the MERGE operation is more efficient than an UPDATE. Check back next week for more!

1 comment

Leave a comment

Your email address will not be published. Required fields are marked *