Working with dependencies

Working with dependencies, particularly recursive dependencies, may not always be entirely intuitive, but it could be critical knowledge in your database development work. This article focuses primarily on different ways of visualizing dependencies and how to loop through them using recursive common table expressions.

Types of dependencies

Dependencies come in many forms, though the most common types you’ll probably encounter are in the form of business rules and dependencies between database objects, particularly in schemabound relations.

For instance, fact rows in a data warehouse are dependent on their respective dimension members, either explicitly by having a foreign key constraint that binds the two, or implicitly – where you haven’t really defined a FK constraint, but your solution won’t work if fact data is orphaned. This means that you can’t delete a dimension member unless you’ve first cleared all its fact data.

Similarly, imagine a parent-child relation within a table, like a dimension. You can’t delete a row unless you first make sure that it doesn’t have any children of its own that would be turned into orphans by the operation.

In this example, we’re going to be working with a simple temp table called #references that we’re populating with the all of the current database’s object dependencies.

CREATE TABLE #references (
    [object]             sysname NOT NULL,
    referenced_object    sysname NOT NULL,
    drop_order           int NULL,
    PRIMARY KEY CLUSTERED ([object], referenced_object)

--- Populate table with object references from the current database:
INSERT INTO #references ([object], referenced_object)
FROM sys.sql_expression_dependencies AS dep
INNER JOIN sys.objects AS obj ON obj.[object_id]=dep.referencing_id
INNER JOIN sys.objects AS ref ON ref.[object_id]=dep.referenced_id
WHERE dep.referenced_database_name IS NULL;

Displaying all relations

First off, here’s a way of constructing a recursive common table expression that loops through all of the relations that exist between any two objects in the #references table.

WITH rcte ([object], referenced_object, [level])
AS (
    --- This is the anchor, one row for each object:
    SELECT DISTINCT [object],
        [object] AS referenced_object,
        0 AS [level]
    FROM #references


    --- This part is the recursion, where we join each referenced
    --- object from the recursive CTE to each of its own
    --- referenced objects in turn:
    SELECT rcte.[object],
    FROM rcte
    INNER JOIN #references AS ref ON
        rcte.referenced_object=ref.[object] AND
        --- Avoid circular (infinite) references:
        rcte.[object]!=ref.referenced_object AND
        --- ... also, skip self-referencing relations:

--- .. and output the result:
SELECT [object], referenced_object, [level]
FROM rcte
WHERE [level]>0;

Note that objects can appear more than once in this view, whenever they have more than one relation to another object. Also, the view does not really accomodate circular references between objects, if you have those, the query will reach a maximum recursion limit.

Calculating a path between objects

A more human-readable way of visualizing relations between objects is by constructing a plaintext “path”, where all the levels of a dependency are clearly readable to the user. We’ll start off with the same recursive CTE as above, but we’ll add the “path” column. In the anchor, the “path” is just the name of the top-level object, and as we recurse through the dependency hierarchy, we’ll add each level’s member to the end of the path.

WITH rcte ([object], referenced_object, [level], [path])
AS (
    --- This is the anchor, one row for each object:
    SELECT DISTINCT [object],
        [object] AS referenced_object,
        0 AS [level],
        --- At the top level, the object is it's own path:
        CAST([object] AS varchar(max)) AS [path]
    FROM #references


    --- This part is the recursion, where we join each referenced
    --- object from the recursive CTE to each of its own
    --- referenced objects in turn:
    SELECT rcte.[object],
        --- Adding the referenced object to the end of the path
        --- with a suitable separator, like a slash, dash, colon,
        --- or in this case, a "<-".
        CAST(rcte.[path]+' <- '+ref.referenced_object AS varchar(max))
    FROM rcte
    INNER JOIN #references AS ref ON
        rcte.referenced_object=ref.[object] AND
        --- Avoid circular (infinite) references:
        rcte.[object]!=ref.referenced_object AND
        --- ... also, skip self-referencing relations:
    WHERE [level]<100)

SELECT [object],  referenced_object,  [path]
FROM rcte
WHERE [level]>0
ORDER BY referenced_object;

Note that recursive CTE’s do not allow implicit datatype conversions in the UNION between the anchor and the recursive section, which means that we explicitly have to CAST the path column.

Dropping objects in the order of their dependencies

Here’s a common challenge when you’re faced with dropping and recreating a large number of database objects. If you’re not careful with your dependencies, particularly schemabound objects, you’ll be faced with the following message:

Msg 3726, Level 16, State 1, Line 1
Could not drop object 'schema.object' because it is referenced by a FOREIGN KEY constraint.

.. or:

Msg 3729, Level 16, State 1, Line 1
Cannot DROP TABLE 'schema.object' because it is being referenced by object 'object'.

What you need to do is figure out which objects aren’t referenced by any other objects, and drop those first. Having dropped those objects will make other objects unreferenced, so you can drop them in turn. And so it goes, until you’ve dropped all of the objects.

We can use T-SQL to calculate this “drop order” of objects. But for the sake of readability, I’ve chosen to update the “drop_order” column in the temp table using a WHILE loop, instead of embedding this stuff in another recursive common table expression.

--- @drop_order is the counter variable that we'll be using:
DECLARE @drop_order int=1;

--- Remove any self-referencing rows here (objects that depend
--- on themselves)
DELETE FROM #references
WHERE [object]=referenced_object;

--- Top-level objects are the ones that no other objects
--- reference. These are "drop order" 1, i.e. the first
--- ones in line:
UPDATE #references
SET [email protected]_order
WHERE [object] NOT IN (
    SELECT referenced_object
    FROM #references);

--- And from here, we loop through the table, incrementing
--- @drop_order by one for every loop. The WHILE .. @@ROWCOUNT
--- construct makes sure we loop through this block until
--- nothing is updated.
    SET @[email protected]_order+1;

    --- Objects that have now become "droppable" are
    --- 1) children of an object of the most recent drop_order, and
    --- 2) not referenced by any to-be-dropped objects.
    UPDATE droppable
    SET [email protected]_order
    FROM #references AS recently_dropped
    INNER JOIN #references AS droppable ON
    WHERE [email protected]_order-1 AND
          droppable.drop_order IS NULL AND
          droppable.[object] NOT IN (
              SELECT referenced_object
              FROM #references
              WHERE drop_order IS NULL);

--- When the loop has completed, display the result:
SELECT DISTINCT ref.[object], drop_order
FROM #references AS ref
ORDER BY drop_order, [object];

Note: The list above does not include objects that are not part of any relation at all. This is intentional, in order to keep the example code readable.

I hope this tutorial has given you a good starting point for working with recursive common tables and dependencies. Let me know if there’s anything missing!

Let me hear your thoughts!

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.