Working with dependencies, particularly recursive dependencies, may not always be entirely intuitive, but it could be critical knowledge in your database development work. This article focuses primarily on different ways of visualizing dependencies and how to loop through them using recursive common table expressions.
Types of dependencies
Dependencies come in many forms, though the most common types you’ll probably encounter are in the form of business rules and dependencies between database objects, particularly in schemabound relations.
For instance, fact rows in a data warehouse are dependent on their respective dimension members, either explicitly by having a foreign key constraint that binds the two, or implicitly – where you haven’t really defined a FK constraint, but your solution won’t work if fact data is orphaned. This means that you can’t delete a dimension member unless you’ve first cleared all its fact data.
Similarly, imagine a parent-child relation within a table, like a dimension. You can’t delete a row unless you first make sure that it doesn’t have any children of its own that would be turned into orphans by the operation.
In this example, we’re going to be working with a simple temp table called #references that we’re populating with the all of the current database’s object dependencies.
CREATE TABLE #references (
[object] sysname NOT NULL,
referenced_object sysname NOT NULL,
drop_order int NULL,
PRIMARY KEY CLUSTERED ([object], referenced_object)
);
--- Populate table with object references from the current database:
INSERT INTO #references ([object], referenced_object)
SELECT DISTINCT
SCHEMA_NAME(obj.[schema_id])+'.'+obj.[name],
SCHEMA_NAME(ref.[schema_id])+'.'+ref.[name]
FROM sys.sql_expression_dependencies AS dep
INNER JOIN sys.objects AS obj ON obj.[object_id]=dep.referencing_id
INNER JOIN sys.objects AS ref ON ref.[object_id]=dep.referenced_id
WHERE dep.referenced_database_name IS NULL;
Displaying all relations
First off, here’s a way of constructing a recursive common table expression that loops through all of the relations that exist between any two objects in the #references table.
WITH rcte ([object], referenced_object, [level]) AS ( --- This is the anchor, one row for each object: SELECT DISTINCT [object], [object] AS referenced_object, 0 AS [level] FROM #references UNION ALL --- This part is the recursion, where we join each referenced --- object from the recursive CTE to each of its own --- referenced objects in turn: SELECT rcte.[object], ref.referenced_object, rcte.[level]+1 FROM rcte INNER JOIN #references AS ref ON rcte.referenced_object=ref.[object] AND --- Avoid circular (infinite) references: rcte.[object]!=ref.referenced_object AND --- ... also, skip self-referencing relations: ref.[object]!=ref.referenced_object) --- .. and output the result: SELECT [object], referenced_object, [level] FROM rcte WHERE [level]>0;
Note that objects can appear more than once in this view, whenever they have more than one relation to another object. Also, the view does not really accomodate circular references between objects, if you have those, the query will reach a maximum recursion limit.
Calculating a path between objects
A more human-readable way of visualizing relations between objects is by constructing a plaintext “path”, where all the levels of a dependency are clearly readable to the user. We’ll start off with the same recursive CTE as above, but we’ll add the “path” column. In the anchor, the “path” is just the name of the top-level object, and as we recurse through the dependency hierarchy, we’ll add each level’s member to the end of the path.
WITH rcte ([object], referenced_object, [level], [path]) AS ( --- This is the anchor, one row for each object: SELECT DISTINCT [object], [object] AS referenced_object, 0 AS [level], --- At the top level, the object is it's own path: CAST([object] AS varchar(max)) AS [path] FROM #references UNION ALL --- This part is the recursion, where we join each referenced --- object from the recursive CTE to each of its own --- referenced objects in turn: SELECT rcte.[object], ref.referenced_object, rcte.[level]+1, --- Adding the referenced object to the end of the path --- with a suitable separator, like a slash, dash, colon, --- or in this case, a "<-". CAST(rcte.[path]+' <- '+ref.referenced_object AS varchar(max)) FROM rcte INNER JOIN #references AS ref ON rcte.referenced_object=ref.[object] AND --- Avoid circular (infinite) references: rcte.[object]!=ref.referenced_object AND --- ... also, skip self-referencing relations: ref.[object]!=ref.referenced_object WHERE [level]<100) SELECT [object], referenced_object, [path] FROM rcte WHERE [level]>0 ORDER BY referenced_object;
Note that recursive CTE’s do not allow implicit datatype conversions in the UNION between the anchor and the recursive section, which means that we explicitly have to CAST the path column.
Dropping objects in the order of their dependencies
Here’s a common challenge when you’re faced with dropping and recreating a large number of database objects. If you’re not careful with your dependencies, particularly schemabound objects, you’ll be faced with the following message:
Msg 3726, Level 16, State 1, Line 1 Could not drop object 'schema.object' because it is referenced by a FOREIGN KEY constraint.
.. or:
Msg 3729, Level 16, State 1, Line 1 Cannot DROP TABLE 'schema.object' because it is being referenced by object 'object'.
What you need to do is figure out which objects aren’t referenced by any other objects, and drop those first. Having dropped those objects will make other objects unreferenced, so you can drop them in turn. And so it goes, until you’ve dropped all of the objects.
We can use T-SQL to calculate this “drop order” of objects. But for the sake of readability, I’ve chosen to update the “drop_order” column in the temp table using a WHILE loop, instead of embedding this stuff in another recursive common table expression.
--- @drop_order is the counter variable that we'll be using: DECLARE @drop_order int=1; --- Remove any self-referencing rows here (objects that depend --- on themselves) DELETE FROM #references WHERE [object]=referenced_object; --- Top-level objects are the ones that no other objects --- reference. These are "drop order" 1, i.e. the first --- ones in line: UPDATE #references SET drop_order=@drop_order WHERE [object] NOT IN ( SELECT referenced_object FROM #references); --- And from here, we loop through the table, incrementing --- @drop_order by one for every loop. The WHILE .. @@ROWCOUNT --- construct makes sure we loop through this block until --- nothing is updated. WHILE (@@ROWCOUNT!=0) BEGIN; SET @drop_order=@drop_order+1; --- Objects that have now become "droppable" are --- 1) children of an object of the most recent drop_order, and --- 2) not referenced by any to-be-dropped objects. UPDATE droppable SET droppable.drop_order=@drop_order FROM #references AS recently_dropped INNER JOIN #references AS droppable ON recently_dropped.referenced_object=droppable.[object] WHERE recently_dropped.drop_order=@drop_order-1 AND droppable.drop_order IS NULL AND droppable.[object] NOT IN ( SELECT referenced_object FROM #references WHERE drop_order IS NULL); END --- When the loop has completed, display the result: SELECT DISTINCT ref.[object], drop_order FROM #references AS ref ORDER BY drop_order, [object];
Note: The list above does not include objects that are not part of any relation at all. This is intentional, in order to keep the example code readable.
I hope this tutorial has given you a good starting point for working with recursive common tables and dependencies. Let me know if there’s anything missing!