Accumulating values in a parent-child hierarchy

Last week, we looked at how to construct a visual representation of a hierarchy stored as a parent-child table. The obvious next step is to accumulate values stored on those nodes using this hierarchy.

Here’s the same example data that we used for the last article; a corporate balance sheet with a number of levels which together make up a ragged (parent-child) dimension.

DECLARE @balanceSheet TABLE (
    rowID             int NOT NULL,
    parentRowID       int NULL,
    rowDescription    varchar(100) NOT NULL,
    PRIMARY KEY CLUSTERED (rowID)
);

--- Balance sheet example from http://en.wikipedia.org/wiki/Balance_sheet
--- Slightly modified to readability.
INSERT INTO @balanceSheet (rowID, parentRowID, rowDescription)
VALUES (1,  NULL, 'ASSETS'),
       (2,     1, 'Current Assets'),
       (3,     2, 'Cash and Cash Equivalents'),
       (4,     2, 'Accounts Receivable (Debtors)'),
       (5,     4, 'Less : Allowances for Doubtful Accounts'),
       (6,     2, 'Inventories'),
       (7,     2, 'Prepaid Expenses'),
       (8,     2, 'Investment Securities'),
       (9,     2, 'Other Current Assets'),
       (10,    1, 'Non-Current Assets (Fixed Assets)'),
       (11,   10, 'Property, Plant and Equipment'),
       (12,   11, 'Less : Accumulated Depreciation'),
       (13,   10, 'Investment Securities'),
       (14,   10, 'Investments in Associates'),
       (15,   10, 'Intangible Assets'),
       (16,   15, 'Less : Accumulated Amortization'),
       (17,   10, 'Goodwill'),
       (18,   10, 'Other Non-Current Assets'),
       (19, NULL, ''),
       (20, NULL, 'LIABILITIES and SHAREHOLDERS'' EQUITY'),
       (21,   20, 'LIABILITIES'),
       (22,   21, 'Current Liabilities (due within one year)'),
       (23,   22, 'Accounts Payable'),
       (24,   22, 'Current Income Tax Payable'),
       (25,   22, 'Current portion of Loans Payable'),
       (26,   22, 'Short-term Provisions'),
       (27,   22, 'Other Current Liabilities'),
       (28,   21, ''),
       (29,   21, 'Non-Current Liabilities (due after more than one year)'),
       (30,   29, 'Loans Payable'),
       (31,   29, 'Issued Debt Securities'),
       (32,   29, 'Deferred Tax Liabilities'),
       (33,   29, 'Provisions, e.g. Pension Obligations'),
       (34,   29, 'Other Non-Current Liabilities'),
       (35,   20, 'SHAREHOLDERS'' EQUITY'),
       (36,   35, 'Paid-in Capital'),
       (37,   35, 'Share Capital'),
       (38,   35, 'Share Premium'),
       (39,   38, 'Less: Treasury Shares'),
       (40,   35, 'Retained Earnings'),
       (41,   35, 'Revaluation Reserve'),
       (42,   35, 'Accumulated Other Comprehensive Income');

Now, let’s also add a few values that we can aggregate using this hierarchy:

DECLARE @fact TABLE (
    rowID         int NOT NULL,
    amount        money NOT NULL,
    PRIMARY KEY CLUSTERED (rowID)
);

INSERT INTO @fact (rowID, amount)
VALUES ( 3,  1000.00),
       ( 4,   450.00),
       ( 5,   -30.00),
       (11,  3700.00),
       (12,  -370.00),
       (17,    50.00),
       (23,  -190.00),
       (30,  -800.00),
       (34,   -40.00),
       (36, -2200.00),
       (37, -1280.00),
       (40,  -290.00);

Here’s the how the balance sheet dimension looks after we’ve sorted and indented it to a textual/tabular representation:

---- -------------------------------------------------------------
  1  ASSETS
  2     Current Assets
  3        Cash and Cash Equivalents
  4        Accounts Receivable (Debtors)
  5           Less : Allowances for Doubtful Accounts
  6        Inventories
  7        Prepaid Expenses
  8        Investment Securities
  9        Other Current Assets
 10     Non-Current Assets (Fixed Assets)
 11        Property, Plant and Equipment
 12           Less : Accumulated Depreciation
 13        Investment Securities
 14        Investments in Associates
 15        Intangible Assets
 16           Less : Accumulated Amortization
 17        Goodwill
 18        Other Non-Current Assets
 19  
 20  LIABILITIES and SHAREHOLDERS' EQUITY
 21     LIABILITIES
 22        Current Liabilities (due within one year)
 23           Accounts Payable
 24           Current Income Tax Payable
 25           Current portion of Loans Payable
 26           Short-term Provisions
 27           Other Current Liabilities
 28        
 29        Non-Current Liabilities (due after more than one year)
 30           Loans Payable
 31           Issued Debt Securities
 32           Deferred Tax Liabilities
 33           Provisions, e.g. Pension Obligations
 34           Other Non-Current Liabilities
 35     SHAREHOLDERS' EQUITY
 36        Paid-in Capital
 37        Share Capital
 38        Share Premium
 39           Less: Treasury Shares
 40        Retained Earnings
 41        Revaluation Reserve
 42        Accumulated Other Comprehensive Income

You’ll notice that we’ve added values to rows 5, 4 and 3, among others. We need the values on row 5 to be included in the total on row 4, which in turn should be added to the total of row 3. The amount on row 1 should reflect the total of all amounts on rows 2 through 18.

As with the indent/sort example, this is also a recursive problem. Each row has to accumulate to its respective parent row, which in turn also accumulates up to its own parent row, and so on. Here’s how we would construct a recursive CTE to do this:

WITH hierarchy (topRowID, rowID)
AS (
    --- Each row is an anchor. In this case, the anchor is
    --- the "top" or "accumulated" row.
    SELECT rowID, rowID
    FROM @balanceSheet

    UNION ALL

    --- ... and from there on we recurse "down the hierarchy"
    --- by finding children of the current node.
    SELECT h.topRowID, bs.rowID
    FROM @balanceSheet AS bs
    INNER JOIN hierarchy AS h ON bs.parentRowID=h.rowID)

SELECT rowID, topRowID
FROM hierarchy
ORDER BY rowID, topRowID;

You can build this CTE in two ways, depending on if you want to recurse up or down the hierarchy, but I won’t go into the details here.

The output of this query is:

rowID       topRowID
----------- -----------
1           1
2           1
2           2
3           1
3           2
3           3
4           1
4           2
4           4
5           1
5           2
5           4
5           5
6           1
6           2
6           6
...

If you read the data from the top, you’ll see that row 1 is “included” in, i.e. adds up to only row 1. Row 2, however, is “included” in row 2 and row 1, because row 1 is the parent of row 2. As for row 3, it is “included” with rows 1, 2 and 3, because row 2 is its parent and row 1 its ancestor.

Do you see what we’re doing here? With this recordset, we can now join the rowID column to the fact data, but in the SELECT and GROUP BY clauses, we’ll use the topRowID column. This way, we can aggregate all the data so it’s presented at the accumulated level, i.e. grouped for each node in the hierarchy.

Here’s one way to write that query:

WITH hierarchy (topRowID, rowID)
AS (
    --- Each row is an anchor. In this case, the anchor is
    --- the "top" or "accumulated" row.
    SELECT rowID, rowID
    FROM @balanceSheet

    UNION ALL

    --- ... and from there on we recurse "down the hierarchy"
    --- by finding children of the current node.
    SELECT h.topRowID, bs.rowID
    FROM @balanceSheet AS bs
    INNER JOIN hierarchy AS h ON bs.parentRowID=h.rowID)

SELECT h.topRowID AS rowID,
    --- For clarity, this is each row's amount, without the
    --- accumulation:
    SUM((CASE WHEN fact.rowID=h.topRowID
        THEN fact.amount
        ELSE 0.0 END)) AS [amount, this row],
    --- ... and here's the accumulated total, which is what
    --- we're looking for:
    SUM(fact.amount) AS [amount, accumulated]
FROM hierarchy AS h
INNER JOIN @fact AS fact ON fact.rowID=h.rowID
GROUP BY h.topRowID
ORDER BY h.topRowID;

Now all you have to do is combine the accumulation from this post with the indentation and sorting that we worked out in last week’s article, but I’ll leave that to you.

Until next week!

8 thoughts on “Accumulating values in a parent-child hierarchy

  1. Hi Daniel,

    great example, I’d like to adopt it to my needs (a profit-loss calculation). When I execute the queries above I get same amount for “amount, this row” and “amount, accumulated”. In my understanding the “amount, accumulated” represents the sum of the certain row and it’s childs, right?

    Best regards
    Michael

    • Good catch! Turns out, there was a typo in the “amount, this row” column. The correct syntax should be

          --- For clarity, this is each row's amount, without the
          --- accumulation:
          SUM((CASE WHEN fact.rowID=h.topRowID
              THEN fact.amount
              ELSE 0.0 END)) AS [amount, this row],
      

      I’ve updated the code in the post to reflect this. Thanks for your help!

  2. Hi Daniel,

    I tried a lot with your examples, but got stuck again :-(

    As i wrote before, i want to build a profit-loss statement.

    I have one table forProfit-Loss structure like “balancesheet” in your example.

    STRUCTURE
    Id|Parent_Id|Description
    1|NULL|Revenue
    2|1|Revenue Europe
    3|1|Revenue Asia

    Then I have a table with certain accounts. Each account belongs to a certain node in the structure table:

    ACCOUNT
    Id|Structure_Id|Description|Account_No
    1|2|Revenue Germany|501000
    2|2|Revenue Siwss|502000
    3|3|Revenue Japan|510100
    4|3|Revenue Indonesia|510200

    The data is also in a separate table like:

    DATA
    Id|Account_No|Amount|YearPeriod
    1|501000|1000|201701
    2|501000|560|201702
    3|502000|300|201701
    4|502000|600|201702

    How do I have to adjust your example to get it work?

    I hope you can help me, thanks in advance.

    Best
    Michael

    • I would put the “structure” and the “account” data in the same table, which together forms your parent-child hierarchy. In effect, the account is just the leaf level of your P&L or balance sheet. The rest is pretty much copy-paste from my code.

  3. Hi Daniel,

    Great post!

    I been trying to place the description along side the row, so I can see the accumulated value that is associated with it, unsuccessfully. I attempted to create a join between the hierarchy table and the declared @balancesheet with not much success. Do you have any suggestions in how I should approach this?

    Thank you,
    Paul

  4. I am successfully able to apply the above concept and accumulated the sum.
    How do I apply indentation concept from your previous post? Please help me.
    In previous post we loop from parent withe level 0 and the each child with incrementing level.
    How do i apply sort column and level in this code?

Let me hear your thoughts!

This site uses Akismet to reduce spam. Learn how your comment data is processed.