Deleting Nodes

Deleting Nodes#

One fundamental tree operation we haven’t covered yet is deleting nodes.

We saw previously a way to select sub-trees of a tree with a mask. For instance given the following tree:

p1 7 4 1 7 6 3 7 6 1
(⍳≢p) PPV p
┌─────────┐
│    7    │
│   ┌┴───┐│
│   1    4│
│┌──┼──┐ ││
│0  3  9 2│
│   │     │
│   6     │
│  ┌┴┐    │
│  5 8    │
└─────────┘

Say we want to delete node \(3\) and all of its descendants. We can mask all nodes which are node \(3\) or one of its descendants using a method we’ve seen before.

mask3=p I@{3}≡⍳≢p
mask PPV p
0 0 0 1 0 1 1 0 1 0
┌─────────┐
│    0    │
│   ┌┴───┐│
│   0    0│
│┌──┼──┐ ││
│0  1  0 0│
│   │     │
│   1     │
│  ┌┴┐    │
│  1 1    │
└─────────┘

The first step will be to remove node \(3\) and its descendants from the parent vector using this mask.

(~mask)/p
1 7 4 7 7 1

Like so many times before, we’re now left with parent pointers which need correcting. In the original p, a certain number of to-be-deleted nodes may have appeared before each node which was not deleted. After deleting the sub-tree, each node is therefore that many places farther back in the vector, and we need to correct the pointers to reflect that.

For instance, there are \(3\) nodes which we will remove that appear before node \(7\) in p, so when the nodes are removed, node \(7\) will move to index \(7-3=4\), and we need to update parent pointers accordingly.

index:     0 1 2 3 4 5 6 7 8 9
p:         1 7 4 1 7 6 3 7 6 1
mask:      0 0 0 1 0 1 1 0 1 0
                 ↑   ↑ ↑ ↑
  to-be-deleted ─┴───┴─┘ │
                   ┌─────┘
                   ↓
(~mask)/p: 1 7 4 7 7 1

We saw a similar situation when we were extracting trees from forests, where parent pointers were offset too far. In that instance, we found the offsets to correct by with a +\ on the mask of deleted nodes. We can use the same technique here, but for completeness, we’re going to show another way[1].

Firstly, let’s find the indices of all the nodes being deleted.

mask
3 5 6 8

These indices define intervals, within which we know how many to-be-deleted nodes have appeared. For example, we know that \(3\) such nodes appear before node \(7\).

  1+(mask)7    ⍝ 3 to-be-deleted nodes appear before node 7
⍝ ││└───────┴───── in which interval does node 7 appear
⍝ └┴────────────── using ⎕IO←0 means we have to make this adjustment
3

This tells us exactly what to subtract in order to correct each parent pointer.

q(~mask)/p          ⍝ delete unwanted nodes
  1+(mask)q        ⍝ deleted nodes appearing before each parent
q-1+(mask)q        ⍝ correcting the pointers
0 3 1 3 3 0
1 4 3 4 4 1

This correctly removes the whole sub-tree descending from node \(3\).

((~mask)/⍳≢p) PPV q-1+(mask)q
┌─────┐
│  7  │
│ ┌┴─┐│
│ 1  4│
│┌┴┐ ││
│0 9 2│
└─────┘

We can put this all together into a tacit idiom we can easily reuse.

    (mask)(⊢-1+⍸)(~mask)/p
PPV (mask)(⊢-1+⍸)(~mask)/p
1 4 3 4 4 1
┌─────┐
│  ∘  │
│ ┌┴─┐│
│ ∘  ∘│
│┌┴┐ ││
│∘ ∘ ∘│
└─────┘