Deleting Nodes

Deleting Nodes#

One fundamental tree operation we haven’t covered yet is deleting nodes.

We saw previously a way to select sub-trees of a tree with a mask. For instance given the following tree:

p1 7 4 1 7 6 3 7 6 1
(⍳≢p) PPV p
┌─────────┐
│    7    │
│   ┌┴───┐│
│   1    4│
│┌──┼──┐ ││
│0  3  9 2│
│   │     │
│   6     │
│  ┌┴┐    │
│  5 8    │
└─────────┘

Say we want to delete node 3 and all of its descendants. We can mask all nodes which are node 3 or one of its descendants using a method we’ve seen before.

mask3=p I@{3}≡⍳≢p
mask PPV p
0 0 0 1 0 1 1 0 1 0
┌─────────┐
│    0    │
│   ┌┴───┐│
│   0    0│
│┌──┼──┐ ││
│0  1  0 0│
│   │     │
│   1     │
│  ┌┴┐    │
│  1 1    │
└─────────┘

The first step will be to remove node 3 and its descendants from the parent vector using this mask.

(~mask)/p
1 7 4 7 7 1

Like so many times before, we’re now left with parent pointers which need correcting. In the original p, a certain number of to-be-deleted nodes may have appeared before each node which was not deleted. After deleting the sub-tree, each node is therefore that many places farther back in the vector, and we need to correct the pointers to reflect that.

For instance, there are 3 nodes which we will remove that appear before node 7 in p, so when the nodes are removed, node 7 will move to index 73=4, and we need to update parent pointers accordingly.

index:     0 1 2 3 4 5 6 7 8 9
p:         1 7 4 1 7 6 3 7 6 1
mask:      0 0 0 1 0 1 1 0 1 0
                 ↑   ↑ ↑ ↑
  to-be-deleted ─┴───┴─┘ │
                   ┌─────┘
                   ↓
(~mask)/p: 1 7 4 7 7 1

We saw a similar situation when we were extracting trees from forests, where parent pointers were offset too far. In that instance, we found the offsets to correct by with a +\ on the mask of deleted nodes. We can use the same technique here, but for completeness, we’re going to show another way[1].

Firstly, let’s find the indices of all the nodes being deleted.

mask
3 5 6 8

These indices define intervals, within which we know how many to-be-deleted nodes have appeared. For example, we know that 3 such nodes appear before node 7.

  1+(mask)7    ⍝ 3 to-be-deleted nodes appear before node 7
⍝ ││└───────┴───── in which interval does node 7 appear
⍝ └┴────────────── using ⎕IO←0 means we have to make this adjustment
3

This tells us exactly what to subtract in order to correct each parent pointer.

q(~mask)/p          ⍝ delete unwanted nodes
  1+(mask)q        ⍝ deleted nodes appearing before each parent
q-1+(mask)q        ⍝ correcting the pointers
0 3 1 3 3 0
1 4 3 4 4 1

This correctly removes the whole sub-tree descending from node 3.

((~mask)/⍳≢p) PPV q-1+(mask)q
┌─────┐
│  7  │
│ ┌┴─┐│
│ 1  4│
│┌┴┐ ││
│0 9 2│
└─────┘

We can put this all together into a tacit idiom we can easily reuse.

    (mask)(⊢-1+⍸)(~mask)/p
PPV (mask)(⊢-1+⍸)(~mask)/p
1 4 3 4 4 1
┌─────┐
│  ∘  │
│ ┌┴─┐│
│ ∘  ∘│
│┌┴┐ ││
│∘ ∘ ∘│
└─────┘