Deleting Nodes#
One fundamental tree operation we haven’t covered yet is deleting nodes.
We saw previously a way to select sub-trees of a tree with a mask. For instance given the following tree:
p←1 7 4 1 7 6 3 7 6 1
(⍳≢p) PPV p
┌─────────┐ │ 7 │ │ ┌┴───┐│ │ 1 4│ │┌──┼──┐ ││ │0 3 9 2│ │ │ │ │ 6 │ │ ┌┴┐ │ │ 5 8 │ └─────────┘
Say we want to delete node
⊢mask←3=p I@{⍵≠3}⍣≡⍳≢p
mask PPV p
0 0 0 1 0 1 1 0 1 0
┌─────────┐ │ 0 │ │ ┌┴───┐│ │ 0 0│ │┌──┼──┐ ││ │0 1 0 0│ │ │ │ │ 1 │ │ ┌┴┐ │ │ 1 1 │ └─────────┘
The first step will be to remove node
(~mask)/p
1 7 4 7 7 1
Like so many times before, we’re now left with parent pointers which need correcting. In the original p
, a certain number of to-be-deleted nodes may have appeared before each node which was not deleted. After deleting the sub-tree, each node is therefore that many places farther back in the vector, and we need to correct the pointers to reflect that.
For instance, there are p
, so when the nodes are removed, node
index: 0 1 2 3 4 5 6 7 8 9
p: 1 7 4 1 7 6 3 7 6 1
mask: 0 0 0 1 0 1 1 0 1 0
↑ ↑ ↑ ↑
to-be-deleted ─┴───┴─┘ │
┌─────┘
↓
(~mask)/p: 1 7 4 7 7 1
We saw a similar situation when we were extracting trees from forests, where parent pointers were offset too far. In that instance, we found the offsets to correct by with a +\
on the mask of deleted nodes. We can use the same technique here, but for completeness, we’re going to show another way[1].
Firstly, let’s find the indices of all the nodes being deleted.
⍸mask
3 5 6 8
These indices define intervals, within which we know how many to-be-deleted nodes have appeared. For example, we know that
1+(⍸mask)⍸7 ⍝ 3 to-be-deleted nodes appear before node 7
⍝ ││└───────┴───── in which interval does node 7 appear
⍝ └┴────────────── using ⎕IO←0 means we have to make this adjustment
3
This tells us exactly what to subtract in order to correct each parent pointer.
q←(~mask)/p ⍝ delete unwanted nodes
1+(⍸mask)⍸q ⍝ deleted nodes appearing before each parent
q-1+(⍸mask)⍸q ⍝ correcting the pointers
0 3 1 3 3 0
1 4 3 4 4 1
This correctly removes the whole sub-tree descending from node
((~mask)/⍳≢p) PPV q-1+(⍸mask)⍸q
┌─────┐ │ 7 │ │ ┌┴─┐│ │ 1 4│ │┌┴┐ ││ │0 9 2│ └─────┘
We can put this all together into a tacit idiom we can easily reuse.
(⍸mask)(⊢-1+⍸)(~mask)/p
PPV (⍸mask)(⊢-1+⍸)(~mask)/p
1 4 3 4 4 1
┌─────┐ │ ∘ │ │ ┌┴─┐│ │ ∘ ∘│ │┌┴┐ ││ │∘ ∘ ∘│ └─────┘