It's been a long time since i've implemented binary tree traversal functions myself.. But this time I need something to visit all nodes of a tree (in any order.. it does not matter).
For binary trees, does such a tree traversal algorithm exist, that visits all nodes without recursing, and without getting rid of recursion by adding a stack? I don't know the tree-depth, and I don't feel like calling malloc/free in case I run out of memory.
Optimal would be a traversal that implements a next/prev relationship between nodes, so I could call node = tree.next(node) or something like this.
Nils
Visit all nodes in a tree without recursion...
Started by Nils Pipenbrinck, Apr 24 2006 07:23 PM
8 replies to this topic
#1
Posted 24 April 2006 - 07:23 PM
#2
Posted 24 April 2006 - 09:10 PM
Hey Nils! :)
You need some sort of state. This state might be implied by the current node if there's an ordering in the tree, but I don't feel like solving that problem. ;) A viable algorithm might be like this (untested):
You need some sort of state. This state might be implied by the current node if there's an ordering in the tree, but I don't feel like solving that problem. ;) A viable algorithm might be like this (untested):
struct tree
{
tree *left, *right, *parent;
};
void traverse(tree *t)
{
bool bproc = true;
bool bright = true;
while (t)
{
if (bproc)
// process node t
// after processing the node, always try the left side
if (bproc && t->left)
t = t->left;
// if allowed, go down the right side
else if (bright && t->right)
{
// we are going down the hierarchy so process next node
bproc = true;
t = t->right;
}
else
{
// backtrack to parent, don't process the node nor the left branch
bproc = false;
// only process the right branch of the parent if we're not it already
// check if we're backtracking from the root (i.e. end of traversal)
bright = (t->parent && t != t->parent->right);
// perform the backtrack
t = t->parent;
}
}
}
Building an iterator form of that is left as an exercise. :) You can use a single (N+1) state variable (or a last-visited node pointer when backtracking, NULL when going deeper) instead of my two booleans, where N is the number of children in each node, but I used the two booleans for clarity (hah!).
#3
Posted 24 April 2006 - 09:58 PM
Hi Jare (long time no see!). Remember that surprising evening at Rosis Bar in Hamburg, or is the memory faded?
Anyways, thanks for the code. Looks logical and simple.. Also I have no parent pointer at the moment it looks like I can at adapt that idea and use a fixed depth stack for the first 16 levels, then (if nessesary) do the rest recursively. In 99.5% of my calls the tree will be balanced anyways, so traversals will be fast and I don't have a need to realloc. (at the moment I do my entire traversals recursively, and it has a measurable overhead compared to an iterative solution)
I'm working with splay-trees, so parent pointers will have significant speed overhead. Especially since i'm busy deleting and inserting items.. full tree traversals are maybe 15% of my tree-operations.
Nils
Anyways, thanks for the code. Looks logical and simple.. Also I have no parent pointer at the moment it looks like I can at adapt that idea and use a fixed depth stack for the first 16 levels, then (if nessesary) do the rest recursively. In 99.5% of my calls the tree will be balanced anyways, so traversals will be fast and I don't have a need to realloc. (at the moment I do my entire traversals recursively, and it has a measurable overhead compared to an iterative solution)
I'm working with splay-trees, so parent pointers will have significant speed overhead. Especially since i'm busy deleting and inserting items.. full tree traversals are maybe 15% of my tree-operations.
Nils
#4
Posted 24 April 2006 - 10:28 PM
Any sensible tree iterator implementation does exactly that (of course, the only way to achieve this without a stack of visited parents is when every node has a reference to it's parent), like C++'s std::map::iterator.
For an in-order traversal, a node that precedes it's parent is always on the left side. So the first node in the tree can be found by traveling from the root to the left, until you reach a node that doesn't have a left child.
As any node N splits the tree in a left-side that precedes N and a right-side that proceeds N, the former rule implies that the node that follows N is the first node of the right subtree of N (traverse 1 to the right, then go left until you can't go any further).
Of course, if N doesn't have a right child you reached the end of the subtree and you need to go upward. Here, you must track where you come from. If N is on the left-side of it's parent P, you reached the end of the left subtree of P and P is therefore the next node to report. However, if you N is the right child of P, then you've already reported P earlier and you need to go upward even further, repeating this process until you reach a node where the child from whence you came is on it's left side. If you reach the root node when coming from the right, there are no more nodes to report.
When converting this to sourcecode you'll get something similar as Jare's. Keep in mind that you effectively only need a single pointer to the current node to get to the next one (or, say, a next() function that accepts the current node as an argument and returns the next node, or null if there aren't any). Jare is using more variables because he implemented the left-subtree-recursion and the backtracking as single steps in the main loop.
For an in-order traversal, a node that precedes it's parent is always on the left side. So the first node in the tree can be found by traveling from the root to the left, until you reach a node that doesn't have a left child.
As any node N splits the tree in a left-side that precedes N and a right-side that proceeds N, the former rule implies that the node that follows N is the first node of the right subtree of N (traverse 1 to the right, then go left until you can't go any further).
Of course, if N doesn't have a right child you reached the end of the subtree and you need to go upward. Here, you must track where you come from. If N is on the left-side of it's parent P, you reached the end of the left subtree of P and P is therefore the next node to report. However, if you N is the right child of P, then you've already reported P earlier and you need to go upward even further, repeating this process until you reach a node where the child from whence you came is on it's left side. If you reach the root node when coming from the right, there are no more nodes to report.
When converting this to sourcecode you'll get something similar as Jare's. Keep in mind that you effectively only need a single pointer to the current node to get to the next one (or, say, a next() function that accepts the current node as an argument and returns the next node, or null if there aren't any). Jare is using more variables because he implemented the left-subtree-recursion and the backtracking as single steps in the main loop.
C++ addict
-
Currently working on: the 3D engine for Tomb Raider.
-
Currently working on: the 3D engine for Tomb Raider.
#5
Posted 24 April 2006 - 10:52 PM
Hah Nils, beer ensured that memory didn't survive the next morning, but I took enough pics to discredit Marco and us forever. :)
I think once you see the code, you quickly get the idea that you only need the current node to find the next one. Yeah, parent pointers are soooo convenient! If you can't have up-to-date parent pointers after splay operations, but can afford the space in each node, you can build them on the fly as you iterate. In effect, you are allocating space for the recursion stack as part of the nodes, but necessarily wasting space (you allocate N nodes but only need a logN size stack).
Ahh lovely data structures.
I think once you see the code, you quickly get the idea that you only need the current node to find the next one. Yeah, parent pointers are soooo convenient! If you can't have up-to-date parent pointers after splay operations, but can afford the space in each node, you can build them on the fly as you iterate. In effect, you are allocating space for the recursion stack as part of the nodes, but necessarily wasting space (you allocate N nodes but only need a logN size stack).
Ahh lovely data structures.
#6
Posted 25 April 2006 - 09:02 AM
I found a good solution:
I have a 16 entry stack to store where I came from. When I reach this limit, I splay the current note up to the root. This will half the treedepth along the path taken, so it will only trigger in degenerate cases, where a re-balancing might be a good idea anyways. If I do so I just throw away my current result and start all over again.
This clearly only works with up to 2^16 nodes, but since i only have an expect a maximum of 60 nodes, and always less than 2^16 this does not only solve my problem but speeds up further tree operations for free.
I have a 16 entry stack to store where I came from. When I reach this limit, I splay the current note up to the root. This will half the treedepth along the path taken, so it will only trigger in degenerate cases, where a re-balancing might be a good idea anyways. If I do so I just throw away my current result and start all over again.
This clearly only works with up to 2^16 nodes, but since i only have an expect a maximum of 60 nodes, and always less than 2^16 this does not only solve my problem but speeds up further tree operations for free.
#7
Posted 26 April 2006 - 02:53 PM
I'm gonna have a crack at this as well, feel free to laugh in my face. :happy:
I solved the same kind of problem with a separate linked list of node-pointers. This coupled with a counter for the number of nodes in the tree and a pointer in the node to the entry in the linked list.
The list would, naturally, need to be updated with deletes and adds as well as the tree and the node counter.
So, would that be a workable solution for you?
My tree was static, so I actually didn't need the extra info as described above, but it could work, couldn't it?
I solved the same kind of problem with a separate linked list of node-pointers. This coupled with a counter for the number of nodes in the tree and a pointer in the node to the entry in the linked list.
The list would, naturally, need to be updated with deletes and adds as well as the tree and the node counter.
So, would that be a workable solution for you?
My tree was static, so I actually didn't need the extra info as described above, but it could work, couldn't it?
#8
Posted 29 April 2006 - 04:43 PM
mice said:
I solved the same kind of problem with a separate linked list of node-pointers. This coupled with a counter for the number of nodes in the tree and a pointer in the node to the entry in the linked list.
The list would, naturally, need to be updated with deletes and adds as well as the tree and the node counter.
So, would that be a workable solution for you?
My tree was static, so I actually didn't need the extra info as described above, but it could work, couldn't it?
The list would, naturally, need to be updated with deletes and adds as well as the tree and the node counter.
So, would that be a workable solution for you?
My tree was static, so I actually didn't need the extra info as described above, but it could work, couldn't it?
Hi mice,
That was my first idea as well, but my tree is not static (in fact it's highly dynamic, I have more inserts/deletes than traversals or lookups. Adding an additional list would be kinda ok for insert, but degrade the log(n) performance for deletes quite a bit. or I switch to a double linked list, but that would in turn take the size of my nodes over a cacheline border, and that would degrate the performance even more.
That said, I'm currently quite happy with the tree code. I'm using it in a polygon tesselation code, did a hardcore test the day before, and I'm very satisfied with the performance results.
I tesselated a high poly hilbert curve, and my tree holds for each vertex all edges to the left of it. A structure that evolves while I traverse the vertices, so each time I visit a vertex multiple inserts/deletes can happen.
You bet, that's a shitload of inserts/deletes that I have to perform.
Here's a picture of my tesselated hilbert curve. I colored each triangle individually, so one can see the topology (not that it has anything to do with trees, but... it looks so nice).

Nils
#9
Posted 09 May 2006 - 09:25 AM
I believe your best option would be to use a threaded tree, either right-threaded, or full-threaded, depending on whether you need traversal in both directions.
Google Threaded-Trees for more info.
Alternatively there's using parent pointers.
Lastly, if you don't need to maintain the exact tree structure, you can first flatten the tree by performing right rotations every time the left node is non-NULL during traversal, and then rebalance the tree afterwards, which is quite easy once it's fully flattened. Not as efficient as simply using a recursive function, or maintaining an explicit stack, or the options I mention above for that matter, but all the options I mention here can be done without recusrion or an explicit stack.
Google Threaded-Trees for more info.
Alternatively there's using parent pointers.
Lastly, if you don't need to maintain the exact tree structure, you can first flatten the tree by performing right rotations every time the left node is non-NULL during traversal, and then rebalance the tree afterwards, which is quite easy once it's fully flattened. Not as efficient as simply using a recursive function, or maintaining an explicit stack, or the options I mention above for that matter, but all the options I mention here can be done without recusrion or an explicit stack.
1 user(s) are reading this topic
0 members, 1 guests, 0 anonymous users












