dranges.recursive

This modules explores the possibility of 'recursive' or 'branching' ranges, like trees or graphs. It a work-in-progress, as I'm not quite sure of the semantics yet (empty/hasValue..., should filter conserve the entire topology?)

Ranges:

A range defines empty/front/popFront as the standard way to iterate, possibly completed by some other functions. Given an element as front, there no ambiguity as to where to go next: there is only one next element, giving rise to a linear range of values (hence the name). You could instead use other methods, like hasNext/next (next popping the front and returning the successor). You can complete it with front, if you want to cache the current element to access it many times (calling next many times would advance the range).

Recursive (or Branching) Ranges:

A recursive range is somewhat similar, but based on a tree: given an recursive range, the presence of a valid node is given by hasValue. If OK, the current node's value is returned by a method named value. The successors method then returns a range of children, each of these children being itself a recursive range of the same type. This range of children is traditionally an array, but it could be lazy and even infinite. If successors returns an empty range, then the current node is a sink in the graph, or leaf in tree parlance. Note that the classical end condition for algorithms crawling trees or graphs is to test for leaves, not for empty. Indeed, empty is useful only at the very beginning to test for the validity of a recursive range or after filtering a recursive range (see rfilter).

There is no popFront equivalent: there is no natural next element, though depth-first (pre- or post-order) and breadth-first iterations are standard. You will find here traversal functions called depthFirst and breadthFirst, that constructs a linear range from a recursive range. That way, you can use all the standard linear algorithms on a recursive range.

These methods hasValue/value/successors are what defines an input recursive range. There is a corresponding template, isInputRecursiveRange. From now on, I'll use r-range as a shorter term for 'recursive range'.

Given a non-empty r-range, a method isSink is trivial:
bool isSink() { return successors.empty;}.
It's good practice to expose it in a r-range, as it's the terminating condition for many algorithms and is linked to infinite depth (see below). Maybe it should be part of a r-range definition?

Forward Recursive Ranges:

If you can copy the r-range into a variable, to store its state and restore it later, you have the equivalent of a forward range, here named a forward recursive range, with its companion template isForwardRecursiveRange.

Infinity:

These simple functions give rise to two kinds of infinity: infinite breadth if successors returns an infinite range, infinite height if there is no leaf in the range: the range returned by successors always have at least one element. As some part of the range may be finite, while other parts are infinite (a situation not possible for an infinite linear range), this is somewhat complicated. As such, the template hasInfiniteBreadth is global: it alias itself to true iff successors returns an infinite range. This is local at first view, but since the return type of successors is defined in the node type, the infinity of this range is deinfed at creation, and so is the same for all nodes inside the r-range. Use hasInfiniteBreadth before iterating on the children of a node.

As for hasInfiniteHeight, it tests if isSink is statically false.

Length, depth:

There is no simple, generic length equivalent, though some recursive ranges could know the total number of their nodes accessible from the current node, with a method called size. As with linear ranges, a template hasSize tests for its presence, and a walkLength equivalent can be done. A r-range that's infinite in any direction (downward, upward, or sideways) cannot have a size method.

In parallel, a height method can give the max number of successors levels downstream. Its companion template is hasHeight. An infinite depth range cannot have a height method, in the same way than an infinite range has no valid length member.

A breadth method is not required, as it's the length of the successors range, if it has one. So hasBreadth just test for length member in successors' return type. Obviously, a r-range with infinite breadth has no valid breadth member.

Cycles:

Note that successors returns a range of nodes, so the current node can be in the list directly or somewhere downstream, thus creating a cycle. Graphs can contain cycles, but trees must not. I see no way to statically test for a r-range having cycles or not, and thus testing if a r-range is a graph or a tree (there is another condition to be a tree: its predecessors range must have a length of at most 1). This kind of condition must be enforced by factory functions.

Going Up, With Bidirectional Recursive Ranges:

Up to now, there has been no way to go upstream, to the root of the current node. A r-range can define a predecessors method returning a range of predecessors (nodes with the current node in their successors ranges). A node with no predecessor is a source (or root for a tree), tested with isSource/isRoot. The (untestable at compile-time) contract is that if a node n0 is in the ancestors range of a node n1, then n1 is in the successors range of n0.

A r-range defining predecessors is a bidirectional recursive range, its associated template being isBidirectionalRecursiveRange. There is no equivalent to back for a r-range, as there are many (potentially an infinity) of 'extremity values'. On the other hand, it's possible to code a generic leaves/sinks function that returns a range of all leaves accessible from a beginning node. The same for a roots/sources function going upstream: it's simply a filtering on the nodes.

In complement of height with the successors of a node, one can define depth as being the highest number of levels upstream before reaching a source. Also, we can imagine infinite height r-ranges, albeit this situation seems quite uncommon (there is nevertheless a hasInfiniteHeight template).

Bidirectional ranges can be given to retro which reverses the sense of iteration. What could be the name for a r-range? Maybe reverse or upsideDown?

Output Recursive Ranges:

Given a subjacent recursive container, a r-range can be used to update the values in the container, with a put method. For output ranges, put puts the value at the front and advances the range, preparing it for the next value. You cannot do exactly that with a r-range: put puts the value in the current node (in a range-dependant manner) and returns successors. A r-range defining put is an output recursive range, with isOutputRecursiveRange as a testing template.

What's Impossible:

A linear range gives rise to a natural indexing: number the elements, starting from 0 and adding 1 for each call to popFront. It's nicely parallel to the indexing of arrays and some ranges expose indexing capabilities and slicing, tested by the isRandomAccessRange and hasSlicing templates. No such natural indexing can be defined for recursive ranges, so we have no random-access recursive range nor slices on r-ranges. Maybe defining opIndex() could be interesting, as it's becoming a standard way to return a clone of a range.

And, as range do not (standardly) define an opIndex[key] like associative arrays do, there is no associative recursive range, though the idea could be tried, as associative containers are often implemented as trees. So, maybe there is something possibly interesting.

Topology:

All this machinery is for iterating on the r-range, accessing its topology, which is the shape encoded with successors/predecessors. The values accessible through the r-range are another matter: current returns a node, not a value. Typically a node will have a value member. In parallel with ElementType for ranges, we can then distinguish the templates NodeType and ValueType: NodeType alias itself to the node type of the r-range (well, duh) and ValueType is typeof(NodeType.value).

The situation is then, once again, different than for ranges. In a range the topology is implicit, only the values are exposed. That's what gives ranges their interest as abstractions for many different actions. R-ranges are quite another beast and are more 'container-like' than ranges (indeed, I'm still trying to see whether they are a good idea or not). But this separation between the shape and the values permits interesting actions because of the richer information accessible to iterating functions:
  • You can map on the number of children, numbering a node not with its depth/height but with the number of its successors.
  • You can map and conserve the global shape or lazily expose a new topology.
  • You can filter on the topology, keeping only leaves for example.
  • You can test if two r-ranges have the same shape.
  • You can zip together two r-ranges with the same shape.
  • If you collapse the structure using reduce, the result will depend on the order of your iteration.
Nodes:

So, should nodes be value or reference types? In D, structs cannot contain values of the same type, but can contain references to them. Arrays of Nodes inside a Node struct are possible, or pointers to Nodes. So, in the end, it's up to you.

Range of ranges:

Let's compare the situation with range of ranges: they also encode topology (albeit a simpler one), but are not recursive. At each level of a r-range, the node type is the same and the successors also: it's quite homogeneous. For ranges of ranges, each level has a different type. Nevertheless, similar actions can be done: you can zip together ranges of ranges with the same shape, you can project them onto a linear range, etc. (* OK, what else can be said? *).

Comparison with ranges:

The following table compares the functions for ranges and r-ranges.

Ranges Recursive ranges
isInputRange isInputRecursiveRange
isForwardRange isForwardRecursiveRange
isBidirectionalRange isBidirectionalRecursiveRange
isRandomAccessRange -
isOutputRange isOutputRecursiveRange
front value
popFront successors
empty hasValue/isSink
back -
popBack predecessors
- isSource
hasLength hasSize
length size
- height
- depth
- breadth
isInfinite -
- hasInfiniteBreadth
- hasInfiniteDepth
- hasInfiniteHeight




License:
Boost License 1.0.

Authors:
Philippe Sigaud

Distributed under the Boost Software License, Version 1.0. (See accompanying file LICENSE_1_0.txt or copy at http://www.boost.org/LICENSE_1_0.txt)

template isInputRecursiveRange (T)


template isForwardRecursiveRange (T)


template isBidirectionalRecursiveRange (T)


template isOutputRecursiveRange (T,E)


template ValueType (T) if (isInputRecursiveRange!(T))


template SuccessorsType (T) if (isInputRecursiveRange!(T))


template hasSize (T) if (isInputRecursiveRange!(T))


template hasInfiniteBreadth (T) if (isInputRecursiveRange!(T))


template hasInfiniteDepth (T) if (isInputRecursiveRange!(T))


template hasInfiniteHeight (T) if (isBidirectionalRecursiveRange!(T))


struct RMap (alias fun,RR) if (isForwardRecursiveRange!(RR));
template rmap (alias fun)
mapping functions on a r-range

typeof(unaryFun!(onSink)(ValueType!(RR).init)) rreduce (alias onSink, alias fun, RR)(RR rRange);
Reducing a recursive range

T sumTree (T, R)(T a, R b);


size_t heightTree (T)(T a, size_t[] b);


T maxTree (T)(T a, T[] b);


T[] preorder (T)(T a, T[][] b);


T[] postorder (T)(T a, T[][] b);


T[] leaves (T)(T a, T[][] b);


int numNodes (T)(T a, int[] nums);


Page was generated with on Fri Nov 12 11:55:12 2010