Information on Binary Trees

A binary tree is often defined recursively as either (a) being empty, or (b) consisting of a root node together with left and right subtrees, both of which are binary trees. It is this definition that is most useful from the point of view of data structures in computer science. From the mathematical point of view these objects are not trees. Changing (a) from "empty" to "a single vertex" gives a definition equivalent to that of ordered trees in which every node is either a leaf (no children) or has two children. These are sometimes called extended binary trees. In either case, they are usually parameterized by n, the number of applications of rule (b) that are used.

In the extended binary tree illustrated above there are 5 internal nodes (the brown circles, which correspond to applications of rule (b)) and 6 leaves (the blue squares, which correspond to applications of rule (a)). By removing the blue edges and the leaves you obtain the corresponding binary tree.

The number of extended binary trees with n internal nodes is given by the nth Catalan number (2n)!/[(n+1)!n!] which, for n = 1,2,...,15, has the values 1, 2, 5, 14, 42, 132, 429, 1430, 4862, 16796, 58786, 208012, 742900, 2674440, 9694854. This is sequence Anum=A000100"> A000100(M1459) in

The number of extended binary trees with n internal nodes and leftmost leaf at level m is m(2n-m-1)!/[(n-m)!n!]. The table of these numbers, shown below, is known as "Catalan's triangle" if the rows are read backwards. This is sequence Anum=A033184"> A033184 in

m =  1 2 3 4  5 6 7 8
n=1   1
2 1 1
3 2 2 1
4 5 5 3 1
5 14 14 9 4 1
6 42 42 28 14 5 1
7 132 132 90 48 20 6 1
8    429 429 297 165 75 27 7 1

Binary trees with n nodes are often represented as bitstrings b1 b2 ... b2n obtained by traversing the equivalent extended binary tree in preorder, recording a 1 as each internal node is encountered, and a 0 each time a leaf is encountered, except for the final leaf, which is ignored. The position of the leftmost 0 is one greater than the level of the leftmost leaf (with the root at level 0). Thus for the tree illustrated above the bitstring is 1101001010. Every such bitstring corresponds to a well-formed parentheses string, obtained by regarding 1 a left paren and 0 as a right paren. Our bitstring is transformed into (()())()(). If "alternate output" is selected, then the bitstring is transformed into balls, black for an 1 and white for a 0. For our example, the output is .

Gray codes

Swapping a parentheses pair

If Gray code order is selected then each tree sequence differs from its predecessor by the transposition of two bits. It is the position of these two bits that are output if Gray code output is selected.

By a rotation

If generation by rotations is selected, then each successive tree differs by a rotation from its predecessor. A rotation is illustrated in the animation on the left, which alternately rotates the tree about the blue edge; left, right, left, right, etc. The relative order of the three subtrees is maintained but the root of the tree changes as illustrated. Rotations are used to maintain balance in binary search trees. There is an interesting correspondence between rotations and diagonal flips in triangulations of convex polygons.

By swapping adjacent parentheses pairs

Well-formed parentheses cannot be generated by swapping adjacent pairs unless n is even. However, if one or two adjacent swaps are allowed then such generation is possible by using the correspondence between linear extensions of the poset whose cover graph is the 2 by n grid tilted by 45 degrees, and well-formed parentheses with n left and n right parentheses. There is a Gray code for linear extensions in which each successive extension differs by one or two transpositions from its predecessor on the list. More information can be found on the linear extensions information page. Note that two adjacent swaps can result in a string that differs by the tranposition of a pair that are two positions apart.

Ordered Trees

An ordered tree is a rooted tree in which the relative order of subtrees matters. An ordered forest is an ordered collection of ordered trees. There is a one-to-one correspondence between ordered forests with n nodes and binary trees with n nodes. This correspondence is best explained using the well-formed parentheses strings with n left and n right parentheses. If the n left parentheses and n right parentheses are labelled 1 through n from left to right in the string, then there are n pairs of matching left/right pairs of parentheses. These pairs correspond to the nodes of the corresponding ordered forest. Listing the pairs in increasing order of the left parentheses corresponds to a preorder listing of the nodes of the forest and listing the pairs in increasing order of the right parentheses corresponds to a postorder listing of the nodes of the forest.

For the parentheses string mentioned above, (()())()(), the pairs of matching left/right pairs are 1/3, 2/1, 3/2, 4/4, 5/5. The corresponding binary forest is shown below, with the numbers to the left of each node inicating the preorder index and the numbers to the right of each node indicating the postorder index. The parentheses that correspond to each number are also shown to the left/right of each node.

Thus the algorithms for generating binary trees may also be thought of as algorithms for generating ordered forests.

Related Links:

The lex order algorithm was developed independently by several authors and is periodically re-discovered! The Gray code algorithm used is from the paper: F.Ruskey and A.Proskurowski, Generating Binary Trees by Transpositions, J.Algorithms, 11 (1990) 68-84. The algorithm for generating binary trees by rotations is from the paper: J. Lucas, D. Roelants van Baronaigien, and F. Ruskey, On Rotations and the Generation of Binary Trees, J. Algorithms, 15 (1993) 343-366.

Programs available:

It was last modified on .

The animations will not appear on a browser that does not support the GIF89a specification. They should appear on Netscape 2.0b5 and successors.