{ $description "Literal syntax for a crit-bit tree." } ;
HELP:<cb>
{ $values { "tree" cb } }
{ $description "Creates an empty crit-bit tree" } ;
HELP:>cb
{ $values { "assoc"assoc } { "tree" cb } }
{ $description "Converts any " { $link assoc } " into a crit-bit tree. If the input assoc is a " { $link cb } ", the elements are cloned before insertion. The resulting tree is, then, identical to the input, as crit-bit trees are unique for any given content." } ;
HELP:cb
{ $class-description "This is the class for binary crit-bit trees (i.e. discriminating on a single critical bit)." } ;
{ $description "Converts a key, which can be any " { $link object } ", into a " { $link byte-array } ". Standard methods convert strings into its " { $link utf8 } " byte sequences and " { $link float } " values into byte arrays representing machine-specific doubles. Integrals are converted into a byte sequence of at least machine word size in little endian byte order."
"All other objects are serialized using " { $link object>bytes } ". In the standard implementation, this maps " { $link f } " to the byte array " { $snippet "B{ 110 }" } " and " { $link t } " to " { $snippet "B{ 116 }" } ", which is identical to the respective integers." } ;
"The " { $vocab-link "trees.cb" } " vocabulary is a library for binary critical bit trees, a variant of PATRICIA tries. A crit-bit tree stores each element of a non-empty set of keys " { $snippet "K" } " in a leaf node. Each leaf node is attached to the tree of internal split nodes for bit strings " { $snippet "x" } " such that " { $snippet "x0" } " and " { $snippet "x1" } " are prefixes of (serialized byte arrays of) elements in " { $snippet "K" } " and ancestors of other bit strings higher up in the tree. Split nodes store the prefix compressed as two values, the byte number and bit position, in the subset of " { $snippet "K" } " at which the prefixes of all ancestors to the left differ from all ancestors to the right."
"Serialization of keys is implemented using " { $link key>bytes } ". Crit-bit trees can store arbitrary keys and values, even mixed (but see implementation notes to " { $link key>bytes* } "). Due to the nature of crit-bit trees, for any given input key set that shares a common prefix, the tree compresses the common prefix into the split node at the root extending the lookup by one for arbitrary long prefixes."
"Keys are serialized once for every lookup and insertion not adding a new leaf node. Two keys are serialized for every insertion adding a new leaf node to the tree."
"Due to ordering ancestors at split nodes into crit-bit '0' (left) and crit-bit '1' (right), the order of the elements in a crit-bit tree is total allowing efficient suffix searches and minimum searches."
"Crit-bit trees consume 2 * " { $emphasis "n" } " - 1 nodes in total for storing " { $emphasis "n" } " elements; each internal split node consumes two pointers and two fixnums; each leaf node two pointers to the key and value. Their shape is unique for any given set of keys, which also means lookup times are deterministic for a known set of keys regardless of insertion order or the tree having been cloned."
"Compared to hash tables, crit-bit trees provide fast access without being prone to malicious input (but see limitations of the standard implementation of " { $link key>bytes* } ") and also provide ordered operations (e.g. finding minimums). Compared to heaps, they support exact searches and suffix searches in addition. Compared to other ordered trees (AVL, B-), they support the same set of operations while keeping a simpler inner structure."