! Copyright (C) 2020 Michael Raitza ! See http://factorcode.org/license.txt for BSD license. ! ! * Crit-bit trees ! ** Rationale ! Critbit trees are described in [[https://cr.yp.to/critbit.html][djb's crit-bit tree]]. They are an evolution of ! PATRICIA trees showing that fast insertion, deletion, exact searching and suffix ! searching is possible with this data structure. ! The strength of this data structure, according to its author, lies in its ! simple design and its optimisation to be machine parsable using machine ! word-sized operations where possible. Like PATRICIA trees, crit-bit trees are ! prefix-compressed, with internal nodes storing next decision point (the ! critical bit) in a length field (encoded as an integer and a mask) and two ! successor pointers. Arbitrary data objects make up its leaves. USING: accessors arrays assocs fry kernel math namespaces sequences serialize trees trees.private ; IN: trees.cb TUPLE: cb < tree ; : ( -- tree ) cb new-tree ; inline >bits swap >>byte# ; inline : ( byte# bits -- node ) cb-node new-node ; ! -1 = left ! 1 = right : key-side ( bits byte -- side ) bitor 1 + -8 shift 0 = -1 1 ? ; ! Produce a byte with all bits set except the msb from bits*. ! See MAGIC Algorithms for rationale. : msb0 ( bits* -- bits ) dup -1 shift bitor dup -2 shift bitor dup -4 shift bitor dup -1 shift bitnot bitand 255 bitxor ; ! Calculate the direction and the critical bit for the differing byte. : byte-diff ( newbyte oldbyte -- side bits ) swap over bitxor msb0 [ key-side ] keep ; ! For two byte strings, calculate the critical bit, byte and direction of ! difference (0 = left, 1 = right). : bytes-diff ( newbytes oldbytes -- side bits byte# ) 2dup mismatch [ [ '[ _ swap nth ] bi@ byte-diff ] keep ] [ ! Equal prefix over full (shorter) byte sequence. [ 1 255 ] 2dip shorter length 1 - ] if* ; ! Keep the byte sequence of the current key in =key-bytes= and provide a working ! environment for it with =with-key=. SYMBOL: key-bytes SYMBOL: current-key : key>bytes ( key -- bytes ) object>bytes ; : with-key ( key quot -- ) [ { current-key key-bytes } swap dup key>bytes 2array zip ] dip with-variables ; inline ! Extract the critical byte : byte-at ( byte# -- byte/0 ) key-bytes get ?nth [ 0 ] unless* ; ! ** Insertion ! ! Tree insertion must be done by traversing the tree from the root, as it is ! ordered. ! Insert a node at a leaf-node by first inserting a split node. : cb-insert ( value leaf-node -- node created? ) [ current-key get swap ] dip dup key>> key>bytes key-bytes get swap bytes-diff swap ! ( new-leaf old-leaf side cb-node -- ) [ swap [ [ set-node-link ] keep set-node+link ] with-side ] keep t ; DEFER: cb-set ! Insert a node into a non-empty tree. : (cb-set) ( value node -- node created? ) dup cb-node? [ dup [ bits>> ] [ byte#>> ] bi byte-at key-side [ [ node-link cb-set ] keep swap [ [ set-node-link ] keep ] dip ] with-side ] [ cb-insert ] if ; ! Insert a node into the tree : cb-set ( value node -- node created? ) [ (cb-set) ] [ current-key get swap t ] if* ; PRIVATE> M: cb set-at ( value key cb -- ) [ swap [ cb-set ] with-key swap ] change-root swap [ dup inc-count ] when drop ;