@@ -40,10 +40,15 @@ pub(super) const MIN_LEN: usize = node::MIN_LEN_AFTER_SPLIT;
4040
4141/// An ordered map based on a [B-Tree].
4242///
43- /// A B-tree resembles a [binary search tree], but each leaf (node) contains
44- /// an entire array (of unspecified size) of elements, instead of just a single element.
45- /// A search first traverses the tree structure to find, in logarithmic time, the correct leaf.
46- /// This leaf is then searched linearly, which is very fast on modern hardware.
43+ /// An ordered map is a map in which the keys are totally ordered.
44+ /// That means that keys must be of a type that implements the [`Ord`] trait,
45+ /// such that two keys can always be compared to determine their [`Ordering`].
46+ /// Examples of totally ordered keys are strings with lexicographical order,
47+ /// and numbers with their natural order.
48+ ///
49+ /// Iterators obtained from functions such as [`BTreeMap::iter`], [`BTreeMap::into_iter`], [`BTreeMap::values`], or
50+ /// [`BTreeMap::keys`] produce their items in key order, and take worst-case logarithmic and
51+ /// amortized constant time per item returned.
4752///
4853/// It is a logic error for a key to be modified in such a way that the key's ordering relative to
4954/// any other key, as determined by the [`Ord`] trait, changes while it is in the map. This is
@@ -52,15 +57,6 @@ pub(super) const MIN_LEN: usize = node::MIN_LEN_AFTER_SPLIT;
5257/// `BTreeMap` that observed the logic error and not result in undefined behavior. This could
5358/// include panics, incorrect results, aborts, memory leaks, and non-termination.
5459///
55- /// Iterators obtained from functions such as [`BTreeMap::iter`], [`BTreeMap::into_iter`], [`BTreeMap::values`], or
56- /// [`BTreeMap::keys`] produce their items in order by key, and take worst-case logarithmic and
57- /// amortized constant time per item returned.
58- ///
59- /// [B-Tree]: https://en.wikipedia.org/wiki/B-tree
60- /// [binary search tree]: https://en.wikipedia.org/wiki/Binary_search_tree
61- /// [`Cell`]: core::cell::Cell
62- /// [`RefCell`]: core::cell::RefCell
63- ///
6460/// # Examples
6561///
6662/// ```
@@ -148,6 +144,42 @@ pub(super) const MIN_LEN: usize = node::MIN_LEN_AFTER_SPLIT;
148144/// // modify an entry before an insert with in-place mutation
149145/// player_stats.entry("mana").and_modify(|mana| *mana += 200).or_insert(100);
150146/// ```
147+ ///
148+ /// # Background
149+ ///
150+ /// A B-tree is (like) a [binary search tree], but adapted to the natural granularity that modern machines like to consume data at.
151+ /// This means that each node contains an entire array of elements, instead of just a single element.
152+ ///
153+ /// B-Trees represent a fundamental compromise between cache-efficiency and actually minimizing
154+ /// the amount of work performed in a search. In theory, a binary search tree (BST) is the optimal
155+ /// choice for a sorted map, as a perfectly balanced BST performs the theoretical minimum number of
156+ /// comparisons necessary to find an element (log<sub>2</sub>n). However, in practice the way this
157+ /// is done is *very* inefficient for modern computer architectures. In particular, every element
158+ /// is stored in its own individually heap-allocated node. This means that every single insertion
159+ /// triggers a heap-allocation, and for every comparison a node needs to be loaded,
160+ /// which could result in a cache miss. Since both heap-allocations and cache-misses
161+ /// are notably expensive in practice, we are forced to, at the very least,
162+ /// reconsider the BST strategy.
163+ ///
164+ /// A B-Tree instead makes each node contain B-1 to 2B-1 elements in a contiguous array. By doing
165+ /// this, we reduce the number of allocations by a factor of B, and improve cache efficiency in
166+ /// searches. However, this does mean that searches will have to do *more* comparisons on average.
167+ /// The precise number of comparisons depends on the node search strategy used. For optimal cache
168+ /// efficiency, one could search the nodes linearly. For optimal comparisons, one could search
169+ /// the node using binary search. As a compromise, one could also perform a linear search
170+ /// that initially only checks every i<sup>th</sup> element for some choice of i.
171+ ///
172+ /// Currently, our implementation simply performs naive linear search. This provides excellent
173+ /// performance on *small* nodes of elements which are cheap to compare. However in the future we
174+ /// would like to further explore choosing the optimal search strategy based on the choice of B,
175+ /// and possibly other factors. Using linear search, searching for a random element is expected
176+ /// to take B * log(n) comparisons, which is generally worse than a BST. In practice,
177+ /// however, performance is excellent.
178+ ///
179+ /// [B-Tree]: https://en.wikipedia.org/wiki/B-tree
180+ /// [binary search tree]: https://en.wikipedia.org/wiki/Binary_search_tree
181+ /// [`Cell`]: core::cell::Cell
182+ /// [`RefCell`]: core::cell::RefCell
151183#[ stable( feature = "rust1" , since = "1.0.0" ) ]
152184#[ cfg_attr( not( test) , rustc_diagnostic_item = "BTreeMap" ) ]
153185#[ rustc_insignificant_dtor]
0 commit comments