DragonFly kernel List (threaded) for 2008-06
Re: HAMMER Update 13-June-2008 - HEADS UP: MEDIA CHANGED AGAIN
:> I am not too familiar with Reiser so I can't really come to that
:> conclusion, but it doesn't seem likely that the reasons are similar.
:They used B+ in earlier versions. In Reiser4 they decided to use
I can never keep all the variations straight, I need a Wiki reference
every time to get the name right :-).
These are all minor variations of B+Trees. The B* variation keeps
nodes 2/3rds full instead of half full. It is an optimization which
is intended to maintain higher fill ratios to make better use of system
A Dancing tree ... hehehehe. Looks like Hans invented that one. I'm
not sure it even counts as a variation of a B+Tree. It doesn't apply
to HAMMER at all because HAMMER makes no modifications to the B-Tree
whatsoever on the front-end. When you create, delete, rename, write,
etc... when you do those operations HAMMER caches them in a
virtualization layer in memory and doesn't make any modifications to
its on-media data structures (or their in-memory representations) at
all until the meta-data is synced to disk.
HAMMER doesn't have to make real-time on-media modifications for any
meta-data change for ANY operation. Not a single one. Not rename, not
hard or soft links, not truncations, nothing.
HAMMER uses a modified B+Tree for its on-disk representation, which is
a B-Tree with only keys at internal nodes and only records at the leafs.
This was done to reduce structural bloat, allow for a leaf->leaf
linking optimization in the future, and for other reasons. The Wiki
definition isn't quite right. The leaf->leaf links are optional (and
HAMMER doesn't implement them, though I've left room to do so). It's
still a B+Tree if you don't have them.
HAMMER's custom modification is to formalize both a left AND a right
bounding element. So a HAMMER B+Tree looks like this:
B B B B B B <---- internal node (e.g. 6 elements w/ 5 leaf ptrs)
L L L L L <---- leaf nodes
Whereas a normal B+Tree looks like this (note the missing right bound):
B B B B B <---- internal node
L L L L L <---- leaf nodes
Note that HAMMER uses a radix of 64, so e.g. 64 internal elements but one
is a right-bound so only 63 pointers to the next level. The leafs then
have 64 elements. So HAMMER's B+Tree recursion is 63x63x63....x64.
HAMMER's internal nodes have a left and right bounding element. A
standard B+Tree only has a left bounding element. By adding a right
bounding element HAMMER can cache pointers into its B+Tree and 'pick up'
searches, insertions, and deletions relative to the cached pointers instead
of having to start at the root of the tree. More importantly, it can
pickup searches, insertions, and deletions at internal nodes, not just
leaf nodes. So I can cache a proximity pointer and if I do a good job
I never have to traverse the B+Tree above that point.
Without the right bound a search cannot be picked up from mid-tree
without potentially going down the wrong branch and then having to
Many of the textbook optimizations listed on e.g. Wiki have serious
issues... you can't make the blanket statement "method A is better
then method B". Balancing optimizations can cause massive issues with
deadlocks, for example, or a need to lock extra nodes, or a need to
traverse a greater portion of the tree. In fact, for that reason, a
lot of people don't even try to balance B-Tree fill ratios in the
critical path, and neither does HAMMER. Putting in better balancing and
fill-ratio algorithms are something I'll probably do in the pruner or
reblocker at some point, but never in the critical path.
I've left the implementation open to additional optimizations, such
as the leaf-linking optimization and better balancing optimizations
such as you get with B*. There are plenty of others as well, and in fact
WHEN you do the optimizations is often more important then WHAT
optimizations you choose to do.
HAMMER does a bunch of other stuff as well. But HAMMER's main
optimization is the addition of the right bounding element and its
ability to cache pointers into the middle of the B+Tree to short-cut