Trees I have known

Toby Lehman (skizix75@gmail.com)

I introduced the T-Tree at the first HPTS in 1985. I was investigating binary trees and B-Trees in the context of main memory database management systems (MMDBMS) for my thesis. I came up with a mix of the two structures that had better overall characteristics as an index structure for an MMDBMS. The T-Tree provided search speeds almost as fast as binary trees, and it provided for memory efficiency that was almost as good as a B+ Tree. For a time, it got a fair amount of attention, and it was chosen as the preferred index structure in several MMDBMS products in the 1990’s.

One of the characteristics of the index structures that I chose for my work in MMDBMS was that the index structures themselves didn’t have any data – they only had references to the memory-resident record. That made them significantly more space efficient on the one hand, but it also made them a bit slower in search execution time, since every data reference required one or more indirections to get to the record data. One of the significant advantages of this approach was that any predicate (i.e. a residual predicate) could be applied to the record once it was located via the index.

The data reference could be a simple pointer (i.e. a memory address), or it could be a data structure reference. A memory address is faster to traverse, but it is more fragile and would not likely survive a machine crash or shutdown. A data structure address is robust, but requires more steps to traverse. Although not used at the time, I learned that a hybrid mechanism (permanent data structure address, with a transient cached physical memory address) could provide faster access, at the expense of extra memory.

Since then, the database community and I have learned many lessons about data structures. The first lesson is that memory efficiency still matters today. When I started my work in the 1980’s, megabyte memories were still fairly expensive. It was somewhat of a fanciful dream to suppose that one day there would be GIGABYTE memories. However, even though we might have machines with terabyte memories today, we also have databases that have billions of records. If each record is included in multiple indexes, then the index overhead in terms of bytes per record is still an issue. Binary trees tend to be the least space-efficient of all the index structures, and yet there are companies today that still use basic, space-wasting binary trees. For example, if each index entry per record requires, say, 64 bytes, then EACH index of a billion record table requires 64 billion bytes. For a large table of small records (common in NOSQL systems), the index memory cost could dwarf the table memory cost. Customers find this interesting when they learn that their 200 gigabytes of data actually requires 10 terabytes of memory.

The second lesson is that the underlying machine architecture has changed from 1 or two processors to 16, 24, 32, 64, 128 or 256, and yet many database products are still using index data structures that are designed for a single processor. New index structures, such as the KISS-Tree, the Adaptive Radix Tree and the BuzzWord Tree, to name a few, are designed for high concurrency for both read and update.

We are on the cusp of a new age. With increased memory densities, especially those brought to market by the new 3D XPoint memory technology, fast inexpensive stable memories in the terabyte sizes will be commonplace. We must keep our collective eye on maintaining memory efficiency, but with 256-core processors and multi-hundred terabyte memories, the emphasis will certainly be on parallel efficiency of the index structures.