B+Trees

CPSC 461: Copyright © 2002 Katrin Becker 1998-2002 Last Modified August 4, 2000 05:38 PM

B+TREES

B+ Trees and Indexed Sequential Files

Indexed Sequential Files attempt to give the best of both - indexed (direct) access and sequential (physically contiguous records - no seeking)

- random access has been achieved using indices but if we want to access sequentially we end up with one seek per access

- sequential organization makes sequential access efficient but leaves us having to do linear searches (or at best binary searches) for random access; we also end up with a lot of overhead when adding or deleting records

- examples of file that require both: student files; credit card systems

The Sequence Set

- forget the index for now and let's concentrate on keeping the records in order (sequence set)

- forget about sorting and resorting the entire file - it's too expensive

- need to localize the changes

ANSWER : blocking - using one block to hold > 1 record

Using Blocks

- insertion can cause overflow just like B-Trees; splitting process is similar but there is no promotion [slide 1]

- deletion can cause underflow (less than half full) [slide 2]

1. If neighbouring block is also half full then concatenate, and free one block for re-use; neighbours are logical rather than physical blocks
2. If neighbouring blocks are more than half full then redistribute

associated costs:

- internal fragmentation means the file takes up more space - this can be helped some by using redistribution before splitting and by using 2 to 3 splits.
- Sequential access is only guaranteed within one block

Choice of Block Size:

- Should be big enough so we can hold several blocks in RAM so we can do splitting etc.

- Want to keep overhead for random accesses to a minimum - we need to read in an entire block even if we only want one record

- Remember clusters (the minimum # of sectors allocated at one time); data can be accessed sequentially within one sector since we only need one seek to get to the entire sector

- If you get to choose your own cluster sizes: need to consider internal fragmentation as well as RAM limitations

Back to Indices...

Borrow from the B-Trees :

have the index function kind of like the parent key: an index entry corresponds to the smallest key value in the block
we want to be able to locate the correct block and then search through the block to find the required record
keep the last record's key in the index, then as long as the searched for key is greater than the index key we keep ‘scanning' (if we keep the first one we won't know we have the right block till we've passed it) [slide 3]
index must be kept in RAM because as records are added and deleted, the index must kept current
What if the index is too big to be kept entirely in RAM? - keep the index as a B-Tree, so now we have a B-Tree PLUS a sequence set for the actual records = B+Tree
Given that we aren't actually interested in the keys in the index at all - we are interested in the index so it can direct us to the correct block in the sequence set - so we can think about the index in a different way. We don't need to have the keys at all - what we want is a kind of road map,
we can use separators instead of the actual keys. [Slide 4]
Need an algorithm to get us the shortest unique separator by comparing the last key of one block and the first of the next block; with the choices we have made we use the separator to go left if it is less than the separator and right otherwise


Prefix B+ Tree
To build our first simple Prefix B+Tree - we use the separators as the keys in the B-Tree pages and attach pointers into the sequence set to the leaves [slide 5]

It's called simple prefix because of the algorithm used to determine the separators
since the index set is a B-Tree, a node containing N separators points to N + 1 children

Maintenance:

modifications involve maintenance in BOTH parts of the structure
remember that insertions and deletions are of the actual records, which may change the range of values to which an index set key refers and thus the key in the index set
insertions and deletions in the index set are never direct - they are always a result of a change in the sequence set
don't change the index set key if you don&'t have to; as long as it is a valid separator it can stay - even if it no longer resembles the first key of the sequence set


Deletions:

1. Deletions that don't involve redistribution or concatenations can leave the index alone
2. Deletions that involve redistribution may change one index key
3. Deletions that involve concatenation may involve redistribution or concatenation in the index keys as well as changes to the index keys
Insertions:

1. Insertions that don't involve redistribution or splitting don't change the index; since we insert on the basis of the existing index set it makes sense that the index set will still be valid
2. Insertions that involve redistribution may change the index in the same way as a deletion that results in redistribution
3. Insertions that result in splitting result in adding a key to the index

Looking at it another way:

all changes work from the bottom up
- changes are first made in the record block and the required information is passed ‘upward' to the index set (which is a B-Tree)

What information must be passed to the index set? If blocks are split? (A new key to insert) If blocks are concatenated? (A key to delete) If blocks are redistributed? (A key to change)
We already know how to do the first two in a B-Tree, the third simply involves a change of key (replace key A with value B) and does not affect the structure otherwise

Index Set Block Size
physical size of index node usually same as physical size of block in sequence set (call index node an index block (great!))

1. Want good fit between block size, physical characteristics of disk, and available memory
2. Common block size makes for simpler buffering - can implement sequence set blocks as virtual too - we can have a virtual simple prefix B+tree (!!)
3. The whole thing is often in one file so manipulation is simpler if everything has the same size block

Internal Structure of Index Set Blocks:
- so far we've used only B-Trees with a fixed number of keys - the whole point behind using the shortest separator is to save space - we can't if we fix the size of the space for the keys
- if the index set contains keys of varying lengths.... we need a way to tell where keys start and end....
answer - MAKE AN INDEX INTO THE INDEX SET !!!

HOLD ON -- don't try to keep track of all the details all the time -- look at the details when necessary but look at the big picture

- an index into the index tells us where each of the keys begins - we still need a way to tell where they end (maybe a null character or something); we need also to know how many keys are in this block; with that information we can do a binary search into this monstrosity

- now the index set looks like this:

- separator count (# of keys)
- total length of (all) separators
- separators (the keys themselves)
- index to separators (index to index - tells where keys begin)
- relative block numbers (the blocks of records to which the separators or keys point)

- a block is not just an arbitrary chunk cut out of a homogenous file; it is more than just a set of records
- these blocks have a sophisticated structure all their own
- this idea is more useful as block size increases
- with very large blocks it is imperative that we have an efficient way of processing all data
- the node within the B-Tree index set here is of variable order
- # of separators is directly limited by block size
- it still has a maximum order and therefore a minimum depth
- since it is variable order, determining when a block is full or half full is no longer simple
- decisions about when to split, concatenate or redistribute are more complicated

Loading a Simple Prefix B+Tree

through successive insertions but splitting and redistribution are relatively expensive - they're fine for maintenance but not so good for building
begin by sorting records to guarantee that next record we read will be next record to load
with sorted records we can place them into the sequence set block by block
when a block is full we can determine the shortest separator
we collect separators in RAM until it is full
gives us the choice of loading a tree with full blocks or setting a degree of utilization based on our knowledge of how it will be used
we end up writing sequence set and index set blocks so that they are physically close on the disk - this creates a degree of spacial proximity which results in more efficient seeks

B+Trees
Simple Prefix B+Trees are but one form of B+Trees

plain B+Trees use an actual key as a separator

when to use which? 1. When the extra cost of managing variable separators outweighs benefits of shorter separators 2. When key sets don't compress

remember other simpler file structures - use them when they will do the trick

can use parts of various structures to build hybrids - such as using a simple index into a blocked sequence set

SUMMARY: B-Trees; B*Trees, B+Trees

all are paged index structures so they bring entire blocks of info into RAM at once
all maintain height balanced trees
all trees grow from the bottom up; balance is maintained through splitting, concatenation and redistribution
all can be implemented as virtual tree structures
all can be adapted for use with variable-length records

IMPORTANT DIFFERENCES: B-TREES: info is grouped as pairs (key and associated info) distributed over all nodes B+TREES: all key and record info is contained in a linked sequence set; accessed through index set containing separators
SIMPLE PREFIX B+TREES: uses simple separators built from keys