Long(8 bytes) and
String(N)(N bytes). For this project you'll be working with the abstract
DataBoxclass which implements
Comparable<DataBox>. You may find it useful to review how the Comparable interface works for this project.
RecordId. For this project we'll be using record IDs in our leaf nodes as pointers to records in the data pages.
BPlusTree.java- This file contains the class that manages the structure of the B+ tree. Every B+ tree maps keys of a type
DataBox(a single value or "cell" in a table) to values of type
RecordId(identifiers for records on data pages). An example of inserting and a retrieving records using keys can be found in the comments at
BPlusNode.java- A B+ node represents a node in the B+ tree, and contains similar methods to
BPlusNodeis an abstract class and is implemented as either a
BPlusTreeMetadata.java- This file contains a class that stores useful information such as the order and height of the tree. You can access instances of this class using the
this.metadatainstance variables available in all of the classes listed above.
indexdirectory. Many comments contain critical information on how you must implement certain functions. For example,
BPlusNode::putspecifies how to redistribute entries after a split. You are responsible for reading these comments. Here are a few of the most notable points:
2dentries is broken. Note that actual B+ trees do rebalance after deletion, but we will not be implementing rebalancing trees in this project for the sake of simplicity.
LockContext. You do not need to worry too much about these objects right now; they will become more relevant in Project 4.
Optional<T>objects. We recommend reading through the documentation here to get a feel for them. In particular, we use
Optionals for values that may not necessarily be present. For example, a call to
getmay not yield any value for a key that doesn't correspond to a record, in which case an
Optional.empty()would be returned. If the key did correspond to a record, a populated
Optional.of(RecordId(pageNum, entryNum))would be returned instead.
LeafNode. This method reads a
LeafNodefrom a page. For information on how a leaf node is serialized, see
LeafNode::toBytes. For an example on how to read a node from disk, see
InnerNode::fromBytes. Your code should be similar to the inner node version but should account for the differences between how inner nodes and leaf nodes are serialized. You may find the documentation in
fromBytesyou should be passing
fromBytes, you will need to implement the following methods in
BPlusTreemethod starts the call, the
InnerNodemethod is the recursive case, and the
LeafNodemethod is the base case. It's suggested that you work on one method at a time (over all three classes).
InnerNode. The purpose of
sync()is to ensure that representation of a node in our buffers is up-to-date with the representation of the node in program memory. Do not forget to call
sync()when implementing the two mutating methods (
remove); it's easy to forget.
bulkLoadwithin all three of
BPlusTree. Since bulk loading is a mutating operation you will need to call
sync(). Be sure to read the instructions in
BPluNode::bulkLoadcarefully to ensure you split your nodes properly. We've provided a visualization of bulk loading for an order 2 tree with fill factor 0.75 (powerpoint slides here):
BPlusTree. You can add a call to this method in a test to generate a PDF file of your B+ tree.