Fixed Allocators

From Blazegraph
Jump to: navigation, search

The RWStore allocates slots of storage in fixed size amounts. For example: 64, 128, 256, 512 and 1024 bytes big.

A FixedAllocator manages allocations for a specific size, for example, 512 bytes. This would be described as a 512 byte FixedAllocator.

When the RWStore receives a request for an allocation, it identifies a FixedAllocator via a free list of FixedAllocators for each allocation size.

Internally each FixedAllocator manages a list of Allocation Blocks.

Allocation Blocks

An Allocation Block is a contiguous region of the store file reserved for allocations of a specific size.

It has a start address, which encodes the offset in the file, and a bit array to indicate which slots are available for allocation. The size of the bit array determines the size of the contiguous region. By defining a minimum region size, a 32 bit start address can encode a much larger file offset. For example, if the minimum contiguous region is 128K, then a 64 byte allocation block would manage 2048 bits. We can also define a minimum number of slots for an allocation block to manage, and if this is set at 64 bits then an 8K allocation block would require a contiguous region of 64 * 8K = 512K. We will have more to say on this parameterisation when we consider locality, but the most important point is the support of large file offsets, where, at a minimum 128K allocation, 32 bits can encode an offset for a file of 128K * 2G bytes, or 256TB.

The Bit Arrays

Unlike the malloc and free of the standard C library, the allocation algorithms of the Allocation Blocks are transaction, or rather commit-point, aware.

This is achieved with the use of three bit arrays:

  • live
  • transient
  • committed

The committed bit array represents the allocated slots at the last commit.

The live bit array is initialized as a clone of the committed bits, and represents the current allocated and freed slots

The transient bit array represents an ORing of the committed and the live bits. Only slots which are available in the transient bit array can be allocated. This prevents the reallocation of committed storage before the next commit.

On commit, the live bits are cloned to the committed and transient bit arrays.

An example should help with Committed bits, Live bits and Transient bits: Initial state

L-00000000
T-00000000
C-00000000

Make 3 allocations

L-11100000
T-11100000
C-00000000

Commit

L-11100000
T-11100000
C-11100000

Free 2 allocations and make one new one

L-00110000
T-11110000
C-11100000

Commit

L-00110000
T-00110000
C-00110000

Now let's make four new allocations

L-11111100
T-11111100
C-00110000

Now we will free one of the original committed slots and one of the newly allocated slots

L-11100100
T-11110100
C-00110000

Note that we are able to free the newly allocated slot in the transient bit array. This allocation is now available to be re-used before the next commit.

This is the key functionality of the FixedAllocator, to support reallocation between commits while protecting previously committed storage.

Addressing

Every FixedAllocator is given an index when created. And each FixedAllocator is stored in a 1K meta-allocation. Therefore the maximum number of bits that can possibly be addressed is < 8K (8 bits * 1K bytes) So, if 13 bits are all that is required to define the bit offset, this provides 19 bits to specify the FixedAllocator index, giving a maximum of 256K FixedAllocators.

FFFFFFFFFFFFFFFFFFFBBBBBBBBBBBBB

This addressing scheme provides very quick offset encoding and direct access to free current allocations.

It is worth addressing whether the int-based address is an unnecessary restriction. If we make some simple approximations then at an average allocation of 512 bytes, and 6K useful bits per FixedAllocator the RWStore would support 6K * 256K = ~1536M allocations within a 80GB store. If the average allocation is 4K, then the store would support the same number of allocations but in a 640GB store.