Region-based memory management

This document is an attempt to formalize the memory management design that was proposed in design meeting. If we all agree on this design, it should be used consistently across the codebase, any time a producer passes a value to a consumer, where the consumer needs to be sure the value will remain valid as long as needed.

Note that this only describes the simplest possible version. I can imagine refinements, such as Cotton’s proposal for implementing regions with strictly nested lifetimes using “checkpoints”, which allow a region to free only the memory allocated since a checkpoint. But I don’t see any need to incorporate such refinements in the first pass.

Design

We consider a situation where one entity—the producer—creates a value and gives it to another entity—the consumer. Here, an entity might be a piece of code being generated, a function, an object, whatever level of abstraction is most useful.

The fundamental invariant we must maintain is this: the passed value must be useable for as long as the consumer requires it. For example, the value must not have some memory on which it depends be freed before the consumer is done making use of the value. Likewise, if the value depends on some resource like a file handle, we must ensure that that resource lives as long as the consumer needs the value; however, this proposal doesn’t address non-memory resources.

First, a summary of relevant pieces of the existing region system.

  • A Region is a reference counted shared pointer to a RegionMemory.
  • A RegionMemory is an arena allocator: it can allocate memory, and it can free all the memory it has allocated. It cannot free some of its memory while preserving the rest. All interaction with a RegionMemory is done through a Region handle, and a RegionMemory is freed if and only if the last Region handle to it has been released.
  • A RegionMemory can have any number of “parent” regions. It holds a Region handle to each of its parents. It releases those handles only when it is freed, thus ensuring that the parent regions live at least as long as itself. Region r1 is registered as a parent of region r2 by calling r2.addReferenceTo(r1).

The following simple protocol suffices to enforce the main invariant:

  • The consumer must always pass a destination Region to the producer when requesting a value.
  • The producer must ensure that the value it produces remains valid until the destination region is freed. It can do this by making sure each component of the value is either:
    • constructed in the destination region, or
    • in a new region created by the producer, which the producer has registered as a parent of the destination region.

Note that this leaves some choices to the producer. If the producer needs to create some intermediate values, it could

  1. Create those values in the destination region, which would become junk never to be used.
  2. Create them in a new region which it frees before returning to the consumer.

In 1, we waste memory, while in 2, we perform more allocations. In general, we will want to make use of contextual knowledge to choose between these options. For example, when filtering rows of a table, we probably want to free each filtered row before building the next row, but when filtering an array, we might be okay with putting all the values, filtered or not, in one region.

In the current model (as I understand it), we make this decision by saying “each row of a table is built in a single region, along with all intermediate values”. We can keep this same behavior by introducing a new IR node SingleRegion (better names welcome), and by modifying the earlier protocol to require the consumer to pass a flag singleRegion to the producer. Then all IR nodes except SingleRegion propagate the flag from their parent to their children, we set singleRegion = false at the root of the IR, and SingleRegion always passes singleRegion = true to its children.

Alternatively, instead of a flag, if we enforce that the only way for a producer to create a new region is through the destination region, then SingleRegion can enforce that its children use a single region by passing a Region for which requesting a new region always returns itself.

This would allow us to easily experiment with different points along the spectrum between “more time allocating” and “higher memory usage” by making localized changes in the table lowerer.

1 Like

I believe you meant RegionPool in your second bullet point.

EDIT: Nevermind, I can’t read.

Great, thanks Patrick! A few comments:

The fundamental invariant we must maintain is this: the passed value must be useable for as long as the consumer requires it.

Thanks for including this. Said another way, the allocation strategy must guarantee there are no uses after free. This must be true of any allocation strategy, not just the region-based one. I anticipate we will have additional strategies (e.g. reference counting, malloc, etc.)

The other thing an allocation strategy needs to protect against is memory leads. The Region discipline should say what needs to be done to avoid memory leaks. I think this is sufficient:

  • All regions allocated by a piece of code must either be released (closed), or must be added to the region passed in by the consumer (if there is one).

I say “if there is one” because I imagine there will be some toplevel driver code that does not take in a region (but does take in, or creates, a region pool).

It might be a good check to have RegionPool (if we don’t already, possibly in debug mode) verify there are no pending allocations when it is closed. It would clean them up, but I don’t think we want to rely on RegionPool as an allocation strategy.

I think some of the region code has some legacy naming conventions that it might be useful to clean up:

  • Region => RegionPtr or RegionRef
  • RegionMemory => Region
  • The regions reference by another region should be its children, not its parents. Parents die before children, children don’t die before parents.

If the producer needs to create some intermediate values, it could …

There is a third case where the producer in turn calls a sub-producer which can again make its own choices about what to allocate in the passed-in regions vs new regions.

In the current model (as I understand it), we make this decision by saying “each row of a table is built in a single region, along with all intermediate values”.

I don’t think this is true, in fact it can’t be: explode on tables has to consume a row from one table that lives longer than the rows of the table it returns. I also don’t think decisions about region usage should go into the IR. Let’s remove that and leave the decision about when to create regions up to the producer (as long as it honors the invariants above) and start a separate thread to discuss this proposal if you want to push on it.

My concern is that without this, the table lowering will introduce significantly more allocation overhead. For example, both table filter and array filter will get lowered to stream filter, so array filters will now be creating a new region for each element of the array.

But I do agree that we should leave this until after we’ve benchmarked the naive version. And we might want to try your region checkpointing suggestion before adding region logic to the IR, that might eliminate most of the added overhead.

I understand now. I’d be OK with a flag/hint on StreamFilter for now.

creating a new region for each element of the array

Just to clarify, this shouldn’t be happening: it should be obtaining a region from the region pool which should be extremely lightweight (fast path: check regions are available and pop one off). This might be comparable with the checkpoint solution. I’m not sure. We should benchmark.

We should also have an analysis that determines whether an IR allocates. Then we can elide region “allocation” or checkpointing if we know there’s nothing to free.

Patrick, are you able to get the benchmarks to run?

After I rebased on master last week, I ran into a bug that I haven’t been able to figure out. I’ve also been working on a redesign of the stream emitter that solves some fundamental limitations of the current design. So I’m on the fence whether I should spend my time debugging something that won’t be used, in order to see the benchmarks, or to work on the new version which will also handle region management. I’m leaning towards finishing the new version.

Could you describe the new version, and the problems it solves? I’ve been thinking about our conversations about cleanup hooks last week, and I’m pretty convinced that we’re not really doing this in the “right” way in the TableValue/RVD world and things don’t seem to explode on the regular.

The most fundamental limitation of the current stream emitter is that there is no way for a consumer of a stream to tell the producer that it is done consuming. For example, take may stop consuming a stream before the stream is exhausted, and similarly left join may stop consuming the right stream before it is exhausted. In the current design, a stream frees its resources when it is exhausted, but doesn’t get a chance to clean up when its consumer stops consuming before the stream is exhausted.

The other issue is that it seems to be difficult to deal with streams of streams. This wasn’t a problem for flatMap, but it does seem to be a problem for things like groupBy.