Our internal API is a headache right now. Given a RegionValue and an offset, how am I supposed to load an int? rv.region.loadInt(rv.offset + offset)
? This seems wrong.
I’m starting to think that having RegionValue as an abstraction for a MemoryBuffer + offset is actually worse than just dealing with a buffer + offset all the time – it seems like you’ll always need to go back to the buffer at some point anyway.
I agree. I’ve been struggling to find a consistent interface to extracting data from a RegionValue.
One option could be to copy the TString.loadString
method. So for example, we could add a TInt.loadInt(rv: RegionValue): Int
, and a TVariant.loadVariant(rv: RegionValue): Variant
. Would all the things that need to be read from a RegionValue into a Scala object fit into this pattern?
I’m confused. You should never have an offset within a region value. A region value is an offset with a region. Accessing the contents of that value (e.g. array elements or struct fields) should give you a new offset with respect to the region, not the region value.
So if you want to load the i
th field from a struct which is a region value, you do:
region.loadInt(t.loadField(rv, i))
or
region.loadInt(t.loadField(region, offset, i))
if you’re working with a region value that isn’t packaged up in a RegionValue (the common case).
Yes, you should think of region values a region + an offset. I created region value mainly to be mutable and avoid when returning region values (think the Iterator[RegionValue]
in an RDD
). Most code has one region flying around – I usually stick it in a variable called region
– and then values within that region are just Long
offsets.
Does this help?