I’m having a bit of difficulty finding the right way to extend the ordered join implementation to Table
. Joins on Table
and MatrixTable
should share almost all of their code. It currently seems to me like the most natural way to achieve that is to implement joins on RVD
. The only problem is that RVD
isn’t keyed, so joins don’t make sense.
One option would be to put keys in the parameters of RVD.join
, but that doesn’t feel right. As far as I’m aware, all of our uses of RVD
carry key information already, so it seems natural to add keys to RVD
directly. Tim mentioned on slack that we might want to allow MatrixTable
to be backed by general RVD
, in which case it definitely seems like we’d want RVD
to be keyed.
It seems reasonable to move the OrderedRVDType
member from OrderedRVD
to RVD
(and rename OrderedRVDType
to RVDType
). It’s a little odd to have a partition key for something which isn’t partitioned by keys, but I think the point is that it could be partitioned by keys, and that we want to repartition in order to join with an OrderedRVD
with compatible keys.
If we did this, my join strategy would be:
-
Add a method to
RVD
calledconformToPartitioner
(or something) which takes anOrderedRVDPartitioner
compatible with theRVD
s key schema. If theRVD
is unordered, this will do a shuffle-sort; if it is already ordered it will do a narrow-dependency repartitioning. In both cases, any data falling outside the new partitioner’s partitions get dropped. -
Add a
join
method toRVD
. If both left and rightRVD
s are already ordered, choose a new partitioner (depending on whether the join is left, outer, etc.), repartition both using the previous method so they now have a common partitioner, then usezipPartitions
and the existing iterator joins.If they aren’t both ordered, then we might choose the new partitioner differently (e.g. if
left
is ordered andright
is not, just useleft
s partitioner), and repartitioning the unordered one will require a shuffle, but otherwise the idea is the same.
As an aside, I think that the conformToPartitioner
method should be put into the MatrixIR
and TableIR
. That way the optimizer could bubble repartitioning steps up as early as possible. In effect, this just means using knowledge of the later steps of the pipeline to make smart choices about partitioning in the beginning, to avoid the need to repartition as much as possible. E.g. the inputs to a join should be computed with the same partitioner in the first place, so that no shuffling is needed.
Before I start making fundamental changes, I’d like feedback on:
- Should
RVD
carry a list of key fields? - Should
RVD
also carry a list of partition key fields? (In which caseRVD
andOrderedRVD
have the same type data, which makes things easier). - Is there a better approach to making a common implementation of joins for
Table
andMatrixTable
?