I’m having a bit of difficulty finding the right way to extend the ordered join implementation to
Table. Joins on
MatrixTable should share almost all of their code. It currently seems to me like the most natural way to achieve that is to implement joins on
RVD. The only problem is that
RVD isn’t keyed, so joins don’t make sense.
One option would be to put keys in the parameters of
RVD.join, but that doesn’t feel right. As far as I’m aware, all of our uses of
RVD carry key information already, so it seems natural to add keys to
RVD directly. Tim mentioned on slack that we might want to allow
MatrixTable to be backed by general
RVD, in which case it definitely seems like we’d want
RVD to be keyed.
It seems reasonable to move the
OrderedRVDType member from
RVD (and rename
RVDType). It’s a little odd to have a partition key for something which isn’t partitioned by keys, but I think the point is that it could be partitioned by keys, and that we want to repartition in order to join with an
OrderedRVD with compatible keys.
If we did this, my join strategy would be:
Add a method to
conformToPartitioner(or something) which takes an
OrderedRVDPartitionercompatible with the
RVDs key schema. If the
RVDis unordered, this will do a shuffle-sort; if it is already ordered it will do a narrow-dependency repartitioning. In both cases, any data falling outside the new partitioner’s partitions get dropped.
RVD. If both left and right
RVDs are already ordered, choose a new partitioner (depending on whether the join is left, outer, etc.), repartition both using the previous method so they now have a common partitioner, then use
zipPartitionsand the existing iterator joins.
If they aren’t both ordered, then we might choose the new partitioner differently (e.g. if
leftis ordered and
rightis not, just use
lefts partitioner), and repartitioning the unordered one will require a shuffle, but otherwise the idea is the same.
As an aside, I think that the
conformToPartitioner method should be put into the
TableIR. That way the optimizer could bubble repartitioning steps up as early as possible. In effect, this just means using knowledge of the later steps of the pipeline to make smart choices about partitioning in the beginning, to avoid the need to repartition as much as possible. E.g. the inputs to a join should be computed with the same partitioner in the first place, so that no shuffling is needed.
Before I start making fundamental changes, I’d like feedback on:
RVDcarry a list of key fields?
RVDalso carry a list of partition key fields? (In which case
OrderedRVDhave the same type data, which makes things easier).
- Is there a better approach to making a common implementation of joins for