Things that illegally use MatrixValue right now


#1

Table

To_matrix_table
Expand_types
groupByKey
Order_by
Repartition
n_partitions
Sample
Head
Add_index
Interval join

MatrixTable

annotateRowsTable
annotateColsTable
Repartition
N_partitions
Count (all)
Distinct_by_row
Distinct_by_col
dropRows
explodeRows
explodeCols
filterCols
Head
UnionCols
UnionRows
ChooseCols
sampleRows
Coalesce
naiveCoalesce
unfilterEntries
trioMatrix
Add_row_index
Add_col_index
Filter_partitions
Window_variants
PCA
Regressions
Relatedness methods
Rename_duplicates
SplitMulti
VEP
Nirvana
Concordance
MIS
Balding nichols
Ld prune
Filter intervals
Sample QC

Cotton Seed: expand_types can be rewritten in Python.
n_partitions definitely shouldn’t be an IR node.

Cotton Seed: Sample is just filter (once we fix determinism)

Cotton Seed: add_index is going to become scan which will have an IR representation, but analogous to aggregators at the value level.

Cotton Seed: OK, @Patrick Schultz is doing some join work which I think will unify some of these join operations. In particular, annotateRowsTable, annotateColsTable, Union{Rows, Cols}, and maybe Table.intervalJoin (although I guess this will have to have some explicit IR representation)

Cotton Seed: MT.sampleRows also becomes filter.

Tim Poterba: dropRows and dropCols too

Cotton Seed: I think we can move SplitMulti fully to Python.

Cotton Seed: Same with add_{col_row}_index and scans.

Cotton Seed: ChooseCols should get removed, I don’t even know why that’s a thing.

Cotton Seed: I think filter_partitions shouldn’t be in the IR either.

Cotton Seed: We can definitely write BN 100% in Python.

Cotton Seed: And Concordance becomes entries join.