Pushing the Limits of Expressions

Robert today asked for a few operations on Expressions:

  • e.show()
  • e.take(n)
  • e.collect()

This was motivated by the lack of vds.sample_ids.

Tim proposes that e.show() work on any 1 or 0 indexed thing, and it prints like this:

In[0]: ds.s.show()
Out[0]:
+--------+
| e      |
+--------+
| String |
+--------+
| NA1232 |
| NA4499 |
|   ...  |
+--------+

e.take() is more or less implemented as:

if (isinstance(e.source, Table)):
    e.source.select(some_name = e).show()
else:
    assert(isinstance(e.source, MatrixTable))
    if (e.indices == e.source.row_indices):
        e.source.rows().select(some_name = e).show()
    elif (e.indices == e.source.col_indices):
        e.source.cols().select(some_name = e).show()
    else:
        boom

Thus the distinction between a (Matrix)Table and an expression continues to blur. However, I like this. (Pandas has something similar.)

I think global expressions can just be evaluated directly in the last case.

I’ll prototype show/take/collect on the train home today

We discussed that expressions should probably include their keys (at least if they’re not a subset of the key):

In[1]: ds.s.is_case.show()
Out[1]:
+--------+---------+
| s      | is_case |
+--------+---------+
| String | Boolean |
+--------+---------+
| NA1232 |    True |
| NA4499 |   False |
|   ...  |   ...   |
+--------+---------+