Name for method that adds row/col index

view_join_rows is becoming index_rows.

What should index_rows (index on Table) become?

I have suggested with_index / with_row_index, Cotton has suggested annotate_rows_with_row_index / annotate_with_row_index

I think I like Cotton’s. It’s wordy, but since it’s temporary and going away when scans are in, I don’t think it matters too much. Alternatively maybe add_row_index etc?

I don’t like with_index. Cotton’s will fall nicely under annotate_rows in the docs. How about: annotate_rows_with_index. It’s shorter and just as clear.

Although these do sound like they’re making the index available for annotation within that step, which they’re not.

I like add_index / add_row_index / add_col_index

Seems like

annotate_index / annotate_row_index / annotate_col_index

fits our vocabulary better.

In my mind annotate rows is opening up a context where you annotate the rows of the matrix. These methods that add the index are doing something a bit different.

Ah, I always thought annotating was just adding a field (or some data) a table, for example, in our old vocabulary, annotate_rows_table. We weren’t opening up a context, we were adding the contents of a table into the rows.

I still very much dislike ‘annotate’

I also like add_index, etc. As long as view_join_rows => index_rows, I don’t care too much because, as Konrad says, this is temporary (and will be marked as experimental) until scans come online.

I still very much dislike ‘annotate’

On that we are all obviously in violent agreement.

Oh man, I’d love to kill ‘annotate’.

How about we replace ‘annotate’ with ‘add’ everywhere and stick to appending rows, cols, and entries where it makes sense:

add
add_index
drop
filter
select
transmute
explode
etc

1 Like

For the record, I used add_row_index, etc.: https://github.com/hail-is/hail/pull/3057