LowerTableIR and nested CDAs

cseed · July 1, 2020, 1:19pm

Here is another perspective:

I’d call that a subquery (although it is a trivial one), and I’d say it should error because we don’t support subqueries. There user should explicitly use rbind to write a subquery free pipeline:

hl.rbind(ht1.aggregate(hl.agg.count(), _localize=False),
  lambda ht1count: ht2.aggregate(hl.agg.count() + ht1count))

extracts these nested patterns as RelationalLet IRs

I’m confused why these will be relational lets. I thought relational lets are only needed when pulling out values in “freestanding” matrix or table IR. IR in lowering are always value IR at the toplevel. Therefore, subqueries can be pulled up to the value level and bound in a normal let.

Proposal 2

We need to recompute requiredness for each of these nodes. This scales quadratically with the size of an IR.

I don’t understand why this is necessary. Just pass in the requiredness analysis recursively.

We can emit CDAs inside of loops, and we currently don’t have a code motion pass to clean this up.

I’m also confused by this. If you have a relational node in a loop, you need to run it on each iteration. Or do you mean in a loop inside e.g. TableFilter? That has to get pulled out.

To implement proposal 2, you need lowering to keep track of and return the set of relational nodes that have been pulled out so they can be placed in globals when you hit the value level again.

These proposals essentially seem the same to me: either you pull things out beforehand, or you pull them out while you do the lowering. I’d probably do proposal 2 since LowerTableIR already has all the infrastructure to do this. You can just throw the lowered subqueries in the globals, right?

Topic		Replies	Views
TableAggregate lowering proposal	11	724	December 13, 2019
Ideas for improving generated code	0	689	April 24, 2018
[RFC] Aggregation IR design	1	657	September 13, 2019
Optimization ideas	3	974	April 25, 2018
Notes on EmitStream	0	597	December 16, 2019

LowerTableIR and nested CDAs

Related topics