This is a thread for ideas for improving the generated code (for a fixed IR). Compare: Optimization ideas.
-
Right now ExtractAggregators producing two IRs: the aggregation and the post-agg IR. The post-agg IR takes the aggregation results as an argument, and they are built using RegionValueBuilder. We should pass the array of RegionValueAggregators to the post-agg IR and add an IR node: AggResult(i) to produce the aggregator results. I think then MapRows can be all one thing.
-
RegionValueAggregator.clear (for MatrixMapRows to avoid additional allocations)
-
Once dead field elimination is done, the decoder should be modified to only build values with the required fields (huge, huge).
-
MakeStruct(…, MakeStruct(…), …) current builds the inner struct, and then copies it into the outer struct (see addIRIntermediate TBaseStruct case). The inner struct can be built in place. We could maybe consider this StagedRegionValueBuilder “continutation passing”. This can also include AggResult.