Dealing with Large Numbers of Fields
In the past, we encountered a Spark bug when the number of columns in a Row was so large that the code generated for equality and ordering between the Row was larger than the Java method size limit.
We have also occasionally generated JVM methods that are too large to be compiled by the JIT. This produces rather tremendous performance loss.
Most recently, we introduced a Hail-specific “Method Code Too Large” bug when a user attempted to
rename a table with 1000s of fields. This
rename used a new code path that triggered
IR generation. The IR code literally copied the struct with no modifications. Longer term, we would like
rename to be a no-op. However, this revealed a more fundamental issue:
transmute, et al. will all fail on tables with large numbers of columns because all of these methods will attempt to generate a struct with 1000s of fields. The byte code to generate that struct will necessarily exceed the JVM method code size limit.
Short Term Fix
Amanda added a safe guard in https://github.com/hail-is/hail/pull/3233#event-1542109178 which uses the annotation path for tables with more than 500 fields.
This buys us time until we are forced to remove the
Options for Long Term Fixes
Fewer JVM Bytecodes
In certain circumstances, we can avoid generating byte codes linear in the number of fields. For example, if a struct is copied without modification, then we may
memcpy its bytes. If only one field of a struct is modified, we can memcpy the left and right hand side of the field, as long as the new missing bits don’t change the layout. It’s unclear how far we can get with this strategy alone.
Many JVM Methods
Using class variables to store
StagedRegionValueBuilder state, we can break the JVM code into smaller methods that are individually below the JVM method size limit. Consider, for example, generating a method for each field of the top-level struct, or for a group of fields. The JVM class file spec uses two bytes to store the number of methods, so we certainly cannot exceed 65535 methods. I do not know if there exists a lower limit on method count.
Generate x86 Assembly
This is viable long term, but we will probably remove
AST before we can implement this approach.