On the subject of EmitTriplet

OK @pschultz, I’m convinced EmitTriplet should be a COption[PValue]. Concretely:

trait EmitTriplet {
  def apply(missing: () => Code[Nothing], present: (PValue) => Code[Nothing]): Code[Nothing]
}

although I would prefer to write:

case class EmitTriplet(f: (() => Code[Nothing], (PValue) => Code[Nothing]) => Code[Nothing])

to avoid the syntactic overhead of the method declaration and write:

  EmitTriplet { (missing, present) => ... }

The motivation for this I think is most clearly seen in the code generation rule for ArrayRef. Currently:

        val setup = Code(
          codeA.setup,
          xma := codeA.m,
          xa := pArray.defaultValue,
          codeI.setup,
          xmi := codeI.m,
          xi := coerce[Int](defaultValue(TInt32())),
          len := coerce[Int](defaultValue(TInt32())),
          (xmi || xma).mux(
            xmv := const(true),
            Code(
              xa := codeA.pv,
              xi := coerce[Int](codeI.v),
              len := pArray.loadLength(xa.load().tcode[Long]),
              (xi < len && xi >= 0).mux(
                xmv := !pArray.isElementDefined(xa.load().tcode[Long], xi),
                Code._fatal(errorTransformer(
                  const("array index out of bounds: index=")
                    .concat(xi.load().toS)
                    .concat(", length=")
                    .concat(len.load().toS)))))))

        EmitTriplet(setup, xmv, PValue(pt,
          Region.loadIRIntermediate(x.pType)(pArray.elementOffset(xa.load().tcode[Long], len, xi))))

This computes m up front and stores it. Clearly that code is horrendous. It should be three ifs (with two jumps to the common false case) and then the code that uses the element’s value. With this proposal, the code would look like:

        val L = new CodeLabel()
        EmitTriplet { (missing, present) =>
          a(Code(L, missing), { av =>
            i(L.goto, { iv =>
              Code(
                xa := av,
                xi := iv,
                len := pArray.loadLength(xa.load().tcode[Long]),
                (xi < len && xi >= 0).mux(
                  pArray.isElementDefined(xa.load().tcode[Long], xi)
                    .mux(present(
                        PValue(pt, Region.loadIRIntermediate(x.pType)(pArray.elementOffset(xa.load().tcode[Long], len, xi)))),
                         L.goto),
                  Code._fatal(errorTransformer(
                    const("array index out of bounds: index=")
                      .concat(xi.load().toS)
                      .concat(", length=")
                      .concat(len.load().toS))))))))
        }

This could be done incrementally, although switching from the labels to the values means introducing a local temporary which currently requires a MethodBuilder. With some abuse of notation:

class EmitTriplet {
  def apply(setup: Code[Unit], m: Code[Boolean], pv: PValue): EmitTriplet =
    new EmitTriplet {
      def emit(missing: Code[Unit], present: (PValue) => Code[Nothing]): Code[Nothing] = {
        m.mux(missing, present(pv))
      }

  def toValues(mb: MethodBuilder, et: EmitTriplet): EmitTriplet = {
      val L = new CodeLabel()
      val x = mb.newPField(et.pt)
      Code(et(m := true, L.goto), v2 => Code(x := false, v := v2, L.goto), L)
  }
}

However, lir can support local creation without a method builder. So maybe it makes sense to attempt it incrementally after that goes in. Also, there aren’t a ton of uses of it, so someone might be able to do it in one go.

I want to discuss what this looks like in the universe where Code has been split into (imperative) CodeBuilder and a value-carrying (non-Unit) Value[T]. This is what I called it in the other dev post, it is different than the Value[T] I just PR’ed.

A function returning Code[T] in the old world corresponds to a function taking a CodeBuilder and (possibly) returning a Value[T]. So the EmitTriplet becomes:

trait EmitTriplet {
  def apply(cb: CodeBuilder,
    missing: (CodeBuilder) => Unit,
    present: (CodeBuilder, PValue) => Unit): Unit
}

And what are we going to call this thing?

I assume @pschultz is on board with this. Comments, anyone else?

I was being a bit daft last night, and I will remark anything you can do with the new proposed implementation of EmitTriplet can be done with the old one. I rewrote the Emit case for the ArrayRef to generate the correct code in the current setup: https://github.com/hail-is/hail/pull/8245. I am still of the opinion the proposal is ergonomically better, because with it, it would have been hard not to write the correct code, however, this may make it slightly less urgent.

I think I’m convinced of that too.

On ergonomics, what I’d really like is for most uses of EmitTriplet to be written using an interface of things like map, flatMap, addSetup (or Code(setup: Code[Unit], et: EmitTriplet): EmitTriplet), and so be independent of the low level representation.

In that style, and assuming we add PArray.getElement(aoff: PValue, i: PValue): EmitTriplet, the array indexing code would look like:

codeA.flatMap { aoff =>
  codeI.flatMap { i =>
    Code(
      xa := aoff,
      xi := i,
      len := pArray.loadLength(xa.load().tcode[Long]),
      (xi.load() >= len.load() || xi.load() < 0).orEmpty(
        Code._fatal(errorTransformer(
          const("array index out of bounds: index=")
            .concat(xi.load().toS)
            .concat(", length=")
            .concat(len.load().toS)))),
      PArray.getElement(xa.load(), xi.load()))
  }
}

I think this is more readable, and much harder to write the wrong or suboptimal thing. Also, any improvements we make to the code generation for the handful of combinators apply everywhere.

I think I’m convinced of that too.

Great. So we should be finding and fixing bad code generation independent of this proposal.

be written using an interface of things like map , flatMap , addSetup

I think we are in disagreement about two things:

I don’t want code to be functional, I want it to be imperative. I think it makes the control flow of the generated code more transparent and it lets you do inline bindings (see below). But I agree it’s in a half-way state and needs to go one way or another.

Second, I don’t want EmitTriplet to abstract the low-level details unless we have a use case for it. I think it’s more work and I find people have more trouble reasoning about the abstract than the concrete case (myself included), although I realize that may not be your personal experience.

This does not however preclude map/flatMap which I think would have signatures:

class EmitTriplet {
  def map(f: (CodeBuilder, PValue) => PValue): EmitTriplet

  def flatMap(f: (CodeBuilder, PValue) => EmitTriplet): EmitTriplet
}

I also like to write (a, i).flatMap { (av, iv) => body } for

  a.flatMap { av =>
    i.flatMap { i =>
      body
    }
  }

So under my proposal the full code for ArrayRef would be:

        val L = new CodeLabel()
        (a, i).flatMap { (cb, av, iv) =>
          EmitTriplet { (cb, missing, present) =>
            val xa = cb.memoize(av)
            val xi = cb.memoize(iv)
            val len = cb.memoize(xa.loadLength())
            Code(
              (xi < len && xi >= 0).mux(
                xa.isElementDefined(xi)
                  .mux(
                    present(xa.loadElement(len, xi)),
                    missing)
                Code._fatal(errorTransformer(
                  const("array index out of bounds: index=")
                    .concat(xi.load().toS)
                    .concat(", length=")
                    .concat(len.load().toS)))))
      }

where cb.memoize returns a reusable PValue and we might need a cast to PIndexableValue in there somewhere.

I think your code needs an EmitTriplet constructor somewhere for the flatMaps and you don’t test the missingness of the element. And note your version has the definitions of xa, xi and len out of line.

Code(setup: Code[Unit], et: EmitTriplet): EmitTriplet

I’d probably object to this because it looks like a Code constructor but isn’t. But happy with addSetup or prepend or something.

edit: I made some code edits. I think things are roughly internally consistent now.

I’m fine with that. I was writing the example in the style I’m familiar with, but I think there’s a fairly mechanical translation between functional AST interfaces and imperative builder interfaces, so I think most of the discussion is orthogonal to that choice.

I was suggesting a method PContainer.getElement(aoff: PValue, i: PValue): EmitTriplet that does the missingness check.

Yeah, the functional version of memoize would be

def memoize[A](value: Code[_])(body: PValue => Code[A]): Code[A]

or something like that. Not sure if I have the Code/PValue distinction right. I agree the builder version makes for simpler code.

Thanks for sharing where you see the final version of this ending up. It looks good to me.

I was suggesting a method PContainer.getElement(aoff: PValue, i: PValue): EmitTriplet that does the missingness check.

Ah, got it. Yeah, that is better. I will updated my code snippet to use this and orElse (which I just call _if in the Unit case).

I agree the builder version makes for simpler code.

Great, that’s the main motivation.

Thanks for sharing where you see the final version of this ending up. It looks good to me.

You too, this is been a fun discussion! I look forward to continuing the other threads.

I think there’s a fairly mechanical translation between functional AST interfaces and imperative builder interfaces

I agree, to a point. In a sense, the functional setting is more general because the state is explicit and you can fork it or built it out of order and compose it. When you can translate, I’d probably argue you’re better off avoiding the additional generality using the imperative solution.

Side remark, the ArrayRef change failed due to Code[T] reuse issues. (I think we’re generating a ton of duplicate code right now.) I tried to push Value[T] into Emit, but I ran into issues with ParameterPack. So my current plan is:

  • EmitStream1 gets ripped out
  • rip out joinpoint (explicit labels) and ParameterPack (PValue)
  • push through Value[T] and the PValue analog (I will probably rename PValue => PCode, and add PValue as the analog to Value[T]),
  • retry to ArrayRef change
  • switch to lir