Hail expr types in Python


#1

We want a more user-friendly type system. Numpy has interesting infrastructure for creating numpy dtypes, but it leaves a lot to be desired too. Here’s what I’m thinking:

  1. We leave the T-Type system alone, but build helpful architecture on top of it so people never need to write TInt32(). Scalar types should be hl.int32, hl.str, etc. However, this presents a problem: clash between existing expr functions (hl.str is the toString expr method) and the types. One possibility is that the types have a defined __call__ method that’s an expr conversion, but this may be confusing. This also doesn’t solve the problem of how array and struct types work. Should hl.array construct a type or value? How about hl.struct?
  2. We write a type parser in Python/Cython that handles similar-looking type strings to what we have in Scala. Cotton likes the following syntax:
struct{x:array<int32>, y: str, z: struct{z2: dict<str, int32>}}
  1. We use this type parser optionally in most places that require type inputs (hl.import_table, hl.null, etc).

#2

Another option for 1 could be a naming convention making all types capitalized, and everything else lower-case. So hl.Str is the type, hl.str is the toString method, hl.Array constructs a type, hl.array constructs a value, etc.


#3

I’m not sure I like the struct prefix. The curly braces are unambiguous in the type syntax already.


#4

I want to leave space for sum types, so if you axe struct you proposal an alternate to union { ... } for sum types.