Grand plan for Python type annotations
I’ve been experimenting with Python 3 type annotations in a few recent PRs, and I really like them. Here’s the PEP. With some work, moving our codebase to use type annotations will provide us with the following:
- Better IDE support for us and our users
- Automatic documentation
- Integrated runtime typechecking
- Static typechecking with mypy
I’ll address these point by point
Better IDE support
This one is low-hanging fruit. If we add type annotations to our public interfaces, everybody wins. See here for an example. However, for the immediate future, we’ll exist in a world where we have 3 types of types:
- documented types in docstrings
- typechecked types declared in decorators
- unchecked type annotations
This is obviously not ideal. So…
Automatic documentation + Integrated runtime typechecking
Some cursory experiments have shown that it’s not possible to generate documentation for any method with a decorator. Since pretty much every public Hail method is decorated, this prevents us from combining “documented types” or “typechecked types” into the type annotations independently, instead requiring that we go from a 3-type world to a 1-type world in one go.
Here’s an example of one of a few Python3 modules that do something similar: https://github.com/agronholm/typeguard
I’ve poked around there (and did a bit last year as well) for ideas. I don’t think it’ll be too hard to transform our current system to use a similar approach.
I imagine that instead of methods looking like:
@typecheck(a=str, b=int)
def foo(a, b):
# body
they will look like:
def foo(a: str, b: int) -> Foo:
typecheck()
# body
There’s one thing we lose here, which is the “transformer” typecheckers (which convert to expr, and convert strs to types and ReferenceGenome
s currently). This seems like an OK price to pay, and will probably make our code more easily maintained. A method that takes a ReferenceGenome
will probably need to look something like:
RG = Union[ReferenceGenome, str]
def locus(contig: str, position: int, rg: RG) -> Locus:
rg = get_rg(rg)
Static typechecking with mypy
We looked into this when our Python interface first came out, and concluded it wasn’t very mature. Patrick brought up the idea again before the holidays, and I told him that it wasn’t very mature – I could have been wrong this time around. I don’t have much to say here, but it’s certainly worth looking into after we’ve sorted out the other items here.