Python typecheck design

I had an idea for an alternative to the current typecheck design where the typecheck decorator doesn’t inject frames into the call stack. The downside is that you need a manual call to typecheck() in the body.

You can get the type annotations via inspect, so you might be able to make this work with a subset of Python type annotations. A little googling turned up typeguard, which does exactly this: https://typeguard.readthedocs.io/en/latest/.

This is the stack trace for the code below. With a little hacking, we could probably make the exception appear to be thrown from the call site by omitting the last two stack frames, although it wasn’t obvious how to do that. There’s Exception.with_traceback, but I couldn’t figure out how to cook up the right traceback object.

$ python3 tc.py
Traceback (most recent call last):
  File "tc.py", line 37, in <module>
    g2()
  File "tc.py", line 36, in g2
    f(5, k=5)
  File "tc.py", line 28, in f
    typecheck()
  File "tc.py", line 24, in typecheck
    raise TypeError(f'in call to {code.co_name}: invalid argument for {n}: expected {t.__name__}, got: {v} of type {type(v).__name__}')
__main__.TypeError: in call to f: invalid argument for k: expected str, got: 5 of type int
import sys
import inspect
import traceback

class TypeError(Exception):
    pass

code_signature = {}

def declare(**kwargs):
    def wrap(f):
        code_signature[f.__code__] = kwargs
        return f
    return wrap

def typecheck():
    frame = inspect.currentframe()
    frame = frame.f_back
    code = frame.f_code
    locals = frame.f_locals
    for n, t in code_signature[code].items():
        v = locals[n]
        if not isinstance(v, t):
            raise TypeError(f'in call to {code.co_name}: invalid argument for {n}: expected {t.__name__}, got: {v} of type {type(v).__name__}')

@declare(a=int, k=str)
def f(a, *, k='foo'):
    typecheck()
    return a + len(k)

def g1():
    f(5)
g1()
    
def g2():
    f(5, k=5)
g2()

I basically wrote this a couple years ago, then removed it because we didn’t have the impetus to rewrite everything to use it.

Here’s the PR deleting it:

I have retreated from my earlier position that the services code shouldn’t have type annotations. I would adopt typeguard and Python type annotations in the services code. I think services code would just use the function wrapping annotation.

I’m game for a hackathon day where we convert the entire code-base.

I feel like this is the way to go. There are tons of changes in flight in the codebase, and this can coexist with typecheck(1). Any reason we don’t push this forward incrementally?

I think we should also investigate mypy or other Python static checking tools.

I also think we should start linting the hail Python code. This is maybe the top priority. So

  1. lint hail Python code.
  2. Add mypy
  3. Switch to typecheck2/type annotations.

Sound right?

shall I PR typecheck2 back?

Enabling linters would be great but someone has to invest a lot into fixing issues. There’s over 6000 flake8 issues and 14k pylint issues (this I using our existing pylintrc and batch’s setup.cfg). This is why I gave up on enabling them back when I enabled it for batch.

I can’t find the original motivation to use a custom typecheck rather than typeguard, can y’all remind me?

I’m definitely game for getting mypy working. Scorecard would be a good test project, it’s pretty small.