Here’s a proposed interface and spec for how something like split_multi_hts
should be checked in 0.2:
@handle_py4j
@typecheck(...)
def split_multi_hts(dataset, ...args):
"""docstring"""
dataset._require_fields('split_multi_hts',
GT=TCall(),
AD=TArray(TInt32()),
DP=TInt32(),
GQ=TInt32(),
PL=TArray(TInt32()))
...rest of the method...
If you pass a dataset without the GT field, you’d get an error like:
ValueError: 'split_multi_hts': required field 'GT' is missing
Expected a dataset with entry fields:
'GT': Call
'AD': Array[Int32]
'DP': Int32
'GQ': Int32
'PL': Array[Int32]
If you pass a dataset with a GT field, but it’s row-indexed, you’d get:
ValueError: 'split_multi_hts': field 'GT': expected entry field, found row field
Expected a dataset with entry fields:
'GT': Call
'AD': Array[Int32]
'DP': Int32
'GQ': Int32
'PL': Array[Int32]
If you pass a dataset with a GT of type TInt32, you’d get:
ValueError: 'split_multi_hts': field 'GT': expected type Call, found type Int32
Expected a dataset with entry fields:
'GT': Call
'AD': Array[Int32]
'DP': Int32
'GQ': Int32
'PL': Array[Int32]