Pavlos brought up an interesting question on the user forum

The problem as I see it is: Given a list of phenotypes *Y*, each of which has *N* samples with defined phenotypes, compute a list of lists of phenotypes such that the number of samples included in each group, *G*, is not smaller than *N* by more than some small error term *E* for any phenotype in the group:

*G/N > 1 - E*

We want to minimize the number of groups, in order to take best advantage of the BLAS3 optimizations in linear regression rows.