GPU Support to Hail

Based on the chat with @tpoterba on Gitter, I learnt that @jbloom is working on linalg to leverage GPU by hail.

As per @tpoterba suggestion, I am creating this thread to keep track of the development of linalg on this topic.

linalg docs stub:

1 Like

@jbloom, I would like to start contributing for enabling hail to leverage gpu.
Is there any design doc or can you share some idea on the linalg, so that I can also get along with you in contributing to this.

I have worked on spark-scala and have a bit experience in python. I am ready to learn the required stuff in contributing to linalg. So can you please share the required topics, I need to get a handle on, in order to start taking up the feature development.

Is there any update on the above thread?
Does hail support gpu for computations?

No GPU support yet, no.

1 Like

I have seen spark3 release and they support GPU computation. Do you think a spark3 backend can compute a hail task on gpu?

No. That’s a GPU code generator for Spark-SQL, and Hail doesn’t use any Spark-SQL interfaces, only the RDD interfaces.

I came across spark-rapids(Link) library which allows GPU usage for Apache Spark3 backend.

It requires setting the following plugin

${SPARK_HOME}/bin/spark --jars 'rapids-4-spark_2.12-21.08.0.jar,cudf-21.08.2-cuda11.jar' \
--conf spark.plugins=com.nvidia.spark.SQLPlugin \
--conf spark.rapids.sql.incompatibleOps.enabled=true

So, like you said, Hail doesn’t use any Spark-SQL interface. Therefore, this library is not useful in my case.