I changed the Google storage connector write buffer size when starting up an experimental Dataproc cluster with
--properties 'core:fs.gs.io.buffersize.write=1048576'
and found these commands (modified from a SO post) useful for verifying such changes. You’ll need access to hc
:
from hail.utils.java import Env
To print the Spark config:
print(Env().hc().sc.getConf().getAll())
To print the Hadoop config:
hadoopConf = {}
iterator = Env().hc().sc._jsc.hadoopConfiguration().iterator()
while iterator.hasNext():
prop = iterator.next()
hadoopConf[prop.getKey()] = prop.getValue()
for item in sorted(hadoopConf.items()): print(item)
To print system properties:
import os
for item in sorted(os.environ.items()): print(item)
GCP defaults are here: https://github.com/GoogleCloudPlatform/bigdata-interop/blob/master/gcs/conf/gcs-core-default.xml