See "Auto-scaling scikit-learn with Apache Spark": https://databricks.com/blog/2016/02/08/auto-scaling-scikit-learn-with-apache-spark.html