Apache DataFu™

Apache DataFu-Spark 1.8.0 Released

Eyal Allweil

I'd like to announce the release of Apache DataFu-Spark 1.8.0.

Many thanks to Arpit Bhardwaj and Shaked Aharon, who worked on this version.


  • dedupWithCombiner method now supports a list of columns in the order / group by params (DATAFU-171)
  • Scala Python bridge now uses secure gateway (DATAFU-167)

Breaking changes

  • Spark 2.2.0, 2.2.1, and 2.3.0. no longer supported

The source release can be obtained from:


Artifacts for DataFu are published in Apache's Maven Repository:


Please visit the Download page for instructions on building from source or retrieving the artifacts in your build system.