-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Size of fat jars #340
Comments
Flink 1.2.0 bundles breeze 0.12. Spark 2.1.0 bundles breeze 0.12 as well, but with some exclusions. I am not sure whether those are available in the classpath when submitting a job against a running cluster. |
I can access Breeze in the Spark REPL, but not in the Flink REPL. I have a few questions:
|
|
Breeze is not in the Flink REPL because it's not a top level dependency in Flink (it's only listed in Flink's ML library). |
So MLLib and Flink-ML have a dependency on Breeze and Breeze has a dependency on Shapeless. The newest version of Breeze depends on the newest version of Shapeless, but MLLib and Flink-ML reference older versions of Breeze. |
This is a general discussion question regarding the size of the fat-jars produced by the
emma-spark-examples
andemma-flink-examples
modules.Running
in the project root shows the following output
The
emma-flink-examples
andemma-spark-examples
jars are ~65M each, which is also indicative of the expected size of any client jars bindingemma-language
and one ofemma-flink
oremma-spark
in the future.A closer in
emma-spark-examples
reveals the root causes (output is similar for the other one).The list looks as follows.
It might be better to rely on the breeze version shipped with the dataflow engine rather than bundling our own. @ParkL could you check the versions bundled with Spark 2.1.0 and Flink 1.2.1?
I am not sure what to do with
scalaz
. It seems that we're only using it due toquiver
, and I am not aware of any alternative which has smaller footprint or, say, relies oncats
.I am open for suggestions.
The text was updated successfully, but these errors were encountered: