Build spark-py image for Kubernetes
# Retrieve the code
git clone https://github.com/astrolabsoftware/k8s-spark-py
cd k8s-spark-py
Eventually edit conf.sh
to fine-tune the configuration
# Download and unzip Spark binaries
./prereq-install.sh
./build.sh
# Log in IN2P3 registry
docker login gitlab-registry.in2p3.fr
./push-image.sh
The goal is to remain closest from the standard build procedure documented here: https://spark.apache.org/docs/latest/running-on-kubernetes.html#docker-images
However, it is possible to customize the build by adding files inside the custom/
directory. This file will be copied to SPARK_HOME just before building the container images. Currenly avro
, hbase
and kafka
java libraries are added in order to support fink-broker.
The CI will automatically build and push spark-py
container image inside IN2P3 registry for each commit to the git repository.