Daf-Kylo for PDND (Piattaforma Digitale Nazionale Dati), previously DAF (Data & Analytics Framework)
In order to install and use this repo you may deploy all the components into a cloudera shared edge node.
Daf-Kylo is a data lake platform built on Apache Hadoop and Spark. Daf-Kylo provides a data lake solution enabling self-service data ingest, data preparation, and data discovery. Kylo integrates best practices around metadata capture, security, and data quality. Apache Nifi provides a flexible data processing framework for building batch or streaming pipeline templates, and for enabling self-service features.
PDND stands for "Piattaforma Digitale Nazionale Dati" (Italian Digital Data Platform), previously known as Data & Analytics Framework (DAF).
In brief, is an attempt to establish a central Chief Data Officer (CDO) for the Government and Public Administration. Its main goal is to promote data exchange among Italian Public Administrations (PAs), to support the diffusion of open data, and to enable data-driven policies. You can find more about the PDND on the official Digital Transformation Team website.
Daf-Kylo repository contains the set of components used to deploy and manage the PDND data ingestion process.
Folder /docker contains all the docker files for build images of daf-kylo components.
Folder /kubernetes contains all the yaml files for deploy pods and services on kubernetes.
Folder /kylo contains all the kylo stuff such as api documentation for the integration with PDND Portal, kylo templates, kylo patch.
Folder /nifi contains all the nifi templates and customized processors used in ingestion process.
Folder /scripts contains utils scripts for manage pods, log and other kubernetes stuff.
Project dependencies can be find by clicking on this link.
Project Daf-Kylo depends by the following components.
- ActiveMQ version 5.15.1, available here;
- Elasticsearch version 5.6.4, available here;
- MariaDB version 10.3, available here;
- Spark version 2.2.0, available here;
- Kylo-Services version 9.1.0, available here;
- Kylo-UI version 9.1.0, available here;
- NiFi version 1.7.0, available here.
Installing Daf-Kylo on Unix-like systems requires a package manager such as Homebrew. You can download and install Homebrew following the instructions given in the Homebrew official website. Once you have installed Homebrew, you can follow some steps to complete the setup. First step is Homebrew cask installation. Open a terminal and type the following command to install Homebrew cask:
brew tap caskroom/cask
Then, update all formulas and Homebrew itself by typing
brew update
Last, install kube-controller-manager, RPM, make and Git by typing
brew install kubectl rpm make git
To build most of Docker images, kylo code is required (source and compiled). To get it run, you have to download and compile it, using Makefile
, by typing the following commands (production and test environment):
make -f Makefile daf-kylo
make -f Makefile build-kylo
make -f Makefile.test daf-kylo
make -f Makefile.test build-kylo
docker login nexus.daf.teamdigitale.it
Once this is completed, you can build every image (production and test environment), by typing the following comands:
make activemq
make mysql
make kylo-services
make kylo-ui
make nifi
make -f Makefile.test activemq
make -f Makefile.test mysql
make -f Makefile.test kylo-services
make -f Makefile.test kylo-ui
make -f Makefile.test nifi
Please ensure previously configuration of docker client as well as correct tagging the image has been performed. 'How to' can be found in:
TeamDigitale onboarding 'Setup Docker '
TeamDigitale onboarding 'Push Docker Image'
After config and proper tagging has been done, push can be performed typing:
docker push [repositoryurl:repositoryport/artifact:version]
for instance:
./nexus_push.sh prod [namespace]
./nexus_push.sh test [namespace]
The [namespace] is optional.
Please ensure previously configuration of kubectl has been done. 'How to' can be found in: -TeamDigitale onboarding , 'Setup Kubernetes'
After config is done, deploy into kubernetes cluster can be performed typing ./playbook.sh [component]
.
As an example, ./playbook.sh prod activemq [namespace]
.
Pod deletion can be performed typing: ./cleanup.sh [environment] [component]
.
As an example,
./cleanup.sh prod activemq [namespace]
for instance:
./playbook.sh test activemq [namespace]
or delete by: ./cleanup.sh [environment] [component]
for instance:
./cleanup.sh test activemq [namespace]
By default the kylo database is not created in mysql container, so you have to create it.
To configure Ldap authentication:
Edit the config-maps kylo-services.yaml & kylo-ui.yaml as follows:
config-map/kylo-services.yaml
shoud be:
security.auth.ldap.server.uri=ldap://idm.daf.gov.it:389/cn=users,cn=accounts,dc=daf,dc=gov,dc=it
security.auth.ldap.server.authDn=uid=admin,cn=users,cn=accounts,dc=daf,dc=gov,dc=it
security.auth.ldap.server.password=xxxxxx
security.auth.ldap.server.uri=ldap://idm.teamdigitale.test:389/cn=users,cn=accounts,dc=daf,dc=gov,dc=it
security.auth.ldap.server.authDn=uid=application,cn=users,cn=accounts,dc=daf,dc=gov,dc=it
security.auth.ldap.server.password=xxxxxx
After these two changes redeploy as follows:
kubectl delete -f config-map/kylo-services.yaml
kubectl delete -f config-map/kylo-ui.yaml
kubectl apply -f config-map/kylo-services.yaml
kubectl apply -f config-map/kylo-ui.yaml
In the above example, it is not take in account the [namespace].
- Go to idm.teamdigitale.test and create an user such as dladmin with a password password
After these you are able to login into kylo ui!
As pointed out above, once this is done ldap login will be substituted by default login , this will allow to log in with default user dladmin/thinkbig
. This has to be done to create users with the same name that those exist in ldap in order to grant them permissions (same functionality but for groups is currently being fixed by R&D) . Once user/s (or group/s) is/are created change back config-map/kylo-services.yaml
and config-map/kylo-ui.yaml
and redeploy again. Ldap is now good to go.
When kylo starts for the first time it need liquibase for creating Kylo DB, make sure that in the application.properties in kylo-service's config map:
liquibase.enabled=true
Here you can find additional information about custom processors created for the DAF.
Contributions are welcome. Feel free to open issues and submit a pull request at any time, but please read our handbook first.
Copyright (c) 2019 Presidenza del Consiglio dei Ministri
This program is a free software: you can redistribute it and/or modify it under the terms of the GNU Affero General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Affero General Public License for more details.
You should have received a copy of the GNU Affero General Public License along with this program. If not, see https://www.gnu.org/licenses/.