Distributed Deep Learning

This repo contains the code for End to End Distributed Deep Learning Process Pipeline.

The Process happens in 7 steps:

Real-Time Streaming Data or Batch Data is captured using Debezium.
Captured Stream or Batch Data is pushed as Apache Kafka Topics using Kafka Connectors.
Apache Flink is used to perform ETL operations.
The Streaming/Batch Data Predictions are received from Models Deployed using TensorFlow Serving on Docker.
Frequent Data Caching is achieved with RocksDB.
Once the required predictions are made, all the data is pushed into Apache Druid where further processing takes place.
The data present in Druid is now very powerful and can be used for making personalized predictions, cancellation probabilities, time-series forecasting etc.

Made with ❤️ by Praneet Pabolu

Provide feedback