Skip to content

Latest commit

 

History

History
21 lines (14 loc) · 1.18 KB

README.md

File metadata and controls

21 lines (14 loc) · 1.18 KB

Distributed Deep Learning

This repo contains the code for End to End Distributed Deep Learning Process Pipeline.

The Process happens in 7 steps:

  1. Real-Time Streaming Data or Batch Data is captured using Debezium.
  2. Captured Stream or Batch Data is pushed as Apache Kafka Topics using Kafka Connectors.
  3. Apache Flink is used to perform ETL operations.
  4. The Streaming/Batch Data Predictions are received from Models Deployed using TensorFlow Serving on Docker.
  5. Frequent Data Caching is achieved with RocksDB.
  6. Once the required predictions are made, all the data is pushed into Apache Druid where further processing takes place.
  7. The data present in Druid is now very powerful and can be used for making personalized predictions, cancellation probabilities, time-series forecasting etc.

Architecture Diagram containing the whole Distributed Deep Learning Pipeline:


Architecture-DistributedDL

Made with ❤️  by Praneet Pabolu