Rectik is a recommendation system project aimed at building, training, and deploying a multi-stage recommendation system for TikTok-style short videos. This project utilizes the NVIDIA Merlin ecosystem for efficient data processing, feature extraction, and model deployment, while leveraging Metaflow for workflow management.
Rectik’s workflow is divided into three main flows:
- Data Flow: Handles data preprocessing, feature extraction, and transformations.
- Train Flow: Defines and trains the recommendation models, including retrieval and reranking stages.
- Serve Flow: Combines models from the train flow to create an ensemble for deployment.
The Data Flow pipeline is responsible for:
- Data Preprocessing: Preparing raw data for modeling, including handling missing values, feature engineering, and data transformations.
- Feature Extraction: Extracting video features to serve as input for downstream models.
- Data Splitting: Splitting the data into training and testing sets. These steps ensure that the data is compatible with NVIDIA Merlin models.
The Train Flow pipeline defines and trains models for the recommendation system using a multi-stage approach:
- Retrieval (Two-Tower Model): This model retrieves a large set of candidate videos, narrowing down potential recommendations to a manageable number.
- Reranking (DLRM): This model ranks the retrieved candidates to find the most relevant videos. Tools Used:
- NVIDIA Merlin: For model building and training.
- FAISS: For vector similarity search, used to speed up the retrieval of candidates.
- Feast: As a feature store for managing and serving user and item features.
The Serve Flow pipeline handles the following tasks:
- Ensemble Creation: Merging the retrieval and reranking models into a single ensemble.
- Deployment Setup: Preparing the model repository with metadata, workflows, and checkpoints for deployment on Triton Server for efficient inference.
- Data Processing:
- NVIDIA Merlin Ecosystem for end-to-end recommendation workflows.
- Pyarrow, Cudf, Dask for fast data manipulation.
- DuckDB for SQL-based operations on Parquet files.
- Vector Store: FAISS
- Feature Store: Feast
- Model Deployment: Triton Server Inference for efficient inference of model ensembles.