Skip to content

A production-ready RAG (Retrieval Augmented Generation) system built with FastAPI, LangChain, LangServe, LangSmith, Hugging Face, and Qdrant for document processing and intelligent querying.

Notifications You must be signed in to change notification settings

hoduy511/rag-langchain

Repository files navigation

RAG LangChain

A production-ready RAG (Retrieval Augmented Generation) system built with FastAPI, LangChain, LangServe, LangSmith, Hugging Face, and Qdrant for document processing and intelligent querying.

Table of Contents

Core Features

  • PDF Document Processing with automatic chunking and metadata enrichment
  • Vector Search using Qdrant with sentence-transformers/all-MiniLM-L6-v2 embedding model for efficient document retrieval
  • Integration with google/flan-t5-base for question answering
  • RESTful API with streaming support
  • Docker containerization with multi-service architecture

Tech Stack

Quick Start

  1. Clone the repository:
git clone https://github.com/hoduy511/rag-langchain.git
cd rag-langchain
  1. Create and configure your .env file with required variables:
cp .env-dev .env
  1. Start the services using make up command:
make up

API Endpoints

Base URLs

Core Endpoints

  • GET /health - Health check endpoint
  • POST /api/v1/upload - Upload and process PDF documents
  • POST /api/v1/query - Query the knowledge base
  • POST /api/v1/search - Perform similarity search

Components

API Layer (src/api/)

  • FastAPI application handling HTTP requests
  • Route definitions for document upload, querying, and search
  • Input validation and response formatting
  • CORS and middleware configuration

Core Services (src/services/)

Document Formatter

  • Text chunking with configurable size and overlap
  • Metadata enrichment
  • UTF-8 encoding handling
  • Content cleaning and normalization

LLM Service

  • google/flan-t5-base model integration
  • Text generation pipeline configuration
  • Token length management
  • Model parameter optimization

PDF Service

  • PDF document processing
  • Text extraction and cleaning
  • Temporary file management
  • Chunk generation and storage

Vector Store

  • Qdrant vector database integration
  • Document embedding using sentence-transformers
  • Similarity search functionality
  • Collection management and indexing

RAG Chain (src/chains/)

  • LangChain implementation for question answering
  • Integration with google/flan-t5-base model
  • Prompt management and chain composition
  • Context retrieval and response generation

Data Models (src/models/)

  • Pydantic schemas for request/response validation
  • Data transfer object definitions
  • Type hints and validation rules

Development

Available make commands for development:

  • up: Start all services with docker-compose
  • down: Stop all services and remove containers
  • logs: View container logs in follow mode
  • shell: Open interactive shell in app container
  • clean: Remove all containers, volumes and prune system
  • test: Run pytest test suite
  • format: Format Python code with autopep8 and isort
  • lint: Run flake8 linter checks

Testing

The project includes comprehensive tests for all components including API endpoints, RAG chain implementation, and various services.

Configuration

Environment Variables

  • API settings
  • Qdrant vector database configuration
  • Model settings (google/flan-t5-base and sentence-transformers/all-MiniLM-L6-v2)
  • LangChain integration parameters
  • LangSmith API keys and project settings
  • Hugging Face API tokens and model configurations

About

A production-ready RAG (Retrieval Augmented Generation) system built with FastAPI, LangChain, LangServe, LangSmith, Hugging Face, and Qdrant for document processing and intelligent querying.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published