Skip to content

Intelligent document Q&A powered by RAG technology - transform your documents into interactive knowledge bases

License

Notifications You must be signed in to change notification settings

soheil-mp/DocChat

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

17 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

CC0 License FastAPI React TypeScript Python MongoDB Pinecone OpenAI LangChain Docker Tailwind CSS Node.js Jest WebSocket

Demo Β· Documentation Β· Report Bug Β· Request Feature

DocChat

πŸ€– An intelligent document Q&A chat interface powered by RAG (Retrieval-Augmented Generation) - transform your documents into interactive knowledge bases.

DocChat Demo

πŸ“‹ Table of Contents

✨ Features

  • πŸ“„ Smart Document Management

    • Multi-format support (PDF, DOCX, TXT)
    • Batch uploads with progress tracking
    • Version control & metadata management
  • πŸ’¬ AI-Powered Chat

    • Context-aware responses using RAG
    • Real-time interactions
    • Source citations
    • Conversation history
  • βš™οΈ Customization

    • Multiple LLM providers (OpenAI, Anthropic, Cohere)
    • Adjustable generation parameters
    • Custom prompting
    • Flexible output formatting

πŸš€ Quick Start

Prerequisites

  • Node.js 16+
  • Python 3.8+
  • MongoDB
  • Pinecone account
  • OpenAI API key

Installation

  1. Clone the repository
git clone https://github.com/yourusername/DocChat.git
cd DocChat
  1. Set up the backend
cd backend
python -m venv venv
source venv/bin/activate  # Windows: .\venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env     # Configure your environment variables
  1. Set up the frontend
cd frontend
npm install
cp .env.example .env     # Configure your environment variables

Running Locally

  1. Start the backend server
cd backend
uvicorn app.main:app --reload
  1. Launch the frontend
cd frontend
npm start

Visit http://localhost:3000 to see the application.

πŸ—οΈ Architecture

graph TD
    A[Client] -->|HTTP/WebSocket| B[FastAPI Backend]
    B -->|Document Storage| C[MongoDB]
    B -->|Vector Storage| D[Pinecone]
    B -->|RAG Pipeline| E[LangChain]
    E -->|LLM Requests| F[OpenAI]
Loading

πŸ“ Project Structure

Click to expand
DocChat/
β”œβ”€β”€ backend/              # FastAPI server
β”‚   β”œβ”€β”€ app/
β”‚   β”‚   β”œβ”€β”€ api/         # REST endpoints
β”‚   β”‚   β”œβ”€β”€ core/        # Core utilities
β”‚   β”‚   β”œβ”€β”€ services/    # Business logic
β”‚   β”‚   └── models/      # Data models
β”‚   └── tests/           # Backend tests
β”œβ”€β”€ frontend/            # React application
β”‚   β”œβ”€β”€ src/
β”‚   β”‚   β”œβ”€β”€ components/  # UI components
β”‚   β”‚   β”œβ”€β”€ features/    # Feature modules
β”‚   β”‚   └── lib/        # Utilities
β”‚   └── tests/          # Frontend tests
└── docs/               # Documentation

πŸ› οΈ Tech Stack

Click to expand

Frontend

  • React 18 with TypeScript
  • TailwindCSS & HeadlessUI
  • React Query & Zustand
  • Jest & Testing Library

Backend

  • FastAPI
  • LangChain & LangGraph
  • MongoDB & Pinecone
  • OpenAI GPT-4

πŸ“¦ Deployment

Docker Deployment

The application can be deployed using Docker in both development and production environments.

Development Environment

# Start all services with hot-reload
docker-compose -f deploy/docker/docker-compose.dev.yml up --build

# Start specific services
docker-compose -f deploy/docker/docker-compose.dev.yml up backend mongodb
docker-compose -f deploy/docker/docker-compose.dev.yml up frontend

# View logs
docker-compose -f deploy/docker/docker-compose.dev.yml logs -f

Production Environment

# Build and start all services in detached mode
docker-compose -f deploy/docker/docker-compose.yml up --build -d

# Check service status
docker-compose -f deploy/docker/docker-compose.yml ps

# Monitor logs
docker-compose -f deploy/docker/docker-compose.yml logs -f

Container Architecture

  • Backend Container: Python FastAPI application with uvicorn server
  • Frontend Container: Nginx serving built React application
  • MongoDB Container: Database service with persistent storage
  • Volumes:
    • mongodb_data: Persistent database storage
    • uploads: Document storage for processed files

Environment Configuration

  1. Backend Environment (.env)
MONGODB_URL=mongodb://mongodb:27017
MONGODB_DB_NAME=DocChat
OPENAI_API_KEY=your_openai_key
PINECONE_API_KEY=your_pinecone_key
PINECONE_ENV=your_pinecone_environment
JWT_SECRET_KEY=your_jwt_secret
  1. Frontend Environment
REACT_APP_API_URL=http://localhost:8000
REACT_APP_WS_URL=ws://localhost:8000/ws

Health Monitoring

The deployment includes health checks for all services:

  • Backend: HTTP health endpoint at /health
  • Frontend: Nginx status page
  • MongoDB: Connection check

Scaling Considerations

  • Backend can be scaled horizontally using Docker Swarm or Kubernetes
  • MongoDB should be configured with replication for production
  • Consider using managed services for databases in production

Cloud Platform Deployment

AWS Deployment

  • EC2 instances for application containers
  • ECS/EKS for container orchestration
  • MongoDB Atlas for database
  • S3 for document storage
  • CloudFront for CDN
  • Route53 for DNS management

Detailed AWS Setup Guide

Google Cloud Platform

  • Google Compute Engine for containers
  • Google Kubernetes Engine for orchestration
  • Cloud Storage for documents
  • Cloud CDN for content delivery
  • Cloud DNS for domain management

Detailed GCP Setup Guide

Microsoft Azure

  • Azure Container Instances
  • AKS for Kubernetes deployment
  • Azure Cosmos DB with MongoDB API
  • Azure Blob Storage for documents
  • Azure CDN for content delivery

Detailed Azure Setup Guide

Security Considerations

  • All containers run as non-root users
  • Environment variables for sensitive data
  • Regular security updates for base images
  • Network isolation between services
  • Rate limiting on API endpoints
  • CORS configuration
  • SSL/TLS encryption

Backup Strategy

  1. Database Backups
# Manual MongoDB backup
docker-compose exec mongodb mongodump --out /backup

# Restore from backup
docker-compose exec mongodb mongorestore /backup
  1. Document Storage Backups
# Backup uploads volume
docker run --rm --volumes-from DocChat_backend_1 -v $(pwd):/backup \
  alpine tar czvf /backup/uploads.tar.gz /app/uploads

πŸ”’ Security

  • JWT-based authentication
  • Rate limiting
  • Input validation
  • CORS protection
  • Regular security audits

🀝 Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/AmazingFeature)
  3. Commit changes (git commit -m 'Add AmazingFeature')
  4. Push to branch (git push origin feature/AmazingFeature)
  5. Open a Pull Request

πŸ“ License

Distributed under the MIT License. See LICENSE for more information.

πŸ”§ Development

Code Style

# Backend
pip install black isort flake8
black .
isort .
flake8

# Frontend
npm run lint
npm run format

Testing

# Backend
pytest
pytest --cov=app tests/

# Frontend
npm run test
npm run test:coverage

🐳 Docker Support

Development

# Development with hot-reload
docker-compose -f docker-compose.dev.yml up

# Production build
docker-compose -f docker-compose.prod.yml up

Production

# Build images
docker build -t DocChat-backend -f backend/Dockerfile.prod backend/
docker build -t DocChat-frontend -f frontend/Dockerfile.prod frontend/

# Run containers
docker-compose -f docker-compose.prod.yml up -d

πŸ” Troubleshooting

Common Issues

Backend Issues

  1. MongoDB Connection Errors

    # Check MongoDB status
    mongosh
    # Verify connection string in .env
  2. Pinecone API Issues

    • Verify API key and environment
    • Check index name and dimension

Frontend Issues

  1. WebSocket Connection Failed

    • Verify backend is running
    • Check CORS settings
    • Confirm WebSocket URL
  2. Build Failures

    # Clear node modules and reinstall
    rm -rf node_modules
    npm install

πŸ“ˆ Performance

Optimizations

  • Document chunking strategy
  • Vector store indexing
  • Response streaming
  • Frontend caching
  • API rate limiting

Monitoring

  • Prometheus metrics
  • Grafana dashboards
  • Error tracking
  • Usage analytics

πŸ”„ Updates & Migration

Version History

  • v1.0.0 - Initial release
  • v1.1.0 - Added streaming support
  • v1.2.0 - Multiple document handling
  • v2.0.0 - New UI and improved RAG

Migration Guides

About

Intelligent document Q&A powered by RAG technology - transform your documents into interactive knowledge bases

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published