Demo Β· Documentation Β· Report Bug Β· Request Feature
π€ An intelligent document Q&A chat interface powered by RAG (Retrieval-Augmented Generation) - transform your documents into interactive knowledge bases.
- Features
- Quick Start
- Architecture
- Project Structure
- Tech Stack
- API Documentation
- Configuration
- Deployment
- Security
- Contributing
- Troubleshooting
- License
- Contact
-
π Smart Document Management
- Multi-format support (PDF, DOCX, TXT)
- Batch uploads with progress tracking
- Version control & metadata management
-
π¬ AI-Powered Chat
- Context-aware responses using RAG
- Real-time interactions
- Source citations
- Conversation history
-
βοΈ Customization
- Multiple LLM providers (OpenAI, Anthropic, Cohere)
- Adjustable generation parameters
- Custom prompting
- Flexible output formatting
- Node.js 16+
- Python 3.8+
- MongoDB
- Pinecone account
- OpenAI API key
- Clone the repository
git clone https://github.com/yourusername/DocChat.git
cd DocChat
- Set up the backend
cd backend
python -m venv venv
source venv/bin/activate # Windows: .\venv\Scripts\activate
pip install -r requirements.txt
cp .env.example .env # Configure your environment variables
- Set up the frontend
cd frontend
npm install
cp .env.example .env # Configure your environment variables
- Start the backend server
cd backend
uvicorn app.main:app --reload
- Launch the frontend
cd frontend
npm start
Visit http://localhost:3000 to see the application.
graph TD
A[Client] -->|HTTP/WebSocket| B[FastAPI Backend]
B -->|Document Storage| C[MongoDB]
B -->|Vector Storage| D[Pinecone]
B -->|RAG Pipeline| E[LangChain]
E -->|LLM Requests| F[OpenAI]
Click to expand
DocChat/
βββ backend/ # FastAPI server
β βββ app/
β β βββ api/ # REST endpoints
β β βββ core/ # Core utilities
β β βββ services/ # Business logic
β β βββ models/ # Data models
β βββ tests/ # Backend tests
βββ frontend/ # React application
β βββ src/
β β βββ components/ # UI components
β β βββ features/ # Feature modules
β β βββ lib/ # Utilities
β βββ tests/ # Frontend tests
βββ docs/ # Documentation
Click to expand
- React 18 with TypeScript
- TailwindCSS & HeadlessUI
- React Query & Zustand
- Jest & Testing Library
- FastAPI
- LangChain & LangGraph
- MongoDB & Pinecone
- OpenAI GPT-4
The application can be deployed using Docker in both development and production environments.
# Start all services with hot-reload
docker-compose -f deploy/docker/docker-compose.dev.yml up --build
# Start specific services
docker-compose -f deploy/docker/docker-compose.dev.yml up backend mongodb
docker-compose -f deploy/docker/docker-compose.dev.yml up frontend
# View logs
docker-compose -f deploy/docker/docker-compose.dev.yml logs -f
# Build and start all services in detached mode
docker-compose -f deploy/docker/docker-compose.yml up --build -d
# Check service status
docker-compose -f deploy/docker/docker-compose.yml ps
# Monitor logs
docker-compose -f deploy/docker/docker-compose.yml logs -f
- Backend Container: Python FastAPI application with uvicorn server
- Frontend Container: Nginx serving built React application
- MongoDB Container: Database service with persistent storage
- Volumes:
mongodb_data
: Persistent database storageuploads
: Document storage for processed files
- Backend Environment (.env)
MONGODB_URL=mongodb://mongodb:27017
MONGODB_DB_NAME=DocChat
OPENAI_API_KEY=your_openai_key
PINECONE_API_KEY=your_pinecone_key
PINECONE_ENV=your_pinecone_environment
JWT_SECRET_KEY=your_jwt_secret
- Frontend Environment
REACT_APP_API_URL=http://localhost:8000
REACT_APP_WS_URL=ws://localhost:8000/ws
The deployment includes health checks for all services:
- Backend: HTTP health endpoint at
/health
- Frontend: Nginx status page
- MongoDB: Connection check
- Backend can be scaled horizontally using Docker Swarm or Kubernetes
- MongoDB should be configured with replication for production
- Consider using managed services for databases in production
- EC2 instances for application containers
- ECS/EKS for container orchestration
- MongoDB Atlas for database
- S3 for document storage
- CloudFront for CDN
- Route53 for DNS management
- Google Compute Engine for containers
- Google Kubernetes Engine for orchestration
- Cloud Storage for documents
- Cloud CDN for content delivery
- Cloud DNS for domain management
- Azure Container Instances
- AKS for Kubernetes deployment
- Azure Cosmos DB with MongoDB API
- Azure Blob Storage for documents
- Azure CDN for content delivery
- All containers run as non-root users
- Environment variables for sensitive data
- Regular security updates for base images
- Network isolation between services
- Rate limiting on API endpoints
- CORS configuration
- SSL/TLS encryption
- Database Backups
# Manual MongoDB backup
docker-compose exec mongodb mongodump --out /backup
# Restore from backup
docker-compose exec mongodb mongorestore /backup
- Document Storage Backups
# Backup uploads volume
docker run --rm --volumes-from DocChat_backend_1 -v $(pwd):/backup \
alpine tar czvf /backup/uploads.tar.gz /app/uploads
- JWT-based authentication
- Rate limiting
- Input validation
- CORS protection
- Regular security audits
- Fork the repository
- Create your feature branch (
git checkout -b feature/AmazingFeature
) - Commit changes (
git commit -m 'Add AmazingFeature'
) - Push to branch (
git push origin feature/AmazingFeature
) - Open a Pull Request
Distributed under the MIT License. See LICENSE
for more information.
# Backend
pip install black isort flake8
black .
isort .
flake8
# Frontend
npm run lint
npm run format
# Backend
pytest
pytest --cov=app tests/
# Frontend
npm run test
npm run test:coverage
# Development with hot-reload
docker-compose -f docker-compose.dev.yml up
# Production build
docker-compose -f docker-compose.prod.yml up
# Build images
docker build -t DocChat-backend -f backend/Dockerfile.prod backend/
docker build -t DocChat-frontend -f frontend/Dockerfile.prod frontend/
# Run containers
docker-compose -f docker-compose.prod.yml up -d
Common Issues
-
MongoDB Connection Errors
# Check MongoDB status mongosh # Verify connection string in .env
-
Pinecone API Issues
- Verify API key and environment
- Check index name and dimension
-
WebSocket Connection Failed
- Verify backend is running
- Check CORS settings
- Confirm WebSocket URL
-
Build Failures
# Clear node modules and reinstall rm -rf node_modules npm install
- Document chunking strategy
- Vector store indexing
- Response streaming
- Frontend caching
- API rate limiting
- Prometheus metrics
- Grafana dashboards
- Error tracking
- Usage analytics
- v1.0.0 - Initial release
- v1.1.0 - Added streaming support
- v1.2.0 - Multiple document handling
- v2.0.0 - New UI and improved RAG