This Streamlit application demonstrates the integration of ChatGroq (Llama3 model), OpenAIEmbeddings, and FAISS for document embedding and retrieval. Users can input questions, and the app retrieves relevant documents and provides accurate responses based on the provided context.
- Document Embedding: Embed documents using OpenAI embeddings and store them using FAISS.
- Question Answering: Answer user questions based on embedded documents using ChatGroq's Llama3 model.
- Document Similarity Search: Display similar documents related to the user's query.
To run this application locally, follow these steps:
- Prerequisites
-
Clone the Repository
git clone https://github.com/Tobsky/DocuQuery cd yourrepository
-
Set Up Environment Variables
Create a .env file in the root directory of the project and add your OpenAI and Groq API keys:
OPENAI_API_KEY=your_openai_api_key GROQ_API_KEY=your_groq_api_key
-
Install Dependencies
pip install -r requirements.txt
-
Run the Application
streamlit run app.py
- Embedding Documents: Click the "Embed Documents" button to process and embed the documents located in the ./PDFdocs directory.
- Ask a Question: Enter your question in the text input field and press Enter. The app will retrieve relevant documents and provide an answer based on the context.
- View Similar Documents: Expand the "Document Similarity Search" section to view similar documents related to your query.
- Environment Setup: Load API keys from the .env file using dotenv.
- Document Embedding: Embed documents using OpenAI embeddings and store them with FAISS.
- Question Answering: Use ChatGroq's Llama3 model to answer questions based on the provided context.
- Streamlit Interface: Provide a user interface to embed documents, ask questions, and view similar documents.
- vector_embedding(): Handles document embedding and vector store creation.
- create_stuff_documents_chain(): Combines documents to form a chain for processing.
- create_retrieval_chain(): Creates a retrieval chain to fetch relevant documents based on user queries.
- Rate Limit Error: If you exceed the API quota, consider upgrading your OpenAI plan or reducing the number of API calls.
- Environment Variable Errors: Ensure your .env file is correctly set up with valid API keys.
- Document Loading Issues: Verify that the document directory (./PDFdocs) exists and contains valid PDF files.
If you would like to contribute to this project, please fork the repository and submit a pull request with your changes.