Chat with your Resume, an advanced RAG system

A tool for chatting with your resume, as simple as that.

FrontEnd technologies

Used React for a chat interface

Models Used

Language Model (LLM): Leveraged Groq API with llama3-70b-8192.
Embedding Model: Utilized nomic-embed-text-v1 using HuggingFace API.
Reranker: Utilized rerank-english-v2.0 using Cohere API.

System Overview

The RAG system consists of the following components:

Chunking and Embedding:

Text data is chunked into manageable pieces, with an additional summary of the resume. Each chunk is embedded using a model from HuggingFace. Embeddings are stored in a vector database (ChromaDB).

Caching

The system is able to cache previous questions and their answers, to retrieve the content automatically when the question is repeated.

Retrieval and Reranking:

Relevant chunks are retrieved from ChromaDB based on the query. Retrieved chunks are reranked using the Cohere API to ensure the most relevant chunks are prioritized.

Response Generation:

The top-ranked chunks are passed to the Llama model (via Groq API) to generate a coherent and relevant response.

How to start

Clone the repository

git clone https://github.com/AnasAber/RAG_in_CPU.git

Install the dependencies

pip install -r requirements.txt

Rename .env.example to .env, and Set up the environment variables:

export GROQ_API_KEY="your_groq_key"
export COHERE_API_KEY="your_cohere_key"
export HUGGINGFACE_API_KEY="your_huggingFace_key"

Run the app.py file

python app.py

To install and use Redis, Create a new terminal and follow this tutorial:

You'll first need to enable WSL2 (Windows Subsystem for Linux): https://learn.microsoft.com/en-us/windows/wsl/install

Next, type this command:

sudo service redis-server start

Move to frontend folder:

npm install
npm start

This project is not deployed yet as I'm still working on the frontend, and the cache mechanism needs to get optimized.

This project's RAG uses semantic search using ChromaDB and FAISS, I'll work on doing a combination of Hybrid Search and a HyDE following the best practices of RAG mentioned in the following paper: link

If you encounter an error just hit me up, make a pull request, or report an issue, and I'll happily respond.

Disadvantages

For cohere API, it's free for testing and unlimited, but not for production use as it's paid

Next goals

Optimize the caching mechanism
Create a better frontend.
See if there's a fast and good alternative to cohere api
Evaluating the performance of this RAG pipeline
Implement a combination of Hybrid Search and HyDE
Add Repacking after Reranking, and before giving the prompt back to the model

Name		Name	Last commit message	Last commit date
Latest commit History 25 Commits
.vscode		.vscode
data		data
frontend		frontend
src		src
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
answer.txt		answer.txt
app.py		app.py
package-lock.json		package-lock.json
package.json		package.json
requirements.txt		requirements.txt
setup.py		setup.py
structure.txt		structure.txt
tree.py		tree.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Chat with your Resume, an advanced RAG system

FrontEnd technologies

Models Used

System Overview

Chunking and Embedding:

Caching

Retrieval and Reranking:

Response Generation:

How to start

Disadvantages

Next goals

About

Releases

Packages

Languages

AnasAber/RAG_in_CPU

Folders and files

Latest commit

History

Repository files navigation

Chat with your Resume, an advanced RAG system

FrontEnd technologies

Models Used

System Overview

Chunking and Embedding:

Caching

Retrieval and Reranking:

Response Generation:

How to start

Disadvantages

Next goals

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages