Credits: The Gale Encylopedia of Medicine
GaleMed Insights is an advanced medical chatbot built using a Retrieval-Augmented Generation (RAG) framework, leveraging the wealth of medical knowledge contained in The Gale Encyclopedia of Medicine, which spans over 637 pages. By combining PDF processing, semantic embedding, and vector databases, GaleMed Insights offers accurate, detailed, and user-friendly responses to medical inquiries. This makes it a valuable and accessible resource for individuals seeking trustworthy health information.
With its foundation in The Gale Encyclopedia of Medicine, GaleMed Insights ensures users can access well-researched, vetted content on a wide range of medical topics, from conditions and treatments to best healthcare practices. The chatbot empowers users to easily navigate complex medical concepts and obtain clear, relevant answers.
Link for the book: The Gale Encylopedia of Medicine
Using a Retrieval-Augmented Generation (RAG) methodology offers significant advantages over traditional large language models (LLMs) alone:
GaleMed Insights retrieves factual information directly from The Gale Encyclopedia of Medicine, ensuring that responses are factually accurate and relevant. This drastically reduces the risk of generating incorrect information (known as "hallucinations"), which is a common issue when using LLMs on their own.
Instead of relying on broad datasets, GaleMed Insights focuses on a specific medical knowledge base, ensuring that users receive expert, specialized responses. This contrasts with general LLMs, which may provide superficial or irrelevant answers to health-related queries.
Through chunking and semantic embedding, GaleMed Insights retrieves the most relevant information while maintaining the necessary context, which leads to clearer and more coherent responses tailored to the user's needs.
Dividing the text into chunks avoids running into token limits of LLMs, allowing the model to process large documents effectively without losing important details from the encyclopedia.
With precise, well-formatted responses, GaleMed Insights enhances the readability and accessibility of complex medical information, making it easier for users to understand and act on the information provided.
The GaleMed Insights project is designed to address the growing need for reliable, accurate, and accessible medical information. In a world where misinformation spreads easily, especially on health-related topics, GaleMed Insights offers a trusted source of knowledge from The Gale Encyclopedia of Medicine. Key reasons for developing this project include:
GaleMed Insights empowers individuals to make informed decisions about their health by providing clear and factual medical information.
The chatbot serves as a quick reference tool for healthcare providers, giving them rapid access to accurate medical information to assist in patient care.
As medical knowledge evolves, GaleMed Insights can easily be updated with new data, ensuring the chatbot remains current and provides the latest medical insights.
Credits: EagerWorks
The project utilizes the pyPDF library to read and extract data from The Gale Encyclopedia of Medicine. This library efficiently handles large documents, facilitating the structured extraction of valuable medical information.
def load_pdf(data):
loader = DirectoryLoader(data,
glob="*.pdf",
loader_cls=PyPDFLoader)
documents = loader.load()
return documents
After extracting the data, it is crucial to divide the information into manageable chunks. Chunking helps to:
-
Improve retrieval efficiency by enabling quick access to relevant sections.
-
Enhance context provided in responses, allowing the model to generate more meaningful answers.
-
Mitigate issues related to the maximum token limits of language models by ensuring that only concise and relevant information is processed at any given time.
We also use overlap to keep in the context of these chunks
def text_split(extracted_data):
text_splitter = RecursiveCharacterTextSplitter(chunk_size = 500, chunk_overlap = 20)
text_chunks = text_splitter.split_documents(extracted_data)
return text_chunks
- To facilitate meaningful queries and responses, we employ the HuggingFaceEmbeddings model (sentence-transformers/all-MiniLM-L6-v2).
Credits: Hugging Face
- This model generates semantic embeddings, which capture the context and meaning of the text beyond surface-level words. These embeddings enable GaleMed to comprehend user queries more effectively and retrieve the most relevant information.
def download_hugging_face_embeddings():
embeddings = HuggingFaceEmbeddings(model_name="sentence-transformers/all-MiniLM-L6-v2")
return embeddings
Credits:Pinecone
-
The next step involves creating a vector database using Pinecone.
-
Then we load the chunked and embedded data into Pinecone, establishing a knowledge base that can efficiently manage user queries by identifying similar vectors and retrieving pertinent information.
#Initializing index name and the Pinecone
os.environ["PINECONE_API_KEY"] = "Your-Pinecone-API-Key"
index_name="medical-vector"
# Initialize Pinecone with optional parameters
try:
pc = Pinecone(
api_key=os.environ.get("PINECONE_API_KEY"),
proxy_url=None, # Example optional parameter
proxy_headers=None, # Example optional parameter
ssl_ca_certs=None, # Example optional parameter
ssl_verify=True, # Example optional parameter, usually set to True
)
# Check if the index exists
indexes = pc.list_indexes() # List of index names
index_names = indexes.names() # Get only the names of the indexes
if index_name not in index_names:
print(f'{index_name} does not exist')
# Change the following line to create the index of your choice
pc.create_index(
name=index_name,
dimension=384,
metric="cosine",
spec=ServerlessSpec(
cloud="aws",
region="us-east-1"
)
)
else:
print(f'{index_name} exists.')
# Connect to the existing index
index = pc.Index(index_name)
except Exception as e:
print(f"An error occurred while checking indexes: {e}")
# Embedding the text chunks and storing them in Pinecone
try:
docsearch = LangchainPinecone.from_texts(
texts=[t.page_content for t in text_chunks], # Assuming `text_chunks` is a list of text splits
embedding=embeddings, # Embedding model instance
index_name=index_name
)
except Exception as e:
print(f"An error occurred while creating embeddings: {e}")
Once the knowledge base is established, we conduct test queries to evaluate the performance of GaleMed. This testing phase ensures that the chatbot retrieves accurate and relevant information effectively, confirming the effectiveness of the RAG approach.
As you can see it does fetch relevant information for the first three except the fourth which we will handle as well
Credits: Medium
To enhance the clarity and readability of the information returned to users, we utilize a large language model (LLM) based on the LLaMA architecture (CTransformers). This offline version processes the extracted data and formats it into coherent responses, making complex medical information accessible and understandable.
-
First we use an offline model Llama-2-7b
-
We then realised it takes a lot of time for us to get a response from these LLMs since not everyone usually have dedicated GPUs that these LLMs require
-
Thus, it would be better to use LLM API
-
For that we use any LLM Api but we went ahead and used the one at Groq
-
We utilized llama3-groq-70b-8192-tool-use-preview
-
The results were quick and effective
-
The response for "I have pain in my head" was
- It also ensure that it gives us a response only from the Gale Encylopedia mentioning "GB20" and the "yintang" points as well
-
Clearly our model with the Gale Encylopedia offers key inferences we wouldn't have gotten from general LLM models like ChatGPT
-
Further we also handled cases where the query and the results got from the vector DB are not relevant.
Credits: Flask
Finally, we aim to develop a Flask web application that serves as the interface for GaleMed. This application will allow users to interact with the chatbot seamlessly, posing questions and receiving comprehensive answers based on the medical encyclopedia.
This RAG-based approach is not only useful in healthcare but also has real-world applications in various fields, such as:
Legal research chatbots can use RAG to pull up relevant laws, precedents, and case summaries from specific legal databases, providing accurate and contextual legal advice.
Companies can use RAG to power customer service bots that retrieve relevant product information and troubleshooting steps from their internal knowledge bases, enhancing customer experience with quicker and more accurate responses.
RAG-based educational bots can assist students by providing targeted explanations and detailed answers based on textbooks, enhancing their learning experiences.
Tech companies can use RAG to help engineers and developers quickly retrieve information from large technical manuals or knowledge bases to solve problems efficiently.
By using a RAG framework such as GaleMed Insights, we can enable and ensure that users get precise, relevant information from a trusted source, improving the overall accuracy and reliability of responses across various industries.
Clone the repository
Project repo: https://github.com/pramodkondur/GaleMed-Insights-GenAI-RAG
conda create -n medicalbot python=3.11 -y
conda activate medicalbot
pip install -r requirements.txt
Create Pinecone, Groq account and API
Create a '.env; file in the root directory and add your Pinecone & openai credentials as follows:
PINECONE_API_KEY = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
OPENAI_API_KEY = "xxxxxxxxxxxxxxxxxxxxxxxxxxxxx"
# run the following command to store embeddings to pinecone
python store_index.py
# Finally run the following command
python app.py
Now,
open up localhost: