---Ingestion
- Loading the medium blog (TextLoader)
- Splitting the blog into smaller chunks (TextSplitter)
- Embed the chunks and get vectors (OpenAIEmbeddings)
- Store the embeddings in Pinecone Vectorstore (PineconeVectorstore)
---Retrival
- Get the embeddings from Pinecone Vectorstore, Embedding User Query
- Semantic Search (Relevant Vectors)
- Prompt Augmentation
- Generation
This script retrieves information using language models and vector stores based on a given query. It combines different chains to retrieve relevant documents and provide concise answers to questions.
- Description: Formats a list of documents into a single string, separating each document by two newline characters.
- Parameters:
docs
: List of documents to be formatted.
- Returns:
- A string containing the formatted documents.
-
Retrieving Information:
- The script begins by initializing language models and loading the necessary environment variables.
-
Creating Chains:
- Embeddings and Language Models:
- OpenAI embeddings and ChatOpenAI language model are initialized.
- Prompt Template:
- A prompt template is created from the query "What is Pinecone in Machine Learning?" using the
PromptTemplate
class.
- A prompt template is created from the query "What is Pinecone in Machine Learning?" using the
- Vector Store:
- A Pinecone vector store is initialized with the specified index name and embeddings.
- Retrieval Chains:
- Retrieval chains are created using the
create_stuff_documents_chain
andcreate_retrieval_chain
functions, which combine language models and vector stores to retrieve relevant documents based on the query.
- Retrieval chains are created using the
- Embeddings and Language Models:
-
Invoking Chains:
- The retrieval chain is invoked with the query input, which retrieves relevant documents.
- A custom RAG prompt template is created to provide helpful answers to the question.
- The RAG chain is invoked with the query to generate a response using the retrieved documents and the specified question.
-
Output:
- The retrieved documents and the response generated by the RAG chain are printed to the console.