Skip to content

Commit

Permalink
doc: add/improve
Browse files Browse the repository at this point in the history
Signed-off-by: hmoazzem <moazzem@edgeflare.io>
  • Loading branch information
hmoazzem committed Oct 21, 2024
1 parent 8ca611e commit 7936a59
Show file tree
Hide file tree
Showing 10 changed files with 243 additions and 91 deletions.
12 changes: 10 additions & 2 deletions docs/docker-compose.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,14 @@ version: '3.8'

services:
postgres:
image: docker.io/edgeflare/postgresql:16.4.0-wal-g
image: docker.io/edgeflare/postgresql:16 # bitnami/postgresql:16 + wal-g
# image: docker.io/edgeflare/postgresql:16-wal-g # WAL archiving, Point-in-time restore
# image: docker.io/edgeflare/postgresql:16-pgvector # Vector extension
# image: docker.io/edgeflare/postgresql:16-apache-age # Graph extension
# image: docker.io/edgeflare/postgresql:16-wal2json # WAL as JSON (prefer pgoutput instead)
# image: docker.io/timescale/timescaledb-ha:pg16 # Timeseries extension. includes pgvector, pgvectorscale
# image: docker.io/bitnami/postgresql:16 # includes PostGIS
# image: docker.io/postgres:16
environment:
POSTGRES_HOST_AUTH_METHOD: md5
POSTGRES_USER: postgres
Expand All @@ -14,7 +21,7 @@ services:
volumes:
- "$PWD/postgresql:/bitnami/postgresql"
emqx:
image: emqx:5.8
image: docker.io/emqx:5.8
environment:
EMQX_DASHBOARD__DEFAULT_PASSWORD: public
EMQX_DASHBOARD__DEFAULT_USERNAME: admin
Expand All @@ -36,3 +43,4 @@ services:
# ports:
# - "6433:6432"
# - "9930:9930"

Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
1. Start Postgres and EMQX MQTT broker

```shell
curl -OL https://raw.githubusercontent.com/edgeflare/pgo/refs/heads/main/docs/docker-compose.yaml
docker compose up -d
```

Expand Down
82 changes: 82 additions & 0 deletions docs/rag.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,82 @@
# Retrieval-Augmented Generation (RAG) on PostgreSQL with PGVector

Retrieval-Augmented Generation (RAG) is a technique that combines the strengths of large language models (LLMs) with up-to-date, private, internal information to provide more accurate, contextually relevant responses. It works by:

1. Retrieving relevant information from a knowledge base in response to a query.
2. Augmenting the original query with this retrieved information (context).
3. Using a language model to generate a response based on both the query and the retrieved context.

<img src="./rag.mmd.svg" alt="Retrieval-Augmented Generation (RAG)" style="max-height: 30rem; width: 100%;">

## Key Concepts

1. **Embeddings**: Vector representations of texts or other data.
2. **Vector Database**: A database optimized for storing and querying vector embeddings (e.g., PostgreSQL with pgvector extension).
3. **Similarity Search**: Finding the most similar vectors to a given query vector.
4. **RAG (Retrieval-Augmented Generation)**: A technique that enhances language models by retrieving relevant information from a knowledge base.

## PGVector Basics

PGVector is a PostgreSQL extension that adds support for vector operations and similarity search, enabling efficient storage and querying of embeddings.

### Data Type
- `vector(n)`: Represents an n-dimensional vector.

### Similarity Functions
- `<->`: Euclidean distance
- `<#>`: Negative inner product
- `<=>`: Cosine distance

### Creating a Table with Vector Column

```sql
CREATE TABLE IF NOT EXISTS embeddings (
id BIGINT PRIMARY KEY GENERATED BY DEFAULT AS IDENTITY,
content TEXT,
embedding vector(1536)
);
```

### Indexing

```sql
-- IVFFlat index (faster build, larger index)
CREATE INDEX ON documents USING ivfflat (embedding vector_l2_ops) WITH (lists = 100);
-- HNSW index (slower build, faster search)
CREATE INDEX ON documents USING hnsw (embedding vector_cosine_ops) WITH (m = 16, ef_construction = 64);
```

## RAG Implementation Steps

1. **Prepare Knowledge Base**:
- Collect and preprocess documents
- Generate embeddings for documents
- Store documents and embeddings in the vector database

2. **Query Processing**:
- Generate embedding for the user query
- Perform similarity search to retrieve relevant documents

3. **Context Augmentation**:
- Combine retrieved documents with the original query

4. **Generation**:
- Feed the augmented context to the language model
- Generate the response

## Best Practices

1. Choose appropriate embedding dimensions (usually 768, 1536, 3072 for modern models).
2. Experiment with different indexing methods for optimal performance.
3. Use batching for efficient embedding generation and database operations.
4. Implement caching mechanisms to reduce redundant computations.
5. Regularly update and maintain your knowledge base for accurate retrievals.

## Useful PGVector Functions

- `vector_dims(vector)`: Returns the dimension of a vector
- `vector_norm(vector)`: Calculates the Euclidean norm of a vector
- `vector_add(vector, vector)`: Adds two vectors
- `vector_subtract(vector, vector)`: Subtracts one vector from another

See implementation [examples/rag/main.go](../examples/rag/main.go).
36 changes: 19 additions & 17 deletions docs/rag.mmd
Original file line number Diff line number Diff line change
@@ -1,24 +1,26 @@
graph TD
A[Document_Collection]
B[Document_Chunks]
C[Vector_Database]
subgraph Knowledge_Base
D[Document_Collection]
C[Document_Chunks]
D --> |1a. Chunk & Process| C
end

A --> |1a. Chunk & Process| B
B -->|1b. Generate Embeddings| C
DB[Vector_Database]
Knowledge_Base -->|1b. Generate Embeddings| DB

D[User_Query]
Q[User_Query]
E[Query_Embedding]
D -->|2a. Generate Embedding| E
E -->|2b. Vector Similarity Search| C
Q -->|2a. Generate_Embedding| E
E -->|2b. Vector Similarity Search| DB

F[Retrieved_Context]
C -->|3a. Retrieve Relevant Chunks| F
CTX[Retrieved_Context]
DB -->|3 Retrieve Relevant Chunks| CTX

G[Augmented_Prompt]
D -->|"4a. Combine"| G
F -->|"4b. Combine"| G
P[Augmented_Prompt]
Q -->|"4a. Combine"| P
CTX -->|"4b. Combine"| P

H[Language_Model]
I[Final_Response]
G -->|5a. Send to LLM| H
H -->|6a. Generate Response| I
LLM[Language_Model]
R[Final_Response]
P -->|5 Send to LLM| LLM
LLM -->|6 Generate Response| R
1 change: 1 addition & 0 deletions docs/rag.mmd.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1 change: 0 additions & 1 deletion docs/rag.svg

This file was deleted.

2 changes: 2 additions & 0 deletions examples/postgres-cdc-mqtt/main.go
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
package main

// See [docs/pgcdc-mqtt.md](../../docs/pgcdc-mqtt.md) for more information.

import (
"context"
"encoding/json"
Expand Down
9 changes: 7 additions & 2 deletions examples/rag101/main.go → examples/rag/main.go
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
package main

// See [docs/rag.md](../../docs/rag.md) for more information.

import (
"context"
"fmt"
Expand All @@ -21,13 +23,16 @@ func main() {

// Create a new RAG client
client, err := rag.NewClient(conn, rag.DefaultConfig())
client.Config.TableName = "example_table"
client.Config.TableName = "lms.courses"
// TODO: fix primary key data type from primary key column in contentSelectQuery

if err != nil {
log.Fatalf("Failed to create RAG client: %v", err)
}

err = client.CreateEmbedding(ctx, "")
err = client.CreateEmbedding(ctx, "SELECT id, CONCAT('title:', title, ', summary:', summary) AS content FROM lms.courses")
// err = client.CreateEmbedding(ctx, "") // CreateEmbedding constructs content by concatenating colname:value of other columns for each row
// err = client.CreateEmbedding(ctx) // Assumes the table has a column named `content` that contains the content for which embedding will be created
if err != nil {
log.Fatalf("Failed to create embeddings: %v", err)
}
Expand Down
5 changes: 3 additions & 2 deletions pkg/rag/client.go
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ import (
"go.uber.org/zap"
)

// Config holds the configuration for the RAG package
// Config holds the configuration for the RAG Client
type Config struct {
TableName string
TablePrimaryKeyCol string
Expand Down Expand Up @@ -52,7 +52,8 @@ func NewClient(conn *pgx.Conn, config Config, loggers ...*zap.Logger) (*Client,
logger = loggers[0]
} else {
var err error
logger, err = zap.NewDevelopment()
// logger, err = zap.NewDevelopment()
logger, err = zap.NewProduction()
if err != nil {
return nil, fmt.Errorf("failed to create logger: %w", err)
}
Expand Down
Loading

0 comments on commit 7936a59

Please sign in to comment.