Skip to content

The most basic example of running Llama in Kubernetes.

Notifications You must be signed in to change notification settings

ethanhinson/ollama-k8s

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Minimal example of running LLama with Kubernetes

Uses the ollama/ollama image from Docker Hub.

Prerequisites

  • Running Kubernetes cluster
  • kubectl installed and configured
  • helm installed
  • ingress-nginx installed in the cluster
  • A local container registry running at http://localhost:5000
  • Python 3.8 or later

You can find an example using kind here: Kubernetes and You

Execute the demo

  • Apply the helm chart
helm upgrade --install ollama ./ollama -f ./ollama/values.yaml -n ollama
  • Execute the example Python script to start a Flask server
pip install -r example-app/requirements.txt
python ./example-app/app.py
  • Visit http://localhost:8999 in your browser to receive a motivational llama message.

RAG Pipeline

An example real time RAG pipeline is provided. We use Redis as a vector database and document cache.

Pre-requisites

Assuming you have the provided kind cluster running locally. You can use the following to install a redis Helm chart:

helm -n redis install redis oci://registry-1.docker.io/bitnamicharts/redis --create-namespace

About

The most basic example of running Llama in Kubernetes.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published