The scripts in this repository will setup and maintain one or more kubernetes clusters consisting of dedicated Hetzner servers. Each cluster will also be provisioned to operate as a node in the THORCHain network.
Executing the scripts in combination with some manual procedures will get you highly available, secure clusters with the following features on bare metal.
- Kubespray (based)
- Internal NVMe storage (Ceph/Rook)
- Virtual LAN (also over multiple locations) (Calico)
- Load Balancing (MetalLB)
Acquire a couple of servers as the basis for a cluster (AX41-NVME
's are working well for instance). Visit the admin panel and name the servers appropriately.
tc-k8s-node1
tc-k8s-node2
tc-k8s-node3
...
tc-k8s-master1
tc-k8s-master2
tc-k8s-worker1
tc-k8s-worker2
tc-k8s-worker3
...
Refer to the reset procedure to properly initialize them.
Create a vSwitch and order an appropriate subnet (it may take a while to show up after the order). Give the vSwitch a name (i.e. tc-k8s-net
) and assign this vSwitch to the servers.
Checkout the docs for help.
Clone this repository, cd
into it and download kubespray.
git submodule init && git submodule update
Create a Python virtual environment or similar.
# Optional
virtualenv -p python3 venv
Install dependencies required by Python and Ansible Glaxy.
pip install -r requirements.python.txt
ansible-galaxy install -r requirements.ansible.yml
Note: Mitogen does not work with ansible collections and the strategy must be changed (i.e.
strategy: linear
).
Create a deployment environment inventory file for each cluster you want to manage.
cp hosts.example inventory/production.yml
cp hosts.example inventory/test.yml
cp hosts.example inventory/environment.yml
...
cp hosts.example inventory/production-01.yml
cp hosts.example inventory/production-02.yml
...
cp hosts.example inventory/production-helsinki.yml
cp hosts.example inventory/whatever.yml
Edit the inventory file with your server ip's and network information and customize everything to your needs.
# Manage a cluster
ansible-playbook cluster.init.yml -i inventory/environment.yml
ansible-playbook --become --become-user=root kubespray/cluster.yml -i inventory/environment.yml
ansible-playbook cluster.finish.yml -i inventory/environment.yml
# Run custom playbooks
ansible-playbook private-cluster.yml -i inventory/environment.yml
ansible-playbook private-test-cluster.yml -i inventory/environment.yml
ansible-playbook private-whatever-cluster.yml -i inventory/environment.yml
Check this out for more playbooks on cluster management.
In order for the cluster to operate as a node in the THORCHain network deploy as instructed here. You can also refer to the node-launcher repository, if necessary, or the THORChain documentation as a whole.
This will install and use Ubuntu 20.04 on only one of the two internal NVMe drives. The unused ones will be used for persistent storage with ceph/rook. You can check the internal drive setup with lsblk
. Change it accordingly in the command shown above when necessary.
Visit the console and put each server of the cluster into rescue mode. Then execute the following script.
installimage -a -r no -i images/Ubuntu-2004-focal-64-minimal.tar.gz -p /:ext4:all -d nvme0n1 -f yes -t yes -n hostname
Create a pristine state by running the playbooks in sequence.
ansible-playbook server.rescue.yml -i inventory/environment.yml
ansible-playbook server.bootstrap.yml -i inventory/environment.yml
Instantiate the servers.
ansible-playbook server.instantiate.yml -i inventory/environment.yml
Register the new node(s) in your existing inventory file and run the scale command.
Keep your network limitations in mind. You can use a maximum of 4 nodes for a
/29
subnet. Prepare accordingly if necessary.
node4:
ansible_host: 44.55.66.77
etcd_member_name: node4
ip: 10.10.10.14
node5:
ansible_host: 55.66.77.88
etcd_member_name: node5
ip: 10.10.10.15
kube-node:
hosts:
node1: {}
node2: {}
node3: {}
node4: {}
node5: {}
ansible-playbook --become --become-user=root kubespray/scale.yml -i inventory/environment.yml
ansible-playbook --become --become-user=root -e node=node4 kubespray/remove-node.yml -i inventory/environment.yml
Deploy, use and remove the Rook Toolbox.
ansible-playbook cluster.toolbox.yml -i inventory/environment.yml
kubectl -n rook-ceph get pod -l "app=rook-ceph-tools" -o jsonpath='{.items[0].metadata.name}') bash
ansible-playbook cluster.toolbox.yml -e state=absent -i inventory/environment.yml