Skip to content

Commit

Permalink
docs(README.md): add information about various aspects of the infrast…
Browse files Browse the repository at this point in the history
…ructure
  • Loading branch information
dr460nf1r3 committed Sep 9, 2023
1 parent d037005 commit c43509c
Showing 1 changed file with 67 additions and 17 deletions.
84 changes: 67 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,19 +1,19 @@
# Garuda Linux server configurations

[![built with nix](https://img.shields.io/static/v1?logo=nixos&logoColor=white&label=&message=Built%20with%20Nix&color=41439a)](https://builtwithnix.org) [![nix flake check](https://github.com/garuda-linux/infrastructure-nix/actions/workflows/flake_check.yml/badge.svg)](https://github.com/garuda-linux/infrastructure-nix/actions/workflows/flake_check.yml)
[![built with nix](https://img.shields.io/static/v1?logo=nixos&logoColor=white&label=&message=Built%20with%20Nix&color=41439a)](https://builtwithnix.org) [![run nix flake check](https://github.com/garuda-linux/infrastructure-nix/actions/workflows/flake_check.yml/badge.svg?branch=main)](https://github.com/garuda-linux/infrastructure-nix/actions/workflows/flake_check.yml)

## General information

- Our current infrastructure is hosted in one of [these](https://www.hetzner.com/dedicated-rootserver/ax102).
- The only other server not being contained in this dedicated server is our mail server.
- Both servers are being backed up to Hetzner storage boxes via [Borg](https://www.borgbackup.org/).
- After multiple different setups, we settled on NixOS as our main OS as it provides reproducible and atomically updated system states
- Most (sub)domains are protected by Cloudflare while also making use of its caching feature. Exemptions are services such as our mailserver and parts violating Cloudflares rules such as proxying Piped content.
- Most (sub)domains are protected by Cloudflare while also making use of its caching feature. Exemptions are services such as our mail server and parts violating Cloudflares rules such as proxying Piped content.

## Devshell and tooling

This NixOS flake provides a [devshell](https://github.com/numtide/devshell) which contains all deployment tools as well as handy aliases for common tasks.
The only requirement for using it is having the Nix packge manager available and having flakes enabled. It can be installed on various distributions via:
The only requirement for using it is having the Nix package manager available and having flakes enabled. It can be installed on various distributions via:

```
sh <(curl -L https://nixos.org/nix/install) --daemon
Expand All @@ -22,11 +22,11 @@ sh <(curl -L https://nixos.org/nix/install) --daemon
After that, the shell can be invoked as follows:

```
nix-shell # assuming flakes are not enabled, this bootstraps the needed files and sets up the pre-commit hook
nix develop # the intended way to use the devshell, contains all the aliases
nix-shell # Assuming flakes are not enabled, this bootstraps the needed files and sets up the pre-commit hook
nix develop # The intended way to use the devshell, contains all the aliases
```

To enable flakes and the direct usage of `nix develop` follow this [wiki article](https://nixos.wiki/wiki/Flakes#Other_Distros:_Without_Home-Manager). After running nix develop, new commands are available to perform the following actions:
To enable flakes and the direct usage of `nix develop` follow this [wiki article](https://nixos.wiki/wiki/Flakes#Other_Distros:_Without_Home-Manager). After running `nix develop``, new commands are available to perform the following actions:

- `apply` - applies NixOS configuration by executing `nixos-rebuild switch`, mostly used after using `deploy`
- `buildiso` - spawns a buildiso shell on our `iso-runner` container
Expand All @@ -35,11 +35,43 @@ To enable flakes and the direct usage of `nix develop` follow this [wiki article
- `update` - runs a full infrastructure update including a flake.lock bump
- `update-forum` - updates the Discourse container by running `./launcher rebuild app`

## General structure

A general overview of the folder structure can be found below:

```
├── assets
├── devshell
├── docker-compose
│   ├── all-in-one
│   ├── github-runner
│   └── proxied
├── home-manager
├── host_vars
│   ├── garuda-build
│   ├── garuda-mail
│   └── immortalis
├── nixos
│   ├── hosts
│   │   ├── garuda-build
│   │   ├── garuda-mail
│   │   └── immortalis
│   ├── modules
│   │   └── static
│   └── services
│   ├── chaotic
│   ├── docker-compose-runner
│   └── monitoring
├── playbooks
├── scripts
└── secrets
```

## Immortalis (Hetzner dedicated)

### General

This utilizes a NixOS host which uses [nixos-containers](https://nixos.wiki/wiki/NixOS_Containers) to build declarative `systemd-nspawn` machines for different purposes. To make the best use of the available resources, common directories are shared between containers. This includes `/home` (home-manager / NixOS configurations writing to home are generated by the host and disabled for the containers), Pacman and Chaotic cache, the `/nix` directory, and a few others. Further details can be found in the [Nix expression](https://gitlab.com/garuda-linux/infra-nix/-/blob/main/nix/immortalis.nix?ref_type=heads) for the host.
This system utilizes a NixOS host which uses [nixos-containers](https://nixos.wiki/wiki/NixOS_Containers) to build declarative `systemd-nspawn` machines for different purposes. To make the best use of the available resources, common directories are shared between containers. This includes `/home` (home-manager / NixOS configurations writing to home are generated by the host and disabled for the containers), Pacman and Chaotic cache, the `/nix` directory, and a few others. Further details can be found in the [Nix expression](hhttps://gitlab.com/garuda-linux/infra-nix/-/blob/main/nixos/hosts/immortalis/containers.nix) of the host.

All directories containing important data were mapped to `/data_1` and `/data_2` in order to have them all in one place. The first mostly contains web services' files, the latter only builds related directories such as the Pacman cache.

Expand Down Expand Up @@ -107,7 +139,8 @@ Further information may be obtained by clicking `chaotic seen above`. The corres

### Squid proxy

Squid is being installed on the host machine to proxy outgoing requests via random IPv6 addresses of the /64 subnet Hetzner provides for services that need it, eg. Piped, the Chaotic-AUR builders, and other services that are getting rate limited quickly. The process is not entirely automated, which means that we currently have a pool of IPv6 addresses active and need to switch them whenever those are getting rate limited again.
Squid is being installed on the host machine to proxy outgoing requests via random IPv6 addresses of the /64 subnet Hetzner provides for services that need it, eg. Piped, the Chaotic-AUR builders, and other services that are getting rate limited quickly. The process is not entirely automated, which means that we currently have a pool of IPv6 addresses active and need to switch them whenever those are getting rate-limited again.
Since we supply an invalid IPv4 to force outgoing IPv6, the log files were somewhat cluttered by (expected) errors. Systemd-unit logging has been set to `LogLevelMax=1` to un-clutter the journal and needs to increased again if debugging needs to be done.

### Backups

Expand All @@ -130,6 +163,19 @@ ansible-vault encrypt secrets/pathtofile
Further information on `ansible-vault` can be found in its [documentation](https://docs.ansible.com/ansible/latest/vault_guide/index.html).
It is important to keep the `secrets` directory in the latest state before deploying a new configuration as misconfigurations might happen otherwise.

## CI tooling

We have using pull/push based mirroring for this git repository. This allows easy access to Renovate without having to run a custom instance mirroring changes to both Github and GitLab. The following tasks have been automated as of now:

- `nix flake check` runs for every labeled PR and commit on main.
- [Renovate](https://renovatebot.com/) periodically checks `docker-compose.yml` and other supported files for version updates. It has a [dependency dashboard](https://github.com/garuda-linux/infrastructure-nix/issues/5) as well as the [developer interface](https://developer.mend.io/github/garuda-linux/infrastructure-nix) to check logs of individual runs. Minor updates appear as grouped PRs while major updates are separated from those. Note that this only applies to the GitHub side.

## Monitoring

Our current monitoring stack mostly relies on Netdata to provide insight into current system loads and trends. The major reason for using it was that it provides the most vital metrics and alerts out of the box without having to create in-depth configurations. Might switch to Prometheus/Grafana/Loki stack in the future. We used to set up children -> parent streaming in the past, though after transitioning to one big host this didn't really make sense anymore. Instead, up to 10GB of data gets stored on individual hosts. While Netdata agents do have their own dashboard, the [Dashboard provided by Netdata](https://app.netdata.cloud/spaces/garuda-infra/rooms/all-nodes) is far superior and allows a better insight, eg. by offering the functions feature. Additional services like Squid or Nginx have been configured to be monitored by Netdata plugins as well. Further information can be found in its [documentation](https://learn.netdata.cloud/).

To access the dashboard (linked before), use `team@garudalinux.org` as login, the login will be completed after opening the link sent here.

## Common maintenance tasks

### Rebuilding / updating the forum container
Expand All @@ -147,7 +193,7 @@ sudo ./launcher rebuild app
To build Garuda ISO, one needs to connect to the `iso-runner` container and execute the `buildiso` command, which opens a shell containing the needed environment:

```
ssh -p 227 $user@116.202.208.112
ssh -p 227 $user@116.202.208.112 # if one ran nix develop before, this can be skipped
buildiso
buildiso -i # updates the iso-profiles repo
buildiso -p dr460nized
Expand All @@ -174,17 +220,18 @@ deployiso -FSRd # oneliner for the above-given commands
One needs to have the [infra-nix](https://gitlab.com/garuda-linux/infra-nix) repo cloned locally. Then proceed by updating the `flake.lock` file, pushing it to the server & building the configurations:

```
cd nix # Nix files are not in the main directory due to Ansible still being used to deploy files
nix flake update
cd ..
ansible-playbook garuda.yml -l $servername # eg. immortalis for the Hetzner host
ansible-playbook garuda.yml -l $servername # Eg. immortalis for the Hetzner host
deploy # Skip using above command and use this one in case nix develop was used
```

Then you can either apply it via Ansible or connect to the host to view more details about the process while it runs:

```
ansible-playbook apply.yml -l $servername # Ansible
apply # Nix develop shell
ssh -p 666 $user@116.202.208.112 # Manually, examplary on immortalis
sudo nixos-rebuild switch
```
Expand All @@ -196,14 +243,17 @@ Keep in mind that this will restart every service whose files changed since the
Most system configurations are contained in individual Nix files in the `nix` directory of this repo. This means changing anything must not be done manually but by editing the corresponding file and pushing/applying the configuration afterward.

```
ansible-playbook garuda.yml -l $servername # eg. immortalis for the Hetzner host
ansible-playbook garuda.yml -l $servername # Eg. immortalis for the Hetzner host
deploy # In case nix develop is used
```

As with the system update, one can either apply via Ansible or manually:

```
ansible-playbook apply.yml -l $servername # Ansible
apply # Nix develop shell
ssh -p 666 $user@116.202.208.112 # Manually, exemplary on immortalis
sudo nixos-rebuild switch
```
Expand All @@ -214,7 +264,7 @@ If configurations of services running in Docker containers need to be altered, o

### Updating Docker containers

Docker containers generally use the `latest` tag in case of generic services and versioned ones for critical ones such as databases. Most containers using the `latest` tag are automatically updated via [watchtower](https://containrrr.dev/watchtower/) on a daily basis. The remaining ones can be updated by connecting to the correct [nixos-container](https://nixos.wiki/wiki/NixOS_Containers) either via SSH or `nixos-container login`. Then proceed as follows:
Docker containers sometimes use the `latest` tag in case no current tag is available or in case of services like Piped and Searx, where it is often crucial to have the latest build to bypass Google's restrictions. Containers using the `latest` tag are automatically updated via [watchtower](https://containrrr.dev/watchtower/) on a daily basis. The remaining ones can be updated changing its version in the corresponding `docker-compose.yml` and then running `deploy` & `apply`. If containers are to be updated manually, this can be achieved by connecting to the host, running `nixos-container root-login $containername` and executing:

```
cd /var/garuda/docker-compose-runner/$name/ # replace $name with the actual docker-compose.yml or autocomplete via tab
Expand All @@ -226,7 +276,7 @@ The updated containers will be pulled and automatically recreated using the new

### Rotating IPv6

Sometimes it is needed to rotate the available IPv6 addresses to solve the current ones being rate limited for outgoing requests of Piped, Searx, etc. This can be achieved by editing the hosts Nix file `immortalis.nix`, replacing the existing values of the `networking.interfaces."eth0".ipv6.addresses` keys seen [here](https://gitlab.com/garuda-linux/infra-nix/-/blob/main/nix/immortalis.nix?ref_type=heads#L74). Then, proceed doing the same with the [squid configuration](https://gitlab.com/garuda-linux/infra-nix/-/blob/main/nix/immortalis.nix?ref_type=heads#L594). Possible IPv6 addresses need to be generated from our available /64 subnet space and can't be chosen completely random.
Sometimes it is needed to rotate the available IPv6 addresses to solve the current ones being rate-limited for outgoing requests of Piped, Searx, etc. This can be achieved by editing the hosts Nix file `immortalis.nix`, replacing the existing values of the `networking.interfaces."eth0".ipv6.addresses` keys seen [here](https://gitlab.com/garuda-linux/infra-nix/-/blob/main/nixos/hosts/immortalis.nix?ref_type=heads#L30). Then, proceed doing the same with the [squid configuration](https://gitlab.com/garuda-linux/infra-nix/-/blob/main/nixos/hosts/immortalis.nix?ref_type=heads#L219). Possible IPv6 addresses need to be generated from our available /64 subnet space and can't be chosen completely random.

### Checking whether backups were successful

Expand All @@ -236,7 +286,7 @@ To check whether backups to Hetzner are still working as expected, connect to th
systemctl status borgbackup-job-backupToHetzner
```

This should yield a successful unit state. The only exception is having an exit code != 0 due to files having changed during the run.
This should yield a successful unit state. The only exception is having an exit code != `0`` due to files having changed during the run.

### Updating the website content or Chaotic-AUR toolbox

Expand All @@ -248,7 +298,7 @@ nix flake lock --update-input src-garuda-website # website
nix flake lock --update-input src-chaotic-toolbox # toolbox
```

After that deploy as usual. The commit and corresponding hash will be updated and NixOS will use it to build the website or toolbox using the new revision automatically.
After that deploy as usual (by running `deploy`). The commit and corresponding hash will be updated and NixOS will use it to build the website or toolbox using the new revision automatically.

### Updating the Garuda startpage content

Expand Down

0 comments on commit c43509c

Please sign in to comment.