Skip to content

Commit

Permalink
feat(monitoring): instrument code using Prometheus
Browse files Browse the repository at this point in the history
  • Loading branch information
massix committed Jul 3, 2024
1 parent 54e9fad commit 49b6582
Show file tree
Hide file tree
Showing 10 changed files with 566 additions and 13 deletions.
34 changes: 34 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -183,6 +183,40 @@ The value is not case-sensitive.

Invalid or empty values will make ChaosMonkey default to the `info` level.

## Observability
The Chaos Monkey exposes some metrics using the [Prometheus](https://prometheus.io/) library and format.

This is an _evolving_ list of metrics currently exposed, for more details please take a look
in the code under the corresponding service (all the services in the [watcher folder](./internal/watcher/) expose
some sort of metrics).

All the events use the prefix `chaos_monkey` which, for readability issues, is not repeated in the
table below.

| Name | Description | Type |
|----------------------------------------|-----------------------------------------|-----------|
| nswatcher_events | events handled by the nswatcher | Counter |
| nswatcher_event_duration | duration of each event in microseconds | Histogram |
| nswatcher_cmc_spawned | crd services spawned | Counter |
| nswatcher_cmc_active | currently active crd | Gauge |
| nswatcher_restarts | timeouts happened from K8S APIs | Counter |
| crdwatcher_events | events handled by the crd watcher | Counter |
| crdwatcher_pw_spawned | PodWatchers spawned | Counter |
| crdwatcher_pw_active | PodWatchers currently active | Gauge |
| crdwatcher_dw_spawned | DeploymentWatchers spawned | Counter |
| crdwatcher_dw_active | DeploymentWatchers active | Gauge |
| crdwatcher_restarts | timeouts happened from K8S APIs | Counter |
| podwatcher_pods_added | Pods having been added to the list | Counter |
| podwatcher_pods_removed | Pods having been removed from the list | Counter |
| podwatcher_pods_killed | Pods having been killed | Counter |
| podwatcher_pods_active | Pods currently being targeted | Gauge |
| podwatcher_restarts | timeouts happened from K8S APIs | Counter |
| deploymentwatcher_deployments_rescaled | deployments having been rescaled | Counter |
| deploymentwatcher_random_distribution | random distribution of deployments | Histogram |
| deploymentwatcher_last_scale | last value used to scale the deployment | Gauge |



## Development
All contributions are welcome, of course. Feel free to open an issue or submit a
pull request. If you want to develop and test locally, you need to install:
Expand Down
20 changes: 20 additions & 0 deletions cmd/chaosmonkey/main.go
Original file line number Diff line number Diff line change
Expand Up @@ -2,6 +2,7 @@ package main

import (
"context"
"net/http"
"os"
"os/signal"
"sync"
Expand All @@ -10,6 +11,7 @@ import (
"github.com/massix/chaos-monkey/internal/apis/clientset/versioned"
"github.com/massix/chaos-monkey/internal/configuration"
"github.com/massix/chaos-monkey/internal/watcher"
"github.com/prometheus/client_golang/prometheus/promhttp"
"github.com/sirupsen/logrus"
"k8s.io/client-go/kubernetes"
"k8s.io/client-go/rest"
Expand Down Expand Up @@ -63,9 +65,27 @@ func main() {
}
}()

// Spawn the HTTP Server for Prometheus in background
srv := &http.Server{
Handler: promhttp.Handler(),
Addr: "0.0.0.0:9000",
}

wg.Add(1)
go func() {
defer wg.Done()
if err := srv.ListenAndServe(); err != nil {
log.Warnf("Could not spawn Prometheus handler: %s", err)
}
}()

// Wait for a signal to arrive
<-s

if err := srv.Shutdown(context.Background()); err != nil {
log.Warnf("Could not shutdown Prometheus handler: %s", err)
}

log.Info("Shutting down...")
cancel()

Expand Down
8 changes: 7 additions & 1 deletion go.mod
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@ module github.com/massix/chaos-monkey
go 1.22.3

require (
github.com/prometheus/client_golang v1.19.1
github.com/sirupsen/logrus v1.9.3
k8s.io/api v0.30.2
k8s.io/apimachinery v0.30.2
Expand All @@ -11,6 +12,8 @@ require (
)

require (
github.com/beorn7/perks v1.0.1 // indirect
github.com/cespare/xxhash/v2 v2.2.0 // indirect
github.com/davecgh/go-spew v1.1.1 // indirect
github.com/emicklei/go-restful/v3 v3.11.0 // indirect
github.com/evanphx/json-patch v4.12.0+incompatible // indirect
Expand All @@ -32,10 +35,13 @@ require (
github.com/modern-go/reflect2 v1.0.2 // indirect
github.com/munnerz/goautoneg v0.0.0-20191010083416-a7dc8b61c822 // indirect
github.com/pkg/errors v0.9.1 // indirect
github.com/prometheus/client_model v0.5.0 // indirect
github.com/prometheus/common v0.48.0 // indirect
github.com/prometheus/procfs v0.12.0 // indirect
github.com/spf13/pflag v1.0.5 // indirect
golang.org/x/mod v0.15.0 // indirect
golang.org/x/net v0.23.0 // indirect
golang.org/x/oauth2 v0.10.0 // indirect
golang.org/x/oauth2 v0.16.0 // indirect
golang.org/x/sys v0.18.0 // indirect
golang.org/x/term v0.18.0 // indirect
golang.org/x/text v0.14.0 // indirect
Expand Down
16 changes: 14 additions & 2 deletions go.sum
Original file line number Diff line number Diff line change
@@ -1,3 +1,7 @@
github.com/beorn7/perks v1.0.1 h1:VlbKKnNfV8bJzeqoa4cOKqO6bYr3WgKZxO8Z16+hsOM=
github.com/beorn7/perks v1.0.1/go.mod h1:G2ZrVWU2WbWT9wwq4/hrbKbnv/1ERSJQ0ibhJ6rlkpw=
github.com/cespare/xxhash/v2 v2.2.0 h1:DC2CZ1Ep5Y4k3ZQ899DldepgrayRUGE6BBZ/cd9Cj44=
github.com/cespare/xxhash/v2 v2.2.0/go.mod h1:VGX0DQ3Q6kWi7AoAeZDth3/j3BFtOZR5XLFGgcrjCOs=
github.com/creack/pty v1.1.9/go.mod h1:oKZEueFk5CKHvIhNR5MUki03XCEU+Q6VDXinZuGJ33E=
github.com/davecgh/go-spew v1.1.0/go.mod h1:J7Y8YcW2NihsgmVo/mv3lAwl/skON4iLHjSsI+c5H38=
github.com/davecgh/go-spew v1.1.1 h1:vj9j/u1bqnvCEfJOwUhtlOARqs3+rkHYY13jYWTU97c=
Expand Down Expand Up @@ -65,6 +69,14 @@ github.com/pkg/errors v0.9.1 h1:FEBLx1zS214owpjy7qsBeixbURkuhQAwrK5UwLGTwt4=
github.com/pkg/errors v0.9.1/go.mod h1:bwawxfHBFNV+L2hUp1rHADufV3IMtnDRdf1r5NINEl0=
github.com/pmezard/go-difflib v1.0.0 h1:4DBwDE0NGyQoBHbLQYPwSUPoCMWR5BEzIk/f1lZbAQM=
github.com/pmezard/go-difflib v1.0.0/go.mod h1:iKH77koFhYxTK1pcRnkKkqfTogsbg7gZNVY4sRDYZ/4=
github.com/prometheus/client_golang v1.19.1 h1:wZWJDwK+NameRJuPGDhlnFgx8e8HN3XHQeLaYJFJBOE=
github.com/prometheus/client_golang v1.19.1/go.mod h1:mP78NwGzrVks5S2H6ab8+ZZGJLZUq1hoULYBAYBw1Ho=
github.com/prometheus/client_model v0.5.0 h1:VQw1hfvPvk3Uv6Qf29VrPF32JB6rtbgI6cYPYQjL0Qw=
github.com/prometheus/client_model v0.5.0/go.mod h1:dTiFglRmd66nLR9Pv9f0mZi7B7fk5Pm3gvsjB5tr+kI=
github.com/prometheus/common v0.48.0 h1:QO8U2CdOzSn1BBsmXJXduaaW+dY/5QLjfB8svtSzKKE=
github.com/prometheus/common v0.48.0/go.mod h1:0/KsvlIEfPQCQ5I2iNSAWKPZziNCvRs5EC6ILDTlAPc=
github.com/prometheus/procfs v0.12.0 h1:jluTpSng7V9hY0O2R9DzzJHYb2xULk9VTR1V1R/k6Bo=
github.com/prometheus/procfs v0.12.0/go.mod h1:pcuDEFsWDnvcgNzo4EEweacyhjeA9Zk3cnaOZAZEfOo=
github.com/rogpeppe/go-internal v1.10.0 h1:TMyTOH3F/DB16zRVcYyreMH6GnZZrwQVAoYjRBZyWFQ=
github.com/rogpeppe/go-internal v1.10.0/go.mod h1:UQnix2H7Ngw/k4C5ijL5+65zddjncjaFoBhdsK/akog=
github.com/sirupsen/logrus v1.9.3 h1:dueUQJ1C2q9oE3F7wvmSGAaVtTmUizReu6fjN8uqzbQ=
Expand Down Expand Up @@ -97,8 +109,8 @@ golang.org/x/net v0.0.0-20200226121028-0de0cce0169b/go.mod h1:z5CRVTTTmAJ677TzLL
golang.org/x/net v0.0.0-20201021035429-f5854403a974/go.mod h1:sp8m0HH+o8qH0wwXwYZr8TS3Oi6o0r6Gce1SSxlDquU=
golang.org/x/net v0.23.0 h1:7EYJ93RZ9vYSZAIb2x3lnuvqO5zneoD6IvWjuhfxjTs=
golang.org/x/net v0.23.0/go.mod h1:JKghWKKOSdJwpW2GEx0Ja7fmaKnMsbu+MWVZTokSYmg=
golang.org/x/oauth2 v0.10.0 h1:zHCpF2Khkwy4mMB4bv0U37YtJdTGW8jI0glAApi0Kh8=
golang.org/x/oauth2 v0.10.0/go.mod h1:kTpgurOux7LqtuxjuyZa4Gj2gdezIt/jQtGnNFfypQI=
golang.org/x/oauth2 v0.16.0 h1:aDkGMBSYxElaoP81NpoUoz2oo2R2wHdZpGToUxfyQrQ=
golang.org/x/oauth2 v0.16.0/go.mod h1:hqZ+0LWXsiVoZpeld6jVt06P3adbS2Uu911W1SsJv2o=
golang.org/x/sync v0.0.0-20190423024810-112230192c58/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20190911185100-cd5d95a43a6e/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
golang.org/x/sync v0.0.0-20201020160332-67f06af15bc9/go.mod h1:RxMgew5VJxzue5/jJTE5uejpjVlOe/izrB70Jof72aM=
Expand Down
Loading

0 comments on commit 49b6582

Please sign in to comment.