Shortly describe projects in related work

timebertt · Feb 14, 2024 · efcaf63 · efcaf63
1 parent c3df4c9
commit efcaf63
Show file tree

Hide file tree

Showing 6 changed files with 39 additions and 14 deletions.
diff --git a/.vscode/cspell.json b/.vscode/cspell.json
@@ -41,6 +41,7 @@
         "Kubeflow",
         "kubelet",
         "kubernetes",
+        "Kustomize",
         "Metacontroller",
         "newpage",
         "pandoc",

diff --git a/content/02-abstract.md b/content/02-abstract.md
@@ -12,7 +12,7 @@ This thesis bridges this gap by proposing an approach to achieve horizontal scal
 The design builds upon proven mechanisms from distributed databases to distribute the responsibility for API objects across a ring of controller instances, removing the scalability limitations inherent in traditional leader election setups.
 Key features include dynamic membership and failure detection for automatic failovers and rebalancing, a consistent hashing algorithm for ensuring a balanced distribution of API objects, label-based coordination for transparent object assignments without client interaction, and a dedicated handover protocol for preventing concurrent reconciliations.
 
-This thesis presents a reusable implementation that allows for easy integration of the mechanism into arbitrary controllers, opening the potential for adoption and collaboration within the open-source community.
+This thesis presents a reusable implementation that allows for easy integration of the mechanism into arbitrary controllers including built-in controllers, opening the potential for adoption and collaboration within the open-source community.
 Systematic evaluation using load test experiments demonstrates that all identified requirements are met.
 The mechanism showcases minimal overhead compared to singleton controller setups and an almost linear increase of the controller's load capacity with every added controller instance.
 This work contributes to advancing the scalability and efficiency of Kubernetes controllers, offering promising prospects for the future development and usage of Kubernetes and controller-based applications and platforms.

diff --git a/content/10-motivation.md b/content/10-motivation.md
@@ -19,7 +19,7 @@ The Kubernetes community has extensively picked up the operator pattern, and man
 - streaming & messaging: strimzi-kafka-operator [@strimzi], Koperator [@koperator]
 - storage and backup: Rook [@rook], Velero [@velero]
 - machine learning: Kubeflow [@kubeflow]
-- networking: Knative [@knative], Istio [@istio]
+- serverless and service mesh: Knative [@knative], Istio [@istio]
 - infrastructure and application management: Crossplane [@crossplane], Argo CD [@argocd], Flux [@flux], KubeVela [@kubevela]
 - cluster management: Gardener [@gardenerdocs], Cluster API [@clusterapi]
 - cloud infrastructure: Yaook [@yaook], IronCore [@ironcore]
@@ -47,7 +47,7 @@ It limits the maximum number of objects and the maximum object churn rate to the
 [@bondi2000characteristics]
 
 To address the demand for facilitating large-scale deployments, several of the mentioned open-source projects feature sharding mechanisms that distribute reconciliation work across multiple controller instances [@argocddocs; @kubevela].
-However, the mechanisms are specific to the individual projects and cannot be reused in other controllers.
+However, the mechanisms are specific to the individual projects and cannot be reused in other custom controllers or Kubernetes core controllers.
 Many of these sharding implementations still need to be fully matured and face similar challenges, e.g., the mechanism requires clients to be sharding-aware and manually assign API objects to shards, or the implementation does not facilitate automatic failover and rebalancing [@flux].
 Furthermore, many other projects also consider sharding mechanisms for achieving higher scalability[^sharding-issues].
 The problem is that no standard design or implementation exists that can be applied to arbitrary controllers for scaling them horizontally.

diff --git a/content/30-related-work.md b/content/30-related-work.md
@@ -78,12 +78,15 @@ The assignments per object also require many sharder reconciliations and API req
 
 ![Study project memory usage by pod [@studyproject]](../assets/study-project-memory.pdf){#fig:study-project-memory}
 
-## knative
+## Knative
 
-In knative [@knative], controllers also use leader election but not for global locking[^knative-issue].
+Knative is a platform for running serverless and event-driven workloads on Kubernetes.
+Users package their applications as container images and deploy them without managing infrastructure, networking, autoscaling, revision tracking, and other cross-cutting concerns [@knative].
+
+In Knative, controllers also use leader election but not for global locking[^knative-issue].
 Instead, the controllers perform leader election per reconciler[^reconciler] and bucket.
 When running multiple instances of the controllers, each instance might acquire a subset of all leases and run only the corresponding reconcilers.
-Some of knative's reconcilers are leader-aware and run in all instances but behave differently according to the leadership status.
+Some of Knative's reconcilers are leader-aware and run in all instances but behave differently according to the leadership status.
 For example, the webhook components also use reconcilers for building up indices.
 The reconcilers also run in non-leader instances but only perform writing actions in the leader instance.
 Additionally, the keys of all objects are split into a configurable number of buckets.
@@ -92,7 +95,7 @@ Before reconciling an object, the reconciler checks if its instance is responsib
 Only if it is responsible can it continue with the usual reconciliation.
 [@mooresharding]
 
-![Failover with leader election per controller and bucket in knative [@mooresharding]](../assets/reconciler-buckets.pdf)
+![Failover with leader election per controller and bucket in Knative [@mooresharding]](../assets/reconciler-buckets.pdf)
 
 To realize these mechanisms, all controller instances run all informers.
 I.e., they watch all objects regardless of whether they need to reconcile them.
@@ -102,7 +105,7 @@ Furthermore, the mechanisms do not guarantee an even distribution of objects acr
 Users need to configure a higher number of buckets to achieve an even distribution.
 This, in turn, increases the additional API request volume for `Lease` objects even further.
 
-The described sharding mechanisms in knative achieve fast failovers as informers are warmed in all controller instances.
+The described sharding mechanisms in Knative achieve fast failovers as informers are warmed in all controller instances.
 However, the system's scalability is still limited as the watch caches' resource impact is duplicated and not distributed.
 Applying the described concepts to other controllers is complex and requires notable changes to the controller implementation.
 To summarize, the system benefits from these mechanisms in terms of availability but not in terms of scalability.
@@ -112,7 +115,10 @@ To summarize, the system benefits from these mechanisms in terms of availability
 
 ## Flux
 
-The Flux controllers offer a command line option `--watch-label-selector` that filters the controllers' watch caches using a label selector.
+Flux [@flux] is a tool for facilitating continuous delivery of Kubernetes-based applications.
+It is comprised of multiple components that pull application configuration from sources like Git repositories and deploy them using Kustomize [@kustomizedocs] and Helm [@helmdocs].
+
+The Flux components offer a command line option `--watch-label-selector` that filters the controllers' watch caches using a label selector.
 This can be used to scale out Flux controllers horizontally using a sharding strategy.
 For this, users deploy multiple instances of the same controller with unique label selectors used as the sharding key[^flux-sharding].
 Then, users assign objects to shards by adding the shard key label to the respective manifests ([@lst:flux-sharding]).
@@ -157,7 +163,8 @@ The sharding strategy is limited to a static number of instances and does not al
 
 ## Argo CD
 
-In Argo CD [@argocddocs], the application controller is the central component that deploys manifests pulled from Git repositories to Kubernetes.
+Argo CD [@argocddocs] is a continuous delivery tool for Kubernetes similar to Flux.
+In Argo CD, the application controller is the central component that deploys manifests pulled from Git repositories to Kubernetes.
 It works with one or more clusters configured via `Secrets` that contain credentials for the cluster.
 The application reconciliation process can become memory-intensive depending on the number and size of the deployed manifests.
 
@@ -180,7 +187,8 @@ However, this is specific to Argo CD's application controller, and the mechanism
 
 ## KubeVela
 
-KubeVela also allows running multiple instances of its core controller responsible for deploying applications to support large-scale use cases.
+KubeVela is a platform for delivery and management of Kubernetes-based applications.
+The project also allows running multiple instances of its core controller responsible for deploying applications to support large-scale use cases.
 For this, users deploy multiple instances of vela-core – one in master mode (primary) and the others in slave mode (shards).
 The primary instance runs all controllers and webhooks and schedules applications to one of the available shard instances.
 On the other hand, the shard instances are labeled with a unique `shard-id` label and only run the application controller.

diff --git a/content/70-conclusion.md b/content/70-conclusion.md
@@ -32,7 +32,7 @@ To conclude, the systematic evaluation has shown that all identified requirement
 As the mechanism can be easily applied to existing controllers, it opens opportunities for adopting the presented work, discussion, and collaboration in the open-source community.
 Further development is simplified because the implementation does not depend on a specific Kubernetes version.
 
-As future work on horizontally scalable Kubernetes controllers, the design and implementation from this thesis should be further evaluated through usage in real-world controllers.
+As future work on horizontally scalable Kubernetes controllers, the design and implementation from this thesis should be further evaluated through usage in real-world controllers including built-in and custom controllers.
 The implementation's performance during rolling updates, automatic scaling, chaos engineering experiments [@chaos2016], and more scenarios should be investigated and enhanced if necessary.
 For this, feedback from the community on the presented development needs to be collected.
 New requirements shall be collected and explored if certain use cases cannot adopt the presented work.

diff --git a/content/bibliography.bib b/content/bibliography.bib
@@ -560,10 +560,10 @@ @misc{argocd
 }
 
 @misc{flux,
-  title   = {{Flux}},
+  title   = {{Flux Documentation}},
   author  = {{The Flux Authors}},
   date    = {2024},
-  url     = {https://fluxcd.io/},
+  url     = {https://fluxcd.io/flux/},
   urldate = {2024-02-07}
 }
 
@@ -607,6 +607,22 @@ @misc{argocddocs
   urldate = {2024-02-07}
 }
 
+@misc{kustomizedocs,
+  title   = {{Kustomize Documentation}},
+  author  = {{The Kubernetes Authors}},
+  date    = {2024},
+  url     = {https://kustomize.io/},
+  urldate = {2024-02-14}
+}
+
+@misc{helmdocs,
+  title   = {{Helm: The package manager for Kubernetes}},
+  author  = {{The Helm Authors}},
+  date    = {2024},
+  url     = {https://helm.sh/},
+  urldate = {2024-02-14}
+}
+
 @article{argoaws,
   author  = {Andrew Lee and Christina Andonov and Carlos Santana and Nima Kaviani},
   journal = {AWS Open Source Blog},