Skip to content
This repository has been archived by the owner on Oct 18, 2018. It is now read-only.

Support for shard group isolation instead of managing multiple clusters #237

Open
robskillington opened this issue Aug 30, 2018 · 2 comments

Comments

@robskillington
Copy link
Contributor

robskillington commented Aug 30, 2018

Discussed with @cw9 regarding why we had originally wanted to be able to shard metrics across multiple clusters, and concluded that it would actually be more ideal to support the same benefits in a single logical cluster.

The main benefits seen for sharding metrics across N clusters are:

  • Each cluster can have 1 node down, hence making it more safe to run very large deployments (think thousands of nodes)
  • Each node doesn't have to be connected to each other node, and health checks aren't n^2

We concluded that we can have the same benefits introducing grouping of a topology shards, and assigning instances to a specific group.

Hence we would get groups of machines in a single cluster that all share the same shards, but do not share any shards of any other groups in the cluster.

This has the benefit of each group of machines being able to tolerate a single node down, and not needing to be connected to each other node in the entire cluster (with respect to bootstrapping, health checking, etc).

@richardartoul
Copy link
Contributor

@robskillington Could we do sharding across these groups? I.E Instead of replicating data inside a group, you replicate it between groups. That way you could lose an entire group (or deploy an entire group) at once and not lose any data.

@robskillington
Copy link
Contributor Author

@richardartoul Replicas exist exactly for the purpose you describe, this is more for suppressing cross talk and allowing multiple groups to be deployed at the same time.

It's essentially a slice in another dimension to the existing replicas dimension.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants