Support for shard group isolation instead of managing multiple clusters #237

robskillington · 2018-08-30T20:09:49Z

Discussed with @cw9 regarding why we had originally wanted to be able to shard metrics across multiple clusters, and concluded that it would actually be more ideal to support the same benefits in a single logical cluster.

The main benefits seen for sharding metrics across N clusters are:

Each cluster can have 1 node down, hence making it more safe to run very large deployments (think thousands of nodes)
Each node doesn't have to be connected to each other node, and health checks aren't n^2

We concluded that we can have the same benefits introducing grouping of a topology shards, and assigning instances to a specific group.

Hence we would get groups of machines in a single cluster that all share the same shards, but do not share any shards of any other groups in the cluster.

This has the benefit of each group of machines being able to tolerate a single node down, and not needing to be connected to each other node in the entire cluster (with respect to bootstrapping, health checking, etc).

richardartoul · 2018-10-04T22:47:22Z

@robskillington Could we do sharding across these groups? I.E Instead of replicating data inside a group, you replicate it between groups. That way you could lose an entire group (or deploy an entire group) at once and not lose any data.

robskillington · 2018-10-05T12:59:50Z

@richardartoul Replicas exist exactly for the purpose you describe, this is more for suppressing cross talk and allowing multiple groups to be deployed at the same time.

It's essentially a slice in another dimension to the existing replicas dimension.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support for shard group isolation instead of managing multiple clusters #237

Support for shard group isolation instead of managing multiple clusters #237

robskillington commented Aug 30, 2018 •

edited

Loading

richardartoul commented Oct 4, 2018

robskillington commented Oct 5, 2018

Support for shard group isolation instead of managing multiple clusters #237

Support for shard group isolation instead of managing multiple clusters #237

Comments

robskillington commented Aug 30, 2018 • edited Loading

richardartoul commented Oct 4, 2018

robskillington commented Oct 5, 2018

robskillington commented Aug 30, 2018 •

edited

Loading