Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Scalability Test and Cluster Management Doc #511

Merged
merged 7 commits into from
Aug 14, 2024
Merged

Scalability Test and Cluster Management Doc #511

merged 7 commits into from
Aug 14, 2024

Conversation

hanahmily
Copy link
Contributor

Signed-off-by: Gao Hongtao <hanahmily@gmail.com>
Signed-off-by: Gao Hongtao <hanahmily@gmail.com>
Signed-off-by: Gao Hongtao <hanahmily@gmail.com>
@hanahmily hanahmily added documentation Improvements or additions to documentation testing labels Aug 14, 2024
@hanahmily hanahmily added this to the 0.7.0 milestone Aug 14, 2024

The cluster's availability is also improved by increasing the number of data nodes, as active data nodes need to handle a lower additional workload when some data nodes become unavailable. For example, if one node out of 2 nodes is unavailable, then 50% of the load is re-distributed across the remaining node, resulting in a 100% per-node workload increase. If one node out of 10 nodes is unavailable, then 10% of the load is re-distributed across the 9 remaining nodes, resulting in only an 11% per-node workload increase.

Increasing the number of etcd nodes can increase the cluster's metadata capacity and improve the cluster's metadata query performance. It can also improve the cluster's metadata availability, as the metadata is replicated across all the etcd nodes. However, the cluster size should be odd to avoid split-brain situations.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

etcd could be a potential risk when we run larger scale deployment, I believe.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Absolutely.

During the test, I used 10 data nodes and only 1 etcd node in a medium-sized cluster. Moving forward, we need to include more extensive and complex scenarios in the scale testing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, and when we meet Lenovo team next month, we need to verify the scale with them.

@wu-sheng
Copy link
Member

Others are good, please fix menu structure for operation docs.

Signed-off-by: Gao Hongtao <hanahmily@gmail.com>
@wu-sheng
Copy link
Member

Why your yaml is so different?

Signed-off-by: Gao Hongtao <hanahmily@gmail.com>
@wu-sheng wu-sheng merged commit c27d562 into main Aug 14, 2024
15 checks passed
@wu-sheng wu-sheng deleted the cluster branch August 14, 2024 13:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation testing
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants