You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Many pull request templates, including Shopify core, have a "This is safe to rollback" checkbox. This helps people during incidents to assess the risk of rolling back vs fixing forward. With our larger deploy batch sizes, and separate deploy batches for canary and production, it is becoming very hard for a human to assess rollback safety. This has resulted in situations where people have decided to fix forward rather than rollback, which adds significant time to the impact of a disruption. As one example, this came up during RCA discussion for https://github.com/Shopify/service-disruptions/issues/1037.
Idea: explicitly model "safe to rollback" in Shipit rather than via PR template checkbox. Perhaps it is exposed via HCTW as an extra field next to "Add to Merge queue". Shipit can than quickly determine if a batch of changes is "unsafe to rollback" and indicate this in the Shipit UI. This would not be foolproof but it would help a human responder more quickly discover cases where rollback is not safe.
This would also help us collect a list of cases that are not safe that we can work towards eliminating, with the goal of eventually never shipping changes that are not "safe to rollback".
The text was updated successfully, but these errors were encountered:
Many pull request templates, including Shopify core, have a "This is safe to rollback" checkbox. This helps people during incidents to assess the risk of rolling back vs fixing forward. With our larger deploy batch sizes, and separate deploy batches for canary and production, it is becoming very hard for a human to assess rollback safety. This has resulted in situations where people have decided to fix forward rather than rollback, which adds significant time to the impact of a disruption. As one example, this came up during RCA discussion for https://github.com/Shopify/service-disruptions/issues/1037.
Idea: explicitly model "safe to rollback" in Shipit rather than via PR template checkbox. Perhaps it is exposed via HCTW as an extra field next to "Add to Merge queue". Shipit can than quickly determine if a batch of changes is "unsafe to rollback" and indicate this in the Shipit UI. This would not be foolproof but it would help a human responder more quickly discover cases where rollback is not safe.
This would also help us collect a list of cases that are not safe that we can work towards eliminating, with the goal of eventually never shipping changes that are not "safe to rollback".
The text was updated successfully, but these errors were encountered: