-
Notifications
You must be signed in to change notification settings - Fork 223
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bug-1928847: Update architecture diagram for GCP #6799
Conversation
I have a couple of thoughts:
We should have a data flow diagram for crash data--covered in bug 1769978--but that should be separate from the architecture diagram. What does everyone else think? Does the new diagram help you understand the Socorro architecture? |
I don't know how to materially turn this feedback into diagram changes. I agree, but I also thought that about the old diagram. I could hand this off so someone else can give it a shot instead? I feel like part of the problem is that almost every component accesses both gcs raw/processed and pubsub, and showing those connections creates a lot of overlapping lines. maybe i could separate those and put them in a box that with a note that says "used by every component"?
I was under the impression the arrows already indicated data flow. Were the arrows are supposed to be like |
9307258
to
2c5a588
Compare
I updated the diagram to apply my own suggestions from my last comment, plus switch back to orthogonal lines, lmk if that helps? |
ea7563f
to
998a1d3
Compare
998a1d3
to
10c97a2
Compare
@biancadanforth , @smarnach : ^^^ what do you two think? |
I only looked at the latest version of the diagram, and I like it. There's only so much you can encode in a single diagram, and I think this diagram strikes a nice balance. The only thing I found slightly confusing is the arrow pointing back to the load balancer from the stage submitter, since it looks kind of circular, while in reality that arrow goes from the stage submitter running in prod to the stage load balancer. I don't know how to improve this, though – maybe we could just add some comment to that arrow? More generally, there are different kinds of architecture diagrams for software. The goal here appears to be visualizing the infrastructure of Socorro, which I think is useful. This kind of diagram usually also visualizes the data flow as well, since there just isn't too big a difference between "component A accesses component B" and "data flows from component A to component B". Either way, I think the current version of the diagram is useful. |
It actually lives in stage but reads from prod, so I think i'll change it to have that connecting from "off-screen" indicating it's reading prod crashes and pubsub standard queue, and also add |
This comment was marked as resolved.
This comment was marked as resolved.
Thanks! I like the new version with the arrow pointing off-screen better. |
Having the arrow point "off-screen" sounds good to me as well. I like this diagram. 💯 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Thank you!
stage submitter now reads from gcs and pubsub, and writes to the collector's load balancer. webapp and crontabber now show that they read processed crashes from gcs. I split up gcs into raw and processed crashes to limit the number of crossing lines, but kept a single pubsub to also limit crossing lines.
With more crossing lines I couldn't visually track orthogonal lines anymore, so I switched to straight lines and reorganized things as needed, with a secondary goal of data flowing left to right and top to bottom where possible (e.g. data still flows up for telemetry to reduce line crossing).
based on the legend stage submitter writes to crash-reports and processor "writes" to the version string api, because if I switch the arrow to have processor read the api it looks like the load balancer is routing version string api requests to the processor.