-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(reports): sidecar container report generation #779
Conversation
3f4d7be
to
80dde0e
Compare
f45db29
to
add2644
Compare
a5802d9
to
aabfdcd
Compare
src/main/java/io/cryostat/net/reports/SubprocessReportGenerator.java
Outdated
Show resolved
Hide resolved
A general question before I start reviewing the actual code changes: if I'm understanding this PR correctly, by delegating report generation to a sidecar container used strictly for that purpose, there is more memory available for report generation (compared to a subprocess inside the Cryostat instance container which must also handle every other function of Cryostat), meaning OOM errors should not occur? Edit: And this process of requesting report generation and receiving the generated report is done strictly through HTTP calls between the Cryostat container and the sidecar container? |
There isn't necessarily more memory available, since that depends on the container limits set by the container runtime platform. I explained a bit of the resource requirements rationale in this comment: https://github.com/cryostatio/cryostat/pull/779#discussion_r766053568 The main problem is that report generation is going to be a relatively less-frequently used feature, but it requires by far the most resources. This leads to a classic resource provisioning problem where an end user would need to provision the Cryostat container with enough resources to be able to meet peak resource demands (during report generation), but the majority of the time those resources are not needed and end up being wasted. In a cloud environment, overprovisioned resources are a very direct waste of money at the end of the day. Splitting out report generation into a sidecar does mean that report generation should never cause Cryostat itself to OOM (well, it's possible, but the memory usage on a reports request is now just whatever is required for the HTTP request plus enough to hold the HTML document in memory, basically - some number of KB). The cryostat-reports sidecar could still OOM, but this is isolated away from the main Cryostat container and should just turn up a 500 response. By splitting the report generation into a separate container then it becomes much easier to provision resources appropriately. The main Cryostat container only needs enough resources to run its webserver and perform various relatively lightweight operations over HTTP and JMX, so its footprint can be small and therefore running it and keeping it deployed can be cheap. Users who only occasionally use automated reports can provision just a single container for it and only allocate it a small amount of resources, letting it fail if those resources are insufficient - or just continue using subprocess generation if they're really, low priority. The more you care about reports and the more often you want to see them, the more resources you can provision to the report container and/or the more replicas of the container you can spin up. In the future I hope to have cryostat-reports built and runnable as a Quarkus native image as well, which would further reduce the resource footprint and also open the door to running it serverless, meaning that the cryostat-reports container would actually be started up on-demand when a reports generation request is made, and stopped after the response has been completed. This should make Cryostat as a whole just about the smallest and cheapest to run that it can be.
Correct, the sidecar container has exactly one HTTP endpoint that expects a JFR binary to be POSTed to it and it responds with the automated analysis HTML document. So when you make a request to Cryostat for a report, then behind the scenes it actually delegates out to cryostat-reports with another HTTP request, and when it gets a response it wraps that up and sends you back another HTTP response. |
cd75c34
to
4a1d574
Compare
src/test/java/io/cryostat/net/reports/ActiveRecordingReportCacheTest.java
Outdated
Show resolved
Hide resolved
4a1d574
to
ee97ed6
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested it out, looks good.
src/main/java/io/cryostat/net/reports/RemoteReportGenerator.java
Outdated
Show resolved
Hide resolved
ee97ed6
to
d361a9e
Compare
After pulling your latest changes, I got a dependency error looking for the 2.4.0 cryostat-core jar. When I looked at cryostat-core, I see the versioning changed from 2.3.1 to 2.5.0-SNAPSHOT. Your branch builds fine when I change |
You'll have to check out and build the upstream That goes for anything after #731 |
I have a quick/dirty setup of this in OpenShift, but hit a reflection problem with the reports container:
Should I be testing it in JVM mode for now? |
extract subprocess report generation variable extract platform configuration variables extract max WS connections variable extract CORS origin variable extract DISABLE_SSL variable extract webserver configuration variables use Variables extract JMX connection config variables finish extracting variables fix missed import fix extracted variables in tests
request ordering is handled deeper in the stack by the SubprocessReportGenerator synchronizing lock, or is handled by the configured sidecar report generator setup and its potential load balancer
allows Cryostat too inspect the report generator for profiling and analysis
d1c22ac
to
5c330fc
Compare
Rebased, just one very minor conflict in MessagingModule since I moved the MAX_CONNECTIONS env var out of there, but Janelle added the PRUNER constant. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes look good to me after the rebase.
The todo bot should detect and create issues for any todo comments, right?
The todo bot is not active anymore, actually. I can't remember anymore why. |
No worries. It would be nice, although not necessary, if the todos were tracked as issues somehow. None of the todos are high priority anyway. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good!
Related to cryostatio/cryostat#8
Fixes #324 (no longer necessary)
This abstracts report generation to reduce the coupling of the report caches and report handlers to the SubprocessReportGenerator, and then introduces an alternate report generator implementation that performs HTTP calls out to a separate sidecar container (https://github.com/cryostatio/cryostat-reports). The user selects which report generation strategy is used by setting the environment variable
CRYOSTAT_REPORT_GENERATOR
. If this env var is unset or empty then Cryostat will default to forking a subprocess to generate the report as usual. If it is set then it is expected to be the URL to acryostat-reports
instance (or a loadbalanced service in front of some replicas), and Cryostat will delegate report generation off to the sidecar(s).