feat(reports): sidecar container report generation #779

andrewazores · 2021-12-07T16:47:30Z

Related to cryostatio/cryostat#8
Fixes #324 (no longer necessary)

This abstracts report generation to reduce the coupling of the report caches and report handlers to the SubprocessReportGenerator, and then introduces an alternate report generator implementation that performs HTTP calls out to a separate sidecar container (https://github.com/cryostatio/cryostat-reports). The user selects which report generation strategy is used by setting the environment variable CRYOSTAT_REPORT_GENERATOR. If this env var is unset or empty then Cryostat will default to forking a subprocess to generate the report as usual. If it is set then it is expected to be the URL to a cryostat-reports instance (or a loadbalanced service in front of some replicas), and Cryostat will delegate report generation off to the sidecar(s).

andrewazores · 2021-12-07T20:13:00Z

Test failure: https://github.com/cryostatio/cryostat/runs/4448618311?check_suite_focus=true#step:9:1165
#666

src/main/java/io/cryostat/net/reports/ReportsModule.java

src/main/java/io/cryostat/net/reports/SubprocessReportGenerator.java

smoketest.sh

hareetd · 2021-12-09T18:45:49Z

A general question before I start reviewing the actual code changes: if I'm understanding this PR correctly, by delegating report generation to a sidecar container used strictly for that purpose, there is more memory available for report generation (compared to a subprocess inside the Cryostat instance container which must also handle every other function of Cryostat), meaning OOM errors should not occur?

Edit: And this process of requesting report generation and receiving the generated report is done strictly through HTTP calls between the Cryostat container and the sidecar container?

andrewazores · 2021-12-09T18:59:49Z

A general question before I start reviewing the actual code changes: if I'm understanding this PR correctly, by delegating report generation to a sidecar container used strictly for that purpose, there is more memory available for report generation (compared to the Cryostat instance container which must also handle every other function of Cryostat), meaning OOM errors should not occur?

There isn't necessarily more memory available, since that depends on the container limits set by the container runtime platform.

I explained a bit of the resource requirements rationale in this comment: https://github.com/cryostatio/cryostat/pull/779#discussion_r766053568

The main problem is that report generation is going to be a relatively less-frequently used feature, but it requires by far the most resources. This leads to a classic resource provisioning problem where an end user would need to provision the Cryostat container with enough resources to be able to meet peak resource demands (during report generation), but the majority of the time those resources are not needed and end up being wasted. In a cloud environment, overprovisioned resources are a very direct waste of money at the end of the day.

Splitting out report generation into a sidecar does mean that report generation should never cause Cryostat itself to OOM (well, it's possible, but the memory usage on a reports request is now just whatever is required for the HTTP request plus enough to hold the HTML document in memory, basically - some number of KB). The cryostat-reports sidecar could still OOM, but this is isolated away from the main Cryostat container and should just turn up a 500 response.

By splitting the report generation into a separate container then it becomes much easier to provision resources appropriately. The main Cryostat container only needs enough resources to run its webserver and perform various relatively lightweight operations over HTTP and JMX, so its footprint can be small and therefore running it and keeping it deployed can be cheap. Users who only occasionally use automated reports can provision just a single container for it and only allocate it a small amount of resources, letting it fail if those resources are insufficient - or just continue using subprocess generation if they're really, low priority. The more you care about reports and the more often you want to see them, the more resources you can provision to the report container and/or the more replicas of the container you can spin up.

In the future I hope to have cryostat-reports built and runnable as a Quarkus native image as well, which would further reduce the resource footprint and also open the door to running it serverless, meaning that the cryostat-reports container would actually be started up on-demand when a reports generation request is made, and stopped after the response has been completed. This should make Cryostat as a whole just about the smallest and cheapest to run that it can be.

Edit: And this process of requesting report generation and receiving the generated report is done strictly through HTTP calls between the Cryostat container and the sidecar container?

Correct, the sidecar container has exactly one HTTP endpoint that expects a JFR binary to be POSTed to it and it responds with the automated analysis HTML document. So when you make a request to Cryostat for a report, then behind the scenes it actually delegates out to cryostat-reports with another HTTP request, and when it gets a response it wraps that up and sends you back another HTTP response.

src/main/java/io/cryostat/net/reports/ActiveRecordingReportCache.java

src/test/java/io/cryostat/net/reports/ActiveRecordingReportCacheTest.java

hareetd

Tested it out, looks good.

andrewazores · 2021-12-10T18:56:27Z

It's a fairly large change, not just in line count/patch size but in functionality being moved around, so I'll wait until @jan-law gets another chance to take a look and @ebaron gets some eyes on this before merging.

src/main/java/io/cryostat/net/reports/RemoteReportGenerator.java

found deficiency

jan-law · 2021-12-13T15:44:40Z

After pulling your latest changes, I got a dependency error looking for the 2.4.0 cryostat-core jar.
Could not resolve dependencies for project io.cryostat:cryostat:jar:2.1.0-SNAPSHOT: Failure to find io.cryostat:cryostat-core:jar:2.4.0

When I looked at cryostat-core, I see the versioning changed from 2.3.1 to 2.5.0-SNAPSHOT. Your branch builds fine when I change <io.cryostat.core.version>2.4.0</io.cryostat.core.version> to 2.5.0-SNAPSHOT. Where is the 2.4.0 -core version from?

andrewazores · 2021-12-13T17:17:41Z

After pulling your latest changes, I got a dependency error looking for the 2.4.0 cryostat-core jar. Could not resolve dependencies for project io.cryostat:cryostat:jar:2.1.0-SNAPSHOT: Failure to find io.cryostat:cryostat-core:jar:2.4.0

When I looked at cryostat-core, I see the versioning changed from 2.3.1 to 2.5.0-SNAPSHOT. Your branch builds fine when I change <io.cryostat.core.version>2.4.0</io.cryostat.core.version> to 2.5.0-SNAPSHOT. Where is the 2.4.0 -core version from?

You'll have to check out and build the upstream v2 branch: https://github.com/cryostatio/cryostat-core/blob/v2/pom.xml#L8

That goes for anything after #731

ebaron · 2021-12-16T22:32:11Z

I have a quick/dirty setup of this in OpenShift, but hit a reflection problem with the reports container:

2021-12-16 22:18:19,814 ERROR [io.qua.ver.htt.run.QuarkusErrorHandler] (executor-thread-0) HTTP Request to /report failed, error id: 64f46a06-c1e9-4822-932e-c8cd734f5498-1: java.lang.RuntimeException: java.lang.InstantiationException: Type `org.openjdk.jmc.flightrecorder.internal.parser.v1.StructTypes.JfrOldObject` can not be instantiated reflectively as it does not have a no-parameter constructor or the no-parameter constructor has not been added explicitly to the native image.
	at org.openjdk.jmc.flightrecorder.internal.parser.v1.ValueReaders$ReflectiveReader.read(ValueReaders.java:526)
	at org.openjdk.jmc.flightrecorder.internal.parser.v1.TypeManager$TypeEntry.readConstant(TypeManager.java:294)
	at org.openjdk.jmc.flightrecorder.internal.parser.v1.TypeManager.readConstants(TypeManager.java:412)
	at org.openjdk.jmc.flightrecorder.internal.parser.v1.ChunkLoaderV1.readConstantPoolEvent(ChunkLoaderV1.java:110)
	at org.openjdk.jmc.flightrecorder.internal.parser.v1.ChunkLoaderV1.call(ChunkLoaderV1.java:77)
	at org.openjdk.jmc.flightrecorder.internal.parser.v1.ChunkLoaderV1.call(ChunkLoaderV1.java:47)
	at java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
	at java.util.concurrent.FutureTask.run(FutureTask.java:264)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
	at java.lang.Thread.run(Thread.java:829)
	at com.oracle.svm.core.thread.JavaThreads.threadStartRoutine(JavaThreads.java:567)
	at com.oracle.svm.core.posix.thread.PosixJavaThreads.pthreadStartRoutine(PosixJavaThreads.java:192)
Caused by: java.lang.InstantiationException: Type `org.openjdk.jmc.flightrecorder.internal.parser.v1.StructTypes.JfrOldObject` can not be instantiated reflectively as it does not have a no-parameter constructor or the no-parameter constructor has not been added explicitly to the native image.
	at java.lang.Class.newInstance(DynamicHub.java:911)
	at org.openjdk.jmc.flightrecorder.internal.parser.v1.ValueReaders$ReflectiveReader.read(ValueReaders.java:516)
	... 13 more

Should I be testing it in JVM mode for now?

…in target

…ng stream

extract subprocess report generation variable extract platform configuration variables extract max WS connections variable extract CORS origin variable extract DISABLE_SSL variable extract webserver configuration variables use Variables extract JMX connection config variables finish extracting variables fix missed import fix extracted variables in tests

…requests

request ordering is handled deeper in the stack by the SubprocessReportGenerator synchronizing lock, or is handled by the configured sidecar report generator setup and its potential load balancer

allows Cryostat too inspect the report generator for profiling and analysis

andrewazores · 2022-01-13T19:16:38Z

Rebased, just one very minor conflict in MessagingModule since I moved the MAX_CONNECTIONS env var out of there, but Janelle added the PRUNER constant.

jan-law

Changes look good to me after the rebase.

The todo bot should detect and create issues for any todo comments, right?

andrewazores · 2022-01-13T20:12:07Z

The todo bot is not active anymore, actually. I can't remember anymore why.

jan-law · 2022-01-13T20:18:22Z

No worries. It would be nice, although not necessary, if the todos were tracked as issues somehow. None of the todos are high priority anyway.

hareetd

Looks good!

andrewazores added the feat New feature or request label Dec 7, 2021

andrewazores force-pushed the sidecar-reports branch from 3f4d7be to 80dde0e Compare December 7, 2021 19:36

andrewazores marked this pull request as ready for review December 8, 2021 16:25

andrewazores force-pushed the sidecar-reports branch from f45db29 to add2644 Compare December 8, 2021 16:26

andrewazores marked this pull request as draft December 8, 2021 16:26

andrewazores force-pushed the sidecar-reports branch 2 times, most recently from a5802d9 to aabfdcd Compare December 8, 2021 17:04

andrewazores marked this pull request as ready for review December 8, 2021 17:14

andrewazores requested review from hareetd, ebaron and jan-law December 8, 2021 17:15

jan-law reviewed Dec 9, 2021

View reviewed changes

src/main/java/io/cryostat/net/reports/ReportsModule.java Show resolved Hide resolved

src/main/java/io/cryostat/net/reports/SubprocessReportGenerator.java Outdated Show resolved Hide resolved

smoketest.sh Outdated Show resolved Hide resolved

smoketest.sh Outdated Show resolved Hide resolved

andrewazores force-pushed the sidecar-reports branch from cd75c34 to 4a1d574 Compare December 9, 2021 23:14

hareetd reviewed Dec 9, 2021

View reviewed changes

src/main/java/io/cryostat/net/reports/ActiveRecordingReportCache.java Show resolved Hide resolved

hareetd reviewed Dec 10, 2021

View reviewed changes

src/test/java/io/cryostat/net/reports/ActiveRecordingReportCacheTest.java Outdated Show resolved Hide resolved

andrewazores force-pushed the sidecar-reports branch from 4a1d574 to ee97ed6 Compare December 10, 2021 18:01

hareetd previously approved these changes Dec 10, 2021

View reviewed changes

jan-law reviewed Dec 10, 2021

View reviewed changes

src/main/java/io/cryostat/net/reports/RemoteReportGenerator.java Outdated Show resolved Hide resolved

hareetd self-requested a review December 10, 2021 20:59

andrewazores force-pushed the sidecar-reports branch from ee97ed6 to d361a9e Compare December 10, 2021 22:01

This was referenced Dec 10, 2021

V1 RecordingsPostHandler does not delete improperly-named files #784

Closed

Deleting archived recording does not remove its cached report #785

Closed

andrewazores added 22 commits January 13, 2022 14:15

extract some env var names that are referenced in multiple places

186d9e6

apply report generation locking only in subprocess mode

3197ff3

default to subprocess generation if env var unset

664ab88

apply limits to report generator

a9bbcbd

error handling cleanup

6349165

test(health): include reports status in expected output

1b329ca

fixup! default to subprocess generation if env var unset

8c4cae3

apply spotless formatting

67790bd

fix(reports): throw expected exception type when recording not found …

9cd39ab

…in target

fix(reports): attempt to open local file stream before remote recordi…

cf7a60b

…ng stream

suppress spurious warning

b6ee6b3

change cryostat-reports namespace after repo transfer

0ee531c

remove unnecessary comment

f3337f5

use xpath host var

ca1f8f6

fixup! change cryostat-reports namespace after repo transfer

3eae455

remove unused class

c991b65

fix(reports): apply timeout when generating reports to avoid hanging …

d054763

…requests

fix(reports): reports handlers should be unordered

f242afd

request ordering is handled deeper in the stack by the SubprocessReportGenerator synchronizing lock, or is handled by the configured sidecar report generator setup and its potential load balancer

fix unit tests

e65d0e8

use latest cryostat-reports and configure for small footprint

d9820d3

enable JDP and JMX on cryostat-reports sidecar

5c330fc

allows Cryostat too inspect the report generator for profiling and analysis

andrewazores force-pushed the sidecar-reports branch from d1c22ac to 5c330fc Compare January 13, 2022 19:16

jan-law approved these changes Jan 13, 2022

View reviewed changes

hareetd approved these changes Jan 13, 2022

View reviewed changes

andrewazores merged commit 8347e65 into cryostatio:main Jan 13, 2022

andrewazores deleted the sidecar-reports branch January 13, 2022 21:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(reports): sidecar container report generation #779

feat(reports): sidecar container report generation #779

andrewazores commented Dec 7, 2021 •

edited

Loading

andrewazores commented Dec 7, 2021

hareetd commented Dec 9, 2021 •

edited

Loading

andrewazores commented Dec 9, 2021 •

edited

Loading

hareetd left a comment

andrewazores commented Dec 10, 2021

jan-law commented Dec 13, 2021

andrewazores commented Dec 13, 2021 •

edited

Loading

ebaron commented Dec 16, 2021

andrewazores commented Jan 13, 2022

jan-law left a comment

andrewazores commented Jan 13, 2022

jan-law commented Jan 13, 2022

hareetd left a comment

feat(reports): sidecar container report generation #779

feat(reports): sidecar container report generation #779

Conversation

andrewazores commented Dec 7, 2021 • edited Loading

andrewazores commented Dec 7, 2021

hareetd commented Dec 9, 2021 • edited Loading

andrewazores commented Dec 9, 2021 • edited Loading

hareetd left a comment

Choose a reason for hiding this comment

andrewazores commented Dec 10, 2021

jan-law commented Dec 13, 2021

andrewazores commented Dec 13, 2021 • edited Loading

ebaron commented Dec 16, 2021

andrewazores commented Jan 13, 2022

jan-law left a comment

Choose a reason for hiding this comment

andrewazores commented Jan 13, 2022

jan-law commented Jan 13, 2022

hareetd left a comment

Choose a reason for hiding this comment

andrewazores commented Dec 7, 2021 •

edited

Loading

hareetd commented Dec 9, 2021 •

edited

Loading

andrewazores commented Dec 9, 2021 •

edited

Loading

andrewazores commented Dec 13, 2021 •

edited

Loading