Feature Request: Provide explicit timings to the Performance Monitoring API #941

leops · 2024-01-30T13:26:01Z

Exposing a lower-level API for manually creating transactions and spans with explicit timing values (instead of automatically reading the system clock when the spans are started / finished) would be useful to capture timing informations at a higher resolution than the default millisecond (which is often not precise enough for native code so microseconds / nanoseconds may be needed), or to build up a transaction from timing data coming from an external source (eg. performance counters from the GPU). I could easily open a PR implementing this, but it does raise the question of what the naming convention and general API design for these explicit transaction management functions should be.

supervacuus · 2024-01-31T12:24:04Z

Hi @leops, as far as I know, microseconds are the current limit in the backend for performance/tracing events. Anything smaller than this would be truncated (or rounded), so opening the API to arbitrary scales doesn't make much sense (without considerable changes in the backend, and that discussion would have to happen in another issue tracker). Changing our timestamping to microseconds might be sensible, though.

I am unsure whether the tracing API is the proper interface for instrumentation at that scale level because we can gather a maximum of 1000 spans before a transaction must flush. Transaction flushing is a costly operation (the transport happens in a backend thread, but preparing the envelope for sending happens in the calling thread and includes serialization, something you might not want to have in a tight loop).

This sounds like the perfect use case for developer metrics, where you have client-side aggregation of arbitrary measurements (you can use it for profiling code at that low level) over time. This is implemented only in the backend and in the Python SDK (?) in a very early phase and only rolled out to some customers. However, a discussion regarding the needs of Native SDK users might already make sense.

CC: @kahest.

leops · 2024-02-01T08:47:52Z

Setting the resolution limit to microseconds sounds sensible, I think collecting timings at nanosecond precision is bound to have a lot of measurement noise for one-shot captures anyway. The cost of transaction flushing doesn't sound like that much of a problem though as having an explicit API would allow for the envelope to be built up asynchronously in a background thread (this would necessarily be the case for collecting GPU timings for instance, with timestamp queries being processed asynchronously on the CPU between frames), but I agree this kind of high-resolution transaction should still avoid getting anywhere near the 1000 spans limits and only keep track of operations at a macroscopic level within that scope since performing too many measurements would once again add a lot of noise / overhead. I think the metrics feature is potentially interesting, as would profiling but there's still use case where you want to have a clear hierarchical view of what's going on when a given operation is being slower than expected, and profiling isn't supported in the Native SDK (and once again even less so on GPUs)

supervacuus · 2024-02-02T17:35:28Z

Hi @leops. That makes a lot of sense. So, to quickly summarize:

microseconds would be enough fine-grained resolution for now
you'd prefer to have hierarchical tracing rather than generic metrics (the latter aren't currently implemented anyway, but even if they were, performance/tracing would still be the better match for your use case)
you would run the transaction flush asynchronously. I mainly mention this because people use the tracing API synchronously in screen-refresh loops or UI threads, which is different from the scenario for which they were initially implemented.
if I got this right, you would still prefer to pass explicit timestamps or construct the span brackets yourself rather than defining them explicitly in the code, not only due to different timing resolutions but also because the measurements you want to take are not expressible in the host code. To be clear, you would still need to align these measurements with our wall clock to make these interpretable in traces.

Do you think this makes sense? I am asking to understand what kind of interface would be sufficient.

leops · 2024-02-05T13:33:43Z

Yes I think this matches what I had in mind

getsantry bot added the Waiting for: Product Owner label Jan 30, 2024

github-project-automation bot added this to Mobile & Cross Platform SDK Jan 30, 2024

getsantry bot added this to GitHub Issues with 👀 2 Jan 30, 2024

getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 2 Jan 30, 2024

github-project-automation bot moved this to Needs Discussion in Mobile & Cross Platform SDK Jan 30, 2024

getsantry bot removed the Waiting for: Product Owner label Jan 31, 2024

getsantry bot removed the status in GitHub Issues with 👀 2 Jan 31, 2024

supervacuus added enhancement New feature or request area: api area: core labels Jan 31, 2024

getsantry bot added the Waiting for: Product Owner label Feb 1, 2024

getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 2 Feb 1, 2024

getsantry bot removed the Waiting for: Product Owner label Feb 2, 2024

getsantry bot removed the status in GitHub Issues with 👀 2 Feb 2, 2024

getsantry bot added the Waiting for: Product Owner label Feb 5, 2024

getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 2 Feb 5, 2024

supervacuus removed the Waiting for: Product Owner label Feb 7, 2024

getsantry bot removed the status in GitHub Issues with 👀 2 Feb 7, 2024

kahest moved this from Needs Discussion to Backlog in Mobile & Cross Platform SDK Mar 7, 2024

supervacuus mentioned this issue May 24, 2024

feat: change the timestamp resolution to microseconds #995

Merged

kahest added the Platform: Native label Aug 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature Request: Provide explicit timings to the Performance Monitoring API #941

Feature Request: Provide explicit timings to the Performance Monitoring API #941

leops commented Jan 30, 2024

supervacuus commented Jan 31, 2024 •

edited

Loading

leops commented Feb 1, 2024

supervacuus commented Feb 2, 2024

leops commented Feb 5, 2024

Feature Request: Provide explicit timings to the Performance Monitoring API #941

Feature Request: Provide explicit timings to the Performance Monitoring API #941

Comments

leops commented Jan 30, 2024

supervacuus commented Jan 31, 2024 • edited Loading

leops commented Feb 1, 2024

supervacuus commented Feb 2, 2024

leops commented Feb 5, 2024

supervacuus commented Jan 31, 2024 •

edited

Loading