Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get Sentry to use the debug symbols that are compiled in to the executable? #938

Open
jfriesne opened this issue Jan 17, 2024 · 11 comments
Open

Comments

@jfriesne
Copy link

Environment

SaaS (https://sentry.io/)

What are you trying to accomplish?

I'm evaluating Sentry-native as a possible replacement for an existing (hand-crafted) crash-handling mechanism in my C++/Qt app.

Under Windows, my crash-handler implementation would use _try/_except to catch the crash, and call MiniDumpWriteDump() to write a file containing a stack trace (etc) to the user's local disk. We would then have to rely on the user to send the file to us for analysis.

Under MacOS/X, we just relied on the default MacOS crash-handler, which would do something similar (and present a nice crash-dialog containing the stack trace, etc)

In order for the generated stack traces to be easily readable, we shipped our executables with debug-symbols included. However, it seems that Sentry ignores the debug-symbols that are built in to the executable, and wants us to upload debug-symbols separately instead.

My question is, is there any way to convince Sentry to use/upload the debug-symbols that are present in the crashed executable file? That would be easier for us to work with than having to store and track separate symbols-files for every build. (I'm aware that it makes our executables larger, and potentially easier to reverse-engineer, and that's okay)

Thanks,
Jeremy

How are you getting stuck?

Not sure if Sentry supports the behavior I'd like to have, or not.

Where in the product are you?

Issues - Source Maps

Link

https://sentry.io/organizations/meyer-sound-b5217be22/projects/cuestation/?project=4506146063122432

DSN

https://3f24d5875acfd318421a573605556030@o4505915185823744.ingest.sentry.io/4506146063122432

Version

No response

@getsantry
Copy link

getsantry bot commented Jan 17, 2024

Assigning to @getsentry/support for routing ⏲️

@kerenkhatiwada kerenkhatiwada transferred this issue from getsentry/sentry Jan 17, 2024
@getsantry getsantry bot moved this from Waiting for: Support to Waiting for: Product Owner in GitHub Issues with 👀 2 Jan 17, 2024
@supervacuus
Copy link
Collaborator

With our crashpad fork, you can enable client stack traces in the build (pass -DCRASHPAD_ENABLE_STACKTRACE=On at the CMake configuration stage) when using the crashpad backend. As mentioned in the readme, the feature is considered experimental, but it should serve your use case on macOS, Linux, and Windows.

This will only attach a symbolicated stack trace to the uploaded minidump in your crash event. That means it will have no line numbers (without uploading corresponding debug files).

@jfriesne
Copy link
Author

Thanks for the quick and helpful response! I've added the -DCRASHPAD_ENABLE_STACKTRACE=On flag to my cmake line, and I see that the Sentry report now contains a stack trace with some function names.

However, at least on MacOS, Sentry's stack trace appears to be inaccurate -- a few of the entries look correct, but the majority of them are wrong or missing. As an example, see the attached screenshot comparing the stack trace generated by MacOS's built-in crash reporter vs the Sentry stack trace, both for the same fault.

Is that a known issue, or is there some other step I might need to take in order to get accurate stack traces from Sentry this way?

Thanks,
Jeremy

sentry_stack_trace_comparison

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 2 Jan 18, 2024
@supervacuus
Copy link
Collaborator

However, at least on MacOS, Sentry's stack trace appears to be inaccurate -- a few of the entries look correct, but the majority of them are wrong or missing. As an example, see the attached screenshot comparing the stack trace generated by MacOS's built-in crash reporter vs the Sentry stack trace, both for the same fault.

Is that a known issue, or is there some other step I might need to take in order to get accurate stack traces from Sentry this way?

No, this is not a known issue. It seems like the first frame is already corrupted. I would expect the stack trace in a Qt application to look like this:

grafik

which I have created from a test qt project with 0.7.0 on macOS.

Can you provide more information on your setup (architecture, macOS version, etc.) or maybe send a link to the event to karl.struggl@sentry.io so we can inspect the event further?

cc @Swatinem, have you seen something like this when using libunwind on macOS? It seems like the symbol lookup is corrupted, could this be from a stripped binary?

@jfriesne
Copy link
Author

AFAIK the binary is not stripped; e.g. with Sentry disabled, the MacOS stack reporter shows a stack trace with the expected human-readable function names. (Also if I do run strip on the executable, I see that the executable size is significantly reduced, which suggests to me that there was debug information in it beforehand)

FWIW I have attached a file containing a couple of lines of output from my app's build log; one for compiling a .cpp file, and the second line is the link line. Perhaps there is a flag being used there that Sentry wasn't expecting?

compile_and_link_lines.txt

@supervacuus
Copy link
Collaborator

Thanks for the update and the build parameters. I don't see any parameters that should cause an issue, but this feature uses a curious subset of the LLVM-distributed libunwind (curious because the feature "remote unwinding" was explicitly removed from the LLVM-libunwind and we use an implementation from an older version, so this is largely unmaintained code, even in upstream).

It is also hard to debug the issue without access to the concrete artifacts. Could you produce a more minimal repro that shows the same behavior?

@Swatinem, can I ask you to have a short look whether you can see something (either in the stack-trace, the event or the build parameters) that immediately comes to mind wrt the remote unwind feature we're using.

@jfriesne
Copy link
Author

@supervacuus I spent about 3 hours playing "how about now?" with my app and Sentry (removing and adding code to see what would trigger the fault and what would not) and unfortunately I wasn't able to come up with a clear cause-and-effect scenario that I could use to form a minimal reproducible case. The problem seems to occur whenever certain symbols are present in the executable, even if those functions are never executed and are implemented as no-ops; however, I don't think it's anything special about those symbols; more likely their presence just happens to tickle the fault in the right way. My suspicion is that newer Apple tools have modified or extended their object-code format in some way that the older version of libunwind wasn't expecting, and that is distorting libunwind's results.

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 2 Jan 30, 2024
@supervacuus
Copy link
Collaborator

I spent about 3 hours playing "how about now?" with my app and Sentry (removing and adding code to see what would trigger the fault and what would not) and unfortunately I wasn't able to come up with a clear cause-and-effect scenario that I could use to form a minimal reproducible case.

Thanks for the effort, @jfriesne!

My suspicion is that newer Apple tools have modified or extended their object-code format in some way that the older version of libunwind wasn't expecting, and that is distorting libunwind's results.

That might be the case, although I wonder why I can get a relatively sensible stack trace on my M1 with a similar toolchain and target-os version. I could imagine two things that you can still try out:

  • turning off optimization (compiling with -O0)
  • changing -mmacosx-version-min=10.13 to something more recent (e.g., the same version as your build-machine)

I guess you are not using Rosetta 2 and run the x86-64 program on an Intel Mac, right?
Are you manually (ie, via dlopen/dlsym) loading any of the libraries at runtime?
Can you send karl.struggl@sentry.io a minidump (which you find in the database-path under the completed directory)? This way we can ensure that the backend reproduces the client stacktrace and nothing happens in-between.

@jfriesne
Copy link
Author

@supervacuus yes, I am compiling/running on an Intel Mac Mini running Sonoma 14.1.2 (with XCode 15.0.1). I've zipped up my database folder (including just one crash report with the fault) and attached it.

I recompiled with -O0 to see if that would make a difference; with that build the stack trace is a bit different, but still wrong (e.g. as shown in the attached screenshot, with the expected stack trace on the left and Sentry's reported stack trace on the right)

I then also removed the -mmacosx-version-min=10.13 argument from the Makefile and recompiled; that yielded a different stack trace, but still wrong in the same general way (see second attached screenshot).

jaf_sentry_crash_log.zip

image image

@getsantry getsantry bot moved this to Waiting for: Product Owner in GitHub Issues with 👀 2 Jan 31, 2024
@supervacuus
Copy link
Collaborator

Thanks for checking @jfriesne! I will check the dump tomorrow.

@supervacuus
Copy link
Collaborator

I am very sorry for the delay @jfriesne, there is definitely something wrong.

I had a look into that minidump you posted and the symbols are similarly corrupted (and not demangled, since that happens in the backend). I wanted to be sure that there is no corruption in the streamwriter or in the processing of the minidump in the backend. It seems the issue is solely in the remote unwinder we added to the snapshotting in the crashpad_handler.

I have some good news though, interestingly it seems that frames from libqcocoa and QtWidgets always produce legible symbols, while QtCore and QtGui do not (in addition to the frames from your module CueStation 8). I looked at the stack-trace from my own testing and it seems that frames from the latter modules also have issues there. That means i can at least partially reproduce the issue locally and try to figure out what leads to the failed symbol lookup.

Last but not least, I cannot guarantee you that we can immediately focus on this issue since it is an experimental feature, but I will keep you in the loop. Thanks for your understanding.

cc: @kahest sync agenda

@supervacuus supervacuus added bug Something isn't working backend: crashpad and removed question This issue poses a question labels Feb 7, 2024
@kahest kahest moved this from Needs Discussion to Needs Investigation in Mobile & Cross Platform SDK Feb 15, 2024
@kahest kahest changed the title Is there a way to get Sentry to use the debug symbols that are compiled in to the executable? Get Sentry to use the debug symbols that are compiled in to the executable? Jun 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Status: Needs Investigation
Development

No branches or pull requests

3 participants