Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mention reproduciblity #8

Open
joebonrichie opened this issue Oct 22, 2024 · 2 comments
Open

Mention reproduciblity #8

joebonrichie opened this issue Oct 22, 2024 · 2 comments

Comments

@joebonrichie
Copy link
Contributor

Distributions often won't PGO their packages as it can hurt reproducibility, this is mainly due the compiler potentially producing different run to run profiling results for the given workload affecting the optimized build. This is less of an issue with FDO as the profile provided is static.

For distributions like Arch Linux this is especially problematic as the binary packages are currently compiled the maintainer's personal machines then uploaded to the repository server AFAIU. Therefore the reproducibility requirements are high.

GCC has a flag to mitigate this https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html#index-fprofile-reproducible. E.g. the GCC auto PGO build will enable fprofile-reproducible=parallel-runs by default.

I currently don't know if clang provides any flags that affect profile reproducibility

Any toolchain resources compiled here mentioning reproducible profiles would be handy.

@zamazan4ik
Copy link
Owner

Yeah, I know the pain about PGO and reproducibility - it's probably the main blocker for distributions about enabling PGO for the packages (as far as I can see from multiple discussions with OS maintainers).

This is less of an issue with FDO as the profile provided is static.

Could you please clarify what do you mean here under the "FDO" term? The Sampling PGO (Clang docs) or something else?

For distributions like Arch Linux this is especially problematic as the binary packages are currently compiled the maintainer's personal machines then uploaded to the repository server AFAIU. Therefore the reproducibility requirements are high.

Wow, I didn't know about such a, khm-khm, tricky detail about Arch Linux packages. I agree, that in such cases the importance of reproducibility for packages is higher.

GCC has a flag to mitigate this https://gcc.gnu.org/onlinedocs/gcc/Instrumentation-Options.html#index-fprofile-reproducible. E.g. the GCC auto PGO build will enable fprofile-reproducible=parallel-runs by default.

I didn't know about such a flag in GCC - thank you! I definitely need to spend more time with the GCC toolchain, huh. Most of my experience is about LLVM due to various reasons (I have nothing against GCC - it's just a personal preference, no more).

I currently don't know if clang provides any flags that affect profile reproducibility

That's a good question! I also didn't know about such a flag or any alternative in LLVM. There is a similar flag -fprofile-update=<option> flag in Clang (docs) but I don't know does it guarantee or not the "full" profile reproducibility or not. I will ask this question on LLVM forums, and update the article accordingly.

Any toolchain resources compiled here mentioning reproducible profiles would be handy.

Sure! I think I will extend the current "Reproducibilty" section in the article with such information.

If you have more information to share about PGO - you are welcome! I really appreciate contributions about this topic.

@zamazan4ik
Copy link
Owner

The question about profile reproducibility in LLVM: https://discourse.llvm.org/t/pgo-profile-reproducibility/82861

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants