Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to locate managed application c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2 #109611

Open
omajid opened this issue Nov 7, 2024 · 18 comments
Assignees
Milestone

Comments

@omajid
Copy link
Member

omajid commented Nov 7, 2024

Description

I am seeing variants of this error on Fedora ELN and CentOS Stream 10, with .NET 8 and .NET 9.

With .NET 8, I see this when building the VMR:

             ILCompiler.Diagnostics -> dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/ILCompiler.Diagnostics/x64/Release/ILCompiler.Diagnostics.dll
             ILCompiler.ReadyToRun -> dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/ILCompiler.ReadyToRun/x64/Release/ILCompiler.ReadyToRun.dll
             crossgen2_publish -> dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/crossgen2.dll
             Optimizing assemblies for size. This process might take a while.
             crossgen2_publish -> dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/
             Microsoft.NETCore.App.Crossgen2 -> 
             Failed to locate managed application [dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2]
           dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/installer/pkg/sfx/Microsoft.NETCore.App/Microsoft.NETCore.App.Crossgen2.sfxproj(67,5): error MSB3073: The command "dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/crossgen2 dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/coreclr/linux.x64.Release/IL/System.Private.CoreLib.dll --out dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/Microsoft.NETCore.App.Crossgen2/Release/net8.0/fedora.42-x64/S.P.C.tmp" exited with code 146.

Trying to trace this shows:

COREHOST_TRACE=1 dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/crossgen2 dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/coreclr/linux.x64.Release/IL/System.Private.CoreLib.dll --out dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/Microsoft.NETCore.App.Crossgen2/Release/net8.0/fedora.42-x64/S.P.C.tmp
Tracing enabled @ Thu Nov  7 13:41:40 2024 GMT
--- Invoked apphost [version: 8.0.10 @Commit: 81cabf2857a01351e5ab578947c7403a5b128ad1] main = {
dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/crossgen2
dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/coreclr/linux.x64.Release/IL/System.Private.CoreLib.dll
--out
dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/Microsoft.NETCore.App.Crossgen2/Release/net8.0/fedora.42-x64/S.P.C.tmp
}
The managed DLL bound to this executable is: 'c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2'
Detected Single-File app bundle
Using internal fxr
Invoking fx resolver [dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/] hostfxr_main_bundle_startupinfo
Host path: [dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/crossgen2]
Dotnet path: [dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/]
App path: [dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2]
Bundle Header Offset: [99398693]
--- Invoked hostfxr_main_bundle_startupinfo [version: 8.0.10 @Commit: 81cabf2857a01351e5ab578947c7403a5b128ad1]
Mapped application bundle
Unmapped application bundle
Single-File bundle details:
DepsJson Offset:[5ec8f80] Size[24a5]
RuntimeConfigJson Offset:[4f4f5a8] Size[5b4]
.net core 3 compatibility mode: [No]
--- Executing in a native executable mode...
Using dotnet root path [dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/]
App runtimeconfig.json from [dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2]
Runtime config is cfg=dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2.runtimeconfig.json dev=dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2.runtimeconfig.dev.json
Attempting to read dev runtime config: dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2.runtimeconfig.dev.json
Attempting to read runtime config: dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2.runtimeconfig.json
Mapped bundle for [dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2.runtimeconfig.json]
Unmapped application bundle
Runtime config [dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2.runtimeconfig.json] is valid=[1]
Executing as a self-contained app as per config file [dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2.runtimeconfig.json]
Using internal hostpolicy
Reading from host interface version: [0x16041101:248] to initialize policy version: [0x16041101:248]
Mapped application bundle
Unmapped application bundle
--- Invoked hostpolicy [version: 8.0.10 @Commit: 81cabf2857a01351e5ab578947c7403a5b128ad1] corehost_main = {
dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/crossgen2
dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/coreclr/linux.x64.Release/IL/System.Private.CoreLib.dll
--out
dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/Microsoft.NETCore.App.Crossgen2/Release/net8.0/fedora.42-x64/S.P.C.tmp
}
Mode: apphost
Deps file: 
Managed application [c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2] not found in single-file bundle
Failed to locate managed application [otnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2]

On CentOS Stream 10, the VMR is built successfully, but the produced SDK fails to run with similar symptoms.

Reproduction Steps

On Fedora:

$ fedpkg clone -a dotnet8.0
$ cd dotnet8.0
$ fedpkg --namespace rpms --name dotnet8.0 --release eln mockbuild --no-cleanup-after

Expected behavior

.NET builds

Actual behavior

VMR fails to build, or the generated SDK is broken.

Regression?

Yes.

I don't see this behaviour on Fedora 41, or CentOS Stream 8 or 9.

Known Workarounds

No response

Configuration

$ clang --version
clang version 19.1.0 (Fedora 19.1.0-1.eln143)
Target: x86_64-redhat-linux-gnu
Thread model: posix
InstalledDir: /usr/bin
Configuration file: /etc/clang/x86_64-redhat-linux-gnu-clang.cfg

Other information

I had to port #109198 first, to build against clang 19. But clang 19 is also used on Fedora 41, and that doesn't show the same issue.

Copy link
Contributor

Tagging subscribers to this area: @vitek-karas, @agocke, @VSadov
See info in area-owners.md if you want to be subscribed.

@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Nov 7, 2024
@elinor-fung
Copy link
Member

The managed DLL bound to this executable is: 'c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2'

That is the placeholder that should get replaced by the SDK during build:

#define EMBED_HASH_HI_PART_UTF8 "c3ab8ff13720e8ad9047dd39466b3c89" // SHA-256 of "foobar" in UTF-8
#define EMBED_HASH_LO_PART_UTF8 "74e592c2fa383d4a3960714caef0c4f2"

// Re-write the destination apphost with the proper contents.
BinaryUtils.SearchAndReplace(accessor, AppBinaryPathPlaceholderSearchValue, appPathBytes);

We do explicitly check whether the placeholder value is the same though, so I am confused as to why we don't hit this check:

// Check if the value is the same as the placeholder
// Since the single static string is replaced by editing the executable, a reference string is needed to do the compare.
// So use two parts of the string that will be unaffected by the edit.
size_t hi_len = (sizeof(hi_part) / sizeof(hi_part[0])) - 1;
size_t lo_len = (sizeof(lo_part) / sizeof(lo_part[0])) - 1;
if (binding.size() >= (hi_len + lo_len)
&& binding.compare(0, hi_len, &hi_part[0]) == 0
&& binding.compare(hi_len, lo_len, &lo_part[0]) == 0)
{
trace::error(_X("This executable is not bound to a managed DLL to execute. The binding value is: '%s'"), app_dll->c_str());
return false;
}

It seems like on those platforms, the crossgen2 executable is somehow (reported as) successfully produced, but without actually updating the placeholder? Do you have a binlog for the build/publish of crossgen2 itself?

@omajid
Copy link
Member Author

omajid commented Nov 20, 2024

It seems like on those platforms, the crossgen2 executable is somehow (reported as) successfully produced, but without actually updating the placeholder?

Yes, I think so. This is a VMR build, so looking for the placeholder value shows apphost and singlefilehost (which are expected), and also crossgen2 (unexpected). There's also /builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/crossgen2_publish/x64/Release/singlefilehost which, if I am reading the binlog right, is the intermediate (?) apphost for crossgen2.

# grep -rF c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2 ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts
grep: ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/Microsoft.NETCore.App.Host/Release/net8.0/fedora.42-x64/output/packs/Microsoft.NETCore.App.Host.fedora.42-x64/8.0.10/runtimes/fedora.42-x64/native/apphost: binary file matches
grep: ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/Microsoft.NETCore.App.Host/Release/net8.0/fedora.42-x64/output/packs/Microsoft.NETCore.App.Host.fedora.42-x64/8.0.10/runtimes/fedora.42-x64/native/singlefilehost: binary file matches
grep: ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/crossgen2_publish/x64/Release/singlefilehost: binary file matches
grep: ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/linux.x64.Release/Corehost.Static/CMakeFiles/singlefilehost.dir/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/corehost.cpp.o: binary file matches
grep: ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/linux.x64.Release/Corehost.Static/singlefilehost: binary file matches
grep: ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/fedora.42-x64.Release/apphost/standalone/CMakeFiles/apphost.dir/__/__/corehost.cpp.o: binary file matches
grep: ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/fedora.42-x64.Release/apphost/standalone/apphost: binary file matches
grep: ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/coreclr/linux.x64.Release/corehost/singlefilehost: binary file matches
grep: ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/fedora.42-x64.Release/corehost/singlefilehost: binary file matches
grep: ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/fedora.42-x64.Release/corehost/apphost: binary file matches
grep: ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/Microsoft.NETCore.App.Crossgen2/Release/net8.0/fedora.42-x64/crossgen2: binary file matches
grep: ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/crossgen2: binary file matches
grep: ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/crossgen2_publish/x64/Release/fedora.42-x64/publish/crossgen2: binary file matches

Do you have a binlog for the build/publish of crossgen2 itself?

Here's the binlog of the VMR. The VMR produces this broken crossgen2 that then fails to run: https://people.redhat.com/~omajid/scratch/crossgen2-placeholder-sourcebuild.binlog

This was built using a source-built SDK itself. That SDK can produce an mkdir foobar && cd foobar && dotnet new console && dotnet run && bin/Debug/net8.0/foobar without problems.

@agocke agocke added this to the 10.0.0 milestone Nov 21, 2024
@agocke agocke removed the untriaged New issue has not been triaged by the area owner label Nov 21, 2024
@elinor-fung
Copy link
Member

I can't open that binlog for some reason. I get System.IO.InvalidDataException: Found invalid data while decoding when trying to open it in the latest MSBuild log viewer.

There's also /builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/crossgen2_publish/x64/Release/singlefilehost which, if I am reading the binlog right, is the intermediate (?) apphost for crossgen2.

Yes, that is the intermediate apphost for crossgen2. That is what should have the app name instead of the placeholder in the binary (it just gets copied to the output directory). In the correct/working case, we should have already updated the bytes in our memory mapped view of the original apphost before writing it to that obj/ location:

// Transform the host file in-memory.
RewriteAppHost(memoryMappedViewAccessor);
// Save the transformed host.
using (FileStream fileStream = new FileStream(appHostDestinationFilePath, FileMode.Create))

I would expect for there to be some failure / exception thrown if we failed to find/update the placeholder in the memory mapped view though.

Is there anything different with access/permissions for the crossgen2 intermediate output (./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/crossgen2_publish/x64/Release/) versus other folders?

Does the source-built SDK itself produce a self-contained single-file dotnet publish --sc /p:PublishSelfContained=true that runs fine?

I don't see this behaviour on Fedora 41, or CentOS Stream 8 or 9.

I wonder if there's something different with memory mapping files that what we are doing isn't working properly with. That's the part of the sequence of 'memory map the original, search/replace in private memory mapped view, then write to output' that seems like it might be affected by an OS upgrade.

@omajid
Copy link
Member Author

omajid commented Nov 25, 2024

I can't open that binlog for some reason

Sorry about that. That server seems to mess with file compression in http response and since binlogs are compressed files, the uncompression breaks the binlog.

Can you try this version: https://people.redhat.com/~omajid/scratch/crossgen2-placeholder-sourcebuild.binlog.tar.gz ? It should extract to a sourcebuild.binlog with sha256sum of 00ae8439efa285a6edfc93bfb007947b3562691137214f5d9cfe6185356062fa

Does the source-built SDK itself produce a self-contained single-file dotnet publish --sc /p:PublishSelfContained=true that runs fine?

Yes:

# dotnet new console
The template "Console App" was created successfully.

Processing post-creation actions...
Restoring /builddir/hello/hello.csproj:
  Determining projects to restore...
  Restored /builddir/hello/hello.csproj (in 123 ms).
Restore succeeded.

# dotnet publish --sc /p:PublishSelfContained=true
MSBuild version 17.8.5+b5265ef37 for .NET
  Determining projects to restore...
  Restored /builddir/hello/hello.csproj (in 140 ms).
  hello -> /builddir/hello/bin/Release/net8.0/linux-x64/hello.dll
  hello -> /builddir/hello/bin/Release/net8.0/linux-x64/publish/
# ./bin/Release/net8.0/linux-x64/publish/hello 
Hello, World!

I was able to narrow it to something to do with the singlefile host itself. In a dotnet new console, I could make this change to reproduce the problem:

<Project Sdk="Microsoft.NET.Sdk">                                                                                                                            
                                                                                                                                                             
  <PropertyGroup>                                                                                                                                            
    <OutputType>Exe</OutputType>                                                                                                                             
    <TargetFramework>net8.0</TargetFramework>                                                                                                                
    <ImplicitUsings>enable</ImplicitUsings>                                                                                                                  
    <Nullable>enable</Nullable>                                                                                                                              
  </PropertyGroup>                                                                                                                                           
                                                                                                                                                             
  <UsingTask TaskName="CreateAppHost" AssemblyFile="/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/previously-built-dotnet/sdk/8.0.108/Sdks/Microsoft.NET.Sdk/targets/../tools/net8.0/Microsoft.NET.Build.Tasks.dll" />
                                                                                                                                                             
  <Target Name="Hack" AfterTargets="Build">                                                                                                                  
                   <!-- Produces broken applications                                                                                                         
                   AppHostSourcePath="/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/coreclr/linux.x64.Release/corehost/singlefilehost"
                   -->                                                                                                                                       
                   <!-- Produces working executable                                                                                                          
                   AppHostSourcePath="/usr/lib64/dotnet/packs/Microsoft.NETCore.App.Host.fedora.42-x64/8.0.8/runtimes/fedora.42-x64/native/singlefilehost"   
                   -->                                                                                                                                       
    <CreateAppHost                                                                                                                                           
                   AppHostSourcePath="singlefilehost"                                                                                                        
                   AppHostDestinationPath="app"                                                                                                              
                   AppBinaryName="app"                                                                                                                       
                   IntermediateAssembly="crossgen2.dll"                                                                                                      
                   />                                                                                                                                        
  </Target>                                                                                                                                                  
                                                                                                                                                             
</Project>

Building this with the apphost from the current build of the VMR produces a broken application:

# cp /builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/coreclr/linux.x64.Release/corehost/singlefilehost .
# ls -alh singlefilehost 
-rwxr-xr-x. 1 root root 80M Nov 25 12:03 singlefilehost
# id
uid=0(root) gid=0(root) groups=0(root)
# ls
Program.cs  app  bin  crossgen2.dll  foobar.csproj  obj  singlefilehost
# grep c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2 singlefilehost 
grep: singlefilehost: binary file matches
# rm -rf bin obj app
# dotnet build
MSBuild version 17.8.5+b5265ef37 for .NET
  Determining projects to restore...
  Restored /builddir/foobar/foobar.csproj (in 354 ms).
  foobar -> /builddir/foobar/bin/Debug/net8.0/foobar.dll

Build succeeded.
    0 Warning(s)
    0 Error(s)

Time Elapsed 00:00:02.36
# grep c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2 app 
grep: app: binary file matches

But using the one from the SDK itself produces a working archive:

# rm -rf bin obj app
# cp /usr/lib64/dotnet/packs/Microsoft.NETCore.App.Host.fedora.42-x64/8.0.8/runtimes/fedora.42-x64/native/singlefilehost .
# ls
Program.cs  crossgen2.dll  foobar.csproj  singlefilehost
# dotnet build
MSBuild version 17.8.5+b5265ef37 for .NET
  Determining projects to restore...
  Restored /builddir/foobar/foobar.csproj (in 120 ms).
  foobar -> /builddir/foobar/bin/Debug/net8.0/foobar.dll

Build succeeded.
    0 Warning(s)
    0 Error(s)

Time Elapsed 00:00:00.90
# grep c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2 app 
# 

@elinor-fung
Copy link
Member

Thanks - I am able to open it now.

I was able to narrow it to something to do with the singlefile host itself

Oh, that is definitely interesting (also confusing). So same Microsoft.NET.Build.Tasks.dll (and Microsoft.NET.HostModel.dll) from the SDK itself, but the current VMR singlefilehost versus the SDK singlefilehost is the difference.

From the binlog, crossgen2 is the only one using the current VMR singlefilehost (ilc uses the previously build SDK - /builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/previously-built-dotnet/packs/Microsoft.NETCore.App.Host.fedora.42-x64/8.0.8/runtimes/fedora.42-x64/native/singlefilehost), so it is the only one affected here.

Would you be able to share the broken/working singlefilehosts from the current VMR build/previously source-built SDK?

I did also try to get myself a repro. I am admittedly completely unfamiliar with fedpkg/mockbuild - where are the outputs after doing the build? After running fedpkg --namespace rpms --name dotnet8.0 --release eln mockbuild --no-cleanup-after (I am on WSL Ubuntu 22.04 running Fedora in a docker container), I'm not sure where the artifacts are. I assumed the --no-cleanup-after flag meant I would still have the intermediate build artifacts after the failure - I have an empty dotnet8.0-8.0.111-build folder and a results_dotnet8.0 folder that just has *.log files.

@omajid
Copy link
Member Author

omajid commented Nov 26, 2024

Would you be able to share the broken/working singlefilehosts from the current VMR build/previously source-built SDK?

https://people.redhat.com/~omajid/scratch/hosts.tar.gz

$ sha256sum singlefilehost.*
30466e81774e18426dc7b392806e5b009b5d1841585d9bb7fd70292706850996  singlefilehost.broken  # builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/bin/coreclr/linux.x64.Release/corehost/singlefilehost
9010a4452687a60770cfd51c548c15105c4d9ae7c95bfcbaab75724973cb606b  singlefilehost.working # /usr/lib64/dotnet/packs/Microsoft.NETCore.App.Host.fedora.42-x64/8.0.8/runtimes/fedora.42-x64/native/singlefilehost  

I am admittedly completely unfamiliar with fedpkg/mockbuild

Sorry about the lack of guidance on this earlier.

mock is essentially a docker container, invented before containers were formalized. Mock uses a single container instance for each os/release/arch combination with a name computed based on that. In our case, it's fedora-eln-x86_64.

I use mock --shell to get into the "container" used to run the build. I also enable networking so I can restore packages and run dotnet. So that would look like this: mock --enable-network -r fedora-eln-x86_64 --shell.

When I need additional tools, I run mock -r fedora-eln-x86_64 --install strace (and replace strace with the package name of the tool I want). That will install the tool into the container.

There is a mock --copyin and mock --copyout command to get files into and out of the container. I generally skip that and append the path inside the container to the well-known path of the container. In this case, the prefix is /var/lib/mock/fedora-eln-x86_64/root. We can join it to a path like /usr/bin/echo inside the container, which makes it /var/lib/mock/fedora-eln-x86_64/root/usr/bin/echo on the host (outside the container).

Many mock commands will wipe the "container", so I try and be a bit careful, and use the --no-cleanup flag as much as possible.

@omajid
Copy link
Member Author

omajid commented Nov 26, 2024

I added some instrumentation to BinaryUtils.cs and then extracted it to a standalone tool to run.

diff --git a/src/runtime/src/installer/managed/Microsoft.NET.HostModel/AppHost/BinaryUtils.cs b/src/runtime/src/installer/managed/Microsoft.NET.HostModel/AppHost/BinaryUtils.cs
index 62597baeae..5aa4a003d0 100644
--- a/src/runtime/src/installer/managed/Microsoft.NET.HostModel/AppHost/BinaryUtils.cs
+++ b/src/runtime/src/installer/managed/Microsoft.NET.HostModel/AppHost/BinaryUtils.cs
@@ -38,6 +38,12 @@ internal static unsafe void SearchAndReplace(
                 {
                     Pad0(searchPattern, patternToReplace, bytes, position);
                 }
+
+                position = KMPSearch(searchPattern, bytes, accessor.Capacity);
+                if (position > 0)
+                {
+                    throw new IOException("replacing pattern didn't work!");
+                }
             }
             finally
             {

In the working scenario:

  • First KMPSearch finds a position of 11067888
  • Second KMPSearch finds a position of -1

In the broken scenario:

  • First KMPSearch finds a position of 8756096
  • Second KMPSearch finds a position of 11092256

Now I am wondering: did the compiler embed multiple copies of the pattern in the singlefilehost binary? Does the HostWriter replace the wrong one?

Edit: Yeah:

# grep -c c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2 singlefilehost
2
# grep --byte-offset --only-matching --text  c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2 singlefilehost
8756096:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2
11092256:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2

@elinor-fung
Copy link
Member

Having the placeholder in there twice would definitely break assumptions made when creating the apphost.

I can see that the second occurrence at 11092256 / 0xa94120 maps to the symbol we expect, but I don't know where the first one at 8756096 / 859b80 is coming from:

# nm -n --demangle singlefilehost.broken | grep a94120
0000000000a94120 d _ZZ28is_exe_enabled_for_executionPNSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEEE5embed
# nm -n --demangle singlefilehost.broken | grep 859b80
#

One thing that also stood out to me is just how much bigger the broken singlefilehost is - 11 MB vs 81 MB.

@omajid
Copy link
Member Author

omajid commented Nov 26, 2024

I think the file size is just because the broken singlefilehost hasn't been stripped of debuginfo?

$ file *
singlefilehost.broken:  ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, BuildID[sha1]=21d23a8646767e34ba5140ba7e2ed121b00ae8ba, for GNU/Linux 3.2.0, with debug_info, not stripped
singlefilehost.working: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib64/ld-linux-x86-64.so.2, for GNU/Linux 3.2.0, BuildID[sha1]=ba829cfe6c54b0b45e47d50791958faa23d9f423, stripped

@elinor-fung
Copy link
Member

Ah, yeah, that makes sense.

I can also see that the placeholder is in corehost.cpp.o twice:

# grep --byte-offset --only-matching --text  c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2 corehost.cpp.o
96:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2
4432:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2

This is the way we're defining that placeholder:

#define EMBED_HASH_HI_PART_UTF8 "c3ab8ff13720e8ad9047dd39466b3c89" // SHA-256 of "foobar" in UTF-8
#define EMBED_HASH_LO_PART_UTF8 "74e592c2fa383d4a3960714caef0c4f2"
#define EMBED_HASH_FULL_UTF8 (EMBED_HASH_HI_PART_UTF8 EMBED_HASH_LO_PART_UTF8) // NUL terminated

static char embed[EMBED_MAX] = EMBED_HASH_FULL_UTF8; // series of NULs followed by embed hash string

I'm just not sure how/why that gets a copy for the actual embed symbol and also a copy for something else.

@omajid
Copy link
Member Author

omajid commented Nov 26, 2024

Good catch with corehost.cpp.o! It seems to be down to the -march=x86-64-v3 flag. With different march flags, I get different number of placeholders in corehost.cpp.o.

With -march=x86-64-v3:

# pushd /builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/linux.x64.Release/Corehost.Static                                                                                                                                                      
# /usr/bin/clang++ -DCOMPILER_SUPPORTS_W_RESERVED_IDENTIFIER -DCORECLR_EMBEDDED '-DCURRENT_ARCH_NAME="x64"' '-DCURRENT_OS_NAME="linux"' -DDISABLE_CONTRACTS '-DFALLBACK_HOST_OS="fedora.42"' -DFEATURE_APPHOST=1 -DFEATURE_STATIC_HOST=1 -DGSS_SHIM -DHOSTPOLICY_EMBEDDED -DHOST_64BIT -DHOST_AMD64 '-DHOST_POLICY_PKG_NAME="static"' '-DHOST_POLICY_PKG_REL_DIR="static"' -DHOST_UNIX -DNATIVE_LIBS_EMBEDDED -DNDEBUG '-DREPO_COMMIT_HASH="static"' -DTARGET_64BIT -DTARGET_AMD64 -DTARGET_LINUX -DTARGET_UNIX -DURTBLDENV_FRIENDLY=Retail -D_FILE_OFFSET_BITS=64 -D_NO_ASYNCRTIMP -D_NO_PPLXIMP -Dsinglefilehost_EXPORTS -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/inc -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/apphost/static/.. -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/apphost/static/../../json -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/libs/System.IO.Compression.Native -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/libs/Common -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/fxr/../json -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/fxr/../fxr -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/hostpolicy/../fxr -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/hostpolicy/../json -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/hostcommon/../fxr -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/apphost/static -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/.. -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/hostmisc -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/linux.x64.Release/Corehost.Static -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/fxr -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/external/libunwind/include -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/external/libunwind/include/tdep -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/external/libunwind/include -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/external/libunwind/include/tdep -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config=/usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64-v3 -mtune=generic -fasynchronous-unwind-tables -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -Wno-used-but-marked-unused -O3 -DNDEBUG -std=gnu++11 -fPIE -fPIE -O3 -g -Wall -Wno-null-conversion -fno-omit-frame-pointer -fms-extensions -fwrapv -fstack-protector-strong -Wno-unused-variable -Wno-unused-value -Wno-unused-function -Wno-tautological-compare -Wno-unknown-pragmas -Wimplicit-fallthrough -Wno-invalid-offsetof -Wno-unused-but-set-variable -ffp-contract=off -fno-rtti -Wno-unknown-warning-option -ferror-limit=4096 -Wno-unused-private-field -Wno-constant-logical-operand -Wno-pragma-pack -Wno-incompatible-ms-struct -Wno-reserved-identifier -Wno-unsafe-buffer-usage -Wno-single-bit-bitfield-constant-conversion -Wno-cast-function-type-strict -Wno-switch-default -fsigned-char -fvisibility=hidden -ffunction-sections -MD -MT Corehost.Static/CMakeFiles/singlefilehost.dir/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/corehost.cpp.o -MF CMakeFiles/singlefilehost.dir/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/corehost.cpp.o.d -o CMakeFiles/singlefilehost.dir/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/corehost.cpp.o.v3.omajid -c /builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/corehost.cpp
# popd
# grep --byte-offset --only-matching --text  c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2 ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/linux.x64.Release/Corehost.Static/CMakeFiles/singlefilehost.dir/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/corehost.cpp.o.v3.omajid
96:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2
4432:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2

With -march=x86-64:

# pushd /builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/linux.x64.Release/Corehost.Static
# /usr/bin/clang++ -DCOMPILER_SUPPORTS_W_RESERVED_IDENTIFIER -DCORECLR_EMBEDDED '-DCURRENT_ARCH_NAME="x64"' '-DCURRENT_OS_NAME="linux"' -DDISABLE_CONTRACTS '-DFALLBACK_HOST_OS="fedora.42"' -DFEATURE_APPHOST=1 -DFEATURE_STATIC_HOST=1 -DGSS_SHIM -DHOSTPOLICY_EMBEDDED -DHOST_64BIT -DHOST_AMD64 '-DHOST_POLICY_PKG_NAME="static"' '-DHOST_POLICY_PKG_REL_DIR="static"' -DHOST_UNIX -DNATIVE_LIBS_EMBEDDED -DNDEBUG '-DREPO_COMMIT_HASH="static"' -DTARGET_64BIT -DTARGET_AMD64 -DTARGET_LINUX -DTARGET_UNIX -DURTBLDENV_FRIENDLY=Retail -D_FILE_OFFSET_BITS=64 -D_NO_ASYNCRTIMP -D_NO_PPLXIMP -Dsinglefilehost_EXPORTS -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/inc -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/apphost/static/.. -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/apphost/static/../../json -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/libs/System.IO.Compression.Native -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/libs/Common -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/fxr/../json -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/fxr/../fxr -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/hostpolicy/../fxr -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/hostpolicy/../json -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/hostcommon/../fxr -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/apphost/static -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/.. -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/hostmisc -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/linux.x64.Release/Corehost.Static -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/fxr -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/external/libunwind/include -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/external/libunwind/include/tdep -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/external/libunwind/include -I/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/external/libunwind/include/tdep -O2 -fexceptions -g -grecord-gcc-switches -pipe -Wall -Werror=format-security -Wp,-U_FORTIFY_SOURCE,-D_FORTIFY_SOURCE=3 -Wp,-D_GLIBCXX_ASSERTIONS --config=/usr/lib/rpm/redhat/redhat-hardened-clang.cfg -fstack-protector-strong -m64 -march=x86-64 -mtune=generic -fasynchronous-unwind-tables -fcf-protection -fno-omit-frame-pointer -mno-omit-leaf-frame-pointer -Wno-used-but-marked-unused -O3 -DNDEBUG -std=gnu++11 -fPIE -fPIE -O3 -g -Wall -Wno-null-conversion -fno-omit-frame-pointer -fms-extensions -fwrapv -fstack-protector-strong -Wno-unused-variable -Wno-unused-value -Wno-unused-function -Wno-tautological-compare -Wno-unknown-pragmas -Wimplicit-fallthrough -Wno-invalid-offsetof -Wno-unused-but-set-variable -ffp-contract=off -fno-rtti -Wno-unknown-warning-option -ferror-limit=4096 -Wno-unused-private-field -Wno-constant-logical-operand -Wno-pragma-pack -Wno-incompatible-ms-struct -Wno-reserved-identifier -Wno-unsafe-buffer-usage -Wno-single-bit-bitfield-constant-conversion -Wno-cast-function-type-strict -Wno-switch-default -fsigned-char -fvisibility=hidden -ffunction-sections -MD -MT Corehost.Static/CMakeFiles/singlefilehost.dir/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/corehost.cpp.o -MF CMakeFiles/singlefilehost.dir/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/corehost.cpp.o.d -o CMakeFiles/singlefilehost.dir/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/corehost.cpp.o.omajid -c /builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/corehost.cpp
# popd
# grep --byte-offset --only-matching --text  c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2 ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/linux.x64.Release/Corehost.Static/CMakeFiles/singlefilehost.dir/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/corehost.cpp.o.omajid
4448:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2

@omajid
Copy link
Member Author

omajid commented Nov 27, 2024

I am looking at where the constants are stored in the ELF object.

# grep --byte-offset --only-matching --text  c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2 ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/linux.x64.Release/Corehost.Static/CMakeFiles/singlefilehost.dir/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/corehost.cpp.o.omajid
4448:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2      
# printf '%x\n' 4448
1160
# readelf --wide -S ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/linux.x64.Release/Corehost.Static/CMakeFiles/singlefilehost.dir/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/corehost.cpp.o.omajid
...
  [30] .text.__clang_call_terminate PROGBITS        0000000000000000 001140 000012 00 AXG  0   0 16
  [31] .rela.text.__clang_call_terminate RELA            0000000000000000 016308 000030 18   G 56  30  8
  [32] .data             PROGBITS        0000000000000000 001160 000401 00  WA  0   0 16
  [33] .rodata           PROGBITS        0000000000000000 001570 000051 00   A  0   0 16
  [34] .rodata.str1.1    PROGBITS        0000000000000000 0015c1 000357 01 AMS  0   0  1
...

So the constant is put in address 0x1160, or the .data section.

# grep --byte-offset --only-matching --text  c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2 ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/linux.x64.Release/Corehost.Static/CMakeFiles/singlefilehost.dir/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/corehost.cpp.o.v3.omajid 
96:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2
4432:c3ab8ff13720e8ad9047dd39466b3c8974e592c2fa383d4a3960714caef0c4f2
# printf '%x\n' 96 4432
60
1150
# readelf --wide -S ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/linux.x64.Release/Corehost.Static/CMakeFiles/singlefilehost.dir/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/corehost.cpp.o.v3.omajid
,,,
  [ 1] .strtab           STRTAB          0000000000000000 01df8c 000b0d 00      0   0  1
  [ 2] .text             PROGBITS        0000000000000000 000040 000000 00  AX  0   0  4
  [ 3] .note.gnu.property NOTE            0000000000000000 000040 000020 00   A  0   0  8
  [ 4] .rodata.cst32     PROGBITS        0000000000000000 000060 000040 20  AM  0   0 32
...
  [30] .text.__clang_call_terminate PROGBITS        0000000000000000 001130 000012 00 AXG  0   0 16
  [31] .rela.text.__clang_call_terminate RELA            0000000000000000 016290 000030 18   G 56  30  8
  [32] .data             PROGBITS        0000000000000000 001150 000401 00  WA  0   0 16
  [33] .rodata           PROGBITS        0000000000000000 001560 000051 00   A  0   0 16
  [34] .rodata.str1.1    PROGBITS        0000000000000000 0015b1 000357 01 AMS  0   0  1

In this case, the constant is put in 0x60, .rodata.cst32 and also in 0x1150 .data section. There's some info about the .rodata.cst32 section at https://patchwork.kernel.org/project/linux-crypto/patch/20170119213304.18140-1-dvlasenk@redhat.com/. For our purposes, it seems like it's used to store readonly data that can be merged across compilation units by the linker. It seems like a safe value to overwrite too, along with .data.

Shall I create a PR to make HostWriter overwrite all instances of the value?

@elinor-fung
Copy link
Member

Thanks for all the investigation and links.

For our purposes, it seems like it's used to store readonly data that can be merged across compilation units by the linker. It seems like a safe value to overwrite too, along with .data.

Shall I create a PR to make HostWriter overwrite all instances of the value?

That sounds reasonable to me. A PR would be much appreciated, thank you!

@fweimer-rh
Copy link

fweimer-rh commented Nov 27, 2024

I am looking at where the constants are stored in the ELF object.

# readelf --wide -S ./build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/artifacts/obj/coreclr/linux.x64.Release/Corehost.Static/CMakeFiles/singlefilehost.dir/builddir/build/BUILD/dotnet8.0-8.0.110-build/dotnet-8.0.10/src/runtime/artifacts/source-build/self/src/src/native/corehost/corehost.cpp.o.v3.omajid
,,,
  [ 1] .strtab           STRTAB          0000000000000000 01df8c 000b0d 00      0   0  1
  [ 2] .text             PROGBITS        0000000000000000 000040 000000 00  AX  0   0  4
  [ 3] .note.gnu.property NOTE            0000000000000000 000040 000020 00   A  0   0  8
  [ 4] .rodata.cst32     PROGBITS        0000000000000000 000060 000040 20  AM  0   0 32
...
  [30] .text.__clang_call_terminate PROGBITS        0000000000000000 001130 000012 00 AXG  0   0 16
  [31] .rela.text.__clang_call_terminate RELA            0000000000000000 016290 000030 18   G 56  30  8
  [32] .data             PROGBITS        0000000000000000 001150 000401 00  WA  0   0 16
  [33] .rodata           PROGBITS        0000000000000000 001560 000051 00   A  0   0 16
  [34] .rodata.str1.1    PROGBITS        0000000000000000 0015b1 000357 01 AMS  0   0  1

@omajid I expect that you are seeing this because with -march=x86-64-v3, initialization uses 256-bit vector instructions (AVX2). LLVM decided to put constant data for that into its own 32-byte aligned section, .rodata.cst32, to reduce cache-line crossing and fragmentation. Previously, all the constant data needed during initialization probably was in .rodata because that's already 16-byte aligned due to the alignment required in the x86-64 psABI for sufficiently large arrays (this time for 128-bit SSE2 vector instruction).

@jkotas
Copy link
Member

jkotas commented Nov 27, 2024

initialization uses 256-bit vector instructions (AVX2). LLVM decided to put constant data for that into its own 32-byte aligned section

We should make this data volatile to avoid optimizations like this from interfering with the single-file binary patching scheme. In addition to creating copies of the data, compilers can rearrange the data as well (e.g. encoded the data in the instruction immediate) that would break the binary patching even more.

@fweimer-rh
Copy link

@jkotas A different approach would be to avoid copy-initialization altogether: have all the constant data in .data, and only initialize explictly the parts that are not known at link time. The toolchain should already do this if the initialization aggregate is constant.

@jkotas
Copy link
Member

jkotas commented Nov 27, 2024

have all the constant data in .data, and only initialize explictly the parts that are not known at link time. The toolchain should already do this if the initialization aggregate is constant.

The problem is that C/C++ toolchain sees some data, observes that the data appear to be constant throughout the program and then does extra optimization based on this observation. We need to tell C/C++ toolchain: do not make assumptions about this data. volatile is the way to do that.

I do not see how changing how your suggestion solves the problem. It does not prevent the C/C++ compiler from making observations about the data that can lead to invalid optimizations. Global optimizers in C/C++ compilers can do wonders these days.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: No status
Development

No branches or pull requests

5 participants