Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doubts regarding compiling CppInterOp using emscripten #1402

Open
anutosh491 opened this issue Oct 29, 2024 · 3 comments
Open

Doubts regarding compiling CppInterOp using emscripten #1402

anutosh491 opened this issue Oct 29, 2024 · 3 comments

Comments

@anutosh491
Copy link
Collaborator

Context for anyone interested.

@argentite was able to run clang-repl in the browser and this is how we was able to accomplish this

  1. He built llvm (clang, lld) for emscripten-wasm32
  2. He then framed a CompilerModule.cpp (some simple code calling Clang-Repl's Parse & Execute )
  3. A CMakeLists.txt file with all configurations
cmake_minimum_required(VERSION 3.20.0)
project(ClangWASMREPL)

find_package(LLVM REQUIRED CONFIG)
find_package(Clang REQUIRED CONFIG)
find_package(LLD REQUIRED CONFIG)

message(STATUS "Using ClangConfig.cmake in: ${Clang_DIR}")
message(STATUS "Using LLVMConfig.cmake in: ${LLVM_DIR}")
message(STATUS "Found LLVM ${LLVM_PACKAGE_VERSION}")

set (CMAKE_CXX_STANDARD 17)
add_compile_options(-Wall -pedantic -fPIC)

include_directories(include)

include_directories(${LLVM_INCLUDE_DIRS})
include_directories(${CLANG_INCLUDE_DIRS})
include_directories(${LLD_INCLUDE_DIRS})
separate_arguments(LLVM_DEFINITIONS_LIST NATIVE_COMMAND ${LLVM_DEFINITIONS})
add_definitions(${LLVM_DEFINITIONS_LIST})

add_executable(Compiler CompilerModule.cpp)

llvm_map_components_to_libnames(llvm_libs Core MC Support TargetParser WebAssembly)

# Link against LLVM libraries
target_link_libraries(Compiler embind)
target_link_libraries(Compiler lldWasm)
target_link_libraries(Compiler clangInterpreter)

target_link_options(Compiler PRIVATE
	-O1
	-sMODULARIZE
	-sEXPORT_ES6=1
	-sASSERTIONS
	-sALLOW_MEMORY_GROWTH=1
	-sINITIAL_MEMORY=128MB
	-sTOTAL_STACK=32MB
	-sMAIN_MODULE=1
	-sUSE_SDL=2
	-sEXPORTED_RUNTIME_METHODS=ccall,cwrap,stringToNewUTF8,getValue,setValue
	-sEXPORTED_FUNCTIONS=_malloc,_free,__ZTIN10emscripten3valE
	--preload-file ${EMSCRIPTEN_SYSROOT}/@/
	#--preload-file /media/hdstorage/builds/llvm-wasm-ems/lib/clang/19/include@/lib/clang/19/include
	#--preload-file /opt/emscripten-llvm/lib@/lib
)

For my use case, we just have 1 more layer of abstractions as compared to the 2 above
Xeus-Cpp, CppInterOp and llvm. In our case

  1. We have a very clean/efficient recipe on emscripten-forge (leaking no symbols, generating all required libraries like libclanginterpreter.a and liblldwasm.a) so this part is done.
  2. In our case the abstract "Compiler" is CppInterOp's API provides Parse and Execute so this part is done.
  3. We now have to correctly compile CppInterOp to wasm (and host on emscripten-forge) and that shall be used by xeus-cpp and we should be done. Hence only this part is remaining.
@anutosh491
Copy link
Collaborator Author

anutosh491 commented Oct 29, 2024

Starting with the doubts

  1. As can be seen while building xeus-cpp lot of undefined symbols show up (Regarding lot of undefined symbols while building xeus-cpp against emscripten compiler-research/CppInterOp#334) ... these symbols are basically being leaked by libclangInterpreter.a.

So technically as can be seen on xeus-cpp's master we have

target_link_libraries(xeus-cpp PUBLIC xeus-cpp-static clangCppInterOp pugixml argparse::argparse)

But installing llvm from emscripten-forge and using clanginterpreter tackles the undefined symbols

target_link_libraries(xeus-cpp PUBLIC xeus-cpp-static clangCppInterOp pugixml argparse::argparse clanginterpreter)

But technically we want these symbols to be exposed from clangCppInterOp and we wouldn't have to install llvm. To build clangCppInterOp we would be interested in this file (https://github.com/compiler-research/CppInterOp/blob/main/lib/Interpreter/CMakeLists.txt)

Now as can be seen in this file we have the following

set(link_libs
  ${cling_clang_interp}
  clangAST
  clangBasic
  clangFrontend
  clangLex
  clangSema
  )
  
  add_llvm_library(clangCppInterOp
  DISABLE_LLVM_LINK_LLVM_DYLIB
  CppInterOp.cpp
  ${DLM}
  LINK_LIBS
  ${link_libs}
 )

As can be in seen in the cmakelists.txt file above we also need to also link against lldwasm and provide a few flags (hence I tried adding a very rough/unpolished patch on top of this file through (#1387)

Now what I notice is the linking (clangInterpreter and others present in link_libs with clangCppInterOp) doesn't work as expected due to which the undefined symbols are still prevelant
The link.txt generated file shows this

/Users/anutosh491/work/recipes/output/bld/rattler-build_cppinterop_1730188329/build_env/opt/emsdk/upstream/emscripten/emar qc libclangCppInterOp.a CMakeFiles/clangCppInterOp.dir/CppInterOp.cpp.o CMakeFiles/clangCppInterOp.dir/DynamicLibraryManager.cpp.o CMakeFiles/clangCppInterOp.dir/DynamicLibraryManagerSymbol.cpp.o CMakeFiles/clangCppInterOp.dir/Paths.cpp.o
/Users/anutosh491/work/recipes/output/bld/rattler-build_cppinterop_1730188329/build_env/opt/emsdk/upstream/emscripten/emranlib libclangCppInterOp.a

And doesn't reference the libraries or the emscripten based flags passed.

Building with --trace-expand I do see that add_llvm_libraries is basically doing

add_library(clangCppInterOp STATIC CppInterOp.cpp;DynamicLibraryManager.cpp;DynamicLibraryManagerSymbol.cpp;Paths.cpp;/Users/anutosh491/work/recipes/output/bld/rattler-
 │ │ build_cppinterop_1730189487/work/lib/Interpreter/Compatibility.h;/Users/anutosh491/work/recipes/output/bld/rattler-build_cppinterop_1730189487/work/lib/Interpreter/CppInterOpInterpreter.h;/Users/anutosh491/work/r
 │ │ ecipes/output/bld/rattler-build_cppinterop_1730189487/work/lib/Interpreter/DynamicLibraryManager.h;/Users/anutosh491/work/recipes/output/bld/rattler-build_cppinterop_1730189487/work/lib/Interpreter/Paths.h )
 
 target_link_libraries(clangCppInterOp PUBLIC clangInterpreter;clangAST;clangBasic;clangFrontend;clangLex;clangSema;lldWasm;LLVMWebAssemblyCodeGen;LLVMWebAssemblyAsmPars
 │ │ er;LLVMWebAssemblyDesc;LLVMWebAssemblyDisassembler;LLVMWebAssemblyInfo;LLVMWebAssemblyUtils;LLVMBinaryFormat;LLVMCore;LLVMObject;LLVMOrcJIT;LLVMSupport;LLVMMC;LLVMTargetParser;LLVMWebAssemblyCodeGen;LLVMWebAssemb
 │ │ lyAsmParser;LLVMWebAssemblyDesc;LLVMWebAssemblyDisassembler;LLVMWebAssemblyInfo;LLVMWebAssemblyUtils;LLVMFrontendDriver;LLVMOrcDebugging;dl  LLVMWebAssemblyCodeGen;LLVMWebAssemblyAsmParser;LLVMWebAssemblyDesc;LLV
 │ │ MWebAssemblyDisassembler;LLVMWebAssemblyInfo;LLVMWebAssemblyUtils;LLVMBinaryFormat;LLVMCore;LLVMObject;LLVMOrcJIT;LLVMSupport;LLVMMC;LLVMTargetParser;LLVMWebAssemblyCodeGen;LLVMWebAssemblyAsmParser;LLVMWebAssembl
 │ │ yDesc;LLVMWebAssemblyDisassembler;LLVMWebAssemblyInfo;LLVMWebAssemblyUtils;LLVMFrontendDriver;LLVMOrcDebugging )
 
 target_link_options(clangCppInterOp PRIVATE -O1 -sMODULARIZE -sEXPORT_ES6=1 -sASSERTIONS -s
 │ │ ALLOW_MEMORY_GROWTH=1 -sINITIAL_MEMORY=128MB -sTOTAL_STACK=32MB -sMAIN_MODULE=1 -sEXPORTED_RUNTIME_METHODS=ccall,cwrap,stringToNewUTF8,getValue,setValue -sEXPORTED_FUNCTIONS=_malloc,_free,__ZTIN10emscripten3valE 
 │ │ --preload-file $BUILD_PREFIX/opt/emsdk/upstream/emscripten/cache/sysroot/@/ )

So under the hood we are trying to replicate argentite's script which is basically linking with lldwasm & clangintepreter and providing the flags but the generated link.txt doesn't reflect those changes.
Hence we end up with libClangCppInterOp.a that can't provide the missing symbols.

@anutosh491
Copy link
Collaborator Author

anutosh491 commented Oct 29, 2024

P.S I also removed add_llvm_library and simply did the following just to check but I might not be linking it the correct way.

add_library(clangCppInterOp STATIC 
 CppInterOp.cpp 
 DynamicLibraryManager.cpp 
 DynamicLibraryManagerSymbol.cpp 
 Paths.cpp
)

target_link_libraries(clangCppInterOp PUBLIC
 clangInterpreter 
 clangAST 
 clangBasic 
 clangFrontend 
 clangLex 
 clangSema 
 lldWasm
 embind
)

target_link_options(clangCppInterOp PRIVATE
 -O1
 -sMODULARIZE
 -sEXPORT_ES6=1
 -sASSERTIONS
 -sALLOW_MEMORY_GROWTH=1
 -sINITIAL_MEMORY=128MB
 -sTOTAL_STACK=32MB
 -sMAIN_MODULE=1
 -sEXPORTED_RUNTIME_METHODS=ccall,cwrap,stringToNewUTF8,getValue,setValue
 -sEXPORTED_FUNCTIONS=_malloc,_free,__ZTIN10emscripten3valE
 --preload-file ${EMSCRIPTEN_SYSROOT}/@/
)

So technically taking up any undefined symbol from the list, we might want to check if the link has happened perfectly

(xeus-cpp-wasm-host) anutosh491@Anutoshs-MacBook-Air lib % nm libclangInterpreter.a | grep '__clang_Interpreter_SetValueNoAlloc' 
00005cf4 T __clang_Interpreter_SetValueNoAlloc
(xeus-cpp-wasm-host) anutosh491@Anutoshs-MacBook-Air lib % nm libclangCppInterOp.a | grep '__clang_Interpreter_SetValueNoAlloc'.      // fails as of now

@anutosh491
Copy link
Collaborator Author

That being said the exact operation through add_llvm_library is being done while building lets say for osx or ubuntu and there I don't think there is any issue with the linking there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant