-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dynamically Linked Library in CPP #11439
base: main
Are you sure you want to change the base?
Conversation
✅ Deploy Preview for meta-velox ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
@soumiiow thanks for looking into this. Out of curiosity, why doesn't this work in MacOS? |
velox/common/CMakeLists.txt
Outdated
@@ -15,6 +15,7 @@ add_subdirectory(base) | |||
add_subdirectory(caching) | |||
add_subdirectory(compression) | |||
add_subdirectory(config) | |||
add_subdirectory(dynamicRegistry) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we use snake case for directory names "dynamic_registry"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool! I few small comments but overall looks good.
#include <dlfcn.h> | ||
#include <iostream> | ||
#include "velox/common/base/Exceptions.h" | ||
namespace facebook::velox { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: new line before namespace definition.
VELOX_USER_FAIL("Couldn't find Velox registry symbol: {}", error); | ||
} | ||
registryItem(); | ||
std::cout << "LOADED DYLLIB 1" << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for consistency, could you use LOG(INFO)
and print the file name / path of the library loaded?
|
||
static constexpr const char* kSymbolName = "registry"; | ||
|
||
void loadDynamicLibraryFunctions(const char* fileName) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can probably omit the "Functions" from the name, and this can be used to really load anything, as long as you provide the registration functions. Let's name it loadDynamicLibrary()
### 1. Create a cpp file for your dynamic library | ||
For dynamically loaded function registration, the format followed is mirrored of that of built-in function registration with some noted differences. Using [MyDynamicTestFunction.cpp](tests/MyDynamicTestFunction.cpp) as an example, the function uses the extern "C" keyword to protect against name mangling. A registry() function call is also necessary here. | ||
|
||
### 2. Register functions dynamically by creating .dylib or .so shared libraries and dropping them in a plugin directory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: the titles are too long; maybe just add the docs as a refular numbered list?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I pushed it out without the title formatting but does this look a bit cluttered now?
auto signaturesBefore = getFunctionSignatures().size(); | ||
|
||
// Function does not exist yet. | ||
EXPECT_THROW(dynamicFunction(0), VeloxUserError); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you use VELOX_ASSERT_THROW() instead to validate the right exception is being thrown?
# `MyDynamicFunction.cpp` as a small .so library, and use the | ||
# MY_DYNAMIC_FUNCTION_LIBRARY_PATH macro to locate the .so binary. | ||
add_compile_definitions( | ||
MY_DYNAMIC_FUNCTION_LIBRARY_PATH="${CMAKE_CURRENT_BINARY_DIR}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please vendor the macro. Maybe something like VELOX_TEST_DYNAMIC_LIBRARY_PATH
* limitations under the License. | ||
*/ | ||
|
||
#include "velox/common/dynamicRegistry/DynamicLibraryLoader.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this header include is not required.
|
||
// Dynamically load the library. | ||
std::string libraryPath = MY_DYNAMIC_FUNCTION_LIBRARY_PATH; | ||
libraryPath += "/libvelox_function_my_dynamic.so"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use CMAKE_SHARED_LIBRARY_SUFFIX and CMAKE_SHARED_LIBRARY_PREFIX to support MacOS.
https://stackoverflow.com/questions/32445070/how-does-cmake-know-which-prefixes-and-suffixes-to-add-to-shared-libraries
https://cmake.org/cmake/help/v3.0/variable/CMAKE_SHARED_LIBRARY_PREFIX.html
https://cmake.org/cmake/help/v3.0/variable/CMAKE_SHARED_LIBRARY_SUFFIX.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What else is an issue for MacOS?
789da39
to
c5cbea2
Compare
c5cbea2
to
f8afbc0
Compare
${GMock} | ||
${GTEST_BOTH_LIBRARIES}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please use the GTest::
targets
# To test functions being added by dynamically linked libraries, we compile | ||
# `MyDynamicFunction.cpp` as a small .so library, and use the | ||
# VELOX_TEST_DYNAMIC_LIBRARY_PATH macro to locate the .so binary. | ||
add_compile_definitions( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use target_compile_definitions(
on the relevant target instead.
if(${VELOX_BUILD_TESTING}) | ||
add_subdirectory(tests) | ||
endif() | ||
velox_add_library(velox_dynamic_function_loader DynamicLibraryLoader.cpp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
velox_add_library(velox_dynamic_function_loader DynamicLibraryLoader.cpp) | |
velox_add_library(velox_dynamic_function_loader DynamicLibraryLoader.cpp) | |
velox_link_libraries(velox_dynamic_function_loader PRIVATE velox_exception) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding docs!
# VELOX_TEST_DYNAMIC_LIBRARY_PATH macro to locate the .so binary. | ||
add_compile_definitions( | ||
VELOX_TEST_DYNAMIC_LIBRARY_PATH="${CMAKE_CURRENT_BINARY_DIR}") | ||
add_library(velox_function_my_dynamic SHARED MyDynamicFunction.cpp) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a new line before and after.
# `MyDynamicFunction.cpp` as a small .so library, and use the | ||
# VELOX_TEST_DYNAMIC_LIBRARY_PATH macro to locate the .so binary. | ||
add_compile_definitions( | ||
VELOX_TEST_DYNAMIC_LIBRARY_PATH="${CMAKE_CURRENT_BINARY_DIR}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
MacOS support is still missing. You can create the full library path here based on the CMake options I shared earlier.
@@ -0,0 +1,30 @@ | |||
#include "velox/functions/Udf.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing license header here.
@@ -0,0 +1,22 @@ | |||
# Velox: Dynamically Loading Registry Libraries in C++ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"Dynamic Loading of Velox Extensions" is probably a better title.
@@ -0,0 +1,22 @@ | |||
# Velox: Dynamically Loading Registry Libraries in C++ | |||
|
|||
This library adds the ability to load User Defined Functions (UDFs), connectors, or types without having to fork and build Prestissimo, through the use of shared libraries that a Prestissimo worker can access. These are to be loaded on launch of the Presto server. The Presto server searches for any .so or .dylib files and loads them using this library. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prestissimo -> Velox. Remaining paragraph as well.
target_link_libraries(name_of_dynamic_fn PRIVATE xsimd fmt::fmt velox_expression) | ||
``` | ||
|
||
3. In the Prestissimo worker's config.properties file, set the plugin.dir property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this to Prestissimo.
``` | ||
plugin.dir="User\Test\Path\plugin" | ||
``` | ||
4. When the worker or the sidecar process starts, it will scan the plugin directory and attempt to dynamically load all shared libraries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Move this to Prestissimo.
|
||
namespace facebook::velox { | ||
|
||
/// Dynamically opens and registers functions defined in a shared library (.so) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove (.so)
Add fullstop.
|
||
/// Dynamically opens and registers functions defined in a shared library (.so) | ||
/// | ||
/// Given a shared library name (.so), this function will open it using dlopen, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Opens a shared library using dlopen, looks for the symbol registry
, and invokes it.
|
||
// Lookup the symbol. | ||
void* registrySymbol = dlsym(handler, kSymbolName); | ||
auto registryItem = reinterpret_cast<void (*)()>(registrySymbol); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we ensure the signature is void registry();
? What happens if the return type is different or there are arguments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @soumiiow. Had bunch of minor comments, except for a bigger one around testing.
@@ -0,0 +1,4 @@ | |||
if(${VELOX_BUILD_TESTING}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit : Don't think its a convention, but most CMakeLists files have the velox_add_library function calls before the sub-directory related functions/macros.
|
||
// Lookup the symbol. | ||
void* registrySymbol = dlsym(handler, kSymbolName); | ||
auto registryItem = reinterpret_cast<void (*)()>(registrySymbol); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit : rename registryFunction
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hey!! so i got some previous feedback to stay away from the "registryFunction" in the naming so as to not make it seem like this library is to be used exclusively for functions, and to move away from our initial design which was made with only the function loading in mind. Perhaps, would there be a better name for this variable than the work "item"? I can only rlly think of registryItem or registryPtr but would love to hear your suggestions too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@soumiiow : To me this is almost like the "main" function in a executable program. How about "loadLibrary" or "loadUserLibrary" or "enterUserLibrary" ? There could be code beyond registration here as well.
if (error != nullptr) { | ||
VELOX_USER_FAIL("Couldn't find Velox registry symbol: {}", error); | ||
} | ||
registryItem(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a comment "Invoke the registry function"
|
||
void registry() { | ||
facebook::velox::registerFunction< | ||
facebook::velox::common::dynamicRegistry::Dynamic123Function, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumes that certain types/functions are already available at this point. Those types and functions depend on the registrations that the service like Prestissimo have done apriori before the "registry" function here is invoked. For this test, since its inherited from FunctionTestBase, all the Velox registrations are available.
So in general, this function assumes some context setup done already. It might be better to explicitly describe those assumptions here.
Or then change the test to not assume anything and do all the registrations within its code itself.
std::string libraryPath = VELOX_TEST_DYNAMIC_LIBRARY_PATH; | ||
libraryPath += "/libvelox_function_my_dynamic.so"; | ||
|
||
loadDynamicLibrary(libraryPath.data()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add some more test cases :
i) One that loads 2 different libraries and checks that the number of function signatures increment at each point.
ii) One that loads the same library again and validates the behavior. We are not likely to load the same library again in the service, but then its better to make that assumption explicit. But in any case its possible that you have 2 libraries that do the exact same thing that are loaded one after another. So then we should be explicit about the behavior then.
iii) An error case with an incorrect implementation of the
/ registry function signature.
iv) Generally when adding functions, we want to add them to a catalog, so they have a namespace. Prestissimo definitely has namespaces. How do you incorporate this in the logic ? It would be good for your test to demo a function added to a non-default namespace.
@@ -0,0 +1,30 @@ | |||
#include "velox/functions/Udf.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this file not have the copyright header ? Is it intentional so that we ensure we didn't trigger the build rules for it ?
BTW, our clients might use any copyright header they want. So we should ensure our builds can handle that.
|
||
extern "C" { | ||
|
||
void registry() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need the entire function definition in the extern or is it possible to just declare the function here and have the definition elsewhere. Just asking as its possible the registration is big function or we want it to call other functions.
e.g. In this folder we have a bunch of window functions we want to expose for users to register. Might be better to use this kind of file structure as its more realistic : https://github.com/facebookincubator/velox/blob/main/velox/functions/prestosql/window/WindowFunctionsRegistration.cpp#L30
target_link_libraries(name_of_dynamic_fn PRIVATE xsimd fmt::fmt velox_expression) | ||
``` | ||
|
||
3. In the Prestissimo worker's config.properties file, set the plugin.dir property |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not relevant in Velox. And also since its not used anywhere in the current code, its hard to put this in picture.
@@ -0,0 +1,22 @@ | |||
# Velox: Dynamically Loading Registry Libraries in C++ | |||
|
|||
This library adds the ability to load User Defined Functions (UDFs), connectors, or types without having to fork and build Prestissimo, through the use of shared libraries that a Prestissimo worker can access. These are to be loaded on launch of the Presto server. The Presto server searches for any .so or .dylib files and loads them using this library. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be good to not talk about Prestissimo in this README.
This is a generic utility for dynamically loading a "registry" function from a library. Its sufficient to just say that this is for "Extensibility" features that add custom user code which could include new Velox types, functions, operators and connectors.
Related to prestodb/presto#23634 in the Prestissimo space
and based off of the following PR: https://github.com/facebookincubator/velox/pull/1005/files
These changes will allow users to dynamically load functions in prestissimo using cpp. The Presto Server will use this library to dynamically load User Defined Functions (UDFs), connectors, or types.
an example of dynamically registering a function is also provided for reference, along with a unit test
Currently, this library works on linux machines but not MacOS.