Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[C++][Python] Protobuf error building Arrow on macOS #44868

Open
stemillington-flock opened this issue Nov 27, 2024 · 11 comments
Open

[C++][Python] Protobuf error building Arrow on macOS #44868

stemillington-flock opened this issue Nov 27, 2024 · 11 comments

Comments

@stemillington-flock
Copy link

Describe the bug, including details regarding any error messages, version, and platform.

I am trying to build arrow following the instructions here.

I have managed to create the conda environment and installed all the requirements but when running the command

cmake -S arrow/cpp -B arrow/cpp/build \
        -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
        --preset ninja-release-python

I get the error

CMake Error at /opt/homebrew/lib/cmake/protobuf/protobuf-targets.cmake:71 (set_target_properties):
  The link interface of target "protobuf::libprotobuf" contains:

    absl::absl_check

  but the target was not found.  Possible reasons include:

    * There is a typo in the target name.
    * A find_package call is missing for an IMPORTED target.
    * An ALIAS target is missing.

Call Stack (most recent call first):
  /opt/homebrew/lib/cmake/protobuf/protobuf-config.cmake:16 (include)
  cmake_modules/FindProtobufAlt.cmake:31 (find_package)
  cmake_modules/ThirdpartyToolchain.cmake:313 (find_package)
  cmake_modules/ThirdpartyToolchain.cmake:1962 (resolve_dependency)
  CMakeLists.txt:546 (include)

This is on a Mac Book Pro M2

Component(s)

Python

@raulcd
Copy link
Member

raulcd commented Nov 27, 2024

This seems to be the same issue as reported here: #41331

  • Could you try uninstalling protobuf from homebrew as suggested on that issue?
  • Could you add slightly more log context, is this when building orc_ep?

@stemillington-flock
Copy link
Author

I tried uninstalling protobuf using brew uninstall --ignore-dependencies protobuf

and now instead get the error

-- Could NOT find protobuf (missing: protobuf_DIR)
-- Could NOT find Protobuf (missing: Protobuf_LIBRARIES Protobuf_INCLUDE_DIR) 
CMake Error at cmake_modules/ThirdpartyToolchain.cmake:315 (if):
  if given arguments:

    "VERSION_LESS" "3.0.0"

  Unknown arguments specified
Call Stack (most recent call first):
  cmake_modules/ThirdpartyToolchain.cmake:1962 (resolve_dependency)
  CMakeLists.txt:546 (include)

Here's a little more of the original error

-- Providing CMake module for FindorcAlt as part of Arrow CMake package
-- Found ORC static library: /opt/homebrew/Caskroom/miniconda/base/envs/pyarrow-dev/lib/liborc.dylib
-- Found ORC headers: /opt/homebrew/Caskroom/miniconda/base/envs/pyarrow-dev/include
-- All bundled static libraries: substrait;mimalloc::mimalloc
-- CMAKE_C_FLAGS: -ftree-vectorize -fPIC -fstack-protector-strong -O2 -pipe -isystem /opt/homebrew/Caskroom/miniconda/base/envs/pyarrow-dev/include -Qunused-arguments  -Wall -Wno-unknown-warning-option -Wno-pass-failed -march=armv8-a 
-- CMAKE_CXX_FLAGS:  -fno-aligned-new -ftree-vectorize -fPIC -fstack-protector-strong -O2 -pipe -stdlib=libc++ -fvisibility-inlines-hidden -fmessage-length=0 -isystem /opt/homebrew/Caskroom/miniconda/base/envs/pyarrow-dev/include -Qunused-arguments -fcolor-diagnostics  -Wall -Wno-unknown-warning-option -Wno-pass-failed -march=armv8-a 
-- CMAKE_C_FLAGS_RELEASE: -O3 -DNDEBUG -O2 
-- CMAKE_CXX_FLAGS_RELEASE: -O3 -DNDEBUG -O2 
CMake Warning (dev) at src/arrow/CMakeLists.txt:1096 (install):
  Policy CMP0177 is not set: install() DESTINATION paths are normalized.  Run
  "cmake --help-policy CMP0177" for policy details.  Use the cmake_policy
  command to set the policy and suppress this warning.
This warning is for project developers.  Use -Wno-dev to suppress it.

-- ---------------------------------------------------------------------
-- Arrow version:                                 19.0.0-SNAPSHOT
-- 
-- Build configuration summary:
--   Generator: Ninja
--   Build type: RELEASE
--   Source directory: /Users/stephen.millington/Code/arrow/cpp
--   Install prefix: 
--   Compile commands: /Users/stephen.millington/Code/arrow/cpp/build/compile_commands.json
-- 
-- Compile and link options:
-- 
--   ARROW_CXXFLAGS="" [default=""]
--       Compiler flags to append when compiling Arrow
--   ARROW_BUILD_STATIC=OFF [default=ON]
--       Build static libraries
--   ARROW_BUILD_SHARED=ON [default=ON]
--       Build shared libraries
--   ARROW_PACKAGE_KIND="" [default=""]
--       Arbitrary string that identifies the kind of package
--       (for informational purposes)
--   ARROW_GIT_ID=26b08a3246799d65937d9764784156a0d301ea42 [default=""]
--       The Arrow git commit id (if any)
--   ARROW_GIT_DESCRIPTION=apache-arrow-19.0.0.dev-138-g26b08a324 [default=""]
--       The Arrow git commit description (if any)
--   ARROW_POSITION_INDEPENDENT_CODE=ON [default=ON]
--       Whether to create position-independent target
--   ARROW_USE_CCACHE=ON [default=ON]
--       Use ccache when compiling (if available)
--   ARROW_USE_SCCACHE=ON [default=ON]
--       Use sccache when compiling (if available),
--       takes precedence over ccache if a storage backend is configured
--   ARROW_USE_LD_GOLD=OFF [default=OFF]
--       Use ld.gold for linking on Linux (if available)
--   ARROW_USE_LLD=OFF [default=OFF]
--       Use the LLVM lld for linking (if available)
--   ARROW_USE_MOLD=OFF [default=OFF]
--       Use mold for linking on Linux (if available)
--   ARROW_USE_PRECOMPILED_HEADERS=OFF [default=OFF]
--       Use precompiled headers when compiling
--   ARROW_SIMD_LEVEL=NEON [default=DEFAULT|NONE|SSE4_2|AVX2|AVX512|NEON|SVE|SVE128|SVE256|SVE512]
--       Compile-time SIMD optimization level
--   ARROW_RUNTIME_SIMD_LEVEL=MAX [default=MAX|NONE|SSE4_2|AVX2|AVX512]
--       Max runtime SIMD optimization level
--   ARROW_ALTIVEC=ON [default=ON]
--       Build with Altivec if compiler has support
--   ARROW_RPATH_ORIGIN=OFF [default=OFF]
--       Build Arrow libraries with RATH set to $ORIGIN
--   ARROW_INSTALL_NAME_RPATH=ON [default=ON]
--       Build Arrow libraries with install_name set to @rpath
--   ARROW_GGDB_DEBUG=ON [default=ON]
--       Pass -ggdb flag to debug builds
--   ARROW_WITH_MUSL=OFF [default=OFF]
--       Whether the system libc is musl or not
--   ARROW_ENABLE_THREADING=ON [default=ON]
--       Enable threading in Arrow core
-- 
-- Test and benchmark options:
-- 
--   ARROW_BUILD_EXAMPLES=OFF [default=OFF]
--       Build the Arrow examples
--   ARROW_BUILD_TESTS=OFF [default=OFF]
--       Build the Arrow googletest unit tests
--   ARROW_ENABLE_TIMING_TESTS=ON [default=ON]
--       Enable timing-sensitive tests
--   ARROW_BUILD_INTEGRATION=OFF [default=OFF]
--       Build the Arrow integration test executables
--   ARROW_BUILD_BENCHMARKS=OFF [default=OFF]
--       Build the Arrow micro benchmarks
--   ARROW_BUILD_BENCHMARKS_REFERENCE=OFF [default=OFF]
--       Build the Arrow micro reference benchmarks
--   ARROW_BUILD_OPENMP_BENCHMARKS=OFF [default=OFF]
--       Build the Arrow benchmarks that rely on OpenMP
--   ARROW_BUILD_DETAILED_BENCHMARKS=OFF [default=OFF]
--       Build benchmarks that do a longer exploration of performance
--   ARROW_TEST_LINKAGE=shared [default=shared|static]
--       Linkage of Arrow libraries with unit tests executables.
--   ARROW_FUZZING=OFF [default=OFF]
--       Build Arrow Fuzzing executables
--   ARROW_LARGE_MEMORY_TESTS=OFF [default=OFF]
--       Enable unit tests which use large memory
-- 
-- Lint options:
-- 
--   ARROW_ONLY_LINT=OFF [default=OFF]
--       Only define the lint and check-format targets
--   ARROW_VERBOSE_LINT=OFF [default=OFF]
--       If off, 'quiet' flags will be passed to linting tools
--   ARROW_GENERATE_COVERAGE=OFF [default=OFF]
--       Build with C++ code coverage enabled
-- 
-- Checks options:
-- 
--   ARROW_TEST_MEMCHECK=OFF [default=OFF]
--       Run the test suite using valgrind --tool=memcheck
--   ARROW_USE_ASAN=OFF [default=OFF]
--       Enable Address Sanitizer checks
--   ARROW_USE_TSAN=OFF [default=OFF]
--       Enable Thread Sanitizer checks
--   ARROW_USE_UBSAN=OFF [default=OFF]
--       Enable Undefined Behavior sanitizer checks
-- 
-- Project component options:
-- 
--   ARROW_ACERO=ON [default=OFF]
--       Build the Arrow Acero Engine Module
--   ARROW_AZURE=OFF [default=OFF]
--       Build Arrow with Azure support (requires the Azure SDK for C++)
--   ARROW_BUILD_UTILITIES=OFF [default=OFF]
--       Build Arrow commandline utilities
--   ARROW_COMPUTE=ON [default=OFF]
--       Build all Arrow Compute kernels
--   ARROW_CSV=ON [default=OFF]
--       Build the Arrow CSV Parser Module
--   ARROW_CUDA=OFF [default=OFF]
--       Build the Arrow CUDA extensions (requires CUDA toolkit)
--   ARROW_DATASET=ON [default=OFF]
--       Build the Arrow Dataset Modules
--   ARROW_FILESYSTEM=ON [default=OFF]
--       Build the Arrow Filesystem Layer
--   ARROW_FLIGHT=OFF [default=OFF]
--       Build the Arrow Flight RPC System (requires GRPC, Protocol Buffers)
--   ARROW_FLIGHT_SQL=OFF [default=OFF]
--       Build the Arrow Flight SQL extension
--   ARROW_GANDIVA=OFF [default=OFF]
--       Build the Gandiva libraries
--   ARROW_GCS=OFF [default=OFF]
--       Build Arrow with GCS support (requires the GCloud SDK for C++)
--   ARROW_HDFS=OFF [default=OFF]
--       Build the Arrow HDFS bridge
--   ARROW_IPC=ON [default=ON]
--       Build the Arrow IPC extensions
--   ARROW_JEMALLOC=OFF [default=OFF]
--       Build the Arrow jemalloc-based allocator
--   ARROW_JSON=ON [default=OFF]
--       Build Arrow with JSON support (requires RapidJSON)
--   ARROW_MIMALLOC=ON [default=OFF]
--       Build the Arrow mimalloc-based allocator
--   ARROW_PARQUET=ON [default=OFF]
--       Build the Parquet libraries
--   ARROW_ORC=ON [default=OFF]
--       Build the Arrow ORC adapter
--   ARROW_PYTHON=OFF [default=OFF]
--       Build some components needed by PyArrow.
--       (This is a deprecated option. Use CMake presets instead.)
--   ARROW_S3=OFF [default=OFF]
--       Build Arrow with S3 support (requires the AWS SDK for C++)
--   ARROW_SKYHOOK=OFF [default=OFF]
--       Build the Skyhook libraries
--   ARROW_SUBSTRAIT=ON [default=OFF]
--       Build the Arrow Substrait Consumer Module
--   ARROW_TENSORFLOW=OFF [default=OFF]
--       Build Arrow with TensorFlow support enabled
--   ARROW_TESTING=OFF [default=OFF]
--       Build the Arrow testing libraries
-- 
-- Thirdparty toolchain options:
-- 
--   ARROW_DEPENDENCY_SOURCE=CONDA [default=AUTO|BUNDLED|SYSTEM|CONDA|VCPKG|BREW]
--       Method to use for acquiring arrow's build dependencies
--   ARROW_VERBOSE_THIRDPARTY_BUILD=OFF [default=OFF]
--       Show output from ExternalProjects rather than just logging to files
--   ARROW_DEPENDENCY_USE_SHARED=ON [default=ON]
--       Link to shared libraries
--   ARROW_BOOST_USE_SHARED=ON [default=ON]
--       Rely on Boost shared libraries where relevant
--   ARROW_BROTLI_USE_SHARED=ON [default=ON]
--       Rely on Brotli shared libraries where relevant
--   ARROW_BZ2_USE_SHARED=ON [default=ON]
--       Rely on Bz2 shared libraries where relevant
--   ARROW_GFLAGS_USE_SHARED=ON [default=ON]
--       Rely on GFlags shared libraries where relevant
--   ARROW_GRPC_USE_SHARED=ON [default=ON]
--       Rely on gRPC shared libraries where relevant
--   ARROW_JEMALLOC_USE_SHARED=ON [default=ON]
--       Rely on jemalloc shared libraries where relevant
--   ARROW_LLVM_USE_SHARED=ON [default=ON]
--       Rely on LLVM shared libraries where relevant
--   ARROW_LZ4_USE_SHARED=ON [default=ON]
--       Rely on lz4 shared libraries where relevant
--   ARROW_OPENSSL_USE_SHARED=ON [default=ON]
--       Rely on OpenSSL shared libraries where relevant
--   ARROW_PROTOBUF_USE_SHARED=ON [default=ON]
--       Rely on Protocol Buffers shared libraries where relevant
--   ARROW_SNAPPY_USE_SHARED=ON [default=ON]
--       Rely on snappy shared libraries where relevant
--   ARROW_THRIFT_USE_SHARED=ON [default=ON]
--       Rely on thrift shared libraries where relevant
--   ARROW_UTF8PROC_USE_SHARED=ON [default=ON]
--       Rely on utf8proc shared libraries where relevant
--   ARROW_ZSTD_USE_SHARED=ON [default=ON]
--       Rely on zstd shared libraries where relevant
--   ARROW_USE_GLOG=OFF [default=OFF]
--       Build libraries with glog support for pluggable logging
--   ARROW_WITH_BACKTRACE=ON [default=ON]
--       Build with backtrace support
--   ARROW_WITH_OPENTELEMETRY=OFF [default=OFF]
--       Build libraries with OpenTelemetry support for distributed tracing
--   ARROW_WITH_BROTLI=ON [default=OFF]
--       Build with Brotli compression
--   ARROW_WITH_BZ2=ON [default=OFF]
--       Build with BZ2 compression
--   ARROW_WITH_LZ4=ON [default=OFF]
--       Build with lz4 compression
--   ARROW_WITH_SNAPPY=ON [default=OFF]
--       Build with Snappy compression
--   ARROW_WITH_ZLIB=ON [default=OFF]
--       Build with zlib compression
--   ARROW_WITH_ZSTD=ON [default=OFF]
--       Build with zstd compression
--   ARROW_WITH_UCX=OFF [default=OFF]
--       Build with UCX transport for Arrow Flight
--       (only used if ARROW_FLIGHT is ON)
--   ARROW_WITH_UTF8PROC=ON [default=ON]
--       Build with support for Unicode properties using the utf8proc library
--       (only used if ARROW_COMPUTE is ON or ARROW_GANDIVA is ON)
--   ARROW_WITH_RE2=ON [default=ON]
--       Build with support for regular expressions using the re2 library
--       (only used if ARROW_COMPUTE or ARROW_GANDIVA is ON)
-- 
-- Parquet options:
-- 
--   PARQUET_MINIMAL_DEPENDENCY=OFF [default=OFF]
--       Depend only on Thirdparty headers to build libparquet.
--       Always OFF if building binaries
--   PARQUET_BUILD_EXECUTABLES=OFF [default=OFF]
--       Build the Parquet executable CLI tools. Requires static libraries to be built.
--   PARQUET_BUILD_EXAMPLES=OFF [default=OFF]
--       Build the Parquet examples. Requires static libraries to be built.
--   PARQUET_REQUIRE_ENCRYPTION=OFF [default=OFF]
--       Build support for encryption. Fail if OpenSSL is not found
-- 
-- Gandiva options:
-- 
--   ARROW_GANDIVA_STATIC_LIBSTDCPP=OFF [default=OFF]
--       Include -static-libstdc++ -static-libgcc when linking with
--       Gandiva static libraries
--   ARROW_GANDIVA_PC_CXX_FLAGS="" [default=""]
--       Compiler flags to append when pre-compiling Gandiva operations
-- 
-- Cross compiling options:
-- 
--   ARROW_GRPC_CPP_PLUGIN="" [default=""]
--       grpc_cpp_plugin path to be used
-- 
-- Advanced developer options:
-- 
--   ARROW_EXTRA_ERROR_CONTEXT=OFF [default=OFF]
--       Compile with extra error context (line numbers, code)
--   ARROW_OPTIONAL_INSTALL=OFF [default=OFF]
--       If enabled install ONLY targets that have already been built. Please be
--       advised that if this is enabled 'install' will fail silently on components
--       that have not been built
--   ARROW_GDB_INSTALL_DIR="" [default=""]
--       Use a custom install directory for GDB plugin.
--       In general, you don't need to specify this because the default
--       (CMAKE_INSTALL_FULL_BINDIR on Windows, CMAKE_INSTALL_FULL_LIBDIR otherwise)
--       is reasonable.
--   Outputting build configuration summary to /Users/stephen.millington/Code/arrow/cpp/build/cmake_summary.json
-- Configuring done (0.4s)
CMake Error at /opt/homebrew/lib/cmake/protobuf/protobuf-targets.cmake:71 (set_target_properties):
  The link interface of target "protobuf::libprotobuf" contains:

    absl::absl_check

  but the target was not found.  Possible reasons include:

    * There is a typo in the target name.
    * A find_package call is missing for an IMPORTED target.
    * An ALIAS target is missing.

Call Stack (most recent call first):
  /opt/homebrew/lib/cmake/protobuf/protobuf-config.cmake:16 (include)
  cmake_modules/FindProtobufAlt.cmake:31 (find_package)
  cmake_modules/ThirdpartyToolchain.cmake:313 (find_package)
  cmake_modules/ThirdpartyToolchain.cmake:1962 (resolve_dependency)
  CMakeLists.txt:546 (include)

@raulcd raulcd changed the title Error Building Arrow [C++][Python] Protobuf error building Arrow on macOS Nov 27, 2024
@amoeba
Copy link
Member

amoeba commented Nov 27, 2024

I can reproduce this issue on my machine. I see the same output and the line that concerns me is this one:

CMake Error at /opt/homebrew/lib/cmake/protobuf/protobuf-targets.cmake:71 (set_target_properties):

I don't think the build should be looking at Homebrew protobuf if we're packaging with Conda. I tried to use bundled Protobuf (-DProtobuf_SOURCE=BUNDLED) but get zlib linking errors. I checked the lib/cmake folder in $CONDA_PREFIX for (lib)protobuf CMake files and don't see any so maybe the build is falling back to Homebrew and then it runs into trouble?

@amoeba
Copy link
Member

amoeba commented Nov 27, 2024

This is a bit of a conda usage question for me now, but why do I get libprotobuf 3.21.12 and not a more recent one when I run conda install -c conda-forge libprotobuf?

...>8...
libprotobuf        conda-forge/osx-arm64::libprotobuf-3.21.12-ha614eb4_2
...>8...

It looks like the most recent version for my system is osx-arm64/libprotobuf-5.28.3-h8f0b736_0.conda.

Edit: I'm guessing another package is causing a much older libprotobuf to be solved by conda.
Edit 2: And the newer libprotobuf has cmake files

@amoeba
Copy link
Member

amoeba commented Nov 27, 2024

I was able to tweak conda package versions around and get this build working correctly and I think a fix here is updating conda_env_cpp.

The issue was our pin of grpc-cpp<1.50.1 . I removed grpc-cpp, orc, libprotobuf (as they're entangled) from the environment and reinstalled them which brought in newer versions. I was then able to build.

@raulcd the pin showed up in #35089. What do you think about me submitting a PR and doing some testing in the PR to see what breaks?

@raulcd
Copy link
Member

raulcd commented Nov 28, 2024

I think we can try to remove the pin for grpc-cpp. It seems the pin might be unnecessary with the new abseil version if I understood correctly the solution for the original issue on conda here: conda-forge/grpc-cpp-feedstock#281
The macOS verification jobs used to fail so we can check those to see if the GRPC failure is reproducible or not.
@stemillington-flock if you remove the pin suggested from @amoeba (https://github.com/apache/arrow/blob/main/ci/conda_env_cpp.txt#L34) to be just grpc-cpp on a new conda environment. Is the issue reproducible?

@stemillington-flock
Copy link
Author

I took the following steps

  • deleted the conda environment
  • changed grpc-cpp<=1.50.1 to grpc-cpp in conda_env_cpp.txt
  • recreated the conda environment
  • activated the new environment and re-ran the cmake command

I now get this error

CMake Error at /opt/homebrew/lib/cmake/protobuf/protobuf-targets.cmake:71 (set_target_properties):
  The link interface of target "protobuf::libprotobuf" contains:

    absl::if_constexpr

  but the target was not found.  Possible reasons include:

Let me know if you need more detail from the error

@amoeba
Copy link
Member

amoeba commented Nov 28, 2024

Thanks for testing @stemillington-flock. Can you run conda list when inside the environment and report out which libprotobuf version you have? And can you also re-create the build directory from scratch before re-running cmake? You usually don't have to do this but I find I often do when troubleshooting issues like this.

@stemillington-flock
Copy link
Author

No worries. I deleted the build folder and recreated it, but still the same error. For libprotobuf i have

libprotobuf               3.21.12              ha614eb4_2    conda-forge

Could this be an issue with the priority of the conda channels? I have

--add channels 'defaults'   # lowest priority
--add channels 'conda-forge'   # highest priority

@stemillington-flock
Copy link
Author

I flipped the channel priorities in my conda config and tried again from scratch - same version of libprotobuf and same error. All the libraries are installed from conda-forge

@amoeba
Copy link
Member

amoeba commented Nov 29, 2024

Hi again @stemillington-flock, I think the steps you took might not have installed the right versions.

I tested again and was able to get configuration (and a build) to succeed with the following steps:

  1. Create the environment with the latest versions of the conda env files:

    conda create -n pyarrow-dev -c conda-forge \
      --file arrow/ci/conda_env_unix.txt \
      --file arrow/ci/conda_env_cpp.txt \
      --file arrow/ci/conda_env_python.txt \
      --file arrow/ci/conda_env_gandiva.txt \
      compilers \
      python=3.10 \
      pandas
    conda activate pyarrow-dev
  2. Remove grpc-cpp entirely from the environment

    conda uninstall grpc-cpp
  3. Install a more recent libprotobuf into the environment

    conda install -c conda-forge  libprotobuf==5.28.2
  4. Configure and build

    export ARROW_HOME="$CONDA_PREFIX"
    cmake -S arrow/cpp \
      -B arrow/cpp build \
      -DCMAKE_INSTALL_PREFIX=$ARROW_HOME \
      --preset ninja-release-python
    cmake --build arrow/cpp/build

Let us know if that doesn't work for you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants