-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: add a module to read all Substrait plan formats #45
Draft
EpsilonPrime
wants to merge
35
commits into
substrait-io:main
Choose a base branch
from
EpsilonPrime:textformat_for_python
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
35 commits
Select commit
Hold shift + click to select a range
306ea96
Starting.
EpsilonPrime 42cee0e
A working version but for some reason tests aren't working properly.
EpsilonPrime 684d0ac
Now passes with hand-built library.
EpsilonPrime 7287773
Corrected build instructions.
EpsilonPrime 0f8074e
Remove some comments.
EpsilonPrime 18b0bd8
Experiment adding the planloader to the build.
EpsilonPrime 0f5922b
Now handles errors consistently.
EpsilonPrime bba04a3
Remove accidentally added test output file.
EpsilonPrime b06464c
Updated based on review.
EpsilonPrime e207987
Updated workflow.
EpsilonPrime 50010a3
Added missing requirements to test workflow.
EpsilonPrime e91bb92
See if we can add a C++ library dependency.
EpsilonPrime 7b716f5
Not that way.
EpsilonPrime b44d311
Another attempt at the C++ library dependency.
EpsilonPrime e2618b4
Not that way either.
EpsilonPrime 09b2f30
Correct updated size type.
EpsilonPrime 85e55a4
Always free the returned SerializedPlan.
EpsilonPrime 006da91
Be consistent with signed int32.
EpsilonPrime 51da6b0
Now point at latest head.
EpsilonPrime 154e5e3
Switch to turning submodules to true following documents to see if th…
EpsilonPrime 75178ac
Another attempt at recursing into submodules.
EpsilonPrime 791fc14
Use pending version of substrait-cpp which uses updated substrait repo.
EpsilonPrime 5c52ff8
Updated substrait-cpp dependency.
EpsilonPrime 2b5a510
Add cmake dependency to rules.
EpsilonPrime 595f7da
Switch back to the version of substrait-cpp that matches the current …
EpsilonPrime e32a782
See if we can add a protobuf-c dependency.
EpsilonPrime 09bef75
Try using an apt-get action.
EpsilonPrime e89780c
Change package name.
EpsilonPrime f0f21c0
Switch to using apt-get directly (Linux only).
EpsilonPrime 82340d1
Update substrait-cpp (provides an external protobuf dependency.
EpsilonPrime 9ff593b
Remove apt-get call.
EpsilonPrime 7f5c023
See if we can add curl to the environment.
EpsilonPrime 10a91dd
See if we can add curl to the environment.
EpsilonPrime 3044b7c
add more environment dependencies
EpsilonPrime b42276c
Adding curl somewhere else.
EpsilonPrime File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -22,10 +22,22 @@ jobs: | |
uses: actions/checkout@v4 | ||
with: | ||
submodules: recursive | ||
- name: Pull submodules | ||
run: | | ||
git submodule update --init --recursive | ||
git submodule update --recursive | ||
- name: Set up Python | ||
uses: actions/setup-python@v5 | ||
with: | ||
python-version: ${{ matrix.python }} | ||
- name: Build Substrait planloader library | ||
run: | | ||
cd ${{ github.workspace }}/third_party/substrait-cpp | ||
make release | ||
- name: Install Substrait planloader library | ||
run: | | ||
cd ${{ github.workspace }}/third_party/substrait-cpp/build-Release/export/planloader | ||
make install | ||
Comment on lines
+33
to
+40
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We'll want to handle compilation via the |
||
- name: Install package and test dependencies | ||
run: | | ||
python -m pip install --upgrade pip | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,6 @@ | ||
[submodule "third_party/substrait"] | ||
path = third_party/substrait | ||
url = https://github.com/substrait-io/substrait | ||
[submodule "third_party/substrait-cpp"] | ||
path = third_party/substrait-cpp | ||
url = git@github.com:substrait-io/substrait-cpp.git |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -11,3 +11,4 @@ dependencies: | |
- python >= 3.8.1 | ||
- setuptools >= 61.0.0 | ||
- setuptools_scm >= 6.2.0 | ||
- libcurl4-openssl-dev |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,84 @@ | ||
# SPDX-License-Identifier: Apache-2.0 | ||
"""Routines for loading and saving Substrait plans.""" | ||
import ctypes | ||
import ctypes.util as ctutil | ||
import enum | ||
import substrait.gen.proto.plan_pb2 as plan_pb2 | ||
import sys | ||
|
||
|
||
class PlanFileFormat(enum.Enum): | ||
BINARY = ctypes.c_int32(0) | ||
JSON = ctypes.c_int32(1) | ||
PROTOTEXT = ctypes.c_int32(2) | ||
TEXT = ctypes.c_int32(3) | ||
|
||
|
||
class PlanFileException(Exception): | ||
pass | ||
|
||
|
||
class SerializedPlan(ctypes.Structure): | ||
pass | ||
|
||
|
||
SerializedPlan._fields_ = [ | ||
("buffer", ctypes.POINTER(ctypes.c_byte)), | ||
("size", ctypes.c_int32), | ||
("errorMessage", ctypes.c_char_p), | ||
] | ||
|
||
|
||
# Load the C++ library | ||
planloader_path = ctutil.find_library("planloader") | ||
planloader_lib = ctypes.CDLL(planloader_path) | ||
if planloader_lib is None: | ||
print('Failed to find planloader library') | ||
sys.exit(1) | ||
|
||
# Declare the function signatures for the external functions. | ||
external_load_substrait_plan = planloader_lib.load_substrait_plan | ||
external_load_substrait_plan.argtypes = [ctypes.c_char_p] | ||
external_load_substrait_plan.restype = ctypes.POINTER(SerializedPlan) | ||
|
||
external_free_substrait_plan = planloader_lib.free_substrait_plan | ||
external_free_substrait_plan.argtypes = [ctypes.POINTER(SerializedPlan)] | ||
external_free_substrait_plan.restype = None | ||
|
||
external_save_substrait_plan = planloader_lib.save_substrait_plan | ||
external_save_substrait_plan.argtypes = [ctypes.c_void_p, ctypes.c_int32, ctypes.c_char_p, ctypes.c_int32] | ||
external_save_substrait_plan.restype = ctypes.c_char_p | ||
|
||
|
||
def load_substrait_plan(filename: str) -> plan_pb2.Plan: | ||
""" | ||
Loads a Substrait plan (in any format) from disk. | ||
|
||
Returns: | ||
A Plan protobuf object if successful. | ||
Raises: | ||
PlanFileException if an except occurs while converting or reading from disk. | ||
""" | ||
result = external_load_substrait_plan(filename.encode('UTF-8')) | ||
try: | ||
if result.contents.errorMessage: | ||
raise PlanFileException(result.contents.errorMessage) | ||
data = ctypes.string_at(result.contents.buffer, result.contents.size) | ||
plan = plan_pb2.Plan() | ||
plan.ParseFromString(data) | ||
finally: | ||
external_free_substrait_plan(result) | ||
return plan | ||
|
||
|
||
def save_substrait_plan(plan: plan_pb2.Plan, filename: str, file_format: PlanFileFormat): | ||
""" | ||
Saves the given plan to disk in the specified file format. | ||
|
||
Raises: | ||
PlanFileException if an except occurs while converting or writing to disk. | ||
""" | ||
data = plan.SerializeToString() | ||
err = external_save_substrait_plan(data, len(data), filename.encode('UTF-8'), file_format.value) | ||
if err: | ||
raise PlanFileException(err) |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,9 @@ | ||
# SPDX-License-Identifier: Apache-2.0 | ||
|
||
|
||
from substrait.planloader import planloader | ||
|
||
|
||
def test_main(): | ||
testplan = planloader.load_substrait_plan('tests/tpch-plan01.json') | ||
EpsilonPrime marked this conversation as resolved.
Show resolved
Hide resolved
|
||
planloader.save_substrait_plan(testplan, 'myoutfile.splan', planloader.PlanFileFormat.TEXT.value) | ||
EpsilonPrime marked this conversation as resolved.
Show resolved
Hide resolved
|
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'll want to handle compilation via the
pyproject.toml
file, I'll leave some notes there.