-
Notifications
You must be signed in to change notification settings - Fork 104
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Errors with pyarrow ds.write_dataset() using 0.6.0 #171
Comments
@jorisvandenbossche FYI I will post this as a pyarrow issue as well |
@ldacey I suspect this is an |
@ldacey and @jorisvandenbossche -- I think the issue here is related to a change in the API with 0.6.0 in an effort to align the adlfs API to Python os.mkdir(). See referenced here. Can anyone comment on if s3fs works? |
Is this a change which will be needed to be done in pyarrow? I have no access to S3.
This created the 0.6 folder within the "dev" container, but "test" is a 0 bytes file: When I upgrade to 0.6.2, if I run the command a single time it works but it does not create a folder, "test3" is an empty file. If I run the command twice, I get an error: pyarrow 3.0.0
|
Can you try this branch. Should fix the issue. |
No errors running |
Implemented in release v0.6.3 |
I was able to write a table with asynchronous=False and then read it as a pyarrow table
Reading a large dataset (23,000 fragments) did not have any issues, and fs.find() did not show any empty blobs. |
What happened:
I run into "The specified blob already exists" errors when trying to save a pyarrow dataset while adlfs 0.6.0 is installed. Reverting to 0.5.9 fixes this issue.
What you expected to happen:
There should be no error that the container exists when I am writing the dataset - it should exist beforehand.
Minimal Complete Verifiable Example:
ds.write_dataset("dev/example", filesystem=fs, partitioning=partitioning)
Pinning adlfs at 0.5.9 works:
Same code fails when adlfs 0.6.0 is installed:
Anything else we need to know?:
I narrowed this down when I downgraded pyarrow to 2.0 and noticed that 0.5.9 adlfs was also installed and there were no errors. I then installed pyarrow 3.0 and pinned 0.5.9 adlfs and it worked.
Environment:
The text was updated successfully, but these errors were encountered: