Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Remove nested async by copying code #581

Merged
merged 3 commits into from
Mar 25, 2021

Conversation

martindurant
Copy link
Member

Major change to async code!

All nested calls to sync removed, by copying code such as walk() from spec to asyn. All async method only call async methods, and an attempt to do otherwise raises NotImplemented. In fact, _run_until_complete is gone. This is how it should have been written from the start (except that some copied code could clearly be factored out).

cc @hayesgb , this surely affects azure
s3fs and gcsfs already have conforming code locally I will push when we agree on this change.

@martindurant
Copy link
Member Author

Documentation will need to be very clear about about to write an async implementation.

cf fsspec/gcsfs#363 , which still fails due to a nested sync call somewhere.

cc @mrocklin - this should make calling async things much more robust.

@hayesgb
Copy link
Contributor

hayesgb commented Mar 23, 2021

@martindurant -- Want to be sure I'm clear. No calls from async code to sync code, correct?

@TomAugspurger
Copy link
Collaborator

I'm happy to see this :) @hayesgb LMK if you need a hand on the adlfs side of things.

@martindurant
Copy link
Member Author

No calls from async code to sync code

Specifically: async code cannot call sync code that calls async code.
For example, previously get was a sync function created by sync(_get), so that you can call it from normal code. However, it internally calls _expand_path which, in the spec implementation calls find, which also is sync(_find). That no longer happens, _expand_path's code is copied into AsyncFileSystem and calls await _find.

In other news...
#572 moved the loop into a threadlocal variable, so that you have one event loop per thread instead of a single event loop in its own thread. This required some objects deeply tied to the loop, like http session objects, to also be threadlocal. I don't know if this affects azure.

@martindurant
Copy link
Member Author

@hayesgb @TomAugspurger - probably should include adlfs in the list of things that get tested here, as s3fs and gcsfs already are. That would be a separate PR.

@hayesgb
Copy link
Contributor

hayesgb commented Mar 24, 2021 via email

@TomAugspurger
Copy link
Collaborator

Sounds good. I'll take a look at this today.

@TomAugspurger
Copy link
Collaborator

Locally, I see three failures from the adlfs test suite, but those three failures are also present on fsspec master. At a glance, some look related to "directory" handling, which I think is discussed in #562.

If needed I can bisect fsspec to the commit where things started failing (they pass with 0.8.7) but I'm guessing it's already known.

@martindurant
Copy link
Member Author

Sounds good, @TomAugspurger - let's cover that in the separate issue, then.

@martindurant martindurant merged commit 0b28f04 into fsspec:master Mar 25, 2021
@martindurant martindurant deleted the no_nesting branch March 25, 2021 19:18
@hayesgb
Copy link
Contributor

hayesgb commented Mar 28, 2021

I’m going to execute a release of adlfs. Based on this discussion, I’ll pin fsspec<=0.8.7, with the intention to remove the pin once this is resolved.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants