-
Notifications
You must be signed in to change notification settings - Fork 766
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a pycall!()
macro
#4503
base: main
Are you sure you want to change the base?
Add a pycall!()
macro
#4503
Conversation
This macro allows you to call Python objects with the most convenient syntax and maximum performance. The implementation is... complicated. The performance requirements mean that we do a lot of type juggling, including clever tricks. This is a draft; it likely contains bugs and it isn't the most readable.
12a5926
to
ebbb88c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! As the implementation is huge and is stated to be draft, I have only skimmed it, so let's focus on the high level points first and can iterate:
- It looks like there are some stylistic choices like
"named_argument"=value
rather thannamed_argument=value
, and also(*)args
and(**)kwargs
. Can you elaborate on the choices here: are they draft syntaxes, are they due to limitations in macro parsing, or are they to prevent ambiguities? I think what I'd like to see is an exact mirror of Python call syntax, but we also have the problem of*
being the deref operator in Rust, which creates at least one complication. - As per the thread in RFC:
py_call!
macro #4414 there are maintainers among us who rightly question whether a macro is necessary. Can we use builder functions to achieve the same effect (or indeed just regular functions)? If so, how ergonomic can we make those? My intuition is that because kwargs need to be appended to the args slice, and kwarg names separated out into a tuple, the function forms are non-trivial / high overhead. But it would be good to rule that out rather than proceed based on my hunch. - The number of combinations of helpers are enormous. Do you think there are natural points where this PR could be broken up so we could review in chunks and possibly extend the design incrementally? We could potentially merge a limited implementation behind an
experimental-pycall-macro
feature, for example, and extend functionality as we go through the PRs.
The syntax is The syntax for unpacking include parentheses because of the deref ambiguity you mentioned. There are alternatives; for example, we can say if you want to deref you have to parenthesize the value (i.e.
A builder is possible, but:
Yes. We can provide only the most general combinations (which are unpacking with any iterator or any python iterable/mapping), and provide more efficient specializations as we go. From convenience POV, this will be the same as the last iteration, while from performance POV we can add things progressively. There is a question of how much to "predict the future" (i.e. whether to add things to a PR when they will only be needed for future combinators), but I guess we can deal with that in time. As a last thing, I was noted on Stack Overflow that some of my work is useless ;P While I hope it will become useful with future enhancements to Python, it currently cannot provide any speedup and maybe even slow things down a bit. So we'll have to decide what to do with that too. |
Wow, this is a huge amount of work... and a huge PR. 👍 I for am +1 for a As for syntax, I think this hits the sweet spot of being close enough to Python syntax, while resolving the unfortunate clash with It it was up to me, I would probably make one change - add a sigil like |
FWIW, the linked future enhancements to Python discussion is more about consistent naming for the existing CPython call API, than about performance or behaviour. |
@encukou From what I understand it's also about allowing you to call with tuple and dict without a copy in case you are already holding one, and not providing vectorcall where it's slower (python/cpython#123372). Which are the things I care about. And maybe also having a faster way to retrieve a non-bound method, which means I can check for vectorcall-ability of a method. |
Can the "performance" talk please be backed up by benchmarks? It's hard to judge if there actually is any performance benefit or, for that matter, whether the extra performance is worth the added complexity1... From where I am standing the current complexity1 of this PR is not worth any extra performance. Even if you managed to pare it down to ~2k LOC it would not just have to be faster, it would have to be significantly faster to justify the added complexity1. But back to the PR; I think this PR is not the best way forward:
Instead, I'd prefer the following approach:
Footnotes
|
This is a draft. The code isn't the most readable in the worlds and it likely contains bugs. It also won't pass CI, as I coded for most recent Python and Rust versions. I am opening this PR to gauge opinions.
I chose the performance over code safety or conciseness. This choice can be reverted, fully or partially. While this PR is very large, I believe we can get into reasonable size if we are ready to give up some of the performance.
The whole new
pycall
module is a safety boundary. Things aren't marked unsafe even when they should be. Part of this could be fixed, but part is hard because of macro constraints. Users should not be able to use the stuff defined in this module anyway.This macro allows you to call Python objects with the most convenient syntax and maximum performance.
The implementation is... complicated. The performance requirements mean that we do a lot of type juggling, including clever tricks.
Fixes #4414.