Add a `pycall!()` macro #4503

ChayimFriedman2 · 2024-08-30T09:51:02Z

This is a draft. The code isn't the most readable in the worlds and it likely contains bugs. It also won't pass CI, as I coded for most recent Python and Rust versions. I am opening this PR to gauge opinions.

I chose the performance over code safety or conciseness. This choice can be reverted, fully or partially. While this PR is very large, I believe we can get into reasonable size if we are ready to give up some of the performance.

The whole new pycall module is a safety boundary. Things aren't marked unsafe even when they should be. Part of this could be fixed, but part is hard because of macro constraints. Users should not be able to use the stuff defined in this module anyway.

This macro allows you to call Python objects with the most convenient syntax and maximum performance.

The implementation is... complicated. The performance requirements mean that we do a lot of type juggling, including clever tricks.

Fixes #4414.

This macro allows you to call Python objects with the most convenient syntax and maximum performance. The implementation is... complicated. The performance requirements mean that we do a lot of type juggling, including clever tricks. This is a draft; it likely contains bugs and it isn't the most readable.

davidhewitt

Thanks for the PR! As the implementation is huge and is stated to be draft, I have only skimmed it, so let's focus on the high level points first and can iterate:

It looks like there are some stylistic choices like "named_argument"=value rather than named_argument=value, and also (*)args and (**)kwargs. Can you elaborate on the choices here: are they draft syntaxes, are they due to limitations in macro parsing, or are they to prevent ambiguities? I think what I'd like to see is an exact mirror of Python call syntax, but we also have the problem of * being the deref operator in Rust, which creates at least one complication.
As per the thread in RFC: py_call! macro #4414 there are maintainers among us who rightly question whether a macro is necessary. Can we use builder functions to achieve the same effect (or indeed just regular functions)? If so, how ergonomic can we make those? My intuition is that because kwargs need to be appended to the args slice, and kwarg names separated out into a tuple, the function forms are non-trivial / high overhead. But it would be good to rule that out rather than proceed based on my hunch.
The number of combinations of helpers are enormous. Do you think there are natural points where this PR could be broken up so we could review in chunks and possibly extend the design incrementally? We could potentially merge a limited implementation behind an experimental-pycall-macro feature, for example, and extend functionality as we go through the PRs.

ChayimFriedman2 · 2024-08-31T18:47:22Z

It looks like there are some stylistic choices like "named_argument"=value rather than named_argument=value, and also (*)args and (**)kwargs. Can you elaborate on the choices here: are they draft syntaxes, are they due to limitations in macro parsing, or are they to prevent ambiguities? I think what I'd like to see is an exact mirror of Python call syntax, but we also have the problem of * being the deref operator in Rust, which creates at least one complication.

The syntax is named_argument=value, just like in Python. There is another alternative ("named_argument")=value (which is less efficient), for when you want to provide a single argument whose name is only known at runtime (that is, the expression within the parentheses is evaluated at runtime instead of a single identifier string). This is essentially a shortcut for (**)(("named_argument", value),) (i.e. unpacking a single-item tuple of key and value) that will also have the same performance. Of course, we can drop this option if we like.

The syntax for unpacking include parentheses because of the deref ambiguity you mentioned. There are alternatives; for example, we can say if you want to deref you have to parenthesize the value (i.e. (*value)) but I see that as confusing. But in general, the syntax is kind of a placeholder; I focused more on the implementation (since this was more fun).

As per the thread in RFC: py_call! macro #4414 there are maintainers among us who rightly question whether a macro is necessary. Can we use builder functions to achieve the same effect (or indeed just regular functions)? If so, how ergonomic can we make those? My intuition is that because kwargs need to be appended to the args slice, and kwarg names separated out into a tuple, the function forms are non-trivial / high overhead. But it would be good to rule that out rather than proceed based on my hunch.

A builder is possible, but:

It cannot provide the same level of convenience, and
It cannot provide the same level of efficiency. The macro relies on being a macro for two things: validating that keyword argument names are disjoint (when all are known at compile time), and autoderef specialization to choose the most performant implementation. The latter can be replaced by multiple functions (but this will be both less convenient and risk people will accidentally choose the less efficient implementation), but the former cannot. Which means passing kwargs with all-knwon-at-compile-time names will have to be penalized.

The number of combinations of helpers are enormous. Do you think there are natural points where this PR could be broken up so we could review in chunks and possibly extend the design incrementally? We could potentially merge a limited implementation behind an experimental-pycall-macro feature, for example, and extend functionality as we go through the PRs.

Yes. We can provide only the most general combinations (which are unpacking with any iterator or any python iterable/mapping), and provide more efficient specializations as we go. From convenience POV, this will be the same as the last iteration, while from performance POV we can add things progressively. There is a question of how much to "predict the future" (i.e. whether to add things to a PR when they will only be needed for future combinators), but I guess we can deal with that in time.

As a last thing, I was noted on Stack Overflow that some of my work is useless ;P While I hope it will become useful with future enhancements to Python, it currently cannot provide any speedup and maybe even slow things down a bit. So we'll have to decide what to do with that too.

birkenfeld · 2024-09-01T09:05:42Z

Wow, this is a huge amount of work... and a huge PR. 👍

I for am +1 for a pycall! macro with nice syntax and potential efficiency, but I agree that it should be scaled down initially, and then expanded over time depending on the review bandwidth and user feedback.

As for syntax, I think this hits the sweet spot of being close enough to Python syntax, while resolving the unfortunate clash with * in a fine way.

It it was up to me, I would probably make one change - add a sigil like = to the "computed names" in brackets. I.e. name=value or (=name)=value and the same for obj.method() and obj.(=method)() - to make it obvious that something is being computed here. (name) and name just are a bit too similar, and it might be hard to remember at a glance that the parens are significant.

encukou · 2024-09-09T15:42:58Z

FWIW, the linked future enhancements to Python discussion is more about consistent naming for the existing CPython call API, than about performance or behaviour.
If there's some calling style that CPython could add to make calls from Rust faster, let's try to add it now :)

ChayimFriedman2 · 2024-09-09T16:01:26Z

@encukou From what I understand it's also about allowing you to call with tuple and dict without a copy in case you are already holding one, and not providing vectorcall where it's slower (python/cpython#123372). Which are the things I care about.

And maybe also having a faster way to retrieve a non-bound method, which means I can check for vectorcall-ability of a method.

mejrs · 2024-09-09T18:37:50Z

Can the "performance" talk please be backed up by benchmarks? It's hard to judge if there actually is any performance benefit or, for that matter, whether the extra performance is worth the added complexity¹...

From where I am standing the current complexity¹ of this PR is not worth any extra performance. Even if you managed to pare it down to ~2k LOC it would not just have to be faster, it would have to be significantly faster to justify the added complexity¹.

But back to the PR; I think this PR is not the best way forward:

it's huge, and looks like it took a lot of effort. We definitely appreciate your enthusiasm (👍 ) but big PRs out of nowhere are bad for several reasons, mostly because they're hard to review and require a big time/effort investment on your part.
it looks like there's some kind of arrayvec impl in here; please just use that instead of reinventing the wheel.

Instead, I'd prefer the following approach:

create a set of benchmarks of realistic scenarios where you can get a significant benefit by using pyo3_ffi api rather than PyAnyMethods. If there is a big difference then we'd like to see improvements, and it would motivate more work. If the difference is small or non existent, you've just saved yourself a lot of work.
change or add to PyAnyMethods to make it better.
if the above api gets big and/or confusing, add a macro to forward calls to it

I touched on this slightly in https://github.com/PyO3/pyo3/issues/4414#issuecomment-2267158235 . With "complexity", I don't just mean "this PR is a pain to review". Complexity also involves: Can someone (who is not you) confidently make changes or improvements to it? How easy is it to understand as a user of the library? How much documentation is required? We happen to have a lot of documentation, so adding even more should not be done lightly. ↩ ↩² ↩³

ChayimFriedman2 force-pushed the pycall-macro branch from 12a5926 to ebbb88c Compare August 30, 2024 10:01

davidhewitt reviewed Aug 31, 2024

View reviewed changes

ChayimFriedman2 closed this Aug 31, 2024

ChayimFriedman2 reopened this Aug 31, 2024

davidhewitt mentioned this pull request Oct 19, 2024

Vectorcall support #684

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a `pycall!()` macro #4503

Add a `pycall!()` macro #4503

ChayimFriedman2 commented Aug 30, 2024 •

edited

Loading

davidhewitt left a comment •

edited

Loading

ChayimFriedman2 commented Aug 31, 2024 •

edited

Loading

birkenfeld commented Sep 1, 2024

encukou commented Sep 9, 2024

ChayimFriedman2 commented Sep 9, 2024 •

edited

Loading

mejrs commented Sep 9, 2024 •

edited

Loading

Add a pycall!() macro #4503

Are you sure you want to change the base?

Add a pycall!() macro #4503

Conversation

ChayimFriedman2 commented Aug 30, 2024 • edited Loading

davidhewitt left a comment • edited Loading

Choose a reason for hiding this comment

ChayimFriedman2 commented Aug 31, 2024 • edited Loading

birkenfeld commented Sep 1, 2024

encukou commented Sep 9, 2024

ChayimFriedman2 commented Sep 9, 2024 • edited Loading

mejrs commented Sep 9, 2024 • edited Loading

Footnotes

Add a `pycall!()` macro #4503

Add a `pycall!()` macro #4503

ChayimFriedman2 commented Aug 30, 2024 •

edited

Loading

davidhewitt left a comment •

edited

Loading

ChayimFriedman2 commented Aug 31, 2024 •

edited

Loading

ChayimFriedman2 commented Sep 9, 2024 •

edited

Loading

mejrs commented Sep 9, 2024 •

edited

Loading