-
Notifications
You must be signed in to change notification settings - Fork 112
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Please consider clarifying use of unsafe in README example #433
Comments
I did some experimenting. The function #[pyfn(m)]
fn mult_safe<'py>(a: f64, mut x: PyReadwriteArrayDyn<'py, f64>) {
let x = x.as_array_mut();
mult(a, x);
} But this seems to come at a slight cost in terms of performance (release mode): % python3 -m timeit -s 'import rust_python as rp; import numpy as np; a = np.identity(3)' 'rp.mult(3, a)'
1000000 loops, best of 5: 289 nsec per loop
% python3 -m timeit -s 'import rust_python as rp; import numpy as np; a = np.identity(3)' 'rp.mult_safe(3, a)'
500000 loops, best of 5: 406 nsec per loop Is this the reason why the variant using I wonder about the reason for the performance difference. After all the checking that prevents mutable aliasing with #[pyfn(m)]
fn mult_bad<'py>(a: f64, mut x: PyReadwriteArrayDyn<'py, f64>) {
let y = x.as_array_mut();
let x = x.as_array_mut();
mult(a, x);
mult(a, y);
} While the following does: #[pyfn(m)]
fn mult_dangerous<'py>(a: f64, x: &Bound<'py, PyArrayDyn<f64>>) {
let y = unsafe { x.as_array_mut() };
let x = unsafe { x.as_array_mut() };
mult(a, x);
mult(a, y);
} |
I think I finally understand now, but this took me many hours of studying PyO3/pyo3#2885 and the like. Would a patch that introduces some hints into the documentation (notably the README file) be welcome? Here is my current understanding (I am grateful for any corrections): So, if the above But, and this is a huge but, this borrow checking API (to my knowledge) is not used by other extensions, most notably it's not used by NumPy itself. So it is possible to simultaneously mutate a NumPy array using just rust-numpy and pure Python/Numpy. (Just like it's possible to simultaneously mutate a NumPy array using exclusively pure Python without rust-numpy.) Rust-numpy takes the view that this is the most that it can do:
I think that's OK if we can be sure that this will never trigger undefined behavior. Are we? |
In summary, unless my understanding is flawed, the following changes seem appropriate: (1) let x = unsafe { x.as_array_mut() }; by let mut x = x.readwrite();
let x = x.as_array_mut(); (or by a function that takes (2) |
Note that the It is however possible to produce multiple
The current situation w.r.t. the soundness of rust-numpy's safe API is certainly precarious and to be honest, I am somewhat surprised about the lack of backlash and disagreement when I made the choice to introduce dynamic borrow checking and consider Python code trusted. Meaning that I would definitely welcome improvements and clarifications. I am also somewhat grateful that you did all the work of following up on your questions and reading the existing documentation and code in detail. I think it would be great if this would result in improving the situation for anybody having the same questions as you had.
Well, I am not aware of any way to break aliasing discipline into NumPy arrays using only safe Rust code. But there is no formal proof for that (or even a formal description of the problem). Currently, this fully depends on the construction of the And one can certainly trigger data races using e.g. Python code and safe Rust code, but as you said yourself, this is no worse than pure Python code which can also trigger data races and thereby undefined behaviour. This is why
I explicitly decided against making all of the example use the safe API to show case that the escape hatch exists, but for performance reasons and because the dynamic borrow checking is not exact (just like the Rust compiler's borrow checking which is also an approximation). But this would certainly justify a safety comment. I would be grateful if you could add one referencing aliasing discipline for the array interior and note on the escape hatch character of the API w.r.t. runtime overhead and interleaved array views.
Actually, in common situations, the mutual exclusion is a property of the combined Python and Rust code and human review is sufficient to ensure aliasing discipline. For example, functions taking a number of read-only array views and producing a new owned array as output that are called from a single Python thread are very common in scientific code. This pattern has long been used to write extensions before dynamic borrow checking was added and it working most of the time was the reason why dynamic borrow checking is really only checking and for example not locking. I also think the safety doc comments on these methods are correct, but of course, they are extremely terse and I would certainly be glad if they were extended. Especially towards discussing that aliasing discipline has to be respected across the Rust-Python language boundary as the current wording could easily be read as being limited to the Rust side of things. Also note that breaking aliasing discipline does not require threads, i.e. having a function fn add_in_place(x: PyReadwriteArrayDyn<f64>, y: PyReadonlyArray<f64>) {
x.as_array_mut() += y.as_array();
} must also fail when called as |
Thanks for providing this amazing library!
The code example given in this project's README file demonstrates the use of PyO3. I guess that it should also serve as a pedagogical model of how PyO3 should be used. After all, this is often going to be the first bit of PyO3-using code that prospective users are going to see.
Now that example features a short
unsafe
block without any explanation. But the usage ofunsafe
in Rust code means that the safety of the featured block has been carefully verified and the compiler should trust that.In practice the usage of
unsafe
is quite often accompanied by a comment that explains why it is safe. This should certainly be the case in pedagogical code. I think that such a comment would be very helpful here.Perhaps this particular use of unsafe is obvious to seasoned PyO3 users, but it certainly isn't to newbies for whom this code will often be the first contact with PyO3. Here are some question answers that may come up:
unsafe
safe after all?PyReadonlyArrayDyn<'py, f64>
arguments. It seems that the second function could similarly take aPyReadwriteArrayDyn<'py, f64>
argument (which provides a safeas_array_mut
method), but it takes&Bound<'py, PyArrayDyn<f64>>
instead. Is this indeed a choice and what is the reason behind it?I'm not suggesting that a long discussion of this issue should be added to the README. Probably a short comment and a few pointers into the documentation would be enough.
The text was updated successfully, but these errors were encountered: