Re: [PATCH v5 6/6] rust: use strict provenance APIs
From: Benno Lossin
Date: Wed Mar 19 2025 - 10:29:43 EST
On Wed Mar 19, 2025 at 1:21 PM CET, Alice Ryhl wrote:
> On Wed, Mar 19, 2025 at 12:23:44AM +0000, Benno Lossin wrote:
>> On Tue Mar 18, 2025 at 1:29 PM CET, Alice Ryhl wrote:
>> > On Mon, Mar 17, 2025 at 10:23:56AM -0400, Tamir Duberstein wrote:
>> >> Throughout the tree, use the strict provenance APIs stabilized in Rust
>> >> 1.84.0[1]. Retain backwards-compatibility by introducing forwarding
>> >> functions at the `kernel` crate root along with polyfills for rustc <
>> >> 1.84.0.
>> >>
>> >> Use `#[allow(clippy::incompatible_msrv)]` to avoid warnings on rustc <
>> >> 1.84.0 as our MSRV is 1.78.0.
>> >>
>> >> In the `kernel` crate, enable the strict provenance lints on rustc >=
>> >> 1.84.0; do this in `lib.rs` rather than `Makefile` to avoid introducing
>> >> compiler flags that are dependent on the rustc version in use.
>> >>
>> >> Link: https://blog.rust-lang.org/2025/01/09/Rust-1.84.0.html#strict-provenance-apis [1]
>> >> Suggested-by: Benno Lossin <benno.lossin@xxxxxxxxx>
>> >> Link: https://lore.kernel.org/all/D8EIXDMRXMJP.36TFCGWZBRS3Y@xxxxxxxxx/
>> >> Signed-off-by: Tamir Duberstein <tamird@xxxxxxxxx>
>> >
>> > I'm not convinced that the pros of this change outweigh the cons. I
>> > think this is going to be too confusing for the C developers who look at
>> > this code.
>>
>> 1) I think we should eliminate all possible `as` conversions. They are
>> non-descriptive (since they can do may *very* different things) and
>> ptr2int conversions are part of that.
>> 2) At some point we will have to move to the provenance API, since
>> that's what Rust chose to do. I don't think that doing it at a later
>> point is doing anyone a favor.
>
> We don't *have* to do anything. Sure, most `as` conversions can be
> removed now that we have fixed the integer type mappings, but I'm still
> not convinced by this case.
>
> Like, sure, use it for that one case in `kernel::str` where it uses
> integers for pointers for some reason. But most other cases, provenance
> isn't useful.
I disagree, it's only going to get more painful in the long run to
change this.
>> 3) I don't understand the argument that this is confusing to C devs.
>> They are just normal functions that are well-documented (and if
>> that's not the case, we can just improve them upstream). And
>> functions are much easier to learn about than `as` casts (those are
>> IMO much more difficult to figure out than then strict provenance
>> functions).
>
> I really don't think that's true, no matter how good the docs are. If
> you see `addr as *mut c_void` as a C dev, you are going to immediately
> understand what that means. If you see with_exposed_provenance(addr),
> you're not going to understand what that means from the name - you have
> to interrupt your reading and look up the function with the weird name.
I see this as a double edged sword, yes `addr as *mut c_void` might seem
more easily digestible on the first encounter, but that might also lead
them to never look up what it exactly does.
And I don't think that we should optimize these functions for C readers.
They aren't used commonly (or supposed to IMO) and there are several
other functions that are similarly confusing if not more already in our
codebase.
> And those docs probably spend a long time talking about stuff that
> doesn't matter for your pointer, since it's probably a userspace pointer
> or similar.
For userspace pointers, see below.
>> Thus I think we should keep this patch (with Boqun's improvement).
>>
>> >> diff --git a/rust/kernel/uaccess.rs b/rust/kernel/uaccess.rs
>> >> index 719b0a48ff55..96393bcf6bd7 100644
>> >> --- a/rust/kernel/uaccess.rs
>> >> +++ b/rust/kernel/uaccess.rs
>> >> @@ -226,7 +226,9 @@ pub fn read_raw(&mut self, out: &mut [MaybeUninit<u8>]) -> Result {
>> >> }
>> >> // SAFETY: `out_ptr` points into a mutable slice of length `len`, so we may write
>> >> // that many bytes to it.
>> >> - let res = unsafe { bindings::copy_from_user(out_ptr, self.ptr as *const c_void, len) };
>> >> + let res = unsafe {
>> >> + bindings::copy_from_user(out_ptr, crate::with_exposed_provenance(self.ptr), len)
>> >> + };
>> >> if res != 0 {
>> >> return Err(EFAULT);
>> >> }
>> >> @@ -264,7 +266,7 @@ pub fn read<T: FromBytes>(&mut self) -> Result<T> {
>> >> let res = unsafe {
>> >> bindings::_copy_from_user(
>> >> out.as_mut_ptr().cast::<c_void>(),
>> >> - self.ptr as *const c_void,
>> >> + crate::with_exposed_provenance(self.ptr),
>> >> len,
>> >> )
>> >> };
>> >
>> > That's especially true for cases like this. These are userspace pointers
>> > that are never dereferenced. It's not useful to care about provenance
>> > here.
>>
>> I agree for this case, but I think we shouldn't be using raw pointers
>> for this to begin with. I'd think that a newtype wrapping `usize` is a
>> much better fit. It can then also back the `IoRaw` type. AFAIU user
>> space pointers don't have provenance, right? (if they do, then we should
>> use this API :)
>
> We're doing that to the fullest extent possible already. We only convert
> them to pointers when calling C FFI functions that take user pointers as
> a raw pointer.
We should make bindgen use that type in those interfaces already.
---
Cheers,
Benno