Re: [PATCH v6 04/26] rust: alloc: implement `Allocator` for `Kmalloc`

From: Danilo Krummrich
Date: Tue Sep 10 2024 - 09:37:56 EST


On Tue, Sep 10, 2024 at 01:11:35PM +0000, Benno Lossin wrote:
> On 03.09.24 13:48, Danilo Krummrich wrote:
> > On Fri, Aug 30, 2024 at 02:45:35PM +0000, Benno Lossin wrote:
> >> On 30.08.24 00:04, Danilo Krummrich wrote:
> >>> On Thu, Aug 29, 2024 at 06:32:42PM +0000, Benno Lossin wrote:
> >>>> On 16.08.24 02:10, Danilo Krummrich wrote:
> >>>>> +///
> >>>>> +/// For more details see [self].
> >>>>> +pub struct Kmalloc;
> >>>>>
> >>>>> /// Returns a proper size to alloc a new object aligned to `new_layout`'s alignment.
> >>>>> fn aligned_size(new_layout: Layout) -> usize {
> >>>>> @@ -36,6 +52,60 @@ pub(crate) unsafe fn krealloc_aligned(ptr: *mut u8, new_layout: Layout, flags: F
> >>>>> unsafe { bindings::krealloc(ptr as *const core::ffi::c_void, size, flags.0) as *mut u8 }
> >>>>> }
> >>>>>
> >>>>> +/// # Invariants
> >>>>> +///
> >>>>> +/// One of the following `krealloc`, `vrealloc`, `kvrealloc`.
> >>>>> +struct ReallocFunc(
> >>>>> + unsafe extern "C" fn(*const core::ffi::c_void, usize, u32) -> *mut core::ffi::c_void,
> >>>>> +);
> >>>>> +
> >>>>> +impl ReallocFunc {
> >>>>> + // INVARIANT: `krealloc` satisfies the type invariants.
> >>>>> + const KREALLOC: Self = Self(bindings::krealloc);
> >>>>> +
> >>>>> + /// # Safety
> >>>>> + ///
> >>>>> + /// This method has the same safety requirements as [`Allocator::realloc`].
> >>>>> + unsafe fn call(
> >>>>> + &self,
> >>>>> + ptr: Option<NonNull<u8>>,
> >>>>> + layout: Layout,
> >>>>> + flags: Flags,
> >>>>> + ) -> Result<NonNull<[u8]>, AllocError> {
> >>>>> + let size = aligned_size(layout);
> >>>>> + let ptr = match ptr {
> >>>>> + Some(ptr) => ptr.as_ptr(),
> >>>>> + None => ptr::null(),
> >>>>> + };
> >>>>> +
> >>>>> + // SAFETY: `ptr` is either NULL or valid by the safety requirements of this function.
> >>>>
> >>>> You need some justification as to why calling the three allowed
> >>>> functions here.
> >>>
> >>> What kind of justification do I need? Can you please share some more details on
> >>> what you think is missing here?
> >>
> >> So, you are calling a function pointer to an `unsafe` function. This
> >> means that through some invariant you have to know what the safety
> >> requirements are (otherwise how can you guarantee that this is OK?). You
> >> have the invariant that the pointer points at one of the three functions
> >> mentioned above. What are the safety requirements of those functions? I
> >> would assume that the only one is that `ptr` is valid. So you can use:
> >>
> >> // SAFETY:
> >> // - `self.0` is one of `krealloc`, `vrealloc`, `kvrealloc` and thus only requires that `ptr` is
> >> // NULL or valid.
> >
> > I'm fine adding it, but I'd like to understand why you think it's required in
> > the safety comment here? Isn't this implicit by being the type invariant?
>
> You are calling a function pointer to an `unsafe` function that takes a
> raw pointer. Without this comment it is not clear what the function
> pointer's safety requirements are for the raw pointer parameter.

That's my point, isn't this implicitly clear by the type invariant? If needed,
shouldn't it be:

// INVARIANT:
// - `self.0` is one of [...]
//
// SAFETY:
// - `ptr` is either NULL or [...]

>
> >> // - `ptr` is either NULL or valid by the safety requirements of this function.
> >
> > This is the part I already have.
>
> I kept it to ensure that you also keep it.
>
> >>>>> + let raw_ptr = unsafe {
> >>>>> + // If `size == 0` and `ptr != NULL` the memory behind the pointer is freed.
> >>>>> + self.0(ptr.cast(), size, flags.0).cast()
> >>>>> + };
> >>>>> +
> >>>>> + let ptr = if size == 0 {
> >>>>> + NonNull::dangling()
> >>>>> + } else {
> >>>>> + NonNull::new(raw_ptr).ok_or(AllocError)?
> >>>>> + };
> >>>>> +
> >>>>> + Ok(NonNull::slice_from_raw_parts(ptr, size))
> >>>>> + }
> >>>>> +}
> >>>>> +
> >>>>> +unsafe impl Allocator for Kmalloc {
> >>>>
> >>>> Missing SAFETY comment.
> >>>
> >>> Yeah, I think we came across this in an earlier version of the series. I asked
> >>> you about the content and usefulness of a comment here, since I'd just end up
> >>> re-iterating what the `Allocator` trait documentation says.
> >>>
> >>> IIRC, you replied that you want to think of something that'd make sense to add
> >>> here.
> >>
> >> Oh yeah, sorry I forgot about that.
> >>
> >>> What do you think should be written here?
> >>
> >> I think the best way to do it, would be to push this question down into
> >> `ReallocFunc::call`. So we would put this on the trait:
> >>
> >> // SAFETY: `realloc` delegates to `ReallocFunc::call`, which guarantees that
> >> // - memory remains valid until it is explicitly freed,
> >> // - passing a pointer to a vaild memory allocation is OK,
> >> // - `realloc` satisfies the guarantees, since `ReallocFunc::call` has the same.
> >
> > So, we'd also need the same for:
> > - `unsafe impl Allocator for Vmalloc`
> > - `unsafe impl Allocator for KVmalloc`
>
> Yes.
>
> >> We then need to put this on `ReallocFunc::call`:
> >>
> >> /// # Guarantees
> >> ///
> >> /// This method has the same guarantees as `Allocator::realloc`. Additionally
> >> /// - it accepts any pointer to a valid memory allocation allocated by this function.
> >
> > You propose this, since for `Allocator::realloc` memory allocated with
> > `Allocator::alloc` would be fine too I guess.
> >
> > But if e.g. `Kmalloc` wouldn't use the default `Allocator::alloc`, this would be
> > valid too.
>
> So if `Kmalloc` were to implement `alloc` by not calling
> `ReallocFun::call`, then we couldn't use this comment. Do you think that
> such a change might be required at some point?

I don't think so, this was purely hypothetical. Let's stick to your proposal.

>
> > We could instead write something like:
> >
> > "it accepts any pointer to a valid memory allocation allocated with the same
> > kernel allocator."
>
> It would be better, if we can keep it simpler (ie only `realloc` is
> implemented).
>
> >> /// - memory allocated by this function remains valid until it is passed to this function.
> >
> > Same here, `Kmalloc` could implement its own `Allocator::free`.
> >
> > Maybe just "...until it is explicitly freed.".
>
> I don't really like that, since by that any other function could be
> meant. Do you need to override the `free` function? If not then it would
> be better.
>
> > Anyway, I'm fine with both, since non of the kernel allocators uses anything
> > else than `ReallocFunc::call` to allocate and free memory.
> >
> >>
> >> Finally, we need a `GUARANTEE` comment (just above the return [^1]
> >> value) that establishes these guarantees:
> >>
> >> // GUARANTEE: Since we called `self.0` with `size` above and by the type invariants of `Self`,
> >> // `self.0` is one of `krealloc`, `vrealloc`, `kvrealloc`. Those functions provide the guarantees of
> >> // this function.
> >>
> >> I am not really happy with the last sentence, but I also don't think
> >> that there is value in listing out all the guarantees, only to then say
> >> "all of this is guaranteed by us calling one of these three functions.
> >>
> >>
> >> [^1]: I am not sure that there is the right place. If you have any
> >> suggestions, feel free to share them.
> >
> > Either way, I'm fine with this proposal.
> >
> >>
> >>
> >>>>> + #[inline]
> >>>>> + unsafe fn realloc(
> >>>>> + ptr: Option<NonNull<u8>>,
> >>>>> + layout: Layout,
> >>>>> + flags: Flags,
> >>>>> + ) -> Result<NonNull<[u8]>, AllocError> {
> >>>>> + // SAFETY: `ReallocFunc::call` has the same safety requirements as `Allocator::realloc`.
> >>>>> + unsafe { ReallocFunc::KREALLOC.call(ptr, layout, flags) }
> >>>>> + }
> >>>>> +}
> >>
> >> Oh one more thing, I know that you already have a lot of patches in this
> >> series, but could you split this one into two? So the first one should
> >> introduce `ReallocFunc` and the second one add the impl for `Kmalloc`?
> >> I managed to confuse me twice because of that :)
> >
> > Generally, I'm fine with that, but I'm not sure if I can avoid an intermediate
> > compiler warning about unused code doing that.
>
> You can just use `#[expect(dead_code)]` for that in the intermediate
> patches.

I usually try to avoid that, because it can be misleading when bisecting things.

If the temporarily unused code contains a bug, your bisection doesn't end up at
this patch, but some other patch that starts using it.

>
> ---
> Cheers,
> Benno
>