Re: [PATCH 01/20] rust: alloc: add `Allocator` trait

From: Danilo Krummrich
Date: Mon Jul 08 2024 - 19:12:34 EST


On Mon, Jul 08, 2024 at 08:12:34AM +0000, Benno Lossin wrote:
> On 06.07.24 20:47, Danilo Krummrich wrote:
> > On Sat, Jul 06, 2024 at 05:08:26PM +0000, Benno Lossin wrote:
> >> On 06.07.24 17:11, Danilo Krummrich wrote:
> >>> On Sat, Jul 06, 2024 at 01:17:19PM +0000, Benno Lossin wrote:
> >>>> On 06.07.24 13:05, Danilo Krummrich wrote:
> >>>>>>> + layout: Layout,
> >>>>>>> + flags: Flags,
> >>>>>>> + ) -> Result<NonNull<[u8]>, AllocError>;
> >>>>>>> +
> >>>>>>> + /// Free an existing memory allocation.
> >>>>>>> + ///
> >>>>>>> + /// # Safety
> >>>>>>> + ///
> >>>>>>> + /// `ptr` must point to an existing and valid memory allocation created by this `Allocator`
> >>>>>>> + /// instance.
> >>>>>>> + unsafe fn free(&self, ptr: *mut u8) {
> >>>>>>
> >>>>>> `ptr` should be `NonNull<u8>`.
> >>>>>
> >>>>> Creating a `NonNull` from a raw pointer is an extra operation for any user of
> >>>>> `free` and given that all `free` functions in the kernel accept a NULL pointer,
> >>>>> I think there is not much value in making this `NonNull`.
> >>>>
> >>>> I don't think that this argument holds for Rust though. For example,
> >>>> `KBox` contains a `Unique` that contains a `NonNull`, so freeing could
> >>>> just be done with `free(self.0.0)`.
> >>>
> >>> Agreed, we can indeed make it a `&NonNull<u8>`. However, I find this a bit
> >>
> >> I think you mean `NonNull<u8>`, right?
> >
> > Yes, but I still don't see how that improves things, e.g. in `Drop` of
> > `KVec`:
> >
> > `A::free(self.ptr.to_non_null().cast())`
> >
> > vs.
> >
> > `A::free(self.as_mut_ptr().cast())`
> >
> > I'm not against this change, but I don't see how this makes things better.
>
> Ah you still need to convert the `Unique<T>` to a pointer...
> But we could have a trait that allows that conversion. Additionally, we
> could get rid of the cast if we made the function generic.

I'm still not convinced of the `NonNull`, since technically it's not required,
but using a generic could indeed get us rid of all the `cast` calls - same for
`realloc` I guess.

>
> >>> inconsistent with the signature of `realloc`.
> >>>
> >>> Should we go with separate `shrink` / `grow`, `free` could be implemented as
> >>> shrinking to zero and allowing a NULL pointer makes not much sense.
> >>>
> >>> But as mentioned, I'm not yet seeing the benefit of having `realloc` split into
> >>> `grow` and `shrink`.
> >>
> >> I would not split it into grow/shrink. I am not sure what exactly would
> >> be best here, but here is what I am trying to achieve:
> >> - people should strongly prefer alloc/free over realloc,
> >
> > I agree; the functions for that are there: `Allocator::alloc` and
> > `Allocator::free`.
> >
> > `KBox` uses both of them, `KVec` instead, for obvious reasons, uses
> > `Allocator::realloc` directly to grow from zero and `Allocator::free`.
> >
> >> - calling realloc with zero size should not signify freeing the memory,
> >> but rather resizing the allocation to 0. E.g. because a buffer now
> >> decides to hold zero elements (in this case, the size should be a
> >> variable that just happens to be zero).
> >
> > If a buffer is forced to a new size of zero, isn't that effectively a free?
>
> I would argue that they are different, since you get a pointer back that
> points to an allocation of zero size. A vector of size zero still keeps
> around a (dangling) pointer.
> You also can free a zero sized allocation (it's a no-op), but you must
> not free an allocation twice.

I don't think this is contradictive. For instance, if you call krealloc() with
a size of zero, it will free the existing allocation and return ZERO_SIZE_PTR,
which, effectively, can be seen as zero sized allocation. You can also pass
ZERO_SIZE_PTR to free(), but, of course, it doens't do anything, hence, as you
said, it's a no-op.

>
> > At least that's exactly what the kernel does, if we ask krealloc() to resize to
> > zero it will free the memory and return ZERO_SIZE_PTR.
>
> Not every allocator behaves like krealloc, in your patch, both vmalloc
> and kvmalloc are implemented with `if`s to check for the various special
> cases.

Well, for `Vmalloc` there simply is no vrealloc() on the C side...

For `KVmalloc` there is indeed a kvrealloc(), but it behaves different than
krealloc(), which seems inconsistant. Also note that kvrealloc() has a hand full
of users only.

My plan was to stick with the logic that krealloc() uses (because I think that's
what it should be) and, once we land this series, align kvrealloc() and get a
vrealloc() on the C side in place. We could do this in the scope of this series
as well, but I think this series is huge enough and separating this out makes
things much easier.

So, my goal would be that `Vmalloc` and `KVmalloc` look exactly like `Kmalloc`
looks right now in the end.

>
> > So, what exactly would you want `realloc` to do when a size of zero is passed
> > in?
>
> I don't want to change the behavior, I want to prevent people from using
> it unnecessarily.

I'm not overly worried about this. In fact, the C side proves that krealloc()
isn't really prone to be used where kmalloc() or free() can be used instead. And
sometimes it makes sense, `KVec` is a good example for that.

Besides that, who do we expect to use this API? In Rust most use cases should be
covered by `KBox` and `KVec` anyways. I don't expect much direct users of this
API.

>
> >> - calling realloc with a null pointer should not be necessary, since
> >> `alloc` exists.
> >
> > But `alloc` calls `realloc` with a NULL pointer to allocate new memory.
> >
> > Let's take `Kmalloc` as example, surely I could implement `alloc` by calling
> > into kmalloc() instead. But then we'd have to implement `alloc` for all
> > allocators, instead of having a generic `alloc`.
>
> My intuition is telling me that I don't like that you can pass null to
> realloc. I can't put my finger on exactly why that is, maybe because
> there isn't actually any argument here or maybe there is. I'd like to
> hear other people's opinion.
>
> > And I wonder what's the point given that `realloc` with a NULL pointer already
> > does this naturally? Besides that, it comes in handy when we want to allocate
> > memory for data structures that grow from zero, such as `KVec`.
>
> You can just `alloc` with size zero and then call `realloc` with the
> pointer that you got. I don't see how this would be a problem.

In contrast to that, I don't see why calling two functions where just one would
be sufficient makes anything better.

>
> >> This is to improve readability of code, or do you find
> >>
> >> realloc(ptr, 0, Layout::new::<()>(), Flags(0))
> >>
> >> more readable than
> >>
> >> free(ptr)
> >
> > No, but that's not what users of allocators would do. They'd just call `free`,
> > as I do in `KBox` and `KVec`.
>
> I agree that we have to free the memory when supplying a zero size, but
> I don't like adding additional features to `realloc`.

So, if you agree that we have to free the memory when supplying a zero size, why
would it be an additional feature then?

Besides that, as mentioned above, krealloc() already does that and kvrealloc()
should be fixed to be consistent with that.

>
> Conceptually, I see an allocator like this:
>
> pub trait Allocator {
> type Flags;

Agreed.

> type Allocation;

What do we need this for?

I already thought about having some kind of `Allocation` type, also with a
drop() trait freeing the memory etc. But, I came to the conclusion that this is
likely to be overkill. We have `KBox` and `KVec` to take care of managing their
memory.

> type Error;

I'm not worried about adding this, but maybe it makes sense to wait until we
actually implement somthing that needs this.

>
> fn alloc(layout: Layout, flags: Self::Flags) -> Result<Self::Allocation, Self::Error>;
>
> fn realloc(
> alloc: Self::Allocation,
> old: Layout,

We only really care about the size here. `Alignment` would be wasted. Are we
keen taking this downside for a bit more consistency?

> new: Layout,
> flags: Self::Flags,
> ) -> Result<Self::Allocation, (Self::Allocation, Self::Error)>;
>
> fn free(alloc: Self::Allocation);
> }
>
> I.e. to reallocate something, you first have to have something
> allocated.

See the discussion above for this point.

>
> For some reason if we use `Option<NonNull<u8>>` instead of `*mut u8`, I
> have a better feeling, but that might be worse for ergonomics...

As argument for `realloc`?

I get your point here. And for any other API I'd probably agree instantly.

Generally, `Allocator` is a bit special to me. It's not supposed to be used by a
lot of users, other than types like `KBox` or `KVec`, that are supposed to
manage the memory and the interface is pretty unsafe by definition. To me
keeping a rather "raw" access to krealloc() and friends seems to be desirable
here.

>
> ---
> Cheers,
> Benno
>