Re: [RFC PATCH 03/11] rust: block: introduce `kernel::block::mq` module
From: Andreas Hindborg (Samsung)
Date: Mon Jan 29 2024 - 09:28:00 EST
Benno Lossin <benno.lossin@xxxxxxxxx> writes:
> On 23.01.24 19:39, Andreas Hindborg (Samsung) wrote:
>>>>>> +/// A generic block device
>>>>>> +///
>>>>>> +/// # Invariants
>>>>>> +///
>>>>>> +/// - `gendisk` must always point to an initialized and valid `struct gendisk`.
>>>>>> +pub struct GenDisk<T: Operations> {
>>>>>> + _tagset: Arc<TagSet<T>>,
>>>>>> + gendisk: *mut bindings::gendisk,
>>>>>
>>>>> Why are these two fields not embedded? Shouldn't the user decide where
>>>>> to allocate?
>>>>
>>>> The `TagSet` can be shared between multiple `GenDisk`. Using an `Arc`
>>>> seems resonable?
>>>>
>>>> For the `gendisk` field, the allocation is done by C and the address
>>>> must be stable. We are owning the pointee and must drop it when it goes out
>>>> of scope. I could do this:
>>>>
>>>> #[repr(transparent)]
>>>> struct GenDisk(Opaque<bindings::gendisk>);
>>>>
>>>> struct UniqueGenDiskRef {
>>>> _tagset: Arc<TagSet<T>>,
>>>> gendisk: Pin<&'static mut GenDisk>,
>>>>
>>>> }
>>>>
>>>> but it seems pointless. `struct GenDisk` would not be pub in that case. What do you think?
>>>
>>> Hmm, I am a bit confused as to how you usually use a `struct gendisk`.
>>> You said that a `TagSet` might be shared between multiple `GenDisk`s,
>>> but that is not facilitated by the C side?
>>>
>>> Is it the case that on the C side you create a struct containing a
>>> tagset and a gendisk for every block device you want to represent?
>>
>> Yes, but the `struct tag_set` can be shared between multiple `struct
>> gendisk`.
>>
>> Let me try to elaborate:
>>
>> In C you would first allocate a `struct tag_set` and partially
>> initialize it. The allocation can be dynamic, static or part of existing
>> allocation. You would then partially initialize the structure and finish
>> the initialization by calling `blk_mq_alloc_tag_set()`. This populates
>> the rest of the structure which includes more dynamic allocations.
>>
>> You then allocate a `struct gendisk` by calling `blk_mq_alloc_disk()`,
>> passing in a pointer to the `struct tag_set` you just created. This
>> function will return a pointer to a `struct gendisk` on success.
>>
>> In the Rust abstractions, we allocate the `TagSet`:
>>
>> #[pin_data(PinnedDrop)]
>> #[repr(transparent)]
>> pub struct TagSet<T: Operations> {
>> #[pin]
>> inner: Opaque<bindings::blk_mq_tag_set>,
>> _p: PhantomData<T>,
>> }
>>
>> with `PinInit` [^1]. The initializer will partially initialize the struct and
>> finish the initialization like C does by calling
>> `blk_mq_alloc_tag_set()`. We now need a place to point the initializer.
>> `Arc::pin_init()` is that place for now. It allows us to pass the
>> `TagSet` reference to multiple `GenDisk` if required. Maybe we could be
>> generic over `Deref<TagSet>` in the future. Bottom line is that we need
>> to hold on to that `TagSet` reference until the `GenDisk` is dropped.
>
> I see, thanks for the elaborate explanation! I now think that using `Arc`
> makes sense.
>
>> `struct tag_set` is not reference counted on the C side. C
>> implementations just take care to keep it alive, for instance by storing
>> it next to a pointer to `struct gendisk` that it is servicing.
>
> This is interesting, is this also done in the case where it is shared
> among multiple `struct gendisk`s?
Yes. The architecture is really quite flexible. For instance C NVMe uses
one tag set for the admin queue of a nvme controller, and one tag set
shared for all IO queues of a controller. The admin queue tag set is not
actually attached to a `struct gendisk` and does not appear as a block
device to the user. The IO queue tag set serves a number `struct
gendisk`, one for each name space of the controller.
> Does this have some deeper reason? Or am I right to assume that creating
> `Gendisk`/`TagSet` is done rarely (i.e. only at initialization of the
> driver)?
I am not sure. It could probably be reference counted on C side. Perhaps
nobody felt the need for it, since the lifetime of it is not that
complex. And yes, it is relatively rare as it is only as part of
initialization and tear down that you would create or destroy this
structure.
Best regards,
Andreas